The Form 1-K Files Dataset is a complete, monthly-refreshed archive of Form 1-K and Form 1-K/A submissions to EDGAR — the annual reports mandated by Rule 257(b)(1) of Regulation A for Tier 2 issuers under the Securities Act of 1933. Each record reproduces one EDGAR accession in its original structure: a JSON metadata sidecar, the canonical onekfiler XML cover, its XSL-rendered XHTML twin, and the narrative and exhibit HTML documents that make up the filing body. Filers are Tier 2 Regulation A issuers themselves, submitting under their own CIK following SEC qualification of a Form 1-A offering statement. The dataset begins in April 2016 — shortly after the first fiscal cycles under the 2015 Regulation A+ overhaul produced the first 1-K filings — and extends through the current monthly refresh, covering both original 1-K filings and 1-K/A amendments (including the Special Financial Report variant that shares the 1-K submission type).
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
The Form 1-K Files Dataset packages every Form 1-K and Form 1-K/A submission made to EDGAR since April 2016. Each record in the dataset corresponds to a single EDGAR accession — one Regulation A Tier 2 issuer's annual report for one fiscal year, or a single amendment to such a disclosure. Records live inside monthly ZIP containers at the path <YYYY>-<MM>/<accession-number-no-dashes>/, and each accession folder carries one metadata.json alongside the machine-readable cover, its rendered XHTML twin, and the narrative and exhibit HTML documents that make up the filing body. Inline image attachments are deliberately omitted from the packaged copy.
The underlying Form 1-K is the annual report mandated by Rule 257(b)(1) of Regulation A under the Securities Act of 1933. It is the ongoing-reporting counterpart to the Form 1-A offering statement, required of issuers that have conducted a Tier 2 offering (capped at $75 million in a rolling 12-month window under the 2021-amended rules, originally $50 million). The form must be filed within 120 calendar days after the issuer's fiscal year-end and plays a role broadly analogous to Form 10-K for Exchange Act reporters, but under the Regulation A disclosure regime rather than the Exchange Act periodic-reporting regime. Form 1-K combines a structured EDGAR cover — offering-qualification dates, securities sold, auditor and legal fees, net proceeds — with a narrative annual report in Part II and a directors/officers/compensation block in Part III, plus two years of audited financial statements and a defined set of exhibits. Form 1-K/A is the amendment, carrying the same internal structure but restating or correcting a previously filed 1-K. A "Special Financial Report" variant exists for issuers that qualified a Tier 2 offering shortly before their fiscal year-end and must file abbreviated financial statements to bridge the gap; this variant travels on the same 1-K submission type and is distinguished only by a cover-page indicator on the structured form, not by a separate form code.
The dataset is distributed as monthly ZIP containers. The file types present in the dataset are XML (the two primary-doc files), HTML (narrative body and exhibits, with the SGML wrapper), JSON (metadata.json), and TXT (referenced by URL for the SGML submission text, with the primary HTML content preserved in decomposed form inside the folder).
A single accession folder contains four overlapping content layers:
metadata.json) derived from the EDGAR submission header, which describes the filing, the filer(s), and itemizes every document that was part of the original submission — including items deliberately not packaged, such as inline images and the concatenated SGML submission text file.primary_doc.xml) conforming to EDGAR's onekfiler XML schema under the namespace http://www.sec.gov/edgar/rega/onekfiler. It carries the structured header (submission type, CIK, flags, reporting period) and structured form data (issuer identification, fiscal year, offering summary with dollar and share counts).xsl1-K_X01/primary_doc.xml, produced by applying EDGAR's /css/REGA_1K_print.css print stylesheet to the canonical XML. This is the human-readable rendering that EDGAR's viewer returns for the "Filing Details" link..htm document typed PART II, or a single combined document typed PART II AND III — plus zero or more exhibit HTML files. Every non-primary_doc.xml document preserves EDGAR's legacy SGML <DOCUMENT> wrapper around the HTML body, with <TYPE>, <SEQUENCE>, <FILENAME>, <DESCRIPTION>, and <TEXT> leader lines.The concatenated SGML submission text file (<accession>.txt) that EDGAR assembles from all parts is referenced in metadata.json by URL but not included in the folder, and neither are the submission's image attachments (typically GRAPHIC .jpg or .png files used inline by the narrative).
metadata.jsonThe metadata sidecar is a flat JSON object with a small set of scalar header fields plus four arrays.
Scalar header fields identify the submission and link back to EDGAR:
formType — "1-K" for an original annual report or "1-K/A" for an amendment.accessionNo — the canonical dashed accession number, e.g. 0001096906-25-001945.filedAt — ISO-8601 filing timestamp with timezone offset, normally US Eastern.periodOfReport — the fiscal year-end covered by the annual report, as an ISO date.description — the short EDGAR description string (commonly "Form 1-K -").linkToFilingDetails — URL to the XSL-rendered primary doc on sec.gov.linkToHtml — URL to the EDGAR index page for the accession.linkToTxt — URL to the complete SGML submission text file.linkToXbrl — URL to an XBRL archive if any; Form 1-K is not an XBRL-bearing form, so this is typically empty.id — a 32-character hex content identifier.documentFormatFiles enumerates every document in the original EDGAR submission, including those intentionally excluded from the packaged folder. Each entry carries a sequence (1-based, with sequence "1" reserved for the form primary document and a blank sequence used for the complete-submission .txt), a type code drawn from EDGAR's Regulation A vocabulary (for example 1-K, PART II, PART III, PART II AND III, EX1K-2, EX1K-6 MAT CTRCT, EX1K-11, GRAPHIC, and blank for the submission .txt), a free-text description provided by the filer, a documentUrl pointing to the public sec.gov copy, and a size string in bytes. The canonical primary_doc.xml and its XSL-rendered sibling both appear with sequence: "1" and type: "1-K"; the XSL copy's size field is blank because EDGAR does not report one.
dataFiles is reserved for ancillary structured data files and is empty for Form 1-K submissions. seriesAndClassesContractsInformation is reserved for investment-company series/class metadata and is empty for operating-company issuers, which account for essentially all Regulation A Tier 2 filers.
entities describes each filer or co-filer on the submission — almost always a single operating-company issuer for a 1-K. Per-entity fields include companyName suffixed with a role tag such as "(Filer)", the unpadded cik, the submission type ("1-K"), the act under which the filing is made ("33", reflecting Regulation A's Securities Act basis), the Regulation A fileNo in the 24R-NNNNNN series, the EDGAR filmNo, irsNo, two-letter stateOfIncorporation, fiscalYearEnd as a four-character MMDD string, and a combined sic field pairing the four-digit code with its textual description.
primary_doc.xml (canonical)The canonical primary document is a compact XML cover-page filing conforming to the EDGAR onekfiler schema. Its two top-level branches are headerData and formData.
headerData contains the submissionType (1-K or 1-K/A) and a filerInfo block with a liveTestFlag (LIVE in production filings, TEST in EDGAR test submissions), the zero-padded 10-digit cik at filer/issuerCredentials/cik, a set of boolean flags (shellCompanyFlag, confirmingCopyFlag, successorFilingFlag, returnCopyFlag, overrideInternetFlag, rendered as Y/N or true/false), and a reportingPeriod date formatted as MM-DD-YYYY.
formData carries the substantive cover-page items:
item1 — issuer cover information. Indicates whether the filing is an Annual Report or a Special Financial Report, plus fiscal year-end, principal executive office address, telephone number, and the title of the securities reported on.item1Info — structured issuer identification: issuerName, cik, jurisdictionOrganization, irsNum.item2.regArule257 — a boolean capturing whether the filing is made as a successor filing under Rule 257(b)(5) (relevant when a Regulation A issuer's reporting obligations are assumed by a successor in a business combination).summaryInfo — optional but common for issuers that sold securities in the reporting period. Fields capture offering-level economics: commissionFileNumber, offeringQualificationDate, offeringCommenceDate, qualifiedSecuritiesSold (period count), offeringSecuritiesSold (cumulative count), pricePerSecurity, aggregrateOfferingPrice (EDGAR's original misspelling preserved), aggregrateOfferingPriceHolders, auditor/legal/blue-sky service-provider names and fees (auditorSpName/auditorFees, legalSpName/legalFees, blueSkySpName/blueSkyFees), issuerNetProceeds, and a clarificationResponses free-text block for narrative clarifications attached to the summary.xsl1-K_X01/primary_doc.xml (rendered)The XSL-rendered sibling is an XHTML document produced by applying EDGAR's Regulation A 1-K print stylesheet to the canonical XML. It starts with an XHTML doctype, links to /css/REGA_1K_print.css, and lays out the Form 1-K cover page exactly as EDGAR's viewer displays it. It is semantically redundant with the canonical XML but useful for human review and for consumers that prefer to render the cover without implementing the schema themselves.
The substantive annual-report content arrives as one or more HTML files carrying the narrative body. The dataset shows two common patterns: a single <issuer>_1k.htm typed PART II containing the full annual report (often including Part III content merged into the tail of Part II), or a single combined document such as partiiandiii.htm typed to span both parts. Each such file is prefixed by EDGAR's SGML wrapper:
1
<DOCUMENT>
2
<TYPE>PART II
3
<SEQUENCE>2
4
<FILENAME>gatc_1k.htm
5
<DESCRIPTION>PART II
6
<TEXT>
7
<HTML>... body ...</HTML>
8
</TEXT>
9
</DOCUMENT>
The HTML body carries the items required by Form 1-K Part II and, when merged, Part III:
When a separate Part III document is present, it typically carries items 3–7 and the exhibit index; small Regulation A issuers often combine all of Part II and Part III into a single HTML document for filing convenience.
Exhibits are delivered as additional .htm files (one per exhibit), each wrapped in its own SGML <DOCUMENT> block with a Regulation A exhibit <TYPE> code. Form 1-K's exhibit taxonomy (Part III, Item 17 of the form instructions) assigns a numeric slot to each category:
EX1K-1 — Underwriting agreement. Uncommon for Regulation A Tier 2 issuers, which frequently self-offer.EX1K-2 — Charter and bylaws (including amendments).EX1K-3 — Instruments defining the rights of securityholders (indentures, specimen certificates, etc.).EX1K-4 — Subscription agreement.EX1K-5 — Voting trust agreement.EX1K-6 MAT CTRCT — Material contracts. The most common exhibit category in practice; examples include license agreements, employment or consulting agreements, distribution agreements, and termination agreements.EX1K-7 — Plan of acquisition, reorganization, arrangement, liquidation, or succession.EX1K-8 — Escrow agreements.EX1K-9 — Letter regarding change in certifying accountant.EX1K-10 — Power of attorney.EX1K-11 — Consent of auditor (and, where applicable, consents of other named experts).EX1K-12 — Opinion regarding legality.EX1K-13 — "Testing the waters" materials used by the issuer prior to qualification.EX1K-14 — Consent to service of process (Form F-X equivalent) for foreign issuers.EX1K-15 — Additional exhibits — a catch-all slot used for anything not fitting the numbered categories above.Each exhibit file preserves its SGML wrapper ahead of the HTML body. The filer-chosen filename (for example gatc_ex6z31.htm) has no semantic meaning beyond what <TYPE> and <DESCRIPTION> record. The <DESCRIPTION> line commonly carries rich context such as the counterparties and effective date of a material contract, which is often the most efficient way to triage exhibits without parsing the HTML body.
EDGAR assembles the complete submission into a single SGML text file conventionally named <accession>.txt, in which each constituent document appears inside a <DOCUMENT>...</DOCUMENT> block. The packaged record preserves this wrapper on every HTML document (but not on the XML primary docs, which are pure XML). Consumers that want clean HTML must strip everything before the first <HTML> tag and the trailing </TEXT></DOCUMENT> footer; consumers that want exhibit-type and description metadata without parsing the inner HTML can read <TYPE>, <SEQUENCE>, <FILENAME>, and <DESCRIPTION> directly from the leading SGML lines.
For each accession, the packaged folder contains:
metadata.json — always one per folder.primary_doc.xml — the canonical EDGAR 1-K cover XML.xsl1-K_X01/primary_doc.xml — the XSL-rendered XHTML form view.<DOCUMENT> wrapper preserved.Two categories of content from the original EDGAR submission are referenced in metadata.json but not placed inside the accession folder:
GRAPHIC-type entries (.jpg, .png) used inline by the narrative or cover are enumerated in documentFormatFiles with their sec.gov URLs and sizes but are not copied into the ZIP. Narrative HTML that references these images by relative filename will therefore render with broken image links when viewed from the folder alone; canonical copies remain fetchable from the documentUrl values..txt file. EDGAR's concatenated submission text carrying every document in a single SGML stream is referenced by URL (linkToTxt) but not included, because its content is already present in decomposed form across the individual XML and HTML files in the folder.Form 1-K is not an XBRL-bearing form, so linkToXbrl is empty and no XBRL taxonomy or instance files appear in the folder.
Form 1-K was introduced as part of the SEC's 2015 Regulation A+ overhaul (effective June 19, 2015), which split Regulation A into Tier 1 and Tier 2 and created the ongoing-reporting regime for Tier 2 issuers. The dataset begins in April 2016, shortly after the first fiscal cycles under that regime produced the first 1-K filings. The Part II item structure, the EX1K-1 through EX1K-15 exhibit taxonomy, and the onekfiler XML schema for the structured cover have been substantially stable across the dataset's coverage window; the record anatomy described above applies uniformly from April 2016 forward.
Two regulatory refinements affect values inside the summaryInfo block without changing the structural layout:
aggregrateOfferingPrice and related fields for issuers qualified under the revised cap, but no new XML elements were introduced.successorFilingFlag in headerData and the item2.regArule257 boolean in formData; these are stable elements that distinguish successor filings from ordinary 1-Ks without altering the overall record shape.Issuer-level variation is significant. Some filers split Part II, Part III, financial statements, and the auditor's report across multiple HTML documents with distinct <TYPE> codes. Others, particularly smaller issuers using common filing-agent templates, deliver a single combined narrative document covering the entire form. Exhibit counts range from zero to many, with EX1K-6 MAT CTRCT being by far the most frequently encountered type.
Form 1-K has been an HTML-and-XML form since inception — it was introduced after EDGAR's HTML era was fully established, so there is no ASCII/text-only filing era in this dataset. Every record uses the same mixed-format layout: EDGAR's onekfiler XML schema for the structured cover, an XSL-rendered XHTML copy for human display, HTML for narrative and exhibits, and SGML <DOCUMENT> wrappers around each non-XML document. The principal practical format wrinkle is therefore the persistent SGML wrapper on HTML documents, which must be stripped by any consumer that wants to parse the inner HTML as a standard web document.
formType field in metadata.json and the submissionType element in primary_doc.xml are the only authoritative indicators that a record is an amendment. Amendments may restate the entire annual report or target specific items, and the form does not require a machine-readable diff, so comparing an amendment to its predecessor requires document-level analysis.item1 cover indicator in the canonical XML is the authoritative distinguisher, and the set of required financial statements differs (Special Financial Reports cover a shorter period, typically a single fiscal year).headerData.filerInfo.flags.successorFilingFlag and formData.item2.regArule257 both capture this case; the two should agree within a well-formed filing.summaryInfo block is populated when the issuer reports offering-level activity for the period. Its absence or sparseness does not indicate malformed data — it indicates that no qualified offering activity occurred in the period being reported on.xsl1-K_X01/primary_doc.xml is semantically redundant with the canonical primary_doc.xml. For machine extraction of cover-page data, prefer the canonical XML; the XSL rendering is retained for human viewing and parity with EDGAR's public display.GRAPHIC attachments are not packaged, narrative HTML that embeds images will render with broken image references when viewed from the folder. The image URLs remain resolvable via documentFormatFiles[].documentUrl on sec.gov.</TEXT></DOCUMENT> footer, or use an SGML-tolerant parser. The SGML header is the canonical source for <TYPE>, <SEQUENCE>, <FILENAME>, and <DESCRIPTION> on each document.gatc_1k.htm or partiiandiii.htm reflect filer convention and filing-agent templates; they are not a reliable signal of content. Always use the SGML <TYPE> header or the documentFormatFiles[].type value in metadata.json to classify a document.summaryInfo block uses EDGAR's original aggregrateOfferingPrice and aggregrateOfferingPriceHolders spellings; consumers should not normalize these keys away when parsing the XML.<TYPE>PART III document, may be fully merged into the Part II document, or may arrive under the combined <TYPE>PART II AND III code. Consumers should not rely on item-level boundaries being exposed by document boundaries; section extraction from the rendered HTML is usually required.Each record in the Form 1-K Files Dataset is an annual report filed on EDGAR by a Regulation A Tier 2 issuer that is subject to the ongoing reporting obligations of Rule 257 following SEC qualification of a Form 1-A offering statement. The filer is always the issuer itself, submitting under its own CIK. A single record corresponds to one Form 1-K, Form 1-K/A amendment, or Special Financial Report submission.
The Form 1-K filer population consists of:
Entities outside the filer population:
The Form 1-K obligation is periodic and calendar-driven, tied to the issuer's status as a Tier 2 reporter:
The form is not event-driven. Transactions and material developments are reported on Form 1-U (current report) and Form 1-SA (semiannual).
The Form 1-K obligation ends when the issuer files Form 1-Z under Rule 257(d) and meets its conditions: all required ongoing reports have been filed for the shorter of the reporting period or the most recent fiscal year, and the issuer is not subject to Section 13 or 15(d). Once Form 1-Z is effective, no further 1-K filings are required. An issuer that becomes subject to Exchange Act reporting transitions off the 1-K regime; its Tier 2 annual obligation is satisfied by Form 10-K under Rule 257(b)(2)(iii).
Form 1-K sits entirely within Regulation A under the Securities Act of 1933, codified at 17 CFR 230.251–230.263, with ongoing reporting under Rule 257. Rule 257(b)(1) establishes the annual 1-K; Rule 257(b)(3)–(b)(4) establish semiannual 1-SA and current 1-U; Rule 257(d) governs the Form 1-Z exit. Regulation A was rewritten under Section 401 of the JOBS Act of 2012, with the current Tier 1/Tier 2 structure and the Form 1-K requirement taking effect June 19, 2015 ("Regulation A+"). The Tier 2 offering cap was initially $50 million in a 12-month period and was raised to $75 million in March 2021. The earliest 1-K filings on EDGAR date to April 2016, following the first post-effectiveness fiscal-year cycle; no pre-EDGAR history exists because the form did not exist under the pre-2015 Regulation A regime.
Form 1-K sits inside the Regulation A ongoing-reporting family and is easily confused with adjacent disclosure regimes. The closest comparisons are the other Reg A forms (1-A, 1-SA, 1-U, 1-Z), its own amendments (1-K/A), the Exchange Act periodic reports (10-K, 10-Q), and the Regulation Crowdfunding annual report (C-AR). The sections below mark the boundaries so 1-K records are not silently pooled with filings that carry different legal weight, disclosure depth, or filer coverage.
10-K is the annual report of fully reporting Exchange Act registrants under Section 13 or 15(d). The confusion is structural: both are annual reports with overlapping section names (business, risk factors, MD&A, audited financials). The differences that matter:
10-K is not a substitute. Using 10-Ks to benchmark 1-K issuers will systematically overstate disclosure depth.
1-A is the entry document that qualifies a Tier 1 or Tier 2 Reg A offering; 1-K is the annual report that follows Tier 2 qualification.
1-A and 1-K are complements, not substitutes: 1-A sets the baseline, 1-Ks track performance against it.
1-SA is the semi-annual report under Rule 257(b)(3), due within 90 days after the first six months of the fiscal year.
1-U is the Reg A Tier 2 current-event form under Rule 257(b)(4), analogous to Form 8-K but with a shorter, Reg A-specific event list (fundamental changes, bankruptcy, auditor change, non-reliance on prior financials, change in control, departure of principal officers, unregistered equity sales).
1-Z terminates Reg A reporting obligations or concludes a Tier 1 offering.
1-K/A filings are included in this dataset alongside originals.
Listed only to mark the boundary: Reg A issuers do not file Form 10-Qs. The finest regular Reg A cadence is semi-annual via 1-SA, supplemented by 1-U for events. No quarterly Reg A data exists. For Exchange Act quarterly data, see 10-Q filings.
C-AR is the closest structural analogue to 1-K outside Reg A: the annual report for issuers that raised capital under Section 4(a)(6) and Regulation CF.
Tier 1 issuers file 1-A and are subject to state blue-sky review, but they have no 1-K obligation; their post-qualification filings are limited to 1-Z. Any study framed as "all Reg A issuers" using 1-K alone will systematically exclude Tier 1 activity, which requires a 1-A dataset to observe.
Form 1-K is the only SEC annual-report form used by Tier 2 Regulation A issuers. It is narrower than a 10-K, more substantive than a 1-SA, periodic rather than event-driven unlike 1-U, continuing rather than terminal unlike 1-Z, and built on a different exemption than C-AR. Its filer universe is Tier 2 Reg A issuers from April 2016 forward, and does not meaningfully overlap with Exchange Act registrants or Reg CF issuers. The other Reg A forms are complements; Exchange Act and Reg CF annual reports describe different populations under different rules.
Form 1-K is the annual report for Reg A Tier 2 issuers, a segment dominated by small operating companies, early-stage real-estate vehicles, token-adjacent issuers, and companies that raised capital from retail investors on online platforms. Professional users of this dataset rarely overlap with the Form 10-K audience.
Build issuer coverage on Reg A names absent from mainstream feeds. They pull metadata.json and primary_doc.xml header fields (CIK, fiscal year end, jurisdiction, SIC, offering history) to assemble rosters, then extract Part II MD&A and Item 8 audited financials to compute revenue growth, cash burn, going-concern language, and capital-raise cadence. Output: coverage lists with standardized financial snapshots for allocator-facing research.
Track issuers that raised on online investment platforms, for which the 1-K is the only consistent post-offering disclosure. Focus on Part II business description, officer and director listings, related-party disclosure, and follow-on capital raised, reconciled against platform-side offering data. Supports survivorship analysis across crowdfunding vintages and detection of issuers graduating to Form S-1 or acquisition.
Benchmark peer drafting of Part II risk factors, MD&A liquidity and capital resources, related-party transactions, and smaller-reporting executive compensation. Review EX1K-2 (charter and bylaws), EX1K-3 (instruments defining security holder rights), EX1K-4 (subscription agreements), and EX1K-6 (material contracts) for recurring issues such as anti-dilution mechanics, token rights, and offering-circular amendments. Output: drafting precedent, completeness checklists, and Reg A eligibility advice.
Use 1-Ks as the primary public audited set for these issuers. Compare accounting policies, revenue recognition, and going-concern footnotes across peer filers; pull Item 8 statements and the auditor consent exhibit (EX1K-11) to track auditor turnover, qualified opinions, and restatements signaled by 1-K/A amendments. Supports peer review, inspection preparation, and engagement risk scoping.
Calibrate their own 1-K against peers on MD&A depth, officer and director tables, executive compensation presentation, and exhibit indexes, then use metadata.json filing dates to manage the 120-day post-fiscal-year-end deadline. Supports filing calendars, exhibit checklists, and responses to staff comment letters.
Ingest the corpus from 2016 forward to study Reg A's effectiveness as a capital-formation tool. Parse metadata.json and Item 8 financials and link CIKs across Form 1-A, 1-K, 1-SA, and 1-U. Outputs include working papers on adoption rates, disclosure readability, offering-to-survival correlation, and retail-investor protection under Tier 2.
Screen for warning signs: going-concern language, late filings, insider compensation outsized relative to revenue, related-party transactions, auditor changes, and 1-K/A restatements. Combine Part II narrative with EX1K-6 material contracts and the subsidiary list to build threads on specific issuers, including token issuers and real-estate sponsors.
Monitor Tier 2 issuers that are preempted from state merit review but remain subject to antifraud authority. Screen metadata.json for missed 120-day deadlines, reconcile 1-A offering claims against 1-K results, and cluster issuers sharing officers, auditors, or promoters. Supports exam priorities, enforcement referrals, and rulemaking analysis.
Due-diligence direct Reg A investments, secondary purchases of Reg A securities, and funds holding Reg A paper. Pull Item 8 audited financials, MD&A liquidity, officer and director biographies, and EX1K-6 material contracts to verify that operations, governance, and capital position match offering-stage claims. Supports investment-committee memos and post-investment monitoring for illiquid positions.
Meet listing and continued-inclusion standards that require current Reg A reports. Use metadata.json to confirm timely 1-Ks, flag 1-K/A restatements, and extract share-count and capitalization data from Item 8 and exhibits for order-book reference. Supports listing eligibility, disclosure-driven trading halts, and issuer profile pages.
Underwrite small operators, real-estate sponsors, and revenue-based finance borrowers that lack ratings or full public filings. Focus on Item 8 balance sheet and cash flows, debt and lease footnotes, related-party transactions, and going-concern language, tied back to covenant packages. Supports credit memos, covenant checks, and watchlist escalation.
Build retrieval systems and entity graphs for the small-issuer segment. Parse primary_doc.xml for structured metadata, chunk Part II HTML by item, extract Item 8 financial tables, and index EX1K exhibits by type for retrieval over material contracts, organizational documents, and subsidiary lists. Outputs: entity-resolved Reg A databases, fine-tuned small-issuer models, and retrieval pipelines for analyst assistants.
The workflows below show how practitioners put Form 1-K records to work. Each one ties to specific fields or document types inside the accession folder.
Private-markets analysts and platform researchers pull formData/summaryInfo from primary_doc.xml across successive 1-K vintages for the same CIK, reading qualifiedSecuritiesSold, offeringSecuritiesSold, aggregrateOfferingPrice, and issuerNetProceeds to build a per-issuer time series of offering uptake. Joined with periodOfReport and filedAt from metadata.json, the output is an issuer-level panel of annual and cumulative dollars raised, cost of capital (auditor, legal, and blue-sky fees as a share of gross proceeds), and investor count — used to rank platforms, benchmark raise efficiency, and flag offerings that stall after qualification.
Securities lawyers and disclosure counsel extract the PART II (or PART II AND III) HTML document, strip the SGML wrapper, and segment Items 1, 2, and 8 by heading. They compare peer Tier 2 issuers on risk-factor taxonomy, MD&A liquidity discussion, and auditor going-concern paragraphs embedded in Item 8. The output is a precedent library and drafting checklist keyed to SIC and jurisdiction, used when preparing a client's next 1-K or responding to SEC staff comments.
Transactional attorneys and diligence teams walk documentFormatFiles in metadata.json, filtering for type values EX1K-2, EX1K-3, EX1K-4, and EX1K-6 MAT CTRCT, and use the <DESCRIPTION> SGML header on each exhibit to identify counterparties and effective dates before opening the body. The output is a searchable exhibit corpus — license agreements, employment contracts, subscription agreements, token instruments, indentures, and amended charters — used as drafting precedent for small-issuer deals where Reg A filings are the most accessible source of real-world templates.
Audit-firm inspection teams and investigative journalists combine three signals per issuer: presence of an EX1K-9 letter regarding change in certifying accountant, the auditorSpName field in summaryInfo, and any 1-K/A filings in formType. They diff auditor names year over year and link 1-K/A accessions back to the periodOfReport of the original 1-K to isolate restated periods. The output is a watchlist of Tier 2 issuers with auditor changes, withdrawn opinions, or material restatements within the 120-day window.
ATS and secondary-market operators use metadata.json fields formType, periodOfReport, and filedAt to verify that each listed issuer filed a 1-K within 120 days of fiscal year end, compute lateness, and flag any subsequent 1-K/A. They also parse Item 5 beneficial-ownership tables and Item 8 share counts from the Part II HTML to keep capitalization reference data current. The output feeds listing-eligibility decisions, disclosure-driven trading halts, and issuer profile pages.
Specialty lenders ingest Item 8 audited balance sheets, cash-flow statements, and debt and lease footnotes from the Part II HTML, together with Item 6 related-party transactions and any EX1K-6 MAT CTRCT financing or guarantee agreements. Combined with stateOfIncorporation, sic, and fiscalYearEnd from metadata.json.entities, the output is a credit memo template populated with leverage, liquidity, covenant-relevant disclosures, and counterparty concentration for borrowers that have no 10-K and no rating.
Academic researchers and regulators join 1-K records to their predecessor 1-A offering statements on cik and commissionFileNumber (the 24R-NNNNNN series in summaryInfo). They reconcile forecast use-of-proceeds language in 1-A against realized issuerNetProceeds and MD&A narrative in 1-K, across multiple reporting years, to measure disclosure accuracy and survival. The output supports working papers on Reg A effectiveness, enforcement referrals where offering claims diverge sharply from results, and rulemaking analysis of the 2021 Tier 2 cap increase from $50 million to $75 million.
The Form 1-K Files Dataset is available through three access methods: a JSON index API for metadata and container discovery, a full dataset archive download, and individual per-month container downloads. The dataset covers Form 1-K and 1-K/A filings from April 2016 onward, distributed as monthly ZIP containers with XML, HTML, JSON, and TXT file types.
Dataset Index JSON API: https://api.sec-api.io/datasets/form-1k-files.json
Returns dataset-level metadata (name, description, last updated timestamp, earliest sample date, total records, total size, form types, container format, and file types), the full dataset download URL, and the complete list of container files. Each container entry includes its key (e.g., 2025-11.zip), size, record count, updated timestamp, and individual download URL. This endpoint does not require an API key. Poll it to detect which monthly containers were updated in the most recent refresh run and download only those on a day-by-day basis.
1
{
2
"datasetId": "1f13365b-9ae0-697d-9110-51baafb4e7e6",
3
"datasetDownloadUrl": "https://api.sec-api.io/datasets/form-1k-files.zip",
4
"name": "Form 1-K Files Dataset",
5
"updatedAt": "2026-04-24T02:56:04.657Z",
6
"earliestSampleDate": "2016-04-01",
7
"totalRecords": 9393,
8
"totalSize": 155262480,
9
"formTypes": ["1-K", "1-K/A"],
10
"containerFormat": "ZIP",
11
"fileTypes": ["XML", "HTML", "JSON", "TXT"],
12
"containers": [
13
{
14
"downloadUrl": "https://api.sec-api.io/datasets/form-1k-files/2026/2026-04.zip",
15
"key": "2026/2026-04.zip",
16
"size": 4821334,
17
"records": 27,
18
"updatedAt": "2026-04-24T02:56:04.657Z"
19
}
20
]
21
}
Download Entire Dataset: https://api.sec-api.io/datasets/form-1k-files.zip?token=YOUR_API_KEY
Downloads the complete archive containing every monthly container in a single ZIP file. This endpoint requires authentication via your API key passed as the token query parameter.
Download Single Container: https://api.sec-api.io/datasets/form-1k-files/2026/2026-03.zip?token=YOUR_API_KEY
Downloads one monthly container identified by its year and year-month key (e.g., 2025/2025-11.zip, 2024/2024-03.zip). Use the containers array returned by the index API to enumerate available keys. This endpoint requires authentication via your API key.
The dataset covers Form 1-K, the annual report mandated by Rule 257(b)(1) of Regulation A under the Securities Act of 1933, together with Form 1-K/A amendments. It also includes the Special Financial Report variant, which is filed on the same 1-K submission type but triggered by offering-statement qualification rather than a fiscal-year cycle.
One record is a single Form 1-K or Form 1-K/A submission to EDGAR, identified by its 18-digit accession number and packaged as a folder that reproduces the original submission's document set. Each accession folder contains metadata.json, the canonical primary_doc.xml cover, its XSL-rendered XHTML twin, and the narrative and exhibit HTML documents that make up the filing body.
Tier 2 Regulation A issuers that have qualified a Form 1-A offering statement are required to file Form 1-K under Rule 257 so long as they remain within the ongoing reporting window and have not filed a Form 1-Z exit report. Tier 1 Regulation A issuers, Exchange Act reporting companies, registered investment companies, and Regulation Crowdfunding issuers do not file 1-K.
Form 1-K must be filed within 120 calendar days after the issuer's fiscal year-end under Rule 257(b)(1). The deadline is uniform across all Tier 2 filers — unlike Form 10-K, Regulation A makes no distinction for accelerated, large accelerated, or non-accelerated filers.
The dataset begins with filings dated April 2016 — the first full fiscal-year cycle after the 2015 Regulation A+ overhaul — and extends through the most recent monthly refresh. No pre-2016 history exists because Form 1-K did not exist under the pre-2015 Regulation A regime.
The dataset is distributed as monthly ZIP containers. Inside each container, accession folders contain XML (the canonical primary_doc.xml and its XSL-rendered sibling), HTML (narrative and exhibit documents, each preserved inside its EDGAR SGML <DOCUMENT> wrapper), and JSON (metadata.json). The complete SGML submission .txt file is referenced by URL in metadata.json but not packaged, and GRAPHIC image attachments are likewise referenced but not included.
Form 10-K is the Exchange Act annual report filed under Section 13 or 15(d) and follows the full Regulation S-K disclosure package with PCAOB-audited financials and Inline XBRL tagging. Form 1-K is a scaled-down Regulation A annual report with a 120-day deadline, no XBRL requirement, and audit standards that may be AICPA rather than PCAOB — so 10-K filings systematically carry deeper disclosure than 1-K filings and should not be pooled with them for peer benchmarking.