Form F-1 Files Dataset

The Form F-1 Files Dataset is the corpus of foreign private issuer (FPI) Securities Act registration statements filed with the SEC on Form F-1 and its pre- or post-effectiveness amendments on Form F-1/A. Each record is one complete EDGAR submission keyed by a single SEC accession number, materialized as one folder containing a metadata.json submission header and every text, HTML, XML, or PDF document attached to that submission, with image binaries omitted by design. Filers are non-U.S. operating companies, holding companies, and other commercial entities organized outside the United States that qualify as foreign private issuers under Rule 405 of the Securities Act and that are registering securities for offer or sale in the U.S. market under Section 5 of the Securities Act of 1933. The dataset begins in June 1996 — when EDGAR electronic filing became mandatory for most registrants — and is delivered as monthly ZIP containers covering both initial F-1 registrations and the F-1/A amendment chains they generate during SEC review.

Update Frequency
Daily
Updated at
2026-05-20
Earliest Sample Date
1996-06-01
Total Size
4.1 GB
Total Records
89,205
Container Format
ZIP
Content Types
TXT, JSON, HTML, PDF
Form Types
F-1, F-1/A

Dataset APIs

Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.

Dataset Index JSON API

Download the entire dataset as a single archive file.

Download Entire Dataset:

Download a single container file (e.g. monthly archive) from the dataset.

Download Single Container:

Dataset Files

351 files · 4.1 GB
Download All
2026-05.zip12.9 MB358 records
2026-04.zip14.5 MB317 records
2026-03.zip34.6 MB742 records
2026-02.zip28.8 MB689 records
2026-01.zip31.5 MB703 records
2025-12.zip29.1 MB808 records
2025-11.zip21.8 MB359 records
2025-10.zip26.0 MB574 records
2025-09.zip63.3 MB1,376 records
2025-08.zip44.9 MB1,033 records
2025-07.zip37.7 MB737 records
2025-06.zip42.9 MB809 records
2025-05.zip31.5 MB760 records
2025-04.zip28.8 MB608 records
2025-03.zip46.1 MB949 records
2025-02.zip40.5 MB744 records
2025-01.zip36.3 MB701 records
2024-12.zip47.1 MB885 records
2024-11.zip44.1 MB981 records
2024-10.zip37.6 MB825 records
2024-09.zip51.2 MB1,106 records
2024-08.zip49.6 MB1,111 records
2024-07.zip31.8 MB742 records
2024-06.zip37.4 MB726 records
2024-05.zip40.7 MB870 records
2024-04.zip26.4 MB502 records
2024-03.zip41.2 MB742 records
2024-02.zip36.0 MB651 records
2024-01.zip36.3 MB652 records
2023-12.zip40.7 MB912 records
2023-11.zip38.5 MB815 records
2023-10.zip26.3 MB548 records
2023-09.zip39.3 MB688 records
2023-08.zip31.1 MB651 records
2023-07.zip21.9 MB535 records
2023-06.zip35.3 MB727 records
2023-05.zip20.1 MB410 records
2023-04.zip12.7 MB311 records
2023-03.zip54.7 MB1,170 records
2023-02.zip32.1 MB683 records
2023-01.zip20.8 MB419 records
2022-12.zip29.6 MB607 records
2022-11.zip30.0 MB553 records
2022-10.zip17.9 MB397 records
2022-09.zip31.6 MB583 records
2022-08.zip23.9 MB566 records
2022-07.zip26.2 MB540 records
2022-06.zip24.1 MB380 records
2022-05.zip19.0 MB368 records
2022-04.zip14.4 MB258 records
2022-03.zip22.6 MB436 records
2022-02.zip18.1 MB287 records
2022-01.zip22.1 MB390 records
2021-12.zip18.0 MB250 records
2021-11.zip33.7 MB459 records
2021-10.zip35.2 MB499 records
2021-09.zip28.9 MB413 records
2021-08.zip17.9 MB377 records
2021-07.zip26.3 MB415 records
2021-06.zip40.8 MB664 records
2021-05.zip28.5 MB566 records
2021-04.zip20.0 MB472 records
2021-03.zip32.8 MB717 records
2021-02.zip21.1 MB389 records
2021-01.zip36.8 MB603 records
2020-12.zip23.1 MB443 records
2020-11.zip16.6 MB334 records
2020-10.zip16.8 MB405 records
2020-09.zip30.6 MB598 records
2020-08.zip21.8 MB391 records
2020-07.zip20.3 MB444 records
2020-06.zip24.6 MB471 records
2020-05.zip11.4 MB221 records
2020-04.zip8.7 MB250 records
2020-03.zip10.9 MB241 records
2020-02.zip11.6 MB202 records
2020-01.zip13.7 MB231 records
2019-12.zip12.7 MB271 records
2019-11.zip19.7 MB367 records
2019-10.zip19.3 MB543 records
2019-09.zip17.9 MB425 records
2019-08.zip7.4 MB162 records
2019-07.zip25.5 MB382 records
2019-06.zip13.4 MB260 records
2019-05.zip11.5 MB296 records
2019-04.zip14.1 MB262 records
2019-03.zip12.9 MB263 records
2019-02.zip10.6 MB183 records
2019-01.zip5.8 MB177 records
2018-12.zip8.4 MB181 records
2018-11.zip13.7 MB237 records
2018-10.zip16.2 MB312 records
2018-09.zip19.8 MB430 records
2018-08.zip18.5 MB439 records
2018-07.zip14.0 MB267 records
2018-06.zip14.0 MB307 records
2018-05.zip13.4 MB296 records
2018-04.zip9.0 MB133 records
2018-03.zip19.2 MB345 records
2018-02.zip12.3 MB303 records
2018-01.zip18.2 MB275 records
2017-12.zip15.2 MB258 records
2017-11.zip12.3 MB250 records
2017-10.zip19.0 MB327 records
2017-09.zip17.7 MB454 records
2017-08.zip7.9 MB208 records
2017-07.zip5.3 MB86 records
2017-06.zip5.5 MB103 records
2017-05.zip6.9 MB134 records
2017-04.zip8.3 MB231 records
2017-03.zip10.7 MB179 records
2017-02.zip6.4 MB73 records
2017-01.zip6.3 MB95 records
2016-12.zip8.1 MB168 records
2016-11.zip6.7 MB113 records
2016-10.zip8.6 MB151 records
2016-09.zip8.7 MB114 records
2016-08.zip4.7 MB96 records
2016-07.zip7.7 MB132 records
2016-06.zip6.9 MB170 records
2016-05.zip5.5 MB109 records
2016-04.zip4.6 MB90 records
2016-03.zip5.3 MB76 records
2016-02.zip7.9 MB174 records
2016-01.zip5.5 MB81 records
2015-12.zip7.1 MB170 records
2015-11.zip12.6 MB248 records
2015-10.zip14.4 MB277 records
2015-09.zip11.6 MB287 records
2015-08.zip8.6 MB174 records
2015-07.zip14.5 MB329 records
2015-06.zip11.0 MB242 records
2015-05.zip10.4 MB227 records
2015-04.zip11.2 MB278 records
2015-03.zip9.2 MB259 records
2015-02.zip12.1 MB121 records
2015-01.zip12.6 MB178 records
2014-12.zip7.9 MB177 records
2014-11.zip15.8 MB234 records
2014-10.zip13.5 MB240 records
2014-09.zip19.1 MB331 records
2014-08.zip15.0 MB218 records
2014-07.zip22.1 MB425 records
2014-06.zip19.7 MB410 records
2014-05.zip14.5 MB309 records
2014-04.zip17.6 MB452 records
2014-03.zip17.6 MB447 records
2014-02.zip11.7 MB197 records
2014-01.zip8.1 MB155 records
2013-12.zip7.9 MB141 records
2013-11.zip14.2 MB301 records
2013-10.zip21.8 MB477 records
2013-09.zip11.4 MB234 records
2013-08.zip4.3 MB70 records
2013-07.zip8.1 MB112 records
2013-06.zip15.3 MB197 records
2013-05.zip13.7 MB208 records
2013-04.zip9.4 MB173 records
2013-03.zip4.9 MB127 records
2013-02.zip2.7 MB42 records
2013-01.zip7.8 MB78 records
2012-12.zip5.3 MB92 records
2012-11.zip7.4 MB112 records
2012-10.zip6.1 MB123 records
2012-09.zip8.8 MB151 records
2012-08.zip14.3 MB213 records
2012-07.zip5.0 MB69 records
2012-06.zip5.7 MB98 records
2012-05.zip10.0 MB192 records
2012-04.zip9.0 MB179 records
2012-03.zip12.4 MB313 records
2012-02.zip8.4 MB271 records
2012-01.zip6.8 MB175 records
2011-12.zip9.6 MB164 records
2011-11.zip2.0 MB32 records
2011-10.zip2.3 MB58 records
2011-09.zip2.0 MB76 records
2011-08.zip4.4 MB39 records
2011-07.zip8.0 MB146 records
2011-06.zip7.3 MB170 records
2011-05.zip16.2 MB359 records
2011-04.zip16.3 MB387 records
2011-03.zip16.5 MB415 records
2011-02.zip5.6 MB177 records
2011-01.zip17.6 MB324 records
2010-12.zip10.5 MB183 records
2010-11.zip26.7 MB694 records
2010-10.zip19.7 MB591 records
2010-09.zip18.3 MB553 records
2010-08.zip7.4 MB150 records
2010-07.zip12.4 MB261 records
2010-06.zip9.8 MB297 records
2010-05.zip12.9 MB177 records
2010-04.zip11.0 MB283 records
2010-03.zip10.7 MB289 records
2010-02.zip7.3 MB205 records
2010-01.zip10.6 MB262 records
2009-12.zip7.7 MB196 records
2009-11.zip10.9 MB260 records
2009-10.zip10.5 MB230 records
2009-09.zip5.7 MB127 records
2009-08.zip2.4 MB62 records
2009-07.zip1.5 MB23 records
2009-06.zip1.3 MB44 records
2009-05.zip751.5 KB24 records
2009-04.zip1.2 MB61 records
2009-03.zip1.4 MB64 records
2009-02.zip675.8 KB11 records
2009-01.zip1.2 MB43 records
2008-12.zip540.3 KB15 records
2008-11.zip1.2 MB40 records
2008-10.zip703.6 KB14 records
2008-09.zip5.5 MB153 records
2008-08.zip3.9 MB167 records
2008-07.zip5.5 MB191 records
2008-06.zip3.6 MB138 records
2008-05.zip5.1 MB190 records
2008-04.zip1.9 MB81 records
2008-03.zip1.6 MB78 records
2008-02.zip2.7 MB99 records
2008-01.zip9.7 MB335 records
2007-12.zip6.6 MB131 records
2007-11.zip16.3 MB468 records
2007-10.zip19.9 MB677 records
2007-09.zip7.2 MB234 records
2007-08.zip3.2 MB86 records
2007-07.zip9.9 MB293 records
2007-06.zip8.0 MB146 records
2007-05.zip9.2 MB301 records
2007-04.zip7.3 MB249 records
2007-03.zip6.1 MB157 records
2007-02.zip6.8 MB275 records
2007-01.zip8.6 MB243 records
2006-12.zip8.7 MB288 records
2006-11.zip10.8 MB270 records
2006-10.zip6.0 MB213 records
2006-09.zip8.2 MB248 records
2006-08.zip2.9 MB81 records
2006-07.zip5.0 MB117 records
2006-06.zip3.9 MB145 records
2006-05.zip2.4 MB89 records
2006-04.zip3.4 MB93 records
2006-03.zip7.3 MB170 records
2006-02.zip3.5 MB52 records
2006-01.zip6.6 MB163 records
2005-12.zip9.6 MB277 records
2005-11.zip10.0 MB289 records
2005-10.zip8.0 MB216 records
2005-09.zip6.1 MB238 records
2005-08.zip2.6 MB98 records
2005-07.zip4.5 MB170 records
2005-06.zip6.5 MB193 records
2005-05.zip2.1 MB75 records
2005-04.zip4.2 MB131 records
2005-03.zip3.6 MB92 records
2005-02.zip3.5 MB157 records
2005-01.zip4.9 MB294 records
2004-12.zip2.2 MB98 records
2004-11.zip5.7 MB203 records
2004-10.zip11.0 MB350 records
2004-09.zip7.1 MB210 records
2004-08.zip3.6 MB63 records
2004-07.zip3.3 MB119 records
2004-06.zip8.6 MB268 records
2004-05.zip1.8 MB51 records
2004-04.zip2.8 MB114 records
2004-03.zip2.9 MB43 records
2004-02.zip4.7 MB214 records
2004-01.zip1.5 MB46 records
2003-12.zip3.2 MB68 records
2003-11.zip3.0 MB133 records
2003-10.zip531.1 KB15 records
2003-09.zip1.4 MB66 records
2003-08.zip499.4 KB3 records
2003-07.zip747.1 KB15 records
2003-06.zip1.5 MB34 records
2003-05.zip612.9 KB12 records
2003-04.zip2.9 MB44 records
2003-03.zip14.6 MB67 records
2003-02.zip501.4 KB45 records
2003-01.zip11.2 MB51 records
2002-12.zip1.9 MB60 records
2002-11.zip1.3 MB32 records
2002-10.zip1.3 MB50 records
2002-09.zip710.5 KB33 records
2002-08.zip1.0 MB58 records
2002-07.zip1.1 MB39 records
2002-06.zip1.6 MB83 records
2002-05.zip1.0 MB21 records
2002-04.zip669.9 KB32 records
2002-03.zip570.6 KB23 records
2002-02.zip1.6 MB62 records
2002-01.zip495.8 KB12 records
2001-11.zip700.2 KB15 records
2001-10.zip1.1 MB52 records
2001-09.zip878.2 KB34 records
2001-08.zip840.0 KB34 records
2001-07.zip1.2 MB41 records
2001-06.zip924.4 KB52 records
2001-05.zip599.3 KB29 records
2001-04.zip181.4 KB7 records
2001-03.zip108.5 KB3 records
2001-02.zip153.5 KB2 records
2000-12.zip1.5 MB85 records
2000-11.zip2.5 MB85 records
2000-10.zip2.0 MB96 records
2000-09.zip5.2 MB115 records
2000-08.zip3.2 MB116 records
2000-07.zip1.5 MB69 records
2000-06.zip2.1 MB100 records
2000-05.zip906.8 KB37 records
2000-04.zip3.0 MB73 records
2000-03.zip5.1 MB171 records
2000-02.zip5.3 MB167 records
2000-01.zip3.3 MB116 records
1999-12.zip1.4 MB57 records
1999-11.zip3.1 MB161 records
1999-10.zip3.8 MB158 records
1999-09.zip456.1 KB20 records
1999-08.zip317.8 KB15 records
1999-07.zip3.8 MB132 records
1999-06.zip2.8 MB97 records
1999-05.zip1.4 MB67 records
1999-04.zip872.8 KB47 records
1999-03.zip2.1 MB54 records
1999-02.zip1.4 MB77 records
1998-10.zip279.1 KB7 records
1998-09.zip220.3 KB13 records
1998-07.zip278.5 KB15 records
1998-06.zip1.4 MB56 records
1998-05.zip763.3 KB59 records
1998-04.zip795.4 KB34 records
1998-03.zip1.2 MB60 records
1998-02.zip688.1 KB30 records
1997-12.zip731.6 KB44 records
1997-11.zip366.5 KB13 records
1997-10.zip1.1 MB63 records
1997-09.zip437.0 KB23 records
1997-08.zip231.9 KB23 records
1997-07.zip404.3 KB51 records
1997-06.zip445.4 KB30 records
1997-05.zip173.3 KB14 records
1997-03.zip268.0 KB5 records
1997-02.zip430.1 KB39 records
1997-01.zip327.4 KB21 records
1996-12.zip495.3 KB60 records
1996-11.zip525.1 KB41 records
1996-10.zip844.0 KB19 records
1996-09.zip56.7 KB3 records
1996-07.zip237.5 KB11 records
1996-06.zip327.9 KB23 records

What This Dataset Contains

Form F-1 is the registration statement prescribed under the Securities Act of 1933 for foreign private issuers for which no other Securities Act form (F-3, F-4, F-7, F-8, F-10) is authorized or available. It is the foreign-issuer counterpart to the domestic Form S-1 and is most commonly used by non-U.S. companies undertaking an initial public offering of equity securities in the United States, but is also used for follow-on offerings, ADR programs that require full registration, resale registrations on behalf of selling shareholders, and registrations of debt and convertible instruments. The form is filed under the Securities Act of 1933 (act = "33" in the EDGAR header) and is assigned a 333- series file number that carries through every subsequent Form F-1/A amendment in the same registration.

Each record corresponds 1:1 to a row of the EDGAR full-submission index for form types F-1 and F-1/A, not to a per-issuer or per-offering aggregation. A single offering frequently produces many records over its lifecycle: one initial F-1 followed by a sequence of F-1/A amendments, all sharing the same 333-XXXXXX SEC file number but each with a distinct accession number, and each appearing as its own record. The substantive content is governed by the items prescribed in Form F-1 itself, which in turn incorporate by reference the disclosure schedule of Form 20-F. The bulk of the registrant's narrative and financial disclosure is therefore organized around the Form 20-F item numbering (Items 3 through 19), wrapped inside an offering-specific prospectus that addresses the offering mechanics dictated by the Securities Act and Regulation S-K Item 501 et seq. The financial statements must be prepared under U.S. GAAP, IFRS as issued by the IASB, or home-country GAAP reconciled to U.S. GAAP, with an auditor's report from a PCAOB-registered public accounting firm.

The dataset is distributed in monthly ZIP containers. File types found inside the containers are HTML/HTM, JSON (the metadata file), TXT (full-submission bundles), and PDF (occasional exhibit attachments). XML and XSD artifacts referenced in dataFiles[] are sometimes present locally and sometimes only reachable via the SEC URL recorded in the metadata.

Content Structure of a Single Record

A single record is materialized as one folder whose name is the 18-digit, zero-padded form of the SEC accession number with the two hyphens stripped (e.g., accession 0001104659-25-114804 becomes folder 000110465925114804). Inside the folder sit a metadata.json capturing the EDGAR submission header and every text/HTML/XML document that was part of the original submission, with image binaries omitted. The canonical dashed accession number is preserved inside metadata.json under accessionNo.

Each record stacks two layers:

  1. A submission-level metadata layer (metadata.json) that mirrors the EDGAR SGML header and enumerates every document attached to the accession.
  2. A document layer comprised of the main F-1 (or F-1/A) HTML body plus any number of separate exhibit documents, each wrapped in EDGAR's SGML <DOCUMENT> envelope.

Beneath those layers sit the analytic sub-structures that matter for interpretation: the prospectus narrative inside the main F-1 document, the Inline XBRL fact stream for filing-fee disclosures (and increasingly for cover-page tagging), the legal/tax/auditor opinion exhibits, the material-contract exhibits, and the filing-fee XML data files.

The Submission Metadata Layer

metadata.json captures the parsed EDGAR submission header. Top-level keys describe the filing as a whole:

  • formType"F-1" or "F-1/A".
  • accessionNo — canonical dashed accession number (NNNNNNNNNN-YY-NNNNNN).
  • description — human-readable form description (e.g., "Form F-1 - Registration statement for certain foreign private issuers", with [Amend] appended on amendments).
  • filedAt — ISO-8601 timestamp with timezone offset (e.g., "2025-11-28T17:24:18-05:00").
  • linkToFilingDetails — absolute URL to the primary F-1 document.
  • linkToHtml — URL of the EDGAR -index.htm page.
  • linkToTxt — URL of the full-submission .txt bundle.
  • linkToXbrl — URL of the XBRL viewer; consistently empty for F-1 records even when XBRL is present, so XBRL discovery must rely on dataFiles[].
  • id — opaque 32-character hex identifier (sec-api internal record id).

The header carries an entities[] array, which for F-1 is typically a single (Filer) element representing the foreign private issuer. Per-entity fields include:

  • cik — issuer CIK without leading zeros.
  • companyName — company name with role suffix in parentheses (e.g., "SU Group Holdings Ltd (Filer)").
  • type — repeats the form type for that entity.
  • act — Securities Act under which it is filed ("33" for F-1).
  • fileNo333-XXXXXX registration file number.
  • filmNo — EDGAR film number.
  • irsNo — IRS Employer Identification Number (uniformly "000000000" for foreign issuers without a U.S. EIN).
  • fiscalYearEnd"MMDD" four-digit string; sometimes absent on F-1/A amendment entities.
  • stateOfIncorporation — EDGAR jurisdiction code (e.g., E9 Cayman Islands, D8 British Virgin Islands, G7 Denmark, K3 PRC).
  • sic — SIC code with description; ampersands appear HTML-entity encoded (&amp;).
  • tickers[] — array of ticker symbols associated with the filer; may be empty.

The seriesAndClassesContractsInformation array is almost always empty for F-1, which is reserved for investment-company filings.

The documentFormatFiles[] array enumerates the primary documents in the submission in EDGAR sequence order. Each entry carries:

  • sequence — numeric string ("1", "2", …); a single space " " is reserved for the trailing complete-submission .txt bundle row.
  • size — byte count as a string.
  • documentUrl — absolute SEC URL; Inline XBRL documents are exposed through the https://www.sec.gov/ix?doc=/Archives/... viewer rewriter.
  • description — free-form text such as "F-1", "EXHIBIT 10.15", "FILING FEE IXBRL", "GRAPHIC", or "Complete submission text file".
  • type — EDGAR document-type code drawn from a vocabulary that includes F-1, F-1/A, EX-1.1, EX-3.1, EX-4.x, EX-5.1, EX-8.1, EX-10.x, EX-16.1, EX-21.1, EX-23.1, EX-24.1, EX-99.x, EX-FILING FEES, GRAPHIC, etc.

The parallel dataFiles[] array uses the same shape but carries XBRL/XML auxiliary artifacts, with type values such as XML, EX-101.SCH, EX-101.CAL, EX-101.DEF, EX-101.LAB, and EX-101.PRE. For thin F-1/A amendments dataFiles[] is frequently empty.

The Document Layer

Each document referenced in documentFormatFiles[] is materialized as a file in the accession folder, with two structural exceptions: GRAPHIC entries (image binaries) are intentionally excluded from the dataset, and some dataFiles[] artifacts (the XBRL schema/calc/def/lab/pre set) may be referenced only by URL rather than stored locally. File names are filer-supplied and not standardized; common patterns include filing-agent ticket prefixes (e.g., tm2525392, mask, g085003, ea0267566) followed by a body-type suffix such as _f1, _f1a, formf-1, or formf-1a for the main registration statement, and ex<schedule>-<number> (e.g., ex5-1, ex10-15, ex21-1, ex23-1, ex99-1) for exhibits. The filing-fee exhibit appears under names like ex107, ex-fee, or ex-filingfees.

Every document — main F-1 included — begins with the EDGAR SGML <DOCUMENT> envelope: five unquoted header tags (TYPE, SEQUENCE, FILENAME, DESCRIPTION, TEXT) followed by the inner <HTML> payload, e.g.:

1 <DOCUMENT>
2 <TYPE>F-1
3 <SEQUENCE>1
4 <FILENAME>g085003_f1.htm
5 <DESCRIPTION>F-1
6 <TEXT>
7 <HTML>
8 ...
9 </HTML>
10 </TEXT>
11 </DOCUMENT>

This wrapper must be stripped before the inner HTML can be fed into a standard HTML parser. Filing-fee exhibits keep the same wrapper but embed an Inline XBRL XHTML body whose root carries namespace declarations such as xmlns:ix="http://www.xbrl.org/2013/inlineXBRL" and xmlns:ffd="http://xbrl.sec.gov/ffd/2025", with <ix:header>, <ix:nonNumeric>, and <ix:nonFraction> tags surfacing the fee-calculation facts.

Section-by-Section Anatomy of the F-1 Prospectus

The main F-1 document is the prospectus that, in updated form, will be delivered to investors after the registration becomes effective. Although the SEC does not impose a rigid table of contents, the prospectus is conventionally organized in the following order:

  • Cover page. Issuer name, jurisdiction of incorporation, principal executive offices, telephone, agent for service of process in the United States, title and amount of securities being offered, proposed maximum offering price, name of the lead underwriter(s) or notice that the offering is best-efforts or self-underwritten, expected exchange listing and ticker, and the standard red-herring legend during pre-effective stages. Foreign-issuer cover pages also disclose whether the issuer qualifies as an emerging growth company, a foreign private issuer (FPI), and a smaller reporting company.
  • Prospectus summary. A condensed narrative describing the company's business, competitive strengths, growth strategy, summary of the offering (shares offered, use of proceeds, lock-up, listing), summary risk factors, corporate structure (often involving a Cayman or BVI holding-company chart), implications of being an emerging growth company and a foreign private issuer, and the holding-company structure with VIE arrangements where applicable.
  • Risk factors. A long enumeration of business, industry, regulatory, jurisdictional, and offering-related risks. For issuers operating in or organized under the laws of foreign jurisdictions, this section also covers PRC-, Cayman-, BVI-, or other-jurisdiction-specific risks: enforceability of judgments, foreign-exchange controls, dual-class structures, VIE-structure risks, and, post-2021, the Holding Foreign Companies Accountable Act (HFCAA) inspection-access risks for issuers with auditors located in jurisdictions that the PCAOB cannot fully inspect.
  • Special note on forward-looking statements. Safe-harbor framing for projections.
  • Use of proceeds. Itemized allocation of net proceeds across stated business uses.
  • Dividend policy.
  • Capitalization and dilution tables, including pro forma columns for the offering as adjusted.
  • Exchange rate information. Typical for non-U.S.-dollar functional-currency issuers.
  • Selected and supplemental financial data.
  • Management's Discussion and Analysis of Financial Condition and Results of Operations (MD&A), drawing on the operating and financial review prescribed by Form 20-F Item 5.
  • Business. Detailed description of the company's operations, products and services, customers, suppliers, properties, regulation, intellectual property, and competition (Form 20-F Item 4).
  • Regulation. Industry- and jurisdiction-specific regulatory environment.
  • Management. Directors, senior management, compensation, board practices, and employees (Form 20-F Items 6 and 6.B-D).
  • Principal shareholders. Beneficial-ownership table.
  • Related-party transactions (Form 20-F Item 7.B).
  • Description of share capital. Rights of the registered class, comparison of home-jurisdiction corporate law to Delaware law (a standard inclusion for Cayman, BVI, and Bermuda issuers).
  • Description of American Depositary Shares, where the offering is structured as ADSs rather than ordinary shares, including a summary of the deposit agreement.
  • Shares eligible for future sale. Lock-up summary, Rule 144 / Rule 701 eligibility.
  • Taxation. U.S. federal income tax considerations and home-country tax considerations for U.S. holders.
  • Plan of distribution / Underwriting. Names of underwriters, allocation, discounts, over-allotment option, indemnification, lock-up, stabilization activities, electronic distribution.
  • Expenses related to the offering.
  • Legal matters and Experts (auditor, named experts).
  • Enforceability of civil liabilities against the foreign issuer and its directors and officers in the United States.
  • Where you can find more information.
  • Index to financial statements, followed by the audited consolidated financial statements with the auditor's report, accompanying notes, and any required interim unaudited financial statements.

The Part II portion of the registration statement (Information Not Required in Prospectus) sits at the end of the main document and covers indemnification of directors and officers, recent sales of unregistered securities, the exhibit and financial-statement schedules index, undertakings, and the signature block executed by the registrant, the principal executive officer, the principal financial officer, the principal accounting officer, a majority of the directors, and the authorized U.S. representative.

Exhibit Anatomy

Exhibits are filed as separate sequenced documents and are each wrapped in their own SGML <DOCUMENT> envelope. For a Form F-1 the typical exhibit slate, mirroring the Item 8 / Item 601 of Regulation S-K exhibit table, includes:

  • EX-1.1 — Form of underwriting agreement.
  • EX-3.1 / EX-3.2 — Memorandum and articles of association (or equivalent constitutional documents under home-country law).
  • EX-4.x — Specimen share certificate, form of deposit agreement and form of ADR (for ADS offerings), forms of warrant and warrant agreement, indenture for debt offerings.
  • EX-5.1 — Opinion of counsel as to the legality of the securities being registered (typically issued by counsel in the issuer's home jurisdiction).
  • EX-8.1 / EX-8.2 — Tax opinions (U.S. federal and home-country) where required by Item 601.
  • EX-10.x — Material contracts: employment agreements, equity incentive plan documents, shareholder agreements, registration rights agreements, principal customer/supplier agreements, lease agreements for principal facilities, and (for issuers with VIE structures) the suite of contractual arrangements that establish the VIE.
  • EX-16.1 — Letter from a former independent registered public accounting firm, attached when the filing reflects a change in auditor and accordingly common on F-1/A amendments that update the auditor's report.
  • EX-21.1 — List of subsidiaries.
  • EX-23.1 / EX-23.2 — Consents of the independent registered public accounting firm and any other named experts.
  • EX-24.1 — Power of attorney (often integrated into the signature page).
  • EX-99.x — Additional miscellaneous exhibits, including consent of any director nominee, valuation reports, and translations of foreign-language source documents.
  • EX-FILING FEES (Exhibit 107) — The filing-fee exhibit, since 2022 a mandatory standalone Inline XBRL document.

Exhibit content varies in form: EX-3 documents are constitutional legal text; EX-5 and EX-8 are signed legal opinions; EX-10 documents are executed contracts with their own signature blocks, schedules, and exhibits; EX-21 is a tabular list of subsidiary names with their jurisdictions of incorporation; EX-23 is a brief one- to two-page auditor consent referencing the audit report and the use of the auditor's name; and EX-FILING FEES is a structured fee-calculation table rendered as Inline XBRL with the ffd: namespace.

Included Content

Each accession folder includes metadata.json (always present) and every text/HTML/XML document of the original EDGAR submission that is not a GRAPHIC binary. This covers the main F-1 or F-1/A prospectus, the full slate of exhibits, the filing-fee Inline XBRL exhibit, and (where present) some of the XBRL schema and linkbase artifacts referenced in dataFiles[].

Excluded or Separate Content

Image binaries embedded in the submission (logos, photographs, signature scans, structural diagrams) are excluded from the dataset, even though their GRAPHIC entries remain enumerated in documentFormatFiles[] for reference. Some XBRL linkbase files (the _cal.xml, _def.xml, _lab.xml, _pre.xml set, and the corresponding .xsd schema) referenced in dataFiles[] may not be present locally for thinner amendments — only the SEC URL pointer is preserved. The full-submission .txt bundle is referenced via linkToTxt and as the trailing row of documentFormatFiles[]; whether the bundle itself is materialized inside the folder depends on the submission. Externally-incorporated documents (prior filings cross-referenced by the prospectus) are not pulled in; they remain at their original accession numbers. Confidential draft registration statements (DRS / DRSLTR) filed under the JOBS Act / FAST Act confidential-submission accommodation are out of scope and live under different EDGAR form types; the dataset covers only the public F-1 and F-1/A submissions that follow the public filing of the offering.

Form-Type Mix and the F-1 / F-1/A Relationship

F-1 and F-1/A records coexist in the same monthly partition and share an identical record shape. F-1/A folders are typically smaller because the amendment frequently re-files only the changed pages of the prospectus together with a refreshed auditor consent (EX-23.1), an updated legal opinion (EX-5.1), an updated filing-fee exhibit, and — when an auditor change has occurred — a former-auditor letter (EX-16.1). The amendment chain for a single offering is reconstructable from the shared 333-XXXXXX fileNo carried in entities[].fileNo, ordered chronologically by filedAt.

Changes in Required Content Over Time

The substantive content requirements of Form F-1 have evolved meaningfully across the dataset's coverage window (June 1996 to present), driven by both Form F-1 instructions and the underlying Form 20-F item set that F-1 incorporates by reference:

  • Sarbanes-Oxley Act (2002) introduced Section 302 and Section 404 internal-control over financial reporting concepts that are referenced in F-1 risk factors and MD&A. Foreign private issuers were not required to include Section 302/906 certifications inside the F-1 itself, but disclosure of internal-control framework status entered the prospectus narrative.
  • Form 20-F overhaul (2008) modernized the disclosure schedule that F-1 incorporates, changing item numbering, expanding executive-compensation disclosure, and aligning with IFRS-as-issued-by-the-IASB without U.S. GAAP reconciliation.
  • JOBS Act (2012) introduced the emerging-growth-company regime, adding cover-page check-boxes and the EGC scaled-disclosure regime (two years of audited financials, reduced executive-compensation tables, confidential submission accommodation for IPOs), all of which surface in F-1 prospectuses for qualifying foreign issuers.
  • FAST Act (2015) further extended scaled-disclosure relief (two-year MD&A) and codified confidential pre-effective draft submission for all issuers, expanding the use of confidentially-filed F-1 drafts that later become public on first public filing.
  • Holding Foreign Companies Accountable Act (2020) and related PCAOB determinations (2021) prompted new mandatory risk-factor disclosure and cover-page representations for issuers whose auditors are located in jurisdictions that the PCAOB cannot fully inspect, and added jurisdiction-of-operations and CCP-controlled-entity disclosures particularly relevant to PRC-based foreign private issuers.
  • Filing-fee disclosure modernization (rule effective 2022) replaced the prior tabular filing-fee disclosure on the cover page with a separate Exhibit 107 (EX-FILING FEES), required to be filed as Inline XBRL using the ffd taxonomy. The presence of this exhibit and its Inline XBRL structure is therefore a temporal marker in the dataset.
  • Pay-versus-performance, cybersecurity, and clawback rules primarily target domestic and Form 20-F annual reporting and have only indirect effect on F-1 prospectus narrative.

Changes in Data Format Over Time

Early F-1 records (mid-1996 through roughly 2002) were filed predominantly as plain-text ASCII submissions, with a single .txt document carrying the entire registration statement and exhibits separated by EDGAR <DOCUMENT> envelopes. From the early 2000s onward, HTML became the default presentation format for the main F-1 and individual exhibits, and by the late 2000s essentially all F-1 documents were delivered as .htm files inside the same SGML envelope. Modern HTML payloads are styled with inline CSS and include extensive in-page tables for capitalization, dilution, executive compensation, principal shareholders, and the financial statements.

XBRL did not historically apply to Form F-1 except via the universal cover-page tagging requirements that the SEC extended progressively. The most material format change in recent years is the adoption of Inline XBRL for the filing-fee exhibit (Exhibit 107), in which the fee-calculation table is rendered as XHTML with <ix:> tags surfacing each fact (offering amount, fee rate, fee paid, offsets) under the ffd namespace. The main F-1 document itself can also carry Inline XBRL tags for cover-page facts; in such cases the documentUrl is exposed through the https://www.sec.gov/ix?doc=/... viewer rewriter. Auxiliary XBRL schema and linkbase artifacts (EX-101.SCH, EX-101.CAL, EX-101.DEF, EX-101.LAB, EX-101.PRE) appear in dataFiles[] for filings that include traditional, non-Inline XBRL data alongside the HTML.

Interpretation Notes

  • Amendments are first-class records. Each F-1/A is its own accession with its own folder; reconstructing an offering's full registration history requires grouping by the shared 333-XXXXXX fileNo across formType in (F-1, F-1/A) and ordering by filedAt.
  • HTML-strip step. Every .htm payload begins with the unquoted SGML <DOCUMENT> header (five header tags before <HTML>); a parser that does not strip this header before processing will fail or produce incorrect DOM output.
  • Inline XBRL surface. Filing-fee exhibits and many primary F-1 documents embed Inline XBRL with dei, ffd, xbrli, and ix namespaces. The fact stream is recoverable directly from the HTML body without a separate XML download.
  • dataFiles[] is the XBRL discovery path. Because linkToXbrl is consistently empty for F-1 records even when XBRL is present, all XBRL artifact discovery should go through the dataFiles[] array.
  • HTML entity encoding. Free-text fields in metadata.json (notably sic) preserve HTML-entity encoding such as &amp;; consumers rendering plain text should decode these.
  • Foreign-issuer fingerprints. irsNo is uniformly "000000000" for foreign issuers without a U.S. EIN; stateOfIncorporation carries EDGAR's foreign-jurisdiction code set (e.g., E9 Cayman Islands, D8 British Virgin Islands, G7 Denmark, K3 PRC), which is the most reliable jurisdiction signal at the metadata layer.
  • Exhibit numbering is filer-controlled within the schedule. The exhibit-type vocabulary (EX-5.1, EX-10.15, EX-99.1, EX-FILING FEES, etc.) is assigned by the filer and not strictly normalized; consumers building exhibit-type indexes should canonicalize against Item 601 categories rather than rely on exact string matching.
  • Image exclusion is structural. Any visual content that would have been part of the prospectus (photographs, structural diagrams, organizational charts, signature scans) is referenced by GRAPHIC entries in documentFormatFiles[] but not stored in the record; downstream consumers needing those binaries must fetch them from the SEC URL.
  • Folder name vs. accession number. metadata.accessionNo is dashed (NNNNNNNNNN-YY-NNNNNN); the folder name is the same digits without dashes. Conversion is a simple substitution.

Who Files or Publishes This Dataset, and When

The filer of a Form F-1 or Form F-1/A is the issuer-registrant: a non-U.S. company that qualifies as a foreign private issuer (FPI) under Rule 405 of the Securities Act and Rule 3b-4 of the Exchange Act, and that is registering securities for offer or sale in the United States under the Securities Act of 1933.

An issuer is an FPI unless both:

  • more than 50% of its outstanding voting securities are held of record by U.S. residents, and
  • one of the following is true: a majority of its executive officers or directors are U.S. citizens or residents; more than 50% of its assets are in the United States; or its business is principally administered in the United States.

FPI status is tested as of the last business day of the most recently completed second fiscal quarter.

The filer population consists of operating companies, holding companies, and other commercial entities organized outside the United States, including:

  • non-U.S. operating companies undertaking a first U.S. registered offering (typically an IPO on Nasdaq or NYSE, often via American Depositary Shares);
  • non-U.S. issuers running follow-on, secondary, debt, convertible, or resale offerings when not eligible for the short-form Form F-3;
  • offshore holding companies (Cayman, BVI, Bermuda, Luxembourg, etc.) used to list operating businesses in mainland China, Europe, or elsewhere, where the holding company itself meets the FPI definition.

The filer is always the issuer. Underwriters, depositaries, selling shareholders, auditors, and counsel are named in the registration statement and may sign consents or exhibits, but they do not file Form F-1.

Who Does Not File Form F-1

The following parties register elsewhere and are outside this dataset:

  • Domestic U.S. issuers — Form S-1 (or S-3, S-4, S-8, S-11).
  • FPIs eligible for short-form registrationForm F-3 (generally requires at least 12 months of timely Exchange Act reporting plus a public-float or qualifying-debt test).
  • FPIs registering business combinations or exchange offersForm F-4.
  • Eligible Canadian issuers under the Multijurisdictional Disclosure System — Forms Form F-7, Form F-8, Form F-10, or Form F-80.
  • Foreign governments and political subdivisions issuing debt — Schedule B.
  • ADS facilities themselves — registered by the depositary on Form F-6 (separate from the issuer's Form F-1 registering the underlying shares).

An FPI that loses FPI status migrates to the S-series; a domestic issuer that becomes an FPI may move to F-1 or F-3.

What Triggers a Filing

Form F-1 is transactional and event-driven, not periodic. The trigger is the issuer's decision to undertake a registered offering of securities into the U.S. market under Section 5 of the Securities Act, where no shorter or more specialized Securities Act form is available.

Common trigger events:

  • a first U.S. registered public offering by a non-U.S. company (the dominant case);
  • a follow-on or secondary offering by an FPI that is not yet F-3 eligible;
  • a resale registration covering shares issued in a private placement, PIPE, or business combination;
  • registration of debt or convertible securities by an FPI that cannot use F-3.

There is no recurring schedule. Filing volume tracks U.S. capital-markets activity and FPI listing windows.

When Records Are Created

The initial F-1 is filed when the issuer is ready to begin (or to make public) its SEC registration process. Under the JOBS Act and subsequent SEC staff policy (expanded in 2017 to all FPIs), an issuer may first submit a draft F-1 confidentially; those drafts enter the public EDGAR record only when the issuer publicly files. The dataset reflects the public filing date, not the date of first SEC engagement.

A Form F-1/A is filed whenever the issuer needs to amend a pending F-1 before effectiveness. Typical drivers:

  • SEC staff comments — the Division of Corporation Finance issues comment letters; the issuer responds with one or more F-1/A amendments. This cycle typically runs two to six rounds over weeks or months.
  • Stale financial statements — audited financials must be refreshed under the SEC age-of-financial-statements rules before effectiveness.
  • Price-range and pricing amendments — a pre-effective F-1/A inserts the bona fide price range, share count, and use of proceeds; final terms are typically communicated through a Rule 424(b) prospectus.
  • Material developments — any material change in business, capitalization, risk factors, or deal structure must be reflected so the registration statement is not misleading at effectiveness (Section 11).
  • Mechanical updates — adding or removing securities, selling shareholders, or exhibits; signature page updates; "delaying amendments" under Rule 473.

There is no fixed deadline for an F-1/A; timing is driven by the SEC review cycle, financial-statement age-out, and the issuer's pricing schedule.

The typical chronology for an F-1 IPO runs:

  1. (Optional) confidential draft submission.
  2. Initial public Form F-1 filing on EDGAR.
  3. Multiple F-1/A amendments addressing comments and refreshing disclosure.
  4. Final pre-effective F-1/A with bona fide price range.
  5. SEC declares the registration effective under Section 8(a) (typically after acceleration request under Rule 461).
  6. Issuer prices and files a Rule 424(b) prospectus.
  7. Closing (T+1 or T+2 after pricing).

Regulatory Basis

Form F-1 is a registration statement under the Securities Act of 1933. Key provisions:

  • Section 5 prohibits offers or sales without an effective registration statement (or exemption).
  • Section 6 governs the filing mechanics; Section 7 and Schedule A specify required content; Section 8 governs effectiveness and SEC stop-order authority; Section 10 prescribes prospectus content; Section 11 imposes liability for material misstatements at effectiveness.
  • Rule 405 supplies the FPI definition. Regulation S-X governs financial statements; Form 20-F supplies most line-item disclosure requirements; Regulation C governs filing and amendment mechanics.

The SEC's Division of Corporation Finance, including the Office of International Corporate Finance, reviews F-1 filings.

Important Distinctions

  • Filer is the issuer, not the underwriter. Underwriters bear Section 11 liability and appear in exhibits and the Plan of Distribution but do not file the form.
  • F-1 vs F-3. F-3 is the short-form, incorporation-by-reference registration available to FPIs with an established reporting history; FPIs without F-3 eligibility default to F-1.
  • F-1 vs F-4. F-4 covers business combinations and exchange offers, not primary cash offerings.
  • F-1 vs Form 20-F. Form 20-F is the FPI annual report under the Exchange Act and is periodic; F-1 borrows its disclosure scaffolding but is offering-driven.
  • MJDS forms (F-7/F-8/F-10/F-80). Eligible Canadian issuers use these instead of F-1.
  • Schedule B. Foreign sovereigns and their political subdivisions register debt there.
  • Withdrawn registrations. An issuer abandoning an offering files Form RW; the prior F-1 and F-1/A records remain in EDGAR.
  • Dataset start date. F-1 has existed since the early-1980s FPI registration overhaul, but the dataset begins in June 1996 when EDGAR electronic filing became mandatory for most registrants. Earlier F-1 filings exist only in paper form.
  • Persons disclosed are not filers. Directors, officers, principal and selling shareholders, depositaries, auditors, counsel, and underwriters appear in the registration statement but the filing entity is always the FPI itself.

How This Dataset Differs From Similar Datasets or Filings

Form F-1 sits at the intersection of two regimes: Securities Act registration and the foreign private issuer framework. Several adjacent datasets overlap with it in either purpose (registering securities) or filer population (FPIs). The comparisons below identify the closest neighbors and the precise boundary between each one and Form F-1.

Form S-1 (Domestic Long-Form Registration)

The closest functional analogue. S-1 and F-1 share nearly identical disclosure architecture: prospectus body, business description, risk factors, MD&A, plan of distribution, dilution, use of proceeds, and audited financials.

Key differences are filer eligibility and FPI accommodations:

  • F-1 filers may present financials under IFRS as issued by the IASB without U.S. GAAP reconciliation; S-1 filers cannot.
  • F-1 permits modified executive compensation and FPI governance disclosures; S-1 follows full Regulation S-K.
  • F-1 surfaces FPI-specific risk content (home-country regulation, currency convertibility, VIE structures, dual-class FPI arrangements) that does not appear in S-1.

Use S-1 for the U.S.-issuer IPO universe; F-1 for the FPI slice. The two are not interchangeable populations.

Form F-3 (Short-Form FPI Registration)

Same filer universe as F-1 but at the opposite end of the issuer-maturity curve. F-3 is available to seasoned FPIs meeting reporting-history and float thresholds and incorporates 20-F and 6-K content by reference rather than restating disclosures.

  • F-1: IPOs and FPIs not yet eligible for short-form registration; self-contained, disclosure-heavy.
  • F-3: follow-ons, shelf takedowns, ATMs by established FPIs; intentionally thin, reliant on the reporting record.

Use F-1 for first-time U.S. listings; F-3 for secondary capital raises by listed FPIs.

Form F-4 (FPI Business Combinations)

The FPI parallel to S-4. Both F-1 and F-4 are Securities Act registrations with prospectus-style disclosure, but the transaction context is different.

  • F-1 registers cash offerings (typically IPOs or non-shelf follow-ons).
  • F-4 registers share consideration in M&A, exchange offers, and reclassifications, and adds target-company financials, fairness opinions, merger-agreement summaries, and proxy/consent content.

Use F-4 for cross-border M&A consideration disclosures; F-1 will not capture them.

Form 20-F (FPI Annual Report)

Periodic, not transactional. The relationship to F-1 is sequential: an FPI files F-1 once (initial U.S. registration) and 20-F annually thereafter. F-3 takedowns frequently incorporate 20-F by reference.

Use 20-F as the longitudinal time series for an FPI's disclosure; use F-1 for the single registration event.

Form 6-K (FPI Interim/Current Reports)

Furnished, not filed. 6-K is keyed to home-country disclosure obligations and exchange announcements rather than the fixed event list that drives Form 8-K. It is a continuous stream of interim items (press releases, interim financials, material announcements).

No content overlap with F-1, but the two are complementary when reconstructing an FPI's information record around an offering. Form 6-K is not a substitute for F-1.

MJDS Forms: F-7, F-8, F-10, F-80 (Eligible Canadian Issuers)

Specialized registrations under the Multijurisdictional Disclosure System: F-10 (long-form), F-7 (rights offerings), F-8 and F-80 (business combinations, F-80 for larger deals). MJDS filings rely on Canadian disclosure documents and accept Canadian GAAP/IFRS without U.S.-style restructuring.

Eligible Canadian issuers generally choose MJDS over F-1 because of the lighter burden. As a result, the F-1 dataset systematically underrepresents MJDS-eligible Canadians and captures only Canadian issuers that opt out of or do not qualify for MJDS, plus all non-Canadian FPIs. Use MJDS-form datasets for Canadian cross-border filings.

F-1 vs. F-1/A (Intra-Dataset Distinction)

F-1/A amendments are pre-effective (and occasionally post-effective) amendments to F-1 registration statements and form a substantial share of records, since SEC review typically produces multiple amendment rounds.

The meaningful contrast is between:

  • the initial F-1 (often preceded by a confidential DRS for emerging growth companies under the JOBS Act), and
  • successive F-1/A versions that incorporate staff comments, updated financials, revised pricing ranges, and final pricing.

Studies of disclosure evolution, comment-letter response, or final-prospectus content must distinguish initial from final amendment. Treating all F-1 and F-1/A records as interchangeable conflates draft and final disclosure.

Rule 424(b) Prospectuses

Sequential stages of the same offering. F-1 (and its amendments) is the pre-effective registration statement; 424(b) is the post-effective prospectus actually delivered to investors, capturing final pricing and any Rule 430A changes.

  • Use 424(b) for offering economics: final issue price, proceeds, underwriting compensation, primary/secondary allocation.
  • Use F-1 / F-1/A for pre-effective disclosure, risk-factor language, and registration narrative.

Neither substitutes for the other; together they describe the full offering lifecycle.

Form CB (Exempt Cross-Border Transactions)

A short notice form filed by foreign issuers or bidders for cross-border tender offers, exchange offers, rights offerings, or business combinations qualifying for Rule 13e-4(h)(8), Rule 14d-1(c), or Securities Act Rule 802 exemptions. The U.S. submission is a cover form attaching home-country offering documents, not a U.S.-format prospectus.

F-1 is the opposite case: no exemption applies and the issuer produces a full U.S. registration statement. Both can describe FPI offerings reaching U.S. holders, but they represent different regulatory pathways and are not interchangeable. (Form CB)

Boundary Summary

The Form F-1 Files Dataset is the corpus of long-form Securities Act registration statements and pre-effective amendments filed by foreign private issuers that do not qualify for short-form registration (F-3), MJDS treatment (F-7/F-8/F-10/F-80), or a Securities Act exemption (Form CB).

It is:

  • the FPI counterpart to S-1,
  • the upstream cousin of F-3,
  • the transactional sibling of F-4,
  • the registration-stage predecessor of Rule 424(b) prospectuses, and
  • a one-time event distinct from the periodic 20-F and continuous 6-K cycle.

For prospectus-level disclosure produced at the moment an FPI first registers a U.S. offering (or any later registered offering ineligible for short-form treatment), this dataset is the correct source. For any other FPI or registration scenario, one of the adjacent datasets above applies instead.

Who Uses This Dataset

The Form F-1 corpus is the canonical record of foreign private issuer registration in the United States. Each profession below works on a different slice of the filing — prospectus text, financial statements, exhibits, or metadata — and converts it into a specific output.

Capital Markets and Securities Lawyers

Attorneys representing FPIs, sponsors, and underwriters use the dataset as a precedent library. They pull recent F-1s from the same jurisdiction, sector, and size band to benchmark risk-factor language (country, VIE, exchange-control, sanctions, enforcement), related-party and controlling-shareholder disclosure, plan of distribution, lock-ups, dual-class structures, and use-of-proceeds wording. The exhibit index drives most drafting work: legal opinions (Ex. 5), tax opinions (Ex. 8), underwriting agreements (Ex. 1), articles and bylaws (Ex. 3), material contracts (Ex. 10), subsidiary lists (Ex. 21), and auditor consents. Diffing successive F-1/A amendments reconstructs the staff comment-and-response cycle, which trains associates and informs negotiation on live deals.

ECM Bankers and Syndicate Desks

IPO origination and equity capital markets teams build comparable-deal analyses for FPI mandates. From the cover, plan of distribution, and underwriting exhibit they extract offering size, primary/secondary mix, greenshoe, gross spread, syndicate composition, lock-up terms, pre-IPO cap tables, and insider holdings. MD&A and selected financial data feed growth and margin comps. These extracts populate pitch books, league-table materials, fee grids, and price-talk discussions.

Buy-Side IPO Analysts

Long-only fundamental, event-driven, and IPO-dedicated funds use F-1s as the primary diligence document for foreign offerings, where home-country disclosure is often non-English or sparse. They focus on the business description, customer concentration, audited financials and IFRS/GAAP reconciliations, jurisdiction-specific risks, governance disclosure around dual-class shares and FPI exemptions, related-party flows in VIE structures, and dilution. Comparing F-1/A amendments surfaces price-range cuts, share-count changes, restatements, and new risk factors as deal-momentum signals.

Underwriter Due-Diligence and Compliance Teams

Due-diligence committees, new-issue review desks, and broker-dealer compliance officers use the corpus to support Section 11 and Section 12 defenses and FINRA Rule 5110 filings. They rely on the full prospectus for material-disclosure verification, auditor consents and financials for going-concern flags, legal opinions, and the underwriting agreement for compensation and conflicts. Prior FPI precedents test the adequacy of their own diligence records.

Academic Finance, Accounting, and Securities-Law Researchers

Researchers use the dataset as an empirical corpus for studies on FPI underpricing and long-run performance, JOBS Act emerging-growth-company effects (confidential submission, scaled disclosure), cross-listing and bonding theory, VIE and dual-class governance, IFRS-to-GAAP comparability, and textual analysis of risk factors and boilerplate. Filer CIK, accession number, form type, and amendment number support panel construction; the document text supports NLP pipelines.

Corporate Development and Cross-Border M&A Teams

Strategic acquirers and cross-border M&A advisors mine F-1s for precedent on targets in specific jurisdictions. Pre-IPO ownership and shareholder agreements, Exhibit 21 subsidiary lists, material contracts, foreign-investment and CFIUS disclosures, and post-IPO change-of-control restrictions inform target screening, valuation, deal structuring, and reps-and-warranties drafting.

KYC, AML, and Sanctions Onboarding

Financial-crime teams at banks, prime brokers, custodians, and asset managers use F-1 disclosures to onboard FPIs and their controlling shareholders. They focus on beneficial-ownership and principal-shareholder tables, officer and director biographies, offshore holding structures, operations in sanctioned jurisdictions, state-owned counterparty dealings, PEP relationships, and pending legal or tax proceedings. The filings provide a reproducible primary-source record for enhanced-diligence files.

Credit and Convertible Analysts

Analysts evaluating convertible bonds, pre-IPO loans, and cornerstone tickets in FPIs use the financial statements, capitalization tables, and Exhibit 10 credit agreements to assess leverage, liquidity runway, debt maturity, and covenants, then layer on use-of-proceeds and dilution to model post-offering capital structure.

Data Engineers, Quants, and ML/RAG Developers

Financial-data engineering teams ingest the corpus to build FPI offering databases and event tables. Work includes parsing prospectuses into normalized fields (size, share count, range, sector, country, auditor), extracting syndicate and exhibit chains, backtesting IPO and post-IPO strategies, linking F-1 records to subsequent 20-F, 6-K, and Form 4 filings, and training or evaluating LLMs on prospectus and risk-factor text for retrieval-augmented systems serving lawyers, bankers, and analysts. The combination of full-text documents (TXT/HTML/PDF) and structured JSON metadata supports both rule-based and LLM-based pipelines.

In summary, lawyers treat the Form F-1 Files Dataset as a precedent library, bankers as a comp database, analysts as a diligence source, compliance teams as a liability-defense file, researchers as a corpus, and engineers as a training and extraction substrate. The 1996-onward completeness of F-1 and F-1/A filings, combined with the full exhibit set, is what makes a single dataset support all of these workflows.

Specific Use Cases

The following workflows draw on the prospectus body, exhibit slate, and submission metadata of the Form F-1 Files Dataset.

  • Benchmarking jurisdiction-specific risk-factor language for FPI prospectus drafting. Capital markets lawyers pull risk-factor sections from recent F-1s filed by issuers sharing the same stateOfIncorporation code (e.g., E9 Cayman, K3 PRC, D8 BVI) and SIC band, then diff the language to draft VIE, HFCAA, exchange-control, and enforceability-of-civil-liabilities risk factors. The output is a precedent bank that feeds first-draft risk-factor sections and staff-comment anticipation memos for live mandates.

  • Reconstructing the SEC comment-and-response cycle from F-1/A amendment chains. Grouping records by the shared 333-XXXXXX fileNo across formType in (F-1, F-1/A) and ordering by filedAt produces the full pre-effective amendment chain for a single offering. Section-by-section diffs across consecutive amendments surface staff-driven changes in disclosure (price ranges, share counts, new risk factors, restated financials, auditor changes signaled by EX-16.1), which trains associates, informs negotiation, and feeds academic studies of disclosure evolution.

  • Building an FPI IPO comparables database for ECM pitch books. Bankers parse the cover page, prospectus summary, plan of distribution, and EX-1.1 underwriting agreement to extract deal size, primary/secondary mix, greenshoe, gross spread, lead/co-manager syndicate, lock-up duration, and pre-IPO cap-table composition. Joined with entities[].sic, stateOfIncorporation, and ticker, these fields populate league-table extracts, fee grids, and price-talk decks for new FPI mandates.

  • Extracting Exhibit 21 subsidiary lists for cross-border corporate-structure mapping. EX-21.1 documents are pulled from each accession folder and parsed into subsidiary-name and jurisdiction pairs, then linked to the parent CIK and 333- file number. The resulting graph supports CFIUS screening, CFC/PFIC tax analysis, sanctions exposure mapping for KYC files, and identification of VIE-tier and offshore SPV layers in PRC and Cayman holding structures.

  • Harvesting Inline XBRL filing-fee facts from Exhibit 107. For filings post-2022, the EX-FILING FEES exhibit is parsed by reading the <ix:nonNumeric> and <ix:nonFraction> tags under the ffd: namespace inside the SGML-stripped XHTML body. Extracted facts (offering amount, fee rate, fee paid, prior-fee offsets) feed registered-offering size statistics, fee-aggregation across an amendment chain, and reconciliation against final 424(b) prospectuses.

  • Building a textual NLP corpus for FPI prospectus retrieval and LLM evaluation. Data engineers strip the leading <DOCUMENT> SGML envelope from each .htm payload, segment the prospectus into canonical sections (risk factors, MD&A, business, related-party transactions, taxation, enforceability, plan of distribution), and index the segments by CIK, jurisdiction, SIC, and filedAt. The corpus drives retrieval-augmented systems for FPI counsel and analysts, fine-tuning evaluations on cross-border disclosure, and large-scale textual studies of EGC scaled-disclosure adoption and HFCAA risk-factor diffusion.

  • Linking F-1 IPO records to subsequent 20-F, 6-K, and Form 4 filings for post-IPO event studies. Researchers and quants use issuer CIK as the join key to chain each initial F-1 to the issuer's later 20-F annual reports, 6-K interim items, and insider Form 4 filings. The resulting panel supports underpricing studies, lock-up-expiry insider-selling analysis, post-IPO restatement tracking, and long-run performance work on the foreign-issuer slice of the U.S. market from June 1996 forward.

Dataset Access

The Form F-1 Files Dataset is available through three access methods: a JSON metadata index, a full archive download, and individual container downloads. Containers are ZIP files organized by year and month, covering filings from June 1996 to present.

Dataset Index JSON API: https://api.sec-api.io/datasets/form-f1-files.json

Returns dataset-level metadata (name, description, last updated timestamp, earliest sample date, total records, total size, form types, container format, file types) along with the full dataset download URL and the list of all container files. Each container entry includes its key, size, record count, last updated timestamp, and download URL. Use this endpoint to monitor which containers were updated in the most recent refresh run and decide which to re-download incrementally. This endpoint does not require an API key.

Example response:

Example
1 {
2 "datasetId": "1f13365b-9ae0-692b-92be-c06b5f5da436",
3 "datasetDownloadUrl": "https://api.sec-api.io/datasets/form-f1-files.zip",
4 "name": "Form F-1 Files Dataset",
5 "updatedAt": "2026-05-07T02:50:21.261Z",
6 "earliestSampleDate": "1996-06-01",
7 "totalRecords": 88926,
8 "totalSize": 4113149130,
9 "formTypes": ["F-1", "F-1/A"],
10 "containerFormat": "ZIP",
11 "fileTypes": ["TXT", "JSON", "HTML", "PDF"],
12 "containers": [
13 {
14 "downloadUrl": "https://api.sec-api.io/datasets/form-f1-files/2026/2026-04.zip",
15 "key": "2026/2026-04.zip",
16 "size": 13818783,
17 "records": 154,
18 "updatedAt": "2026-05-07T02:50:21.261Z"
19 }
20 ]
21 }

Download Entire Dataset: https://api.sec-api.io/datasets/form-f1-files.zip?token=YOUR_API_KEY

Downloads the complete dataset as a single ZIP archive containing every monthly container from June 1996 onward. Given the dataset size, prefer this only for full local mirrors; for incremental syncs, use the per-container URLs from the index. This endpoint requires an API key.

Download Single Container: https://api.sec-api.io/datasets/form-f1-files/2026/2026-04.zip?token=YOUR_API_KEY

Downloads one monthly container ZIP. Each container holds the metadata file and all documents (excluding image files) for every F-1 and F-1/A accession in that month, grouped by accession number. This endpoint requires an API key.

Frequently Asked Questions

What forms does this dataset cover?

The dataset covers Form F-1, the long-form Securities Act registration statement filed by foreign private issuers, and Form F-1/A, the pre- or post-effectiveness amendment to such a registration. Both form types coexist in the same monthly partition and share an identical record shape.

What does one record in this dataset represent?

One record is one complete EDGAR submission keyed by exactly one SEC accession number, materialized as a single folder containing a metadata.json submission header and every text, HTML, XML, or PDF document that was part of the original submission. A record corresponds 1:1 to a row of the EDGAR full-submission index for F-1 and F-1/A, not to a per-issuer or per-offering aggregation, so a single offering typically generates one initial F-1 plus a sequence of F-1/A amendment records.

Who is required to file Form F-1?

The filer is always the issuer-registrant: a non-U.S. company that qualifies as a foreign private issuer under Rule 405 of the Securities Act and that is registering securities for offer or sale in the United States under the Securities Act of 1933, where no shorter or more specialized Securities Act form (F-3, F-4, F-7, F-8, F-10) is available. Underwriters, depositaries, selling shareholders, auditors, and counsel may be named in the registration statement and may sign exhibits, but they do not file the form themselves.

What time period does the dataset cover?

The dataset begins in June 1996, when EDGAR electronic filing became mandatory for most registrants, and runs to present. Earlier F-1 filings exist only on paper and are not part of this corpus.

What file format is the dataset distributed in?

The dataset is distributed as monthly ZIP containers organized by year and month. File types found inside the containers are HTML/HTM, JSON (the metadata file), TXT (full-submission bundles), and PDF (occasional exhibit attachments). Image binaries enumerated as GRAPHIC entries in documentFormatFiles[] are intentionally excluded.

How does this dataset differ from Form S-1 and Form F-3 datasets?

Form S-1 is the closest functional analogue but covers domestic U.S. issuers rather than FPIs, and cannot use IFRS without U.S. GAAP reconciliation. Form F-3 covers the same FPI filer universe as F-1 but is a short-form, incorporation-by-reference registration available only to seasoned FPIs that meet reporting-history and float thresholds; F-1 is used by FPIs at first U.S. listing or by FPIs not yet eligible for F-3.

Each F-1/A is its own first-class record with its own accession number and folder; reconstructing a single offering's full registration history requires grouping records by the shared 333-XXXXXX fileNo carried in entities[].fileNo across formType in (F-1, F-1/A) and ordering them chronologically by filedAt. Treating all F-1 and F-1/A records as interchangeable conflates draft and final disclosure.