Form F-4 Files Dataset

The Form F-4 Files Dataset packages every Form F-4 and Form F-4/A registration statement submitted to EDGAR by foreign private issuers since October 1994. Form F-4 is the registration statement prescribed by 17 CFR 239.34 under the Securities Act of 1933 for foreign private issuers (FPIs) registering securities issued in business-combination transactions — statutory mergers, consolidations, exchange offers, and Rule 145 transactions. One record in the dataset is one EDGAR submission, identified by accession number and delivered as a folder containing the byte-faithful original documents (main registration statement, exhibits, XBRL data files) plus a generated metadata.json sidecar that re-states the EDGAR submission header in structured form. Form F-4/A records are amendments — staff-comment responses, refreshed financials, revised exchange ratios, and pre- or post-effective amendments — included alongside initial F-4s, so the dataset captures the complete amendment chain of every cross-border registered M&A registration on EDGAR. The dataset is distributed as ZIP containers with file types TXT, JSON, HTML, and PDF.

Update Frequency
Daily
Updated at
2026-05-15
Earliest Sample Date
1994-10-01
Total Size
1.9 GB
Total Records
33,881
Container Format
ZIP
Content Types
TXT, JSON, HTML, PDF
Form Types
F-4, F-4/A

Dataset APIs

Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.

Dataset Index JSON API

Download the entire dataset as a single archive file.

Download Entire Dataset:

Download a single container file (e.g. monthly archive) from the dataset.

Download Single Container:

Dataset Files

354 files · 1.9 GB
Download All
2026-05.zip4.0 MB31 records
2026-04.zip6.8 MB72 records
2026-03.zip9.2 MB76 records
2026-02.zip10.2 MB117 records
2026-01.zip7.2 MB45 records
2025-12.zip5.4 MB70 records
2025-11.zip5.0 MB70 records
2025-10.zip6.5 MB103 records
2025-09.zip17.7 MB131 records
2025-08.zip7.4 MB84 records
2025-07.zip17.9 MB168 records
2025-06.zip20.5 MB203 records
2025-05.zip14.0 MB189 records
2025-04.zip16.1 MB182 records
2025-03.zip30.3 MB280 records
2025-02.zip17.0 MB199 records
2025-01.zip19.3 MB175 records
2024-12.zip24.7 MB267 records
2024-11.zip20.0 MB174 records
2024-10.zip19.5 MB139 records
2024-09.zip16.8 MB154 records
2024-08.zip19.8 MB191 records
2024-07.zip10.1 MB108 records
2024-06.zip11.2 MB109 records
2024-05.zip11.2 MB127 records
2024-04.zip5.3 MB48 records
2024-03.zip26.4 MB309 records
2024-02.zip17.9 MB153 records
2024-01.zip23.9 MB191 records
2023-12.zip24.5 MB254 records
2023-11.zip24.2 MB228 records
2023-10.zip17.6 MB326 records
2023-09.zip25.8 MB300 records
2023-08.zip26.3 MB290 records
2023-07.zip17.7 MB160 records
2023-06.zip31.4 MB323 records
2023-05.zip26.2 MB204 records
2023-04.zip10.8 MB178 records
2023-03.zip22.6 MB259 records
2023-02.zip16.3 MB127 records
2023-01.zip15.2 MB154 records
2022-12.zip27.8 MB356 records
2022-11.zip12.8 MB130 records
2022-10.zip13.9 MB148 records
2022-09.zip20.5 MB274 records
2022-08.zip15.6 MB201 records
2022-07.zip13.6 MB95 records
2022-06.zip12.9 MB100 records
2022-05.zip19.3 MB157 records
2022-04.zip11.0 MB107 records
2022-03.zip18.4 MB179 records
2022-02.zip14.5 MB166 records
2022-01.zip14.9 MB196 records
2021-12.zip17.3 MB245 records
2021-11.zip32.0 MB300 records
2021-10.zip13.4 MB163 records
2021-09.zip19.2 MB266 records
2021-08.zip19.7 MB212 records
2021-07.zip20.4 MB209 records
2021-06.zip17.5 MB239 records
2021-05.zip15.6 MB200 records
2021-04.zip8.1 MB112 records
2021-03.zip9.9 MB159 records
2021-02.zip8.7 MB118 records
2021-01.zip5.8 MB104 records
2020-12.zip6.5 MB69 records
2020-11.zip5.9 MB79 records
2020-10.zip6.7 MB92 records
2020-09.zip4.3 MB74 records
2020-08.zip2.1 MB71 records
2020-07.zip4.5 MB64 records
2020-06.zip2.6 MB40 records
2020-05.zip2.0 MB15 records
2020-04.zip1.4 MB27 records
2020-03.zip2.7 MB45 records
2020-02.zip5.4 MB62 records
2020-01.zip6.6 MB95 records
2019-12.zip2.5 MB28 records
2019-11.zip6.4 MB56 records
2019-10.zip1.2 MB29 records
2019-09.zip2.9 MB84 records
2019-08.zip1.7 MB19 records
2019-07.zip647.4 KB15 records
2019-06.zip1.9 MB46 records
2019-05.zip1.3 MB22 records
2019-04.zip2.4 MB37 records
2019-03.zip2.3 MB21 records
2019-02.zip2.8 MB41 records
2019-01.zip2.5 MB53 records
2018-12.zip3.1 MB44 records
2018-11.zip5.3 MB100 records
2018-10.zip1.5 MB31 records
2018-09.zip2.9 MB82 records
2018-08.zip3.6 MB60 records
2018-07.zip953.4 KB35 records
2018-06.zip2.2 MB67 records
2018-05.zip1.8 MB33 records
2018-04.zip3.6 MB81 records
2018-03.zip1.1 MB14 records
2018-02.zip1.8 MB55 records
2018-01.zip5.1 MB67 records
2017-12.zip3.7 MB46 records
2017-11.zip2.4 MB35 records
2017-10.zip1.5 MB46 records
2017-09.zip4.9 MB92 records
2017-08.zip2.8 MB48 records
2017-07.zip1.6 MB26 records
2017-06.zip4.5 MB68 records
2017-05.zip4.9 MB77 records
2017-04.zip3.3 MB66 records
2017-03.zip1.8 MB39 records
2017-02.zip1.5 MB27 records
2017-01.zip833.2 KB20 records
2016-12.zip569.0 KB7 records
2016-11.zip7.6 MB174 records
2016-10.zip2.7 MB82 records
2016-09.zip7.6 MB94 records
2016-08.zip6.0 MB102 records
2016-07.zip1.4 MB24 records
2016-06.zip1.1 MB25 records
2016-05.zip5.9 MB125 records
2016-04.zip5.6 MB89 records
2016-03.zip4.0 MB75 records
2016-02.zip4.0 MB90 records
2016-01.zip1.9 MB52 records
2015-12.zip3.2 MB44 records
2015-11.zip1.8 MB44 records
2015-10.zip6.0 MB91 records
2015-09.zip4.8 MB67 records
2015-08.zip5.2 MB105 records
2015-07.zip2.0 MB45 records
2015-06.zip1.4 MB43 records
2015-05.zip3.5 MB91 records
2015-04.zip2.2 MB83 records
2015-03.zip3.3 MB46 records
2015-02.zip2.3 MB23 records
2015-01.zip2.5 MB20 records
2014-12.zip4.0 MB26 records
2014-11.zip4.0 MB45 records
2014-10.zip4.6 MB111 records
2014-09.zip2.2 MB90 records
2014-08.zip482.6 KB18 records
2014-07.zip2.0 MB65 records
2014-06.zip698.2 KB9 records
2014-05.zip1.0 MB22 records
2014-04.zip662.5 KB26 records
2014-03.zip1.5 MB35 records
2014-02.zip2.4 MB23 records
2014-01.zip1.7 MB22 records
2013-12.zip1.2 MB20 records
2013-11.zip1.8 MB42 records
2013-08.zip4.7 MB50 records
2013-07.zip2.5 MB48 records
2013-06.zip3.1 MB68 records
2013-05.zip2.5 MB60 records
2013-04.zip1.0 MB41 records
2013-03.zip1.1 MB16 records
2013-02.zip1.3 MB32 records
2013-01.zip584.4 KB19 records
2012-12.zip6.8 MB198 records
2012-11.zip500.5 KB8 records
2012-10.zip481.8 KB6 records
2012-09.zip960.8 KB25 records
2012-08.zip2.2 MB28 records
2012-07.zip7.9 MB147 records
2012-06.zip8.9 MB161 records
2012-05.zip8.7 MB89 records
2012-04.zip7.1 MB114 records
2012-03.zip3.6 MB80 records
2012-02.zip15.1 MB1,016 records
2012-01.zip2.8 MB95 records
2011-11.zip9.2 MB377 records
2011-10.zip4.2 MB113 records
2011-09.zip4.7 MB109 records
2011-08.zip3.7 MB118 records
2011-07.zip2.2 MB90 records
2011-06.zip4.4 MB255 records
2011-05.zip4.9 MB128 records
2011-04.zip5.6 MB156 records
2011-03.zip2.2 MB62 records
2011-02.zip4.8 MB116 records
2011-01.zip4.6 MB142 records
2010-12.zip3.6 MB112 records
2010-11.zip4.2 MB71 records
2010-10.zip7.2 MB117 records
2010-09.zip2.3 MB25 records
2010-08.zip3.6 MB53 records
2010-07.zip1.1 MB81 records
2010-06.zip1.2 MB21 records
2010-05.zip3.5 MB86 records
2010-04.zip2.7 MB104 records
2010-03.zip3.9 MB84 records
2010-02.zip5.5 MB144 records
2010-01.zip2.8 MB52 records
2009-12.zip7.0 MB176 records
2009-11.zip5.7 MB90 records
2009-10.zip5.1 MB114 records
2009-09.zip2.0 MB45 records
2009-08.zip3.9 MB82 records
2009-07.zip5.5 MB146 records
2009-06.zip3.5 MB100 records
2009-05.zip1.8 MB42 records
2009-03.zip381.2 KB17 records
2009-02.zip979.7 KB22 records
2009-01.zip1.2 MB18 records
2008-12.zip782.1 KB17 records
2008-11.zip1.3 MB26 records
2008-10.zip855.6 KB20 records
2008-09.zip1.7 MB75 records
2008-08.zip341.1 KB7 records
2008-07.zip5.1 MB131 records
2008-06.zip4.9 MB91 records
2008-05.zip2.9 MB50 records
2008-04.zip1.9 MB97 records
2008-03.zip2.3 MB60 records
2008-02.zip7.8 MB153 records
2008-01.zip2.5 MB59 records
2007-12.zip4.2 MB76 records
2007-11.zip5.8 MB102 records
2007-10.zip3.7 MB90 records
2007-09.zip2.0 MB62 records
2007-08.zip3.5 MB137 records
2007-07.zip5.2 MB122 records
2007-06.zip1.7 MB59 records
2007-05.zip2.7 MB70 records
2007-04.zip3.3 MB120 records
2007-03.zip2.3 MB107 records
2007-02.zip134.8 KB4 records
2007-01.zip2.3 MB84 records
2006-12.zip675.6 KB23 records
2006-11.zip2.0 MB59 records
2006-10.zip637.2 KB19 records
2006-09.zip8.6 MB226 records
2006-08.zip2.0 MB45 records
2006-07.zip1.4 MB22 records
2006-06.zip3.9 MB129 records
2006-05.zip3.3 MB77 records
2006-04.zip1.0 MB16 records
2006-03.zip2.6 MB29 records
2006-02.zip2.4 MB50 records
2006-01.zip2.2 MB37 records
2005-12.zip4.5 MB71 records
2005-11.zip1.8 MB51 records
2005-10.zip3.5 MB109 records
2005-09.zip7.8 MB235 records
2005-08.zip2.8 MB51 records
2005-07.zip5.4 MB207 records
2005-06.zip5.4 MB193 records
2005-05.zip5.5 MB142 records
2005-04.zip4.4 MB138 records
2005-03.zip6.8 MB150 records
2005-02.zip5.2 MB138 records
2005-01.zip5.4 MB174 records
2004-12.zip6.5 MB176 records
2004-11.zip4.4 MB123 records
2004-10.zip1.7 MB64 records
2004-09.zip7.0 MB275 records
2004-08.zip3.3 MB102 records
2004-07.zip9.5 MB248 records
2004-06.zip9.2 MB321 records
2004-05.zip7.9 MB219 records
2004-04.zip6.7 MB157 records
2004-03.zip8.0 MB188 records
2004-02.zip3.9 MB141 records
2004-01.zip3.6 MB88 records
2003-12.zip7.2 MB281 records
2003-11.zip3.4 MB160 records
2003-10.zip4.4 MB139 records
2003-09.zip3.1 MB151 records
2003-08.zip5.5 MB213 records
2003-07.zip9.8 MB326 records
2003-06.zip5.2 MB195 records
2003-05.zip3.1 MB96 records
2003-04.zip1.6 MB49 records
2003-03.zip3.1 MB74 records
2003-02.zip5.9 MB173 records
2003-01.zip2.6 MB94 records
2002-12.zip4.8 MB132 records
2002-11.zip2.5 MB56 records
2002-10.zip2.8 MB64 records
2002-09.zip667.7 KB30 records
2002-08.zip2.2 MB61 records
2002-07.zip2.1 MB95 records
2002-06.zip1.1 MB31 records
2002-05.zip1.2 MB48 records
2002-04.zip540.9 KB29 records
2002-03.zip1.9 MB103 records
2002-02.zip2.5 MB80 records
2002-01.zip465.6 KB26 records
2001-12.zip22 B0 records
2001-11.zip1.2 MB38 records
2001-10.zip822.3 KB28 records
2001-09.zip478.5 KB28 records
2001-08.zip5.1 MB149 records
2001-07.zip3.8 MB115 records
2001-06.zip693.0 KB38 records
2001-05.zip1.0 MB33 records
2001-04.zip674.9 KB28 records
2001-03.zip1.4 MB38 records
2001-02.zip2.1 MB72 records
2001-01.zip742.1 KB19 records
2000-12.zip3.7 MB105 records
2000-11.zip2.4 MB84 records
2000-10.zip2.8 MB110 records
2000-09.zip3.8 MB79 records
2000-08.zip4.5 MB139 records
2000-07.zip1.7 MB54 records
2000-06.zip724.7 KB31 records
2000-05.zip1.1 MB51 records
2000-04.zip204.1 KB7 records
2000-03.zip1.7 MB53 records
2000-02.zip3.0 MB91 records
2000-01.zip1.6 MB56 records
1999-12.zip1.8 MB133 records
1999-11.zip916.0 KB33 records
1999-10.zip1.8 MB71 records
1999-09.zip569.2 KB27 records
1999-08.zip994.9 KB54 records
1999-07.zip528.2 KB6 records
1999-06.zip637.0 KB18 records
1999-05.zip1.7 MB43 records
1999-04.zip2.4 MB88 records
1999-03.zip2.3 MB78 records
1999-02.zip538.1 KB26 records
1999-01.zip601.5 KB28 records
1998-12.zip761.5 KB9 records
1998-11.zip1.6 MB52 records
1998-10.zip570.8 KB13 records
1998-09.zip1.9 MB106 records
1998-08.zip495.3 KB22 records
1998-07.zip1.1 MB39 records
1998-06.zip1.6 MB71 records
1998-05.zip1.1 MB38 records
1998-04.zip1.2 MB47 records
1998-03.zip652.6 KB35 records
1997-12.zip754.6 KB54 records
1997-11.zip693.7 KB8 records
1997-10.zip1.7 MB91 records
1997-09.zip1.8 MB78 records
1997-08.zip436.3 KB8 records
1997-07.zip762.4 KB32 records
1997-06.zip132.6 KB1 records
1997-05.zip274.8 KB3 records
1997-04.zip1.6 MB59 records
1997-03.zip1.1 MB60 records
1997-02.zip983.6 KB32 records
1996-12.zip360.5 KB8 records
1996-11.zip1.1 MB35 records
1996-10.zip395.6 KB12 records
1996-08.zip1.3 MB54 records
1996-07.zip6.4 KB1 records
1996-06.zip53.8 KB6 records
1996-05.zip164.8 KB10 records
1994-10.zip291.7 KB17 records

What This Dataset Contains

The dataset contains every Form F-4 and Form F-4/A registration-statement submission filed on EDGAR from October 1994 forward. Each record is a single accession-numbered submission rather than a single document: the underlying SEC registration statement (assembled by counsel and filed via EDGAR) is delivered alongside the dataset's packaging of that filing — one folder per accession number, byte-faithful documents, and a structured metadata sidecar. The folder name is the eighteen-digit accession number with the dashes stripped — for example, 0001683168-25-008354 becomes 000168316825008354.

Form F-4 itself is a hybrid registration statement and prospectus. Part I is the prospectus delivered to security holders of the company being acquired; Part II contains supplementary information, undertakings, signatures, and the exhibit list filed with the SEC but not delivered. Because F-4 governs cross-border deals involving foreign issuers, its prospectus must reconcile target-company and registrant financial statements to either U.S. GAAP or to IFRS as issued by the IASB, depending on the issuer's accounting framework election under Form 20-F General Instruction G. Multi-issuer business combinations frequently produce co-registrant filings under a single accession number: a master file number such as 333-289108 combined with suffixed file numbers 333-289108-01, 333-289108-02 for each co-registrant.

The dataset stores every original document type except graphics. Image files (GRAPHIC documents — JPGs, GIFs, PNGs embedded as logos, signature images, or chart images) are excluded from the bundle but remain referenced by URL inside metadata.json and by inline <IMG SRC="…"> tags inside the HTML. The file types found in the dataset are TXT, JSON, HTML, and PDF, distributed in ZIP containers.

Content Structure of a Single Record

On-disk layout of one record

Each accession-number folder contains:

  • metadata.json — the structured index of the submission, described in detail below.
  • One main registration-statement document — typed F-4 or F-4/A in the EDGAR submission header. In the modern era this is an inline-XBRL-tagged XHTML file, often several megabytes, containing the full prospectus and Part II content.
  • Exhibit documents — one HTML file per exhibit, numbered under Form F-4 Item 21 / Regulation S-K Item 601: EX-2.x, EX-3.x, EX-4.x, EX-5.x, EX-8.x, EX-10.x, EX-21.x, EX-23.x, EX-25.x, EX-99.x, and EX-FILING FEES. Most exhibits are wrapped in EDGAR's SGML <DOCUMENT> envelope around an inner HTML body; the EX-FILING FEES exhibit is itself iXBRL.
  • XBRL data files — for filings that ship XBRL alongside the inline-tagged main document, the schema (.xsd) and the calculation, definition, label, and presentation linkbases (_cal.xml, _def.xml, _lab.xml, _pre.xml), plus extracted instance documents (*_htm.xml).
  • No image files. Graphic documents listed in the EDGAR submission are stripped from the bundle. Their URLs and filenames remain visible in metadata.json and in the HTML body, but the bytes are absent locally.

HTML/HTM is the dominant format for the main statement and exhibits in the modern era; JSON is the metadata sidecar; TXT covers the legacy ASCII-era filings and the complete-submission text URL listed in metadata.json; PDF appears for occasional supplemental exhibits where issuers were permitted to file in that format. XML files (taxonomy linkbases and extracted XBRL instances) ride alongside as data files and are listed under metadata.json.dataFiles[].

The metadata.json shape

The sidecar restates the EDGAR submission header in JSON and adds dataset-level identifiers. Its top-level keys are:

KeyTypeRole
formTypestring"F-4" or "F-4/A".
accessionNostringHyphenated EDGAR accession number, e.g. "0001683168-25-008354".
linkToFilingDetailsstringURL to the primary document on sec.gov, often prefixed with https://www.sec.gov/ix?doc=… for iXBRL.
descriptionstringStandard EDGAR description, e.g. "Form F-4/A — Registration of securities, foreign private issuers, business combinations: [Amend]".
linkToTxtstringURL to the complete-submission .txt file on EDGAR.
filedAtstringISO-8601 timestamp with offset, e.g. "2025-11-14T07:26:12-05:00".
documentFormatFilesarrayOne entry per non-data document in the submission, including graphics that are not redistributed.
dataFilesarrayXBRL/data documents (taxonomy linkbases, instance). May be empty for filings that do not ship XBRL.
entitiesarrayOne entry per filer; multi-filer business combinations produce two or three entries.
seriesAndClassesContractsInformationarraySeries-and-class contract information; typically empty for F-4.
linkToHtmlstringURL to the EDGAR filing-index page.
linkToXbrlstringURL to a separate XBRL package; commonly empty when XBRL is inline in the main document.
idstring32-character internal record identifier.

documentFormatFiles[] and dataFiles[] items

Each item carries:

  • sequence — EDGAR sequence number as a string ("1", "2", …). The complete-submission text-file row uses a single space " " for both sequence and type.
  • size — byte count of the original document, encoded as a string.
  • documentUrl — direct URL to the file on sec.gov.
  • description — free-form description from the submission header, often a truncated all-caps phrase (e.g. "AGREEMENT AND PLAN OF MERGER AND REORGANIZATION", "OPINION OF OGIER", "FORM OF PROXY CARD"). Truncated at roughly 80 characters.
  • type — the EDGAR document-type code: F-4, F-4/A, EX-2.x, EX-3.x, EX-4.x, EX-5.x, EX-8.x, EX-10.x, EX-21.x, EX-23.x, EX-25.x, EX-99.x, EX-FILING FEES, GRAPHIC, XML, EX-101.SCH/CAL/DEF/LAB/PRE.

A large fraction of documentFormatFiles[] rows carry type: "GRAPHIC". Their bytes are not redistributed in the bundle, but the URL remains valid for re-fetching from EDGAR.

entities[] items

One entry per filer (registrant, co-registrant, subject company). Each entity carries:

  • companyName — preserves the EDGAR (Filer) role suffix verbatim (e.g. "YHNA MS I Ltd (Filer)").
  • cik — numeric CIK as a string, no zero-padding.
  • irsNo — IRS employer identification number; frequently "000000000" for foreign private issuers without a U.S. EIN.
  • fileNo — SEC file number, including any co-registrant suffix ("333-289108", "333-289108-01", "333-289108-02").
  • filmNo — EDGAR film number assigned at acceptance.
  • type — repeats the form type ("F-4" / "F-4/A").
  • act — Securities Act under which filed; "33" for F-4.
  • sic — SIC code combined with its textual label (e.g. "7371 Services-Computer Programming Services", "2834 Pharmaceutical Preparations"). May be absent for some co-filers.
  • stateOfIncorporation — EDGAR state/country code (D8 Cayman Islands, E9 Cayman Islands variant, V8 Germany, A1 British Columbia, Z4 other, etc.) preserved as the raw code rather than the human-readable jurisdiction.
  • fiscalYearEnd — MMDD string (e.g. "1231", "0731").
  • tickers — optional array of trading symbols (e.g. ["VACH", "VACHU", "VACHW"]); often absent for unlisted private targets and shell registrants.

The multi-filer pattern is central to F-4. A typical business combination registration produces two or three entities[] entries that share a master fileNo prefix (e.g. 333-289108) with sequential suffixes, while SIC, ticker, state of incorporation, and fiscal-year-end vary across co-registrants because the entities sit in different industries and jurisdictions.

Section-by-section anatomy of the underlying F-4 filing

Filing header and registrant identification

The submission opens with EDGAR's SGML header — accession number, submission type, period of report (where applicable), public document count, filer blocks, and exhibit list. The dataset's metadata.json is the structured projection of this header.

Cover page of the registration statement

The first pages of the main document carry the Form F-4 cover page: the exact name of the registrant as specified in its charter and (where applicable) the English translation; the state or other jurisdiction of incorporation; the IRS employer identification number (often 00-0000000 for foreign issuers); the primary standard industrial classification code; the principal executive office and agent for service of process in the United States; the file number(s) assigned to the registration; the title of each class of securities being registered; the proposed maximum aggregate offering price; for pre-2022 filings, the calculation of the registration fee directly on the cover; and the box-check disclosures (delaying amendment, large accelerated filer status, emerging-growth-company status, and the foreign-private-issuer accommodations).

Part I — The prospectus

Part I, the prospectus delivered to security holders, conventionally includes:

  • Letter to shareholders of the company being acquired, summarizing the transaction.
  • Notice of meeting at which the transaction will be voted on, when applicable.
  • Questions and answers about the proposed transaction.
  • Summary describing the parties, transaction structure, consideration to be received, vote required, conditions, termination rights, and accounting treatment.
  • Risk factors specific to the combined company, the foreign-issuer jurisdiction, the consideration mix, currency exposure, integration risk, and (where relevant) PRC variable-interest-entity, sanctions, or home-country-regulator considerations.
  • Selected historical financial data for the registrant and the target.
  • Unaudited pro forma condensed combined financial information giving effect to the business combination.
  • Comparative per-share data.
  • Information about the parties — business descriptions, properties, legal proceedings, and management's discussion and analysis for both the registrant and the target.
  • Description of the transaction — background, recommendation of the board, opinion of the financial advisor, regulatory approvals, accounting treatment, material U.S. federal income tax consequences, material foreign tax consequences, and dissenters' or appraisal rights.
  • The merger agreement — narrative summary of representations, warranties, covenants, conditions to closing, termination rights, and termination fees.
  • Description of the registrant's securities to be issued as consideration.
  • Comparison of rights of security holders before and after the transaction (a defining feature of F-4, because rights frequently move from one corporate-law regime to another, e.g. Delaware to Cayman or BVI).
  • Audited financial statements of the registrant and the target — reconciled to U.S. GAAP or, where the registrant is an IFRS filer, presented under IFRS as issued by the IASB without reconciliation.
  • Notes to the financial statements, including segment data, share-based compensation, and tax disclosures.
  • Experts and legal matters.
  • Where you can find more information and incorporation-by-reference language.

Part II — Information not required in the prospectus

Part II contains:

  • Item 20 — Indemnification of directors and officers.
  • Item 21 — Exhibits and financial statement schedules, listing every exhibit by number and description.
  • Item 22 — Undertakings, including the Rule 145(c)/(d) undertakings specific to F-4.
  • Signatures — signed by the registrant, the principal executive officer, the principal financial officer, the principal accounting officer, a majority of the board, and the authorized representative in the United States, with the form of attestation prescribed by the Securities Act.

Exhibits (Item 21)

Exhibits follow Form F-4 Item 21 / Regulation S-K Item 601 numbering and appear in the folder as separate .htm files:

  • EX-2.x — Plan of acquisition, reorganization, arrangement, liquidation, or succession. The merger agreement, scheme of arrangement, or business combination agreement governing the transaction, often filed in redacted form with schedules omitted under Item 601(b)(2)(ii).
  • EX-3.x — Articles of incorporation, bylaws, articles of merger. The constituent documents of the registrant and, where relevant, of the surviving entity post-combination.
  • EX-4.x — Instruments defining the rights of security holders. Share-certificate specimens, indentures, supplemental indentures, warrant agreements; debt-heavy filings can carry many EX-4 exhibits (one supplemental indenture per outstanding note series).
  • EX-5.x — Opinion regarding legality. Counsel's validity opinion on the securities being registered, frequently issued by offshore counsel (Conyers Dill & Pearman, Ogier, Maples and Calder) for Cayman, Bermuda, or BVI registrants and by U.S. counsel for Delaware sub-issuers.
  • EX-8.x — Tax opinions. Material U.S. federal income-tax opinions on the qualification of the transaction (for example, as a reorganization under Section 368) and, where relevant, foreign-tax opinions.
  • EX-10.x — Material contracts. Standby equity purchase agreements, registration-rights agreements, employment agreements, lock-up agreements, escrow agreements, voting agreements, and similar instruments.
  • EX-21.x — Subsidiaries of the registrant. A list of significant subsidiaries with jurisdiction of incorporation.
  • EX-23.x — Consents of independent registered public accounting firms. One consent per audit firm whose opinion appears in the prospectus, covering both registrant and target audits — frequently multiple consents because the registrant and target use different auditors.
  • EX-25.x — Form T-1 statement of eligibility of the trustee. Filed only when debt securities are being registered.
  • EX-99.x — Additional exhibits. Forms of proxy card, press releases, consents of named experts (e.g. board nominees), fairness opinions, and supplemental disclosure documents.
  • EX-FILING FEES (EX-107). The structured fee-rate calculation under the SEC's Filing Fee Disclosure (FFD) taxonomy at xbrl.sec.gov/ffd/…, declaring ffd:SubmissnTp, ffd:FeeExhibitTp, ffd:RegnFileNb, ffd:OfferingTableNa, and the offering-table line items.

The exhibit set scales with deal complexity. SPAC and small-cap business combinations commonly ship only the main statement plus EX-5.1, one or more EX-23 consents, and EX-99.1 (form of proxy card). Operating-issuer combinations involving registered debt ship the full slate, including extensive EX-4 indenture exhibits and EX-25.1 trustee eligibility.

XBRL data files

For filings that ship XBRL, the folder additionally contains the taxonomy schema (.xsd), the calculation linkbase (_cal.xml), the definition linkbase (_def.xml), the label linkbase (_lab.xml), the presentation linkbase (_pre.xml), and the extracted instance documents (*_htm.xml). These are listed under metadata.json.dataFiles[] rather than documentFormatFiles[], which keeps narrative documents and structured-data documents on separate axes of the metadata.

Document encoding — iXBRL and SGML coexistence

Two distinct on-disk formats coexist within the same modern filing folder:

  1. Pure inline-XBRL XHTML. The main F-4 / F-4/A document and the EX-FILING FEES exhibit begin with an XML prolog (<?xml version='1.0' encoding='ASCII'?>) and an <html> root carrying the XBRL namespace family — ix (inline XBRL 2013), dei (xbrl.sec.gov/dei/…), us-gaap (fasb.org/us-gaap/…), srt, iso4217, an issuer-specific extension taxonomy, and, for fee exhibits, the ffd namespace. The body interleaves the prospectus narrative with <ix:nonNumeric>, <ix:nonFraction>, <ix:header>, and <ix:hidden> tags binding concepts such as dei:AmendmentFlag, dei:EntityCentralIndexKey, us-gaap:CommitmentsAndContingencies, and us-gaap:StockholdersEquity to context references that name reporting periods (e.g. From2025-04-29to2025-06-30). HTML tags are lowercase XHTML.

  2. SGML <DOCUMENT> wrapper around HTML. Most other exhibits (EX-5.x, EX-10.x, EX-23.x, EX-99.x) are stored in EDGAR's submission-file format: a header block of pseudo-tags — <TYPE>, <SEQUENCE>, <FILENAME>, <DESCRIPTION> — terminating with <TEXT>, followed by the body HTML, then </TEXT></DOCUMENT>. The header pseudo-tags are unclosed (this is SGML, not XML). The inner HTML uses uppercase tags (<HTML>, <BODY>, <P>, <TABLE>, <TR>, <TD>) and frequently contains <IMG SRC="image_NNN.jpg"> references that resolve on sec.gov but are absent locally because graphics are stripped from the bundle.

EX-FILING FEES exhibits are commonly produced by the Novaworks Fee Exhibit Editor and carry editor-version metadata in HTML comments such as <!-- Field: Set; Name: Platform; Value: Novaworks Fee Exhibit Editor --> together with an MD5 of the source. Parsers should detect format per file (XML prolog vs. SGML <DOCUMENT> opener) rather than per record, because both dialects coexist inside the same accession-number folder.

What the dataset includes and excludes

Included in each record:

  • The full byte content of the main F-4 / F-4/A registration statement.
  • The full byte content of every exhibit document filed under the accession number.
  • The full byte content of every XBRL taxonomy file and extracted instance.
  • A metadata.json sidecar that re-states the EDGAR header, lists every original document (including those not redistributed), and exposes the structured filer/issuer fields.

Excluded from each record:

  • Image files (the GRAPHIC document type — JPG, GIF, PNG). The metadata still references them by URL, and inline <IMG> tags inside the HTML still point to the original filenames, so they can be retrieved from EDGAR if needed but are not present locally.
  • Pre-filing correspondence and SEC staff comment letters, which EDGAR files under separate accession numbers (CORRESP, UPLOAD) and which are not part of the F-4 submission itself.
  • Documents incorporated by reference to prior 20-F, 6-K, F-1, or similar filings — the prospectus carries the incorporation by reference language, but the incorporated documents are not duplicated into this record. They live under their own accession numbers in their own filings.

Changes in required content over time

Form F-4's required content has accumulated several material layers since its introduction:

  • Original adoption (effective 1985, paralleling S-4 conventions). The form was created to give foreign private issuers an equivalent to S-4 for business-combination registrations, with reconciliation of foreign-GAAP financials to U.S. GAAP under Item 17 / Item 18 of Form 20-F.
  • Elimination of U.S. GAAP reconciliation for IFRS filers (2007). SEC Release No. 33-8879 eliminated the U.S. GAAP reconciliation requirement for foreign private issuers that prepare financial statements under IFRS as issued by the IASB. From this point, F-4 prospectuses fall into two financial-presentation regimes — reconciled-to-U.S.-GAAP and IFRS-as-issued-by-IASB — and both regimes appear in the dataset depending on filing date and issuer election.
  • Risk-factor and executive-compensation modernization. Successive amendments to Regulation S-K (Items 105, 402, 407) flowed into F-4 prospectus requirements, expanding risk-factor specificity, compensation discussion and analysis, and corporate-governance disclosure for the registrant.
  • JOBS Act accommodations (2012). Emerging-growth-company elections and reduced executive-compensation disclosure became available, surfacing on the cover page and in the management discussions.
  • Filing Fee Disclosure modernization (Release No. 33-11042, effective 2022). The fee table moved from a narrative table on the cover page into a dedicated structured exhibit, EX-FILING FEES (EX-107), tagged in inline XBRL under the FFD taxonomy. Pre-2022 records carry the fee calculation on the cover; post-2022 records carry EX-107 as a separate iXBRL document.
  • Universal proxy and modern governance disclosures. The exhibit set has expanded to include forms of proxy card under EX-99, registration-rights agreements, lock-up agreements, and standby equity purchase agreements increasingly common in SPAC-driven cross-border combinations.

Changes in data format over time

The dataset spans October 1994 to present, traversing every EDGAR document-format era:

  • 1994 – ~2000 — ASCII / SGML era. Early filings are complete-submission text files carrying SGML <DOCUMENT> blocks with plain-ASCII bodies. Tabular financial data is rendered with monospaced columns. No HTML, no XBRL.
  • ~2000 – ~2009 — HTML era. Filings transitioned to HTML inside the SGML wrapper, with embedded tables, fonts, and inline images. Exhibits are uniformly <DOCUMENT>-wrapped HTML. No XBRL.
  • ~2009 – ~2017 — Standalone XBRL era. Following the SEC's Interactive Data rule, foreign private issuers filing in U.S. GAAP and, later, in IFRS, began submitting an accompanying XBRL exhibit set (EX-101) — separate .xsd, _cal.xml, _def.xml, _lab.xml, _pre.xml, and instance .xml files — alongside the human-readable HTML.
  • ~2017 – present — Inline XBRL era. XBRL tagging migrated into the main HTML document via <ix:…> elements, producing iXBRL XHTML files declared with an XML prolog and many XBRL namespaces. Taxonomy linkbases continue to ride alongside as dataFiles, and extracted instance documents (*_htm.xml) are produced from the inline tags.
  • 2022 – present — Structured fee exhibit. EX-FILING FEES (EX-107) appears as a separate iXBRL document scoped to the ffd taxonomy, replacing the narrative fee table on the cover page.

Throughout these eras the dataset preserves the original byte content, so a record's encoding reflects the filing-era conventions: 1994 records read as plain ASCII text inside SGML wrappers, 2010-era records read as HTML inside SGML, and modern records read as a mix of pure iXBRL XHTML (main statement + fee exhibit) alongside SGML-wrapped HTML (most other exhibits).

Interpretation and extraction notes

  • Amendments are full re-submissions. An F-4/A is a complete registration-statement document, not a diff against the prior F-4. To reconstruct the amendment chain for a given deal, group records by the master fileNo (the 333-xxxxxx prefix in entities[].fileNo), not by accession number — each amendment receives its own accession number but reuses the same file number.
  • Co-registrants share a master file number. Multi-filer business combinations produce two or three entities[] entries that share a master fileNo and add suffixes -01, -02. SIC, ticker, state of incorporation, and fiscal-year-end vary across co-registrants because the entities sit in different industries and jurisdictions.
  • Incorporation by reference is pervasive. F-4 prospectuses commonly incorporate by reference the registrant's most recent Form 20-F and subsequent Form 6-K filings; those documents are not included in this record and must be retrieved separately under their own accession numbers.
  • Image references will not resolve locally. Both inline <IMG> tags inside the HTML and documentFormatFiles[] entries with type: "GRAPHIC" point to filenames whose bytes are absent from the bundle. Their documentUrl on sec.gov remains valid for re-fetching when needed.
  • Two HTML dialects coexist within one folder. The main statement and EX-FILING FEES are XML-prolog iXBRL XHTML with lowercase tags; most other exhibits are SGML-wrapped HTML with uppercase tags. Parsers should branch on the file's opening bytes rather than assume a single dialect per record.
  • iXBRL semantics live in the main document. Concept-level financial and disclosure facts are not in a sidecar XBRL instance — they are inline in the XHTML and bound to context references that name the reporting periods. The extracted *_htm.xml data file in dataFiles[] is the canonical instance document produced from those inline tags.
  • Fee tables migrated. For pre-2022 records, the registration-fee calculation appears on the cover page of the main document; for post-2022 records, it is carried in a separate EX-FILING FEES exhibit tagged under the FFD taxonomy.
  • EDGAR header preservation. Strings such as the (Filer) role suffix on companyName, the literal "000000000" IRS number for foreign issuers without a U.S. EIN, and the EDGAR state/country codes (D8, E9, V8, A1, Z4) are preserved verbatim from the submission header rather than normalized to human-readable values.
  • Foreign-GAAP reconciliation status. Whether the financial statements are reconciled to U.S. GAAP or presented under IFRS as issued by the IASB without reconciliation depends on the registrant's accounting framework election and on filing date relative to the 2007 reconciliation-elimination rule; both regimes appear in the dataset.
  • Financial statements are split between registrant and target. Unlike a single-issuer registration, an F-4 prospectus carries two complete sets of audited financials (registrant and target) plus pro forma combined statements. Extraction pipelines that assume a single primary issuer per record will misattribute target-company facts unless they segment the document by the section headings.

Who Files or Publishes This Dataset, and When

Who files

Each record is one EDGAR submission of Form F-4 (initial registration statement) or Form F-4/A (pre- or post-effective amendment) by a foreign private issuer registering securities to be issued in a business combination, exchange offer, or other Rule 145(a) transaction. Form F-4 is the FPI counterpart to Form S-4.

Eligibility to file is defined by Securities Act Rule 405 and Exchange Act Rule 3b-4(c). An issuer organized outside the United States qualifies as an FPI unless both (i) more than 50 percent of its outstanding voting securities are held of record by U.S. residents, and (ii) any one of these is true: a majority of executive officers or directors are U.S. citizens or residents, more than 50 percent of assets are located in the United States, or the business is administered principally in the United States. FPI status is retested as of the last business day of the second fiscal quarter; an issuer that loses FPI status moves to the domestic regime and would file S-4 instead of F-4.

The F-4 filer is the FPI issuing the securities being registered. In a merger that is typically the acquirer or a parent holding company; in an exchange offer it is the offeror; in a Rule 145(a) reclassification or transfer of assets it is the entity whose securities will be issued to voting holders. The target is generally not the F-4 filer, although its financial statements and MD&A are commonly incorporated into the F-4 prospectus. Foreign governments and political subdivisions are not FPIs and use Schedule B, not F-4.

Triggering events

Form F-4 is required when an FPI offers or sells securities in a transaction subject to registration under Section 5 of the Securities Act of 1933 and the transaction is within the scope of the form. The principal triggers are:

  • statutory mergers and consolidations in which the FPI's securities are issued to holders of another entity;
  • exchange offers in which the FPI offers its own securities (alone or with cash) for securities of a target;
  • Rule 145(a) transactions: reclassifications (other than stock splits, reverse splits, or par-value changes), mergers or consolidations, and transfers of assets in consideration for securities, where the matter is submitted to a holder vote or consent;
  • resales by persons (including affiliates) of securities acquired in a Rule 145(a) transaction, where registered on F-4 as permitted; and
  • cross-border offers and business combinations in which the FPI's securities are offered to U.S. holders and no exemption (such as the Tier I exemption under Rules 13e-4(h)(8) and 14d-1(c)) eliminates Securities Act registration.

There is no periodic cadence; F-4 is entirely event-driven by the underlying transaction.

Regulatory framework and content

The disclosure obligation flows from Section 5 of the Securities Act, with Rule 145 confirming that certain holder-vote transactions are "offers" and "sales" of the new securities. Form F-4 integrates Securities Act registration with Regulation S-K and Regulation S-X disclosure, applied through the FPI financial-statement rules (U.S. GAAP, or IFRS as issued by the IASB without reconciliation; otherwise reconciliation to U.S. GAAP). The F-4 prospectus typically does double duty as the offer-to-exchange document or proxy/information statement delivered to target holders, so its content overlaps with Regulations 14A, 14D, and 14E where those regimes also apply.

Timing, effectiveness, and the F-4/A workflow

The standard sequence:

  • The FPI files the initial F-4 once the transaction agreement (merger agreement, exchange-offer terms) is signed or developed enough to support full prospectus disclosure. Confidential submission is available to FPIs in defined circumstances.
  • SEC staff review usually generates at least one round of comments. Each substantive change or response is filed as a pre-effective amendment F-4/A, which is also used to refresh financial statements, revised exchange ratios, fairness opinions, or restructured terms.
  • The registration statement becomes effective only when declared effective by the staff (Rule 462 immediate effectiveness is rarely applicable to F-4). Effectiveness must precede mailing of the prospectus for a vote or commencement of an exchange offer that requires an effective registration statement.
  • After effectiveness, the prospectus is delivered to target holders, who vote (mergers and Rule 145 reclassifications) or tender (exchange offers). Consummation follows the vote or expiration of the offer.
  • Post-effective amendment F-4/A amendments update the file for material changes before consummation, deregister unsold securities, or address issues during the offering period.

The original F-4 and its F-4/A amendments share a single registration-file lineage (CIK plus the F-4 accession-number chain).

Coverage

Rule 145 was adopted in 1972 (Securities Act Release No. 5316); Form F-4 was adopted in 1982 as part of the integrated disclosure system that established the F-series for foreign private issuers (replacing prior use of forms such as S-14). The dataset's earliest records are from October 1994, reflecting EDGAR phase-in rather than the historical origin of the form; pre-1994 paper F-4 filings are not included.

How This Dataset Differs From Similar Datasets or Filings

Form F-4 sits at the crossing of two axes that govern most Securities Act registration choices: (1) issuer status — foreign private issuer versus U.S. domestic registrant, and (2) purpose — business-combination consideration versus general capital raising. Every adjacent form below differs from F-4 along one or both axes. Mapping those axes is the fastest way to know which filings overlap with F-4, which substitute for it, and which travel with it in the same deal.

Form S-4 — domestic counterpart (same purpose, different issuer status)

S-4 is the single closest analogue. Both register securities issued as consideration in mergers, exchange offers, and Rule 145 transactions, and both contain a prospectus with target financials, pro formas, fairness opinions, and merger agreements. The dividing line is Rule 405 FPI status: F-4 is filed only by registrants qualifying as FPIs; S-4 by all other domestic registrants. This drives accounting (IFRS or home-country GAAP with U.S. GAAP reconciliation on F-4; U.S. GAAP throughout on S-4) and incorporated-by-reference baseline (Form 20-F vs. Form 10-K). The two datasets together approximate the full universe of registered M&A consideration in the U.S., but they are mutually exclusive per filing and cannot be merged without normalizing for accounting regime.

Form F-1 — FPI general registration (same issuer status, different purpose)

Form F-1 is the FPI registration statement for offerings not covered by a more specialized form (typically IPOs and follow-ons for cash). Same filer population as F-4 and the same Securities Act mechanics, but no target financials, no pro forma combination, no exchange-ratio mechanics, no Rule 145 framework. An FPI raising primary capital files F-1; the same FPI issuing stock as deal consideration files F-4. They are substitutes only in the rare case where a transaction can be structured either as a primary offering or as a registered exchange.

Form F-3 — FPI shelf (same issuer status, different purpose and structure)

Form F-3 is the short-form FPI shelf available to seasoned issuers meeting reporting-history and float thresholds. It permits incorporation by reference of Exchange Act filings (notably Form 20-F) and supports continuous or delayed takedowns from a single base prospectus. F-4 is transaction-specific and long-form: even when it incorporates 20-F by reference, it must carry the deal prospectus, target financials, and pro formas that F-3 never includes. F-3 is generally not used for business combinations, although a registered acquirer may fund a cash-and-stock deal via a shelf takedown structured as a primary offering rather than an exchange.

Form S-1 — domestic general registration (different on both axes)

Form S-1 is two steps removed from F-4: domestic issuer and not business-combination specific. It is useful only as a contrast point clarifying that F-4 is doubly specialized — by FPI status and by transactional purpose — whereas S-1 is the unspecialized domestic baseline.

Form 425 — deal communications (complementary, not a substitute)

Form 425 is a filing wrapper for written communications relating to a business combination that constitute prospectuses or solicitation material under Rules 165, 166, and 425 (investor presentations, press releases, employee communications, transcripts). It is not a registration statement and never substitutes for F-4. The two are deeply complementary: 425s typically begin before the F-4 is filed and continue through closing, while the F-4 is the formal registration and definitive prospectus. Full deal reconstruction requires pairing them.

Schedule 14A / 14C — proxy and information statements (different statute, frequently combined)

When a business combination requires a U.S. shareholder vote, proxy materials enter the picture. A U.S. target whose shareholders must vote files Schedule 14A (PRE 14A, then DEF 14A); a 14C information statement applies when no solicitation occurs. F-4 governs issuance of securities under the Securities Act; 14A/14C governs solicitation of votes under the Exchange Act. The same physical document is routinely filed as a joint proxy statement / prospectus serving both regimes, but the filing identifiers and datasets remain distinct. FPI acquirer-side votes are usually governed by home-country law and rarely produce a 14A on the acquirer side, so 14A overlap is most common on the U.S. target side.

Schedule TO, TO-T, TO-I — tender offer schedules (different statute, sometimes combined)

Schedule TO is the Exchange Act framework for tender and exchange offers (TO-T for third-party offers; TO-I for issuer self-tenders). Overlap with F-4 occurs only in exchange offers: when a bidder offers its own securities for target securities, the F-4 registers the consideration securities and the TO discloses offer mechanics, with the F-4 prospectus incorporated into the TO. A pure cash tender offer requires TO only, no F-4. A one-step statutory merger requires F-4 (if stock is consideration) but no TO; instead 14A typically accompanies it. The choice between exchange-offer structure (F-4 + TO) and merger structure (F-4 + 14A) is driven by deal speed, squeeze-out mechanics, and target-shareholder dynamics.

Form F-8 / F-80 / F-10 — MJDS forms for Canadian issuers

Eligible Canadian issuers in cross-border business combinations may file Form F-8 (cash exchange offers/business combinations) or Form F-80 (stock exchange offers/business combinations) under the multijurisdictional disclosure system, with reduced disclosure, instead of F-4. Form F-10 is the corresponding MJDS form for non-business-combination registered offerings. Where the Tier I cross-border exemption applies — U.S. ownership of the target is sufficiently limited and the transaction qualifies under Rule 13e-4(h)(8) or Rule 14d-1(c) — Securities Act registration on F-4 may not be required at all and the deal can proceed under home-country rules with a limited U.S. overlay.

Form 20-F — FPI annual report (periodic baseline, frequently incorporated)

Form 20-F is the Exchange Act annual report for FPIs. It is not a substitute for F-4 but is routinely incorporated by reference to supply registrant historical financials, MD&A, and risk factors. The relationship is hierarchical: 20-F is the periodic baseline; F-4 is the transactional event document that pulls 20-F content forward and layers in target financials, pro formas, and deal-specific items.

F-4/A — amendments (same dataset, different workflow stage)

F-4/A filings are amendments to previously filed F-4 registration statements and are included in this dataset alongside initial F-4s. Amendments respond to SEC staff comments, refresh stale financials, reflect repricing or revised exchange ratios, or incorporate post-signing changes to the merger agreement. A typical transaction produces one F-4 and several F-4/A filings before effectiveness. They should be treated as a sequenced trail of a single registration, not as duplicates.

Boundary summary

The F-4 dataset is defined by the intersection of FPI registrant status and business-combination purpose. It is not interchangeable with F-1 / F-3 (same FPI population, but general capital raising rather than deal consideration), S-4 (same business-combination purpose, but domestic registrants), S-1 (neither axis matches), 20-F (periodic reporting, not Securities Act registration), or 425, 14A/14C, and Schedule TO (companion filings under different statutes that travel with F-4 in the same deal but capture different content). Within its scope, F-4 is the authoritative, prospectus-level Securities Act document for cross-border and FPI-led registered M&A consideration — the only filing carrying the full registrant-and-target financial reconciliation, pro forma combination, and deal-mechanics disclosure for that population.

Who Uses This Dataset

Form F-4 filings bundle a transaction prospectus, dual-issuer financials reconciled to U.S. GAAP or IFRS, pro formas, and a full exhibit set. Different professions mine different slices of the same record.

M&A bankers and corporate development teams

Sell-side bankers and in-house corp dev teams pull deal mechanics from the "Terms of the Transaction" and "The Merger Agreement" sections: exchange ratios, collars, walk-away rights, fiduciary-outs, termination and reverse termination fees, and closing conditions. The background-of-the-merger narrative and fairness-opinion exhibits feed precedent decks and break-fee benchmarking by jurisdiction, deal size, and consideration mix.

Merger arbitrage and event-driven funds

Risk-arb analysts size spreads and probability-weight outcomes using exchange ratio mechanics, collar formulas, election and proration rules, and regulatory conditions (antitrust, foreign investment review, sectoral approvals). EX-99 voting and support agreements quantify locked-up target shares. Historical F-4/F-4/A series support backtests of completion rates, time-to-close, and the predictive power of specific deal-protection terms.

Securities lawyers and disclosure counsel

Transactional counsel use the dataset as a precedent library for drafting, staff-comment response, and benchmarking. They search risk factors, tax-consequences disclosure, accounting treatment, and appraisal-rights summaries, and reuse EX-5 legality opinions, EX-8 tax opinions, EX-23 auditor consents, and EX-99 voting/support, lock-up, and stockholder agreements. F-4/A revision diffs expose the comment-and-response pattern, letting counsel anticipate review issues on comparable deals.

Equity and credit analysts

Equity research models the post-combination entity from pro forma financial statements, segment reconciliations, synergy disclosures, and the IFRS/U.S. GAAP reconciliation footnotes. Credit analysts pair the same financials with assumed indebtedness, change-of-control covenants in material debt agreements, and any new financing exhibits to reassess pro forma leverage and covenant headroom on both legs of the combination.

Technical accounting and external reporting teams

Controllers and technical accounting groups at foreign private issuers use the dataset as a working reference for IFRS-to-U.S. GAAP reconciliation, purchase-accounting disclosure, and pro forma adjustment practice. They benchmark reconciliation footnotes, goodwill and intangible allocation tables, IFRS 8 vs. ASC 280 segment reporting, and revenue recognition disclosure against peers in the same industry and home jurisdiction.

Compliance, FCPA, and sanctions reviewers

Compliance and FCPA teams diligence cross-border counterparties using risk factors on bribery, sanctions, and export controls, the legal proceedings section, related-party transactions, and any disclosure of internal investigations or government inquiries. Output feeds onboarding, sanctions screening, and ongoing monitoring of combined entities operating in higher-risk geographies.

Transfer agents, exchange agents, and depositaries

Operations teams running exchange-offer mechanics use the prospectus consideration and election sections plus EX-99 forms of letter of transmittal and exchange agent agreements to configure election deadlines, default elections, fractional-share treatment, and tender procedures for certificated and book-entry shares.

Quantitative researchers

Quant teams build historical libraries of cross-border deal terms and outcomes using cover-page fields, EX-FILING FEES exhibits for registered share counts and aggregate transaction value, and any inline XBRL data. Features feed completion classifiers, premium models, and post-merger drift signals conditioned on jurisdiction, consideration mix, and deal-protection strength.

ETL engineers at data vendors and asset managers

Data engineering teams use metadata.json (accession, filer, form type, exhibit-type tags) for indexing, then run document parsers on prospectus HTML and exhibit text to populate deal-terms, financials, and parties tables. Having every submitted document in one container simplifies snapshot rebuilds and reprocessing when extraction logic changes.

Engineering teams building clause-extraction and precedent-search products use exhibit-level segmentation across EX-5, EX-8, EX-23, and EX-99 merger, voting, support, and registration-rights agreements, plus the prospectus narrative for risk-factor and rationale clause libraries. F-4/A revision diffs provide labeled examples for revision-prediction and comment-response models.

LLM and RAG developers

Teams building retrieval systems on SEC content chunk the proxy statement prospectus by section (summary, risk factors, the merger, material U.S. federal income tax consequences, accounting treatment, comparison of stockholder rights), embed exhibits separately by type, and link registration statements to their amendments via metadata. Mixed HTML/PDF/TXT formats and consistent EDGAR metadata make the corpus suitable for benchmarking parsing accuracy and grounded answers on cross-border M&A.

Academic researchers in finance, law, and accounting

Finance academics use deal terms and outcomes for premium, completion, and announcement-return studies. Legal scholars analyze deal-protection evolution, fiduciary-out drafting, and forum-selection clauses across F-4/A revisions. Accounting researchers study reconciliation quality and pro forma disclosure. Coverage from 1994 forward supports event-time, calendar-time, and panel designs across jurisdictions and industries.

Synthesis

Deal practitioners extract transaction terms and exhibit precedent; investors price announced deals; accounting and compliance functions handle reconciliation and counterparty risk; data, legal-tech, and research teams build structured products on top of the corpus. Each role keys into a different layer of the same record — cover page and metadata.json, prospectus narrative, pro forma and reconciliation financials, XBRL, or the EX-5/EX-8/EX-23/EX-99/EX-FILING FEES exhibit set — which is why the full document bundle, including F-4/A amendments, is the working unit.

Specific Use Cases

The following workflows show how teams operate on Form F-4 records in practice. Each ties to specific exhibits, sections, or metadata.json fields.

Cross-border deal-cost benchmarking from EX-FILING FEES

Parse the EX-107 (EX-FILING FEES) iXBRL exhibit to extract ffd:OfferingTableNa line items, registered share counts, aggregate transaction value, fee rate, and offsets. Joined to entities[].sic, entities[].stateOfIncorporation, and filedAt, this yields a panel of cross-border registered M&A volumes by jurisdiction and industry, plus per-deal SEC fee economics for budgeting and pitch decks.

Merger-arb signal extraction from prospectus terms

Parse the "Terms of the Transaction," "The Merger Agreement," and consideration-election sections of the main F-4 document for exchange ratios, fixed/floating collars, walk-away thresholds, election and proration mechanics, termination fees, and regulatory closing conditions. Combined with EX-99 voting and support agreements (locked-up share counts), these features feed spread-sizing models, completion-probability classifiers, and time-to-close estimators.

SEC comment-letter response tracking via F-4/A revision diffs

Group records by master fileNo (the 333-xxxxxx prefix in entities[].fileNo) to assemble the F-4 / F-4/A amendment chain for a single registration, then diff successive prospectus and exhibit text to surface the staff-comment response pattern: added risk factors, refreshed financials, revised tax-consequences language, and reworked exchange-ratio mechanics. Output supports benchmarking of likely review issues on comparable in-flight deals and labeled training data for revision-prediction models.

IFRS-to-U.S. GAAP reconciliation precedent library

Extract the audited financial statements, reconciliation footnotes, and pro forma combined statements from Part I of the prospectus, segmented by registrant vs. target headings. Filtered against entities[].sic and stateOfIncorporation, this builds a peer-keyed reference set of reconciliation entries, purchase-price allocation tables, IFRS 8 vs. ASC 280 segment mappings, and pro forma adjustment practice for use by technical accounting teams and external reporters.

EX-5 / EX-8 opinion-precedent search for cross-border counsel

Index EX-5.x legality opinions and EX-8.x tax opinions by issuing firm (Conyers Dill & Pearman, Ogier, Maples and Calder, Walkers, U.S. counsel), registrant stateOfIncorporation, and deal structure (Section 368 reorganization, scheme of arrangement, statutory merger). Counsel use the resulting precedent search to draft opinions for Cayman, BVI, Bermuda, and Delaware sub-issuer structures and to benchmark assumption and qualification language across firms.

RAG and LLM training on F-4 prospectus chunks

Chunk the main iXBRL XHTML prospectus by canonical section (Q&A, summary, risk factors, background of the merger, material U.S. federal income tax consequences, accounting treatment, comparison of rights of security holders, MD&A) and embed each exhibit type (EX-2, EX-5, EX-8, EX-10, EX-23, EX-99) as its own document class. Records are linked to their amendments through the master fileNo, producing a grounded retrieval corpus for cross-border M&A question-answering and a benchmark for parsing iXBRL XHTML alongside SGML-wrapped HTML in the same record.

Subsidiary and co-registrant graph construction

Combine EX-21.x subsidiary lists with the multi-entity entities[] array (master fileNo plus -01, -02 suffixes, divergent SIC and stateOfIncorporation) to build a deal-time corporate graph linking the registrant, target, surviving entity, and named subsidiaries by jurisdiction. Output supports counterparty-risk diligence, sanctions and FCPA screening of combined entities, and post-close entity-master maintenance at data vendors.

Dataset Access

Dataset Index JSON API: https://api.sec-api.io/datasets/form-f4-files.json

This endpoint returns the dataset's metadata, including its name, description, last updated timestamp, earliest sample date, total record and size counts, covered form types (F-4, F-4/A), container format (ZIP), and contained file types (TXT, JSON, HTML, PDF). The response also includes the full dataset download URL and a list of all individual container files with per-container size, record count, last updated timestamp, and download URL. This endpoint can be polled daily to identify which containers were updated in the most recent refresh, allowing incremental downloads instead of re-fetching the full archive. No API key is required to access this endpoint.

Example
1 {
2 "datasetId": "1f13365b-9ae0-692c-99b0-82ddaf21130b",
3 "datasetDownloadUrl": "https://api.sec-api.io/datasets/form-f4-files.zip",
4 "name": "Form F-4 Files Dataset",
5 "updatedAt": "2026-04-24T03:02:20.356Z",
6 "earliestSampleDate": "1994-10-01",
7 "totalRecords": 33850,
8 "totalSize": 1923243320,
9 "formTypes": ["F-4", "F-4/A"],
10 "containerFormat": "ZIP",
11 "fileTypes": ["TXT", "JSON", "HTML", "PDF"],
12 "containers": [
13 {
14 "downloadUrl": "https://api.sec-api.io/datasets/form-f4-files/2026/2026-04.zip",
15 "key": "2026/2026-04.zip",
16 "size": 13818783,
17 "records": 154,
18 "updatedAt": "2026-04-24T03:02:20.356Z"
19 }
20 ]
21 }

Download Entire Dataset: https://api.sec-api.io/datasets/form-f4-files.zip?token=YOUR_API_KEY

Downloads the complete Form F-4 Files dataset as a single ZIP archive covering all filings from October 1994 to the most recent refresh. This endpoint requires an API key.

Download Single Container: https://api.sec-api.io/datasets/form-f4-files/2026/2026-04.zip?token=YOUR_API_KEY

Downloads one monthly container file rather than the full archive, which is useful for retrieving only newly added or updated filings identified through the dataset index. This endpoint requires an API key.

Frequently Asked Questions

What forms does this dataset cover?

The dataset covers Form F-4 (initial registration statement) and Form F-4/A (pre- or post-effective amendment) filings submitted to EDGAR. F-4 is the registration statement prescribed by 17 CFR 239.34 under the Securities Act of 1933 for foreign private issuers registering securities issued in business-combination transactions.

What does one record in the dataset represent?

One record is a single Form F-4 or Form F-4/A registration-statement submission, identified by its EDGAR accession number and packaged as one folder on disk. The folder contains the byte-faithful original-submission documents (main registration statement, every exhibit, and any XBRL data files) together with a generated metadata.json sidecar that re-states the EDGAR submission header and indexes every document the filer originally transmitted.

Who is required to file Form F-4?

A foreign private issuer (as defined in Securities Act Rule 405 and Exchange Act Rule 3b-4(c)) issuing securities in a business combination, exchange offer, or other Rule 145(a) transaction must file Form F-4 to register those securities under Section 5 of the Securities Act. The target company is generally not the F-4 filer, although its financial statements and MD&A are commonly incorporated into the F-4 prospectus.

How does this dataset differ from the S-4 dataset?

S-4 is the domestic counterpart to F-4 and is otherwise substantively parallel: both register securities issued as consideration in mergers, exchange offers, and Rule 145 transactions. The dividing line is Rule 405 FPI status — F-4 is filed only by foreign private issuers, S-4 by all other domestic registrants — which drives accounting (IFRS or home-country GAAP with U.S. GAAP reconciliation on F-4; U.S. GAAP throughout on S-4) and the incorporated-by-reference baseline (Form 20-F vs. Form 10-K).

What file formats are inside each record?

The file types found in the dataset are TXT, JSON, HTML, and PDF, packaged in ZIP containers. HTML/HTM is the dominant format for the main statement and exhibits in the modern era; JSON is the metadata sidecar; TXT covers legacy ASCII-era filings and the complete-submission text URL; PDF appears for occasional supplemental exhibits. XML files (XBRL taxonomy linkbases and extracted instances) ride alongside as data files listed under metadata.json.dataFiles[].

What time period does the dataset cover?

The dataset spans October 1994 to the present. The 1994 start date reflects EDGAR phase-in rather than the historical origin of the form (Form F-4 was adopted in 1982); pre-1994 paper F-4 filings are not included.

Are images included in the bundle?

No. Image files (the GRAPHIC document type — JPG, GIF, PNG) are excluded from each record. Their URLs and filenames remain visible in metadata.json and inside the HTML body via inline <IMG SRC="…"> tags, and they can be re-fetched from sec.gov if needed, but their bytes are not present locally.