The Form 425 Files dataset is a complete archive of EDGAR submissions filed on form type 425 — written communications made pursuant to Rule 425 under the Securities Act of 1933 in connection with proposed or pending business combinations (mergers, acquisitions, tender offers, exchange offers, and similar transactions). One record is a single EDGAR accession — a press release, investor presentation, analyst-call transcript, employee memo, shareholder letter, merger-agreement exhibit set, or other written communication about a deal — packaged as a directory of extracted filer-submitted documents plus a metadata.json descriptor. Records are filed by participants in the transaction, including the acquirer or issuer registering securities on Form S-4 or F-4, the target company, SPACs and de-SPAC targets, exchange-offer bidders, and affiliated parties such as controlling shareholders or joint bidders. The dataset begins at January 2000, tracking the start of meaningful filing activity under the SEC's 1999 M&A communications regime (Rel. No. 33-7760, effective January 24, 2000), and extends to the present. Records are grouped into monthly ZIP containers and distributed in the file types TXT, JSON, HTML, and PDF; image attachments are enumerated in metadata but stripped from the extracted tree.
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
The Form 425 Files dataset collects every EDGAR submission carrying form type 425 since January 2000. Each record corresponds to one accession number filed under Rule 425 in connection with a business combination. Physically, each record is a directory named for the 18-digit numeric accession number (for example 000119312525132373) containing a metadata.json descriptor plus the filer-submitted documents extracted from that accession. Accession directories are grouped into monthly ZIP containers. The file types preserved on disk are TXT, JSON, HTML, and PDF; image attachments (logos, slide-deck graphics, charts) that EDGAR received as GRAPHIC documents are enumerated in metadata.json but deliberately excluded from the extracted tree.
Form 425 is a cover designation EDGAR applies to written communications that constitute a prospectus or an offer to sell securities in connection with mergers, acquisitions, tender offers, exchange offers, or other business combinations. It is not a free-standing disclosure document in the sense of a Form 10-K or S-1; it is a filing wrapper around whatever written material the filer is releasing to the market — a press release, an investor presentation, a shareholder letter, a conference-call transcript, a social-media post, an employee memo, an advertisement, a website FAQ, or the text of an analyst-day deck. Rule 425 ensures that any such written communication used while a business-combination registration statement (typically an S-4 or F-4) is on file enters the public record carrying the prescribed legends and cross-references to that registration statement.
The record boundary is always the accession — never the transaction, never the filer, never an individual exhibit. A single multi-month merger campaign therefore produces many records (often dozens), one per 425 communication released as the deal progresses. Reconstructing a deal timeline requires grouping accessions by the filer/subject CIK pair and the registration-statement file number that each record carries. A substantial share of submissions are dual-designated 8-K/425, because the same written communication must be furnished under Form 8-K Item 7.01 (Regulation FD) or Item 8.01 (Other Events) while also satisfying Rule 425. In those cases the primary document is an 8-K body and the 425 content lives in the exhibits, but the EDGAR form type is still 425 for dataset-membership purposes.
Each accession directory contains three logical layers:
metadata.json) summarizing the EDGAR header, enumerating the filer and subject entities, and listing every physical document in the submission with its SGML type, sequence, original filename, filer-supplied description, direct URL, and byte size..htm/.html, occasionally .txt, rarely .pdf — each wrapped in the SGML <DOCUMENT> envelope that EDGAR uses inside the full-submission text file.GRAPHIC documents. These assets are enumerated in metadata.json but deliberately excluded from the extracted tree; only textual and document-format content is packaged.The full-submission SGML text file (<accession>.txt) is enumerated as the trailing entry of documentFormatFiles but is not materialized on disk, because its content is equivalent to the concatenation of the per-document HTMLs together with their SGML headers.
metadata.json structuremetadata.json is a single JSON object per record. Its scalar top-level keys carry the EDGAR header values and pointers back to the original filing on sec.gov:
formType — always "425" for this dataset.accessionNo — the dash-formatted accession number (for example "0001193125-25-132373").filedAt — ISO 8601 timestamp of acceptance by EDGAR, including the Eastern Time offset.description — the EDGAR form description, typically "Form 425 - Prospectuses and communications, business combinations".linkToFilingDetails — URL to the primary document on sec.gov.linkToTxt — URL to the full SGML .txt submission.linkToHtml — URL to the EDGAR filing-index page.id — internal deduplication identifier.Two array-valued fields carry the analytically important content:
documentFormatFiles is the authoritative inventory of every physical document in the submission. Each element is an object with sequence (ordinal position as a string; the trailer entry for the complete-submission text file uses a blank " "), type (the EDGAR document type: "425", "EX-99.1", "EX-2.1", "EX-10.1", "EX-99.2", "GRAPHIC", or an empty string on the trailer), description (free text supplied by the filer), documentUrl (direct URL on sec.gov), and size (byte size as a string). Readers reconciling the manifest against the extracted tree should filter out entries where type == "GRAPHIC" and the blank-sequence trailer entry; what remains is the exact set of files that ship on disk alongside metadata.json.
entities is an array of filer/subject-company blocks. Typical 425 filings carry two entries — one for the acquirer/offeror and one for the target — disambiguated by a suffix appended to companyName: "(Filed by)" for the submitting party and "(Subject)" for the target. These suffixes are the authoritative role markers for the two sides of the transaction and must be stripped when joining to other EDGAR datasets on CIK. Each entity block exposes cik, companyName, fiscalYearEnd, stateOfIncorporation, act (the Securities/Exchange Act number, e.g. "34"), fileNo, irsNo, type (echoing the form type), sic (SIC code plus label), filmNo, and tickers (array of ticker symbols). fileNo is typically populated only on the (Subject) entity, where it references the active registration statement (for example 333-155412); this is the join key to the underlying S-4 or F-4 on the target's filing history. SPAC filings and other self-referential 425s may carry a single entity in which the filer and the subject are the same CIK.
Tail fields are always present but frequently empty on 425 records: seriesAndClassesContractsInformation: [], linkToXbrl: "", and dataFiles: [].
Every accession carries exactly one document of type: "425", and it is the sequence: "1" entry in documentFormatFiles. Each extracted HTML is preserved inside its SGML <DOCUMENT> envelope, so the file begins not with <!DOCTYPE html> or <HTML> but with the EDGAR header block:
1
<DOCUMENT>
2
<TYPE>425
3
<SEQUENCE>1
4
<FILENAME>d44940d425.htm
5
<DESCRIPTION>425
6
<TEXT>
7
<HTML>...</HTML>
8
</TEXT>
9
</DOCUMENT>
The header is machine-readable before any HTML parsing and exposes four redundant-but-useful values: <TYPE> (EDGAR document type), <SEQUENCE> (ordinal within the submission), <FILENAME> (original EDGAR filename), and <DESCRIPTION> (filer free text). The <TEXT>...</TEXT> block wraps the HTML body.
Inside the HTML body, the primary 425 document is typically a short cover-and-legend page with five characteristic components in order:
Filed by [Acquirer] / CIK No. [NNNNNNNNNN] / Pursuant to Rule 425 under the Securities Act of 1933 / Subject Company: [Target] / Registration Statement File Number: 333-NNNNNN.EX-99.x attachment with a short descriptor and (usually) an intra-filing anchor link.When the submission is a dual 8-K/425, the primary document is an 8-K body rather than a Rule 425 cover page. It opens with an 8-K cover sheet and item heading (most commonly Item 7.01 Regulation FD Disclosure or Item 8.01 Other Events), carries a short narrative of the triggering event, ends with a signatures block under the reporting company's name, and includes its own exhibit index pointing into the merger-related attachments. The Rule 425 legends in that case migrate into the exhibits themselves rather than sitting on the cover.
The payload shape below the primary document varies with the nature of the written communication:
EX-99.1 exhibit carrying the substantive material — typically a press release, a shareholder letter, a translated foreign material-fact disclosure, or a transcript. The body of EX-99.1 is usually a standalone document with its own headline, narrative, tables of key dates or transaction terms, references to prior filings, and a repeat of the forward-looking-statements legend.EX-2.1 for the merger or business-combination agreement (often tens of thousands of words of operative contract text with schedules and exhibits of its own), EX-10.1 for voting and support agreements or sponsor lock-ups, EX-99.1 for the joint press release, EX-99.2 for the investor presentation, and additional EX-99.x exhibits for transcripts, FAQs, and employee communications. Investor-presentation exhibits drive most of the image-stripping activity, because every slide background and chart image is enumerated as a separate GRAPHIC entry.entities array lists differing filer and subject CIKs, and the URL pointers inside metadata.json are not internally consistent: linkToTxt and linkToHtml reference the filer's CIK path on sec.gov, while linkToFilingDetails references the subject's CIK path. This asymmetry matters when joining 425 records to other CIK-keyed EDGAR datasets.metadata.json and a thin HTML shell even though documentFormatFiles enumerates many GRAPHIC entries. The enumerated but absent images signal that the full visual rendering is not reproducible from the dataset alone.Every extracted document retains the EDGAR SGML <DOCUMENT> framing even though the dataset is distributed as per-document files rather than as one concatenated .txt. Naive HTML parsers will therefore encounter four or five non-HTML header lines before the first <HTML> or <html> tag and must tolerate or strip them. The envelope is nevertheless a reliable secondary source of document metadata — <TYPE>, <SEQUENCE>, <FILENAME>, and <DESCRIPTION> — that can be recovered from the file body even if metadata.json is lost.
HTML content inside <TEXT> is inconsistent across filers and eras. Some documents are clean semantic HTML produced by filing-agent software; others are word-processor exports littered with presentational <font> tags, fixed-width inline styles, and per-page page-break markers; older filings can contain blocks of pre-formatted text inside <PRE> tags rather than flowed HTML. Tables are common — exhibit indexes, key-terms summaries, cutover-date schedules, lists of prior filings — and in the modern era are almost always rendered as HTML <table> elements rather than as ASCII tables.
References to image assets inside the HTML (<img src="..."> tags pointing at the GRAPHIC filenames listed in metadata) resolve to broken links when the document is rendered locally, because those images are excluded from the extracted payload.
Per the dataset's packaging policy, the file types preserved on disk are TXT, JSON, HTML, and PDF; image files are excluded. Concretely, every document whose EDGAR type is GRAPHIC — JPG, GIF, and PNG logos, slide backgrounds, and chart images — is dropped from the extracted tree but remains fully enumerated in documentFormatFiles with its filename, size, and URL. Two consequences follow:
documentFormatFiles; reconciliation requires filtering out type == "GRAPHIC" entries.documentUrl values directly from sec.gov.The complete-submission text file (for example 0001193125-25-132373.txt) is likewise enumerated in documentFormatFiles — typically as the trailing entry with a blank sequence of " " and an empty type — but is not materialized on disk. Its content equals the concatenation of the per-document HTMLs with their SGML headers.
The Form 425 designation itself dates to the SEC's November 1999 adoption of the Rule 165/Rule 425 communications regime, which took effect in early 2000 and permitted parties to business combinations to freely disseminate written communications about a pending deal provided those communications were filed on EDGAR with the requisite legends. The dataset's January 2000 start date tracks the beginning of meaningful filing activity under that regime.
Several material shifts shape how record contents have evolved:
<PRE>-wrapped fixed-width text, tables are drawn with pipes and dashes, and embedded graphics are absent. The <DOCUMENT> envelope structure, however, is identical to the modern one.<table> elements, inline styles, and the first GRAPHIC exhibits (logos and charts). By the late 2000s HTML is the overwhelming norm.GRAPHIC images), transcript exhibits, and occasional screenshot-style HTML pages.Form 425 filings do not carry XBRL structured data; linkToXbrl and dataFiles appear in metadata.json for schema uniformity with other EDGAR datasets but are empty on this form type.
(Filed by)/(Subject) CIK pair and the registration-statement fileNo.companyName. The (Filed by) and (Subject) suffixes are the authoritative disambiguators for the two entities. Strip them when joining to other EDGAR datasets on CIK.fileNo under the (Subject) entity is the join key to the underlying S-4 or F-4 on the target's filing history; the corresponding field on the (Filed by) entity is typically blank.linkToTxt and linkToHtml reference the filer's CIK path while linkToFilingDetails references the subject's CIK path. Consumers that canonicalize URLs should expect this inconsistency.<DOCUMENT> header block; HTML renderers and parsers must skip or accept these pre-<HTML> lines. The envelope doubles as a fallback source of document-type and filename metadata.<img> references resolve to broken links, and any analysis depending on visual content must fall back to documentUrl on sec.gov.425/A are not part of this dataset; only raw 425 form-type accessions are included.8-K/425, there is exactly one accession and it is included here; when the filer chooses to submit the 8-K and the 425 separately, the 425 side is the accession that appears in this dataset.documentFormatFiles minus (a) entries with type == "GRAPHIC" and (b) the blank-sequence trailer entry for the complete-submission text file. The resulting filtered list should match the non-metadata.json files present in the directory.The filer of a Form 425 record is a participant in a specific business combination (or an affiliate or representative) whose written communication qualifies as a prospectus, offer, or proxy solicitation under Securities Act Rule 425. The defining trait of the filer population is transactional: each filer is tied to a specific business combination and is speaking publicly about it in writing. Ordinary-course capital raises use Form FWP (Rule 433), not Form 425.
Form 425 implements Securities Act Rule 425 (17 CFR 230.425), adopted in the SEC's 1999 M&A communications release (Rel. No. 33-7760, effective January 24, 2000). It interlocks with:
Statutory authority runs through Securities Act Sections 5, 10, and 12 and Exchange Act Sections 14(a) and 14(d)-(e). There is no pre-2000 history for the form; before the M&A release, equivalent communications were either withheld until the registration statement was effective or filed as Schedule 14A additional soliciting material.
The trigger is the first public use of a written communication that is a prospectus, offer to sell securities, or proxy solicitation in connection with a business combination within Rule 165(f). "Written communication" is read broadly and includes:
Each such communication triggers a same-day Form 425 filing on the date of first use. There is no next-business-day grace period. Form 425 is a "filed" document, not "furnished": it is incorporated by reference into the S-4/F-4 and carries Section 11 and Section 12(a)(2) liability, plus Section 14 liability where it also functions as proxy soliciting material.
Triggers by timeline:
Form 425 is defined by the form of a communication, not by the substance of a transaction. Rule 425 sweeps in any written communication a party disseminates about a registered business combination — press releases, slide decks, transcripts, employee FAQs, social media scripts — and requires it be filed as a prospectus. Because the same document frequently satisfies multiple regimes at once, Form 425 filings overlap heavily with registration statements, proxy material, tender offer schedules, 8-Ks, and free writing prospectuses. Choosing between Form 425 and its neighbors usually comes down to three questions: is the consideration stock or cash, is this a standalone communication or a full transactional document, and which statutory safe harbor is the filer trying to preserve.
S-4 (domestic) and F-4 (foreign private issuer) are the registration statements used when securities are issued as consideration in a business combination. An S-4 is typically a joint proxy/prospectus containing business descriptions, pro forma financials, fairness opinions, the merger agreement, and risk factors. Form 425 sits on top of that registration: Rule 425 treats pre-filing written communications about the deal as a prospectus that must be filed to preserve the Rule 165/166 safe harbor leading up to (and alongside) the S-4.
Reach for S-4/F-4 when you need definitive transaction terms and audited financial disclosure. Reach for Form 425 when you want the running stream of shorter communications the parties issue across the deal timeline.
Schedule 14A governs proxy solicitations under Section 14(a). Three variants are the closest proxy-side analogs to Form 425:
Reach for DEFM14A and the S-4 for the definitive deal narrative and terms. Reach for 425 (paired with DEF 14A) for the communications stream.
Schedule TO covers tender offers. TO-C is pre-commencement written communications — the tender-offer analog to Rule 425. TO-I is an issuer tender offer; TO-T is a third-party tender offer; 14D-9 is the target's recommendation statement.
The split between 425 and the TO family is transaction structure:
An analyst studying a hostile or two-step deal typically needs 425, TO-C, and Schedule 14D-9 together.
A single merger press release is commonly filed three ways at once: as an 8-K (Item 1.01 for entry into the merger agreement, Item 7.01 for Reg FD disclosure of the announcement, or Item 8.01 for other material events), as a Form 425, and often as a DEFA14A. The 8-K is an Exchange Act current report by a reporting company; the 425 is a Securities Act prospectus filing preserving the Rule 165/166 safe harbor for written offers.
The sharpest confusions:
Practical implication: an 8-K-only pull misses communications from non-reporting acquirers and supplemental materials (employee memos, investor decks, transcripts) that a filer elects to file only as 425. A 425-only pull misses the formal Item 1.01 contract disclosures and any non-425 exhibits to the 8-K. The two datasets are complementary, not substitutable.
FWP under Securities Act Rule 433 is the procedural cousin to 425: both are Securities Act communication filings sitting outside a statutory prospectus. The split is subject matter. FWP covers registered non-M&A offerings (debt, follow-on equity, structured notes, shelf takedowns); 425 covers business-combination communications. The two corpora do not meaningfully overlap in content.
Neither is a communications filing. Schedule 13D is a beneficial-ownership filing triggered at 5% with non-passive intent, useful for tracking stake accumulation before any 425 traffic appears. Schedule 13E-3 is the going-private transaction statement required when an affiliate takes a public company private, typically filed alongside a Schedule TO or DEFM14A. Both are useful adjacencies when assembling a full deal file, but they answer ownership and transaction-status questions — not communication questions — and are not substitutes for 425.
Form 425 is the running written-communications layer around stock-for-stock mergers and exchange offers. It is narrower than 8-K (425 only covers written deal communications), narrower than S-4/DEFM14A (which are the comprehensive transaction documents), and narrower than Schedule TO (which covers tender offers regardless of consideration). It is broader only in volume: a single deal can generate dozens of 425 filings across months.
Reach for this dataset when you want:
Pair it with S-4/F-4 and DEFM14A for definitive terms, 8-K Items 1.01/7.01/8.01 and DEFA14A for overlapping formal disclosure, Schedule TO and 14D-9 for tender/exchange offers, 13D for stake accumulation, and 13E-3 for going-private mechanics.
Because Rule 425 captures every written communication that constitutes a prospectus or offer in a business combination, the Form 425 Files dataset is a near-complete archive of how deal parties describe and defend transactions from announcement to close. Different audiences mine different elements: cover-page registrant/subject-company CIKs, forward-looking and solicitation legends, press-release and deck exhibits, dual 425 / DEFA14A designations, and cross-links to the related S-4, F-4, or definitive proxy.
Arb traders price the spread and estimate break risk from announcement to close. They pull exchange ratios, collar mechanics, cash/stock mix, termination fees, and antitrust, financing, or regulatory conditions out of the press-release and Q&A exhibits, then track each incremental 425 (road-show scripts, town-hall decks, regulatory update letters) for tone shifts and revised timelines. Filing timestamps anchor event-study windows; spikes in 425 cadence near a vote flag re-pricing risk.
Bankers use the corpus as a precedent library. Juniors assemble pitches from comparable-deal 425 sets to show premium framing, synergy narratives, and strategic rationale. Fairness-opinion teams pull press-release exhibits and investor decks to reconstruct disclosed terms, accretion/dilution framing, and pro forma metrics that peer advisers signed off on. Analyst-day and joint investor materials filed on 425 benchmark disclosure depth for live mandates.
Deal counsel treat the dataset as a drafting precedent bank. They compare forward-looking-statement legends, participants-in-the-solicitation disclosures, no-offer/no-solicitation language, and additional-information legends across peer transactions to satisfy Rule 425, Rule 14a-12, and Regulation M-A. They also study how others handled employee letters, customer communications, and social-media posts filed as 425 exhibits, and whether peers cross-filed 425 as DEFA14A or as an S-4 communication.
In-house strategy and integration groups study how analogous deals communicated rationale and synergies. They harvest cost-synergy phasing, revenue-synergy caveats, integration-leadership announcements, cultural-fit messaging, and customer-retention commitments to shape board materials and their own future 425 playbook.
Compliance teams confirm every client communication in a live deal is filed, legended, and tied to the registration statement or proxy. Proxy solicitors track 425 cadence and content from both sides of a contested or friendly deal to calibrate outreach, vote recommendations, and meeting messaging, focusing on subject-company identification, the exhibit list, and each investor-facing piece.
Reporters and deal-database editorial teams reconstruct deal timelines from 425 press-release exhibits, joint statements, and slide decks, capturing the first public articulation of terms and every amendment. Near-real-time 425 feeds surface unexpected filings that signal proxy fights, counter-bids, or structural changes.
Researchers build multi-decade panels of merger communications. Structured metadata plus full text supports studies of premium disclosure, forward-looking-statement boilerplate evolution, communication intensity versus completion probability, linguistic features of completed versus terminated deals, and the role of employee and social-media communications as 425 exhibits. Legal scholars track how participants-in-the-solicitation disclosure has shifted since Regulation M-A modernization.
Machine-learning teams ingest the corpus to train entity extraction (acquirer/target CIK and name from the cover page), communication-type classification (press release, deck, employee memo, call transcript), quantitative deal-term extraction (exchange ratio, cash component, collars), and per-transaction clustering. Retrieval-augmented systems for M&A workflows use 425 text as grounded source, and CIK links to S-4 / F-4 and DEFA14A unify communications into transaction-level graphs.
Regulatory teams benchmark client legend language, safe-harbor statements, non-GAAP reconciliations, projections, and bring-down risk factors against large 425 samples, and trace how prior staff comments reshaped communication practice.
Plaintiffs' and defense counsel use 425 filings as primary evidence in disclosure litigation. Every communication is reviewed to reconstruct what shareholders knew, when, and from whom in Section 14(a)(9) and fiduciary-duty claims. Appraisal support teams trace publicly disclosed management projections, synergy representations, and process narratives against board-book and data-room evidence. The complete chronological record across paired acquirer and target filings exposes contradictions and shows whether supplemental 425 disclosures cured or compounded alleged omissions.
Each use case below ties a specific workflow to concrete record fields, exhibit types, or filing combinations in the Form 425 corpus.
Primary user: merger-arbitrage analyst on an event-driven desk.
Pull every 425 accession for the deal by grouping on the (Filed by) / (Subject) CIK pair and the fileNo on the subject entity (the S-4 or F-4 number). From the EX-99.1 press-release exhibit on the announcement 8-K/425, extract exchange ratio, collar mechanics, cash component, termination fees, and regulatory conditions; from each follow-on 425 (road-show transcripts, regulatory-update letters, town-hall decks) extract incremental timeline language. Use filedAt timestamps to anchor event-study windows and to flag cadence spikes near the shareholder vote that signal repricing risk.
Primary user: M&A securities counsel drafting deal communications.
Query the primary type: "425" cover document across accessions in the same SIC code and deal-size band, then pull the "Important Notices and Additional Information" and "Forward-Looking Statements" paragraphs plus the "Participants in the Solicitation" block. Compare legend wording, named directors/officers, and S-4 cross-references against current staff guidance to calibrate client drafts for Rule 425, Rule 14a-12, and Regulation M-A compliance. The dual 8-K/425 submissions expose where peers placed legends on the cover versus migrated them into exhibits.
Primary user: corporate-development analyst building a precedent library.
Filter documentFormatFiles for type == "EX-2.1" (merger or business-combination agreement) and type == "EX-10.1" (voting/support agreements, sponsor lock-ups) inside 8-K/425 combo accessions. The operative contract text — with schedules, termination triggers, and no-shop carve-outs — supports clause-level comparison of fiduciary outs, reverse-termination fees, and lock-up terms across peer transactions, feeding both pitch decks and board-memo templates without requiring a separate S-4 pull.
Primary user: SPAC-focused research analyst or compliance monitor.
Isolate SPAC-era accessions (2020-2022 surge) by filtering on single-entity metadata blocks where filer CIK equals subject CIK, or on SIC codes associated with blank-check companies. Track the sequence of 425 filings per target to map PIPE-subscription disclosures, projections updates, and sponsor-forfeiture amendments in the EX-99.2 investor-presentation exhibits. Cadence drops or projection revisions often precede redemption spikes and deal renegotiations.
Primary user: plaintiffs' or defense counsel in merger-disclosure litigation.
Assemble the full chronological 425 stream for both sides of the transaction by joining on the shared fileNo, then diff successive EX-99.1 press releases, employee FAQs, and analyst-day transcript exhibits against the DEFM14A narrative to identify omissions, corrections, or supplemental disclosures. The paired acquirer/target filings expose contradictions between the two sides' characterizations of synergies, process, and projections — the evidentiary backbone of Section 14(a)(9) claims and appraisal support.
Primary user: NLP engineer at a financial-data vendor.
Use the type field in documentFormatFiles as weak supervision for communication-class labels: EX-99.1 for press releases, EX-99.2 for investor decks, EX-2.1 for operative contracts, other EX-99.x for transcripts, FAQs, and employee memos. Entity extraction is grounded by the (Filed by) / (Subject) suffixes in companyName plus the CIK and ticker arrays; transaction-level clustering joins 425 records to the underlying S-4/F-4 via the subject's fileNo. The SGML <DOCUMENT> envelope provides fallback labels when metadata.json is unavailable.
Primary user: financial journalist or deal-database editor.
Stream new 425 accessions by filedAt and flag any filing where a new acquirer CIK appears in the entities array against an existing target, or where a target suddenly files its own 425 under a different fileNo. Short primary 425 covers with a single EX-99.1 press-release exhibit bearing unexpected rationale language frequently mark topping bids, revised exchange ratios, or the opening salvo of a contested proxy solicitation before wire-service pickup.
Dataset Index JSON API: https://api.sec-api.io/datasets/form-425-files.json
This endpoint returns metadata describing the Form 425 Files dataset, including its name, last update timestamp, earliest sample date (2000-01-01), covered form types (425), container format (ZIP), included file types (TXT, JSON, HTML, PDF), the full dataset download URL, and the list of monthly container files. Each container entry includes its key (e.g., 2025/2025-05.zip), size, record count, last updated timestamp, and per-container download URL. The index can be polled to monitor which containers changed in the most recent refresh run, enabling incremental downloads of only the containers that have been updated. No API key is required to read the index.
1
{
2
"datasetId": "1f13365b-9ae0-68e8-b024-63ccb2249299",
3
"datasetDownloadUrl": "https://api.sec-api.io/datasets/form-425-files.zip",
4
"name": "Form 425 Files Dataset",
5
"updatedAt": "2026-04-23T02:56:30.711Z",
6
"earliestSampleDate": "2000-01-01",
7
"totalRecords": 133787,
8
"totalSize": 2176724564,
9
"formTypes": ["425"],
10
"containerFormat": "ZIP",
11
"fileTypes": ["TXT", "JSON", "HTML", "PDF"],
12
"containers": [
13
{
14
"downloadUrl": "https://api.sec-api.io/datasets/form-425-files/2025/2025-05.zip",
15
"key": "2025/2025-05.zip",
16
"size": 13818783,
17
"records": 154,
18
"updatedAt": "2026-04-23T02:56:30.711Z"
19
}
20
]
21
}
Download Entire Dataset: https://api.sec-api.io/datasets/form-425-files.zip?token=YOUR_API_KEY
Downloads the complete Form 425 Files dataset as a single ZIP archive covering all filings from 2000 to present. This endpoint requires a valid API key passed via the token query parameter.
Download Single Container: https://api.sec-api.io/datasets/form-425-files/2025/2025-05.zip?token=YOUR_API_KEY
Downloads one monthly container ZIP file instead of the full archive. Container paths follow the <YEAR>/<YEAR>-<MONTH>.zip pattern shown under each container's key in the index response. Each container decompresses into per-accession folders, where every folder contains a metadata.json file alongside the filing's primary document and exhibits. This endpoint requires a valid API key passed via the token query parameter.
The dataset covers EDGAR submissions with form type 425 — written communications filed pursuant to Securities Act Rule 425 in connection with proposed or pending business combinations (mergers, acquisitions, exchange offers, and similar transactions). Amendments bearing form type 425/A are out of scope; only raw 425 form-type accessions are included.
One record is a single EDGAR accession — one submission filed under form type 425. Physically, each record is a directory named for the 18-digit numeric accession number containing a metadata.json descriptor plus the filer-submitted documents extracted from that accession (predominantly .htm/.html, occasionally .txt, rarely .pdf). The record boundary is always the accession, never the transaction, so a single merger typically produces many records over its lifecycle.
Form 425 is filed by participants in a business combination whose written communications qualify as a prospectus, offer, or proxy solicitation under Rule 425. Filers include the acquirer or issuer registering securities on Form S-4 or F-4, the target company, SPACs and de-SPAC targets, registered exchange-offer bidders, and affiliated parties such as parent companies, controlling shareholders, and joint bidders. Agents such as financial advisors and proxy solicitors do not file in their own names.
Each Form 425 filing is triggered on the date of first public use of a written communication about the business combination. The deadline is same-day — there is no next-business-day grace period. Form 425 is event-driven rather than periodic, and a single deal can generate many filings across the announcement-to-closing window.
Form 425 is a Securities Act prospectus filing preserving the Rule 165/166 safe harbor for written offers; Form 8-K is an Exchange Act current report by a reporting company; DEFA14A is additional soliciting material under Exchange Act Rule 14a-12. A single merger press release is often filed all three ways at once. A 425-only pull misses formal 8-K Item 1.01 contract disclosures, while an 8-K-only pull misses communications from non-reporting acquirers and supplemental materials filed only as 425.
The dataset is distributed as monthly ZIP containers organized in a <YEAR>/<YEAR>-<MONTH>.zip path pattern. Preserved file types are TXT, JSON, HTML, and PDF. Image attachments (JPG, GIF, PNG) that EDGAR received as GRAPHIC documents are enumerated in each record's metadata.json under documentFormatFiles but are deliberately excluded from the extracted tree; reconstructing the original visual appearance requires fetching the documentUrl values directly from sec.gov.
The dataset begins January 2000, aligned with the effective date of the SEC's 1999 M&A communications release (Rel. No. 33-7760, effective January 24, 2000) that created the Rule 165/Rule 425 regime. Coverage extends to the present and is refreshed with new monthly containers; the dataset index JSON can be polled to detect which containers have changed since the previous refresh.