The Form 424B4 Files Dataset is a complete corpus of final, priced prospectuses filed with the U.S. Securities and Exchange Commission under Rule 424(b)(4) of the Securities Act of 1933. Each record represents one Form 424B4 submission accepted by EDGAR — the canonical legal selling document of a registered offering, carrying the public offering price, final share count, underwriting discounts and commissions, net proceeds, and any substantive changes from the preliminary (red-herring) prospectus. Filings are made by the issuer-registrant of the underlying Securities Act registration statement (most often Form S-1, S-3, F-1, or F-3) once the SEC has declared that registration statement effective and the offering has priced.
The dataset is delivered as monthly ZIP containers, with each container holding accession-number folders that contain a structured metadata.json envelope plus the primary 424B4 prospectus document (typically a single self-contained .htm/.html file, occasionally .txt for very early ASCII filings, and rarely .pdf). Coverage begins with the earliest EDGAR 424B4 submissions in January 1994 and continues to the present, spanning IPOs, follow-on equity offerings, registered direct offerings, rights offerings, resale registrations, debt and convertible offerings, and fund and variable-product prospectuses.
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
The dataset is scoped strictly to Form 424B4 accessions: the definitive prospectus filed under Rule 424(b)(4) after the SEC has declared the underlying registration statement effective with pricing-related information omitted under Rule 430A or Rule 430B. The (b)(4) designation specifically signals that the final prospectus reflects information constituting a substantive change from, or material addition to, the preliminary prospectus filed under Rule 424(b)(1) or 424(b)(2). In substance, the 424B4 is the canonical legal selling document for the offering — the version that controls Section 11 and Section 12 liability under the Securities Act — and is used for IPOs, follow-on equity offerings, registered direct offerings, rights offerings, resale registrations on behalf of selling stockholders, debt offerings, convertible-security offerings, and fund/variable-product prospectuses.
The dataset materialises two artifacts per record: a metadata.json envelope describing the filing exactly as EDGAR received it, and the primary 424B4 prospectus document carrying the full narrative, tabular, and legal text of the priced prospectus. Records are delivered inside monthly ZIP containers keyed by year and month (for example, 2025/2025-11.zip). When unzipped, each record sits inside a top-level month folder (2025-11/) whose immediate children are the per-accession folders. Image attachments (GRAPHIC documents such as JPEG/PNG illustrations, charts, organisational diagrams, and issuer logos) referenced inside the metadata are inventoried but not materialised on disk; they remain accessible via the EDGAR archive URLs carried in each metadata entry. The file types present in the dataset are TXT, JSON, PDF, and HTML; Form 424B4 does not carry an XBRL instance, so XBRL data files are absent.
A single record in the Form 424B4 Files Dataset corresponds to one Form 424B4 submission accepted by EDGAR, identified by its accession number. On disk a record is a folder named after the 18-digit accession number with no dashes, and it contains exactly two artifacts: a metadata.json envelope describing the filing, and the primary 424B4 prospectus document — almost always a single self-contained .htm/.html file (rarely .txt for very old filings). The JSON gives a structured handle on filer identity, document inventory, and timestamps; the prospectus document carries the full narrative, tabular, and legal text of the priced prospectus exactly as it was filed on EDGAR.
Each accession folder beneath the month directory holds:
metadata.json — the structured filing envelope, always present..htm/.html, occasionally .txt for legacy ASCII filings) whose filename is whatever the filer or its EDGAR agent submitted. Common filename shapes include Donnelley/Edgar Agents-style (d60114d424b4.htm), Toppan Merrill-style (tm2512009-13_424b4.htm), EGS/EdgarAgents-style (ea0267241-424b4_endra.htm), Issuer Direct boilerplate (form424b4.htm), and issuer-specific names (apvo-2025-424b4_-_eloc2.htm, project_grace_424b4.htm, f424b4_110625.htm).Image attachments — the JPEG/PNG illustrations, charts, organisational diagrams, and issuer logos that the filer also submitted as GRAPHIC documents — are not present in the ZIP. They are inventoried inside metadata.json but are not materialised locally; <IMG SRC="..."> tags inside the prospectus HTML therefore resolve only against the original EDGAR archive URLs, not against the local folder.
metadata.json is a flat JSON object describing the filing as EDGAR received it. The fields are consistent across records:
formType — always "424B4" for this dataset.accessionNo — the dashed accession number (e.g. "0001193125-25-288774"); the surrounding folder name is the digits-only form of the same identifier.filedAt — ISO 8601 timestamp with EDGAR's eastern-time offset, marking the moment EDGAR accepted the filing.description — human-readable form description, typically "Form 424B4 - Prospectus [Rule 424(b)(4)]".linkToFilingDetails — direct URL to the primary prospectus document on EDGAR.linkToTxt — URL to the full SGML-wrapped submission text bundle.linkToHtml — URL to the EDGAR filing-index page.linkToXbrl — empty for 424B4 prospectuses; the form does not carry an XBRL instance.id — internal record identifier (md5-shaped).documentFormatFiles — ordered array of every document EDGAR received with the submission.dataFiles — empty for 424B4 (there are no associated XBRL data files).seriesAndClassesContractsInformation — populated for fund and variable-product 424B4s with series and class identifiers; empty for operating-company prospectuses.entities — array of registrant and co-filer descriptors.documentFormatFilesEach entry describes one attached document with a stringified 1-based sequence, a byte size, an absolute documentUrl on www.sec.gov/Archives/edgar/..., a description, and an EDGAR type. For 424B4 submissions the dominant types are 424B4 for the prospectus itself (whose description may be either the type literal "424B4" or "PROSPECTUS") and GRAPHIC for embedded JPEG/PNG illustrations referenced from the prospectus HTML. The trailing entry — with a single space " " as sequence — is a synthetic pointer to the EDGAR full-submission text bundle (the .txt file). The list is therefore a complete inventory of what EDGAR holds; the dataset materialises only the 424B4 document.
entitiesEach entry corresponds to a registrant or co-filer recorded in the EDGAR header and carries: cik, companyName (with an inline role suffix in parentheses such as (Filer) or (Subject)), fileNo (the registration file number — e.g. 333-290663 — that ties the 424B4 back to its underlying S-1/F-1/S-3), sic (EDGAR SIC code with its short label, such as "6331 Fire, Marine & Casualty Insurance" or "2834 Pharmaceutical Preparations"), tickers (array of trading symbols, with multi-class issuers listing each class — e.g. ["GLIBA","GLIBK","GLIBR","GLIBB"]), irsNo (EIN), stateOfIncorporation (which may be omitted for some entities), fiscalYearEnd in MMDD format, act (typically "33" for the Securities Act of 1933), type (form type for that entity), and filmNo (EDGAR film number). Together cik and fileNo uniquely tie the priced prospectus to a specific registration statement.
Every prospectus file is a single self-contained document wrapped in the SGML envelope EDGAR uses inside .txt submissions. The opening lines are a fixed wrapper of the form:
1
<DOCUMENT>
2
<TYPE>424B4
3
<SEQUENCE>1
4
<FILENAME>...
5
<DESCRIPTION>PROSPECTUS
6
<TEXT>
7
<HTML> ... </HTML>
8
</TEXT>
9
</DOCUMENT>
Inside the SGML wrapper sits the full HTML body of the prospectus. The body is paginated for print using embedded <!-- Field: Page; Sequence: N --> / <!-- Field: /Page --> comments and <DIV STYLE="break-before: page; ..."> constructs. Different filer toolchains (Edgar Agents, Toppan Merrill, Donnelley/edgarfilings, Issuer Direct, EGS) leave distinguishable markup signatures, but the section structure is essentially the same across providers.
A 424B4 prospectus is a continuous document, navigated via a hyperlinked TABLE OF CONTENTS near the front and anchored throughout with <A NAME="..."> targets. The canonical sections, in approximate order, are:
Cover page and offering terms. The first HTML page carries the EDGAR cover-page boilerplate ("Filed Pursuant to Rule 424(b)(4)", the registration number, the issuer name, the security and amount being offered) followed by a cover-page pricing table. For an underwritten cash offering the cover table presents, per share and per total, the public offering price, underwriting discount and commissions, and net proceeds to the issuer or to selling stockholders, with footnotes describing the underwriters' option to purchase additional shares (commonly 30 days, up to 15% of the base deal). For resale 424B4s the cover instead identifies the selling stockholders, the share categories being registered (PIPE shares, pre-funded warrants shares, common warrant shares, placement-agent warrant shares, etc.), the listed exchange and trading symbol, and the most recent reported sale price; for rights offerings it identifies the rights mechanics, exercise price, expiration date, and oversubscription privilege.
About this prospectus and cautionary note on forward-looking statements. Short framing pages describing how to use the document and setting out the standard PSLRA-style forward-looking statements safe-harbor language.
Prospectus Summary. A condensed overview of the issuer, the offering, and the strategic rationale; full-form on IPOs and follow-on offerings, often abbreviated or omitted on resale registrations and rights offerings.
The Offering. A stylised table-form summary box restating in catalog form the number and type of shares offered, who is offering them, the offering terms, use of proceeds, the listing exchange and ticker, the post-offering shares-outstanding count, and a pointer to the Risk Factors page.
Risk Factors. Typically anchored as <A NAME="a_004"> or similar and introduced by a short italic lead-in. On larger deals this is a multi-page enumeration of business, industry, regulatory, financial, and securities-related risks, sometimes running to dozens of pages and (since the 2020 amendments to Item 105 of Regulation S-K) prefaced by a summary of risk factors and grouped by category when the section exceeds 15 pages. On smaller deals — especially smaller-reporting-company resale registrations — Risk Factors may be brief and explicitly incorporate by reference the Risk Factors of the issuer's most recent 10-K and subsequent 10-Q filings.
Use of Proceeds. Identifies the dollar amount the issuer expects to receive net of underwriting discounts and offering expenses and how it intends to deploy that capital (working capital, capital expenditures, debt repayment, acquisitions, R&D). On resale 424B4s this section instead states that the issuer will receive no proceeds from the resale itself, sometimes noting that the issuer will receive proceeds from any cash exercise of warrants underlying the resale shares. On rights offerings it states the intended use of the rights-offering proceeds.
Capitalization. A tabular presentation of the issuer's cash and capitalization as of a recent balance-sheet date on an actual basis and on an as-adjusted basis giving effect to the offering. Standard for IPOs and primary follow-ons; routinely omitted on resale registrations and rights offerings.
Dilution. An IPO-specific arithmetic schedule comparing the public offering price per share to pro forma net tangible book value per share before and after the offering, quantifying the per-share dilution to new investors.
Management's Discussion and Analysis of Financial Condition and Results of Operations. Present in full on IPO 424B4s, where the prospectus must be a stand-alone disclosure document; typically incorporated by reference from periodic reports on follow-on prospectuses where Form S-3 eligibility permits incorporation.
Business. Description of the issuer's operations, products or services, markets, competition, intellectual property, regulatory environment, employees and human capital, properties, and material legal proceedings; full-form on IPOs, frequently incorporated by reference on follow-ons.
Management. Directors and executive officers with biographies, board structure, committee composition, and director independence; corporate-governance disclosures; executive-compensation summary, including (where applicable) pay-versus-performance, clawback-policy, and Item 105/Item 106 cybersecurity disclosures reflected by reference or in summary.
Principal and (where applicable) Selling Stockholders. Beneficial ownership tables identifying 5%-or-greater holders, directors, and officers, and — on resale 424B4s — the named selling stockholders together with shares owned before the offering, shares offered, and shares owned after the offering.
Description of Capital Stock (or Description of Securities / Description of Notes). Authorised and outstanding share counts, par value, voting rights, dividend rights, liquidation preferences, anti-takeover provisions, exchange listing, and transfer agent. For debt or convertible offerings this is replaced or supplemented by a Description of Notes / Description of the Offered Securities covering interest rate, maturity, ranking, covenants, conversion mechanics, redemption, and events of default. For multi-class structures each class is described separately.
Material U.S. Federal Income Tax Considerations. Tax-counsel disclosure addressing U.S. holders, non-U.S. holders, withholding, information reporting, and (on certain offerings) FATCA.
Underwriting / Plan of Distribution. On underwritten deals an UNDERWRITING section names the syndicate, sets out the firm-commitment or best-efforts structure, the underwriters' option to purchase additional shares, lock-up agreements with directors, officers, and major holders, indemnification, stabilisation and short-position mechanics, FINRA disclosures, and electronic distribution. On resale 424B4s a PLAN OF DISTRIBUTION section instead enumerates permitted resale methods (ordinary brokerage transactions, block trades, at-the-market sales through broker-dealers, privately negotiated transactions, exchange or off-exchange transactions, settlement of short sales) and the issuer's expense-reimbursement arrangement. Rights-offering 424B4s describe the subscription rights, oversubscription privilege, dealer-manager arrangement, and settlement mechanics.
Legal Matters and Experts. Names of issuer counsel, underwriters' counsel, and the independent registered public accounting firm whose audit report on the financial statements is referenced.
Where You Can Find More Information / Incorporation by Reference. Pointers to EDGAR and, on shelf-eligible 424B4s, an explicit list of incorporated periodic reports, current reports, and proxy materials.
Financial statements. On IPO 424B4s, audited annual statements (balance sheets, statements of operations, comprehensive income, stockholders' equity, and cash flows) with notes and the auditor's report, plus unaudited interim statements where applicable; on follow-on shelf prospectuses the statements are typically incorporated by reference rather than reproduced in full. Where present, the statements are rendered as HTML tables.
Back matter. Signature/dater block and any final notices.
For fund and variable-insurance-product 424B4s (signalled by a non-empty seriesAndClassesContractsInformation), the section labels follow the relevant fund-prospectus item structure rather than the S-K-driven order: fee tables, investment objectives, principal investment strategies, principal risks, performance, portfolio management, purchase and sale information, tax information, payments to broker-dealers, and similar items.
A record materialises:
metadata.json) with form type, accession number, filing timestamp, EDGAR document inventory, registration file number, filer and co-filer identity, SIC, tickers, IRS number, state of incorporation, fiscal year end, film number, and (for fund filers) series-and-class identifiers.GRAPHIC image files referenced in metadata.json -> documentFormatFiles are not stored in the ZIP. These are the embedded JPEG/PNG illustrations, organisational charts, product photos, maps, and issuer logos that the prospectus HTML references; they remain accessible only via the original EDGAR archive URLs carried in each documentFormatFiles entry. A single IPO 424B4 may list one prospectus document plus dozens of GRAPHIC entries; only the prospectus is materialised.documentFormatFiles is a reference to the EDGAR full-submission .txt bundle and is not stored as a separate artifact in the record.The structural skeleton of a 424B4 prospectus has been stable since the form's introduction, but several SEC rule changes have shifted what appears inside it:
Form 424B4 filings span EDGAR from January 1994 to the present, and the on-disk format of the prospectus document evolved across that span:
<S> / <C> column markers. Tables are rendered as fixed-width ASCII; there are no embedded images or hyperlinks. The SGML <DOCUMENT> / <TYPE>424B4 / <TEXT> envelope is already present, but the body is text rather than HTML.<TABLE>), hyperlinked tables of contents with <A NAME> anchors, typographic styling, and embedded GRAPHIC references for illustrations and logos.<DIV STYLE="break-before: page"> and toolchain-specific page-marker comments, frequent use of <TABLE> for capitalization, dilution, beneficial-ownership, and selling-stockholder schedules, and consistent cross-referencing via anchor IDs.The dataset preserves each file as filed, so the era's formatting conventions are visible in each record. The file types found in the dataset are TXT, JSON, PDF, and HTML: TXT for legacy ASCII prospectuses, JSON for the per-record metadata envelope, HTML for the modern prospectus body, and PDF for rare submissions in which a filer attached a PDF rendition. Form 424B4 does not carry an XBRL instance, and linkToXbrl is correspondingly empty for every record.
entities array can contain multiple registrants — parent and finance-subsidiary co-issuers, multi-class share structures with multiple tickers per issuer, or selling-shareholder entities tagged (Subject). Consumers should treat the array as a list rather than picking only the first entry.GRAPHIC files are excluded, <IMG SRC="..."> tags inside the HTML resolve only against the EDGAR archive URL pattern, not against the local folder. Rendering the HTML offline therefore shows broken images while preserving all textual and tabular content.metadata.json file in the folder, on the documentFormatFiles entry whose type is "424B4", or on the <TYPE>424B4 SGML header, rather than on filename pattern matching.<DIV>-based page constructs are presentation-layer artifacts and should be stripped or treated as section delimiters, not as content, by extraction pipelines.<A NAME="..."> targets (often opaque IDs such as a_001, a_004); the readable section name typically appears in the immediately surrounding <P>, <TR>, or <FONT> element rather than in the anchor itself, so extractors should resolve section boundaries from the table-of-contents pairing of anchor and label rather than from the anchor string.seriesAndClassesContractsInformation is non-empty, the prospectus is a fund or variable-insurance-product 424B4 whose internal structure (fee tables, investment objectives, principal investment strategies, principal risks, performance) differs materially from the operating-company anatomy described above; the SGML wrapper and JSON envelope are identical, but section labels follow fund-prospectus item conventions.The filer of every Form 424B4 is the issuer-registrant of the underlying Securities Act registration statement. The dataset's filer population consists of entities that have an effective registration statement on file and are conducting a registered public offering whose final pricing terms were omitted from the registration statement under Rule 430A or Rule 430B.
In practice, that population includes:
The legal filer is always the registrant. Underwriters set the price and sell the securities, and selling security holders may be the economic sellers, but neither files the prospectus on EDGAR. They appear inside the document, not on the cover as filer of record.
Outside this population: private issuers relying on Regulation D, Regulation S, Rule 144A, or Section 4(a)(2) (no Securities Act registration, therefore no Rule 424 prospectus); and reporting companies whose only EDGAR activity is Exchange Act periodic reporting without a contemporaneous registered offering.
Form 424B4 is event-driven, not periodic. A record exists when three conditions align:
Filing deadline. Rule 424(b) requires the prospectus to be filed no later than the second business day following the earlier of (i) the date the prospectus is first used after effectiveness, or (ii) the date the offering price is determined.
Typical IPO cadence. The issuer prices after the close on day T, trading opens on T+1, and the 424B4 is filed on the morning of T+1 (or by T+2 at the latest) so the final statutory prospectus is publicly available at or before first sales.
Each priced offering generally produces one 424B4 filing. Additional 424B4 or 424B3 submissions may follow if the offering is upsized, repriced, or otherwise materially updated; each is a separate EDGAR submission with its own accession number.
Rule 424(b) routes post-effectiveness prospectuses to several paragraphs. The defining features of paragraph (b)(4) are the combination of:
That combination distinguishes 424(b)(4) from neighboring codes:
424(b)(4) is therefore the standard EDGAR code for an IPO issuer the morning after pricing, and for many priced follow-on offerings off an S-1 or non-automatic S-3 where additional substantive disclosure accompanies the omitted pricing data.
The Rule 424 filing obligation traces to the Securities Act of 1933; Rule 430A, which makes the (b)(4) code necessary by allowing pricing to be omitted at effectiveness, was adopted in 1987. Mandatory EDGAR filing of Securities Act prospectuses was phased in between 1993 and May 1996 under Regulation S-T. The earliest 424B4 records in this dataset accordingly date from 1994; coverage is effectively complete for domestic registrants from May 1996 onward and for foreign private issuers from the early 2000s.
Form 424B4 sits inside a tightly related family of Rule 424(b) prospectus variants, the registration statements they derive from, and the marketing documents filed alongside them. These neighbors share long blocks of identical text and are often filed in close sequence, but they differ in legal trigger, timing within the offering process, and what information they finalize.
The 424B4 prospectus is functionally a subset of the S-1 it points back to: it republishes the prospectus portion with previously omitted Rule 430A pricing terms filled in. S-1 and Form S-1/A are broader, including Part II content (unregistered sales, exhibits, undertakings, signatures) and presenting price, share count, and underwriting spread as ranges or placeholders. 424B4 is narrower, post-effective, and definitive, locking in the actual offering price, final share count, and underwriting discount.
424B1 is the definitive prospectus when the registration statement was already complete on pricing at effectiveness. 424B4 is used when the issuer relied on Rule 430A to defer pricing until after effectiveness, the standard pattern for traditional IPOs. The two are mutually exclusive for any given offering.
424B5 is the closest functional sibling: a definitive priced prospectus, but for follow-on offerings off an effective S-3 or F-3 shelf. 424B4 pairs with S-1/F-1 and dominates IPOs and first-time registrants; 424B5 pairs with shelves and dominates seasoned-issuer follow-ons, ATMs, and block trades. A complete priced-equity dataset typically requires both.
Form 424B2 is a thinner supplement adding offering-specific terms on top of an already-effective base prospectus, used heavily for debt and continuous programs. 424B4, by contrast, republishes a full prospectus tied to a 430A-pending S-1.
Form 424B3 updates an existing prospectus for material developments or required updates (Section 10(a)(3)); it generally does not establish the priced terms of an offering. 424B3 filings are shorter, more frequent across an offering's life, and oriented to amendment rather than initial pricing.
424B7 supplements information about selling security holders in resale registrations — secondary distributions by existing holders rather than the issuer. Form 424B8 fills gaps under Rule 430C (offerings outside 430A and 430B). Both differ from 424B4 in the underlying omission rule, the role of the supplement, and the typical issuer.
FWPs are Rule 433 marketing communications — term sheets, roadshow decks, press releases — filed in the lead-up to pricing. They are not the statutory prospectus and do not establish binding terms. FWP and 424B4 are complementary: FWP captures the pre-pricing trail, 424B4 captures the definitive priced document.
F-1 is the foreign private issuer counterpart to S-1, with extra disclosures on home-country regulation, currency, and financial-statement reconciliation. The 424B4 dataset includes filings tied to both S-1 and F-1 parents; cross-border users should join 424B4 records to their parent F-1 rather than treat foreign filings as a separate corpus.
Form 8-A12B registers a class of securities under the Exchange Act and enables exchange listing. It is typically filed at or just before IPO pricing, paired in time with 424B4 but serving a distinct statutory purpose: 424B4 is the Securities Act prospectus describing the offering, 8-A12B triggers ongoing Exchange Act reporting and listing.
Within the 424(b) family, Form 424B4 is uniquely the definitive priced prospectus for offerings that relied on Rule 430A to defer pricing past effectiveness — the canonical IPO pathway. It is distinct from 424B1 (no deferral), from 424B2 and 424B5 (shelf rather than S-1/F-1), from 424B3 (update rather than initial pricing), from 424B7 and 424B8 (different omission rules and roles), and from FWP (marketing, not statutory). It is a post-effective subset of the S-1 record, focused on locking in the legal and economic terms of the offering at pricing. That makes 424B4 the canonical filing for IPO event studies, underwriting-spread research, and use-of-proceeds analysis, while related datasets (S-1, FWP, 8-A12B, periodic filings) supply context but do not substitute for it.
Form 424B4 is the definitive prospectus that closes out a registered securities offering. It carries the final price, share count, gross spread, syndicate, lockups, dilution math, use of proceeds, risk factors, and audited financials, so it is consumed by capital markets, legal, research, quantitative, and litigation functions.
ECM origination, syndicate, and pricing desks use the corpus as a precedent library for IPOs and follow-ons. They pull cover-page and underwriting fields, final price versus marketed range, primary/secondary split, over-allotment size, gross spread, and bookrunner roster, to build pricing comps, fee benchmarks, and live deal recommendations for pitch books and bake-offs.
Fundamental analysts and IPO desks read 424B4s as the primary source for deal-day decisions. The business overview, KPI tables, capitalization and dilution sections, financial statements, and non-boilerplate risk factors feed initiation notes, valuation models, and allocation requests sized against conviction.
Issuer and underwriter counsel mine the corpus for sector- and structure-specific risk-factor language, plan-of-distribution clauses, use-of-proceeds and dilution wording, EGC/SRC representations, and forum-selection or jury-trial waivers. Diffing the definitive 424B4 against earlier S-1/S-1A versions also surfaces the exact edits driven by SEC review, supporting first drafts, comment responses, and disclosure-committee review.
Plaintiff- and defense-side securities litigators treat the 424B4 as the legally operative offering document for Section 11 and Section 12(a)(2) claims. They map alleged misstatements to specific prospectus passages, compare the 424B4 to later 8-K and 10-Q filings to build corrective-disclosure timelines, and use the underwriter list to identify defendants. The dataset feeds complaints, defense memoranda, damages models, and expert reports.
Forensic accountants compare the audited financials, MD&A discussion, related-party notes, and contingent-liability disclosures inside 424B4 prospectuses against the registrant's later 10-K and 10-Q filings, looking for restatements, segment redefinitions, or going-concern shifts that surface post-IPO.
In-house finance, treasury, and IR teams at private issuers preparing to go public, and at recently public issuers planning follow-ons, benchmark comparable deals: price range, free float, primary-versus-secondary mix, lockup duration, use-of-proceeds language, and dilution per share. The work shapes filing timing, deal structure, and equity-story positioning.
Quant researchers parse the corpus into structured event tables of IPO and follow-on pricing, share counts, and proceeds; run NLP over risk factors and business descriptions to score novelty and similarity; and join new tickers to post-IPO price and volume data to study underpricing, lockup-expiration drift, and underwriter-reputation effects in factor research and systematic strategies.
Staff at supervisory bodies and academics in finance, accounting, and law construct longitudinal offering datasets from 1994 onward to study underpricing, gross-spread clustering at seven percent, hot-issue cycles, and the disclosure effects of regulatory changes such as the JOBS Act, producing working papers and policy reviews.
Broker-dealer compliance officers, syndicate supervisors, and exchange surveillance staff use the plan-of-distribution and lockup sections to verify allocation eligibility, build lockup-expiration calendars, and check whether insider sales fall outside contractual restrictions.
The Form 424B4 corpus supports a small set of high-value workflows where the priced prospectus is the canonical artifact. Each use case below ties to specific record fields or prospectus sections.
Parse the cover-page pricing table on every 424B4 to extract public offering price, base share count, and over-allotment shares, then join on entities[].tickers and entities[].cik to first-day trading data. Diff the final price against the filing-range midpoint disclosed in the parent S-1/S-1A to compute pricing revisions, and aggregate by entities[].sic, filedAt year, and lead-underwriter name parsed from the UNDERWRITING section. The resulting panel feeds underpricing studies, hot-issue cycle analysis, and partial-adjustment tests.
Extract the per-share and aggregate "underwriting discounts and commissions" line from the cover-page pricing table and the syndicate roster from the UNDERWRITING section across every 424B4. Group by deal size buckets, SIC code, and year to build gross-spread comp tables, quantify clustering at seven percent for mid-cap IPOs, and rank bookrunners by deal count and proceeds. ECM desks plug the output directly into pitch-book fee pages and bake-off comps.
Segment Risk Factors using the <A NAME="a_004">-style anchors and the post-2020 summary-of-risk-factors structure, then run topic modeling and embedding similarity over the resulting corpus. Filter entities[].sic to a sector (e.g., 2834 Pharmaceutical Preparations, 7372 Prepackaged Software) and trace how clinical-trial, supply-chain, AI-model, or cybersecurity risk language enters and evolves year over year. Output supports counsel drafting precedent and quant signals on disclosure novelty.
For a defendant issuer, pull the 424B4 by accessionNo and entities[].cik, then mechanically extract every numerical and forward-looking statement from Risk Factors, Business, MD&A, and the audited financials. Cross-reference each statement to subsequent 10-K, 10-Q, and 8-K disclosures to build a corrective-disclosure timeline, and use the UNDERWRITING section to enumerate Section 11 defendants. The mapping table feeds complaint drafting, motion-to-dismiss exhibits, and damages models.
Extract the Use of Proceeds dollar allocations (working capital, debt repayment, R&D, acquisitions) and the Dilution table's pro forma net tangible book value figures from each IPO 424B4. Branch the parser on deal type — underwritten cash, resale registration ("we will not receive proceeds"), or rights offering — using cover-page cues, then aggregate by sector and vintage. The output supports academic studies on capital deployment, IR-team benchmarking for follow-on filings, and lender-side analysis of post-IPO leverage trajectories.
Parse the lockup paragraphs in the UNDERWRITING section to capture lockup duration, covered parties (directors, officers, 5% holders), and carve-outs, then key the schedule by filedAt plus lockup days. Combine with the Principal Stockholders table to identify covered share counts. Compliance and surveillance teams use the resulting calendar to flag Form 4 sales that fall inside contractual restrictions and to anticipate supply overhangs.
Extract authorised share counts, multi-class voting structures, classified-board language, forum-selection clauses, jury-trial waivers, and supermajority provisions from the Description of Capital Stock section, keyed by entities[].stateOfIncorporation and IPO year. The resulting panel supports governance research, ISS/Glass Lewis-style scoring, and counsel precedent searches when drafting or negotiating charter provisions for new issuers.
Dataset Index JSON API: https://api.sec-api.io/datasets/form-424b4-files.json
Returns dataset-level metadata (name, description, last updated timestamp, earliest sample date, total records and size, form types, container format, and file types) along with the full dataset download URL and a list of individual monthly container files with per-container size, record counts, updated timestamps, and download URLs. Poll this endpoint to detect which containers were modified during the most recent refresh and download only those that changed. This endpoint does not require an API key.
1
{
2
"datasetId": "1f13365b-9ae0-691e-9e25-f2510da800ec",
3
"datasetDownloadUrl": "https://api.sec-api.io/datasets/form-424b4-files.zip",
4
"name": "Form 424B4 Files Dataset",
5
"updatedAt": "2026-04-28T02:57:27.107Z",
6
"earliestSampleDate": "1994-01-01",
7
"totalRecords": 13982,
8
"totalSize": 2628326650,
9
"formTypes": ["424B4"],
10
"containerFormat": "ZIP",
11
"fileTypes": ["TXT", "JSON", "PDF", "HTML"],
12
"containers": [
13
{
14
"downloadUrl": "https://api.sec-api.io/datasets/form-424b4-files/2026/2026-04.zip",
15
"key": "2026/2026-04.zip",
16
"size": 13818783,
17
"records": 154,
18
"updatedAt": "2026-04-28T02:57:27.107Z"
19
}
20
]
21
}
Download Entire Dataset: https://api.sec-api.io/datasets/form-424b4-files.zip?token=YOUR_API_KEY
Downloads the complete archive of all Form 424B4 filings from January 1994 to present in a single ZIP file. This endpoint requires an API key.
Download Single Container: https://api.sec-api.io/datasets/form-424b4-files/2026/2026-04.zip?token=YOUR_API_KEY
Downloads a single monthly container ZIP file, where each container holds accession-number subfolders containing a metadata.json file and the 424B4 prospectus HTM document along with any attached exhibits. This endpoint requires an API key.
The dataset covers Form 424B4 — the definitive prospectus filed under Rule 424(b)(4) of the Securities Act of 1933. A 424B4 is filed after the SEC declares the underlying registration statement effective and the offering has priced, supplying the pricing information omitted from the registration statement under Rule 430A or 430B together with any substantive changes from the preliminary prospectus.
One record represents a single Form 424B4 submission accepted by EDGAR, identified by its 18-digit accession number. On disk the record is a folder containing exactly two artifacts: a metadata.json envelope describing the filing and the primary 424B4 prospectus document (typically a single self-contained .htm/.html file, occasionally .txt for legacy ASCII filings).
The issuer-registrant of the underlying Securities Act registration statement is the legal filer. This includes domestic operating companies filing off Form S-1 or S-3, foreign private issuers filing off Form F-1 or F-3, IPO issuers, follow-on and secondary offering issuers, and issuers in registered resale offerings on behalf of selling security holders. Underwriters and selling stockholders appear inside the prospectus but never file it; the registrant is always the filer of record.
Rule 424(b) requires the prospectus to be filed no later than the second business day following the earlier of (i) the date the prospectus is first used after effectiveness, or (ii) the date the offering price is determined. In a typical IPO the issuer prices after the close on day T, trading opens on T+1, and the 424B4 is filed on the morning of T+1 (or by T+2 at the latest).
The earliest 424B4 records date from January 1994, when EDGAR began accepting Securities Act prospectuses electronically. Coverage is effectively complete for domestic registrants from May 1996 onward (after Regulation S-T phased in mandatory EDGAR filing) and for foreign private issuers from the early 2000s, and the dataset continues to the present.
The dataset is delivered as monthly ZIP containers keyed by year and month (for example, 2025/2025-11.zip). Inside each container are accession-number folders, each holding a metadata.json envelope and the primary prospectus document. The file types present are TXT, JSON, PDF, and HTML; Form 424B4 does not carry an XBRL instance, so no XBRL data files are included.
Both 424B4 and 424B5 are definitive priced prospectuses, but they attach to different parent forms. 424B4 pairs with Form S-1 or F-1 and dominates IPOs and first-time registrants where Rule 430A pricing was deferred. 424B5 pairs with the Form S-3 or F-3 shelf and dominates seasoned-issuer follow-ons, ATMs, and block trades. A complete priced-equity dataset typically requires both.
No. The dataset materialises only the 424B4 prospectus and its metadata.json envelope. Documents from the underlying registration statement (S-1, F-1, S-3), preliminary prospectuses (424B1/B2/B3), other 424(b) supplements, free writing prospectuses, and any periodic reports incorporated by reference live in their own EDGAR filings and are outside the scope of this dataset.