Form 424B8 Files Dataset

The Form 424B8 Files Dataset is a collection of every late-cured prospectus filed to EDGAR under Rule 424(b)(8) of the Securities Act of 1933 from January 2006 to the present. One record is one EDGAR submission of Form 424B8, identified by an 18-digit accession number, and packaged as a folder containing a metadata.json header, the registrant-supplied 424B8 prospectus HTML, and — when the filer attached one — an Exhibit 107 EX-FILING FEES inline-XBRL HTML. The filer is always the Securities Act registrant whose effective registration statement covers the offering, most often a structured-note finance subsidiary or a bank holding company operating an automatic shelf. The corpus is delivered as monthly ZIP containers (YYYY/YYYY-MM.zip) through the sec-api.io datasets API and contains HTML, JSON, PDF, and TXT files. Because every record is a prospectus that missed its original (b)(1) through (b)(7) deadline and is being submitted "as soon as practicable" after discovery, the dataset is uniquely suited to compliance benchmarking, late-filing-gap analysis, and structured-product pricing-supplement extraction.

Update Frequency
Daily
Updated at
2026-05-02
Earliest Sample Date
2006-01-01
Total Size
89.6 MB
Total Records
2,780
Container Format
ZIP
Content Types
HTML, JSON, PDF, TXT
Form Types
424B8

Dataset APIs

Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.

Dataset Index JSON API

Download the entire dataset as a single archive file.

Download Entire Dataset:

Download a single container file (e.g. monthly archive) from the dataset.

Download Single Container:

Dataset Files

229 files · 89.6 MB
Download All
2026-05.zip507.2 KB2 records
2026-04.zip458.5 KB24 records
2026-03.zip248.6 KB12 records
2026-02.zip78.2 KB5 records
2026-01.zip1.3 MB18 records
2025-12.zip537.9 KB18 records
2025-11.zip263.1 KB5 records
2025-10.zip113.7 KB5 records
2025-09.zip836.1 KB23 records
2025-08.zip463.8 KB14 records
2025-07.zip499.7 KB25 records
2025-06.zip528.9 KB20 records
2025-05.zip205.0 KB13 records
2025-04.zip772.8 KB25 records
2025-03.zip390.5 KB21 records
2025-02.zip336.6 KB16 records
2025-01.zip243.9 KB11 records
2024-12.zip628.3 KB37 records
2024-11.zip402.5 KB21 records
2024-10.zip1.0 MB35 records
2024-09.zip321.7 KB21 records
2024-08.zip775.5 KB44 records
2024-07.zip533.5 KB31 records
2024-06.zip541.8 KB13 records
2024-05.zip589.1 KB14 records
2024-04.zip43.6 KB4 records
2024-03.zip302.8 KB19 records
2024-02.zip282.9 KB18 records
2024-01.zip261.8 KB19 records
2023-12.zip239.2 KB21 records
2023-11.zip236.9 KB18 records
2023-10.zip238.7 KB20 records
2023-09.zip317.6 KB11 records
2023-08.zip536.9 KB45 records
2023-07.zip1.3 MB46 records
2023-06.zip581.1 KB37 records
2023-05.zip163.8 KB21 records
2023-04.zip377.2 KB30 records
2023-03.zip303.4 KB30 records
2023-02.zip396.2 KB36 records
2023-01.zip396.2 KB42 records
2022-12.zip395.6 KB35 records
2022-11.zip575.6 KB57 records
2022-10.zip684.5 KB57 records
2022-09.zip545.4 KB44 records
2022-08.zip660.1 KB43 records
2022-07.zip708.6 KB27 records
2022-06.zip1.4 MB63 records
2022-05.zip1.2 MB21 records
2022-04.zip238.9 KB19 records
2022-03.zip474.4 KB33 records
2022-02.zip386.3 KB22 records
2022-01.zip113.6 KB6 records
2021-12.zip148.2 KB7 records
2021-11.zip187.0 KB9 records
2021-10.zip413.8 KB16 records
2021-09.zip732.9 KB15 records
2021-08.zip394.3 KB14 records
2021-07.zip635.8 KB19 records
2021-06.zip630.4 KB29 records
2021-05.zip356.2 KB13 records
2021-04.zip1.5 MB59 records
2021-03.zip407.4 KB13 records
2021-02.zip422.5 KB16 records
2021-01.zip491.6 KB18 records
2020-12.zip294.8 KB11 records
2020-11.zip172.2 KB7 records
2020-10.zip225.2 KB9 records
2020-09.zip478.7 KB21 records
2020-08.zip739.0 KB30 records
2020-07.zip215.8 KB11 records
2020-06.zip613.6 KB23 records
2020-05.zip712.2 KB22 records
2020-04.zip310.0 KB12 records
2020-03.zip1.4 MB49 records
2020-02.zip436.2 KB16 records
2020-01.zip289.3 KB13 records
2019-12.zip60.9 KB3 records
2019-11.zip523.3 KB19 records
2019-10.zip570.8 KB23 records
2019-09.zip264.1 KB13 records
2019-08.zip215.6 KB12 records
2019-07.zip485.1 KB11 records
2019-06.zip382.1 KB18 records
2019-05.zip181.3 KB8 records
2019-04.zip166.9 KB7 records
2019-03.zip306.5 KB5 records
2019-02.zip159.4 KB7 records
2019-01.zip271.2 KB14 records
2018-12.zip142.8 KB6 records
2018-11.zip296.9 KB18 records
2018-10.zip204.1 KB10 records
2018-09.zip179.9 KB11 records
2018-08.zip345.4 KB16 records
2018-07.zip105.8 KB4 records
2018-06.zip364.0 KB17 records
2018-05.zip268.2 KB12 records
2018-04.zip244.5 KB11 records
2018-03.zip312.2 KB12 records
2018-02.zip396.9 KB19 records
2018-01.zip566.5 KB15 records
2017-12.zip207.8 KB8 records
2017-11.zip168.9 KB8 records
2017-10.zip145.5 KB8 records
2017-09.zip214.0 KB7 records
2017-08.zip70.8 KB5 records
2017-07.zip121.8 KB4 records
2017-06.zip105.4 KB3 records
2017-05.zip87.9 KB5 records
2017-04.zip185.5 KB7 records
2017-03.zip164.3 KB4 records
2017-02.zip358.8 KB11 records
2017-01.zip121.6 KB5 records
2016-12.zip180.4 KB7 records
2016-11.zip317.2 KB18 records
2016-10.zip140.3 KB9 records
2016-09.zip117.0 KB5 records
2016-08.zip254.9 KB13 records
2016-07.zip57.3 KB4 records
2016-06.zip312.1 KB10 records
2016-05.zip118.5 KB8 records
2016-04.zip714.2 KB5 records
2016-03.zip147.5 KB20 records
2016-02.zip164.6 KB17 records
2016-01.zip200.9 KB8 records
2015-12.zip191.3 KB3 records
2015-11.zip289.1 KB7 records
2015-10.zip47.1 KB1 records
2015-09.zip399.7 KB8 records
2015-08.zip17.0 KB1 records
2015-07.zip56.1 KB3 records
2015-06.zip61.7 KB3 records
2015-05.zip364.7 KB3 records
2015-04.zip530.6 KB9 records
2015-03.zip271.1 KB9 records
2015-02.zip113.6 KB4 records
2015-01.zip99.4 KB3 records
2014-12.zip82.6 KB4 records
2014-11.zip194.5 KB6 records
2014-10.zip130.0 KB3 records
2014-09.zip190.6 KB3 records
2014-08.zip57.9 KB2 records
2014-07.zip77.9 KB3 records
2014-06.zip325.8 KB8 records
2014-04.zip88.4 KB2 records
2014-03.zip37.5 KB1 records
2014-02.zip172.0 KB4 records
2014-01.zip58.7 KB3 records
2013-12.zip97.6 KB4 records
2013-11.zip206.5 KB6 records
2013-10.zip39.6 KB2 records
2013-09.zip15.6 KB1 records
2013-08.zip122.4 KB6 records
2013-07.zip88.2 KB5 records
2013-06.zip90.5 KB7 records
2013-05.zip691.1 KB7 records
2013-04.zip609.2 KB9 records
2013-03.zip1.0 MB10 records
2013-02.zip735.2 KB3 records
2013-01.zip266.0 KB21 records
2012-12.zip942.0 KB14 records
2012-11.zip447.3 KB13 records
2012-10.zip1.8 MB6 records
2012-09.zip2.5 MB22 records
2012-08.zip260.1 KB5 records
2012-07.zip846.3 KB4 records
2012-06.zip3.6 MB11 records
2012-05.zip247.7 KB4 records
2012-04.zip123.6 KB5 records
2012-03.zip68.7 KB1 records
2012-02.zip205.2 KB4 records
2012-01.zip189.5 KB3 records
2011-12.zip171.8 KB3 records
2011-11.zip1.3 MB8 records
2011-10.zip1.6 MB9 records
2011-09.zip775.2 KB6 records
2011-08.zip550.8 KB9 records
2011-07.zip472.5 KB7 records
2011-06.zip575.2 KB7 records
2011-05.zip521.3 KB5 records
2011-04.zip914.6 KB12 records
2011-03.zip258.7 KB8 records
2011-02.zip123.7 KB5 records
2011-01.zip493.7 KB3 records
2010-12.zip5.6 KB2 records
2010-11.zip589.0 KB5 records
2010-10.zip635.1 KB4 records
2010-09.zip14.9 KB1 records
2010-07.zip135.4 KB3 records
2010-06.zip194.7 KB8 records
2010-05.zip560.3 KB9 records
2010-04.zip145.1 KB1 records
2010-03.zip53.7 KB1 records
2010-02.zip189.5 KB2 records
2009-12.zip260.2 KB3 records
2009-10.zip196.8 KB3 records
2009-09.zip9.0 KB3 records
2009-08.zip3.4 KB1 records
2009-06.zip20.9 KB1 records
2009-04.zip60.9 KB1 records
2008-10.zip398.1 KB6 records
2008-08.zip41.0 KB1 records
2008-06.zip12.4 KB1 records
2008-05.zip5.7 KB2 records
2008-04.zip1.7 MB23 records
2008-03.zip257.0 KB3 records
2008-02.zip242.3 KB4 records
2008-01.zip416.2 KB5 records
2007-12.zip20.1 KB1 records
2007-11.zip30.1 KB1 records
2007-10.zip97.9 KB3 records
2007-08.zip650.3 KB9 records
2007-07.zip282.9 KB2 records
2007-06.zip107.3 KB3 records
2007-05.zip130.1 KB1 records
2007-04.zip208.4 KB4 records
2007-03.zip455.5 KB5 records
2007-02.zip417.4 KB16 records
2007-01.zip399.2 KB3 records
2006-12.zip492.0 KB8 records
2006-11.zip948.5 KB7 records
2006-10.zip367.7 KB4 records
2006-09.zip829.9 KB3 records
2006-08.zip379.5 KB1 records
2006-06.zip8.6 KB1 records
2006-05.zip566.7 KB3 records
2006-04.zip74.1 KB2 records
2006-03.zip124.1 KB2 records
2006-01.zip186.1 KB1 records

What This Dataset Contains

The Form 424B8 Files Dataset packages every EDGAR submission whose form code is 424B8 — that is, every prospectus filed pursuant to Rule 424(b)(8) of the Securities Act of 1933. Paragraph (b)(8) is a residual catch-up provision: it applies to "any form of prospectus required to be filed pursuant to" another paragraph of Rule 424(b) — most commonly (b)(2) (prospectuses reflecting offerings off a shelf), (b)(3) (substantive changes or material additions to a prospectus already on file), (b)(5) (prospectus supplements relating to shelf takedowns), or (b)(7) (additional information for medium-term notes and similar continuous offerings) — that was not transmitted within the time frame the underlying paragraph requires. The form must be filed as soon as practicable after the failure to timely file is discovered. Substantively, a 424B8 re-presents the same prospectus content the original (b)(x) paragraph would have required; only the filing identifier differs. EDGAR records the late-filing posture by stamping the cover with a "Filed Pursuant to Rule 424(b)(8)" line and assigning the submission the form code 424B8.

The 424B8 population is dominated by structured-product and medium-term-note shelf programs. Large issuers — Citigroup Global Markets, JPMorgan Chase Financial, Bank of Montreal, GS Finance, Morgan Stanley Finance, Royal Bank of Canada, UBS AG, Barclays Bank — generate many small pricing supplements per month under universal shelf registrations, and the ones that miss the (b)(2)/(b)(5) window are re-filed under (b)(8). Consequently the "prospectus" inside a typical 424B8 record is a compact, product-specific pricing supplement (autocallable notes, contingent income notes, buffered enhanced-return notes, equity-linked notes, market-linked CDs) of a few dozen pages, not a stand-alone base prospectus.

The dataset covers all Form 424B8 filings submitted to EDGAR from January 2006 to present, packaged as monthly ZIP containers and refreshed on an ongoing basis as new filings arrive. The file types found in the dataset are HTML (the dominant format for both the prospectus body and the fee exhibit), JSON (the per-record metadata), with PDF and TXT also represented across the dataset.

Content Structure of a Single Record

What one record represents

One record in the Form 424B8 Files Dataset is a single EDGAR submission of Form 424B8 — a prospectus filed pursuant to Rule 424(b)(8) of the Securities Act of 1933 — identified by its 18-digit SEC accession number. On disk the record is a folder named with the unhyphenated accession number (for example 000095010325013761), placed inside a per-month directory (YYYY-MM/) that is the sole top-level entry of a monthly ZIP archive (YYYY/YYYY-MM.zip). Inside the accession folder sit a metadata.json describing the EDGAR submission as a whole and the registrant-supplied filing artifacts: the primary 424B8 prospectus document (HTML) and, when the filer furnished one, an Exhibit 107 EX-FILING FEES inline-XBRL HTML document. The unit of observation is the submission, not the issuer, the registration statement, or the underlying offering — every accession is one record, and an issuer that files multiple late prospectus supplements within a month appears as multiple, independent records.

Container and physical layout of one record

The dataset is delivered as a tree of monthly ZIPs organized by year — YYYY/YYYY-MM.zip — and each ZIP unpacks to a YYYY-MM/ directory containing one accession-numbered subfolder per filing. The accession folder is the record. Inside it appear:

  • metadata.json — always present, one per accession, capturing the full EDGAR submission metadata (see below).
  • The primary 424B8 document — always present, exactly one file with type == "424B8" in documentFormatFiles[]. The on-disk filename is the registrant's original EDGAR filename (for example dp236412_424b8-us2522940d.htm, ea0262163-01_424b8.htm, bmo4899_424b2-32282.htm); there is no normalized name, so the canonical way to locate it is to look up the entry whose type is 424B8 in metadata.documentFormatFiles[].
  • An EX-FILING FEES inline-XBRL HTML — present whenever the registrant filed an Exhibit 107 fee table. Names commonly follow the pattern *exfilingfees*.htm or ex-filingfees.htm and carry type == "EX-FILING FEES".
  • Occasionally other registrant-supplied exhibits referenced by documentFormatFiles[].

Two classes of artifacts present in the original EDGAR submission are deliberately omitted from the ZIP. GRAPHIC entries (GIF/JPG/PNG images embedded in the prospectus, e.g. image_001.jpg, bg1.jpg) are enumerated in metadata.documentFormatFiles[] but their bytes are not packaged. Likewise the EDGAR "complete submission text file" — the raw .txt wrapper hosted on sec.gov, listed in metadata with empty sequence and type fields — is not bundled; the URL is preserved under linkToTxt. Standalone XBRL instance documents that accompany the EX-FILING FEES exhibit (such as *_exfilingfees_htm.xml) are referenced in metadata.dataFiles[] but, like images, are not shipped inside the ZIP — only the inline-XBRL HTML rendition is.

metadata.json anatomy

metadata.json mirrors the EDGAR submission header and the document-list table from the filing index. The top-level fields are:

  • formType — always the literal string "424B8".
  • accessionNo — the canonical hyphenated accession number (e.g. 0000950103-25-013761).
  • filedAt — ISO-8601 timestamp with timezone offset (e.g. 2025-10-28T18:08:00-04:00) reflecting EDGAR acceptance time.
  • description — short human-readable form description, typically "Form 424B8 - Prospectus filed pursuant to Rule 424(b)(8)".
  • linkToFilingDetails — sec.gov URL of the primary 424B8 document.
  • linkToHtml — sec.gov URL of the -index.htm landing page for the submission.
  • linkToTxt — sec.gov URL of the full submission .txt wrapper.
  • linkToXbrl — sec.gov URL of the XBRL viewer; an empty string when no viewer is exposed.
  • id — a stable 32-character hex hash uniquely identifying the record.

Three array fields carry the structural detail:

  • documentFormatFiles[] enumerates every document in the original EDGAR submission. Each element has sequence (the EDGAR document ordinal as a string), size (bytes, as a string), documentUrl (sec.gov), description (e.g. PRICING SUPPLEMENT, PRELIMINARY PRICING SUPPLEMENT, EX-FILING FEES, GRAPHIC), and type (e.g. 424B8, EX-FILING FEES, GRAPHIC). The trailing element with empty sequence and type represents the complete submission text file.
  • entities[] lists the filer and any co-filers/co-registrants. For each entity the metadata carries cik, companyName (with a role suffix such as (Filer) or (Subject)), type (mirroring the form type), act (Securities Act citation, "33" for the 1933 Act), fileNo (the SEC file number — typically the registration statement number such as 333-270327, with -NN suffixes for co-registrant subsidiaries), filmNo, sic (industry code with description, e.g. "6021 National Commercial Banks"), irsNo, stateOfIncorporation (two-letter code, including non-US codes such as A6 for Ontario), fiscalYearEnd (MMDD), and an optional tickers[] array.
  • dataFiles[] enumerates structured XBRL or other data attachments associated with the submission, such as EXTRACTED XBRL INSTANCE DOCUMENT of type XML. The array is empty when the filer included no XBRL exhibit.

A fourth array, seriesAndClassesContractsInformation[], is reserved for investment-company series-and-class identifiers; it is empty for the operating-company and finance-subsidiary issuers that account for the vast majority of 424B8 filings.

Primary 424B8 document

The primary document is wrapped in the standard EDGAR SGML envelope around the registrant-supplied HTML body:

1 <DOCUMENT>
2 <TYPE>424B8
3 <SEQUENCE>1
4 <FILENAME>...
5 <DESCRIPTION>PRICING SUPPLEMENT
6 <TEXT>
7 <HTML>...prospectus body...</HTML>
8 </TEXT>
9 </DOCUMENT>

The body is a prospectus. Because the (b)(8) catch-up provision is overwhelmingly used by structured-note shelf programs, the typical body is a pricing supplement to a previously filed base prospectus, prospectus supplement, and product supplement, rather than a stand-alone offering document. Common content blocks, in roughly the order they appear, are:

  • A cover page identifying the issuer (and any guarantor), the registration statement file number(s), the legend "Filed Pursuant to Rule 424(b)(8)", the product name (e.g. "Autocallable Contingent Coupon Equity-Linked Securities", "Buffered Enhanced Return Notes", "Market-Linked Notes"), and a brief overview paragraph.
  • A "Key Terms" or "Summary of Terms" table enumerating issuer, guarantor, principal amount, denomination, trade date, original issue date, valuation date(s), maturity date, underlying asset(s) (single equity, ETF, index, basket, currency pair), CUSIP/ISIN, coupon mechanics (fixed, contingent, conditional), call/autocall thresholds, barrier/buffer/threshold levels, payoff formulas at maturity, and the agent's discount or selling concession.
  • Hypothetical payoff tables and worked examples illustrating returns under different underlying scenarios.
  • A risk-factor section, normally cross-referencing the more comprehensive risk factors in the base prospectus, prospectus supplement, and product supplement, plus product-specific risks (issuer credit, liquidity, valuation, conflicts of interest, US federal income tax treatment).
  • "Use of Proceeds" and "Hedging" disclosures, often boilerplate cross-references.
  • "Plan of Distribution" / "Supplemental Plan of Distribution" describing the underwriter or agent (typically the issuer's own broker-dealer affiliate), commissions, and any conflicts-of-interest disclosure under FINRA Rule 5121.
  • Validity-of-securities and tax counsel statements, frequently incorporating opinions by reference.
  • Information about the underlying asset(s) and any historical performance data.
  • A signature block where the registrant (and guarantor, if any) attests to the filing — short relative to the cover and term descriptions, but always present.

Layout fidelity varies by filer because EDGAR accepts any well-formed HTML. Three patterns recur:

  • Semantic HTML with traditional <TABLE> and <P> markup and registrant-branded color/typography (Citigroup-style filings).
  • Absolutely-positioned <DIV> layouts using point-sized inline typography to reproduce a print template (Bank of Montreal-style filings).
  • PDF-rendered HTML — the source PDF was converted to HTML by tooling that emits per-character absolutely-positioned <div class="t ..."> tiles over bg*.jpg background images (JPMorgan-style supplements). The text is fully present but heavily fragmented and hard to reflow. Because the GRAPHIC backgrounds are excluded from the ZIP, opening such files in a browser shows the text without the visual page background.

These differences are stylistic; the substantive prospectus content is the same regardless of the markup style.

EX-FILING FEES (Exhibit 107) document

When a 424B8 carries fees, the registrant attaches an Exhibit 107 fee table. In this dataset that exhibit appears as a separate HTML file with type == "EX-FILING FEES". The file is authored as inline XBRL: it opens as XHTML with the inline-XBRL namespace xmlns:ix="http://www.xbrl.org/2013/inlineXBRL", declares one or more xbrli:context blocks (referencing the registrant CIK and filing date), and tags the SEC fee-table facts with the ffd: (filing-fees disclosure) and dei: taxonomies — for example ffd:SubmissnTp, ffd:FeeExhibitTp, ffd:RegnFileNb, dei:EntityCentralIndexKey. The visible rendering is the standard SEC fee table: form type, fee exhibit type, security type, security class, fee calculation rule, amount registered, proposed maximum offering price per unit, proposed maximum aggregate offering price, fee rate, and fee due, with a carry-forward block where applicable. The companion XBRL instance document (*_exfilingfees_htm.xml) is enumerated under metadata.dataFiles[] but is not packaged in the ZIP — the inline-XBRL HTML is the canonical artifact for fee data within the dataset.

Included content

  • The full metadata.json for the EDGAR submission, including filer/co-filer identification, document inventory, and external links.
  • The primary 424B8 prospectus document, in its registrant-authored HTML form, wrapped in the EDGAR SGML <DOCUMENT>...<TEXT>...</TEXT></DOCUMENT> envelope.
  • Any EX-FILING FEES inline-XBRL HTML exhibit (Exhibit 107) attached by the registrant.
  • Other registrant-supplied document attachments that EDGAR records under documentFormatFiles[] and that are not images or the raw submission wrapper.

Excluded or external content

  • GRAPHIC files (GIF/JPG/PNG) referenced by the prospectus or exhibit. They are enumerated in metadata.documentFormatFiles[] and remain accessible at their documentUrls on sec.gov but are not packaged in the ZIP. For PDF-rendered HTML supplements this means the page-background images are absent locally.
  • The EDGAR "complete submission text file" (the raw .txt envelope concatenating every document in the submission). Its URL is preserved as linkToTxt.
  • Standalone XBRL instance/schema/linkbase files associated with the EX-FILING FEES exhibit. They are listed in dataFiles[] and remain available at their EDGAR URLs.
  • Documents incorporated by reference into the prospectus — the base prospectus, prospectus supplement, product supplement, underlying-supplement, and the issuer's Exchange Act reports — are not duplicated inside the 424B8 record. The pricing supplement only references them.

Changes in required content and structure over time

Because Rule 424(b)(8) is a procedural catch-up paragraph rather than a content rule, the substantive disclosure required in a 424B8 is whatever the underlying paragraph (most often (b)(2), (b)(3), (b)(5), or (b)(7)) demands. The most material changes over the dataset's coverage period (January 2006 to present) are therefore changes to those underlying paragraphs and to surrounding rules:

  • The 2005 Securities Offering Reform overhaul that produced the modern Rule 424 framework had taken effect by December 2005, so the entire dataset operates under the post-Reform regime, including the Rule 405 Well-Known Seasoned Issuer concept and the Rule 415 shelf-takedown mechanics that drive the (b)(2)/(b)(5) volume.
  • The SEC's filing-fee modernization (Release No. 33-10997) — adopted in 2021 with phased compliance — replaced narrative fee disclosure on prospectus covers with a structured Exhibit 107 fee table tagged in inline XBRL under Rule 408 of Regulation S-T. After the relevant compliance date, 424B8 records that carry fees include the EX-FILING FEES iXBRL exhibit; earlier records do not.
  • Item 16.1 of Form S-1 (and the analogous items in Forms S-3, F-1, F-3) was conformed to require the filing-fee table as Exhibit 107, anchoring the new disclosure to the registration form rather than to the prospectus body.
  • The shift in market practice toward retail structured products materially changed the median content of a 424B8 over the period: early-period filings include more conventional shelf takedowns, whereas later-period filings are dominated by short, product-specific pricing supplements from a small set of large structured-note programs.

Changes in data format over time

Form 424B8 has been an EDGAR-accepted HTML/SGML submission throughout the dataset's coverage period. The principal format developments visible across records are:

  • Consistent use of the EDGAR SGML <DOCUMENT>...<TEXT>...</TEXT></DOCUMENT> envelope around the registrant-supplied HTML body, with <TYPE>424B8 driving form classification.
  • Increasing use of PDF-to-HTML conversion tools, which produces absolutely-positioned <div> tiles and image-backed pages — a stylistic change that affects parseability but not substance.
  • Adoption of inline XBRL for the Exhibit 107 filing-fee table after the SEC's fee modernization, embedded directly in an XHTML document with the iXBRL namespace and ffd:/dei: tagging. The 424B8 prospectus body itself is not XBRL-tagged.

Interpretation notes

  • "424B8" is a procedural label, not a content type. To understand what disclosure a particular record actually contains, read the cover-page legend and the document description in metadata.documentFormatFiles[*].description (e.g. PRICING SUPPLEMENT, PRELIMINARY PRICING SUPPLEMENT, PROSPECTUS SUPPLEMENT) — these reveal which underlying (b)(x) paragraph the filing was originally meant to satisfy.
  • The primary document is identified canonically by metadata.documentFormatFiles[*].type == "424B8" rather than by filename. Registrants reuse legacy filenames freely; for example a Bank of Montreal pricing supplement may carry a filename containing 424b2 while being typed 424B8 in metadata.
  • Co-registrants — typical for finance-subsidiary issuers with a parent guarantor (JPMorgan Chase Financial / JPMorgan Chase & Co., GS Finance Corp. / The Goldman Sachs Group, Citigroup Global Markets Holdings / Citigroup Inc.) — appear as multiple objects in entities[], with hyphen-suffixed file numbers (333-XXXXXX-NN) distinguishing each co-registrant on the same registration statement.
  • The seriesAndClassesContractsInformation[] array is a placeholder inherited from the EDGAR submission schema and is empty for the operating-company and finance-subsidiary issuers that produce nearly all 424B8 filings; it would only populate for investment-company filers reporting series and class identifiers.
  • Because GRAPHIC files are excluded, prospectuses that depend on background images (notably PDF-rendered JPMorgan supplements) render as fragmented text-only pages locally; the underlying text is fully present in the HTML and is suitable for extraction even without the images.
  • The EX-FILING FEES HTML is the only inline-XBRL artifact in a 424B8 record. The prospectus body itself carries no XBRL tagging, so financial extraction must rely on the iXBRL fee exhibit for structured offering-amount and fee data and on text/table parsing of the prospectus HTML for product terms (CUSIPs, dates, barrier levels, coupon formulas, payoff tables).
  • The linkToTxt, linkToHtml, linkToFilingDetails, and linkToXbrl URLs let a consumer round-trip back to EDGAR for the artifacts not packaged locally (image attachments, the raw submission wrapper, the standalone XBRL instance documents, and the EDGAR XBRL viewer rendering).
  • Amendments to a prior prospectus filing are filed under their own accession numbers as new submissions; the dataset does not link amendments to the originals beyond the shared registration fileNo carried in entities[*].fileNo.

Who Files or Publishes This Dataset, and When

Who files the record

The filer is always the Securities Act registrant whose effective registration statement covers the offering. That is typically the issuer itself, or, in shelf and structured-product programs, the registrant on whose registration statement a takedown is being conducted (often a finance subsidiary, with a parent guarantor).

The pool of registrants drawn into 424B8 filings is dominated by high-volume shelf issuers, including:

Underwriters, dealers, selling securityholders, and parent guarantors may be named in the prospectus and may carry liability exposure, but they do not file Form 424B8 in their own right. The filing is made under the registrant's CIK.

What triggers the filing

Form 424B8 is a corrective submission. It exists solely to cure a missed Rule 424(b) deadline.

Rule 424(b)(1) through (b)(7) each prescribe how and when specific categories of prospectuses, supplements, and pricing materials must be filed in connection with an effective registration statement, generally within two business days (occasionally five) of the relevant pricing, sale, or first-use event. When a registrant fails to file within that window, Rule 424(b)(8) requires it to file the prospectus "as soon as practicable after the discovery of the failure to file," designating the EDGAR submission as 424B8 rather than the originally applicable 424B1 through 424B7 type.

The trigger is therefore event-driven and two-step: (1) a missed original (b)(1)–(b)(7) deadline, and (2) the subsequent discovery of that lapse, which starts the "as soon as practicable" clock. There is no fixed numeric deadline for the 424B8 itself, and there is no voluntary or strategic reason to elect 424B8 in a timely-filing scenario. Choosing the 424B8 label is itself an admission that the original deadline was missed.

Timing and cadence in the dataset

Filing dates do not follow a periodic schedule. They cluster around:

  • Operational lapses in high-volume shelf and structured-note programs
  • Internal compliance reviews and counsel sweeps that surface missed filings
  • The discovery date, not the original missed event date

Dataset coverage begins in January 2006, immediately after the SEC's 2005 Securities Offering Reform restructured the shelf and prospectus-supplement regime and clarified the (b)(8) corrective path. Earlier paper or pre-Reform filings are not included.

Important distinctions

  • 424B8 vs. 424B1–B7. The other seven 424B variants are routine, on-time filings tied to specific Rule 424(b) paragraphs. 424B8 is the corrective label used after one of those deadlines has been missed. The prospectus content is generally what would have been filed under the original paragraph; only the submission type changes.
  • Filer vs. participants. The registrant files. Selling shareholders, underwriters, and other named participants do not.
  • Issuer vs. parent. In structured-note programs the registrant is often a finance subsidiary; the 424B8 is filed under that subsidiary's CIK, not the parent's, unless the parent is a co-registrant.
  • Not a registration amendment. 424B8 is a prospectus filing under Rule 424(b). Material changes requiring a post-effective amendment to the registration statement itself must still be made through that amendment, not through 424(b)(8).
  • Not for non-Securities Act offerings. Rule 424(b) covers prospectuses used with Securities Act registration statements only. Regulation A offering circulars and other exempt-offering disclosures use separate submission types and are out of scope.
  • Foreign private issuers. FPIs file 424B8 on the same basis as domestic issuers when their Securities Act registration statements (F-1, F-3, F-4) require a prospectus supplement under Rule 424(b). This is distinct from their Exchange Act periodic reporting on Forms 20-F and 6-K.

How This Dataset Differs From Similar Datasets or Filings

Form 424B8 is not a distinct prospectus type. It is a cure-filing label used when a prospectus required under another paragraph of Rule 424(b) (b1 through b7) missed its filing deadline and is being submitted late, "as soon as practicable" after discovery. The substantive content of any 424B8 mirrors whichever paragraph the late filing was supposed to satisfy. That single fact governs every comparison below.

424B1 — initial prospectus with Rule 430A pricing. The on-time filing for prospectuses adding information omitted from the effective registration under Rule 430A (typically IPO pricing). A 424B1 and a late-cured 424B8 derived from a missed b(1) deadline can be content-identical; only the timing posture differs.

424B2 — base prospectus plus shelf takedown pricing supplement (Rule 430B). The high-volume workhorse for shelf debt and MTN programs. A 424B8 carrying a missed b(2) takedown shows the same coupon, maturity, CUSIPs, and underwriters; the form code signals only the missed window.

424B3 — material changes or additions to a previously filed prospectus. Narrative-heavier than pricing supplements (updated risk factors, transaction changes, revised financials). A late b(3) becomes a 424B8, distinguishable from a missed b(2)/b(5) only by reading the document body.

424B4 — final priced prospectus where changes exceed Rule 430A scope. The IPO-pricing filing used when material information beyond 430A omissions is added. IPO-pricing studies should treat 424B8 filings whose underlying paragraph is b(1) or b(4) as part of the IPO population.

424B5 — shelf-takedown prospectus supplement. Overlaps heavily with 424B2 but applies to different combinations of base-prospectus reliance and Rule 430B mechanics. A missed b(5) deadline likewise produces a 424B8 carrying equivalent supplement content.

424B7 — selling-securityholder reoffer prospectus. Used for resales by selling holders (e.g., PIPE shares). Disclosure centers on the selling-holder table and resale mechanics rather than primary-issuer pricing. A late b(7) becomes a 424B8 retaining that selling-holder structure.

424A — preliminary prospectus (red herring). Upstream of all 424B paragraphs in the offering timeline. 424B8 has no preliminary-stage analog; it is always a final or supplemental prospectus filed late under one of the b-paragraphs. The two are not substitutes in any research design.

Adjacent but procedurally distinct

S-1 / S-3 — base registration statements. These authorize the offering; 424B filings (including 424B8) are the as-used prospectuses delivered after effectiveness or pricing. An S-1/S-3 dataset gives the registered universe and full legal disclosure; a 424B8 dataset gives only the late-cured prospectuses. Linking 424B8 records back to their underlying S-1/S-3 (via CIK and registration file number) is often necessary for full offering context.

FWPfree writing prospectus (Rule 433). A separate communications regime for term sheets, road show materials, and pricing communications outside the statutory prospectus. FWPs are not Rule 424 prospectuses, are not subject to b-paragraph deadlines, and never trigger 424B8 cure filings, even when they accompany the same shelf takedown.

Form 497 — mutual fund prospectus. Procedurally analogous as a post-effective prospectus filing, but governed by the Investment Company Act of 1940 and filed by registered funds. Filer population, content (objectives, fees, share classes), and downstream uses do not overlap with the 424B series.

What makes the Form 424B8 dataset distinct

The dataset is defined by timing failure, not content type. Every record is a prospectus that should have been filed under another 424(b) paragraph and was not filed on time. Three consequences follow:

  • The population is content-heterogeneous: a single 424B8 may functionally be an IPO pricing prospectus, a shelf takedown, a substantive update, or a selling-holder reoffer.
  • Offering type cannot be inferred from the form code; the underlying paragraph must be determined by parsing the document body.
  • 424B8 is rarely a substitute for any individual b-paragraph dataset. It is a complement: a complete study of, say, shelf-takedown supplements in a given year should union the on-time b(2)/b(5) population with the subset of 424B8 filings whose missed paragraph was b(2) or b(5).

No other SEC dataset occupies this niche of late-cured Rule 424(b) prospectuses.

Who Uses This Dataset

Every record in this corpus is a late-filed prospectus cured under Rule 424(b)(8), which makes it useful to a narrow set of professionals who care about filing-timeline discipline, structured-note terms, or peer cure behavior. Most workflows draw on three layers: the metadata.json header (entities[], cik, filedAt, accessionNo, formType), the primary 424B8 HTML prospectus body, and the Exhibit 107 inline XBRL filing-fee table when present.

Securities and capital-markets attorneys

Issuer- and underwriter-side disclosure counsel use the corpus as a structured ledger of cure filings to scope Section 11 and Section 12(a)(2) exposure. They reconcile the offering or pricing date stated in the prospectus body against filedAt to measure the gap beyond the original 424(b)(1)/(2)/(5)/(7) window, infer which paragraph should have applied, and benchmark how peers word the late-filing event in the cured supplement. Output: liability memos, Rule 159A access-equals-delivery analyses, and precedent banks of cure language.

Compliance and disclosure-controls officers

Compliance teams at issuers, broker-dealers, and underwriters aggregate entities[].name, cik, and filedAt to score their own filing-deadline performance against peers and to flag deal teams that repeatedly route through 424B8. Output: internal SLA dashboards, business cases for filing-automation tooling, and remediation evidence presented to internal audit and to examiners.

Structured-product analysts and desk strategists

Pricing supplements for structured notes, market-linked CDs, and shelf takedowns dominate this channel. Analysts parse the HTML body for payoff formulas, barriers, buffers, participation rates, observation dates, underlying baskets, reference indices, issuer credit terms, and CUSIP/ISIN identifiers, then key the extracted terms to cik and filedAt to build secondary-market reference tables, back-test payoffs against realized index paths, and detect competitor launches that only surface in this channel. Exhibit 107 supplies machine-readable offering size and fee class for issuance dashboards.

Quant researchers and applied NLP/LLM engineers

Teams building prospectus-summarization, risk-factor classification, and term-extraction models use the corpus as a small, well-labeled training and evaluation slice. Its bounded scope, consistent HTML/PDF prospectus structure, and clean metadata.json labels make it suitable for supervised fine-tuning, RAG retrieval evaluation, and pre-training on payoff and indicative-terms language.

Regulatory examiners and market-oversight analysts

Examinations and market-oversight staff group by entities[].cik and filedAt to surface registrants whose 424B8 cadence suggests systemic prospectus-filing-control weakness, then compare the cured supplement against the base prospectus and registration statement to test whether terms changed materially. Output: examination scoping memos, deficiency letters, and referrals.

Syndicate and DCM professionals

Syndicate and debt-capital-markets bankers compare the offering or pricing date in the prospectus body against the EDGAR filedAt to identify peers that habitually cure late, informing competitive pitches to issuer clients and post-mortems on missed filing windows in the desk's own deal flow.

External auditors and assurance teams

Auditors covering registrants with active shelf programs use cik and filedAt to confirm whether a client filed via the 424B8 channel during the audit period, then read the prospectus body to test issuance-fee revenue-recognition timing, the completeness of offerings disclosed in financial-statement footnotes, and the design effectiveness of disclosure-controls procedures for ICFR walk-throughs.

Forensic accountants and securities litigation support

Experts supporting Section 11 and Section 12 cases use accessionNo and filedAt to fix EDGAR receipt time as chain-of-custody anchors, and the HTML body to compare the operative offering terms against the version delivered to investors, supporting reliance and damages opinions in expert reports.

Prospectus-data vendors and fintech aggregators

Vendors building structured-note inventories and prospectus libraries normalize entities[] to internal issuer IDs, deduplicate on accessionNo against other 424(b) channels, and parse the HTML and Exhibit 107 fee tables to enrich product-master records with offering size, fee class, and underlying-asset metadata for downstream wealth-management feeds.

Disclosure-quality and governance researchers

Academic and practitioner researchers treat the corpus as a clean sample of self-cured disclosure failures. They link late-filing frequency and issuer characteristics from entities[] to outcomes such as restatements, enforcement actions, or shelf-program continuation, producing empirical work on Rule 424 compliance.

Specific Use Cases

Because every record is a late-cured Rule 424(b) prospectus, the dataset supports a tight set of workflows that combine the metadata.json header with the primary 424B8 HTML body and, when present, the EX-FILING FEES inline-XBRL exhibit.

Measuring late-filing gaps for Rule 424(b) cure analysis

A securities attorney scoping Section 11 and Section 12(a)(2) exposure for a shelf-program client extracts the trade date, original issue date, and pricing date from the "Key Terms" table in the primary 424B8 HTML body and subtracts them from metadata.filedAt. Grouping by entities[].cik and the registration fileNo produces a per-issuer distribution of cure latencies, a precedent bank of how peers word the late-filing event, and inputs to access-equals-delivery memos under Rule 159A.

Building a structured-note product master from pricing supplements

A structured-product desk strategist iterates over every record where documentFormatFiles[*].description contains PRICING SUPPLEMENT, parses the prospectus HTML for CUSIP/ISIN, underlying basket, barrier/buffer levels, contingent-coupon thresholds, autocall dates, and payoff formulas, and joins each row to offering size and fee class extracted from the ffd: tags in the EX-FILING FEES iXBRL document. Output: a secondary-market reference table keyed by cik and filedAt that feeds payoff back-tests and competitor-launch monitors for issuers like JPMorgan Chase Financial, GS Finance, and Citigroup Global Markets Holdings.

Compliance benchmarking of filing-deadline discipline

A broker-dealer compliance officer aggregates entities[].cik, entities[].companyName, fileNo, and filedAt across the full corpus to compute monthly counts of 424B8 cures per registration statement, then ranks the firm against named peers. The resulting SLA dashboard identifies deal teams that route disproportionately through (b)(8), supports business cases for filing-automation tooling, and produces remediation evidence for examiners.

Co-registrant and guarantor mapping for finance subsidiaries

A reference-data engineer at a prospectus-data vendor walks entities[] for every record, splitting on the (Filer) versus (Subject) role suffix and on hyphen-suffixed fileNo values (333-XXXXXX-NN) to reconstruct issuer-guarantor pairs such as JPMorgan Chase Financial / JPMorgan Chase & Co. or GS Finance Corp. / The Goldman Sachs Group. The output is a normalized issuer-guarantor crosswalk plus SIC and stateOfIncorporation attributes used to enrich downstream wealth-management product feeds.

Training and evaluating prospectus-extraction models

An applied NLP team uses the corpus as a bounded fine-tuning slice for term-extraction and risk-factor classification on structured-note language. The HTML bodies cover three distinct layout regimes (semantic HTML, absolutely-positioned <DIV> print templates, and PDF-rendered tile HTML) referenced in the anatomy, and metadata.json supplies clean labels (formType, entities[].sic, documentFormatFiles[*].description) for supervised fine-tuning, RAG retrieval evaluation, and layout-robustness testing.

Reconciling fee-table disclosures against prospectus body

An external auditor testing issuance-fee revenue-recognition timing for a shelf-program client filters records by entities[].cik and audit-period filedAt, opens the EX-FILING FEES iXBRL exhibit, and reads ffd:AggtSalesPric, ffd:FeeRate, and ffd:FeeAmt against the aggregate offering price and discount described in the "Plan of Distribution" section of the 424B8 HTML body. Discrepancies feed disclosure-controls walk-throughs for ICFR and completeness testing of offerings recorded in financial-statement footnotes.

Dataset Access

The dataset is distributed through the sec-api.io datasets API. A JSON index endpoint exposes dataset metadata and all container download URLs, while the dataset itself is available either as one consolidated archive or as individual monthly container ZIPs. All download endpoints require a sec-api.io API key, passed either as a ?token=YOUR_API_KEY query parameter or via an Authorization header. The index endpoint itself is public and does not require authentication.

Dataset Index JSON API: https://api.sec-api.io/datasets/form-424b8-files.json

Returns dataset-level metadata (name, description, updatedAt, earliestSampleDate, totalRecords, totalSize, formTypes, containerFormat, fileTypes) and a containers array listing every monthly container with its key, downloadUrl, size, recordsCount, and updatedAt timestamp. Poll this endpoint to detect which containers changed in the latest refresh run and incrementally download only those containers.

Example
1 {
2 "datasetId": "1f13365b-9ae0-696d-8403-2189a750d9c1",
3 "datasetDownloadUrl": "https://api.sec-api.io/datasets/form-424b8-files.zip",
4 "name": "Form 424B8 Files Dataset",
5 "updatedAt": "2026-04-21T02:54:31.354Z",
6 "earliestSampleDate": "2006-01-01",
7 "totalRecords": 2769,
8 "totalSize": 88838022,
9 "formTypes": ["424B8"],
10 "containerFormat": "ZIP",
11 "fileTypes": ["HTML", "JSON", "PDF", "TXT"],
12 "containers": [
13 {
14 "downloadUrl": "https://api.sec-api.io/datasets/form-424b8-files/2026/2026-04.zip",
15 "key": "2026/2026-04.zip",
16 "size": 1248301,
17 "recordsCount": 12,
18 "updatedAt": "2026-04-21T02:54:31.354Z"
19 }
20 ]
21 }

Download Entire Dataset: https://api.sec-api.io/datasets/form-424b8-files.zip?token=YOUR_API_KEY

A single archive containing every monthly container ZIP from January 2006 to the latest refresh. Use this for an initial bulk load. Requires an API key.

Download Single Container: https://api.sec-api.io/datasets/form-424b8-files/2026/2026-04.zip?token=YOUR_API_KEY

Each container is a monthly ZIP at form-424b8-files/YYYY/YYYY-MM.zip. Inside, one folder per accession number holds a metadata.json file plus the filing's original EDGAR documents in HTML, TXT, JSON, and PDF form (image attachments excluded). Requires an API key.

Example with curl:

1 curl -O "https://api.sec-api.io/datasets/form-424b8-files/2026/2026-04.zip?token=YOUR_API_KEY"

Or with wget:

1 wget "https://api.sec-api.io/datasets/form-424b8-files/2026/2026-04.zip?token=YOUR_API_KEY"

For batch downloads, the helper script scripts/download-sec-api-file.js can be used to fetch one or more container files from the dataset index without manually composing each URL.

Frequently Asked Questions

What form does this dataset cover?

The dataset covers Form 424B8, the EDGAR submission code for prospectuses filed pursuant to Rule 424(b)(8) of the Securities Act of 1933. Rule 424(b)(8) is a residual catch-up paragraph used when a prospectus required under another paragraph of Rule 424(b) — typically (b)(2), (b)(3), (b)(5), or (b)(7) — was not filed within the time frame the underlying paragraph requires.

What does one record in this dataset represent?

One record is a single EDGAR submission of Form 424B8, identified by an 18-digit SEC accession number. On disk it is a folder containing a metadata.json describing the EDGAR submission, the primary 424B8 prospectus HTML, and, when the registrant attached one, an Exhibit 107 EX-FILING FEES inline-XBRL HTML document. Each accession is a separate record, even when the same issuer files several late prospectus supplements within the same month.

Who is required to file Form 424B8?

The filer is always the Securities Act registrant whose effective registration statement covers the offering — typically the issuer or, in shelf and structured-product programs, a finance subsidiary with a parent guarantor. The 424B8 population is dominated by well-known seasoned issuers (WKSIs) on automatic shelves and by bank holding companies and their finance subsidiaries issuing medium-term notes and structured notes. Underwriters, dealers, and selling securityholders may be named in the prospectus but do not file Form 424B8 in their own right.

What time period does the dataset cover?

The dataset includes all Form 424B8 filings submitted to EDGAR from January 2006 to the present, refreshed on an ongoing basis as new filings arrive. Coverage starts in 2006 because the SEC's 2005 Securities Offering Reform restructured the shelf and prospectus-supplement regime and clarified the (b)(8) corrective path; earlier paper or pre-Reform filings are not included.

How does Form 424B8 differ from Forms 424B1 through 424B7?

Forms 424B1 through 424B7 are routine, on-time prospectus filings tied to specific paragraphs of Rule 424(b) (initial pricing, shelf takedowns, material changes, selling-holder reoffers, and so on). Form 424B8 is the corrective label used when one of those deadlines has been missed, filed "as soon as practicable after the discovery of the failure to file." The substantive prospectus content is generally what would have been filed under the original paragraph; only the submission type changes.

What file format is the dataset distributed in?

The dataset is distributed as monthly ZIP containers named YYYY/YYYY-MM.zip. Each ZIP unpacks to a YYYY-MM/ directory containing one accession-numbered subfolder per filing, and each accession folder contains a metadata.json plus the registrant-supplied EDGAR documents in HTML, TXT, JSON, and PDF form. Image attachments (GRAPHIC entries) are excluded from the ZIP but remain accessible at their original sec.gov URLs.

How do I download the dataset?

The dataset is served by the sec-api.io datasets API. The public index endpoint at https://api.sec-api.io/datasets/form-424b8-files.json lists every monthly container with its download URL, size, record count, and update timestamp. Authenticated download endpoints require a sec-api.io API key passed as ?token=YOUR_API_KEY or via an Authorization header, and let you fetch either the consolidated archive (form-424b8-files.zip) or any single monthly container (form-424b8-files/YYYY/YYYY-MM.zip).