Form 10-K Files Dataset

The Form 10-K Files Dataset contains the complete filing documents for every Form 10-K and Form 10-K/A annual report submitted to SEC EDGAR by domestic registrants from November 1993 to the present. Each record corresponds to a single EDGAR submission identified by its accession number and includes a metadata.json index file plus all non-image document files from the filing — the primary annual report, exhibits such as officer certifications, subsidiary lists, and material contracts. The dataset covers annual reports required under Section 13 or Section 15(d) of the Securities Exchange Act of 1934, filed by operating companies, REITs, limited partnerships, SPACs, business development companies, and other domestic issuers. Records are distributed in ZIP containers organized by month, with files in HTML, TXT, PDF, JSON, and other text-based formats.

Update Frequency
Daily
Updated at
2026-04-29
Earliest Sample Date
1993-11-01
Total Size
45.7 GB
Total Records
2,000,092
Container Format
ZIP
Content Types
TXT, JSON, HTML, PDF, XFD, FRM
Form Types
10-K, 10-K/A

Dataset APIs

Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.

Dataset Index JSON API

Download the entire dataset as a single archive file.

Download Entire Dataset:

Download a single container file (e.g. monthly archive) from the dataset.

Download Single Container:

Dataset Files

390 files · 45.7 GB
Download All
2026-04.zip73.6 MB3,030 records
2026-03.zip991.6 MB37,866 records
2026-02.zip721.5 MB18,550 records
2026-01.zip71.9 MB727 records
2025-12.zip30.3 MB1,029 records
2025-11.zip46.9 MB1,360 records
2025-10.zip14.6 MB599 records
2025-09.zip37.9 MB1,359 records
2025-08.zip92.5 MB1,260 records
2025-07.zip22.4 MB866 records
2025-06.zip29.3 MB1,180 records
2025-05.zip40.7 MB1,619 records
2025-04.zip99.7 MB3,977 records
2025-03.zip965.3 MB36,071 records
2025-02.zip747.9 MB20,427 records
2025-01.zip24.2 MB893 records
2024-12.zip32.6 MB1,173 records
2024-11.zip47.0 MB1,590 records
2024-10.zip21.9 MB883 records
2024-09.zip49.0 MB1,564 records
2024-08.zip84.6 MB1,420 records
2024-07.zip30.3 MB1,185 records
2024-06.zip32.9 MB1,355 records
2024-05.zip42.8 MB1,759 records
2024-04.zip179.4 MB7,200 records
2024-03.zip784.5 MB32,343 records
2024-02.zip758.0 MB21,509 records
2024-01.zip29.3 MB970 records
2023-12.zip35.5 MB1,242 records
2023-11.zip41.2 MB1,456 records
2023-10.zip25.3 MB983 records
2023-09.zip38.1 MB1,329 records
2023-08.zip77.1 MB1,276 records
2023-07.zip25.7 MB885 records
2023-06.zip40.2 MB1,480 records
2023-05.zip58.1 MB2,148 records
2023-04.zip118.0 MB4,052 records
2023-03.zip1.0 GB37,940 records
2023-02.zip918.0 MB17,198 records
2023-01.zip26.1 MB796 records
2022-12.zip40.0 MB1,263 records
2022-11.zip55.5 MB1,435 records
2022-10.zip39.3 MB933 records
2022-09.zip43.0 MB1,443 records
2022-08.zip42.5 MB1,280 records
2022-07.zip22.2 MB837 records
2022-06.zip34.5 MB1,320 records
2022-05.zip48.2 MB1,762 records
2022-04.zip118.0 MB4,505 records
2022-03.zip1.3 GB39,635 records
2022-02.zip814.2 MB17,885 records
2022-01.zip26.9 MB896 records
2021-12.zip50.0 MB1,555 records
2021-11.zip51.1 MB1,458 records
2021-10.zip19.1 MB793 records
2021-09.zip36.8 MB1,499 records
2021-08.zip41.9 MB1,386 records
2021-07.zip23.4 MB906 records
2021-06.zip37.6 MB1,707 records
2021-05.zip67.4 MB2,377 records
2021-04.zip82.0 MB3,835 records
2021-03.zip839.3 MB34,564 records
2021-02.zip714.9 MB16,871 records
2021-01.zip23.7 MB925 records
2020-12.zip38.6 MB1,303 records
2020-11.zip50.5 MB1,468 records
2020-10.zip17.6 MB800 records
2020-09.zip29.6 MB1,297 records
2020-08.zip41.0 MB1,442 records
2020-07.zip20.5 MB963 records
2020-06.zip39.9 MB1,942 records
2020-05.zip58.5 MB2,617 records
2020-04.zip57.2 MB2,842 records
2020-03.zip649.4 MB30,662 records
2020-02.zip728.9 MB18,304 records
2020-01.zip39.3 MB1,076 records
2019-12.zip32.9 MB1,270 records
2019-11.zip69.9 MB1,832 records
2019-10.zip22.1 MB1,421 records
2019-09.zip26.5 MB1,166 records
2019-08.zip38.4 MB1,535 records
2019-07.zip17.6 MB1,048 records
2019-06.zip26.4 MB1,451 records
2019-05.zip35.9 MB1,613 records
2019-04.zip121.5 MB6,128 records
2019-03.zip560.1 MB28,097 records
2019-02.zip561.7 MB16,520 records
2019-01.zip20.6 MB1,004 records
2018-12.zip32.5 MB1,339 records
2018-11.zip40.3 MB1,600 records
2018-10.zip17.5 MB919 records
2018-09.zip28.2 MB1,282 records
2018-08.zip34.8 MB1,420 records
2018-07.zip16.9 MB896 records
2018-06.zip32.9 MB1,860 records
2018-05.zip36.4 MB1,667 records
2018-04.zip127.4 MB6,935 records
2018-03.zip568.6 MB27,456 records
2018-02.zip542.2 MB17,140 records
2018-01.zip20.1 MB1,032 records
2017-12.zip30.5 MB1,399 records
2017-11.zip40.3 MB1,663 records
2017-10.zip19.0 MB971 records
2017-09.zip34.2 MB1,561 records
2017-08.zip35.4 MB1,409 records
2017-07.zip17.5 MB983 records
2017-06.zip33.5 MB1,846 records
2017-05.zip40.8 MB1,817 records
2017-04.zip70.4 MB3,746 records
2017-03.zip602.3 MB29,774 records
2017-02.zip520.1 MB16,925 records
2017-01.zip22.2 MB1,053 records
2016-12.zip41.8 MB1,707 records
2016-11.zip39.0 MB1,688 records
2016-10.zip23.8 MB1,097 records
2016-09.zip35.3 MB1,791 records
2016-08.zip43.6 MB1,799 records
2016-07.zip19.1 MB1,062 records
2016-06.zip36.4 MB1,973 records
2016-05.zip39.7 MB1,731 records
2016-04.zip80.8 MB4,600 records
2016-03.zip540.3 MB26,864 records
2016-02.zip595.6 MB19,217 records
2016-01.zip24.5 MB1,269 records
2015-12.zip36.5 MB1,626 records
2015-11.zip41.0 MB1,737 records
2015-10.zip21.1 MB1,100 records
2015-09.zip41.5 MB2,031 records
2015-08.zip45.0 MB1,560 records
2015-07.zip37.1 MB1,344 records
2015-06.zip53.2 MB2,345 records
2015-05.zip40.9 MB1,842 records
2015-04.zip97.4 MB5,044 records
2015-03.zip667.6 MB29,424 records
2015-02.zip488.0 MB16,591 records
2015-01.zip27.0 MB1,492 records
2014-12.zip44.1 MB2,260 records
2014-11.zip48.8 MB1,927 records
2014-10.zip28.4 MB1,494 records
2014-09.zip63.5 MB2,236 records
2014-08.zip78.6 MB1,822 records
2014-07.zip32.8 MB1,375 records
2014-06.zip38.5 MB2,294 records
2014-05.zip46.7 MB2,390 records
2014-04.zip140.9 MB5,916 records
2014-03.zip645.2 MB27,482 records
2014-02.zip495.4 MB16,717 records
2014-01.zip28.4 MB1,572 records
2013-12.zip46.5 MB2,356 records
2013-11.zip50.7 MB2,013 records
2013-10.zip29.2 MB1,618 records
2013-09.zip45.1 MB2,285 records
2013-08.zip83.8 MB2,110 records
2013-07.zip31.6 MB1,839 records
2013-06.zip58.0 MB2,367 records
2013-05.zip44.7 MB2,252 records
2013-04.zip239.8 MB9,129 records
2013-03.zip611.5 MB23,565 records
2013-02.zip447.3 MB15,949 records
2013-01.zip29.9 MB1,563 records
2012-12.zip44.5 MB2,298 records
2012-11.zip51.0 MB2,083 records
2012-10.zip36.5 MB1,858 records
2012-09.zip43.7 MB2,362 records
2012-08.zip48.0 MB2,005 records
2012-07.zip30.7 MB1,684 records
2012-06.zip43.8 MB2,350 records
2012-05.zip43.0 MB2,223 records
2012-04.zip114.1 MB6,493 records
2012-03.zip562.8 MB25,602 records
2012-02.zip485.3 MB17,790 records
2012-01.zip29.4 MB1,739 records
2011-12.zip53.7 MB2,674 records
2011-11.zip52.1 MB2,272 records
2011-10.zip35.1 MB1,776 records
2011-09.zip68.2 MB2,947 records
2011-08.zip45.8 MB2,238 records
2011-07.zip32.2 MB1,751 records
2011-06.zip53.6 MB2,721 records
2011-05.zip50.1 MB2,845 records
2011-04.zip130.5 MB6,247 records
2011-03.zip656.6 MB29,922 records
2011-02.zip429.0 MB15,609 records
2011-01.zip39.5 MB2,080 records
2010-12.zip63.5 MB3,023 records
2010-11.zip48.9 MB2,492 records
2010-10.zip36.8 MB2,141 records
2010-09.zip57.2 MB3,009 records
2010-08.zip71.4 MB2,367 records
2010-07.zip33.3 MB1,960 records
2010-06.zip56.3 MB2,897 records
2010-05.zip48.2 MB2,390 records
2010-04.zip156.7 MB7,578 records
2010-03.zip705.7 MB31,145 records
2010-02.zip400.8 MB15,040 records
2010-01.zip42.0 MB2,116 records
2009-12.zip55.2 MB2,927 records
2009-11.zip54.4 MB2,584 records
2009-10.zip42.6 MB2,446 records
2009-09.zip56.9 MB3,118 records
2009-08.zip45.8 MB2,503 records
2009-07.zip41.0 MB2,516 records
2009-06.zip63.1 MB3,301 records
2009-05.zip54.1 MB2,913 records
2009-04.zip153.6 MB8,257 records
2009-03.zip753.0 MB37,032 records
2009-02.zip445.4 MB16,112 records
2009-01.zip38.0 MB2,061 records
2008-12.zip58.9 MB2,812 records
2008-11.zip35.9 MB1,920 records
2008-10.zip29.3 MB1,736 records
2008-09.zip52.8 MB2,788 records
2008-08.zip34.4 MB1,817 records
2008-07.zip33.4 MB2,272 records
2008-06.zip55.0 MB4,187 records
2008-05.zip44.1 MB3,371 records
2008-04.zip107.3 MB5,776 records
2008-03.zip604.7 MB42,302 records
2008-02.zip440.0 MB17,276 records
2008-01.zip20.9 MB1,200 records
2007-12.zip43.5 MB2,339 records
2007-11.zip39.2 MB1,881 records
2007-10.zip22.9 MB1,247 records
2007-09.zip40.9 MB2,550 records
2007-08.zip34.8 MB1,773 records
2007-07.zip27.5 MB1,489 records
2007-06.zip50.7 MB2,295 records
2007-05.zip39.6 MB2,079 records
2007-04.zip192.2 MB9,876 records
2007-03.zip738.5 MB37,727 records
2007-02.zip323.6 MB11,082 records
2007-01.zip20.3 MB1,191 records
2006-12.zip65.3 MB2,990 records
2006-11.zip23.3 MB1,225 records
2006-10.zip23.0 MB1,227 records
2006-09.zip52.3 MB2,605 records
2006-08.zip25.0 MB2,088 records
2006-07.zip22.6 MB1,166 records
2006-06.zip54.9 MB2,690 records
2006-05.zip35.9 MB2,159 records
2006-04.zip78.4 MB4,246 records
2006-03.zip795.4 MB42,320 records
2006-02.zip178.9 MB5,759 records
2006-01.zip21.3 MB1,388 records
2005-12.zip61.1 MB3,345 records
2005-11.zip26.0 MB1,505 records
2005-10.zip22.1 MB1,332 records
2005-09.zip59.8 MB3,230 records
2005-08.zip29.7 MB2,147 records
2005-07.zip29.0 MB1,704 records
2005-06.zip67.0 MB3,001 records
2005-05.zip39.6 MB2,852 records
2005-04.zip96.7 MB6,158 records
2005-03.zip815.7 MB45,804 records
2005-02.zip103.4 MB4,066 records
2005-01.zip21.2 MB1,534 records
2004-12.zip50.8 MB3,302 records
2004-11.zip25.4 MB1,447 records
2004-10.zip19.4 MB1,351 records
2004-09.zip55.8 MB3,314 records
2004-08.zip22.7 MB1,470 records
2004-07.zip22.3 MB1,515 records
2004-06.zip66.9 MB2,997 records
2004-05.zip30.6 MB1,909 records
2004-04.zip84.7 MB6,039 records
2004-03.zip834.3 MB45,184 records
2004-02.zip88.8 MB4,508 records
2004-01.zip23.4 MB1,783 records
2003-12.zip59.7 MB3,537 records
2003-11.zip19.0 MB1,106 records
2003-10.zip24.8 MB1,791 records
2003-09.zip59.0 MB3,534 records
2003-08.zip22.0 MB1,339 records
2003-07.zip27.2 MB1,370 records
2003-06.zip61.7 MB2,587 records
2003-05.zip39.5 MB1,958 records
2003-04.zip105.3 MB5,365 records
2003-03.zip734.8 MB35,052 records
2003-02.zip61.1 MB2,492 records
2003-01.zip24.7 MB1,319 records
2002-12.zip49.9 MB2,687 records
2002-11.zip23.3 MB1,590 records
2002-10.zip20.7 MB1,198 records
2002-09.zip48.2 MB2,619 records
2002-08.zip22.2 MB1,227 records
2002-07.zip28.6 MB1,242 records
2002-06.zip32.7 MB1,657 records
2002-05.zip59.7 MB1,608 records
2002-04.zip196.4 MB8,899 records
2002-03.zip300.1 MB13,979 records
2002-02.zip20.0 MB1,067 records
2002-01.zip15.7 MB954 records
2001-12.zip29.4 MB1,530 records
2001-11.zip11.9 MB601 records
2001-10.zip14.8 MB715 records
2001-09.zip27.4 MB1,523 records
2001-08.zip13.9 MB759 records
2001-07.zip16.5 MB851 records
2001-06.zip29.5 MB1,541 records
2001-05.zip28.6 MB1,290 records
2001-04.zip167.9 MB10,325 records
2001-03.zip251.7 MB14,392 records
2001-02.zip18.6 MB996 records
2001-01.zip14.1 MB894 records
2000-12.zip28.1 MB1,911 records
2000-11.zip10.4 MB653 records
2000-10.zip15.9 MB993 records
2000-09.zip32.9 MB2,152 records
2000-08.zip11.3 MB789 records
2000-07.zip12.8 MB883 records
2000-06.zip28.6 MB2,029 records
2000-05.zip22.7 MB1,359 records
2000-04.zip58.5 MB3,863 records
2000-03.zip379.5 MB24,042 records
2000-02.zip17.9 MB1,501 records
2000-01.zip15.7 MB1,169 records
1999-12.zip33.4 MB3,689 records
1999-11.zip11.0 MB788 records
1999-10.zip13.9 MB913 records
1999-09.zip32.6 MB2,297 records
1999-08.zip12.6 MB864 records
1999-07.zip13.8 MB1,013 records
1999-06.zip28.2 MB1,890 records
1999-05.zip15.5 MB996 records
1999-04.zip66.9 MB4,753 records
1999-03.zip353.0 MB24,660 records
1999-02.zip15.6 MB1,039 records
1999-01.zip15.0 MB1,235 records
1998-12.zip30.2 MB2,259 records
1998-11.zip11.1 MB795 records
1998-10.zip17.2 MB1,216 records
1998-09.zip34.6 MB2,684 records
1998-08.zip10.3 MB831 records
1998-07.zip15.3 MB1,116 records
1998-06.zip31.1 MB2,354 records
1998-05.zip24.5 MB1,941 records
1998-04.zip62.2 MB4,855 records
1998-03.zip345.6 MB26,965 records
1998-02.zip17.2 MB1,343 records
1998-01.zip17.8 MB1,335 records
1997-12.zip34.7 MB2,735 records
1997-11.zip11.1 MB806 records
1997-10.zip18.7 MB1,448 records
1997-09.zip35.7 MB2,833 records
1997-08.zip13.4 MB1,045 records
1997-07.zip14.8 MB1,334 records
1997-06.zip30.1 MB2,341 records
1997-05.zip24.7 MB1,987 records
1997-04.zip54.9 MB4,224 records
1997-03.zip313.3 MB23,772 records
1997-02.zip19.1 MB1,468 records
1997-01.zip18.5 MB1,536 records
1996-12.zip34.0 MB2,691 records
1996-11.zip11.6 MB915 records
1996-10.zip15.2 MB1,173 records
1996-09.zip33.5 MB2,901 records
1996-08.zip11.4 MB1,004 records
1996-07.zip18.2 MB1,441 records
1996-06.zip18.2 MB1,610 records
1996-05.zip15.7 MB1,260 records
1996-04.zip59.7 MB4,609 records
1996-03.zip124.6 MB10,089 records
1996-02.zip13.7 MB1,136 records
1996-01.zip9.3 MB832 records
1995-12.zip15.1 MB1,350 records
1995-11.zip5.3 MB444 records
1995-10.zip7.9 MB666 records
1995-09.zip17.3 MB1,514 records
1995-08.zip6.7 MB600 records
1995-07.zip5.6 MB439 records
1995-06.zip11.1 MB874 records
1995-05.zip8.5 MB743 records
1995-04.zip13.1 MB1,083 records
1995-03.zip138.8 MB9,020 records
1995-02.zip8.5 MB744 records
1995-01.zip6.2 MB579 records
1994-12.zip10.8 MB937 records
1994-11.zip3.9 MB332 records
1994-10.zip4.0 MB308 records
1994-09.zip11.2 MB941 records
1994-08.zip4.4 MB321 records
1994-07.zip4.0 MB297 records
1994-06.zip6.0 MB568 records
1994-05.zip7.5 MB504 records
1994-04.zip13.1 MB966 records
1994-03.zip130.3 MB8,801 records
1994-02.zip6.1 MB457 records
1994-01.zip2.8 MB207 records
1993-12.zip198.0 KB14 records
1993-11.zip66.1 KB3 records

What This Dataset Contains

The Form 10-K Files Dataset is built from Form 10-K and Form 10-K/A filings as accepted by EDGAR. Form 10-K is the annual report providing a comprehensive financial and operational overview of the reporting company, including audited financial statements prepared under U.S. GAAP, narrative disclosures required by Regulation S-K, and financial statement presentation governed by Regulation S-X. Form 10-K/A is an amendment to a previously filed 10-K, typically correcting or supplementing specific items; it carries the same record structure but its formType field reads "10-K/A".

The dataset spans from November 1993 — the beginning of EDGAR electronic filing — to the present, covering the entire population of domestic registrants with active reporting obligations during that period. All domestic registrants were required to file electronically by 1996; pre-EDGAR filings exist only in the SEC's paper and microfiche archives and are not included. The dataset is distributed as ZIP containers, with individual filing records containing files in HTML, TXT, PDF, JSON, XFD, and FRM formats.

Content Structure of a Single Form 10-K Filing Record

A single record in the Form 10-K Files Dataset is the complete set of files from one EDGAR submission of a Form 10-K or Form 10-K/A filing. The record is a folder named by the filing's accession number (zero-padded, dashes removed, 18 digits). It contains a metadata.json file and one or more document files (.htm, .txt, .pdf, or other text-based formats) that together reproduce the full non-image content of the original SEC submission. Each record maps one-to-one to a distinct EDGAR accession number.

Each record folder contains two categories of files:

1. metadata.json — a JSON object capturing filing-level attributes and serving as the structural index for all documents in the submission.

2. One or more document files — the primary annual report and any attached exhibits, each wrapped in EDGAR's SGML document envelope. Most are .htm files, but older filings may include .txt files and some submissions contain .pdf documents. Image files (GRAPHIC type) referenced in the metadata are excluded from the dataset; all other document files are present.

The number of document files per record varies substantially. A minimal filing may contain only metadata.json and a single annual report file. A typical large-accelerated-filer submission includes the primary 10-K document plus six to ten exhibit files covering certifications, subsidiary lists, auditor consents, compensation clawback policies, insider trading policies, and material contract descriptions.

Metadata.json in Detail

The metadata file is the authoritative index to the record. Its top-level scalar fields are:

  • formType: "10-K" or "10-K/A".
  • accessionNo: The SEC accession number in dash-delimited format (e.g., "0000320193-25-000079"), serving as the primary identifier linking the record to EDGAR.
  • description: A human-readable label for the form type.
  • filedAt: ISO 8601 timestamp with timezone offset indicating when EDGAR accepted the filing.
  • periodOfReport: The fiscal year-end date covered, in YYYY-MM-DD format.
  • id: A unique hexadecimal identifier assigned to the filing.
  • linkToFilingDetails: URL to the primary filing document on SEC.gov.
  • linkToTxt: URL to the complete EDGAR submission text file (not included in the record).
  • linkToHtml: URL to the EDGAR filing index page.
  • linkToXbrl: URL to an XBRL viewer; often an empty string for older filings or filings without interactive data.

Four array fields provide deeper structural information:

documentFormatFiles enumerates every document in the submission. Each entry carries sequence (ordering position), size (in bytes, as a string), documentUrl (direct SEC.gov link), description (human-readable label such as "ANNUAL REPORT" or "CERTIFICATION"), and type (the document type code: "10-K", "EX-31.1", "EX-21.1", "GRAPHIC", etc.). This array is the definitive manifest of the submission and includes image files that are excluded from the dataset on disk, enabling consumers to identify missing content and retrieve it from SEC.gov.

dataFiles lists XBRL taxonomy and instance files associated with the filing. For filings with structured data, this typically includes EX-101.SCH (schema), EX-101.CAL (calculation linkbase), EX-101.DEF (definition linkbase), EX-101.LAB (label linkbase), EX-101.PRE (presentation linkbase), and extracted XML instance documents. For filings without XBRL — common before the structured-data mandate and among certain smaller filers — this array is empty.

entities contains one or more objects identifying the filing registrant. Each entity object includes companyName, cik (Central Index Key), tickers (array of stock ticker symbols), sic (SIC code with industry description), stateOfIncorporation, fiscalYearEnd (MMDD format), act (typically "34" for the Exchange Act), fileNo, irsNo, and type.

seriesAndClassesContractsInformation is present but virtually always empty for 10-K filings, as it pertains to investment company series/class structures.

Document File Structure: the SGML Document Wrapper

Each document file in the record is wrapped in EDGAR's SGML document envelope. The wrapper opens with <DOCUMENT> and contains structured header fields before the document body:

  • TYPE: The document or exhibit type code (e.g., 10-K, EX-31.1, EX-21.1, EX-23.1).
  • SEQUENCE: An integer indicating the document's position within the filing.
  • FILENAME: The original filename as submitted.
  • DESCRIPTION: An optional human-readable label.
  • TEXT: The wrapper for the actual content, which is typically full HTML markup but may be plain text in older filings.

The content inside the <TEXT> block is the substantive document. For the primary 10-K, this is the complete annual report. For exhibits, it contains the exhibit text.

File naming conventions vary by filer and filing agent. Some registrants use ticker-based names (aapl-20250927.htm), others use form-type prefixes (form10-k.htm), and filing agents frequently impose their own patterns. The documentFormatFiles array in metadata.json provides the reliable mapping between filenames, types, and descriptions.

Structure of the Primary 10-K Annual Report Document

The primary annual report document is identified by type value "10-K" and typically assigned sequence "1". Its internal structure follows the form's required organization:

Cover page. The filing opens with registrant identification: legal name, state of incorporation, IRS employer identification number, principal office address, telephone number, Securities Act file number, and stock exchange listing information. The cover page also discloses the fiscal year-end date, filer status (large accelerated filer, accelerated filer, non-accelerated filer, smaller reporting company, or emerging growth company), and whether the registrant has filed all required reports during the preceding 12 months. In inline XBRL filings (2019 onward), cover-page data points are individually tagged with ix: elements.

Part I contains seven items:

  • Item 1 — Business. Description of the registrant's operations, products, services, competitive conditions, regulatory environment, human capital resources (required since November 2020), and material properties. Diversified companies often include segment-level breakdowns.
  • Item 1A — Risk Factors. Enumeration of material risks organized thematically (operational, financial, legal, regulatory, macroeconomic, cybersecurity, etc.). Formalized as a required standalone item under 2005 Regulation S-K amendments; earlier filings sometimes embedded risk discussion within Item 1 or Item 7.
  • Item 1B — Unresolved Staff Comments. Disclosure of any unresolved SEC staff comments on periodic reports. Often states that there are none.
  • Item 1C — Cybersecurity. Disclosure of cybersecurity risk management, strategy, and governance. Added for fiscal years ending on or after December 15, 2023; absent from earlier filings.
  • Item 2 — Properties. Description of the registrant's principal physical properties, including owned and leased facilities, locations, and general character of material properties.
  • Item 3 — Legal Proceedings. Disclosure of material pending legal proceedings, other than ordinary routine litigation, to which the registrant or its subsidiaries are a party.
  • Item 4 — Mine Safety Disclosures. Disclosure of mine safety violations and orders under the Federal Mine Safety and Health Act of 2006. Required for registrants operating mines; other registrants state the item is not applicable.

Part II contains nine items focused on financial performance and capital markets:

  • Item 5 — Market for Registrant's Common Equity, Related Stockholder Matters and Issuer Purchases of Equity Securities. Trading market information, dividends, stock repurchase programs, and a performance comparison graph (or cross-reference to it).
  • Item 6 — [Reserved]. Eliminated by SEC amendments effective February 2021; carried as reserved in post-2021 filings. Earlier filings included five-year selected financial data here.
  • Item 7 — Management's Discussion and Analysis of Financial Condition and Results of Operations (MD&A). The central narrative analysis of financial performance, covering revenue drivers, expense trends, liquidity, capital resources, critical accounting estimates, and known trends. Typically the longest narrative block in the filing.
  • Item 7A — Quantitative and Qualitative Disclosures About Market Risk. Exposure to interest rate, foreign currency, commodity price, and other market risks, including quantitative sensitivity analyses.
  • Item 8 — Financial Statements and Supplementary Financial Data. The complete audited financial statements: balance sheets, income statements, statements of comprehensive income, statements of stockholders' equity, and cash flow statements, together with the accompanying notes to the financial statements. Also includes the independent auditor's report, which since 2019 (under PCAOB AS 3101) must communicate critical audit matters. This section constitutes the bulk of the filing's tabular content and, in inline XBRL filings, is densely tagged with ix:nonFraction and ix:nonNumeric elements.
  • Item 9 — Changes in and Disagreements With Accountants on Accounting and Financial Disclosure. Typically brief; discloses any auditor changes or disagreements.
  • Item 9A — Controls and Procedures. Evaluation of disclosure controls, management's report on internal control over financial reporting, and (for accelerated and large accelerated filers) the auditor's attestation report on internal control. Introduced by the Sarbanes-Oxley Act (2002).
  • Item 9B — Other Information. Catch-all for material information not reported on Form 8-K during the fourth quarter.
  • Item 9C — Disclosure Regarding Foreign Jurisdictions that Prevent Inspections. Added by the Holding Foreign Companies Accountable Act; discloses whether the registrant retained an auditor in a jurisdiction where PCAOB inspection is restricted.

Part III contains five items:

  • Item 10 — Directors, Executive Officers and Corporate Governance.
  • Item 11 — Executive Compensation.
  • Item 12 — Security Ownership of Certain Beneficial Owners and Management and Related Stockholder Matters.
  • Item 13 — Certain Relationships and Related Transactions, and Director Independence.
  • Item 14 — Principal Accountant Fees and Services.

Registrants frequently incorporate Part III items by reference from the proxy statement (DEF 14A). When incorporated by reference, these items contain only a brief cross-reference statement and no substantive content. When included directly, they contain full compensation tables, beneficial ownership tables, and governance disclosures. The proxy statement itself is not part of the 10-K record.

Part IV contains two items:

  • Item 15 — Exhibits and Financial Statement Schedules. Lists all exhibits filed with or incorporated by reference into the 10-K, including exhibit numbers, descriptions, and cross-references to prior filings. The exhibit index maps exhibit numbers (e.g., 3.1, 4.1, 10.1, 21.1, 23.1, 31.1, 32.1) to documents present in the filing or previously filed.
  • Item 16 — Form 10-K Summary. An optional summary of the annual report; rarely used in practice.

Signatures. The filing concludes with a signature block in which the principal executive officer, principal financial officer, and a majority of the board of directors sign the report, authenticating the filing and triggering personal liability under Sections 18 and Section 906 of the Sarbanes-Oxley Act. The signature block also includes the signing date and each signatory's title.

Exhibits Included in the Record

The document files beyond the primary 10-K are exhibits. The most common exhibit types are:

  • EX-31.1 and EX-31.2: Section 302 certifications by the principal executive officer and principal financial officer, certifying accuracy of the report and adequacy of disclosure controls. Short and formulaic.
  • EX-32.1 (sometimes EX-32.2): Section 906 certifications (18 U.S.C. 1350), certifying full compliance with Exchange Act requirements and fair presentation of financial condition.
  • EX-21.1: List of subsidiaries, typically a table of entity names, jurisdictions of organization, and sometimes ownership percentages.
  • EX-23.1: Consent of the independent registered public accounting firm authorizing incorporation of the audit report into registration statements.
  • EX-4.x: Descriptions of securities registered under Section 12, or the instruments themselves (indentures, share terms).
  • EX-10.x: Material contracts — employment agreements, credit facilities, lease agreements, licensing agreements.
  • EX-97: Compensation recovery (clawback) policy, required for fiscal years ending on or after December 1, 2023.
  • EX-19.1: Insider trading policy, required for fiscal years ending on or after April 1, 2025.

Not all exhibits listed in Item 15 are filed as separate documents within the submission. Exhibits incorporated by reference point to documents in prior filings, and no corresponding file appears in the record. The documentFormatFiles array is the definitive list of files actually present in the submission; the Item 15 exhibit index lists both filed and incorporated exhibits.

What Is Included in the Dataset Record

Each record includes the metadata.json file with all filing-level attributes and the complete document manifest, the primary 10-K annual report document, and all non-image document files submitted with the filing (exhibits, text files, and any other non-GRAPHIC documents). This covers the full narrative, tabular, and financial-statement content of the annual report and its exhibits.

What Is Excluded

Image files. Documents with type GRAPHIC (typically JPG or PNG files used for logos, charts, performance graphs, or signature images) are listed in documentFormatFiles but not present on disk. Their documentUrl values can be used to retrieve them from SEC.gov. Some older filings that rely heavily on scanned images may have sparse or incomplete HTML content without these files.

XBRL taxonomy and instance files. Structured data files listed in dataFiles (EX-101.SCH, EX-101.CAL, EX-101.DEF, EX-101.LAB, EX-101.PRE, and XML instance documents) are referenced in the metadata but not included as separate files in the record. For inline XBRL filings, the XBRL tags are embedded directly in the primary document and are therefore present within the record.

Complete submission text file. The monolithic EDGAR submission text file (all documents concatenated with SGML wrappers) is referenced via linkToTxt but not included.

Interpretation and Extraction Notes

SGML wrapper parsing. To extract usable content from each document file, consumers must strip the SGML envelope — the <DOCUMENT>, <TYPE>, <SEQUENCE>, <FILENAME>, <DESCRIPTION>, and <TEXT> tags. Substantive content begins after <TEXT> and ends before </TEXT>.

Inline XBRL tags in HTML. In post-2019 filings, the primary document's HTML contains XBRL namespace elements (<ix:nonNumeric>, <ix:nonFraction>, <ix:header>, <ix:hidden>, etc.) interleaved with standard HTML. Text extraction pipelines must account for these tags to avoid duplicating or mangling content. The <ix:header> block, typically placed near the top of the document, contains context and unit definitions that are not visible content.

Incorporation by reference. When Part III items are deferred to the proxy statement, the 10-K record contains only a cross-reference statement for those items. The actual compensation, governance, and ownership data must be obtained from the registrant's separate DEF 14A filing, which is not part of this record.

Amendments (10-K/A). An amendment record typically restates only the amended items and exhibits, not the entire annual report. The formType field distinguishes amendments from original filings. The amendment's cover page usually identifies which items are being amended.

Exhibit availability vs. exhibit index. The exhibit index in Item 15 lists both exhibits filed with the current submission and exhibits incorporated by reference from prior filings. Only the former have corresponding files in the record. The documentFormatFiles array reflects what was actually submitted; the Item 15 index is the broader catalog.

Format evolution. Record structure varies significantly across the dataset's time span. Filings from the mid-1990s may consist of a single plain-text or minimal-HTML file with rudimentary table formatting. Filings from the 2009–2018 period typically contain richly formatted HTML with separate XBRL taxonomy files referenced in dataFiles. Post-2019 filings contain inline-XBRL-tagged HTML where the primary document carries both human-readable content and machine-readable structured data in a single file. The SGML document wrapper persists across all eras.

Filer variation. Record complexity ranges enormously. A large-accelerated filer may produce a record with seven or more exhibit files, dense inline XBRL tagging, and hundreds of pages of financial content. A smaller filer may produce a record with only the metadata and a single document file containing the entire 10-K with minimal formatting.

Who Files or Publishes This Dataset, and When

A Form 10-K record represents an annual report (or amendment) filed on EDGAR by a domestic registrant with an active reporting obligation under the Securities Exchange Act of 1934. The filer is always the issuer itself — the company, trust, partnership, or other entity whose securities are registered. The trigger is the close of the registrant's fiscal year.

Filing Population

Form 10-K filers hold a reporting obligation under one of two Exchange Act provisions:

  • Section 13 filers have a class of securities registered under Section 12, either through exchange listing (Section 12(b)) or, historically, through exceeding asset and holder-of-record thresholds (Section 12(g)). The reporting obligation lasts as long as the Section 12 registration is effective.

  • Section 15(d) filers have an effective Securities Act registration statement (e.g., Form S-1, Form S-11) but no Section 12 registration. The obligation runs for the fiscal year in which the registration statement became effective and each subsequent year, unless suspended because the issuer has fewer than 300 holders of record (1,200 for certain issuers) and is not exchange-listed.

Entity types in the filing population include operating companies (corporations, LLCs), REITs, limited partnerships and MLPs, SPACs, shell and development-stage companies, business development companies, and certain asset-backed issuers whose structure requires Form 10-K rather than specialized ABS forms.

Who Does Not File Form 10-K

  • Foreign private issuers file annual reports on Form 20-F (or Form 40-F for qualifying Canadian issuers under MJDS). Foreign private issuer status is tested under Rule 3b-4; an issuer that fails the test files on domestic forms regardless of where it is incorporated.
  • Registered investment companies (mutual funds, closed-end funds, ETFs, UITs) report on Form N-CSR or other fund-specific forms. The exception is business development companies, which file on Form 10-K.
  • Regulation A (Tier 2) issuers file annual reports on Form 1-K under a separate, lighter reporting regime.
  • Issuers that have terminated or suspended reporting (e.g., by filing Form 15) no longer file and drop out of the dataset.

Trigger and Deadlines

The filing obligation is codified in Rules 13a-1 and 15d-1. Each requires the issuer to file an annual report within a set number of days after its fiscal year end. The deadline depends on the issuer's filer category, determined under Rule 12b-2 based on public float as of the last business day of the most recently completed second fiscal quarter:

Filer categoryPublic floatDeadline
Large accelerated filer>= $700 million60 days
Accelerated filer$75 million to < $700 million75 days
Non-accelerated filer< $75 million or not calculable90 days

Filer category can change year to year, with transition rules to prevent oscillation.

Smaller reporting companies (public float < $250 million, or revenue < $100 million when float is < $700 million or not calculable) receive scaled disclosure relief under Regulation S-K. SRC status and accelerated-filer status are determined independently; an SRC that is an accelerated filer still faces the 75-day deadline.

Emerging growth companies (recent IPO issuers with annual gross revenue below the inflation-adjusted threshold, currently approximately $1.235 billion) are exempt from certain requirements, notably the SOX Section 404(b) auditor attestation on internal controls. EGC status expires at the earliest of: five years post-IPO, crossing the revenue threshold, issuing more than $1 billion in non-convertible debt over three years, or becoming a large accelerated filer.

Filing Cadence

Form 10-K is filed once per fiscal year. The fiscal year end is chosen by the registrant and need not be December 31. Deadlines run from the issuer's own fiscal year end, so filings arrive throughout the year, with heavy concentration after December 31 and other common year ends.

A registrant must file for every fiscal year during which its reporting obligation is active, regardless of whether it has revenue or operations.

Amendments (Form 10-K/A)

Form 10-K/A amends a previously filed 10-K. Each amendment is a separate EDGAR record with its own accession number, referencing the same fiscal-year-end period as the original but carrying a later filing date. Common reasons for amendment:

  • Restatement of financial statements.
  • Correction of material misstatements or omissions.
  • Late filing of required exhibits (certifications, material contracts).
  • Supplying Part III disclosures (directors, compensation, beneficial ownership) that were omitted from the original 10-K under the 120-day incorporation-by-reference rule when the definitive proxy statement was not filed in time.

How This Dataset Differs From Similar Datasets or Filings

Form 10-Q (Quarterly Reports) — The nearest relative. Both are periodic Exchange Act reports filed by the same domestic registrants under Regulation S-K and S-X. The 10-Q covers a single fiscal quarter with unaudited interim financials reviewed (not audited) by the external auditor. The 10-K covers the full fiscal year with audited financial statements and an independent auditor's report. The 10-K is substantially more expansive: it requires the full business description (Item 1), complete risk factors (Item 1A), properties, legal proceedings, executive compensation, and exhibits rarely found on a 10-Q such as Exhibit 21 (subsidiary list) and auditor consent (Exhibit 23). The 10-Q provides interim updates between annual filings; the 10-K is the authoritative annual baseline. They complement each other but are not substitutes.

Form 20-F (Foreign Private Issuer Annual Reports) — The annual report equivalent for foreign private issuers (FPIs) listed on U.S. exchanges. Like the 10-K, it delivers audited financial statements, business descriptions, risk factors, and management discussion on an annual cycle. Key differences: 20-F financials may use IFRS or local GAAP with U.S. GAAP reconciliation rather than requiring U.S. GAAP outright. The filing deadline is 120 days after fiscal year-end versus 60-to-90 days for the 10-K. The item numbering and disclosure structure differ, with FPI-specific content such as exchange controls and home-jurisdiction taxation. FPIs are also exempt from proxy rules and Section 16 short-swing profit reporting. The Form 10-K Files Dataset covers only domestic registrants; a cross-border annual-report analysis requires the 20-F dataset as well.

XBRL Financial Datasets (Structured Numeric Extracts) — The most important structural alternative for financial data extraction. XBRL datasets draw their source values from the same 10-K (and 10-Q) filings, but deliver pre-extracted, taxonomy-tagged numeric facts (revenue, net income, total assets, etc.) in tabular form ready for cross-company quantitative comparison. The Form 10-K Files Dataset preserves the complete filing documents: full narrative content (MD&A, risk factors, business descriptions, legal proceedings), all attached exhibits (material contracts, subsidiary lists, certifications), and the original document formatting. XBRL tagging does not capture narrative text or exhibit content. Use XBRL datasets for comparable financial line items at scale; use the Form 10-K Files Dataset for narrative analysis, exhibit content, NLP tasks, or original-presentation access.

Form 8-K (Current Reports) — The event-driven counterpart filed by the same domestic registrants. An 8-K is triggered by specific material events (changes in control, material agreements, officer departures) and must be filed within four business days. Some content overlaps: a material contract first disclosed on an 8-K often reappears as an exhibit on the next 10-K. But the 8-K provides real-time event notification on an irregular schedule, while the 10-K provides the comprehensive annual picture on a fixed calendar. The filing triggers, cadence, and content scope are fundamentally different.

Key Differences

DimensionForm 10-K Files DatasetNearest alternatives
PeriodicityAnnual (fiscal year-end)10-Q: quarterly; 8-K: event-driven; 20-F: annual
Filer populationDomestic registrants only20-F: foreign private issuers; 40-F: qualifying Canadian issuers
Financial statementsAudited, U.S. GAAP10-Q: unaudited interim; 20-F: may use IFRS
Content scopeFull documents, narratives, and exhibitsXBRL: tagged numeric values only
Coverage periodNovember 1993 to present (includes 10-K/A amendments)Varies by dataset

Boundary Notes

Form 40-F (Canadian cross-listed issuers under the MJDS) serves a similar annual-report function but covers a small, distinct filer population using Canadian disclosure standards. It is a niche complement, not a substitute.

Form 10-KSB was the simplified annual report for smaller reporting companies, retired in 2008. After its elimination, all domestic filers use the standard 10-K with scaled-disclosure accommodations. The Form 10-K Files Dataset does not include 10-KSB filings; pre-2009 research on small companies must account for this gap separately.

Annual Reports to Shareholders (ARS) are corporate marketing documents with no prescribed format, no mandated content, and no systematic EDGAR availability. The 10-K is the legally mandated, regulation-compliant annual disclosure; the ARS is not.

What Makes This Dataset Distinct

The Form 10-K Files Dataset is the document-level annual report archive for domestic SEC registrants, covering the full filing as submitted: narrative disclosures, exhibits, metadata, and inline XBRL where applicable. It is distinct from quarterly and event-driven datasets in periodicity, from foreign-issuer annual forms in filer population and accounting standards, and from XBRL extracts in preserving complete document content rather than only tagged numeric values. For any work requiring access to the actual annual report documents as filed — including exhibits, narrative sections, and original formatting — this is the primary source.

Who Uses This Dataset

The Form 10-K Files Dataset supports professionals who need full-text annual reports with historical depth back to 1993 and broad registrant coverage.

Equity Research Analysts and Fundamental Investors

Sell-side and buy-side analysts anchor company-level valuation on the audited financial statements, footnotes, and MD&A. Financial statements feed ratio analysis, earnings-quality checks, and DCF models. MD&A reveals management's view of operating trends and capital allocation. Analysts compare MD&A language year over year to detect tone shifts, new risks, or guidance changes, and use the dataset's longitudinal coverage to build multi-year financials and benchmark peers.

Credit Analysts

Credit teams at rating agencies, banks, and fixed-income managers assess debt-service capacity using debt schedules, lease and pension footnotes, off-balance-sheet arrangements, and covenant disclosures. The liquidity discussion in MD&A and risk factors (Item 1A) surface concentration, regulatory, and litigation risks relevant to repayment. The dataset's inclusion of 10-K/A amendments matters because restated financials can signal control weaknesses that affect credit quality.

Forensic Accountants and Audit Teams

Forensic accountants and assurance teams use footnotes (revenue recognition, related-party transactions, contingent liabilities, fair value), the auditor's report, and SOX Section 302/906 certifications to spot anomalies and benchmark accounting policy choices. Comparing footnote disclosures across years and peers reveals unusual estimate changes or aggressive recognition. 10-K/A amendments are especially valuable since they often reflect restatements or error corrections warranting deeper investigation.

Securities Lawyers and Disclosure Counsel

Securities attorneys benchmark risk factors (Item 1A), legal proceedings (Item 3), related-party disclosures (Item 13), and material contract exhibits across registrants to draft or review client filings. In litigation, historical 10-K filings establish what a company disclosed or omitted at specific points in time. Coverage back to 1993 and inclusion of amendments directly support litigation timelines.

M&A Due Diligence Teams

Due diligence teams at investment banks, private equity firms, and advisory practices review the target's business description, risk factors, legal proceedings, financial statements, and material contract exhibits. The subsidiary list (Exhibit 21) is critical for mapping entity structure and jurisdictional exposure. Comparing multiple years of filings surfaces changes in segment reporting, revenue mix, or risk disclosure that may not appear in management presentations.

Corporate Finance and Investor Relations Teams

Finance and IR professionals at public companies benchmark their own disclosures against peers. They review how comparable registrants present segment results, structure risk factors, and frame MD&A narratives around shared industry headwinds. Executive compensation disclosures (Item 11 or the incorporated proxy) inform pay-design benchmarking and governance discussions.

Corporate Governance Analysts

Governance analysts at proxy advisory firms and institutional stewardship teams review Items 10 through 14 for board composition, related-party transactions, audit committee designations, and code-of-ethics references. SOX certifications and the auditor's internal-control report provide additional governance signals. This information feeds company scoring, voting recommendations, and engagement reports.

Financial Data Engineers and Quantitative Researchers

Data engineering teams parse the HTML filing document and inline XBRL attachments to extract financial line items, segment data, risk-factor text, and metadata into structured databases. Quantitative researchers use the resulting data for factor construction, cross-sectional screening, and backtesting. The dataset's coverage from 1993 provides the long time series backtesting requires.

NLP and Machine Learning Engineers

ML teams use Form 10-K filings as a large-scale financial text corpus. MD&A, risk factors, and business descriptions offer long-form narrative with consistent structure across thousands of companies and decades. Common tasks include sentiment analysis, topic classification, named-entity extraction, disclosure similarity, and change detection. Teams building retrieval-augmented generation systems over SEC content use 10-K filings as a core retrieval corpus given the breadth of topics each filing covers.

Academic and Policy Researchers

Accounting and finance researchers study disclosure behavior, reporting quality, market reactions, and regulatory impact across the dataset's multi-decade span, which covers Sarbanes-Oxley adoption, segment-reporting standard changes, and the inline XBRL transition. Full filing text supports readability, tone, and boilerplate studies; financial statements support archival work on earnings quality and accruals. 10-K/A amendments enable research on restatement frequency and consequences.

Specific Use Cases

The Form 10-K Files Dataset supports workflows that require the full text of annual reports — narratives, exhibits, and metadata — rather than pre-extracted financial data points.

Tracking Risk Factor Evolution Across Years and Peers

Analysts and disclosure counsel extract Item 1A risk factors from the primary 10-K document across multiple filing years for the same registrant, or across peer companies in the same SIC code, to detect newly added risks, removed disclosures, or shifting language emphasis. The periodOfReport and entities.sic fields in metadata.json enable year-over-year and cross-industry filtering. This supports compliance benchmarking, competitive intelligence, and early identification of emerging sector-wide risks such as cybersecurity or supply-chain concentration.

Building Subsidiary and Entity-Structure Maps From Exhibit 21

Due diligence teams and corporate researchers parse EX-21.1 exhibit files to extract subsidiary names, jurisdictions of organization, and ownership relationships. The documentFormatFiles array in metadata.json identifies which files carry the EX-21.1 type code. Comparing Exhibit 21 across consecutive annual filings reveals entity additions, disposals, or jurisdictional restructurings that surface acquisition activity or tax-planning changes not highlighted in press releases.

Training NLP Models on Structured Financial Narrative

Machine learning teams use the dataset as a large-scale corpus of long-form financial text with a consistent internal structure repeated across thousands of companies and over thirty years. MD&A (Item 7), risk factors (Item 1A), and business descriptions (Item 1) provide labeled narrative sections suitable for sentiment classification, topic modeling, named-entity extraction, and disclosure-similarity scoring. The SGML document wrapper and inline XBRL tags in post-2019 filings require preprocessing, but their predictable structure simplifies section-level segmentation at scale.

Detecting Restatements and Accounting-Policy Shifts via 10-K/A Amendments

Forensic accountants and credit analysts filter records where formType is "10-K/A" to isolate amended annual reports. Comparing the amendment's restated financial statements and footnotes against the original 10-K for the same periodOfReport reveals the nature and magnitude of corrections — restatement of revenue, reclassification of expenses, or changes to contingent-liability estimates. Amendment frequency by registrant or industry serves as an input to audit-quality scoring and credit-risk models.

Extracting Audited Financial Statements and Footnotes for Multi-Year Ratio Analysis

Equity research and credit teams parse the Item 8 section of the primary 10-K document to obtain audited balance sheets, income statements, and cash flow statements along with their accompanying footnotes. For inline XBRL filings (2019 onward), ix:nonFraction tags embedded in the HTML provide machine-readable values directly. For earlier filings, HTML table extraction is required. The dataset's coverage back to 1993 supports construction of long time-series financial databases for DCF modeling, earnings-quality analysis, and covenant-compliance monitoring.

Monitoring Auditor Changes and Internal-Control Opinions

Governance analysts and assurance teams review Item 9 (auditor disagreements), Item 9A (management's report on internal control and the auditor's attestation), and the independent auditor's report within Item 8 to track auditor rotations, qualified opinions, material weaknesses, and critical audit matters. SOX Section 302 and 906 certifications (EX-31.x and EX-32.x exhibits) confirm officer-level attestation. Screening these sections across the full registrant population flags companies with control deficiencies or recent auditor changes for deeper review.

Dataset Access

Dataset Index JSON API: https://api.sec-api.io/datasets/form-10k-files.json

This endpoint returns metadata about the Form 10-K Files Dataset and a list of all available container files. The response includes the dataset name, description, last updated timestamp, earliest sample date, total records and total size, covered form types, container format, and content file types. It also includes the download URL for the full dataset archive and a list of individual containers with their size, record count, last updated timestamp, and download URL. No API key is required to access this endpoint.

Use this endpoint to monitor which containers have been updated in the most recent refresh run, so you can selectively download only the containers that changed on a given day rather than re-downloading the entire dataset.

Example
1 {
2 "datasetId": "1f13365b-9ade-61de-8797-ad37148434da",
3 "datasetDownloadUrl": "https://api.sec-api.io/datasets/form-10k-files.zip",
4 "name": "Form 10-K Files Dataset",
5 "updatedAt": "2026-04-17T02:53:43.651Z",
6 "earliestSampleDate": "1993-11-01",
7 "totalRecords": 1999462,
8 "totalSize": 45701904818,
9 "formTypes": ["10-K", "10-K/A"],
10 "containerFormat": "ZIP",
11 "fileTypes": ["TXT", "JSON", "HTML", "PDF", "XFD", "FRM"],
12 "containers": [
13 {
14 "downloadUrl": "https://api.sec-api.io/datasets/form-10k-files/2026/2026-04.zip",
15 "key": "2026/2026-04.zip",
16 "size": 14523891,
17 "records": 187,
18 "updatedAt": "2026-04-17T02:53:43.651Z"
19 }
20 ]
21 }

Download Entire Dataset: https://api.sec-api.io/datasets/form-10k-files.zip?token=YOUR_API_KEY

Downloads the complete Form 10-K Files Dataset as a single ZIP archive containing all containers. This endpoint requires an API key passed via the token query parameter.

Download Single Container: https://api.sec-api.io/datasets/form-10k-files/2026/2026-04.zip?token=YOUR_API_KEY

Downloads one individual container file, such as a monthly archive, instead of the full dataset. Container paths are listed in the containers array returned by the dataset index JSON API. This endpoint requires an API key passed via the token query parameter.

Frequently Asked Questions

What forms does the Form 10-K Files Dataset cover?

The dataset covers Form 10-K (annual reports) and Form 10-K/A (amendments to annual reports) filed with SEC EDGAR by domestic registrants under the Securities Exchange Act of 1934.

What does one record in this dataset represent?

A single record is the complete set of files from one EDGAR submission — a metadata.json index file plus all non-image document files (the primary annual report and exhibits) — identified by a unique accession number.

What time period does the dataset cover?

The Form 10-K Files Dataset covers filings from November 1993, when EDGAR electronic filing began, to the present. All domestic registrants were required to file electronically by 1996.

Who is required to file Form 10-K?

Domestic registrants with an active reporting obligation under Section 13 or Section 15(d) of the Securities Exchange Act of 1934, including operating companies, REITs, limited partnerships, SPACs, and business development companies. Foreign private issuers file on Form 20-F instead.

What is the filing deadline for Form 10-K?

The deadline depends on filer category: 60 days after fiscal year-end for large accelerated filers (public float >= $700 million), 75 days for accelerated filers ($75 million to < $700 million), and 90 days for non-accelerated filers.

How does this dataset differ from XBRL financial datasets?

XBRL datasets provide pre-extracted, taxonomy-tagged numeric facts (revenue, net income, total assets) in tabular form. The Form 10-K Files Dataset preserves the complete filing documents — full narrative content, all exhibits, and original formatting — supporting analysis that requires text, exhibits, or document-level context beyond structured numeric values.

What file format is the dataset distributed in?

The dataset is distributed as ZIP containers organized by month. Individual filing records contain files in HTML, TXT, PDF, JSON, XFD, and FRM formats, each wrapped in EDGAR's SGML document envelope.