The Form 10-K Files Dataset contains the complete filing documents for every Form 10-K and Form 10-K/A annual report submitted to SEC EDGAR by domestic registrants from November 1993 to the present. Each record corresponds to a single EDGAR submission identified by its accession number and includes a metadata.json index file plus all non-image document files from the filing — the primary annual report, exhibits such as officer certifications, subsidiary lists, and material contracts. The dataset covers annual reports required under Section 13 or Section 15(d) of the Securities Exchange Act of 1934, filed by operating companies, REITs, limited partnerships, SPACs, business development companies, and other domestic issuers. Records are distributed in ZIP containers organized by month, with files in HTML, TXT, PDF, JSON, and other text-based formats.
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
The Form 10-K Files Dataset is built from Form 10-K and Form 10-K/A filings as accepted by EDGAR. Form 10-K is the annual report providing a comprehensive financial and operational overview of the reporting company, including audited financial statements prepared under U.S. GAAP, narrative disclosures required by Regulation S-K, and financial statement presentation governed by Regulation S-X. Form 10-K/A is an amendment to a previously filed 10-K, typically correcting or supplementing specific items; it carries the same record structure but its formType field reads "10-K/A".
The dataset spans from November 1993 — the beginning of EDGAR electronic filing — to the present, covering the entire population of domestic registrants with active reporting obligations during that period. All domestic registrants were required to file electronically by 1996; pre-EDGAR filings exist only in the SEC's paper and microfiche archives and are not included. The dataset is distributed as ZIP containers, with individual filing records containing files in HTML, TXT, PDF, JSON, XFD, and FRM formats.
A single record in the Form 10-K Files Dataset is the complete set of files from one EDGAR submission of a Form 10-K or Form 10-K/A filing. The record is a folder named by the filing's accession number (zero-padded, dashes removed, 18 digits). It contains a metadata.json file and one or more document files (.htm, .txt, .pdf, or other text-based formats) that together reproduce the full non-image content of the original SEC submission. Each record maps one-to-one to a distinct EDGAR accession number.
Each record folder contains two categories of files:
1. metadata.json — a JSON object capturing filing-level attributes and serving as the structural index for all documents in the submission.
2. One or more document files — the primary annual report and any attached exhibits, each wrapped in EDGAR's SGML document envelope. Most are .htm files, but older filings may include .txt files and some submissions contain .pdf documents. Image files (GRAPHIC type) referenced in the metadata are excluded from the dataset; all other document files are present.
The number of document files per record varies substantially. A minimal filing may contain only metadata.json and a single annual report file. A typical large-accelerated-filer submission includes the primary 10-K document plus six to ten exhibit files covering certifications, subsidiary lists, auditor consents, compensation clawback policies, insider trading policies, and material contract descriptions.
The metadata file is the authoritative index to the record. Its top-level scalar fields are:
formType: "10-K" or "10-K/A".accessionNo: The SEC accession number in dash-delimited format (e.g., "0000320193-25-000079"), serving as the primary identifier linking the record to EDGAR.description: A human-readable label for the form type.filedAt: ISO 8601 timestamp with timezone offset indicating when EDGAR accepted the filing.periodOfReport: The fiscal year-end date covered, in YYYY-MM-DD format.id: A unique hexadecimal identifier assigned to the filing.linkToFilingDetails: URL to the primary filing document on SEC.gov.linkToTxt: URL to the complete EDGAR submission text file (not included in the record).linkToHtml: URL to the EDGAR filing index page.linkToXbrl: URL to an XBRL viewer; often an empty string for older filings or filings without interactive data.Four array fields provide deeper structural information:
documentFormatFiles enumerates every document in the submission. Each entry carries sequence (ordering position), size (in bytes, as a string), documentUrl (direct SEC.gov link), description (human-readable label such as "ANNUAL REPORT" or "CERTIFICATION"), and type (the document type code: "10-K", "EX-31.1", "EX-21.1", "GRAPHIC", etc.). This array is the definitive manifest of the submission and includes image files that are excluded from the dataset on disk, enabling consumers to identify missing content and retrieve it from SEC.gov.
dataFiles lists XBRL taxonomy and instance files associated with the filing. For filings with structured data, this typically includes EX-101.SCH (schema), EX-101.CAL (calculation linkbase), EX-101.DEF (definition linkbase), EX-101.LAB (label linkbase), EX-101.PRE (presentation linkbase), and extracted XML instance documents. For filings without XBRL — common before the structured-data mandate and among certain smaller filers — this array is empty.
entities contains one or more objects identifying the filing registrant. Each entity object includes companyName, cik (Central Index Key), tickers (array of stock ticker symbols), sic (SIC code with industry description), stateOfIncorporation, fiscalYearEnd (MMDD format), act (typically "34" for the Exchange Act), fileNo, irsNo, and type.
seriesAndClassesContractsInformation is present but virtually always empty for 10-K filings, as it pertains to investment company series/class structures.
Each document file in the record is wrapped in EDGAR's SGML document envelope. The wrapper opens with <DOCUMENT> and contains structured header fields before the document body:
10-K, EX-31.1, EX-21.1, EX-23.1).The content inside the <TEXT> block is the substantive document. For the primary 10-K, this is the complete annual report. For exhibits, it contains the exhibit text.
File naming conventions vary by filer and filing agent. Some registrants use ticker-based names (aapl-20250927.htm), others use form-type prefixes (form10-k.htm), and filing agents frequently impose their own patterns. The documentFormatFiles array in metadata.json provides the reliable mapping between filenames, types, and descriptions.
The primary annual report document is identified by type value "10-K" and typically assigned sequence "1". Its internal structure follows the form's required organization:
Cover page. The filing opens with registrant identification: legal name, state of incorporation, IRS employer identification number, principal office address, telephone number, Securities Act file number, and stock exchange listing information. The cover page also discloses the fiscal year-end date, filer status (large accelerated filer, accelerated filer, non-accelerated filer, smaller reporting company, or emerging growth company), and whether the registrant has filed all required reports during the preceding 12 months. In inline XBRL filings (2019 onward), cover-page data points are individually tagged with ix: elements.
Part I contains seven items:
Part II contains nine items focused on financial performance and capital markets:
ix:nonFraction and ix:nonNumeric elements.Part III contains five items:
Registrants frequently incorporate Part III items by reference from the proxy statement (DEF 14A). When incorporated by reference, these items contain only a brief cross-reference statement and no substantive content. When included directly, they contain full compensation tables, beneficial ownership tables, and governance disclosures. The proxy statement itself is not part of the 10-K record.
Part IV contains two items:
Signatures. The filing concludes with a signature block in which the principal executive officer, principal financial officer, and a majority of the board of directors sign the report, authenticating the filing and triggering personal liability under Sections 18 and Section 906 of the Sarbanes-Oxley Act. The signature block also includes the signing date and each signatory's title.
The document files beyond the primary 10-K are exhibits. The most common exhibit types are:
Not all exhibits listed in Item 15 are filed as separate documents within the submission. Exhibits incorporated by reference point to documents in prior filings, and no corresponding file appears in the record. The documentFormatFiles array is the definitive list of files actually present in the submission; the Item 15 exhibit index lists both filed and incorporated exhibits.
Each record includes the metadata.json file with all filing-level attributes and the complete document manifest, the primary 10-K annual report document, and all non-image document files submitted with the filing (exhibits, text files, and any other non-GRAPHIC documents). This covers the full narrative, tabular, and financial-statement content of the annual report and its exhibits.
Image files. Documents with type GRAPHIC (typically JPG or PNG files used for logos, charts, performance graphs, or signature images) are listed in documentFormatFiles but not present on disk. Their documentUrl values can be used to retrieve them from SEC.gov. Some older filings that rely heavily on scanned images may have sparse or incomplete HTML content without these files.
XBRL taxonomy and instance files. Structured data files listed in dataFiles (EX-101.SCH, EX-101.CAL, EX-101.DEF, EX-101.LAB, EX-101.PRE, and XML instance documents) are referenced in the metadata but not included as separate files in the record. For inline XBRL filings, the XBRL tags are embedded directly in the primary document and are therefore present within the record.
Complete submission text file. The monolithic EDGAR submission text file (all documents concatenated with SGML wrappers) is referenced via linkToTxt but not included.
SGML wrapper parsing. To extract usable content from each document file, consumers must strip the SGML envelope — the <DOCUMENT>, <TYPE>, <SEQUENCE>, <FILENAME>, <DESCRIPTION>, and <TEXT> tags. Substantive content begins after <TEXT> and ends before </TEXT>.
Inline XBRL tags in HTML. In post-2019 filings, the primary document's HTML contains XBRL namespace elements (<ix:nonNumeric>, <ix:nonFraction>, <ix:header>, <ix:hidden>, etc.) interleaved with standard HTML. Text extraction pipelines must account for these tags to avoid duplicating or mangling content. The <ix:header> block, typically placed near the top of the document, contains context and unit definitions that are not visible content.
Incorporation by reference. When Part III items are deferred to the proxy statement, the 10-K record contains only a cross-reference statement for those items. The actual compensation, governance, and ownership data must be obtained from the registrant's separate DEF 14A filing, which is not part of this record.
Amendments (10-K/A). An amendment record typically restates only the amended items and exhibits, not the entire annual report. The formType field distinguishes amendments from original filings. The amendment's cover page usually identifies which items are being amended.
Exhibit availability vs. exhibit index. The exhibit index in Item 15 lists both exhibits filed with the current submission and exhibits incorporated by reference from prior filings. Only the former have corresponding files in the record. The documentFormatFiles array reflects what was actually submitted; the Item 15 index is the broader catalog.
Format evolution. Record structure varies significantly across the dataset's time span. Filings from the mid-1990s may consist of a single plain-text or minimal-HTML file with rudimentary table formatting. Filings from the 2009–2018 period typically contain richly formatted HTML with separate XBRL taxonomy files referenced in dataFiles. Post-2019 filings contain inline-XBRL-tagged HTML where the primary document carries both human-readable content and machine-readable structured data in a single file. The SGML document wrapper persists across all eras.
Filer variation. Record complexity ranges enormously. A large-accelerated filer may produce a record with seven or more exhibit files, dense inline XBRL tagging, and hundreds of pages of financial content. A smaller filer may produce a record with only the metadata and a single document file containing the entire 10-K with minimal formatting.
A Form 10-K record represents an annual report (or amendment) filed on EDGAR by a domestic registrant with an active reporting obligation under the Securities Exchange Act of 1934. The filer is always the issuer itself — the company, trust, partnership, or other entity whose securities are registered. The trigger is the close of the registrant's fiscal year.
Form 10-K filers hold a reporting obligation under one of two Exchange Act provisions:
Section 13 filers have a class of securities registered under Section 12, either through exchange listing (Section 12(b)) or, historically, through exceeding asset and holder-of-record thresholds (Section 12(g)). The reporting obligation lasts as long as the Section 12 registration is effective.
Section 15(d) filers have an effective Securities Act registration statement (e.g., Form S-1, Form S-11) but no Section 12 registration. The obligation runs for the fiscal year in which the registration statement became effective and each subsequent year, unless suspended because the issuer has fewer than 300 holders of record (1,200 for certain issuers) and is not exchange-listed.
Entity types in the filing population include operating companies (corporations, LLCs), REITs, limited partnerships and MLPs, SPACs, shell and development-stage companies, business development companies, and certain asset-backed issuers whose structure requires Form 10-K rather than specialized ABS forms.
The filing obligation is codified in Rules 13a-1 and 15d-1. Each requires the issuer to file an annual report within a set number of days after its fiscal year end. The deadline depends on the issuer's filer category, determined under Rule 12b-2 based on public float as of the last business day of the most recently completed second fiscal quarter:
| Filer category | Public float | Deadline |
|---|---|---|
| Large accelerated filer | >= $700 million | 60 days |
| Accelerated filer | $75 million to < $700 million | 75 days |
| Non-accelerated filer | < $75 million or not calculable | 90 days |
Filer category can change year to year, with transition rules to prevent oscillation.
Smaller reporting companies (public float < $250 million, or revenue < $100 million when float is < $700 million or not calculable) receive scaled disclosure relief under Regulation S-K. SRC status and accelerated-filer status are determined independently; an SRC that is an accelerated filer still faces the 75-day deadline.
Emerging growth companies (recent IPO issuers with annual gross revenue below the inflation-adjusted threshold, currently approximately $1.235 billion) are exempt from certain requirements, notably the SOX Section 404(b) auditor attestation on internal controls. EGC status expires at the earliest of: five years post-IPO, crossing the revenue threshold, issuing more than $1 billion in non-convertible debt over three years, or becoming a large accelerated filer.
Form 10-K is filed once per fiscal year. The fiscal year end is chosen by the registrant and need not be December 31. Deadlines run from the issuer's own fiscal year end, so filings arrive throughout the year, with heavy concentration after December 31 and other common year ends.
A registrant must file for every fiscal year during which its reporting obligation is active, regardless of whether it has revenue or operations.
Form 10-K/A amends a previously filed 10-K. Each amendment is a separate EDGAR record with its own accession number, referencing the same fiscal-year-end period as the original but carrying a later filing date. Common reasons for amendment:
Form 10-Q (Quarterly Reports) — The nearest relative. Both are periodic Exchange Act reports filed by the same domestic registrants under Regulation S-K and S-X. The 10-Q covers a single fiscal quarter with unaudited interim financials reviewed (not audited) by the external auditor. The 10-K covers the full fiscal year with audited financial statements and an independent auditor's report. The 10-K is substantially more expansive: it requires the full business description (Item 1), complete risk factors (Item 1A), properties, legal proceedings, executive compensation, and exhibits rarely found on a 10-Q such as Exhibit 21 (subsidiary list) and auditor consent (Exhibit 23). The 10-Q provides interim updates between annual filings; the 10-K is the authoritative annual baseline. They complement each other but are not substitutes.
Form 20-F (Foreign Private Issuer Annual Reports) — The annual report equivalent for foreign private issuers (FPIs) listed on U.S. exchanges. Like the 10-K, it delivers audited financial statements, business descriptions, risk factors, and management discussion on an annual cycle. Key differences: 20-F financials may use IFRS or local GAAP with U.S. GAAP reconciliation rather than requiring U.S. GAAP outright. The filing deadline is 120 days after fiscal year-end versus 60-to-90 days for the 10-K. The item numbering and disclosure structure differ, with FPI-specific content such as exchange controls and home-jurisdiction taxation. FPIs are also exempt from proxy rules and Section 16 short-swing profit reporting. The Form 10-K Files Dataset covers only domestic registrants; a cross-border annual-report analysis requires the 20-F dataset as well.
XBRL Financial Datasets (Structured Numeric Extracts) — The most important structural alternative for financial data extraction. XBRL datasets draw their source values from the same 10-K (and 10-Q) filings, but deliver pre-extracted, taxonomy-tagged numeric facts (revenue, net income, total assets, etc.) in tabular form ready for cross-company quantitative comparison. The Form 10-K Files Dataset preserves the complete filing documents: full narrative content (MD&A, risk factors, business descriptions, legal proceedings), all attached exhibits (material contracts, subsidiary lists, certifications), and the original document formatting. XBRL tagging does not capture narrative text or exhibit content. Use XBRL datasets for comparable financial line items at scale; use the Form 10-K Files Dataset for narrative analysis, exhibit content, NLP tasks, or original-presentation access.
Form 8-K (Current Reports) — The event-driven counterpart filed by the same domestic registrants. An 8-K is triggered by specific material events (changes in control, material agreements, officer departures) and must be filed within four business days. Some content overlaps: a material contract first disclosed on an 8-K often reappears as an exhibit on the next 10-K. But the 8-K provides real-time event notification on an irregular schedule, while the 10-K provides the comprehensive annual picture on a fixed calendar. The filing triggers, cadence, and content scope are fundamentally different.
| Dimension | Form 10-K Files Dataset | Nearest alternatives |
|---|---|---|
| Periodicity | Annual (fiscal year-end) | 10-Q: quarterly; 8-K: event-driven; 20-F: annual |
| Filer population | Domestic registrants only | 20-F: foreign private issuers; 40-F: qualifying Canadian issuers |
| Financial statements | Audited, U.S. GAAP | 10-Q: unaudited interim; 20-F: may use IFRS |
| Content scope | Full documents, narratives, and exhibits | XBRL: tagged numeric values only |
| Coverage period | November 1993 to present (includes 10-K/A amendments) | Varies by dataset |
Form 40-F (Canadian cross-listed issuers under the MJDS) serves a similar annual-report function but covers a small, distinct filer population using Canadian disclosure standards. It is a niche complement, not a substitute.
Form 10-KSB was the simplified annual report for smaller reporting companies, retired in 2008. After its elimination, all domestic filers use the standard 10-K with scaled-disclosure accommodations. The Form 10-K Files Dataset does not include 10-KSB filings; pre-2009 research on small companies must account for this gap separately.
Annual Reports to Shareholders (ARS) are corporate marketing documents with no prescribed format, no mandated content, and no systematic EDGAR availability. The 10-K is the legally mandated, regulation-compliant annual disclosure; the ARS is not.
The Form 10-K Files Dataset is the document-level annual report archive for domestic SEC registrants, covering the full filing as submitted: narrative disclosures, exhibits, metadata, and inline XBRL where applicable. It is distinct from quarterly and event-driven datasets in periodicity, from foreign-issuer annual forms in filer population and accounting standards, and from XBRL extracts in preserving complete document content rather than only tagged numeric values. For any work requiring access to the actual annual report documents as filed — including exhibits, narrative sections, and original formatting — this is the primary source.
The Form 10-K Files Dataset supports professionals who need full-text annual reports with historical depth back to 1993 and broad registrant coverage.
Sell-side and buy-side analysts anchor company-level valuation on the audited financial statements, footnotes, and MD&A. Financial statements feed ratio analysis, earnings-quality checks, and DCF models. MD&A reveals management's view of operating trends and capital allocation. Analysts compare MD&A language year over year to detect tone shifts, new risks, or guidance changes, and use the dataset's longitudinal coverage to build multi-year financials and benchmark peers.
Credit teams at rating agencies, banks, and fixed-income managers assess debt-service capacity using debt schedules, lease and pension footnotes, off-balance-sheet arrangements, and covenant disclosures. The liquidity discussion in MD&A and risk factors (Item 1A) surface concentration, regulatory, and litigation risks relevant to repayment. The dataset's inclusion of 10-K/A amendments matters because restated financials can signal control weaknesses that affect credit quality.
Forensic accountants and assurance teams use footnotes (revenue recognition, related-party transactions, contingent liabilities, fair value), the auditor's report, and SOX Section 302/906 certifications to spot anomalies and benchmark accounting policy choices. Comparing footnote disclosures across years and peers reveals unusual estimate changes or aggressive recognition. 10-K/A amendments are especially valuable since they often reflect restatements or error corrections warranting deeper investigation.
Securities attorneys benchmark risk factors (Item 1A), legal proceedings (Item 3), related-party disclosures (Item 13), and material contract exhibits across registrants to draft or review client filings. In litigation, historical 10-K filings establish what a company disclosed or omitted at specific points in time. Coverage back to 1993 and inclusion of amendments directly support litigation timelines.
Due diligence teams at investment banks, private equity firms, and advisory practices review the target's business description, risk factors, legal proceedings, financial statements, and material contract exhibits. The subsidiary list (Exhibit 21) is critical for mapping entity structure and jurisdictional exposure. Comparing multiple years of filings surfaces changes in segment reporting, revenue mix, or risk disclosure that may not appear in management presentations.
Finance and IR professionals at public companies benchmark their own disclosures against peers. They review how comparable registrants present segment results, structure risk factors, and frame MD&A narratives around shared industry headwinds. Executive compensation disclosures (Item 11 or the incorporated proxy) inform pay-design benchmarking and governance discussions.
Governance analysts at proxy advisory firms and institutional stewardship teams review Items 10 through 14 for board composition, related-party transactions, audit committee designations, and code-of-ethics references. SOX certifications and the auditor's internal-control report provide additional governance signals. This information feeds company scoring, voting recommendations, and engagement reports.
Data engineering teams parse the HTML filing document and inline XBRL attachments to extract financial line items, segment data, risk-factor text, and metadata into structured databases. Quantitative researchers use the resulting data for factor construction, cross-sectional screening, and backtesting. The dataset's coverage from 1993 provides the long time series backtesting requires.
ML teams use Form 10-K filings as a large-scale financial text corpus. MD&A, risk factors, and business descriptions offer long-form narrative with consistent structure across thousands of companies and decades. Common tasks include sentiment analysis, topic classification, named-entity extraction, disclosure similarity, and change detection. Teams building retrieval-augmented generation systems over SEC content use 10-K filings as a core retrieval corpus given the breadth of topics each filing covers.
Accounting and finance researchers study disclosure behavior, reporting quality, market reactions, and regulatory impact across the dataset's multi-decade span, which covers Sarbanes-Oxley adoption, segment-reporting standard changes, and the inline XBRL transition. Full filing text supports readability, tone, and boilerplate studies; financial statements support archival work on earnings quality and accruals. 10-K/A amendments enable research on restatement frequency and consequences.
The Form 10-K Files Dataset supports workflows that require the full text of annual reports — narratives, exhibits, and metadata — rather than pre-extracted financial data points.
Analysts and disclosure counsel extract Item 1A risk factors from the primary 10-K document across multiple filing years for the same registrant, or across peer companies in the same SIC code, to detect newly added risks, removed disclosures, or shifting language emphasis. The periodOfReport and entities.sic fields in metadata.json enable year-over-year and cross-industry filtering. This supports compliance benchmarking, competitive intelligence, and early identification of emerging sector-wide risks such as cybersecurity or supply-chain concentration.
Due diligence teams and corporate researchers parse EX-21.1 exhibit files to extract subsidiary names, jurisdictions of organization, and ownership relationships. The documentFormatFiles array in metadata.json identifies which files carry the EX-21.1 type code. Comparing Exhibit 21 across consecutive annual filings reveals entity additions, disposals, or jurisdictional restructurings that surface acquisition activity or tax-planning changes not highlighted in press releases.
Machine learning teams use the dataset as a large-scale corpus of long-form financial text with a consistent internal structure repeated across thousands of companies and over thirty years. MD&A (Item 7), risk factors (Item 1A), and business descriptions (Item 1) provide labeled narrative sections suitable for sentiment classification, topic modeling, named-entity extraction, and disclosure-similarity scoring. The SGML document wrapper and inline XBRL tags in post-2019 filings require preprocessing, but their predictable structure simplifies section-level segmentation at scale.
Forensic accountants and credit analysts filter records where formType is "10-K/A" to isolate amended annual reports. Comparing the amendment's restated financial statements and footnotes against the original 10-K for the same periodOfReport reveals the nature and magnitude of corrections — restatement of revenue, reclassification of expenses, or changes to contingent-liability estimates. Amendment frequency by registrant or industry serves as an input to audit-quality scoring and credit-risk models.
Equity research and credit teams parse the Item 8 section of the primary 10-K document to obtain audited balance sheets, income statements, and cash flow statements along with their accompanying footnotes. For inline XBRL filings (2019 onward), ix:nonFraction tags embedded in the HTML provide machine-readable values directly. For earlier filings, HTML table extraction is required. The dataset's coverage back to 1993 supports construction of long time-series financial databases for DCF modeling, earnings-quality analysis, and covenant-compliance monitoring.
Governance analysts and assurance teams review Item 9 (auditor disagreements), Item 9A (management's report on internal control and the auditor's attestation), and the independent auditor's report within Item 8 to track auditor rotations, qualified opinions, material weaknesses, and critical audit matters. SOX Section 302 and 906 certifications (EX-31.x and EX-32.x exhibits) confirm officer-level attestation. Screening these sections across the full registrant population flags companies with control deficiencies or recent auditor changes for deeper review.
Dataset Index JSON API: https://api.sec-api.io/datasets/form-10k-files.json
This endpoint returns metadata about the Form 10-K Files Dataset and a list of all available container files. The response includes the dataset name, description, last updated timestamp, earliest sample date, total records and total size, covered form types, container format, and content file types. It also includes the download URL for the full dataset archive and a list of individual containers with their size, record count, last updated timestamp, and download URL. No API key is required to access this endpoint.
Use this endpoint to monitor which containers have been updated in the most recent refresh run, so you can selectively download only the containers that changed on a given day rather than re-downloading the entire dataset.
1
{
2
"datasetId": "1f13365b-9ade-61de-8797-ad37148434da",
3
"datasetDownloadUrl": "https://api.sec-api.io/datasets/form-10k-files.zip",
4
"name": "Form 10-K Files Dataset",
5
"updatedAt": "2026-04-17T02:53:43.651Z",
6
"earliestSampleDate": "1993-11-01",
7
"totalRecords": 1999462,
8
"totalSize": 45701904818,
9
"formTypes": ["10-K", "10-K/A"],
10
"containerFormat": "ZIP",
11
"fileTypes": ["TXT", "JSON", "HTML", "PDF", "XFD", "FRM"],
12
"containers": [
13
{
14
"downloadUrl": "https://api.sec-api.io/datasets/form-10k-files/2026/2026-04.zip",
15
"key": "2026/2026-04.zip",
16
"size": 14523891,
17
"records": 187,
18
"updatedAt": "2026-04-17T02:53:43.651Z"
19
}
20
]
21
}
Download Entire Dataset: https://api.sec-api.io/datasets/form-10k-files.zip?token=YOUR_API_KEY
Downloads the complete Form 10-K Files Dataset as a single ZIP archive containing all containers. This endpoint requires an API key passed via the token query parameter.
Download Single Container: https://api.sec-api.io/datasets/form-10k-files/2026/2026-04.zip?token=YOUR_API_KEY
Downloads one individual container file, such as a monthly archive, instead of the full dataset. Container paths are listed in the containers array returned by the dataset index JSON API. This endpoint requires an API key passed via the token query parameter.
The dataset covers Form 10-K (annual reports) and Form 10-K/A (amendments to annual reports) filed with SEC EDGAR by domestic registrants under the Securities Exchange Act of 1934.
A single record is the complete set of files from one EDGAR submission — a metadata.json index file plus all non-image document files (the primary annual report and exhibits) — identified by a unique accession number.
The Form 10-K Files Dataset covers filings from November 1993, when EDGAR electronic filing began, to the present. All domestic registrants were required to file electronically by 1996.
Domestic registrants with an active reporting obligation under Section 13 or Section 15(d) of the Securities Exchange Act of 1934, including operating companies, REITs, limited partnerships, SPACs, and business development companies. Foreign private issuers file on Form 20-F instead.
The deadline depends on filer category: 60 days after fiscal year-end for large accelerated filers (public float >= $700 million), 75 days for accelerated filers ($75 million to < $700 million), and 90 days for non-accelerated filers.
XBRL datasets provide pre-extracted, taxonomy-tagged numeric facts (revenue, net income, total assets) in tabular form. The Form 10-K Files Dataset preserves the complete filing documents — full narrative content, all exhibits, and original formatting — supporting analysis that requires text, exhibits, or document-level context beyond structured numeric values.
The dataset is distributed as ZIP containers organized by month. Individual filing records contain files in HTML, TXT, PDF, JSON, XFD, and FRM formats, each wrapped in EDGAR's SGML document envelope.