The Form 11-K Files Dataset is a comprehensive archive of every Form 11-K and Form 11-K/A annual report filed on EDGAR by employee stock purchase, savings, and similar benefit plans whose participation interests are registered as securities under the Securities Act of 1933. One record represents a single EDGAR submission — an accession-named folder containing a normalized metadata.json descriptor plus the full set of primary and exhibit documents exactly as transmitted to EDGAR (minus image files). The filers are the plans themselves, acting as distinct Exchange Act registrants under Section 15(d), with reports signed by plan administrators or plan-committee members rather than by the sponsoring issuer's officers. Coverage begins in January 1994, when the phased EDGAR mandate brought Form 11-K filers into electronic filing, and extends to the present. The dataset is distributed as monthly ZIP containers keyed YYYY/YYYY-MM.zip, and record payloads include TXT, JSON, HTML, PDF, and XML file types.
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
Form 11-K is the annual report mandated by Section 15(d) of the Securities Exchange Act of 1934 for employee stock purchase, savings, profit-sharing, 401(k), and similar plans when interests in those plans constitute securities registered under the Securities Act of 1933 (typically via Form S-8). Rule 15d-21 sets the filing window at 90 days after the plan's fiscal year end, extended to 180 days for plans subject to ERISA. Substantively the filing is a thin cover-and-signature wrapper around a set of plan-level audited financial statements: a Statement of Net Assets Available for Benefits, a Statement of Changes in Net Assets Available for Benefits, the accompanying notes, and a Report of an Independent Registered Public Accounting Firm. Form 11-K/A amends a previously filed 11-K; it re-files the same form type with corrected financials, a restated auditor report, or late-supplied ERISA supplemental schedules, and follows identical structural conventions. Because the form is seasonal and ERISA-driven, filings cluster heavily in late June for calendar-year plans.
The dataset captures every Form 11-K and Form 11-K/A submission on EDGAR from January 1994 forward. Paper Form 11-K filings from earlier years are not included. Each month's filings are packaged into a ZIP container keyed YYYY/YYYY-MM.zip, and every accession folder inside preserves the submission exactly as transmitted to EDGAR — metadata descriptor, SGML-wrapped primary document, and non-image exhibits — so that downstream parsers can reconstruct the original filing byte-for-byte aside from excluded graphics.
One record in the Form 11-K Files Dataset is a complete EDGAR submission of either Form 11-K (Annual Report of Employee Stock Purchase, Savings, and Similar Plans) or Form 11-K/A (its amendment), identified by an 18-digit zero-padded EDGAR accession number such as 000007426025000006. Physically, a record is an accession-named subfolder inside a month-slice ZIP container (keyed YYYY/YYYY-MM.zip, e.g. 2025/2025-01.zip). Each accession folder holds a metadata.json descriptor alongside the full set of primary and exhibit documents exactly as transmitted to EDGAR, minus image files. A record therefore carries two concentric layers: a dataset-packaging layer (the accession folder and its parsed metadata) and a filing-content layer (the SGML-wrapped 11-K narrative, its financial statements, and its exhibits).
At the outermost level a record is an accession folder holding a small, self-contained bundle:
1
YYYY-MM/
2
{18-digit-accession}/
3
metadata.json
4
{primary-11K-document}.htm (or .pdf / .xml)
5
{exhibit-document}.htm (e.g. EX-23 auditor consent)
6
...
Inside the folder, three content layers sit on top of each other:
metadata.json descriptor, a pre-parsed rendering of the EDGAR SGML submission header plus the filing-index table.<DOCUMENT>…<TYPE>…<SEQUENCE>…<FILENAME>…<DESCRIPTION>…<TEXT>…</TEXT></DOCUMENT>) wrapping each preserved document exactly as submitted.<TEXT> block: HTML (overwhelmingly common in the modern era), PDF, or XML for the primary 11-K; HTML or PDF for exhibits such as the auditor's consent.metadata.json descriptormetadata.json is a single JSON object that normalizes the EDGAR submission header and filing index into explicit fields.
Scalar fields identify the filing and its timing:
formType — "11-K" or "11-K/A".accessionNo — dashed EDGAR accession number (e.g. "0000074260-25-000006").id — a stable internal dataset record identifier (a 32-character hexadecimal string).description — a human-readable form description (e.g. "Form 11-K - Annual report of employee stock purchase, savings and similar plans").filedAt — ISO-8601 timestamp with offset ("2025-01-28T17:16:40-05:00").periodOfReport — the plan's fiscal period end ("2024-08-01" on a short-period terminating plan; frequently non-calendar).linkToFilingDetails, linkToHtml, linkToTxt, linkToXbrl — convenience URLs pointing at, respectively, the primary document on EDGAR, the filing-index page, the consolidated SGML full-submission text, and the XBRL instance. linkToXbrl is empty for this form type, since Form 11-K has never been subject to XBRL or Inline XBRL tagging.Two nested arrays carry the richer structure.
documentFormatFiles[] is a row-for-row mirror of the EDGAR filing-index table. Each entry has:
sequence — ordinal position as a string ("1" for the primary document, higher integers for exhibits, blank for the consolidated full-submission text file).size — byte count as a string.documentUrl — the document's URL on EDGAR.description — the filer-supplied label.type — the EDGAR exhibit-type code (11-K, 11-K/A, EX-23, EX-99, GRAPHIC, or blank for the consolidated submission text).A typical primary-document entry:
1
{
2
"sequence": "1",
3
"size": "127874",
4
"documentUrl": "https://www.sec.gov/Archives/edgar/data/74260/000007426025000006/rmicprofitsharingplan2024.htm",
5
"description": "11-K",
6
"type": "11-K"
7
}
entities[] is an array of filing entities (one filer in the common case; more when co-registrants or subject companies are present). Fields per entity include:
companyName — name with a role-tag suffix such as "(Filer)" or "(Subject Company)".cik — Central Index Key.irsNo — IRS employer identification number.fileNo — SEC file number (e.g. "001-10607").filmNo — EDGAR film number.type — form type in the entity's filer role.act — registration act ("34" for the Exchange Act).sic — SIC code and label, e.g. "6351 Surety Insurance".stateOfIncorporation — two-letter code.fiscalYearEnd — MMDD string (e.g. "1231").tickers — array of ticker symbols (e.g. ["ORI"]).Two further arrays — seriesAndClassesContractsInformation and dataFiles — are present for structural uniformity across datasets but are empty for Form 11-K records, because plan filings do not carry fund-series registrants and Form 11-K has no XBRL data files.
Every non-metadata file inside the accession folder preserves EDGAR's SGML document envelope. The primary 11-K opens with a block such as:
1
<DOCUMENT>
2
<TYPE>11-K
3
<SEQUENCE>1
4
<FILENAME>rmicprofitsharingplan2024.htm
5
<DESCRIPTION>11-K
6
<TEXT>
7
<html><head>
8
<!-- Document created using Wdesk -->
9
...
10
</TEXT>
11
</DOCUMENT>
The header tags restate what is also listed in documentFormatFiles[]: <TYPE> is the EDGAR exhibit-type code, <SEQUENCE> is the ordinal position, <FILENAME> is the on-disk filename, and <DESCRIPTION> is the filer-supplied label. The actual filing payload lives strictly between <TEXT> and </TEXT>. Downstream parsers must strip this envelope before feeding the inner HTML, XML, or PDF to a renderer or extractor. Exhibits follow the identical pattern; an EX-23 auditor consent, for example, opens with <TYPE>EX-23, <SEQUENCE>2, <DESCRIPTION>EX-23.
Inside the <TEXT> block of the primary document, the Form 11-K narrative follows a stable canonical order derived from the SEC form instructions:
Cover page. Opens with "SECURITIES AND EXCHANGE COMMISSION / Washington, D.C. 20549 / FORM 11-K", followed by [X] checkboxes for Annual Report vs. Transition Report under Section 15(d) of the Exchange Act, the commission file number of the issuer whose securities underlie the plan (e.g. 001-10607), the full plan name (e.g. "THE REPUBLIC MORTGAGE INSURANCE COMPANY AND AFFILIATED COMPANIES PROFIT SHARING PLAN"), and the sponsoring issuer's name and principal executive office address. A "Total Pages" count is often included.
Signature block. Under the form's required signature clause, the Plan Administrator — or a designated member of the plan's administration/investment committee — signs and dates on behalf of the plan. Although the form's instructions call for signatures at the end, many filings place the signature page directly after the cover (the financial statements then being introduced as attachments), while others place it after the notes; both orderings are common and extraction heuristics should tolerate either.
Financial Statements Index. A short table of contents listing the required statements and their page ranges, typically "Report of Independent Registered Public Accounting Firm", "Statements of Net Assets Available for Benefits", "Statement of Changes in Net Assets Available for Benefits", "Notes to Financial Statements", and — for ERISA plans — any supplemental schedules required by 29 C.F.R. 2520.103-10 (Schedule H, Line 4i Schedule of Assets (Held at End of Year); Schedule of Reportable Transactions; Schedule of Delinquent Participant Contributions). When no schedules are required, the index explicitly states "not applicable".
Report of Independent Registered Public Accounting Firm. The auditor's opinion letter, addressed to the Plan Administrator and Participants, structured in PCAOB-prescribed form: an Opinion on the Financial Statements paragraph, a Basis for Opinion paragraph referencing PCAOB standards and auditor independence, an auditor-tenure statement ("We have served as the Plan's auditor since ..."), any emphasis-of-matter paragraphs (e.g. a Plan Merger paragraph when the plan was merged into a successor plan mid-period), and the audit firm's name, city, and report date. For ERISA plans the report usually adds a separate section on the supplemental information required by the Department of Labor, and for Section 103(a)(3)(C) limited-scope engagements a separate two-pronged auditor's report structure applies.
Statements of Net Assets Available for Benefits. A comparative two-column statement of plan assets and liabilities as of the current and prior fiscal period ends. Line items typically include cash and cash equivalents, participant-directed investments at fair value categorized by vehicle (insurance company guaranteed interest accounts, insurance company pooled separate accounts, mutual funds, collective investment trusts, employer stock funds such as a sponsor's common-stock fund), notes receivable from participants (participant loans), accrued contributions and income receivable, and accrued plan expenses payable; a "Net assets available for benefits" total closes the statement.
Statement of Changes in Net Assets Available for Benefits. A single-period roll-forward. Additions capture investment income (dividends, interest, net appreciation/depreciation in fair value of investments), participant and employer contributions, and rollovers. Deductions capture benefits paid to participants, administrative expenses, and deemed loan distributions. Plan mergers, spin-offs, or transfers in/out appear as separate line items (e.g. Transfers out of the Plan to a successor plan). The net change reconciles to the beginning and ending "Net assets available for benefits" balances carried from the prior statement.
Notes to Financial Statements. A numbered series of notes following U.S. GAAP for defined-contribution plans (ASC 962). Note 1 is almost always a "Description of the Plan" covering plan type, eligibility, contributions (employee pre-tax, Roth, after-tax; employer matching and profit-sharing formulas), vesting, investment options, participant accounts, loans, forfeitures, and termination provisions. Subsequent notes cover Summary of Significant Accounting Policies (basis of accounting, use of estimates, investment valuation and fair-value hierarchy, income recognition, payment of benefits, administrative expenses), Investments, Fair Value Measurements (with Level 1/2/3 hierarchy tables), Related-Party and Party-in-Interest Transactions, Tax Status (referencing IRS determination or opinion letters and qualification under IRC Sections 401(a) and 401(k)), Risks and Uncertainties, Plan Termination, Plan Mergers or Amendments, Reconciliation to Form 5500, and Subsequent Events.
ERISA supplemental schedules (when applicable). Most commonly the Schedule H, Line 4i — Schedule of Assets (Held at End of Year), listing each investment identity, shares or units, cost, and current value; and, when required, a Schedule of Reportable Transactions (5% reportable transactions) or a Schedule of Delinquent Participant Contributions. These schedules typically trail the notes and are separately referenced in the auditor's report on supplemental information. Plans that are winding down, or whose sponsors are non-ERISA, may mark these "not applicable".
Form 11-K primary documents do not carry a cover-page financial-data highlights table; the form is not XBRL-tagged, which is why linkToXbrl and dataFiles are uniformly empty in metadata.json.
The dominant exhibit is EX-23, Consent of Independent Registered Public Accounting Firm, permitting the auditor's report to be incorporated by reference into the sponsoring issuer's Form S-8 registration statement (or other registration statement) that originally registered the plan interests. Textually, EX-23 is short — a header ("CONSENT OF INDEPENDENT REGISTERED PUBLIC ACCOUNTING FIRM"), one or two sentences of consent language referencing the S-8 registration number (e.g. "Registration Statement (No. 333-37210) on Form S-8"), the audit firm's signature block, city, and date. Rendering varies materially: many filers produce clean typed HTML, while others submit the consent as a scanned image referenced by <IMG src="...jpg"> inside an otherwise minimal HTML shell, sometimes with the typed text mirrored into a hidden white-on-white <FONT> block so that EDGAR's full-text search can still index the words. Other exhibits occasionally appear — EX-99 attaching a separate financial-statement document, a re-filed plan instrument, or plan-amendment attachments — but EX-23 dominates the exhibit population.
A record includes the metadata.json descriptor plus every document file from the original EDGAR submission retaining its SGML envelope: the primary 11-K or 11-K/A document (HTML, PDF, or XML payload) and all non-image exhibits (most commonly EX-23, occasionally EX-99 and others). The file-types found across the dataset are TXT, JSON, HTML, PDF, and XML, though in modern filings most accession folders consist of one or two HTML documents plus the JSON descriptor. Pre-2000 records more frequently use plain ASCII .txt payloads.
Image files (GIF, JPG, PNG) are excluded dataset-wide even when documentFormatFiles[] lists them with type: "GRAPHIC". Consumers diffing the accession folder's contents against the metadata descriptor should expect missing files wherever type is GRAPHIC (for instance, a scanned consent letterhead referenced by an EX-23 HTML will not be bundled).
The consolidated {accessionNo}.txt full-submission SGML file — the aggregate archive EDGAR exposes at the submission level — is likewise not bundled into the accession folder even though it typically appears in documentFormatFiles[] with blank sequence and blank type; it is retrievable via linkToTxt.
Upstream EDGAR machine artifacts (R-files, FilingSummary.xml, financial-report metadata) do not exist for this form type because Form 11-K does not pass through the EDGAR Renderer / XBRL pipeline.
Content that the original 11-K incorporates by reference — for example, prior-period financial statements referenced rather than re-filed, or exhibits referenced to the sponsor's Form S-8 — is not materialized inside the record; only the incorporating language is present.
Form 11-K's skeleton has remained stable since the 1980s, but several rule changes have reshaped the interior content:
The cover page, signature page, and the basic ordering of statements → notes → schedules have not changed materially since EDGAR's inception.
Form 11-K has followed EDGAR's general format evolution. Filings from 1994 through the late 1990s are almost universally plain ASCII .txt inside the SGML envelope, with financial statements rendered as fixed-width text tables using spaces and hyphens for column alignment. HTML filings became permissible in the late 1990s (EDGAR began accepting HTML under Release No. 33-7472, effective January 1998) and came to dominate during the 2000s, introducing inline CSS styling, proper <table> markup for financial statements, and later vendor-specific HTML (Workiva/Wdesk, Toppan Merrill, Donnelley) recognizable by embedded generator comments such as <!-- Document created using Wdesk -->. PDF attachments appear intermittently across all eras, typically when audit firms submit scanned originals or when a filer selects PDF as the primary document; these retain the SGML envelope around a base64- or binary-encoded PDF payload. XML payloads are rare and generally limited to vendor-produced structured renderings. The dataset makes no attempt to normalize across these formats — each document is preserved in the format in which it was originally submitted.
Form 11-K has never been subject to XBRL or Inline XBRL tagging, so records do not carry .xsd, XBRL instance, _htm.xml, or _cal/_def/_lab/_pre.xml taxonomy files, and linkToXbrl / dataFiles are uniformly empty. This is a permanent structural property of the form, not a missing feature of the dataset.
Amendments use the identical accession-folder / metadata / SGML-document architecture. The formType field flips to "11-K/A", the primary document's type and description in documentFormatFiles[] become 11-K/A, and the <TYPE> tag in the SGML wrapper reads 11-K/A. Content-wise, amendments typically restate one or more of the financial statements (for example to correct misclassifications between pooled separate accounts and mutual funds, to revise fair-value levelling, or to reflect a reissued auditor's report) and often include an "Explanatory Note" section on the cover page or a dedicated "Restatement" note. A re-dated auditor report with a dual-date or reissued date is common, and the EX-23 consent is typically re-filed alongside. In some cases the 11-K/A is filed solely to add previously omitted ERISA supplemental schedules.
<DOCUMENT>…<TEXT>…</TEXT></DOCUMENT> wrapper; parsers that feed the file directly to an HTML or PDF renderer without stripping the wrapper will fail outright or misrender the top of the document.periodOfReport is plan-fiscal, not calendar. Many plans have non-calendar fiscal years, and terminating or merging plans often file short final periods. Do not assume periodOfReport aligns with the sponsor issuer's fiscal year.<IMG> reference to an excluded JPG, the only machine-readable text may live in a hidden <FONT color="white"> block; extractors that ignore hidden text will miss the consent wording entirely.documentFormatFiles[] vs. on-disk contents. Reconciliation must tolerate gaps: type: "GRAPHIC" rows refer to excluded image files, and rows with blank sequence refer to the consolidated full-submission .txt, which is also not bundled.entities[] is typically a single filer, but the array shape supports co-registrants and subject companies; consumers should iterate rather than index position zero.Each record in this dataset is one Form 11-K or Form 11-K/A annual report filed on EDGAR by an employee benefit plan whose participation interests are separately registered securities. The filer is the plan itself, acting as a distinct Exchange Act registrant with its own CIK, not the operating company that sponsors the plan.
Form 11-K is filed by employee benefit plans whose interests have been registered under the Securities Act of 1933, typically on Form S-8 together with the underlying employer shares. The plan population includes:
The gating condition is registration of the plan's interests as securities. A plan that holds employer stock but does not offer separately registered participation interests, and any plan with no employer securities at all, is outside the Form 11-K regime.
Sponsors of these plans are ordinarily Exchange Act reporting issuers, most often U.S. public companies, though foreign private issuers that register plan interests are equally subject to Form 11-K even while the sponsor itself reports on Form 20-F or Form 40-F.
The plan appears in EDGAR as its own filer (for example, "XYZ Corp. Savings and Investment Plan") with a separate CIK and accession number. Because the plan has no officers or directors, Form 11-K is signed on behalf of the plan by the plan administrator or by members of the committee designated in the plan document (benefits, plan, or investment committee). Signers act in that fiduciary capacity, not as officers of the sponsor, even when the same individuals also hold officer roles at the sponsor.
Filing is triggered by the close of the plan's fiscal year, not by participant activity or corporate events at the sponsor. The obligation arises under Section 15(d) of the Securities Exchange Act of 1934, which imposes periodic reporting on any issuer (including a plan) that has an effective Securities Act registration statement. Rule 15d-21 permits the plan to satisfy that annual reporting obligation by filing Form 11-K separately, in lieu of including the plan's financial statements in the sponsor's Form 10-K. The sponsor's 10-K does not, by itself, discharge the plan's reporting duty when the plan's interests are registered.
Each plan files one Form 11-K per fiscal year for as long as it remains subject to Section 15(d), plus any amendments.
Two deadlines apply depending on whether the plan is governed by ERISA and how its financial statements are prepared:
Most filings in the dataset cluster in June and July, reflecting the 180-day window for calendar-year ERISA plans, with a smaller late-March cluster from 90-day filers. Plans with non-calendar fiscal years appear throughout the year.
Form 11-K/A records amend a previously filed annual report for the same plan and plan year. Common causes include restated financial statements, replacement auditor consents or reports, corrected signatures, added exhibits, or responses to SEC staff comments. Each amendment is a distinct EDGAR accession, and a single plan-year may be amended more than once.
The dataset is limited to EDGAR-native records. The earliest filings date to January 1994, following the phased EDGAR mandate that brought Form 11-K filers into electronic filing in the mid-1990s; paper Form 11-K filings from earlier years are not included.
Form 11-K sits in a narrow slot on EDGAR: it is the only filing type that carries audited, plan-level financial statements for employee stock purchase, savings, and similar plans whose interests are registered as securities. Because its subject matter brushes against sponsor annual reporting, plan registration, insider and proxy disclosure, and Department of Labor plan returns, it is routinely confused with several adjacent datasets. The comparisons below isolate where overlap ends.
10-K reports the consolidated operating company; 11-K reports the benefit plan as a separate financial reporting entity.
S-8 is the upstream trigger for 11-K: registering plan shares on S-8 creates the Section 15(d) obligation that 11-K satisfies annually.
11-K/A is the amendment variant and is included in this dataset.
The closest functional analog, but outside the SEC.
XBRL-based products (Financial Statement Data Sets, company facts) are built from 10-K/10-Q inline XBRL.
Form 11-K is distinct because of three attributes that no other EDGAR dataset combines: the plan (not the issuer) as the reporting entity, a standalone audit opinion on that plan's financial statements, and the ERISA supplemental schedules when applicable. 10-K and its XBRL derivatives describe the sponsor. S-8 registers plan securities but never audits them. DEF 14A and Form 4 cover governance and insider activity, not plan financials. Form 5500 covers the same plan but lives at DOL, in a different format, without the GAAP audit opinion. 11-K/A is in-scope here. For audited financial statements of SEC-registered benefit plans from 1994 forward, this dataset is the sole comprehensive source on EDGAR.
Form 11-K filings contain audited plan financial statements, ERISA supplemental schedules, and notes on employer-stock holdings. The dataset serves a narrow professional community working on benefit plans, plan audits, and ERISA compliance.
Counsel reviews peer 11-K language when drafting plan documents, advising on SEC registration for company-stock funds, and preparing Form S-8 registration statements that incorporate 11-K financials by reference. They focus on plan description notes, vesting and loan policies, prohibited transaction disclosures, and the Schedule of Assets Held at End of Year and Schedule of Reportable Transactions.
Accounting firms performing ERISA limited-scope and full-scope audits benchmark the report of the independent registered public accounting firm, the statement of net assets available for benefits, and the statement of changes in net assets. Technical accounting groups use the corpus to calibrate fair value hierarchy disclosures, subsequent-event language, and adoption patterns for reporting on benefit-responsive contracts, master trust interests, and self-directed brokerage accounts.
Consultants advising plan sponsors on design, vendor selection, and investment menu construction study peer investment options, fee disclosures, matching formulas, and employer-stock administration. Outputs include fee benchmarks, RFP comparisons, and fiduciary review memos measuring a plan against similarly sized peers.
Research analysts at recordkeepers mine investment schedules and fund-level holdings to track menu composition, target-date adoption, and employer-stock concentration. The data feeds competitive intelligence, defined contribution market sizing, and product development for stable value, target-date, and managed account offerings.
Portfolio managers, stewardship teams, and risk officers use 11-K filings to evaluate employer-stock concentrations inside plans of portfolio companies. The schedule of assets and statement of net assets show the absolute value and percentage of plan assets in employer securities, supporting engagement on diversification, proxy voting on stock-fund matters, and assessment of litigation risk tied to concentration.
Researchers studying 401(k) design and employer-stock concentration use the bulk corpus for panel datasets on asset growth, allocation trends, participant loans, and menu evolution. Line items from the statement of changes in net assets, contribution and rollover notes, and the ERISA supplemental schedules feed econometric work.
Actuaries working on hybrid plans, ESOPs, and nonqualified deferred compensation study contribution patterns and valuation methods for non-readily-marketable securities. ESOP appraisal disclosures and benefit payment schedules feed cash flow and funded-status models.
Enforcement and compliance analysts monitor filings for late submissions, missing audit reports, qualified or adverse opinions, disclosed prohibited transactions, and Form 11-K/A amendments that may signal restatements. Surveillance workflows flag plans outside the 90-day or 180-day deadline, modified auditor opinions, and repeat amenders. Plan sponsor compliance teams run the same checks to benchmark their own filing timeliness.
Buy-side diligence teams review target-company 11-K filings for inherited plan liabilities: plan termination provisions, outstanding participant loans, pending litigation disclosed in plan notes, prohibited transactions, and employer-stock concentrations that must be addressed at closing. Outputs include diligence memos, integration plans, and reps-and-warranties allocation of plan liabilities.
Litigation teams in stock-drop, excessive-fee, and fiduciary-breach cases use historical 11-K filings as admissible evidence of plan asset composition at specific fiscal year ends. Audited financials, supplemental schedules, and fiduciary governance notes establish the timeline of employer-stock holdings, fee arrangements, and named fiduciaries.
Engineers building structured retirement plan databases and retrieval systems over ERISA content use the dataset as a primary source. TXT, HTML, JSON, XML, and PDF variants per accession number support parsing pipelines that extract net-assets statements, investment schedules, and auditor opinion text into structured tables or RAG indexes for benefits attorneys, auditors, and consultants.
Each role draws on specific filing elements: the auditor's report, the statements of net assets and changes in net assets, the ERISA supplemental schedules, the notes on employer stock and prohibited transactions, and the 11-K/A amendment stream. Together these support workflows from longitudinal research and opinion surveillance to plan-level concentration review and transactional diligence.
Concrete workflows the Form 11-K Files Dataset supports. Each use case ties to specific record elements: metadata.json fields, the primary 11-K narrative, the auditor's report, the plan financial statements, the ERISA supplemental schedules, or the EX-23 consent.
Parse the Schedule H, Line 4i Schedule of Assets (Held at End of Year) and the Statement of Net Assets Available for Benefits to extract the current value of the sponsor's common-stock fund and its percentage of total plan assets. Build a longitudinal panel keyed by entities[].cik and periodOfReport to measure concentration drift, forced diversification post-stock-drop, and divestiture pacing. Feeds stewardship engagement memos, proxy-voting rationales on stock-fund proposals, and fiduciary litigation exposure scoring.
Extract the "We have served as the Plan's auditor since ..." sentence, firm name, and report date from the Report of Independent Registered Public Accounting Firm, plus EX-23 signatory, to build an auditor-market map of ERISA plan audits. Detect auditor changes year-over-year, flag Section 103(a)(3)(C) limited-scope engagements introduced under AICPA SAS 136, and surface emphasis-of-matter paragraphs (plan merger, plan termination, going concern). Supports audit-firm competitive intelligence and PCAOB inspection targeting.
Compare filedAt against periodOfReport plus the 90-day (non-ERISA) and 180-day (ERISA) statutory windows to flag delinquent filings. Cross-reference formType "11-K/A" records against prior "11-K" accession numbers for the same plan and fiscal year to identify restatements and repeat amenders, and inspect the "Explanatory Note" or "Restatement" note to categorize the cause (misclassification, reissued opinion, missing supplemental schedules). Produces compliance dashboards for sponsors, enforcement triage lists, and counsel's risk-of-claim matrix.
Mine the Note 1 "Description of the Plan" and the Summary of Significant Accounting Policies for employer-match formulas, vesting schedules, eligibility, Roth availability, and administrative fee allocation; mine the investment detail in the Statement of Net Assets and the Schedule of Assets for menu composition (target-date suites, collective investment trusts, stable-value vehicles, self-directed brokerage). Consultants produce RFP benchmarks and fiduciary review memos positioning a sponsor's plan against similarly sized peers by SIC code (entities[].sic) and plan size band.
Pull the audited Statements of Net Assets and the Schedule of Assets at specific fiscal year ends to establish employer-stock holdings, total plan assets, and participant loan balances on the dates central to a complaint. Combine with the fiduciary-governance language in plan-description and related-party notes to identify named plan administrators and committee members. Output is an exhibit-ready chronology of holdings, fees, and fiduciary structure admissible as an SEC-filed document.
For a target company identified by cik or ticker, retrieve the most recent 11-K and any outstanding 11-K/A to review plan termination provisions, participant-loan receivables, prohibited transaction disclosures, delinquent contribution schedules, and pending plan litigation surfaced in the notes. Feed findings into reps-and-warranties negotiation, purchase-price allocation of plan liabilities, and post-close integration plans for plan mergers or spin-offs (the "Transfers out of the Plan" line on the Statement of Changes is a direct signal of prior restructurings).
Use the preserved SGML envelope plus the primary HTML/PDF payload and EX-23 exhibit as a parsing corpus for retrieval systems serving benefits attorneys and plan auditors. The stable canonical order (cover, signature, auditor report, statements, notes, schedules) and the normalized metadata.json fields (formType, periodOfReport, entities[]) support chunking strategies keyed to section headers, while the 1994-forward history provides longitudinal coverage no XBRL-based dataset can substitute for.
Dataset Index JSON API: https://api.sec-api.io/datasets/form-11k-files.json
The dataset index endpoint returns metadata describing the Form 11-K Files Dataset, including the dataset name, description, last updated timestamp, earliest sample date (1994-01-01), total records, total size, covered form types (11-K and 11-K/A), container format (ZIP), and included file types (TXT, JSON, HTML, PDF, XML). It also returns the download URL for the full dataset archive and a list of every individual container file with per-container metadata such as size, record count, last updated timestamp, and a direct download URL.
This endpoint does not require an API key. It can be polled regularly to monitor which containers were updated in the most recent refresh run, so only changed containers need to be re-downloaded day to day.
Example response:
1
{
2
"datasetId": "1f13365b-9ae0-6901-b951-fe8c346ad25f",
3
"datasetDownloadUrl": "https://api.sec-api.io/datasets/form-11k-files.zip",
4
"name": "Form 11-K Files Dataset",
5
"updatedAt": "2026-04-14T14:41:54.992Z",
6
"earliestSampleDate": "1994-01-01",
7
"totalRecords": 84751,
8
"totalSize": 1077637938,
9
"formTypes": ["11-K", "11-K/A"],
10
"containerFormat": "ZIP",
11
"fileTypes": ["TXT", "JSON", "HTML", "PDF", "XML"],
12
"containers": [
13
{
14
"downloadUrl": "https://api.sec-api.io/datasets/form-11k-files/2026/2026-04.zip",
15
"key": "2026/2026-04.zip",
16
"size": 13818783,
17
"records": 154,
18
"updatedAt": "2026-04-14T14:41:54.992Z"
19
}
20
]
21
}
Download Entire Dataset: https://api.sec-api.io/datasets/form-11k-files.zip?token=YOUR_API_KEY
Downloads the complete dataset as a single ZIP archive containing every Form 11-K and Form 11-K/A filing from January 1994 to the present. This endpoint requires an API key.
Download Single Container: https://api.sec-api.io/datasets/form-11k-files/2026/2026-04.zip?token=YOUR_API_KEY
Downloads one individual monthly container archive instead of the full dataset, which is useful for incremental updates or for retrieving a specific time period. This endpoint requires an API key.
The dataset covers Form 11-K (Annual Report of Employee Stock Purchase, Savings, and Similar Plans) and its amendment, Form 11-K/A. Form 11-K is the annual report mandated by Section 15(d) of the Securities Exchange Act of 1934 for benefit plans whose interests are registered as securities under the Securities Act of 1933.
One record is a complete EDGAR submission of Form 11-K or Form 11-K/A, identified by an 18-digit zero-padded accession number. Physically it is an accession-named subfolder inside a monthly ZIP container, containing a normalized metadata.json descriptor plus the primary filing document and all non-image exhibits preserved with their original EDGAR SGML envelopes.
Employee benefit plans whose participation interests are separately registered under the Securities Act of 1933 — typically via the sponsor's Form S-8 — must file Form 11-K. This includes employee stock purchase plans (ESPPs), 401(k) and savings plans with a registered employer-stock fund, profit-sharing plans, stock bonus plans, and ESOPs. Plans without registered interests, and defined-benefit pensions with no employer securities, do not file Form 11-K.
Form 11-K is due 90 days after the plan's fiscal year end, extended to 180 days for plans subject to ERISA that present financial statements under the ERISA financial reporting framework. Because most ERISA plans use calendar fiscal years, dataset filings cluster heavily in June and July, with a smaller late-March cluster for non-ERISA 90-day filers.
Form 5500 is filed with the Department of Labor through EFAST2, not with the SEC through EDGAR, and it is a structured line-numbered return with schedules (A, C, H, I, R, SB, MB). Form 11-K is a narrative GAAP financial-statement document with an independent auditor's opinion. Form 5500 covers essentially all ERISA plans, while Form 11-K covers only the narrower slice of plans whose interests are registered securities. Form 5500 is not included in this dataset.
No. Form 11-K has never been subject to XBRL or Inline XBRL tagging, so records do not carry .xsd, XBRL instance, _htm.xml, or taxonomy files, and the linkToXbrl and dataFiles fields in metadata.json are uniformly empty. Quantitative work on plan investment composition or employer-security concentration must parse the 11-K document text, tables, and exhibits directly.
Records include TXT, JSON, HTML, PDF, and XML. Modern filings are overwhelmingly HTML for the primary 11-K document and for the dominant EX-23 auditor consent exhibit, with the JSON descriptor (metadata.json) alongside. Pre-2000 filings more frequently use plain ASCII .txt payloads inside the SGML envelope. Image files (GIF, JPG, PNG) are excluded dataset-wide even when they are listed as GRAPHIC in documentFormatFiles[].
Coverage begins January 1, 1994 — the earliest EDGAR-native Form 11-K filings — and extends to the present, refreshed as new filings arrive. Paper Form 11-K filings from before the phased EDGAR mandate in the mid-1990s are not included.