The Form N-30D Files dataset is a per-filing archive of every Form N-30D and Form N-30D/A submission accepted by EDGAR, the annual and semi-annual shareholder reports filed by registered management investment companies under Rule 30e-1 of the Investment Company Act of 1940. Each record corresponds to one EDGAR submission — one accession number — and bundles a metadata.json header together with the documents that originally constituted the submission, with image attachments removed. The EDGAR filer is the registrant fund itself: open-end mutual funds, closed-end funds, business development companies, and certain insurance company separate accounts registered as management investment companies. Coverage begins on 1994-01-01, the EDGAR phase-in date for investment company filers, with the dense core of records concentrated between January 1994 and the mid-2003 cutover to Form N-CSR/N-CSRS, and a long tail of late filings and amendments thereafter. The dataset is delivered in ZIP container format with monthly per-container archives and a continuously refreshed JSON index.
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
The Form N-30D Files dataset preserves three decades of as-filed fund shareholder reports submitted electronically to EDGAR. Form N-30D was the historical SEC form by which registered management investment company entities — primarily open-end mutual funds, closed-end funds, and unit investment trusts operating as management companies — delivered their periodic shareholder reports to the SEC. Rule 30e-1 (originally Rule 30d-1) requires every such company to transmit a report to shareholders at least semi-annually within 60 days of the close of the reporting period and to file a copy with the Commission no later than 10 days after that transmission. The form is a wrapper around the substantive shareholder report — itself a hybrid narrative–financial document combining the portfolio manager's discussion, audited or unaudited financial statements prepared under Regulation S-X Article 6, a complete schedule of portfolio investments, expense and per-share financial highlights, and governance disclosures concerning the fund's directors, trustees, and officers. Form N-30D/A is the amendment variant, used to correct or supplement a previously filed shareholder report; it carries the same internal structure and references the original accession being amended.
N-30D was the dominant fund shareholder-report vehicle from the inception of EDGAR filings in January 1994 until 2003, when the SEC's certified-shareholder-report rules implemented under the Sarbanes-Oxley Act (and accompanying amendments to Regulation S-X) introduced Form N-CSR and N-CSRS as the required forms for transmittal of shareholder reports. The vast majority of records in this dataset therefore cluster in the 1994–2003 window, with a long, thinning tail of late filings, stragglers, and amendments after 2003.
The dataset is distributed as ZIP-per-month containers. The file types found inside the containers are TXT, JSON, HTML, and PDF: modern filings consist of an HTML primary document plus the JSON manifest; older filings from the mid-1990s are predominantly plain-ASCII TXT submissions; PDF appears sporadically as a primary or supplementary report document.
A record has two structural layers:
metadata.json — that captures the EDGAR submission envelope, registrant identification, period and date fields, document manifest, and reference URLs back to sec.gov..txt bundle are intentionally omitted from on-disk extraction; they remain accessible only as URL references inside metadata.json.On disk, the record is a per-filing folder named after the 18-digit normalized EDGAR accession number (the dashed accession 0001206774-25-000089 becomes the folder 000120677425000089). Each ZIP container holds a single top-level folder matching the period (e.g., 2025-02/) with one accession-keyed subfolder per filing. Monthly volume is highly uneven: months aligned with calendar quarter-ends and common fiscal-year cutovers (December, March, June, September) carry many filings as funds cluster their semi-annual and annual transmittals, while off-cycle months may carry only a handful of records.
metadata.jsonAlways present, exactly one per record. The file is a single JSON object combining EDGAR header fields with dataset-specific extensions. The salient fields are:
formType — "N-30D" for original filings, "N-30D/A" for amendments.accessionNo — canonical dashed accession number, e.g., "0001206774-25-000089".filedAt — ISO-8601 timestamp with timezone reflecting when EDGAR accepted the submission.effectivenessDate — date the filing became effective on EDGAR.periodOfReport — the reporting period the shareholder report covers, expressed as the period-end date (e.g., "2024-12-31" for an annual report through year-end 2024).description — human-readable form description from the EDGAR header, typically "Form N-30D - Annual and semi-annual reports mailed to shareholders [Rule 30d-1]".linkToFilingDetails, linkToTxt, linkToHtml, linkToXbrl — direct sec.gov URLs to the primary document, the complete-submission text bundle, the EDGAR filing-index page, and any XBRL data; the XBRL link is empty for N-30D since the form is not part of the XBRL regime.id — opaque dataset-internal record identifier.documentFormatFiles — array enumerating every document originally submitted to EDGAR. Each entry carries sequence, size, documentUrl, description, and type (e.g., N-30D, GRAPHIC, EX-99.CERT). The trailing entry uses a single space " " for both sequence and type and a description of "Complete submission text file" — a sentinel pointing at the submission-level .txt wrapper, not a missing value. GRAPHIC entries are listed here even though their bytes are not stored on disk.entities — array of registrant entities. Each entry carries cik, companyName (with the EDGAR role suffix such as "(Filer)" appended), fiscalYearEnd in MMDD form, stateOfIncorporation, act (Investment Company Act = "40"), fileNo, irsNo, type (form type as filed by that entity), and filmNo. Routine fund filings produce a single-element array; joint filings produce multiple entries.seriesAndClassesContractsInformation — array of series/class identifiers under the SEC's series and class taxonomy. Empty for trust-style filers and pre-series-regime filings; populated for fund families that file under the series/class scheme.dataFiles — array of structured data attachments such as XBRL/XML; empty for N-30D.The substantive content of the record is the N-30D shareholder report itself. On modern filings this is an HTML document (filenames typically follow the filer's own pattern, e.g., *-n30d.htm) wrapped in EDGAR's SGML document envelope. The wrapper takes the form:
1
<DOCUMENT>
2
<TYPE>N-30D
3
<SEQUENCE>1
4
<FILENAME>...-n30d.htm
5
<DESCRIPTION>N-30D
6
<TEXT>
7
<HTML>
8
<HEAD>...</HEAD>
9
<BODY>... full report content ...</BODY>
10
</HTML>
11
</TEXT>
12
</DOCUMENT>
The <TYPE>, <SEQUENCE>, <FILENAME>, and <DESCRIPTION> tags repeat the manifest entry for the document inside the file body itself; the rendered report sits inside the <TEXT> block. Pre-HTML-era filings replace the <HTML>...</HTML> payload with plain ASCII text (or, less often, an embedded PDF reference) but retain the same outer SGML wrapper.
Inside the HTML or text payload, the N-30D report follows a conventional mutual-fund shareholder-report ordering. Exact section names and depth vary by issuer and time period, but the canonical structure is:
Layout in HTML-era filings is achieved with inline-styled <TABLE> and <DIV> elements; financial statements and portfolio schedules are dense, multi-column tables that often span many printed pages. In ASCII-era filings, the same content is rendered with monospaced column alignment and ruled lines drawn from hyphens and equal signs.
Beyond the primary *-n30d.htm (or .txt) report, individual filings may include supplemental documents enumerated in documentFormatFiles. These can include president/treasurer certifications (in late N-30D filings made under emerging Sarbanes-Oxley guidance prior to the formal switch to N-CSR), cover letters, and occasional exhibits. Image attachments — performance-graph GIFs, fund-logo JPGs, signature PNGs — appear in the manifest but are excluded from on-disk extraction; references to them inside the HTML body therefore appear as broken <IMG SRC="..."> links when the HTML is rendered standalone, which is expected behavior.
For each accession number, the dataset bundles:
metadata.json capturing the EDGAR header, registrant identification, period and date metadata, document manifest, and back-references to sec.gov URLs.The following are deliberately not stored on disk inside the per-filing folder:
GRAPHIC type — JPG, GIF, PNG) referenced by the report. They appear in documentFormatFiles with their documentUrl and remain retrievable from sec.gov, but the binary bytes are not extracted into the dataset. Performance graphs and fund logos are therefore absent from the on-disk content..txt bundle — the concatenated SGML envelope EDGAR generates that contains every document of the submission inline. It appears as the trailing sentinel entry in documentFormatFiles (with sequence and type set to a single space and description "Complete submission text file") and is reachable via linkToTxt, but it is not stored as its own file because the individual constituent documents are already extracted.dataFiles entries are listed in metadata but not written to disk; for N-30D this array is routinely empty.The internal structure of the N-30D shareholder report evolved meaningfully across its EDGAR-era life (1994–2003 dominant phase, with a long tail thereafter):
The form's source-file presentation tracks the broader EDGAR format evolution:
<TEXT> payload is plain text rather than HTML.<HTML>...</HTML> content inside the <TEXT> block, with inline-styled tables for financial statements and richer typography. Some filings carry both an HTML primary document and ASCII or PDF supplements.<DOCUMENT> / <TYPE> / <SEQUENCE> / <FILENAME> / <DESCRIPTION> / <TEXT> framing is the most reliable structural anchor for parsers regardless of payload format.formType field is the discriminator. Amendments may replace, supplement, or correct specific sections of the underlying report; reconciling an amendment to the original requires matching by registrant CIK and periodOfReport.documentFormatFiles is the authoritative enumeration of what was originally filed. On-disk content is a subset that excludes images and the complete-submission wrapper. A document listed in the manifest is not guaranteed to be present as a file in the folder, but its documentUrl always resolves on sec.gov.documentFormatFiles: the trailing entry with single-space sequence and type denoting the complete-submission text file is a structural sentinel, not malformed data. Parsers should detect and skip it explicitly rather than treat it as a missing-value error.<DOCUMENT>, <TYPE>, etc.) before the <HTML> opening tag. Strict HTML parsers may misinterpret the leading SGML lines; robust extraction either strips the SGML preamble first or operates from the inner <TEXT> block onward.<IMG> tags pointing to sibling JPG/GIF files that are absent from the on-disk folder. This is intentional dataset behavior; rendering pipelines that need the images must fetch them via the documentUrl values in documentFormatFiles.seriesAndClassesContractsInformation is empty for filers outside the SEC series/class regime (single-trust funds, pre-2002 filers) and populated for multi-series fund-family filings. Linking a record to a specific series or share class therefore requires consulting this array first and falling back to entity-level CIK matching when it is empty.periodOfReport is the substantive temporal anchor for analytical use (it identifies the period the financials cover), while filedAt and effectivenessDate reflect when the report reached EDGAR. Annual reports typically carry periodOfReport equal to the fund's fiscal-year end and filedAt within roughly 70 days afterward (the 60-day shareholder-transmittal window plus the 10-day SEC filing window).<TEXT> payload — is the typical handling pattern.entities array is single-element for routine fund filings but can carry multiple entries when several registrants file the report jointly (common for fund complexes that share a registrant umbrella). All entities listed share the same accession and the same on-disk record; downstream linking should iterate the array rather than assume a single CIK.Each record is a Form N-30D or N-30D/A submission filed on EDGAR by a registered management investment company under the Investment Company Act of 1940. The EDGAR filer is the registrant fund itself, identified by its registrant CIK. A single submission can carry the shareholder report for one series, multiple series sharing a fiscal period, or, in early EDGAR years, an entire family of series under one trust.
In-scope filer classes:
The shareholder report inside each filing is prepared by the fund's adviser and officers, audited by the independent registered public accountant for annual reports, and approved by the fund's board. None of those parties is the EDGAR filer; only the registrant fund is.
The obligation is periodic and tied to the fund's own fiscal calendar, not the calendar year. Section 30(e) of the Investment Company Act of 1940 and its implementing rule require every registered management investment company to transmit reports to its shareholders at least semi-annually. Two reporting periods trigger an N-30D:
Because fund fiscal year-ends are staggered (commonly December 31, October 31, June 30, or March 31), N-30D submissions appear on EDGAR throughout the year, with no single calendar-year peak.
Deadlines:
The EDGAR filing date therefore reflects the 10-day window after actual transmission, not a fixed offset from period-end. Late transmittals still fix the SEC filing window at 10 days from actual transmission.
The shareholder-report obligation originates in Section 30 of the Investment Company Act of 1940. The implementing rule was originally Rule 30d-1 under the Investment Company Act, with the modern semi-annual-transmission and 60-day/10-day filing structure settled by the Commission's 1972 release IC-7113. Form N-30D was the designated form for these transmittals.
In 2002, the SEC renumbered the rule from Rule 30d-1 to Rule 30e-1, carrying over the substantive obligations.
The Sarbanes-Oxley Act of 2002 then changed the form. SEC release 33-8188 / IC-25914, "Certification of Disclosure in Certain Exchange Act Reports" (effective 2003), created Form N-CSR as the new certified shareholder-report vehicle for management investment companies. For reporting periods ending on or after the 2003 compliance date, registered management investment companies file:
instead of Form N-30D. From that point, N-30D was no longer the primary shareholder-report form.
Consequences for this dataset:
A Form N-30D/A is an amendment to a previously filed N-30D, filed by the same registrant fund and tied to the same fiscal reporting period. Typical reasons:
Adjacent registered investment company populations use different reporting paths and are out of scope:
The Investment Company Act shareholder-report obligation dates to 1940, and the 60-day/10-day cadence was settled by 1972 under IC-7113, but pre-EDGAR N-30D reports were filed on paper and microfiche and are not in this dataset. Electronic N-30D submissions begin in January 1994 with the EDGAR phase-in for investment company filers. The dataset's earliest sample date of 1994-01-01 reflects that EDGAR boundary, not the legal origin of the obligation.
Form N-30D sits inside a tightly clustered family of investment-company disclosure regimes: direct successors that replaced it after 2003, siblings that carry pieces of fund disclosure into more structured channels, and adjacent forms that look similar but cover a different filer population. The comparisons below focus on the closest neighbors a researcher would confuse with, substitute for, or pair against an N-30D corpus.
N-CSR (annual) and N-CSRS (semi-annual) are the post-Sarbanes-Oxley replacements, adopted in SEC Release Nos. 33-8188 / IC-25914 and effective for periods ending on or after July 9, 2003. They share the Rule 30e-1 transmittal framework and the same core body: shareholder letter, financial statements, schedule of investments, financial highlights, and expense disclosure.
The differences are additive on the N-CSR side: Sections 302 and 906 CEO/CFO certifications, internal-control disclosure, audit-committee financial-expert identification, principal-accountant fee disclosure, code-of-ethics representation, and (later) the tailored shareholder report regime and inline XBRL-tagged financial exhibits.
Coverage is sequential, not parallel. N-CSR/N-CSRS is authoritative from mid-2003 forward; N-30D is the only source from 1994 through mid-2003. A long-horizon shareholder-report corpus must splice the two, accepting that the earlier segment lacks officer certifications, audit-committee disclosure, and structured tagging.
N-Q required management investment companies to file complete portfolio holdings as of the first and third fiscal quarter-ends, the two quarters not covered by an N-CSR/N-CSRS report. It was rescinded effective May 2019 once N-PORT public reporting matured.
Form N-Q overlaps with N-30D only on the schedule of investments. It carries no shareholder letter, no narrative, no financial statements, and no expense disclosure. As a research complement, it fills intra-year holdings gaps in the 2004-2016 window; it does not substitute for N-30D's narrative content.
Form N-PORT, adopted in SEC Release IC-32314 under Rule 30b1-9, requires registered management funds (excluding money market and small UITs) to report monthly portfolio holdings, risk metrics, derivatives detail, securities-lending activity, and liquidity classifications in structured XML, with public release on a calendar quarter delay.
N-PORT is the modern structured analogue of the schedule-of-investments portion of N-30D — dramatically broader on per-position risk and liquidity, dramatically narrower on everything else (no narrative, no shareholder letter, no financial statements). For quantitative holdings analysis it is far superior to N-30D; for qualitative fund disclosure it contains nothing comparable.
Form N-CEN, adopted alongside N-PORT, is a structured (XML) annual census of fund identifiers, fund type, service providers, securities-lending agents, fees, and exemptive orders relied upon. It replaced N-SAR for fiscal periods ending on or after June 1, 2018.
N-CEN does not duplicate N-30D. It carries no performance discussion, no security-level holdings, and no financial statements — it is operational metadata. The two are complementary: N-30D tells you how a fund described results to investors; N-CEN describes its structural and operational characteristics.
Form N-SAR, the N-CEN predecessor, was a fixed-field structured filing covering transfer agents, custodians, sales loads, expenses paid, and portfolio turnover. For the 1994-2003 N-30D era, N-SAR is the natural pairing for operational and fee data, occupying the slot N-CEN now fills. It is not a substitute for N-30D content — no narrative, no shareholder letter, no audited statements.
Form N-PX, filed annually under Rule 30b1-4, reports each fund's proxy votes across portfolio companies. There is no content overlap with N-30D; the only link is shared filer population. Voting research cannot use N-30D, and performance, expense, or holdings research cannot use N-PX.
N-30B-2 collects periodic reports from registered investment companies — commonly UITs and other non-management vehicles — whose shareholder-style reports flow under Investment Company Act provisions outside the Rule 30e-1 management-company channel that drives N-30D. N-30D-1 was a related historical filing pathway.
The boundary is filer population, not content. N-30D (and successor N-CSR/N-CSRS) covers registered management investment companies — mutual funds, closed-end funds, ETFs organized as management companies. Comprehensive RIC shareholder-report coverage requires the N-30B-2 family alongside N-30D; using N-30D alone systematically omits the UIT segment.
Form 10-K and Form 10-Q are the Exchange Act Sections 13(a)/15(d) periodic-reporting framework for operating companies and do not apply to registered management investment companies. The structures look parallel — periodic financials, MD&A, operational results — but the filer populations are disjoint. The comparison matters only because new users sometimes look for "fund 10-Ks"; the correct equivalents are N-30D for 1994-2003 and N-CSR for the modern era.
N-30D is the unstructured, uncertified, narrative-plus-financials shareholder report of the pre-Sarbanes-Oxley fund-disclosure era — a single form bundling the letter to shareholders, MD&A-style discussion, financial highlights, audited financial statements, and schedule of investments in free-form HTML/TXT/PDF. Its modern function has been split: the certified narrative report now lives in N-CSR/N-CSRS, while structured carve-outs of holdings, census, and voting have moved to N-PORT, N-CEN, and N-PX respectively.
That makes N-30D the only source for 1994 through mid-2003 fund shareholder-report content — N-CSR cannot reach back, N-PORT and N-CEN did not yet exist, and N-SAR carries only operational data. From mid-2003 onward, the same content is available with greater rigor and structure across the successor forms, but no single modern form replaces N-30D on its own. The practical rule: treat N-30D as the canonical fund-narrative-and-financials source for 1994-2003, pair it with N-SAR for operational data of that era, and switch to N-CSR/N-CSRS plus N-PORT, N-CEN, and N-PX for the same content from mid-2003 forward.
Form N-30D shareholder reports are the main public record of how registered management investment companies described portfolios, performance, expenses, and strategy from 1994 until structured forms (N-CSR, N-Q, N-PORT) replaced them. Each accession bundles a metadata file, the primary HTML or TXT report, financial statements, the schedule of investments, and any amendments, and different professions extract different layers.
Build survivorship-bias-free fund panels reaching back to 1994, before N-Q (2004) and N-PORT (2019). They parse the schedule of investments in the primary HTML/TXT for CUSIPs, share counts, and market values, and pull total return, expense ratio, and turnover from the financial highlights. Outputs: performance-attribution panels, fund-flow studies, and replication files for papers on window dressing, fee competition, and stale-pricing arbitrage.
Data teams at multi-manager platforms, fund-of-funds, and outsourced CIOs enrich competitor and target-fund coverage where modern structured holdings are missing. They normalize the schedule of investments into a holdings table keyed by accession and period, lift expense and waiver figures from the financial highlights, and extract the Letter to Shareholders for stated strategy. Output: holdings-overlap analytics, manager-style classification feeds, and sub-adviser due-diligence dossiers.
Commercial fund-analytics shops backfill performance, expense, and holdings series for funds liquidated, merged, or rebranded before structured filings. They parse the metadata file for CIK, series, and class IDs, the financial highlights for per-share data and total returns, and the schedule of investments for asset-class composition. Output: extended fund histories and historical peer-group statistics fed into commercial platforms.
Used in 401(k) excessive-fee class actions, early-2000s market-timing and late-trading matters, soft-dollar disputes, and fiduciary-breach claims. They focus on 12b-1 and expense disclosures in the financial highlights, the schedule of investments around alleged market-timing windows, related-party and affiliated-broker notes, and the auditor's report. Output: expert reports, damages models, and exhibits keyed to as-filed contemporaneous disclosures.
Fund counsel, compliance officers, and product staff diligence acquired, reorganized, or merged predecessor funds. They review the Letter to Shareholders for stated objectives and strategy changes, the schedule of investments for legacy positions with tax or compliance carryovers, and N-30D/A amendments to see what was restated. Output: 15(c) board materials, merger files, and predecessor-performance disclosure in registration statements.
Study the post-1987 fund expansion, the dot-com bubble, the 2000-2002 bear market, and the rise of indexing. They read the Letter to Shareholders for contemporaneous manager narrative, the schedule of investments for sector and geographic shifts, and the financial highlights for asset-growth trajectories. Output: monographs and policy papers on industry evolution, retail participation, and monetary-policy transmission through pooled vehicles.
Use shareholder letters and Manager's Reports inside the primary HTML/TXT as training and evaluation data for tone classification, risk-language extraction, and disclosure parsing. They pair narrative passages with the same filing's schedule of investments and financial highlights to link commentary to realized portfolio changes. Output: fine-tuned summarization models, RAG benchmarks over fund disclosures, and labeled corpora for disclosure-quality research.
Reconstruct fund-family historical positioning on sectors and issuers before mainstream sustainability regimes. They mine the schedule of investments for fossil-fuel, tobacco, defense, and controversial-issuer exposures and the Letter to Shareholders for early references to social or environmental screens. Output: longitudinal greenwashing studies, SRI evolution research, and historical baselines for current ESG fund classifications.
Reference library when onboarding legacy series, reconciling historical NAV series, or answering inspection requests on prior-period disclosure. They pull the audited financial statements, fair-valuation and related-party notes, and the fiscal-year-end schedule of investments, using the metadata file to align accessions with reporting periods. Output: reconciliation memos, audit working papers, and historical pricing-policy reviews.
Each group reaches for a different layer of the same filing — the metadata file, the schedule of investments, the financial statements, or the Letter to Shareholders — but all depend on the dataset because no other source preserves three decades of as-filed registered investment company shareholder reports in a uniformly accessible form.
The workflows below draw on specific record components — metadata.json, the primary HTML/TXT shareholder report, the schedule of investments, the financial highlights, or the manager's letter — and produce a defined artifact.
A quantitative finance researcher reconstructing mutual-fund performance back to 1994 — the era before N-Q (2004) and N-PORT (2019) — iterates the dataset by periodOfReport and entities[].cik, pulls total return, expense ratio, net investment income ratio, and portfolio turnover from the Financial Highlights table, and parses the Schedule of Investments for asset-class weights. Filings from liquidated and merged funds, indexed by accession and CIK, are retained alongside survivors. Output: a fund-period panel feeding fee-competition, window-dressing, and stale-pricing replication studies.
A fund-research vendor extending Morningstar-style coverage backward parses the Schedule of Investments inside each *-n30d.htm (or ASCII payload) into a normalized holdings table keyed by accessionNo, periodOfReport, and CIK, joining seriesAndClassesContractsInformation to map share classes. The Financial Highlights block supplies per-share NAV history. Output: extended historical holdings and total-return series for funds that disappeared before N-CSR/N-PORT, loaded into a commercial fund-analytics platform.
A forensic accountant supporting an ERISA excessive-fee class action or an early-2000s market-timing matter pulls the audited Statement of Operations and Financial Highlights for 12b-1 fees, advisory fees, and expense waivers, the Notes to Financial Statements for affiliated-broker and related-party transactions, and the Schedule of Investments dated near alleged market-timing windows. The auditor's opinion and formType = N-30D/A amendments anchor what was restated and when. Output: damages-model exhibits and expert-report appendices citing contemporaneous as-filed disclosure rather than reconstructed data.
A fund-adviser product or compliance team preparing a registration statement for a reorganized or merged fund reads the Letter to Shareholders for stated investment objective and strategy at the predecessor, the Schedule of Investments for legacy positions with tax-lot or compliance carryovers, the Trustees-and-Officers section for governance continuity, and any N-30D/A amendments tied to the same CIK and periodOfReport. Output: 15(c) board memos, predecessor-performance disclosure for the new prospectus, and merger-file working papers.
An ML team building a fund-disclosure summarization or risk-language model treats the Letter to Shareholders and Manager's Report sections inside the primary HTML/TXT document as a labeled corpus, paired per accession with quantitative outcomes lifted from the same filing's Financial Highlights and Schedule of Investments. Format-aware dispatch on the SGML <DOCUMENT><TYPE>N-30D</TYPE> envelope handles ASCII-era and HTML-era payloads uniformly. Output: fine-tuned tone and risk-disclosure classifiers, and a RAG benchmark linking narrative claims to realized portfolio changes.
An ESG researcher studying sector exposures and early SRI language across the 1994–2003 fund universe filters the dataset to relevant fund families via entities[].companyName and cik, mines each filing's Schedule of Investments for fossil-fuel, tobacco, defense, and other controversial issuers (matched on issuer name strings and, where available, CUSIPs), and scans the Letter to Shareholders for early screening or stewardship language. Output: longitudinal greenwashing studies and historical exposure baselines used to validate or challenge present-day ESG fund classifications.
A fund-accounting or assurance team onboarding a legacy series, or responding to an SEC inspection touching a pre-2003 period, uses metadata.json to align accessions with periodOfReport and fiscal-year ends, pulls the Statement of Assets and Liabilities and the fair-valuation footnote from the Notes to Financial Statements, and ties the period-end Schedule of Investments to historical pricing sources. Output: NAV reconciliation memos, valuation-policy review notes, and audit working papers citing the original audited filing.
The Form N-30D Files dataset is accessible through three endpoints: a JSON index for metadata and container listings, a single archive containing the full dataset, and per-container archives for selective downloads. The dataset covers Form N-30D and N-30D/A filings from January 1994 onward, is delivered in ZIP container format, and is refreshed continuously. Use the updatedAt field on the index to track refreshes and decide which containers to re-download.
Dataset Index JSON API: https://api.sec-api.io/datasets/form-n30d-files.json
Returns dataset-level metadata (name, description, updatedAt, earliestSampleDate, totalRecords, totalSize, formTypes, containerFormat, fileTypes), the full dataset download URL, and the list of all container files with per-container key, size, records, updatedAt, and downloadUrl. This endpoint does not require an API key and can be polled to determine which containers were updated in the most recent refresh.
Example response:
1
{
2
"datasetId": "1f13365b-9ae0-68f7-9e74-7824e75dadf8",
3
"datasetDownloadUrl": "https://api.sec-api.io/datasets/form-n30d-files.zip",
4
"name": "Form N-30D Files Dataset",
5
"updatedAt": "2026-04-14T13:28:43.950Z",
6
"earliestSampleDate": "1994-01-01",
7
"totalRecords": 78245,
8
"totalSize": 2543345574,
9
"formTypes": ["N-30D", "N-30D/A"],
10
"containerFormat": "ZIP",
11
"fileTypes": ["TXT", "JSON", "HTML", "PDF"],
12
"containers": [
13
{
14
"downloadUrl": "https://api.sec-api.io/datasets/form-n30d-files/2026/2026-03.zip",
15
"key": "2026/2026-03.zip",
16
"size": 13818783,
17
"records": 154,
18
"updatedAt": "2026-04-14T13:28:43.950Z"
19
}
20
]
21
}
Fetch the index with curl:
1
curl https://api.sec-api.io/datasets/form-n30d-files.json
Download Entire Dataset: https://api.sec-api.io/datasets/form-n30d-files.zip
Downloads the complete dataset as a single ZIP archive containing every container. This endpoint requires authentication via an Authorization: Bearer YOUR_API_KEY header.
1
curl -H "Authorization: Bearer YOUR_API_KEY" \
2
-o form-n30d-files.zip \
3
https://api.sec-api.io/datasets/form-n30d-files.zip
Download Single Container: https://api.sec-api.io/datasets/form-n30d-files/2026/2026-03.zip
Downloads a single container archive (typically a monthly or quarterly slice) instead of the full dataset. Use the downloadUrl values from the index to target specific containers. This endpoint also requires the Authorization: Bearer YOUR_API_KEY header.
1
curl -H "Authorization: Bearer YOUR_API_KEY" \
2
-o 2026-03.zip \
3
https://api.sec-api.io/datasets/form-n30d-files/2026/2026-03.zip
Form N-30D is the historical SEC form by which registered management investment companies — open-end mutual funds, closed-end funds, business development companies, and certain insurance company separate accounts — delivered their annual and semi-annual shareholder reports to the SEC. It is governed by Section 30(e) of the Investment Company Act of 1940 and Rule 30e-1 (originally Rule 30d-1), which require transmission of the report to shareholders within 60 days after the close of the reporting period and filing with the Commission within 10 days after that transmission.
One record corresponds to one EDGAR submission of Form N-30D or Form N-30D/A — a single annual or semi-annual shareholder report (or amendment) filed by a registered management investment company. On disk, each record is a per-filing folder named after the 18-digit normalized accession number, containing a metadata.json header plus all non-image documents that originally constituted the EDGAR submission.
N-CSR (annual) and N-CSRS (semi-annual) are the post-Sarbanes-Oxley successors to N-30D, effective for reporting periods ending on or after July 9, 2003. They share the Rule 30e-1 transmittal framework and the same core body — shareholder letter, financial statements, schedule of investments, financial highlights — but add CEO/CFO certifications under Section 302 and Section 906, internal-control disclosure, audit-committee financial-expert identification, principal-accountant fee disclosure, and inline-XBRL-tagged exhibits. Coverage is sequential: N-30D is the only source for 1994 through mid-2003, and N-CSR/N-CSRS takes over from there.
No. Form N-30D predates the XBRL regime for fund filings, so the linkToXbrl field in metadata.json is empty and the dataFiles array is routinely empty. Structured XBRL-tagged financial exhibits only appear later under the N-CSR/N-CSRS and N-PORT regimes.
The dataset contains TXT, JSON, HTML, and PDF files. Modern filings consist of an HTML primary document plus the metadata.json manifest; filings from the mid-1990s through roughly 1999 are predominantly plain-ASCII TXT submissions; PDF appears sporadically as a primary or supplementary report document. Image attachments (JPG, GIF, PNG) referenced in the manifest are intentionally excluded from on-disk extraction but remain retrievable from sec.gov via the documentUrl values in documentFormatFiles.
The dataset's earliest sample date is 1994-01-01, reflecting the EDGAR phase-in for investment company filers. Records concentrate between January 1994 and the mid-2003 to 2004 cutover to Form N-CSR, with a long, thinning tail of late filings and N-30D/A amendments after 2003. Pre-EDGAR shareholder reports filed on paper or microfiche are not in the dataset.
Three endpoints are available. The JSON index at https://api.sec-api.io/datasets/form-n30d-files.json lists all container files and their metadata and does not require an API key. The full dataset is available as a single ZIP at https://api.sec-api.io/datasets/form-n30d-files.zip, and individual monthly containers are downloadable at URLs of the form https://api.sec-api.io/datasets/form-n30d-files/YYYY/YYYY-MM.zip. Both download endpoints require an Authorization: Bearer YOUR_API_KEY header.