The Form NT-NSAR Files Dataset is a closed historical archive of every Form NT-NSAR and Form NT-NSAR/A submission filed on EDGAR between January 1994 and June 2018, when the SEC retired the underlying Form N-SAR and replaced it with Form N-CEN. Each record corresponds to a single Rule 12b-25 notification of late filing of Form N-SAR — the semiannual or annual operational report historically required of registered management investment companies and unit investment trusts under Section 30 of the Investment Company Act of 1940 — or to an amendment of such a notification (NT-NSAR/A). Filings are made by the registered investment company itself, executed by an authorized officer such as the principal financial officer, treasurer, or fund administrator. A record is delivered as an EDGAR accession folder inside a monthly ZIP container and bundles a normalized metadata.json descriptor together with the original EDGAR-submitted documents (TXT, occasionally HTML or PDF; image files excluded). The dataset is static: no new NT-NSAR filings have been generated since the June 1, 2018 cutover to Form NT-NCEN.
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
The dataset captures the complete population of EDGAR submissions carrying form type NT-NSAR (original notifications) and NT-NSAR/A (amendments) across the entire active lifetime of the form. The underlying filing is a Form 12b-25 notification of late filing — the universal Rule 12b-25 cover-sheet under the Securities Exchange Act of 1934 that is also used to notify late filing of 10-K, 10-Q, 20-F, and other periodic reports — restricted in this dataset to the variant that targets Form N-SAR. The substantive content of an NT-NSAR filing is not a financial report. It is a short administrative document, typically one to three printed pages of plain English, whose only legal purpose is to invoke the Rule 12b-25 extension and disclose the cause of the delay.
The dataset covers the full perimeter of the form: every registered investment company filing a Rule 12b-25 notification against Form N-SAR between January 1994 and June 2018. Records are organized into monthly ZIP containers named YYYY-MM.zip. Inside each container, every record is a self-contained accession folder holding a metadata.json descriptor plus the original EDGAR documents. The dataset terminates organically in mid-2018: the SEC's adoption of Form N-CEN under the October 2016 investment company reporting modernization rules rescinded Form N-SAR effective June 1, 2018, after which late-filing notifications for the new annual census report take the form type NT-NCEN rather than NT-NSAR.
One record in the Form NT-NSAR Files Dataset is a single EDGAR submission of either Form NT-NSAR or Form NT-NSAR/A, identified by its 18-character SEC accession number and delivered as a self-contained folder inside a monthly ZIP container. The folder bundles a normalized JSON descriptor of the EDGAR submission header together with the documents that the filer originally transmitted to EDGAR, image files excluded. A record therefore corresponds one-to-one to a single Rule 12b-25 notification of late filing of Form N-SAR by a registered investment company, or to an amendment of such a notification when the form type is NT-NSAR/A.
Within a monthly ZIP archive the record manifests as a directory whose name is the EDGAR accession number with hyphens stripped (for example 000158064216012479 for accession number 0001580642-16-012479). Inside that directory sit two kinds of artifact: a single metadata.json file that always exists, and one or more EDGAR-submitted documents — most commonly a single short plain-text file named ntnsar.txt, occasionally accompanied by HTML or PDF exhibits.
A record has three nested layers, and an extraction pipeline traverses all three:
YYYY-MM.zip whose root contains a single subdirectory YYYY-MM/, under which sit the accession folders for that month.<DOCUMENT> envelope; the substantive Form 12b-25 prose sits inside the <TEXT> block of that envelope.The hyphenated accession number is preserved in the JSON descriptor (accessionNo field) while the folder name strips hyphens, so the folder name and the metadata field are the canonical join key between the on-disk layout and the structured header.
metadata.json descriptormetadata.json is the structured mirror of the EDGAR submission header. Its fields are produced from the SGML header EDGAR generates at the moment of filing acceptance, normalized into JSON. The fields that carry meaningful content for an NT-NSAR record are:
formType — either "NT-NSAR" or "NT-NSAR/A". This is the only place in the record where the amendment distinction is captured structurally; the folder layout itself does not separate amendments from originals.accessionNo — the hyphenated 18-character EDGAR accession number, e.g. "0001580642-16-012479".filedAt — ISO-8601 timestamp with timezone offset (Eastern Time), recording the exact moment EDGAR accepted the submission.periodOfReport — the fiscal period-end date of the underlying Form N-SAR whose filing is being delayed, formatted YYYY-MM-DD. Because N-SAR was semiannual, this date is overwhelmingly a March 31, June 30, September 30, or December 31.description — a human-readable label such as "Form NT-NSAR/A - Notification of inability to timely file Form N-SAR: [Amend]".linkToFilingDetails — URL to the primary document (the ntnsar.txt) on EDGAR.linkToTxt — URL to the complete SGML submission text file on EDGAR.linkToHtml — URL to the EDGAR filing-index HTML page for the accession.linkToXbrl — empty string on every record; the form does not carry XBRL.documentFormatFiles — array of every document in the EDGAR submission, each entry carrying sequence, size (byte count, as a string), documentUrl, description, and type. The primary notification is always sequence "1" with type matching the form type and a filer-supplied description. A second entry, with blank sequence and type and description "Complete submission text file", points to the consolidated SGML wrapper EDGAR builds around the whole submission.dataFiles — an empty array for every NT-NSAR record. The field exists because the metadata schema is shared with form types that carry XBRL data exhibits; it should be treated as optional and consistently empty for this form family.seriesAndClassesContractsInformation — usually an empty array. Fund filers that elect to register at the series/class level can populate it with series IDs and class IDs, but the form imposes no requirement to do so on a notification of late filing.entities — an array of filer entities. For each entity the descriptor records companyName (with a parenthesized role suffix such as (Filer)), cik (zero-padded 10-digit central index key), irsNo (often "000000000" for fund shells), fileNo (SEC file number, typically prefixed 811- for registered management investment companies, 814- for business development companies), act (almost always "40", marking the Investment Company Act of 1940), stateOfIncorporation, filmNo (EDGAR film/control number), type (the entity-level relationship to this filing, mirroring the form type), and tickers (an array, frequently multi-valued because a single fund commonly issues several share classes under distinct tickers).id — an internal content hash of the record (MD5-like hex string).The combination of cik, fileNo, periodOfReport, and formType uniquely places each record in the fund's filing timeline; the 811-/814- file-number prefix together with act="40" confirms that the filer is a registered management investment company governed by the Investment Company Act of 1940, the population to which Form N-SAR (and therefore Form NT-NSAR) applied.
ntnsar.txt)The primary document is a plain-text file wrapped in EDGAR's SGML <DOCUMENT> envelope. The envelope carries five header tags — <TYPE>, <SEQUENCE>, <FILENAME>, <DESCRIPTION>, and <TEXT> — and the entire substantive content sits between <TEXT> and </TEXT>. Inside that block lies the literal Form 12b-25 text, organized into the four numbered parts of the SEC paper form, with a literal <PAGE> token marking page breaks from the original paginated layout.
The opening of the body reproduces the SEC heading and the checkbox grid that selects which report is being delayed:
1
U.S SECURITIES AND EXCHANGE COMMISSION
2
WASHINGTON, D.C. 20549
3
4
FORM 12b-25
5
NOTIFICATION OF LATE FILING
On the checkbox grid the box for Form N-SAR is marked, and a For the Period Ended: line carries the fiscal period whose filing is being delayed. The remainder of the document follows the four standard parts.
Part I supplies the registrant's legal identifying block: full company name and the address of the principal executive office (street, city, state, ZIP). Layout is plain-text prose with the same line breaks as the SEC paper form. SEC file number and IRS employer identification number sometimes appear here as well, although most NT-NSAR registrants leave the IRS line zeroed because the filer is an investment-company shell rather than an operating entity.
Part II contains the formal Rule 12b-25 representations. The filer affirms, by checking the corresponding boxes, that (a) the report could not be filed on the prescribed date without unreasonable effort or expense, (b) the report will be filed within the 15-day extension window permitted by Rule 12b-25(b)(2)(ii) for annual reports (or the shorter window for quarterly-style reports), and (c) any required accountant's statement is attached as Exhibit 1 to the notification. For an N-SAR notification the (c) representation is generally not invoked because N-SAR does not require an auditor's statement, but the boilerplate boxes remain on the form.
Part III is the free-text core of the filing: a short prose explanation of why the underlying N-SAR could not be filed on time without unreasonable effort or expense. Typical content includes late receipt of audited financial statements from the fund's accountant or sub-administrator, changes in fund administrator or accounting agent, delays in obtaining sub-adviser confirmations, system or staffing disruptions, internal-controls remediation, mergers and reorganizations affecting the fund complex, and back-testing of fair-value valuations. This is the only free-form narrative in the entire filing and the only place where the substantive cause of the late filing is disclosed.
Part IV captures three small pieces of structured information:
/s/-prefixed electronic signature line.The signature date frequently precedes filedAt by a day or two, since filers sign internally and then transmit to EDGAR shortly afterwards.
A record includes the complete header metadata from the EDGAR submission (in metadata.json), the primary NT-NSAR notification document in its original SGML-wrapped plain-text form, and any HTML or PDF exhibits the filer attached to the original submission. The file-types present in the dataset are TXT, JSON, HTML, and PDF; in practice the overwhelming majority of records consist of a metadata.json plus a single ntnsar.txt, because NT-NSAR notifications rarely have exhibits. When HTML or PDF documents do appear they are preserved verbatim from the EDGAR submission and listed in documentFormatFiles with their original sequence numbers, byte sizes, descriptions, and URLs.
Image files (GIF, JPG, and similar binary image formats) are deliberately stripped from each accession folder. These are almost never substantive for NT-NSAR — typically scanned letterheads or signature images on the rare PDF exhibits — and removing them keeps containers small. Beyond images, no other EDGAR content is removed: the SGML-wrapped notification, the complete-submission text file, and any HTML or PDF exhibits are all retained.
The dataset does not include the underlying Form N-SAR itself. Form N-SAR was a separate filing with its own EDGAR submissions and its own structured exhibits, and is outside the scope of this dataset. Nor does the dataset include other 12b-25 variants such as Form NT 10-K, Form NT 10-Q, Form NT 20-F, or NT-NCEN, which target different periodic reports and live in their own datasets.
The form's required content stayed remarkably stable across the 1994 to 2018 window. The Rule 12b-25 framework, the four-part layout of Form 12b-25, and the checkbox grid identifying which periodic report is being delayed have remained essentially unchanged for the entirety of the dataset's date range. What changed is the surrounding regulatory context rather than the body of the form:
NT-NCEN rather than NT-NSAR. The Form NT-NSAR Files Dataset therefore terminates organically in mid-2018; later notifications by fund filers do not appear here.act="40" and the 811- (open- and closed-end management companies, unit investment trusts) or 814- (business development companies) file-number prefix in the metadata's entities block.Because Form NT-NSAR is a short administrative notification, it did not participate in the structured-data modernization that reshaped operating-company filings:
<DOCUMENT> envelope. The four-part Form 12b-25 body is uniformly ASCII, with <PAGE> tokens between parts.ntnsar.txt retains the plain-text format throughout the dataset's history; HTML or PDF were never widely adopted for the primary document of this form, in contrast to operating-company periodic reports where HTML supplanted ASCII as the dominant primary-document format during the early 2000s.Several practical considerations matter for downstream use of a record:
NT-NSAR/A filings are amendments to prior NT-NSAR notifications. The amendment relationship is not encoded as a parent-accession reference inside metadata.json; reconstructing the amendment chain requires linking on cik and periodOfReport across the record set. Amendments typically re-state the entire Part III narrative rather than diffing against the prior filing.<TEXT> body. Common cue phrases (unreasonable effort or expense, auditor, subadviser, fair value, internal control) cluster in identifiable patterns but are never tagged.<PAGE> tokens. Page-break markers embedded in the body should be treated as soft separators when reconstructing the document into a linear text; they do not delimit semantically meaningful sections.tickers array on a single filer entity is routinely multi-valued because closed-end funds and open-end funds with multiple share classes register a ticker per class. Reading the ticker as the single identifier of the filer is incorrect; the canonical filer identifier is cik, with fileNo providing the SEC file-number anchor.periodOfReport semantics. periodOfReport refers to the period of the delayed Form N-SAR, not the date the notification itself was filed. The two can differ by several months when the filer is significantly late.dataFiles, linkToXbrl, and seriesAndClassesContractsInformation are inherited from the broader EDGAR metadata schema and are usually empty on this form family; they should be parsed as optional rather than treated as missing data.accessionNo field. Any pipeline that walks the folder tree must normalize one form to the other to align them.filedAt timestamp in metadata.json are independent; downstream timelines should distinguish "date signed" from "date accepted by EDGAR".The filer is always the registered investment company itself (the registrant), identified by its EDGAR CIK and its 1940 Act file number (typically the 811- prefix). The form is executed by an authorized officer of the registrant, most often the principal financial officer, treasurer, or fund administrator.
The filing population is limited to entities subject to Form N-SAR under Section 30 of the Investment Company Act of 1940:
For series trusts, the NT-NSAR is filed at the registrant level even where the underlying delay relates to specific series or classes.
Entities outside this population do not appear in the dataset. Operating-company Exchange Act issuers used Form NT 10-K, NT 10-Q, or NT 20-F. Business development companies (BDCs), although investment companies, file Exchange Act periodic reports and used the NT 10-K / NT 10-Q regime rather than NT-NSAR.
The triggering event is the registrant's determination that it cannot timely file Form N-SAR for a given reporting period without unreasonable effort or expense. The underlying N-SAR deadlines are:
When the registrant cannot meet that 60-day deadline, Rule 12b-25 under the Securities Exchange Act of 1934 requires it to file a notification of late filing on the prescribed form. For N-SAR filers, that form is Form NT-NSAR.
Deadline mechanics:
Form NT-NSAR/A is an amendment to a previously filed NT-NSAR, used to correct disclosure errors, revise the anticipated N-SAR filing date, or update identifying information. It is not an independent late-filing notification.
The dataset is a closed historical archive covering EDGAR submissions from January 1994 through June 2018. The earliest records reflect the phase-in of electronic filing for investment company reports; the latest records correspond to the final N-SAR cycle before the SEC retired Form N-SAR and replaced it with Form N-CEN as part of the 2016 investment company reporting modernization (effective June 1, 2018). No new NT-NSAR filings are accepted after the cutover, and the dataset is static.
Form NT-NSAR sits at the intersection of two regulatory regimes: the Rule 12b-25 late-filing framework under the Securities Exchange Act of 1934, and the investment company periodic reporting obligations under the Investment Company Act of 1940. The closest comparison targets fall into three groups: the underlying report whose lateness is being noticed (N-SAR), the successor regime that replaced it (N-CEN and NT-NCEN), and the parallel Rule 12b-25 notifications filed for other report types (NT-NCSR, NT-NCSRS, NT 10-K, NT 10-Q, NT 20-F).
N-SAR was the semi-annual report (annual for unit investment trusts) required of registered management investment companies and UITs under Section 30 of the Investment Company Act and Rule 30b1-1. It is the substantive report; NT-NSAR is the procedural notification that the substantive report will arrive late.
The two datasets are tightly coupled but share almost no content. N-SAR carried structured operational and financial data in a fixed answer-format: portfolio turnover, sales loads, advisory fees, custodian and transfer agent identity, distribution arrangements, securities lending activity, and similar fields. NT-NSAR contains only the 12b-25 cover-sheet narrative — a brief explanation of the delay, the expected filing date, and a representation about whether the lateness signals a material change in results. Both forms were discontinued together in June 2018.
Use N-SAR for fund operations and economics; use NT-NSAR for compliance behavior, late-filing patterns, and audit-cycle disruption.
N-CEN replaced N-SAR effective June 1, 2018 and is the modern annual census-style report for registered investment companies, filed in structured XML under Rule 30a-1. N-CEN is the successor to N-SAR, not to NT-NSAR — it is substantive, not a notification. The relevance is temporal: NT-NSAR is a closed historical series (1994 to mid-2018), and any continuous time series of investment-company late-filing behavior must stitch NT-NSAR to NT-NCEN. Note that the underlying cadence also shifts from semi-annual (and annual for UITs) under N-SAR to annual under N-CEN, which changes the denominator of any lateness rate.
NT-NCEN is the direct functional successor to NT-NSAR — the same Rule 12b-25 notification mechanism, redirected at the new annual N-CEN report. Structure and narrative content are nearly identical. The differences are temporal and quantitative: NT-NCEN covers a single annual filing per registrant rather than semi-annual N-SAR filings, and it remains an active series while NT-NSAR is closed. NT-NCEN is the natural forward extension of this dataset for continuity research; it is out of scope for pure historical N-SAR-era analysis.
Form NT-NCSR and NT-NCSRS are Rule 12b-25 notifications filed by the same investment-company population as NT-NSAR, but for different underlying reports: Form N-CSR (certified annual shareholder reports) and Form N-CSRS (certified semi-annual shareholder reports) under Rule 30b2-1. The content of those underlying reports — audited or unaudited shareholder financial statements, management discussion, and Sarbanes-Oxley certifications — differs fundamentally from the operational census data on N-SAR.
A fund late on its shareholder report files NT-NCSR or NT-NCSRS; a fund late on its operational census report filed NT-NSAR. The two notification streams are independent and can co-occur for the same fund in the same period. Unlike NT-NSAR, NT-NCSR and NT-NCSRS remain active.
NT 10-K and NT 10-Q are the operating-company analogs of NT-NSAR. All three use the identical Rule 12b-25 cover-sheet structure and Part III narrative explaining inability to file timely. The boundary is the filer population and underlying report: NT 10-K and NT 10-Q come from operating companies filing 1934 Act annual or quarterly reports; NT-NSAR came from registered investment companies filing a 1940 Act report.
A pooled Rule 12b-25 dataset can be useful for studying notification mechanics, but mixing operating-company and fund-complex late filings without segmentation will conflate very different drivers — audit timing, restatements, and M&A on the operating-company side versus fund mergers, adviser changes, and transfer agent transitions on the fund side.
NT 20-F is the foreign private issuer analog within the same Rule 12b-25 framework, covering late filings of the annual Form 20-F. It shares the cover-sheet design with NT-NSAR but applies to a disjoint filer population — FPIs reporting under the 1934 Act, not registered investment companies. It is relevant only for mapping the perimeter of the 12b-25 notification family, not as a substitute or complement for NT-NSAR research.
The Form NT-NSAR dataset captures a narrow conjunction: a registered investment company or UIT, invoking Rule 12b-25, to delay Form N-SAR, between January 1994 and June 2018. No other dataset isolates this slice. N-SAR holds the substantive report but not the lateness signal. NT-NCEN carries the lateness signal forward but onto a different report on a different cadence. NT-NCSR and NT-NCSRS share the filer population but cover shareholder reports rather than census reports. NT 10-K, NT 10-Q, and NT 20-F share the 12b-25 architecture but apply to entirely different filer regimes. NT-NSAR is best treated as a compliance-behavior series, complementary to N-SAR rather than substitutable for it, and sharply bounded by the 2018 retirement of the N-SAR regime.
Because the dataset is a closed historical archive, users work retrospectively: reconstructing fund-complex reporting histories, studying the N-SAR regime, or completing Rule 12b-25 archives. The fields that matter most across user types are CIK and registrant identity, periodOfReport, the 12b-25(b)/(c) elections, and the Part III narrative explanation.
Used during internal audits, exam responses covering legacy periods, and diligence on funds inherited through adviser transactions or sub-advisory changes. They pull every NT-NSAR under a fund family by CIK, map missed N-SAR semiannual periods via periodOfReport, and use Part III narratives to characterize stated causes (system conversions, service-provider transitions, audit delays, fair-valuation issues). Output is a reporting-history memo or representation to a successor adviser or trustee.
Treat the dataset as a complete population of fund-side Rule 12b-25 notifications across the form's active window. They analyze filedAt and periodOfReport for seasonality and clustering, quantify 12b-25(b) versus (c) elections, and code Part III narratives to track root causes over time. Output is longitudinal analysis of late-filing behavior and historical context for current N-CEN exception-reporting policy.
Join entities and CIK to provider-to-fund maps to test whether NT-NSAR filings cluster around specific administrators, accounting agents, or platform transitions. periodOfReport and Part III explanations drive the analysis. Output is a service-provider reliability review or benchmarking input for RFPs and contract renewals.
Merge NT-NSAR records with N-SAR fund-level data, returns, board composition, and adviser identity to test whether late-filing events correlate with subsequent adviser turnover, board changes, or performance. Part III narratives support hand-coded studies of stated causes; the 12b-25(b) representation distinguishes routine extensions from serious filing breakdowns.
Construct authoritative reporting timelines in disputes involving fund advisers, administrators, or auditors. They align every NT-NSAR for a CIK against the eventual N-SAR or N-SAR/A filings to compute actual versus permitted delay, then mine Part III for the registrant's contemporaneous explanation. The 12b-25(c) election is read carefully because it touches anticipated material changes in results of operations. Output is a timeline exhibit or expert report appendix.
Surface historical reporting friction in a target adviser by running CIK searches across every fund it manages, clustering NT-NSAR filings by periodOfReport, and reading Part III for patterns of operational stress around semiannual N-SAR cycles. Output feeds purchase-price adjustments, reps and warranties, or post-closing remediation plans.
Use this dataset to fill the registered-investment-company portion of the Rule 12b-25 universe (alongside NT 10-K, NT 10-Q, NT 20-F, NT-NCSR) for the 1994 to 2018 window. They consume accessionNo, filedAt, CIK, and entity metadata to normalize records and ingest document payloads for downstream text extraction.
Use the Part III narratives as a compact, well-bounded corpus tied to structured metadata for training classifiers that label delay reasons, summarize operational disruptions, or link narrative content to periodOfReport and 12b-25 election flags.
The use cases below anchor in the dataset's specific structure: CIK and file-number identity, periodOfReport, the 12b-25(b)/(c) representations, and the Part III free-text explanation.
A fund-adviser M&A diligence team enumerates every CIK managed by a target adviser, pulls all NT-NSAR and NT-NSAR/A records for those CIKs, and clusters them by periodOfReport. Repeated lateness around the same March 31 or September 30 N-SAR cycles, or recurring Part III references to audit firms or sub-administrator transitions, becomes evidence for purchase-price adjustments, reps and warranties on regulatory compliance, or post-closing remediation conditions.
Regulatory researchers and academics treat Part III as a closed-corpus NLP task. They strip <PAGE> tokens from ntnsar.txt, segment on standard cue phrases (unreasonable effort or expense, auditor, sub-adviser, fair value, internal control, merger), and hand-label or LLM-classify a sample to produce a stated-cause taxonomy. Outputs are time series of delay causes from 1994 to 2018 and inputs to studies of N-SAR-era operational risk and the design rationale for N-CEN exception reporting.
Forensic accounting and litigation-support teams join NT-NSAR records (by cik and periodOfReport) against the eventual N-SAR or N-SAR/A acceptance dates for the same period. The 15-day Rule 12b-25(b) extension window from filedAt defines the "permitted" cutoff; the gap between that cutoff and the substantive N-SAR filedAt quantifies the breach. The 12b-25(c) election and Part III narrative supply the registrant's contemporaneous explanation for expert reports and timeline exhibits.
A fund administrator or accountant joins the entities block (CIK, fund file numbers under 811- and 814-) against an external map of fund-to-administrator and fund-to-auditor assignments for the relevant period. Counting NT-NSAR filings per provider-year, weighted by fund complex size, produces a reliability scorecard. Part III text explicitly naming administrator transitions, conversions of accounting platforms, or auditor handoffs adds qualitative evidence for RFP responses and contract renewals.
Compliance teams and historians building a single 1994-to-present view of registered-investment-company late filings concatenate NT-NSAR with NT-NCEN starting June 1, 2018. The dataset's formType, periodOfReport, cik, and fileNo align cleanly with NT-NCEN's schema, but the underlying cadence shifts from semi-annual N-SAR to annual N-CEN, so the analyst must renormalize the lateness denominator at the regime break.
Because metadata.json does not carry a parent-accession reference, reconstructing amendment chains requires linking all NT-NSAR/A records back to the original NT-NSAR on the same cik plus periodOfReport. The output is a per-fund amendment ledger that distinguishes notifications corrected for clerical reasons from those whose Part III narrative was materially re-stated, supporting compliance memos and exam-response packages covering legacy periods.
LLM and RAG developers use the Part III bodies as a small, clean fine-tuning or evaluation corpus for compliance-text classification: each ntnsar.txt is paired with structured labels from metadata.json (form type, period quarter-end, file-number prefix distinguishing 811- management companies from 814- BDCs). The resulting classifier can be redeployed against active NT-NCEN, NT-NCSR, and NT 10-K filings, with NT-NSAR providing a labeled historical baseline.
Dataset Index JSON API: https://api.sec-api.io/datasets/form-ntnsar-files.json
This endpoint returns the dataset's metadata along with the full list of container files. The metadata includes the dataset name, description, last updated timestamp, earliest sample date (1994-01-01), total record and size counters, the form types covered (NT-NSAR, NT-NSAR/A), the container format (ZIP), and the file types contained inside each container (TXT, JSON, HTML, PDF). The containers array lists every monthly ZIP archive with its key, size, records, updatedAt timestamp, and downloadUrl. Use this endpoint to monitor which containers changed in the latest refresh run so you can selectively re-download only what's new.
This endpoint does not require an API key.
Example response:
1
{
2
"datasetId": "1f13365b-9ae0-6977-b435-8aa67686a569",
3
"datasetDownloadUrl": "https://api.sec-api.io/datasets/form-ntnsar-files.zip",
4
"name": "Form NT-NSAR Files Dataset",
5
"updatedAt": "2026-04-15T08:04:25.558Z",
6
"earliestSampleDate": "1994-01-01",
7
"totalRecords": 2760,
8
"totalSize": 7985619,
9
"formTypes": ["NT-NSAR", "NT-NSAR/A"],
10
"containerFormat": "ZIP",
11
"fileTypes": ["TXT", "JSON", "HTML", "PDF"],
12
"containers": [
13
{
14
"downloadUrl": "https://api.sec-api.io/datasets/form-ntnsar-files/2018/2018-06.zip",
15
"key": "2018/2018-06.zip",
16
"size": 45213,
17
"records": 12,
18
"updatedAt": "2026-04-15T08:04:25.558Z"
19
}
20
]
21
}
Download Entire Dataset: https://api.sec-api.io/datasets/form-ntnsar-files.zip?token=YOUR_API_KEY
Downloads the complete dataset as a single ZIP archive covering all NT-NSAR and NT-NSAR/A filings from January 1994 through the form's discontinuation in June 2018. The full archive is small, so a bulk download is trivial and usually the simplest way to work with this dataset. This endpoint requires an API key.
Download Single Container: https://api.sec-api.io/datasets/form-ntnsar-files/2018/2018-06.zip?token=YOUR_API_KEY
Each container is a year-month ZIP archive that bundles all filings submitted in that month. Pass the key value from the containers array of the dataset index JSON to download an individual archive instead of the full dataset. This endpoint requires an API key.
The dataset covers Form NT-NSAR and its amendment variant Form NT-NSAR/A — the Rule 12b-25 notification of late filing submitted by registered investment companies when they cannot timely file Form N-SAR, the periodic operational report historically required under Section 30 of the Investment Company Act of 1940.
One record is a single EDGAR submission of Form NT-NSAR or Form NT-NSAR/A, identified by its 18-character SEC accession number and delivered as a self-contained accession folder inside a monthly ZIP container. The folder bundles a metadata.json descriptor with the original EDGAR documents — most commonly a single plain-text ntnsar.txt notification.
The filer is always the registered investment company itself: open-end and closed-end management investment companies (which filed N-SAR semi-annually under Rule 30b1-1) and registered unit investment trusts (which filed N-SAR annually under Rule 30a-1). The form is executed by an authorized officer such as the principal financial officer, treasurer, or fund administrator. Operating companies and foreign private issuers used different NT forms (NT 10-K, NT 10-Q, NT 20-F) and do not appear in this dataset.
The dataset is a closed historical archive of EDGAR submissions from January 1, 1994 through June 2018. It terminates when the SEC retired Form N-SAR under the 2016 investment company reporting modernization rules, replacing it with Form N-CEN effective June 1, 2018; subsequent late-filing notifications use Form NT-NCEN, which is not included here.
The dataset is distributed as monthly ZIP containers named YYYY-MM.zip. Inside, each accession folder contains a normalized metadata.json plus the original EDGAR documents. File types present are TXT (the SGML-wrapped ntnsar.txt notification), JSON (the metadata descriptor), and occasionally HTML or PDF exhibits. Image files (GIF, JPG, and similar) are deliberately stripped.
Form N-SAR is the substantive operational report; Form NT-NSAR is only the procedural notification that the N-SAR will arrive late. The two share filer population and time window but carry almost no overlapping content. Form NT-NCEN is the direct functional successor to NT-NSAR after June 2018, using the same Rule 12b-25 mechanism but targeting the new annual Form N-CEN; for any continuous fund-side 12b-25 time series, NT-NSAR must be stitched to NT-NCEN at the June 1, 2018 regime break.
The substantive cause is disclosed in Part III of the Form 12b-25 body inside ntnsar.txt — the only free-text narrative section in the filing. Typical disclosures include late receipt of audited financials, changes in fund administrator or accounting agent, sub-adviser confirmation delays, internal-controls remediation, fund mergers, and fair-value back-testing. Because Part III is unstructured prose, automated analysis requires natural-language processing of the SGML <TEXT> body.