Form NTFNSAR Files Dataset

The Form NTFNSAR Files Dataset is a closed corpus of Rule 12b-25 late-filing notifications submitted to EDGAR by registered investment companies that could not timely file their Form N-SAR semi-annual or annual report. Each record is one NTFNSAR submission — a registrant's Form 12b-25 notification with the "Form N-SAR" box checked — identified by EDGAR accession number and shipped as a folder containing a metadata.json submission header plus one or more .txt documents wrapped in EDGAR SGML envelopes. The filers are the same population that filed N-SAR itself: registered management investment companies (open-end mutual funds, closed-end funds, small business investment companies) and registered unit investment trusts, with submissions frequently routed through fund administrators or third-party EDGAR filing agents. The dataset spans April 1995 through June 2019, beginning with the phased EDGAR rollout for investment companies and ending with the rescission of Form N-SAR by Release 33-10231, which replaced N-SAR with Form N-CEN and retired the NTFNSAR submission code. Records are distributed in monthly ZIP containers organized by year, with TXT and JSON file types only.

Update Frequency
Daily
Updated at
2026-04-15
Earliest Sample Date
1995-04-01
Total Size
428.5 KB
Total Records
146
Container Format
ZIP
Content Types
TXT, JSON
Form Types
NTFNSAR

Dataset APIs

Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.

Dataset Index JSON API

Download the entire dataset as a single archive file.

Download Entire Dataset:

Download a single container file (e.g. monthly archive) from the dataset.

Download Single Container:

Dataset Files

33 files · 428.5 KB
Download All
2005-09.zip22 B0 records
2005-08.zip22 B0 records
2001-03.zip27.1 KB8 records
2001-01.zip32.4 KB12 records
2000-09.zip3.9 KB1 records
2000-03.zip5.0 KB2 records
1999-12.zip2.7 KB1 records
1999-11.zip13.4 KB5 records
1999-10.zip16.0 KB6 records
1999-08.zip94.2 KB33 records
1999-07.zip4.8 KB2 records
1999-06.zip7.7 KB3 records
1999-05.zip52.1 KB16 records
1999-03.zip6.9 KB3 records
1999-01.zip2.2 KB1 records
1998-12.zip7.5 KB3 records
1998-09.zip31.5 KB10 records
1998-05.zip3.5 KB1 records
1998-03.zip32.5 KB10 records
1998-02.zip5.4 KB2 records
1997-12.zip4.9 KB2 records
1997-11.zip19.5 KB6 records
1997-07.zip6.0 KB2 records
1997-06.zip944 B1 records
1997-05.zip3.4 KB1 records
1997-03.zip22.0 KB7 records
1997-01.zip2.6 KB1 records
1996-11.zip2.5 KB1 records
1996-09.zip2.9 KB1 records
1996-05.zip3.6 KB1 records
1995-08.zip2.5 KB1 records
1995-05.zip5.5 KB2 records
1995-04.zip3.4 KB1 records

What This Dataset Contains

NTFNSAR is the EDGAR submission type used when a registered investment company files Form 12b-25 ("Notification of Late Filing") to notify the Commission that it cannot timely file its Form N-SAR — the semi-annual and annual report formerly required of management investment companies and unit investment trusts under Section 30 of the Investment Company Act of 1940 and Rule 30a-1 thereunder. The filing is therefore not a bespoke form: it is the generic Form 12b-25 template with the "Form N-SAR" checkbox marked and routed to EDGAR under the NTFNSAR code. Each document is short — typically two to four printed pages of monospaced ASCII — and procedurally invokes Rule 12b-25 to obtain a 15-day grace period for an annual or semi-annual N-SAR (or 5 days for a quarterly report) by representing that the delay could not be eliminated without unreasonable effort or expense and committing to a forthcoming filing date.

The corpus is closed: it spans April 1995 through June 2019, the cutoff being the rescission of Form N-SAR by Release 33-10231 (which replaced N-SAR with Form N-CEN) and therefore of the NTFNSAR submission code itself. Records are distributed inside monthly ZIP containers organized by year (e.g., 2001/2001-01.zip), with one accession folder per filing and no top-level index file inside the ZIP. The file types found in the dataset are TXT and JSON; image attachments from the original EDGAR submission are excluded.

Content Structure of a Single NTFNSAR Record

What one record represents

A single record is one Form NTFNSAR submission accepted by EDGAR, identified by its accession number and materialized on disk as a folder whose name is the 18-digit "no-dash" form of that accession (e.g., accession 0000950147-01-500011 becomes folder 000095014701500011). Each accession folder holds exactly one metadata.json describing the EDGAR submission header plus one or more .txt documents carrying the actual filing content.

Container and file types

Decompressing a monthly ZIP yields one subfolder per accession. Inside each accession folder there are exactly two kinds of files:

  • metadata.json — a flat per-filing JSON object with EDGAR submission metadata. Always present, exactly one per accession.
  • *.txt — the NTFNSAR document(s), each wrapped in EDGAR's SGML <DOCUMENT> envelope. The primary document keeps the registrant's internal filename from the original submission (e.g., e-5991.txt) rather than a normalized name.

The file types found in the dataset are TXT and JSON. Image attachments from the original EDGAR submission are excluded. The "complete submission text file" — the full SGML bundle EDGAR builds for each accession (named <accession>.txt, e.g. 0000950147-01-500011.txt) — is not bundled into the ZIP; it is referenced by URL in metadata.json (linkToTxt), but only the per-document .txt extracted from it is shipped locally. There is no separate HTML rendering of the form inside the container; the EDGAR filing-index page is referenced by URL via linkToHtml.

Per-filing metadata (metadata.json)

metadata.json is a flat JSON object that mirrors the EDGAR submission header for the accession. Every record carries the following top-level keys:

  • formType — always the string "NTFNSAR".
  • accessionNo — dashed EDGAR accession (e.g. 0000950147-01-500011); the enclosing folder name is the same value with dashes removed.
  • description — fixed boilerplate "Form NTFNSAR - [Notices of Late Filings of Form N-SAR](https://www.sec.gov/about/forms/formn-sar.pdf)".
  • filedAt — ISO-8601 timestamp of acceptance, including timezone offset.
  • periodOfReportYYYY-MM-DD of the N-SAR reporting period the registrant could not file on time, normally a fund fiscal half-year or year end.
  • linkToFilingDetails — URL of the primary NTFNSAR document on EDGAR (the same .txt mirrored locally in the folder).
  • linkToTxt — URL of the full SGML "complete submission text file" on EDGAR.
  • linkToHtml — URL of the EDGAR filing-index page (...-index.htm).
  • linkToXbrl — empty string for this form.
  • documentFormatFiles[] — one entry per document in the original EDGAR submission. Each entry carries sequence, size (bytes, encoded as a string), documentUrl, description, and type. The first entry is the primary NTFNSAR document (sequence "1", type "NTFNSAR"); a second entry typically describes the complete-submission text file with sequence and type set to a single space " " and description "Complete submission text file".
  • entities[] — registrant-level identifiers parsed from the EDGAR header. Each entity object carries companyName (suffixed with the EDGAR role, e.g. "(Filer)"), cik (10-digit, zero-padded), type (the form type filed by that entity, here "NTFNSAR"), fileNo (Investment Company Act file number, conventionally prefixed 811-), act ("40" for the Investment Company Act of 1940), irsNo, stateOfIncorporation, fiscalYearEnd (MMDD), sic (often "0000" for fund registrants), and filmNo. irsNo, sic, and stateOfIncorporation are optional and may be absent for some registrants; missing keys should be treated as absent rather than blank.
  • id — opaque 32-character hex record identifier.
  • dataFiles — always an empty array for this form.

formType, description, and the act value "40" are invariant across the corpus; linkToXbrl and dataFiles are always empty.

TXT document anatomy

Each primary .txt is a single EDGAR SGML envelope wrapping the monospaced ASCII body of Form 12b-25. The wrapper has the conventional shape:

1 <DOCUMENT>
2 <TYPE>NTFNSAR
3 <SEQUENCE>1
4 <FILENAME>e-5991.txt
5 <DESCRIPTION>FORM 12B-25 OF PILGRIM ADVISORY FUNDS, INC.
6 <TEXT>
7 UNITED STATES SECURITIES AND EXCHANGE COMMISSION
8 WASHINGTON, D.C. 20549
9
10 FORM 12b-25
11 NOTIFICATION OF LATE FILING
12 ... (form body, monospaced ASCII with <PAGE> page breaks) ...
13 </TEXT>
14 </DOCUMENT>

The header tags (<TYPE>, <SEQUENCE>, <FILENAME>, <DESCRIPTION>) are EDGAR SGML metadata, not HTML; <PAGE> markers inside the body are SGML page-break tokens. The body inside <TEXT>...</TEXT> is the standard, generic Form 12b-25 template — there is no NTFNSAR-specific layout. Its blocks, in the order encountered, are:

  1. Cover header — printed SEC address (Washington, D.C. 20549), the form title "FORM 12b-25", and the subtitle "NOTIFICATION OF LATE FILING".
  2. Commission File Number — the registrant's 811- Investment Company Act file number (the same value carried in entities[0].fileNo).
  3. "Check One" checkbox group — selecting which delinquent report the notification covers (Form 10-K, 10-Q, 20-F, 11-K, N-SAR, N-CSR, etc.). For NTFNSAR records the [X] is always on Form N-SAR.
  4. "For Period Ended:" line — the fiscal period of the late N-SAR, mirrored in metadata.json as periodOfReport.
  5. Transition-report checkboxes — a small block used when the notification covers a transition report; typically all unchecked for NTFNSAR.
  6. PART I — REGISTRANT INFORMATION — full registrant name, an optional "former name" line if applicable, principal-office street address, and the city/state/zip block.
  7. PART II — RULES 12b-25(b) AND (c) — three attestation checkboxes covering (a) that the delay could not be eliminated without unreasonable effort or expense, (b) that the report will be filed within the Rule 12b-25 grace window (15 calendar days for annual or semi-annual reports such as N-SAR; 5 days for quarterly reports), and (c) whether an accountant's statement explaining the delay is attached.
  8. PART III — NARRATIVE — free-text explanation of the reason for the delay. Entries are usually terse (a sentence or two; phrasings such as "Need resolution of audit issues." are common).
  9. PART IV — OTHER INFORMATION — registrant contact name and phone, plus two yes/no questions: whether any other periodic report is also expected to be late, and whether any significant change in results of operations is anticipated relative to the comparable prior period (with space for an explanation if "yes").
  10. Signature block — registrant name, signature line, name and title of the signing officer, and date.

A <PAGE> separator typically falls between Part II and Part III to mark the page break in the printed form.

Included content

Per accession the bundle includes the parsed EDGAR submission header as metadata.json plus every textual document the registrant submitted in that accession as .txt, including the primary NTFNSAR document and any additional textual exhibits or attachments that were part of the original submission. Each .txt retains its original EDGAR filename and SGML envelope, so the document type, sequence number, filename, and description tags remain machine-readable.

Excluded or separate content

  • Image attachments from the original EDGAR submission are not included.
  • The full SGML "complete submission text file" (<accession>.txt) is not bundled locally; only its URL is exposed via linkToTxt.
  • The EDGAR filing-index page is not rendered locally; only its URL is exposed via linkToHtml.
  • There are no XBRL instance documents, schemas, or rendering files associated with NTFNSAR; the form has never carried structured financial data, so linkToXbrl and dataFiles are always empty.

Structural stability

Form 12b-25 has had a stable, generic layout throughout the EDGAR era, and because Form N-SAR — and with it the NTFNSAR submission code — was rescinded in June 2019, the corpus is closed. The four-Part body, the "Check One" group, the Rules 12b-25(b)/(c) attestation checkboxes, the transition-report checkboxes, and the signature block appear in the same order and with the same semantics across the entire 1995–2019 range. Records remain plain-text Form 12b-25 instances inside an EDGAR SGML <DOCUMENT> envelope from beginning to end; there is no transition to HTML and no later structural redesign to account for. The only meaningful variation across records is content-level (registrant identity, the N-SAR reporting period, the narrative reason for the delay, and which optional fields such as irsNo or the former-name line are populated), not template-level.

Interpretation and extraction notes

  • The folder name (no-dash accession) is the canonical join key between filesystem layout and metadata.json.accessionNo once dashes are removed. Document filenames inside the folder are not normalized; consumers that need a stable handle to the primary document should rely on the documentFormatFiles[] entry whose type is "NTFNSAR" (or equivalently the URL in linkToFilingDetails) rather than parsing the filename.
  • The second entry in documentFormatFiles[] describes the complete-submission text file, not a separate exhibit; its sequence and type are a single-space placeholder (" ") and its documentUrl points to the EDGAR-hosted SGML bundle that is not present locally.
  • Bulk filings by fund families are common: a single filer-services CIK frequently submits many NTFNSARs on the same day for related registrants in a fund complex, all sharing the same periodOfReport and an identical or near-identical Part III narrative. Treat the accession, not the calendar day or the narrative text, as the unit of observation.
  • entities[] may include multiple objects when more than one registrant or related entity appears in the EDGAR header; the form-level filer is conventionally the entity whose companyName is suffixed "(Filer)". Optional entity fields (irsNo, sic, stateOfIncorporation) should be treated as absent rather than blank when missing.
  • The <PAGE> markers inside the body are SGML artifacts and should be stripped (or used as page-break hints) by extractors that operate on the body text. The four PART headers, the "Check One" group, the Rules 12b-25(b)/(c) attestation checkboxes, the "For Period Ended:" line, and the PART IV yes/no block are reliable textual anchors because they are consistent across the corpus.
  • All substantive content beyond the metadata header lives in the free-text body of the .txt document — particularly the Part I address block, the Part III narrative, and the Part IV contact and yes/no answers; the body itself is unstructured ASCII and any field-level extraction must be performed on the text.

Who Files or Publishes This Dataset, and When

Who files

The NTFNSAR population mirrors the N-SAR filer population. Under Section 30 of the Investment Company Act of 1940 and the rules thereunder, N-SAR was filed by:

The actual EDGAR submitter is often a fund administrator or third-party filer agent rather than the fund itself. NTFNSAR filings therefore commonly appear in same-day clusters across many CIKs in one fund complex when a shared service provider, auditor, or sub-adviser causes the delay.

Operating-company late notifications (NT 10-K, NT 10-Q, NT 20-F) are not in this dataset. Business development companies file 10-K/10-Q and use NT 10-K/NT 10-Q, so they fall outside NTFNSAR despite being 1940 Act entities.

Regulatory framework

NTFNSAR sits at the intersection of two regimes:

  • The substantive obligation to file N-SAR came from Section 30 of the Investment Company Act of 1940 and Rules 30a-1 and 30b1-1 (annual) and 30b1-1 (semi-annual).
  • The late-filing relief mechanism is Rule 12b-25 under the Securities Exchange Act of 1934. Although N-SAR is a 1940 Act report, the SEC applied the Form 12b-25 framework to it. EDGAR routes 12b-25 notifications relating to N-SAR under the dedicated submission type NTFNSAR so they are searchable separately.

The notification represents that the report could not be filed by its due date without unreasonable effort or expense, states the reasons, and asserts a good-faith expectation of filing within the grace period.

Trigger and timing

The trigger is event-driven: the registrant determines it cannot file N-SAR by the due date set under Rule 30a-1 or 30b1-1. Rule 12b-25(a) then dictates the deadline structure:

  • The notification must be filed no later than one business day after the original due date of the N-SAR report.
  • If timely filed with the required good-faith representation, the N-SAR is treated as timely when filed within:
    • 15 calendar days after the original due date for an annual N-SAR, or
    • 5 calendar days after the original due date for a semi-annual N-SAR.

Missing the one-business-day window forecloses the relief; an NTFNSAR submitted after that point cannot cure the lateness. Once accepted, NTFNSAR submissions are publicly disseminated immediately, with no staff review gate.

Common stated reasons include incomplete audits, late data from sub-advisers or sub-custodians, accounting system migrations, fund reorganizations, or service-provider transitions. Because filing is conditional on an actual late event, the dataset is sparse relative to the underlying N-SAR population.

Coverage window

The dataset spans April 1995 through June 2019:

  • 1995 start: earliest NTFNSAR submissions coincide with the SEC's phased rollout of mandatory EDGAR filing for investment companies. Pre-EDGAR paper notifications under the same 12b-25 framework are not included.
  • 2019 end: in the Investment Company Reporting Modernization rulemaking (Release No. 33-10231, adopted October 2016, with phased compliance through 2018-2019), the SEC rescinded Form N-SAR and replaced it with Form N-CEN, an annual census report. With N-SAR rescinded, NTFNSAR was retired. Late notifications relating to N-CEN are filed under separate EDGAR submission types and are not in this dataset.

Important distinctions

  • Registrant vs. series. The record is keyed to the registrant CIK; a single NTFNSAR can cover an N-SAR that, when filed, will report on dozens of series.
  • Filer agent vs. registrant. The CIK is the registrant's, but the EDGAR submitter is often a fund administrator. Bulk same-day clusters typically reflect this.
  • Amendments. NTFNSAR/A is the amendment sub-type, used to update the stated reason for delay or expected filing date. An amendment does not extend the 15- or 5-day grace period.
  • Notification only. NTFNSAR contains no N-SAR financial or operational data; the substantive N-SAR is a separate filing event.
  • Not for operating companies or BDCs. 10-K/10-Q/20-F filers — including BDCs and asset-backed issuers — use NT 10-K, NT 10-Q, or NT 20-F.
  • Closed dataset. Post-June 2019 late filings relating to investment-company census reports use N-CEN-specific late-filing submission types, not NTFNSAR.

How This Dataset Differs From Similar Datasets or Filings

NTFNSAR belongs to the small family of Rule 12b-25 late-filing notifications. Its closest neighbors are not investment-company disclosures in general but the other "NT-" forms that share the same 12b-25 mechanism while attaching to different underlying late reports. Comparisons below are ordered by closeness.

Form NT-NCEN — the temporal successor

Same Rule 12b-25 mechanism, same form architecture, but points at Form N-CEN, the structured XML annual census report that replaced N-SAR effective June 1, 2018. NTFNSAR and NT-NCEN are temporally exclusive: NTFNSAR closes in June 2019, NT-NCEN takes over from the transition. A continuous time series of "registered fund late annual filings" requires concatenating the two at the 2019 boundary; neither alone covers the full history.

Form NT-NCSR — the most-confused contemporary

Rule 12b-25 notification for late filing of Form N-CSR, the certified shareholder report under Rule 30e-1 / 30b2-1. N-CSR carries semi-annual financial statements, MD&A-style discussion, and Sarbanes-Oxley certifications sent to shareholders; N-SAR carried operational and structural fund data filed only with the SEC. NT-NCSR (introduced 2003) co-existed with NTFNSAR for most of the window, and the same fund could file both in the same year for different overdue reports. The form faces look nearly identical — the only reliable discriminator is the underlying report identified inside the filing.

Form NT 10-K and Form NT 10-Q — same mechanism, different filer universe

Rule 12b-25 notifications for late annual (10-K) and quarterly (10-Q) reports under the Exchange Act. Identical 12b-25 architecture, but the filer population is public operating companies rather than registered investment companies, and the underlying reports are Exchange Act periodic reports rather than Investment Company Act reports. Use NT 10-K / NT 10-Q for corporate-issuer late-filing analysis; use NTFNSAR only for late N-SAR filings by funds.

Adjacent reports (not 12b-25 notifications)

Form N-SAR — the underlying substantive report

The report whose lateness triggers an NTFNSAR. N-SAR contains substantive fund operations data: identifiers, service providers, sales loads, expense ratios, portfolio turnover, structural disclosures. NTFNSAR contains none of this — only the 12b-25 narrative (identification of the missed report, reason for delay, expected filing date). The two should be linked on registrant CIK and reporting period, never substituted.

Form N-CEN — the post-2019 substantive successor

Replaced N-SAR as the annual census report starting June 1, 2018; structured (XML) and broader in scope. N-CEN is the substantive report itself, not a 12b-25 notification — its late-filing counterpart is NT N-CEN, not NTFNSAR. For any reporting period after the N-SAR sunset, NTFNSAR is unavailable by construction.

Same dataset, different filing event

NTFNSAR/A — amendments

Amendments to a previously submitted NTFNSAR, typically correcting the stated reason for delay, the expected filing date, or registrant identifiers. They share form-type lineage with NTFNSAR but carry an /A suffix and reference the original accession. Useful for tracking revised expected-filing dates, but should not be double-counted when counting unique late-filing events.

Boundary summary

NTFNSAR is defined narrowly along three axes that together exclude every neighbor:

  1. Mechanism: Rule 12b-25 notification (excludes N-SAR and N-CEN themselves, which are substantive reports).
  2. Underlying late report: Form N-SAR specifically (excludes NT-NCSR, NT-NCEN, NT 10-K, NT 10-Q).
  3. Time window: April 1995 to June 2019, closed (excludes any post-sunset late filings, which appear on NT-NCEN).

Critically, NTFNSAR is procedural, not substantive. The records do not describe what a fund is or how it performed; they describe why its statutorily required N-SAR is going to be late and when it is expected. Actual fund operations data lives in the eventual N-SAR filing, not in the NTFNSAR. Researchers wanting substantive fund data must pull N-SAR (or N-CEN) and use NTFNSAR only as a timing or compliance overlay; researchers building a complete 12b-25 corpus must combine NTFNSAR with NT 10-K, NT 10-Q, NT-NCSR, and NT-NCEN, since the identity of the underlying late report is what defines each dataset.

Who Uses This Dataset

Workflows on the NTFNSAR corpus are retrospective. Users key on a small set of fields: registrant CIK, file number, the underlying N-SAR period of report, the stated reason for delay, and the represented expected filing date.

Fund-compliance officers

In-house compliance staff at fund complexes pull every NTFNSAR filed under their CIKs to reconstruct historical late-filing behavior for examination prep, board reports, and 38a-1 program documentation. The reason text characterizes each event; peer-family pulls show whether a delay was idiosyncratic or industry-wide (audit-firm transitions, sub-adviser changes, system conversions).

Securities lawyers and regulatory counsel

External counsel advising investment companies treat the corpus as a Rule 12b-25 precedent library for late-N-SAR practice. They mine the reason narratives across the records to catalog accepted framings — auditor delays, sub-adviser transitions, accounting conversions, custody reconciliations, fund mergers — and study NTFNSAR/A amendments to see how estimated filing dates were revised. The patterns inform drafting of current NT N-CEN and NT-NCSR narratives.

Academic researchers in finance and accounting

Researchers in fund governance, audit quality, and disclosure use NTFNSAR as a clean event flag over a complete 1995–2019 panel. CIK joins to fund-level holdings and return databases; period-of-report aligns event windows; reason text supports hand-coded category labels. Typical tests: whether late filings predict liquidations, board turnover, adviser replacement, audit-firm change, or restatement. The bounded universe suits cross-sectional and longitudinal designs with stable denominators.

SEC and regulatory analysts

Analysts studying the N-SAR era as a baseline for the current N-CEN regime compute base rates: filings per year, distribution of stated reasons, share of filers that landed within the 15-day grace period versus those that slipped further, and repeat-filer concentration by fund family. Outputs feed internal memos comparing late-filing populations across the two regimes.

Fund administrators and EDGAR filing agents

Service providers that prepared NTFNSAR filings on behalf of fund clients reconcile internal submission logs against the EDGAR record using accession numbers, CIKs, and timestamps. The reconciliation supports client audit responses, SOC 1 control evidence, and historical service inquiries.

Data journalists and financial historians

Reporters covering the asset-management industry use NTFNSAR clusters to map operational stress — post-dot-com, the 2008–2009 crisis, the 2015–2016 commodity-fund period. Registrant name and CIK identify affected fund families; reason narratives describe the failure mode; period-of-report anchors each event to a reporting cycle.

Quantitative researchers

Quant teams building fund-quality, operational-risk, or adviser-due-diligence models use NTFNSAR as a sparse feature: a CIK-month flag for a late-filing event, expected-filing-date for window construction, and family-level filing counts as a complex-level risk indicator. Used in backtests and as a supervised label, not in live signals.

NLP corpus builders

Teams training domain models and RAG systems for U.S. securities regulation use NTFNSAR as a tightly scoped corpus: one rule (12b-25), one underlying form (N-SAR), one outcome (reason plus expected date). It supports reason-category classifiers, field-extraction models for registrant and date parsing, and reproducible RAG benchmarks. The closed, finite population is attractive because evaluation sets do not drift.

Specific Use Cases

The records are small, structurally uniform, and bounded, so most workflows treat the corpus as a closed event table joined to outside data on CIK, file number, and periodOfReport. The use cases below describe specific analyses the dataset directly supports.

1. Promised-vs-actual delay reconstruction

A fund-governance researcher pairs each NTFNSAR with the eventual N-SAR for the same CIK and periodOfReport. From the NTFNSAR they extract the Part II expected-filing date and from the matched N-SAR the EDGAR filedAt. The output is a per-event delta (days promised under Rule 12b-25 vs. days actually elapsed), aggregated to test whether 15-day grace-period representations were honored, slipped, or were followed by additional NTFNSAR/A amendments revising the date.

2. Late-filing reason classification corpus for an LLM fine-tune

An NLP team treats the Part III narratives as a tight supervised corpus for regulatory English. Each record yields a (registrant, period, reason-text) triple; hand-coded labels cover audit-firm transitions, sub-adviser changes, accounting/system conversions, custody reconciliations, and fund-merger work. The fine-tuned classifier is reused on the larger NT-NCEN and NT-NCSR populations, where the same taxonomy of delay reasons recurs.

3. Fund-family compliance benchmark

A compliance officer at a fund complex pulls every NTFNSAR whose entities[] carries one of the family's CIKs, then pulls peer families' filings for comparison. Counts per fiscal year, distribution of periodOfReport half-years, and reason-narrative categories form a one-page board exhibit showing whether the family's late-filing footprint is in line with peers or an outlier — useful for 38a-1 program reviews and SEC examination prep.

4. Stress-window clustering across the 1995–2019 panel

A data journalist or quant researcher buckets accessions by filedAt quarter and overlays the counts on macro stress windows: dot-com aftermath (2001–2002), the global financial crisis (2008–2009), and the 2015–2016 commodity-fund stress. Registrant names and CIKs identify which complexes drove each spike, and Part III narratives describe the failure mode (auditor resignation, valuation issues, adviser change), producing a chronology of operational stress in the registered-fund industry.

5. Securities-counsel precedent search for Rule 12b-25 narratives

External counsel preparing an NT-NCEN or NT-NCSR for a current client searches the NTFNSAR Part III narratives for analogous fact patterns — e.g., "auditor resignation mid-period," "fund reorganization pending," "sub-adviser transition." Because Form 12b-25's architecture is unchanged across NT- forms, accepted N-SAR-era framings serve as drafting precedent for current filings. CIK and accessionNo give each precedent a citable EDGAR handle.

6. Reconstructing the operational history of a defunct fund complex

For due diligence on a successor adviser or in litigation involving a wound-down complex, an analyst pulls every NTFNSAR whose filer CIK or 811- file number maps to the target family. Sequenced by filedAt and periodOfReport, the records produce a timeline of reporting failures, the reasons given, and the registrants affected — often the only public artifact of late-stage operational distress for funds that left no surviving N-SAR for the final period.

7. Bulk-filing detection inside fund families

A researcher counting unique late-filing events groups accessions by (filer CIK, filedAt date, periodOfReport) to identify same-day batch submissions across related registrants in a fund complex. Near-identical Part III text across the batch confirms the cluster. The deduplicated event count — accession-level, not narrative-level — is the correct denominator for any base-rate or hazard-model analysis.

8. Linking NTFNSAR flags to downstream fund outcomes

A quant or academic researcher uses the CIK and periodOfReport as a CIK-month event flag joined to CRSP fund returns, N-SAR or N-CEN holdings, and form-history tables. The test asks whether a late-filing event predicts subsequent liquidation, board turnover, adviser replacement, auditor change, or N-SAR/A restatement. The closed 1995–2019 panel keeps the denominator stable across specifications.

Dataset Access

The Form NTFNSAR Files Dataset is published as a dataset index JSON document, a single full-archive ZIP, and a set of per-period container ZIPs. Because the dataset is small, most users will simply download the full archive.

Dataset Index JSON API: https://api.sec-api.io/datasets/form-ntfnsar-files.json

Returns dataset-level metadata (name, description, last update timestamp, earliest sample date, total records and total size, form types covered, container format, and file types) along with the full-archive download URL and the list of individual container files. Each container entry includes its key, size, record count, last updated timestamp, and a download URL. Polling this endpoint is the recommended way to detect which containers changed in the most recent refresh and to decide which files to re-download. This endpoint does not require an API key.

Example response:

Example
1 {
2 "datasetId": "1f13365b-9ae0-69fd-aa78-b5a820da3ad5",
3 "datasetDownloadUrl": "https://api.sec-api.io/datasets/form-ntfnsar-files.zip",
4 "name": "Form NTFNSAR Files Dataset",
5 "updatedAt": "2026-04-15T18:18:39.195Z",
6 "earliestSampleDate": "1995-04-01",
7 "totalRecords": 146,
8 "totalSize": 428455,
9 "formTypes": ["NTFNSAR"],
10 "containerFormat": "ZIP",
11 "fileTypes": ["TXT", "JSON"],
12 "containers": [
13 {
14 "downloadUrl": "https://api.sec-api.io/datasets/form-ntfnsar-files/2019/2019-06.zip",
15 "key": "2019/2019-06.zip",
16 "size": 12483,
17 "records": 4,
18 "updatedAt": "2026-04-15T18:18:39.195Z"
19 }
20 ]
21 }

Download Entire Dataset: https://api.sec-api.io/datasets/form-ntfnsar-files.zip?token=YOUR_API_KEY

Downloads every Form NTFNSAR filing from April 1995 through June 2019 as a single ZIP archive. This endpoint requires an API key. Given the small total size, this is the simplest way to obtain the full corpus.

1 curl -O "https://api.sec-api.io/datasets/form-ntfnsar-files.zip?token=YOUR_API_KEY"

Download Single Container: https://api.sec-api.io/datasets/form-ntfnsar-files/2019/2019-06.zip

Downloads one period-scoped container (per-month or per-year ZIP) instead of the full archive. Each container URL is taken from the containers[].downloadUrl field in the dataset index. This endpoint requires an API key passed via an Authorization: Bearer header.

1 curl -H "Authorization: Bearer YOUR_API_KEY" \
2 -O "https://api.sec-api.io/datasets/form-ntfnsar-files/2019/2019-06.zip"

Frequently Asked Questions

What form does this dataset cover?

The dataset covers Form NTFNSAR, the EDGAR submission type used when a registered investment company files Form 12b-25 ("Notification of Late Filing") to report that it cannot timely file its Form N-SAR semi-annual or annual report. Each filing is the generic Form 12b-25 template with the "Form N-SAR" checkbox marked.

What does one record in this dataset represent?

One record is a single NTFNSAR submission accepted by EDGAR, identified by accession number and stored as a folder containing exactly one metadata.json (the EDGAR submission header) and one or more .txt documents wrapped in an EDGAR SGML <DOCUMENT> envelope. The folder name is the 18-digit no-dash form of the accession number.

Who is required to file Form NTFNSAR?

NTFNSAR is filed by the same population that filed Form N-SAR: registered management investment companies (open-end mutual funds, closed-end funds, and small business investment companies under Rules 30a-1 and 30b1-1) and registered unit investment trusts. Operating companies and business development companies do not file NTFNSAR — they use NT 10-K or NT 10-Q for late Exchange Act reports.

What time period does the dataset cover, and why is it closed?

The dataset spans April 1995 through June 2019. The 1995 start coincides with the SEC's phased EDGAR rollout for investment companies; the 2019 end reflects the SEC's rescission of Form N-SAR under Release 33-10231, which replaced N-SAR with Form N-CEN and retired the NTFNSAR submission code. Late-filing notifications relating to N-CEN are filed under NT-NCEN, not NTFNSAR.

How does NTFNSAR differ from NT-NCSR and NT-NCEN?

All three are Rule 12b-25 late-filing notifications and share the same Form 12b-25 architecture, but they attach to different underlying reports. NTFNSAR points at Form N-SAR (operational/structural fund data), NT-NCSR points at Form N-CSR (certified shareholder reports with financial statements), and NT-NCEN points at Form N-CEN (the structured XML annual census report that replaced N-SAR). The only reliable discriminator across the three is the underlying late report identified inside each filing.

What file format is the dataset distributed in?

The dataset is distributed as ZIP containers organized by year and month (e.g., 2019/2019-06.zip), with one accession folder per filing inside each ZIP. The file types inside the containers are TXT (the NTFNSAR documents in EDGAR SGML envelopes) and JSON (the per-filing metadata.json submission header). Image attachments and the full SGML "complete submission text file" are not bundled locally; only their EDGAR URLs are exposed in metadata.json.

Does NTFNSAR contain the actual fund operations data from N-SAR?

No. NTFNSAR is procedural, not substantive: each filing identifies the missed N-SAR report, states a reason for the delay, and represents an expected filing date under the Rule 12b-25 grace period (15 days for an annual N-SAR, 5 days for a semi-annual N-SAR). The substantive fund operations data — service providers, sales loads, expense ratios, portfolio turnover — lives in the eventual N-SAR filing itself, which must be linked to the NTFNSAR on registrant CIK and periodOfReport.