Form N-CEN Files Dataset

The Form N-CEN Files dataset is a structured collection of annual census reports filed by registered investment companies on EDGAR under Rule 30a-1 of the Investment Company Act of 1940. Each record packages a single Form N-CEN filing (or its amendment, Form N-CEN/A) as a self-contained accession number folder containing the canonical XML payload, an EDGAR-normalized metadata header, an EDGAR-rendered XHTML view, and any attachments such as auditor Internal Control Reports and Other Required Information exhibits. The filers are open-end mutual funds, closed-end funds, exchange-traded funds, unit investment trusts, insurance company separate accounts, and registered small business investment companies, each submitting through their officers within 75 days after the registrant's fiscal year end (or calendar year end for UITs). The dataset begins with the first wave of N-CEN filings in September 2018 — the month after Form N-CEN replaced the legacy semi-annual Form N-SAR — and continues to the present with monthly containers. It is the canonical structured source for fund identity, organizational form, board composition, service-provider relationships, securities-lending and credit-line activity, exemptive-rule reliance, and lifecycle events across the registered investment-company population.

Update Frequency
Daily
Updated at
2026-05-16
Earliest Sample Date
2018-09-01
Total Size
2.7 GB
Total Records
90,391
Container Format
ZIP
Content Types
XML, HTML, JSON, TXT, PDF
Form Types
N-CEN, N-CEN/A

Dataset APIs

Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.

Dataset Index JSON API

Download the entire dataset as a single archive file.

Download Entire Dataset:

Download a single container file (e.g. monthly archive) from the dataset.

Download Single Container:

Dataset Files

93 files · 2.7 GB
Download All
2026-05.zip14.9 MB469 records
2026-04.zip8.2 MB241 records
2026-03.zip91.6 MB3,545 records
2026-02.zip13.3 MB723 records
2026-01.zip55.2 MB1,276 records
2025-12.zip23.9 MB901 records
2025-11.zip29.4 MB606 records
2025-10.zip21.5 MB575 records
2025-09.zip15.9 MB561 records
2025-08.zip17.2 MB498 records
2025-07.zip18.2 MB412 records
2025-06.zip30.9 MB907 records
2025-05.zip28.1 MB446 records
2025-04.zip8.5 MB242 records
2025-03.zip85.8 MB3,518 records
2025-02.zip15.9 MB890 records
2025-01.zip54.1 MB1,391 records
2024-12.zip25.2 MB880 records
2024-11.zip28.3 MB630 records
2024-10.zip23.1 MB626 records
2024-09.zip15.8 MB530 records
2024-08.zip16.6 MB497 records
2024-07.zip15.5 MB378 records
2024-06.zip33.1 MB858 records
2024-05.zip16.2 MB423 records
2024-04.zip8.9 MB260 records
2024-03.zip85.0 MB3,578 records
2024-02.zip23.6 MB1,075 records
2024-01.zip53.6 MB1,365 records
2023-12.zip25.1 MB907 records
2023-11.zip29.3 MB647 records
2023-10.zip25.9 MB1,193 records
2023-09.zip16.1 MB642 records
2023-08.zip15.7 MB470 records
2023-07.zip15.4 MB383 records
2023-06.zip33.0 MB880 records
2023-05.zip28.0 MB583 records
2023-04.zip4.4 MB188 records
2023-03.zip85.8 MB3,727 records
2023-02.zip15.2 MB805 records
2023-01.zip49.3 MB1,351 records
2022-12.zip23.5 MB890 records
2022-11.zip26.9 MB630 records
2022-10.zip20.3 MB618 records
2022-09.zip14.2 MB579 records
2022-08.zip14.6 MB476 records
2022-07.zip20.4 MB496 records
2022-06.zip29.0 MB884 records
2022-05.zip15.4 MB462 records
2022-04.zip7.5 MB253 records
2022-03.zip84.6 MB3,640 records
2022-02.zip13.8 MB764 records
2022-01.zip50.9 MB1,503 records
2021-12.zip21.5 MB836 records
2021-11.zip27.6 MB686 records
2021-10.zip20.1 MB547 records
2021-09.zip13.2 MB548 records
2021-08.zip15.1 MB513 records
2021-07.zip16.5 MB456 records
2021-06.zip25.3 MB749 records
2021-05.zip17.2 MB487 records
2021-04.zip7.5 MB259 records
2021-03.zip84.1 MB3,705 records
2021-02.zip15.0 MB789 records
2021-01.zip48.2 MB1,446 records
2020-12.zip22.2 MB848 records
2020-11.zip24.4 MB661 records
2020-10.zip20.4 MB592 records
2020-09.zip13.7 MB542 records
2020-08.zip14.5 MB505 records
2020-07.zip20.4 MB690 records
2020-06.zip33.4 MB922 records
2020-05.zip17.6 MB514 records
2020-04.zip7.8 MB275 records
2020-03.zip84.8 MB3,778 records
2020-02.zip20.3 MB836 records
2020-01.zip48.9 MB1,525 records
2019-12.zip26.1 MB926 records
2019-11.zip27.5 MB739 records
2019-10.zip31.7 MB590 records
2019-09.zip17.0 MB668 records
2019-08.zip15.8 MB534 records
2019-07.zip58.9 MB1,000 records
2019-06.zip26.8 MB941 records
2019-05.zip27.1 MB625 records
2019-04.zip16.0 MB593 records
2019-03.zip87.9 MB4,058 records
2019-02.zip13.8 MB1,002 records
2019-01.zip46.2 MB1,841 records
2018-12.zip20.3 MB925 records
2018-11.zip23.5 MB774 records
2018-10.zip14.2 MB630 records
2018-09.zip17.7 MB564 records

What This Dataset Contains

The dataset is built from Form N-CEN, the structured XML annual census report adopted by the SEC's October 2016 Investment Company Reporting Modernization rule (Release IC-32314) and effective June 1, 2018. Form N-CEN replaced the legacy semi-annual Form N-SAR — a fixed-field text questionnaire designed for the ASCII EDGAR era — with an annual XML report built on a published schema with named elements, validation rules, and a dedicated rendering stylesheet. Every registered management investment company and every registered unit investment trust is in scope; the dataset covers the entire filer population defined by Section 30 of the Investment Company Act and Rule 30a-1 thereunder.

The dataset is distributed as a sequence of monthly ZIP containers organized by year (YYYY/YYYY-MM.zip), beginning with September 2018 — the first month in which N-CEN filings actually arrived in EDGAR — and continuing on a monthly cadence. Filings made on the legacy Form N-SAR before mid-2018 are out of scope and are not present in the dataset. File types found inside each container are XML, HTML, JSON, TXT, and PDF. Form N-CEN collects identifying information about the registrant trust and each of its series, operational metrics for the fiscal year, names and regulatory identifiers (CIK, file number, CRD, LEI, RSSD, PCAOB) of every material service provider, governance details (directors, chief compliance officer, principal underwriter, public accountant), securities-lending and credit-line activity, exemptive rule reliance, and a battery of yes/no flags that surface compliance, custody, valuation, and lifecycle events. Several items require attachments — most prominently the auditor's Internal Control Report — that are filed as separate documents inside the same EDGAR submission and appear as separate files inside the dataset's accession folder.

Content Structure of a Single Record

What one record represents

A single record in the Form N-CEN Files Dataset is one EDGAR filing of Form N-CEN, or its amendment Form N-CEN/A, addressed by its accession number and packaged as a self-contained accession folder inside a monthly ZIP. The on-disk unit is therefore one annual census report submitted by a registered investment company under Rule 30a-1 — covering open-end funds, closed-end funds, exchange-traded funds, unit investment trusts, separate accounts, and small business investment companies — filed within 75 days after the registrant's fiscal year end.

Although the dataset's record unit is the filing (one accession folder), the underlying N-CEN payload is hierarchical: a single filing routinely reports on a registrant trust that holds many fund series, and each series carries its own block of operational data. Series-level analytics therefore require unrolling a repeating XML block inside the record rather than treating accession folders as series-level rows.

Files inside a single accession folder

Each accession folder is a flat directory named after the dashless EDGAR accession number (for example 000089418925010351/), sitting one level deep inside the per-month ZIP. Two files are always present, and several optional files appear depending on which N-CEN items were answered:

  • metadata.json — the normalized header extracted by the SEC API pipeline from the EDGAR submission envelope. Always present.
  • primary_doc.xml — the canonical N-CEN XML payload conforming to the http://www.sec.gov/edgar/ncen schema. Always present.
  • xslFormN-CEN_X05/primary_doc.xml — an EDGAR-rendered XHTML view of the XML produced by applying the NCEN_print.css stylesheet. The file extension is .xml but the content is XHTML 1.0 Strict; the folder name encodes the schema version (X05). Absent for some older or paper-style submissions, so this folder should be treated as optional.
  • One or more Internal Control Report attachments (.htm, .txt, or .pdf), each carrying SGML document headers identifying its <TYPE> as INTERNAL CONTROL RPT. Trusts that share an auditor across all series file a single consolidated report; multi-family trusts whose sub-portfolios use different auditors carry one report per sub-portfolio.
  • Zero or more OTHER REQUIRED INFO exhibits (.htm or .txt) carrying explanatory notes, item-specific addenda, or amendment rationales.

Image files (logos, signature scans) from the original EDGAR submission are not retained, and the full SGML-wrapped submission text file (*.txt) referenced from documentFormatFiles[] is not materialized as a separate file inside the folder — the dataset retains the parsed components rather than the concatenated SGML envelope. Every non-XML attachment, however, preserves its own EDGAR SGML document header (<DOCUMENT>, <TYPE>, <SEQUENCE>, <FILENAME>, <DESCRIPTION>, <TEXT> wrappers) inline, so the authoritative document-type label travels with the file content rather than being inferred from the extension.

The metadata.json header

The metadata file is a flat JSON object that lifts EDGAR submission-header fields into stable, named keys.

Top-level scalar fields:

  • formTypeN-CEN or N-CEN/A.
  • accessionNo — the EDGAR accession in dashed form (0000894189-25-010351).
  • filedAt — ISO-8601 timestamp with timezone offset.
  • effectivenessDate — date the filing becomes effective.
  • periodOfReport — fiscal-year end the report covers.
  • description — human-readable filing description from EDGAR.
  • linkToTxt, linkToHtml, linkToFilingDetails, linkToXbrl — canonical EDGAR URLs for the full SGML-wrapped submission, the filing-index HTML page, and the XSL-rendered XML view.
  • id — internal pipeline identifier for the record.

Array fields enumerate the submission's components:

  • documentFormatFiles[] lists every document attached to the EDGAR submission, with sequence, size in bytes (as a string), documentUrl, type (for example N-CEN/A, INTERNAL CONTROL RPT, OTHER REQUIRED INFO), and an optional description. A trailing entry with a blank type points at the full SGML-wrapped submission text file on EDGAR.
  • dataFiles[] lists XBRL data attachments when present. N-CEN does not carry XBRL data, so this array is empty.

Array fields that expose the registrant graph:

  • entities[] records the filer entity (or entities) lifted from the EDGAR header, with cik, companyName, irsNo, fileNo, filmNo, fiscalYearEnd as MMDD, stateOfIncorporation, the SEC act (40 for the Investment Company Act of 1940), and a type field that echoes the form type.
  • seriesAndClassesContractsInformation[] contains one entry per fund series belonging to the registrant trust, each with the series series S-number, the series name, and a nested classesContracts[] array of {ticker?, name, classContract} triples (the C-number identifying each share class). Tickers are omitted for share classes that are not publicly traded, so the field is sparse rather than universally populated.

The metadata header is therefore sufficient on its own to answer "what trust, which fiscal year, which series and classes, which attachments" without parsing the XML body.

The primary_doc.xml payload

The canonical regulatory body of an N-CEN filing is a single XML document rooted at <edgarSubmission xmlns="http://www.sec.gov/edgar/ncen"> with a <schemaVersion> child (recent filings use X0505). The root has two principal children: <headerData> carrying the submission envelope, and <formData> carrying the substantive census disclosures. The element order is fixed by the schema. A trust covering many series can produce an XML document several thousand lines long, with the bulk of the volume concentrated in the per-series block.

<headerData>

The header block restates the submission-level identification:

  • <submissionType>N-CEN or N-CEN/A.
  • <accessionNumber> — for amendments, this carries the original N-CEN accession that the amendment supersedes, not the amendment's own accession. The amendment's own accession remains in metadata.json and in the folder name. This split lets downstream consumers stitch corrections back to the filing they correct.
  • <filerInfo> — wraps liveTestFlag, a filer/issuerCredentials/cik element, a redacted ccc (CIK confirmation code), and flag children overrideInternetFlag and confirmingCopyFlag. It also carries investmentCompanyType, encoding the registrant's registration form: N-1A for open-end management investment companies, N-2 for closed-end management investment companies, N-3 for separate accounts organized as management companies, N-4 for variable annuity separate accounts, N-5 for SBICs, and N-6 for variable-life separate accounts.
  • <seriesClass>/<reportSeriesClass> — a list of <rptSeriesClassInfo> entries pairing each seriesId reported on with includeAllClassesFlag or with explicit class IDs.

<formData>

<formData> is structurally divided into a small number of trust-level subsections and a repeating per-series block.

<generalInfo> — a single attribute-bearing element exposing reportEndingPeriod (the fiscal-year end the report covers) and isReportPeriodLt12 (a flag indicating the report covers a short period of less than twelve months, typically used in first or final filings).

<registrantInfo> — trust-level data. Disclosures about the registered investment company as a whole, stated once per record:

  • Identity: registrantFullName, investmentCompFileNo (the 811-… Investment Company Act file number), registrantCik, registrantLei, full street/city/state/zip/country address, phone, and websites/website/@webpage URLs.
  • Books-and-records locations<locationBooksRecords>/<locationBooksRecord>, one entry per office holding books and records under Section 31(a) of the Investment Company Act, each with address, officeStateCountry, phone, and a free-text booksRecordsDesc naming the role (adviser, custodian, administrator, distributor, legal counsel, transfer agent).
  • Lifecycle flags: isRegistrantFirstFiling, isRegistrantLastFiling, isRegistrantFamilyInvComp (membership in a fund family).
  • Trust-level classification: registrantClassificationType, totalSeries (count of series reported), isSecuritiesActRegistration (whether shares are registered under the Securities Act of 1933).
  • Board roster<directors>/<director> with directorName, crdNumber, isDirectorInterestedPerson (the Section 2(a)(19) interested person determination), and fileNumbers/fileNumberInfo cross-references to the other registrants on whose boards the director also serves.
  • Chief compliance officer<chiefComplianceOfficers>, required by Rule 38a-1, with name, CRD, address, phone, the list of employers (CCOs may serve multiple investment companies), and isCcoChangedSinceLastFiling.
  • Legal and risk flags: securityMatterSeriesInfo, isPreviousLegalProceeding, isPreviousProceedingTerminated, isClaimFiled, coveredByInsurancePolicy, isFinancialSupportDuringPeriod, isExemptionFromAct.
  • Principal underwriter<principalUnderwriters> with name, SEC file number, CRD, LEI, FDIC/Federal Reserve RSSD ID, state, and isPrincipalUnderwriterAffiliatedWithRegistrant, paired with isUnderwriterHiredOrTerminated.
  • Independent accountant<publicAccountants> with name, pcaobNumber, LEI, RSSD ID, state, and isPublicAccountantChanged (the N-CEN analogue of an 8-K Item 4.01 auditor-change event).
  • Reporting-quality flags: isMaterialWeakness, isOpinionOffered, isMaterialChange, isAccountingPrincipleChange, isPaymentErrorInNetAssetValue, isPaymentDividend.

<managementInvestmentQuestionSeriesInfo> — the repeating per-series block. The substantive heart of the form: one <managementInvestmentQuestion> element per series under the registrant. Each block repeats the same structure:

  • Series identity: mgmtInvFundName, mgmtInvSeriesId (the S-number), mgmtInvLei, isFirstFilingByFund.
  • Class composition: numAuthorizedClass, numAddedClass, numTerminatedClass, plus a <sharesOutstandings>/<sharesOutstanding> list with one element per share class carrying className, the C-number class ID, and ticker.
  • Fund classification: fundType, isNonDiversifiedCompany, isForeignSubsidiary, plus securities lending fields (isFundSecuritiesLending, didFundLendSecurities, paymentToAgentManagerType, avgPortfolioSecuritiesValue, netIncomeSecuritiesLending).
  • Rule reliance<relyOnRuleTypes>/<relyOnRuleType> enumerating the 1940 Act rules the fund relied on during the year (for example Rule 12d1-1 for fund-of-funds investments in money-market funds, Rule 32a-4 for the audit-committee exemption, Rule 17a-7 for cross-trades).
  • Fee/expense flags: isExpenseLimitationInPlace, isExpenseReducedOrWaived, isFeesWaivedRecoupable, isExpenseWaivedRecoupable.
  • Service-provider blocks, each following a parallel structure of name, file numbers, CRD, LEI, RSSD, stateCountry, affiliation flags, and sub-provider information, paired with a top-level hire-or-terminate boolean:
    • <investmentAdvisers>/<investmentAdviser> + isAdviserHiredOrTerminated
    • <transferAgents>/<transferAgent> + isTransferAgentHiredOrTerminated
    • <pricingServices>/<pricingService> + isPricingServiceHiredOrTerminated
    • <custodians>/<custodian> (carries an additional custodyType distinguishing self-custody, foreign sub-custodian, depository) + hire/terminate flag
    • <shareholderServicingAgents>/<shareholderServicingAgent> + hire/terminate flag
    • <admins>/<admin> (fund administrators) + hire/terminate flag
  • Brokerage activity<brokers>/<broker> listing top broker-dealers by commission paid, each with file number, CRD, LEI, RSSD, state, and grossCommission, alongside an <aggregateCommission> total. <principalTransactions>/<principalTransaction> lists counterparties for principal trades, with principalAggregatePurchase aggregating volume. isBrokerageResearchPayment records Section 28(e) soft-dollar arrangements. mnthlyAvgNetAssets carries monthly average net assets.
  • Line of credit<lineOfCredit hasLineOfCredit="…">. When the fund has a facility, a nested <lineOfCreditDetails>/<lineOfCreditDetail> carries the committed/uncommitted flag, lineOfCreditSize, lender names, a sharedCreditType distinguishing sole from shared facilities (with <creditUser> siblings naming every other fund participating in a shared facility), and a creditLineUsed group reporting isCreditLineUsed, averageCreditLineUsed, and daysCreditUsed.
  • Interfund and pricing flags: isInterfundLending, isInterfundBorrowing, isSwingPricing.

Because this block repeats per series, service-provider identities, brokerage economics, lending and credit activity, and operational flags are all series-scoped — even when the trust shares advisers or custodians across its series.

<exchangeSeriesInfo> — relevant for closed-end funds and exchange-listed series. The element is self-closing when no series is exchange-listed; when populated, it lists listing exchanges per series along with ticker symbols and trading-status information.

<attachmentsTab> — a block of boolean indicators identifying which exhibits accompany the form. Flags align with the actual files in the accession folder — for example isIPAReportInternalControl (the auditor's Internal Control Report under Section 17(f) of the Investment Company Act and Item G.1.a.iii) and isOtherInfoRequired (the Other Required Information exhibit). These booleans provide a cheap consistency check against the on-disk attachment set.

<signature> — a single self-closing attribute-only element carrying registrantSignedName, signedDate, signature (the typed /s/ form), and title. There is no separate signature page or notarization.

The XSL-rendered HTML view

The xslFormN-CEN_X05/primary_doc.xml companion is EDGAR's human-readable rendering of the same XML payload, produced by applying the NCEN_print.css stylesheet during filing. The content is XHTML 1.0 Strict despite the .xml extension: tables, divs, and "fakeBox" widgets render the form's questions and answers, and boolean fields are displayed as radio-button images rather than text values. The view exposes exactly the same data as primary_doc.xml and is convenient when a downstream consumer wants a printable document rather than a parsed XML tree. It is a derived rendering and adds no regulatory information beyond what is in the canonical XML.

Attachments: Internal Control Reports and Other Required Information

Non-XML attachments inside an accession folder fall into a small number of EDGAR document types, identified authoritatively by the <TYPE> line in the SGML document envelope:

  • INTERNAL CONTROL RPT — the auditor's Internal Control Report, required when the fund's custody arrangements implicate Section 17(f) of the Investment Company Act and Item G.1.a.iii of N-CEN. The report is signed by the registrant's independent registered public accounting firm and follows the standard PCAOB-style "Report of Independent Registered Public Accounting Firm" header. Trusts whose sub-portfolios use different auditors carry several INTERNAL CONTROL RPT attachments with different filenames. The report may be rendered as HTML, plain text, or PDF depending on the auditor's preferred format.
  • OTHER REQUIRED INFO — supplemental exhibits used for item-specific addenda, explanatory notes about specific responses, G.1.a.vi narratives that require free-text elaboration, or rationales accompanying an N-CEN/A that explain what was amended.
  • N-CEN / N-CEN/A — the form's own XML body, always indexed by sequence number 1.

Each attachment is wrapped in the EDGAR SGML document envelope (<DOCUMENT>, <TYPE>, <SEQUENCE>, <FILENAME>, <DESCRIPTION>, <TEXT> lines) so downstream consumers can identify the document type and sequence even when the filename is not self-describing. Plain-text attachments use the same envelope around fixed-width formatted text rather than HTML. Because the same content type can appear with several extensions across records, the SGML <TYPE> line is the authoritative discriminator rather than the file extension.

What is included and what is excluded

Included in each record:

  • the EDGAR-normalized metadata.json,
  • the canonical primary_doc.xml,
  • the XSL-rendered XHTML view where EDGAR produced one,
  • every Internal Control Report attachment, with its SGML envelope preserved,
  • every OTHER REQUIRED INFO exhibit,
  • and any other text or HTML documents that accompanied the original submission.

Excluded:

  • image files (logos, signature graphics, scanned letterhead) that appeared in the original submission,
  • the full concatenated SGML-wrapped submission .txt file referenced from documentFormatFiles[] — the dataset retains the parsed components instead.

How amendments (N-CEN/A) are represented

Amendments are first-class records in the dataset and appear as their own accession folders. Two pointers identify the amendment relationship:

  • metadata.json.formType is N-CEN/A and metadata.json.accessionNo is the amendment's own accession (matching the folder name).
  • Inside primary_doc.xml, <headerData>/<accessionNumber> carries the original N-CEN accession that the amendment supersedes.

Beyond these identification fields, an N-CEN/A is structurally identical to an N-CEN — the same XML schema, the same <formData> subsections, the same attachment shape. Amendments routinely include an OTHER REQUIRED INFO exhibit narrating the rationale for the amendment and identifying which items were restated. There is no rule that an amendment must restate the entire form, but in practice the full payload is re-submitted with the corrected values.

Schema evolution since June 2018

Within the N-CEN era the form has gone through several schema-version revisions. The schema version is exposed in <schemaVersion> at the root of the XML and echoed in the rendering folder name (xslFormN-CEN_X05/). Successive revisions reflect SEC rulemakings that incrementally added items: the 2020 Fair Valuation rule (Rule 2a-5) introduced fair value designation questions; the 2023 Tailored Shareholder Reports rule added items about shareholder-report formats; the 2023 Names Rule amendments added Rule 35d-1 compliance items; and 2024 ETF and tender-offer fund rulemakings introduced operational disclosures for liquidity-management arrangements. The schema is backward-compatible at the element level — older elements persist — but new elements are added with each schema version, so consumers parsing across years should treat <schemaVersion> as the authoritative guide to which elements to expect.

Interpretation and extraction notes

  • Accession folder is not series-level. One N-CEN filing typically reports on many series under a single trust. Trust-level fields in <registrantInfo> are stated once, but the <managementInvestmentQuestion> block repeats per series. Series-level analytics require unrolling this repeating block; treating the folder count as a count of funds will materially undercount the series universe.
  • Header vs. body identifiers diverge for amendments. The <accessionNumber> inside <headerData> of an N-CEN/A is the original filing's accession, not the amendment's. The amendment's own accession lives in metadata.json and in the folder name. Use both pointers to stitch correction chains.
  • Mixed file extensions for the same content type. Internal Control Reports may be .htm, .txt, or .pdf; the XSL rendering uses a .xml extension but carries XHTML. The SGML <TYPE> line is the authoritative classifier; content type cannot be inferred from the extension.
  • Service-provider sub-structures are uniform. Every service-provider type (adviser, transfer agent, pricing service, custodian, shareholder servicing agent, administrator) uses the same name / file-numbers / CRD / LEI / RSSD / state / affiliation-flags shape and is paired with a hire-or-terminate boolean inside the same series block. This uniformity allows consistent extraction of service-provider time series across series and years.
  • Free-text fields are bounded but present. Most of N-CEN is structured, but a handful of fields — booksRecordsDesc, the OTHER REQUIRED INFO exhibit body, and the auditor's report body — carry free-form text that requires NLP rather than schema-driven extraction.
  • The XSL HTML view is rendering-only. All regulatory facts live in primary_doc.xml; the rendered XHTML is a derived view and adds no information.
  • <attachmentsTab> flags align with the folder contents. When isIPAReportInternalControl is true, an INTERNAL CONTROL RPT attachment is present; when isOtherInfoRequired is true, an OTHER REQUIRED INFO exhibit is present. These flags provide a quick consistency check against the on-disk attachment set.
  • First- and last-filing flags signal lifecycle events. isRegistrantFirstFiling and isRegistrantLastFiling at the trust level, plus isFirstFilingByFund at the series level, identify new-fund and terminating-fund events without requiring a join against historical filings.
  • Schema version varies by filing period. Filings submitted in different years may use different <schemaVersion> values and therefore expose slightly different element sets. Parsers should branch on schema version when extracting items added by post-2018 rulemakings.

Who Files or Publishes This Dataset, and When

Who files

Form N-CEN is filed by registered investment companies themselves, acting through their officers. The filing population, defined by Section 30 of the Investment Company Act of 1940 and Rule 30a-1 thereunder, consists of:

  • Open-end management investment companies — mutual funds, including most ETFs (which are organized as open-end funds and complete ETF-specific items on authorized participants, creation/redemption baskets, and exchange listing).
  • Closed-end management investment companies — listed closed-end funds, interval funds, and tender-offer funds.
  • Unit investment trusts (UITs) — including insurance company separate accounts organized as UITs that fund variable annuity and variable life contracts, plus a small legacy population of UIT-structured ETFs.
  • Small business investment companies (SBICs) that are also registered under the 1940 Act, which complete SBIC-specific items.

For series trusts, a single N-CEN submission from the registrant covers all series and classes under one accession number.

Who does not file

Regulatory framework

The obligation is rooted in Section 30 of the Investment Company Act of 1940 and Rule 30a-1, which requires every registered management investment company and every registered UIT to file an annual census report on Form N-CEN. N-CEN is the census layer of the 1940 Act reporting stack. It sits alongside, but is separate from, Form N-PORT (monthly portfolio holdings, Rule 30b1-9), Form N-CSR (certified shareholder reports, Rule 30b2-1), Form N-PX (proxy voting), and the prospectus forms (N-1A, N-2, N-3, N-4, N-6, N-8B-2).

Trigger and timing

The trigger is purely periodic, not event-driven: each registrant owes one N-CEN per year.

  • Management investment companies (open-end funds, closed-end funds, ETFs, registered SBICs) must file Form N-CEN within 75 days after the end of the fund's fiscal year.
  • Unit investment trusts must file within 75 days after the end of the calendar year (UITs report on a calendar-year basis under Rule 30a-1).

Because fund fiscal year-ends are spread across the calendar (with clusters at December, June, September, and October), N-CEN filings appear throughout the year, with predictable peaks roughly 75 days after each common year-end.

Form N-CEN/A is an amendment. It is filed when the registrant corrects or updates a prior N-CEN — for example, service-provider data, CCO information, or series/class identifiers. Amendments are corrective rather than calendar-driven, have no fixed deadline, and one fiscal-year report may receive multiple N-CEN/A filings over time.

Important distinctions

  • Filer vs. service providers: the registrant is the filer. Advisers, sub-advisers, custodians, transfer agents, administrators, underwriters, auditors, and pricing services are disclosed within the form but do not file it.
  • Series trusts: one N-CEN covers all series and classes of the trust; counting funds requires unpacking series-level XML inside each submission.
  • Master-feeder structures: master and each feeder file their own N-CEN if each is separately registered.
  • ETFs structured as UITs file on the calendar-year cycle; ETFs structured as open-end funds file on the fiscal-year cycle.
  • Variable insurance products: the insurance company separate account (as a UIT) is the filer — not the insurance company and not the underlying funds it holds.
  • Newly registered funds owe no N-CEN until their first fiscal year-end passes; the first filing is due 75 days later.
  • Deregistering funds: once Form N-8F deregistration is effective, the Rule 30a-1 obligation ends; a final N-CEN covers the partial year through deregistration where applicable.

How This Dataset Differs From Similar Datasets or Filings

Form N-CEN sits inside a tight cluster of Investment Company Act reporting forms covering the same filer universe (registered funds and unit investment trusts) but splitting the disclosure surface across census data, financial statements, holdings, voting, and registration. The closest comparisons are with other N-series forms. Form ADV and Form 10-K are commonly confused with N-CEN but cover different filer populations entirely.

Form N-SAR (legacy predecessor)

N-SAR was the semi-annual census report for registered management investment companies, retired in June 2018 and replaced by N-CEN. Key differences: cadence (semi-annual to annual), structure (fixed-field plain text to XML), and scope (N-CEN substantially expands service-provider and operational fields). The two are not field-compatible; pre-2018 historical research requires a manual crosswalk, not concatenation.

Form N-PORT (monthly portfolio holdings)

N-PORT is the monthly position-level holdings report (third month of each quarter made public). Overlap with N-CEN is limited to filer population. N-PORT carries security identifiers, fair values, derivatives exposures, risk metrics, and liquidity classifications; N-CEN carries none of these. The pairing is standard: N-CEN for fund identity, structure, and service providers; N-PORT for what the fund holds.

Form N-PX (proxy voting record)

N-PX is the annual log of proxy votes cast on portfolio securities, filed by registered funds and (post-2022 expansion) certain institutional managers. Cadence and filer population align with N-CEN, but the content is orthogonal: N-PX is a per-meeting, per-ballot voting record; N-CEN is a census of fund structure and service providers.

Form N-CSR and N-CSRS (certified shareholder reports)

N-CSR (annual) and N-CSRS (semi-annual) carry audited or unaudited financial statements, schedules of investments, shareholder-letter narrative, and Sarbanes-Oxley certifications. Same fund population, same annual cadence for N-CSR, but a different payload type: N-CSR is narrative and financial-statement driven (text and PDF), while N-CEN is structured XML census data. For a given fiscal year, N-CSR delivers the audited financials; N-CEN delivers the operational/structural facts. Complements, not substitutes.

Form N-1A and the N-2/N-3/N-4/N-6 registration family

Registration and prospectus forms for open-end mutual funds (Form N-1A), closed-end funds (Form N-2), managed separate accounts (Form N-3), variable annuity separate accounts (Form N-4), and variable life separate accounts (Form N-6). Same registrant universe as N-CEN, but a different regulatory purpose: initial and updated offering disclosures, including prospectus narrative, fee tables, investment objectives, and risk factors. Filed as needed rather than on a periodic cadence, and predominantly narrative. N-CEN provides the standardized annual operational snapshot of the same registrants the registration statements introduce.

Form N-Q (legacy quarterly holdings)

Form N-Q was the quarterly portfolio holdings report, rescinded in 2019 once N-PORT became public. It does not overlap with N-CEN in content (holdings only) but is part of the same legacy ecosystem alongside N-SAR. For pre-2019 research, N-Q plus N-SAR roughly parallel today's N-PORT plus N-CEN split.

Form ADV (investment adviser registration)

ADV is filed by investment advisers, not by funds. This is the most frequent confusion: ADV describes the management company; N-CEN describes the fund. N-CEN itemizes the fund's adviser among its service providers, supplying a natural join key (CRD or SEC file number) into ADV. ADV carries adviser-level AUM, client mix, disciplinary history, and Part 2 brochure; N-CEN carries fund-level structural facts.

Form 10-K (operating company annual report)

10-K is the Exchange Act annual report for operating companies. Filer populations are disjoint: 10-K filers are operating enterprises; N-CEN filers are registered investment companies under the 1940 Act. The mental pairing exists only because both are "annual reports." Content (audited financials, MD&A, risk factors, business description) and statutory basis differ entirely. Listed here only to disambiguate.

Key differences at a glance

  • Versus N-SAR: same purpose, replaced; annual vs semi-annual, XML vs plain text, wider field set.
  • Versus N-PORT: census vs holdings; annual vs monthly; structural vs position-level.
  • Versus N-PX: census vs voting record; same cadence, no content overlap.
  • Versus N-CSR: structured operational facts vs narrative financial statements; same cadence, complementary.
  • Versus N-1A/2/3/4/6: periodic census vs as-needed registration narrative.
  • Versus ADV: fund-level vs adviser-level; linked by adviser identifier.
  • Versus 10-K: 1940 Act fund census vs 1934 Act operating-company report; no substitution.

What makes Form N-CEN distinct

N-CEN is the only SEC dataset providing a standardized, structured-XML annual census of registered investment companies focused on operational and structural attributes rather than financials, holdings, or votes. Its defining features:

  • Census scope: fund identity, organizational form, series and class identifiers, and reporting-period flags.
  • Service-provider granularity: advisers, sub-advisers, custodians, administrators, transfer agents, pricing services, principal underwriters, and independent public accountants are each itemized with identifiers, making N-CEN the canonical source for fund-to-service-provider linkages.
  • Field-level machine readability: XML across the entire registrant population, unlike narrative-heavy N-CSR or registration filings.
  • Amendment support via N-CEN/A: captures restatements of census data when service-provider or organizational entries are corrected post-filing.

N-CEN is not interchangeable with N-PORT (holdings), N-CSR (financials), N-PX (votes), or the N-1A/N-2/N-3/N-4/N-6 family (offering disclosures). It is the structural backbone that anchors those disclosures to a consistent annual snapshot of fund identity and operational arrangement.

Who Uses This Dataset

Because each Form N-CEN filing names the adviser, subadvisers, custodians, transfer agent, auditor, principal underwriter, CCO, and authorized participants alongside flags for securities lending, derivatives, in-kind activity, and reliance on specific 1940 Act rules, it is consumed by professionals who need to know who runs, services, and oversees each fund.

Competitive intelligence analysts at fund complexes

Strategy and CI teams at asset managers pivot Item E (adviser, subadviser, custodian, sub-custodian, administrator, transfer agent, pricing agent, principal underwriter, auditor) against Items B and C (CIK, LEI, series and class identifiers) to build a peer service-provider matrix. Year-over-year diffs flag when a peer migrates a vendor, which feeds renegotiation lists and pricing benchmarks for senior leadership.

Fund board and governance researchers

Independent-trustee staff and proxy-governance analysts focus on CCO identification, CCO turnover, third-party CCO compensation, reliance on exemptive orders, Rule 12b-1 plan use, and payments to affiliated persons. They use these fields to score CCO independence, flag compliance-function turnover as a governance risk, and benchmark trustee oversight across trust complexes.

ETF market-structure researchers

Researchers covering ETF plumbing exploit the ETF section: authorized participants by name and LEI, dollar value of creations and redemptions per AP, cash-versus-in-kind split, AP agreement requirements, and master-feeder or fund-of-funds status. Joined to series and class data, these fields produce AP concentration metrics, in-kind ratios, and launch-versus-liquidation profiles that feed market-structure papers and liquidity models.

Fund compliance and disclosure counsel

Internal compliance teams and fund counsel use peer N-CEN responses to calibrate their own filings. Key items include reliance flags for Rules 10f-3, 17a-7, 17e-1, 12d1-1, 22e-3, 32a-4, and 15a-4; securities lending and indemnification disclosures; line-of-credit and interfund lending participation; and liquidity rule program administrator items. Comparing peer answers helps validate internal controls and surfaces operational differences worth investigating.

Audit-market analysts

Accounting-policy researchers and audit-firm competitive teams use Item E.6 (independent public accountant name, PCAOB ID, location, change-during-period flag) to build auditor switch panels, tenure distributions, and concentration metrics for the fund-audit market. Outputs include audit-quality studies and win/loss tracking for audit-firm leadership.

Custodian and fund-administrator business development

BD teams at custody banks and administrators pull Item E custodian, sub-custodian (with foreign-custody location and affiliation flag), administrator, and transfer-agent disclosures, then join to series counts and net assets to size each relationship. Year-over-year deltas drive account-defense alerts and prospecting lists for renewal windows.

Transfer agent and sub-accounting product teams

TA product managers map served versus unserved share classes using Item E transfer-agent identification, affiliation flags, account-type fields, and class-level load and distribution data. They use this to estimate sub-accounting volume and prioritize prospecting at complexes whose stated arrangement fits their operating model.

Securities-lending agent analysts

Analysts at securities-lending agents and prime brokerage units use the lending section: lending activity flag, lending agent name, compensation basis, indemnification terms, collateral managers, cash collateral reinvestment vehicles, and revenue earned. These fields support agent league tables, complex-level revenue estimates, and mandate-migration tracking.

Academic researchers in fund industrial organization

Researchers studying fund governance, ETF microstructure, and service-provider competition use the full N-CEN schema as a machine-readable panel: adviser and subadviser identifiers and affiliation flags, board structure items, fee waivers, derivative counterparty disclosures, and ETF AP and basket data. Continuous coverage from the post-N-SAR transition supports difference-in-differences and event-study designs.

Data journalists covering asset management

Reporters at financial news outlets use N-CEN to substantiate stories on auditor rotations, adviser and subadviser changes, CCO turnover, fund liquidations and mergers, and service-provider concentration. The fund-action items (liquidation, merger, reorganization, termination flags) plus the Item E grid let reporters produce industry-wide rankings and multi-year narratives.

Taken together, the dataset serves any role that needs registrant-identified, multi-year, structured disclosure of the operational architecture of a fund: who advises it, who services it, who audits it, who oversees it, and which 1940 Act rules and ETF mechanics it relies on. Each profession draws on a different slice of the same Item E and operational-flag fields.

Specific Use Cases

Form N-CEN's structured XML payload supports a set of concrete, repeatable workflows centered on fund identity, service-provider mapping, and operational flags. The use cases below tie each question to specific fields in primary_doc.xml and the surrounding accession folder.

Building an auditor-change panel for the fund-audit market

Unroll <publicAccountants> per series and read isPublicAccountantChanged alongside pcaobNumber, name, and lei to detect auditor switches each fiscal year. Join records on registrantCik and mgmtInvSeriesId across consecutive filings to construct switch pairs (predecessor PCAOB ID, successor PCAOB ID, switch date from metadata.json.periodOfReport). Output feeds audit-firm league tables, tenure-distribution studies, and win/loss dashboards for Big Four and non-Big-Four audit practices.

Mapping the fund-to-service-provider graph for BD targeting

For each series, extract the full service-provider block (<investmentAdvisers>, <custodians> with custodyType, <admins>, <transferAgents>, <pricingServices>, <shareholderServicingAgents>, <principalUnderwriters>) along with CRD, LEI, and RSSD identifiers, then key year-over-year diffs on the paired isXHiredOrTerminated flags. The resulting hire/terminate timeline drives prospecting lists for custody banks, administrators, and TA shops at renewal windows, and supports account-defense alerts when a competitor wins a complex.

Tracking CCO turnover and shared-CCO networks

Read <chiefComplianceOfficers> for each registrant: name, CRD, employer list, address, and isCcoChangedSinceLastFiling. Aggregate by CCO CRD across trusts to identify multi-trust CCO arrangements (a single individual serving as CCO for many fund families) and to flag turnover events. Output supports governance-risk scoring for independent-trustee staff and feeds compliance-function concentration studies.

Constructing a securities-lending revenue dataset

Per series, pull isFundSecuritiesLending, didFundLendSecurities, paymentToAgentManagerType, avgPortfolioSecuritiesValue, and netIncomeSecuritiesLending, joined to the lending agent name and LEI inside the same block. Aggregate netIncomeSecuritiesLending by lending agent to estimate complex-level lending revenue and produce agent league tables; cross-reference against custodian identity to separate captive from third-party lending arrangements.

Detecting fund-lifecycle events without a historical join

Filter on isRegistrantFirstFiling, isRegistrantLastFiling, and per-series isFirstFilingByFund, paired with numAddedClass and numTerminatedClass, to enumerate trust launches, liquidations, and class-level changes for a given period. Combine with <attachmentsTab> flags and any OTHER REQUIRED INFO narrative to characterize the event (merger, reorganization, liquidation). Output supports M&A trackers, mortality-rate studies, and news desks covering fund liquidations.

Benchmarking exemptive-rule reliance across peer funds

For each series, enumerate <relyOnRuleTypes> (Rules 12d1-1, 17a-7, 17e-1, 32a-4, 10f-3, 22e-3, 15a-4) and join to fund classification (fundType, isNonDiversifiedCompany). Build a peer matrix at the strategy level showing reliance frequency; compliance counsel use the matrix to calibrate internal disclosures, while academics use it as a difference-in-differences treatment indicator around specific SEC rulemakings.

Stitching N-CEN/A amendment chains to corrected fields

Treat metadata.json.formType == "N-CEN/A" records as corrections and use <headerData>/<accessionNumber> (carrying the original accession) to link each amendment to its predecessor. Diff the <formData> subtrees between the original and amended payloads to isolate which items were restated, and parse any OTHER REQUIRED INFO exhibit for the rationale. Output supports data-quality pipelines that flag stale service-provider or auditor records and supplies a corrected snapshot for downstream joins.

Sizing top-broker concentration and soft-dollar usage

Unroll <brokers> per series for grossCommission by broker (with CRD/LEI), the <aggregateCommission> total, and <principalTransactions> counterparty volumes, alongside the isBrokerageResearchPayment Section 28(e) flag. Aggregate commissions by broker LEI across all series in a complex to produce concentration metrics; intersect with the soft-dollar flag to identify funds where research payments are bundled into commission flow.

Dataset Access

The Form N-CEN Files dataset is available through three access methods: a JSON metadata endpoint, a full dataset archive download, and per-container downloads. Containers are organized by month from September 2018 to the present and follow the naming pattern YYYY/YYYY-MM.zip.

Dataset Index JSON API: https://api.sec-api.io/datasets/form-ncen-files.json

This endpoint returns dataset-level metadata along with the complete list of container files and their individual download URLs, sizes, record counts, and last-updated timestamps. Use it to discover available monthly containers and to monitor which containers were refreshed in recent runs, so you can incrementally download only the containers that changed. This endpoint does not require an API key.

Example response:

Example
1 {
2 "datasetId": "1f13365b-9ae0-6911-80df-116297a4e0c0",
3 "datasetDownloadUrl": "https://api.sec-api.io/datasets/form-ncen-files.zip",
4 "name": "Form N-CEN Files Dataset",
5 "updatedAt": "2026-05-16T03:02:38.497Z",
6 "earliestSampleDate": "2018-09-01",
7 "totalRecords": 90391,
8 "totalSize": 2652158043,
9 "formTypes": ["N-CEN", "N-CEN/A"],
10 "containerFormat": "ZIP",
11 "fileTypes": ["XML", "HTML", "JSON", "TXT", "PDF"],
12 "containers": [
13 {
14 "downloadUrl": "https://api.sec-api.io/datasets/form-ncen-files/2026/2026-05.zip",
15 "key": "2026/2026-05.zip",
16 "size": 13818783,
17 "records": 154,
18 "updatedAt": "2026-05-16T03:02:38.497Z"
19 }
20 ]
21 }

Download Entire Dataset: https://api.sec-api.io/datasets/form-ncen-files.zip?token=YOUR_API_KEY

This URL returns the full dataset as a single ZIP archive containing every monthly container from September 2018 onward. This endpoint requires an API key passed via the token query parameter.

Download Single Container: https://api.sec-api.io/datasets/form-ncen-files/2026/2026-03.zip?token=YOUR_API_KEY

Use this URL pattern to download an individual monthly container. Replace the year and month in the path to fetch any container listed in the index. To iterate the full history, walk months from 2018/2018-09.zip through the latest container reported by the JSON index, using the updatedAt field on each container to detect changes between refresh runs. This endpoint requires an API key passed via the token query parameter.

Frequently Asked Questions

What form does this dataset cover?

The dataset packages every EDGAR submission of Form N-CEN and its amendment Form N-CEN/A, the structured XML annual census report adopted under SEC Release IC-32314 and effective June 1, 2018. Form N-CEN replaced the legacy semi-annual Form N-SAR and is the canonical Investment Company Act census filing under Section 30 and Rule 30a-1.

What does one record in this dataset represent?

One record is a single EDGAR filing of Form N-CEN (or N-CEN/A), addressed by its accession number and packaged as a self-contained accession folder containing metadata.json, the canonical primary_doc.xml, the EDGAR-rendered XHTML view, and any Internal Control Report or Other Required Information attachments. Because a single N-CEN filing typically reports on a registrant trust holding many series, series-level analysis requires unrolling the repeating <managementInvestmentQuestion> block inside the XML rather than treating the folder as a single fund.

Who is required to file Form N-CEN?

Registered management investment companies (open-end mutual funds, closed-end funds, ETFs, and registered SBICs) and registered unit investment trusts (including insurance company separate accounts that fund variable annuity and variable life contracts) file Form N-CEN. Business development companies, private funds under Section 3(c)(1) or 3(c)(7), face-amount certificate companies, and operating companies do not. Advisers, custodians, transfer agents, administrators, and auditors are disclosed within the form but do not file it.

When is Form N-CEN due?

Management investment companies must file within 75 days after the end of their fiscal year, and unit investment trusts must file within 75 days after the end of the calendar year. Because fund fiscal year-ends are spread throughout the calendar, N-CEN filings appear continuously, with peaks roughly 75 days after common year-ends in December, June, September, and October.

What time period does the dataset cover?

The dataset begins with the first wave of N-CEN filings in September 2018 — the month after N-CEN replaced N-SAR on June 1, 2018 — and continues to the present with monthly containers. Legacy N-SAR filings from before mid-2018 are out of scope and are not included.

How does Form N-CEN differ from Form N-PORT and Form N-CSR?

Form N-CEN is the structural and operational census of the fund: identity, organizational form, board composition, service providers, exemptive-rule reliance, securities-lending, and credit-line participation. Form N-PORT is the monthly position-level holdings report (security identifiers, fair values, derivatives exposures, liquidity classifications), and Form N-CSR is the certified shareholder report carrying audited financial statements and shareholder-letter narrative. The three forms are complements rather than substitutes — N-CEN anchors fund identity and service-provider relationships, N-PORT carries portfolio holdings, and N-CSR carries audited financials.

What file formats are inside each container?

Each monthly ZIP container holds one accession folder per filing. Inside each folder, the file types are XML (the canonical primary_doc.xml and the XSL-rendered view), JSON (the metadata.json header), HTML, plain text, and PDF (Internal Control Reports and Other Required Information exhibits may appear in any of these formats, identified authoritatively by the SGML <TYPE> line rather than by file extension).