The Form N-CEN Files dataset is a structured collection of annual census reports filed by registered investment companies on EDGAR under Rule 30a-1 of the Investment Company Act of 1940. Each record packages a single Form N-CEN filing (or its amendment, Form N-CEN/A) as a self-contained accession number folder containing the canonical XML payload, an EDGAR-normalized metadata header, an EDGAR-rendered XHTML view, and any attachments such as auditor Internal Control Reports and Other Required Information exhibits. The filers are open-end mutual funds, closed-end funds, exchange-traded funds, unit investment trusts, insurance company separate accounts, and registered small business investment companies, each submitting through their officers within 75 days after the registrant's fiscal year end (or calendar year end for UITs). The dataset begins with the first wave of N-CEN filings in September 2018 — the month after Form N-CEN replaced the legacy semi-annual Form N-SAR — and continues to the present with monthly containers. It is the canonical structured source for fund identity, organizational form, board composition, service-provider relationships, securities-lending and credit-line activity, exemptive-rule reliance, and lifecycle events across the registered investment-company population.
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
The dataset is built from Form N-CEN, the structured XML annual census report adopted by the SEC's October 2016 Investment Company Reporting Modernization rule (Release IC-32314) and effective June 1, 2018. Form N-CEN replaced the legacy semi-annual Form N-SAR — a fixed-field text questionnaire designed for the ASCII EDGAR era — with an annual XML report built on a published schema with named elements, validation rules, and a dedicated rendering stylesheet. Every registered management investment company and every registered unit investment trust is in scope; the dataset covers the entire filer population defined by Section 30 of the Investment Company Act and Rule 30a-1 thereunder.
The dataset is distributed as a sequence of monthly ZIP containers organized by year (YYYY/YYYY-MM.zip), beginning with September 2018 — the first month in which N-CEN filings actually arrived in EDGAR — and continuing on a monthly cadence. Filings made on the legacy Form N-SAR before mid-2018 are out of scope and are not present in the dataset. File types found inside each container are XML, HTML, JSON, TXT, and PDF. Form N-CEN collects identifying information about the registrant trust and each of its series, operational metrics for the fiscal year, names and regulatory identifiers (CIK, file number, CRD, LEI, RSSD, PCAOB) of every material service provider, governance details (directors, chief compliance officer, principal underwriter, public accountant), securities-lending and credit-line activity, exemptive rule reliance, and a battery of yes/no flags that surface compliance, custody, valuation, and lifecycle events. Several items require attachments — most prominently the auditor's Internal Control Report — that are filed as separate documents inside the same EDGAR submission and appear as separate files inside the dataset's accession folder.
A single record in the Form N-CEN Files Dataset is one EDGAR filing of Form N-CEN, or its amendment Form N-CEN/A, addressed by its accession number and packaged as a self-contained accession folder inside a monthly ZIP. The on-disk unit is therefore one annual census report submitted by a registered investment company under Rule 30a-1 — covering open-end funds, closed-end funds, exchange-traded funds, unit investment trusts, separate accounts, and small business investment companies — filed within 75 days after the registrant's fiscal year end.
Although the dataset's record unit is the filing (one accession folder), the underlying N-CEN payload is hierarchical: a single filing routinely reports on a registrant trust that holds many fund series, and each series carries its own block of operational data. Series-level analytics therefore require unrolling a repeating XML block inside the record rather than treating accession folders as series-level rows.
Each accession folder is a flat directory named after the dashless EDGAR accession number (for example 000089418925010351/), sitting one level deep inside the per-month ZIP. Two files are always present, and several optional files appear depending on which N-CEN items were answered:
metadata.json — the normalized header extracted by the SEC API pipeline from the EDGAR submission envelope. Always present.primary_doc.xml — the canonical N-CEN XML payload conforming to the http://www.sec.gov/edgar/ncen schema. Always present.xslFormN-CEN_X05/primary_doc.xml — an EDGAR-rendered XHTML view of the XML produced by applying the NCEN_print.css stylesheet. The file extension is .xml but the content is XHTML 1.0 Strict; the folder name encodes the schema version (X05). Absent for some older or paper-style submissions, so this folder should be treated as optional..htm, .txt, or .pdf), each carrying SGML document headers identifying its <TYPE> as INTERNAL CONTROL RPT. Trusts that share an auditor across all series file a single consolidated report; multi-family trusts whose sub-portfolios use different auditors carry one report per sub-portfolio.OTHER REQUIRED INFO exhibits (.htm or .txt) carrying explanatory notes, item-specific addenda, or amendment rationales.Image files (logos, signature scans) from the original EDGAR submission are not retained, and the full SGML-wrapped submission text file (*.txt) referenced from documentFormatFiles[] is not materialized as a separate file inside the folder — the dataset retains the parsed components rather than the concatenated SGML envelope. Every non-XML attachment, however, preserves its own EDGAR SGML document header (<DOCUMENT>, <TYPE>, <SEQUENCE>, <FILENAME>, <DESCRIPTION>, <TEXT> wrappers) inline, so the authoritative document-type label travels with the file content rather than being inferred from the extension.
metadata.json headerThe metadata file is a flat JSON object that lifts EDGAR submission-header fields into stable, named keys.
Top-level scalar fields:
formType — N-CEN or N-CEN/A.accessionNo — the EDGAR accession in dashed form (0000894189-25-010351).filedAt — ISO-8601 timestamp with timezone offset.effectivenessDate — date the filing becomes effective.periodOfReport — fiscal-year end the report covers.description — human-readable filing description from EDGAR.linkToTxt, linkToHtml, linkToFilingDetails, linkToXbrl — canonical EDGAR URLs for the full SGML-wrapped submission, the filing-index HTML page, and the XSL-rendered XML view.id — internal pipeline identifier for the record.Array fields enumerate the submission's components:
documentFormatFiles[] lists every document attached to the EDGAR submission, with sequence, size in bytes (as a string), documentUrl, type (for example N-CEN/A, INTERNAL CONTROL RPT, OTHER REQUIRED INFO), and an optional description. A trailing entry with a blank type points at the full SGML-wrapped submission text file on EDGAR.dataFiles[] lists XBRL data attachments when present. N-CEN does not carry XBRL data, so this array is empty.Array fields that expose the registrant graph:
entities[] records the filer entity (or entities) lifted from the EDGAR header, with cik, companyName, irsNo, fileNo, filmNo, fiscalYearEnd as MMDD, stateOfIncorporation, the SEC act (40 for the Investment Company Act of 1940), and a type field that echoes the form type.seriesAndClassesContractsInformation[] contains one entry per fund series belonging to the registrant trust, each with the series series S-number, the series name, and a nested classesContracts[] array of {ticker?, name, classContract} triples (the C-number identifying each share class). Tickers are omitted for share classes that are not publicly traded, so the field is sparse rather than universally populated.The metadata header is therefore sufficient on its own to answer "what trust, which fiscal year, which series and classes, which attachments" without parsing the XML body.
primary_doc.xml payloadThe canonical regulatory body of an N-CEN filing is a single XML document rooted at <edgarSubmission xmlns="http://www.sec.gov/edgar/ncen"> with a <schemaVersion> child (recent filings use X0505). The root has two principal children: <headerData> carrying the submission envelope, and <formData> carrying the substantive census disclosures. The element order is fixed by the schema. A trust covering many series can produce an XML document several thousand lines long, with the bulk of the volume concentrated in the per-series block.
<headerData>The header block restates the submission-level identification:
<submissionType> — N-CEN or N-CEN/A.<accessionNumber> — for amendments, this carries the original N-CEN accession that the amendment supersedes, not the amendment's own accession. The amendment's own accession remains in metadata.json and in the folder name. This split lets downstream consumers stitch corrections back to the filing they correct.<filerInfo> — wraps liveTestFlag, a filer/issuerCredentials/cik element, a redacted ccc (CIK confirmation code), and flag children overrideInternetFlag and confirmingCopyFlag. It also carries investmentCompanyType, encoding the registrant's registration form: N-1A for open-end management investment companies, N-2 for closed-end management investment companies, N-3 for separate accounts organized as management companies, N-4 for variable annuity separate accounts, N-5 for SBICs, and N-6 for variable-life separate accounts.<seriesClass>/<reportSeriesClass> — a list of <rptSeriesClassInfo> entries pairing each seriesId reported on with includeAllClassesFlag or with explicit class IDs.<formData><formData> is structurally divided into a small number of trust-level subsections and a repeating per-series block.
<generalInfo> — a single attribute-bearing element exposing reportEndingPeriod (the fiscal-year end the report covers) and isReportPeriodLt12 (a flag indicating the report covers a short period of less than twelve months, typically used in first or final filings).
<registrantInfo> — trust-level data. Disclosures about the registered investment company as a whole, stated once per record:
registrantFullName, investmentCompFileNo (the 811-… Investment Company Act file number), registrantCik, registrantLei, full street/city/state/zip/country address, phone, and websites/website/@webpage URLs.<locationBooksRecords>/<locationBooksRecord>, one entry per office holding books and records under Section 31(a) of the Investment Company Act, each with address, officeStateCountry, phone, and a free-text booksRecordsDesc naming the role (adviser, custodian, administrator, distributor, legal counsel, transfer agent).isRegistrantFirstFiling, isRegistrantLastFiling, isRegistrantFamilyInvComp (membership in a fund family).registrantClassificationType, totalSeries (count of series reported), isSecuritiesActRegistration (whether shares are registered under the Securities Act of 1933).<directors>/<director> with directorName, crdNumber, isDirectorInterestedPerson (the Section 2(a)(19) interested person determination), and fileNumbers/fileNumberInfo cross-references to the other registrants on whose boards the director also serves.<chiefComplianceOfficers>, required by Rule 38a-1, with name, CRD, address, phone, the list of employers (CCOs may serve multiple investment companies), and isCcoChangedSinceLastFiling.securityMatterSeriesInfo, isPreviousLegalProceeding, isPreviousProceedingTerminated, isClaimFiled, coveredByInsurancePolicy, isFinancialSupportDuringPeriod, isExemptionFromAct.<principalUnderwriters> with name, SEC file number, CRD, LEI, FDIC/Federal Reserve RSSD ID, state, and isPrincipalUnderwriterAffiliatedWithRegistrant, paired with isUnderwriterHiredOrTerminated.<publicAccountants> with name, pcaobNumber, LEI, RSSD ID, state, and isPublicAccountantChanged (the N-CEN analogue of an 8-K Item 4.01 auditor-change event).isMaterialWeakness, isOpinionOffered, isMaterialChange, isAccountingPrincipleChange, isPaymentErrorInNetAssetValue, isPaymentDividend.<managementInvestmentQuestionSeriesInfo> — the repeating per-series block. The substantive heart of the form: one <managementInvestmentQuestion> element per series under the registrant. Each block repeats the same structure:
mgmtInvFundName, mgmtInvSeriesId (the S-number), mgmtInvLei, isFirstFilingByFund.numAuthorizedClass, numAddedClass, numTerminatedClass, plus a <sharesOutstandings>/<sharesOutstanding> list with one element per share class carrying className, the C-number class ID, and ticker.fundType, isNonDiversifiedCompany, isForeignSubsidiary, plus securities lending fields (isFundSecuritiesLending, didFundLendSecurities, paymentToAgentManagerType, avgPortfolioSecuritiesValue, netIncomeSecuritiesLending).<relyOnRuleTypes>/<relyOnRuleType> enumerating the 1940 Act rules the fund relied on during the year (for example Rule 12d1-1 for fund-of-funds investments in money-market funds, Rule 32a-4 for the audit-committee exemption, Rule 17a-7 for cross-trades).isExpenseLimitationInPlace, isExpenseReducedOrWaived, isFeesWaivedRecoupable, isExpenseWaivedRecoupable.name, file numbers, CRD, LEI, RSSD, stateCountry, affiliation flags, and sub-provider information, paired with a top-level hire-or-terminate boolean:
<investmentAdvisers>/<investmentAdviser> + isAdviserHiredOrTerminated<transferAgents>/<transferAgent> + isTransferAgentHiredOrTerminated<pricingServices>/<pricingService> + isPricingServiceHiredOrTerminated<custodians>/<custodian> (carries an additional custodyType distinguishing self-custody, foreign sub-custodian, depository) + hire/terminate flag<shareholderServicingAgents>/<shareholderServicingAgent> + hire/terminate flag<admins>/<admin> (fund administrators) + hire/terminate flag<brokers>/<broker> listing top broker-dealers by commission paid, each with file number, CRD, LEI, RSSD, state, and grossCommission, alongside an <aggregateCommission> total. <principalTransactions>/<principalTransaction> lists counterparties for principal trades, with principalAggregatePurchase aggregating volume. isBrokerageResearchPayment records Section 28(e) soft-dollar arrangements. mnthlyAvgNetAssets carries monthly average net assets.<lineOfCredit hasLineOfCredit="…">. When the fund has a facility, a nested <lineOfCreditDetails>/<lineOfCreditDetail> carries the committed/uncommitted flag, lineOfCreditSize, lender names, a sharedCreditType distinguishing sole from shared facilities (with <creditUser> siblings naming every other fund participating in a shared facility), and a creditLineUsed group reporting isCreditLineUsed, averageCreditLineUsed, and daysCreditUsed.isInterfundLending, isInterfundBorrowing, isSwingPricing.Because this block repeats per series, service-provider identities, brokerage economics, lending and credit activity, and operational flags are all series-scoped — even when the trust shares advisers or custodians across its series.
<exchangeSeriesInfo> — relevant for closed-end funds and exchange-listed series. The element is self-closing when no series is exchange-listed; when populated, it lists listing exchanges per series along with ticker symbols and trading-status information.
<attachmentsTab> — a block of boolean indicators identifying which exhibits accompany the form. Flags align with the actual files in the accession folder — for example isIPAReportInternalControl (the auditor's Internal Control Report under Section 17(f) of the Investment Company Act and Item G.1.a.iii) and isOtherInfoRequired (the Other Required Information exhibit). These booleans provide a cheap consistency check against the on-disk attachment set.
<signature> — a single self-closing attribute-only element carrying registrantSignedName, signedDate, signature (the typed /s/ form), and title. There is no separate signature page or notarization.
The xslFormN-CEN_X05/primary_doc.xml companion is EDGAR's human-readable rendering of the same XML payload, produced by applying the NCEN_print.css stylesheet during filing. The content is XHTML 1.0 Strict despite the .xml extension: tables, divs, and "fakeBox" widgets render the form's questions and answers, and boolean fields are displayed as radio-button images rather than text values. The view exposes exactly the same data as primary_doc.xml and is convenient when a downstream consumer wants a printable document rather than a parsed XML tree. It is a derived rendering and adds no regulatory information beyond what is in the canonical XML.
Non-XML attachments inside an accession folder fall into a small number of EDGAR document types, identified authoritatively by the <TYPE> line in the SGML document envelope:
INTERNAL CONTROL RPT — the auditor's Internal Control Report, required when the fund's custody arrangements implicate Section 17(f) of the Investment Company Act and Item G.1.a.iii of N-CEN. The report is signed by the registrant's independent registered public accounting firm and follows the standard PCAOB-style "Report of Independent Registered Public Accounting Firm" header. Trusts whose sub-portfolios use different auditors carry several INTERNAL CONTROL RPT attachments with different filenames. The report may be rendered as HTML, plain text, or PDF depending on the auditor's preferred format.OTHER REQUIRED INFO — supplemental exhibits used for item-specific addenda, explanatory notes about specific responses, G.1.a.vi narratives that require free-text elaboration, or rationales accompanying an N-CEN/A that explain what was amended.N-CEN / N-CEN/A — the form's own XML body, always indexed by sequence number 1.Each attachment is wrapped in the EDGAR SGML document envelope (<DOCUMENT>, <TYPE>, <SEQUENCE>, <FILENAME>, <DESCRIPTION>, <TEXT> lines) so downstream consumers can identify the document type and sequence even when the filename is not self-describing. Plain-text attachments use the same envelope around fixed-width formatted text rather than HTML. Because the same content type can appear with several extensions across records, the SGML <TYPE> line is the authoritative discriminator rather than the file extension.
Included in each record:
metadata.json,primary_doc.xml,OTHER REQUIRED INFO exhibit,Excluded:
.txt file referenced from documentFormatFiles[] — the dataset retains the parsed components instead.Amendments are first-class records in the dataset and appear as their own accession folders. Two pointers identify the amendment relationship:
metadata.json.formType is N-CEN/A and metadata.json.accessionNo is the amendment's own accession (matching the folder name).primary_doc.xml, <headerData>/<accessionNumber> carries the original N-CEN accession that the amendment supersedes.Beyond these identification fields, an N-CEN/A is structurally identical to an N-CEN — the same XML schema, the same <formData> subsections, the same attachment shape. Amendments routinely include an OTHER REQUIRED INFO exhibit narrating the rationale for the amendment and identifying which items were restated. There is no rule that an amendment must restate the entire form, but in practice the full payload is re-submitted with the corrected values.
Within the N-CEN era the form has gone through several schema-version revisions. The schema version is exposed in <schemaVersion> at the root of the XML and echoed in the rendering folder name (xslFormN-CEN_X05/). Successive revisions reflect SEC rulemakings that incrementally added items: the 2020 Fair Valuation rule (Rule 2a-5) introduced fair value designation questions; the 2023 Tailored Shareholder Reports rule added items about shareholder-report formats; the 2023 Names Rule amendments added Rule 35d-1 compliance items; and 2024 ETF and tender-offer fund rulemakings introduced operational disclosures for liquidity-management arrangements. The schema is backward-compatible at the element level — older elements persist — but new elements are added with each schema version, so consumers parsing across years should treat <schemaVersion> as the authoritative guide to which elements to expect.
<registrantInfo> are stated once, but the <managementInvestmentQuestion> block repeats per series. Series-level analytics require unrolling this repeating block; treating the folder count as a count of funds will materially undercount the series universe.<accessionNumber> inside <headerData> of an N-CEN/A is the original filing's accession, not the amendment's. The amendment's own accession lives in metadata.json and in the folder name. Use both pointers to stitch correction chains..htm, .txt, or .pdf; the XSL rendering uses a .xml extension but carries XHTML. The SGML <TYPE> line is the authoritative classifier; content type cannot be inferred from the extension.name / file-numbers / CRD / LEI / RSSD / state / affiliation-flags shape and is paired with a hire-or-terminate boolean inside the same series block. This uniformity allows consistent extraction of service-provider time series across series and years.booksRecordsDesc, the OTHER REQUIRED INFO exhibit body, and the auditor's report body — carry free-form text that requires NLP rather than schema-driven extraction.primary_doc.xml; the rendered XHTML is a derived view and adds no information.<attachmentsTab> flags align with the folder contents. When isIPAReportInternalControl is true, an INTERNAL CONTROL RPT attachment is present; when isOtherInfoRequired is true, an OTHER REQUIRED INFO exhibit is present. These flags provide a quick consistency check against the on-disk attachment set.isRegistrantFirstFiling and isRegistrantLastFiling at the trust level, plus isFirstFilingByFund at the series level, identify new-fund and terminating-fund events without requiring a join against historical filings.<schemaVersion> values and therefore expose slightly different element sets. Parsers should branch on schema version when extracting items added by post-2018 rulemakings.Form N-CEN is filed by registered investment companies themselves, acting through their officers. The filing population, defined by Section 30 of the Investment Company Act of 1940 and Rule 30a-1 thereunder, consists of:
For series trusts, a single N-CEN submission from the registrant covers all series and classes under one accession number.
The obligation is rooted in Section 30 of the Investment Company Act of 1940 and Rule 30a-1, which requires every registered management investment company and every registered UIT to file an annual census report on Form N-CEN. N-CEN is the census layer of the 1940 Act reporting stack. It sits alongside, but is separate from, Form N-PORT (monthly portfolio holdings, Rule 30b1-9), Form N-CSR (certified shareholder reports, Rule 30b2-1), Form N-PX (proxy voting), and the prospectus forms (N-1A, N-2, N-3, N-4, N-6, N-8B-2).
The trigger is purely periodic, not event-driven: each registrant owes one N-CEN per year.
Because fund fiscal year-ends are spread across the calendar (with clusters at December, June, September, and October), N-CEN filings appear throughout the year, with predictable peaks roughly 75 days after each common year-end.
Form N-CEN/A is an amendment. It is filed when the registrant corrects or updates a prior N-CEN — for example, service-provider data, CCO information, or series/class identifiers. Amendments are corrective rather than calendar-driven, have no fixed deadline, and one fiscal-year report may receive multiple N-CEN/A filings over time.
Form N-CEN sits inside a tight cluster of Investment Company Act reporting forms covering the same filer universe (registered funds and unit investment trusts) but splitting the disclosure surface across census data, financial statements, holdings, voting, and registration. The closest comparisons are with other N-series forms. Form ADV and Form 10-K are commonly confused with N-CEN but cover different filer populations entirely.
N-SAR was the semi-annual census report for registered management investment companies, retired in June 2018 and replaced by N-CEN. Key differences: cadence (semi-annual to annual), structure (fixed-field plain text to XML), and scope (N-CEN substantially expands service-provider and operational fields). The two are not field-compatible; pre-2018 historical research requires a manual crosswalk, not concatenation.
N-PORT is the monthly position-level holdings report (third month of each quarter made public). Overlap with N-CEN is limited to filer population. N-PORT carries security identifiers, fair values, derivatives exposures, risk metrics, and liquidity classifications; N-CEN carries none of these. The pairing is standard: N-CEN for fund identity, structure, and service providers; N-PORT for what the fund holds.
N-PX is the annual log of proxy votes cast on portfolio securities, filed by registered funds and (post-2022 expansion) certain institutional managers. Cadence and filer population align with N-CEN, but the content is orthogonal: N-PX is a per-meeting, per-ballot voting record; N-CEN is a census of fund structure and service providers.
N-CSR (annual) and N-CSRS (semi-annual) carry audited or unaudited financial statements, schedules of investments, shareholder-letter narrative, and Sarbanes-Oxley certifications. Same fund population, same annual cadence for N-CSR, but a different payload type: N-CSR is narrative and financial-statement driven (text and PDF), while N-CEN is structured XML census data. For a given fiscal year, N-CSR delivers the audited financials; N-CEN delivers the operational/structural facts. Complements, not substitutes.
Registration and prospectus forms for open-end mutual funds (Form N-1A), closed-end funds (Form N-2), managed separate accounts (Form N-3), variable annuity separate accounts (Form N-4), and variable life separate accounts (Form N-6). Same registrant universe as N-CEN, but a different regulatory purpose: initial and updated offering disclosures, including prospectus narrative, fee tables, investment objectives, and risk factors. Filed as needed rather than on a periodic cadence, and predominantly narrative. N-CEN provides the standardized annual operational snapshot of the same registrants the registration statements introduce.
Form N-Q was the quarterly portfolio holdings report, rescinded in 2019 once N-PORT became public. It does not overlap with N-CEN in content (holdings only) but is part of the same legacy ecosystem alongside N-SAR. For pre-2019 research, N-Q plus N-SAR roughly parallel today's N-PORT plus N-CEN split.
ADV is filed by investment advisers, not by funds. This is the most frequent confusion: ADV describes the management company; N-CEN describes the fund. N-CEN itemizes the fund's adviser among its service providers, supplying a natural join key (CRD or SEC file number) into ADV. ADV carries adviser-level AUM, client mix, disciplinary history, and Part 2 brochure; N-CEN carries fund-level structural facts.
10-K is the Exchange Act annual report for operating companies. Filer populations are disjoint: 10-K filers are operating enterprises; N-CEN filers are registered investment companies under the 1940 Act. The mental pairing exists only because both are "annual reports." Content (audited financials, MD&A, risk factors, business description) and statutory basis differ entirely. Listed here only to disambiguate.
N-CEN is the only SEC dataset providing a standardized, structured-XML annual census of registered investment companies focused on operational and structural attributes rather than financials, holdings, or votes. Its defining features:
N-CEN is not interchangeable with N-PORT (holdings), N-CSR (financials), N-PX (votes), or the N-1A/N-2/N-3/N-4/N-6 family (offering disclosures). It is the structural backbone that anchors those disclosures to a consistent annual snapshot of fund identity and operational arrangement.
Because each Form N-CEN filing names the adviser, subadvisers, custodians, transfer agent, auditor, principal underwriter, CCO, and authorized participants alongside flags for securities lending, derivatives, in-kind activity, and reliance on specific 1940 Act rules, it is consumed by professionals who need to know who runs, services, and oversees each fund.
Strategy and CI teams at asset managers pivot Item E (adviser, subadviser, custodian, sub-custodian, administrator, transfer agent, pricing agent, principal underwriter, auditor) against Items B and C (CIK, LEI, series and class identifiers) to build a peer service-provider matrix. Year-over-year diffs flag when a peer migrates a vendor, which feeds renegotiation lists and pricing benchmarks for senior leadership.
Independent-trustee staff and proxy-governance analysts focus on CCO identification, CCO turnover, third-party CCO compensation, reliance on exemptive orders, Rule 12b-1 plan use, and payments to affiliated persons. They use these fields to score CCO independence, flag compliance-function turnover as a governance risk, and benchmark trustee oversight across trust complexes.
Researchers covering ETF plumbing exploit the ETF section: authorized participants by name and LEI, dollar value of creations and redemptions per AP, cash-versus-in-kind split, AP agreement requirements, and master-feeder or fund-of-funds status. Joined to series and class data, these fields produce AP concentration metrics, in-kind ratios, and launch-versus-liquidation profiles that feed market-structure papers and liquidity models.
Internal compliance teams and fund counsel use peer N-CEN responses to calibrate their own filings. Key items include reliance flags for Rules 10f-3, 17a-7, 17e-1, 12d1-1, 22e-3, 32a-4, and 15a-4; securities lending and indemnification disclosures; line-of-credit and interfund lending participation; and liquidity rule program administrator items. Comparing peer answers helps validate internal controls and surfaces operational differences worth investigating.
Accounting-policy researchers and audit-firm competitive teams use Item E.6 (independent public accountant name, PCAOB ID, location, change-during-period flag) to build auditor switch panels, tenure distributions, and concentration metrics for the fund-audit market. Outputs include audit-quality studies and win/loss tracking for audit-firm leadership.
BD teams at custody banks and administrators pull Item E custodian, sub-custodian (with foreign-custody location and affiliation flag), administrator, and transfer-agent disclosures, then join to series counts and net assets to size each relationship. Year-over-year deltas drive account-defense alerts and prospecting lists for renewal windows.
TA product managers map served versus unserved share classes using Item E transfer-agent identification, affiliation flags, account-type fields, and class-level load and distribution data. They use this to estimate sub-accounting volume and prioritize prospecting at complexes whose stated arrangement fits their operating model.
Analysts at securities-lending agents and prime brokerage units use the lending section: lending activity flag, lending agent name, compensation basis, indemnification terms, collateral managers, cash collateral reinvestment vehicles, and revenue earned. These fields support agent league tables, complex-level revenue estimates, and mandate-migration tracking.
Researchers studying fund governance, ETF microstructure, and service-provider competition use the full N-CEN schema as a machine-readable panel: adviser and subadviser identifiers and affiliation flags, board structure items, fee waivers, derivative counterparty disclosures, and ETF AP and basket data. Continuous coverage from the post-N-SAR transition supports difference-in-differences and event-study designs.
Reporters at financial news outlets use N-CEN to substantiate stories on auditor rotations, adviser and subadviser changes, CCO turnover, fund liquidations and mergers, and service-provider concentration. The fund-action items (liquidation, merger, reorganization, termination flags) plus the Item E grid let reporters produce industry-wide rankings and multi-year narratives.
Taken together, the dataset serves any role that needs registrant-identified, multi-year, structured disclosure of the operational architecture of a fund: who advises it, who services it, who audits it, who oversees it, and which 1940 Act rules and ETF mechanics it relies on. Each profession draws on a different slice of the same Item E and operational-flag fields.
Form N-CEN's structured XML payload supports a set of concrete, repeatable workflows centered on fund identity, service-provider mapping, and operational flags. The use cases below tie each question to specific fields in primary_doc.xml and the surrounding accession folder.
Unroll <publicAccountants> per series and read isPublicAccountantChanged alongside pcaobNumber, name, and lei to detect auditor switches each fiscal year. Join records on registrantCik and mgmtInvSeriesId across consecutive filings to construct switch pairs (predecessor PCAOB ID, successor PCAOB ID, switch date from metadata.json.periodOfReport). Output feeds audit-firm league tables, tenure-distribution studies, and win/loss dashboards for Big Four and non-Big-Four audit practices.
For each series, extract the full service-provider block (<investmentAdvisers>, <custodians> with custodyType, <admins>, <transferAgents>, <pricingServices>, <shareholderServicingAgents>, <principalUnderwriters>) along with CRD, LEI, and RSSD identifiers, then key year-over-year diffs on the paired isXHiredOrTerminated flags. The resulting hire/terminate timeline drives prospecting lists for custody banks, administrators, and TA shops at renewal windows, and supports account-defense alerts when a competitor wins a complex.
Read <chiefComplianceOfficers> for each registrant: name, CRD, employer list, address, and isCcoChangedSinceLastFiling. Aggregate by CCO CRD across trusts to identify multi-trust CCO arrangements (a single individual serving as CCO for many fund families) and to flag turnover events. Output supports governance-risk scoring for independent-trustee staff and feeds compliance-function concentration studies.
Per series, pull isFundSecuritiesLending, didFundLendSecurities, paymentToAgentManagerType, avgPortfolioSecuritiesValue, and netIncomeSecuritiesLending, joined to the lending agent name and LEI inside the same block. Aggregate netIncomeSecuritiesLending by lending agent to estimate complex-level lending revenue and produce agent league tables; cross-reference against custodian identity to separate captive from third-party lending arrangements.
Filter on isRegistrantFirstFiling, isRegistrantLastFiling, and per-series isFirstFilingByFund, paired with numAddedClass and numTerminatedClass, to enumerate trust launches, liquidations, and class-level changes for a given period. Combine with <attachmentsTab> flags and any OTHER REQUIRED INFO narrative to characterize the event (merger, reorganization, liquidation). Output supports M&A trackers, mortality-rate studies, and news desks covering fund liquidations.
For each series, enumerate <relyOnRuleTypes> (Rules 12d1-1, 17a-7, 17e-1, 32a-4, 10f-3, 22e-3, 15a-4) and join to fund classification (fundType, isNonDiversifiedCompany). Build a peer matrix at the strategy level showing reliance frequency; compliance counsel use the matrix to calibrate internal disclosures, while academics use it as a difference-in-differences treatment indicator around specific SEC rulemakings.
Treat metadata.json.formType == "N-CEN/A" records as corrections and use <headerData>/<accessionNumber> (carrying the original accession) to link each amendment to its predecessor. Diff the <formData> subtrees between the original and amended payloads to isolate which items were restated, and parse any OTHER REQUIRED INFO exhibit for the rationale. Output supports data-quality pipelines that flag stale service-provider or auditor records and supplies a corrected snapshot for downstream joins.
Unroll <brokers> per series for grossCommission by broker (with CRD/LEI), the <aggregateCommission> total, and <principalTransactions> counterparty volumes, alongside the isBrokerageResearchPayment Section 28(e) flag. Aggregate commissions by broker LEI across all series in a complex to produce concentration metrics; intersect with the soft-dollar flag to identify funds where research payments are bundled into commission flow.
The Form N-CEN Files dataset is available through three access methods: a JSON metadata endpoint, a full dataset archive download, and per-container downloads. Containers are organized by month from September 2018 to the present and follow the naming pattern YYYY/YYYY-MM.zip.
Dataset Index JSON API: https://api.sec-api.io/datasets/form-ncen-files.json
This endpoint returns dataset-level metadata along with the complete list of container files and their individual download URLs, sizes, record counts, and last-updated timestamps. Use it to discover available monthly containers and to monitor which containers were refreshed in recent runs, so you can incrementally download only the containers that changed. This endpoint does not require an API key.
Example response:
1
{
2
"datasetId": "1f13365b-9ae0-6911-80df-116297a4e0c0",
3
"datasetDownloadUrl": "https://api.sec-api.io/datasets/form-ncen-files.zip",
4
"name": "Form N-CEN Files Dataset",
5
"updatedAt": "2026-05-16T03:02:38.497Z",
6
"earliestSampleDate": "2018-09-01",
7
"totalRecords": 90391,
8
"totalSize": 2652158043,
9
"formTypes": ["N-CEN", "N-CEN/A"],
10
"containerFormat": "ZIP",
11
"fileTypes": ["XML", "HTML", "JSON", "TXT", "PDF"],
12
"containers": [
13
{
14
"downloadUrl": "https://api.sec-api.io/datasets/form-ncen-files/2026/2026-05.zip",
15
"key": "2026/2026-05.zip",
16
"size": 13818783,
17
"records": 154,
18
"updatedAt": "2026-05-16T03:02:38.497Z"
19
}
20
]
21
}
Download Entire Dataset: https://api.sec-api.io/datasets/form-ncen-files.zip?token=YOUR_API_KEY
This URL returns the full dataset as a single ZIP archive containing every monthly container from September 2018 onward. This endpoint requires an API key passed via the token query parameter.
Download Single Container: https://api.sec-api.io/datasets/form-ncen-files/2026/2026-03.zip?token=YOUR_API_KEY
Use this URL pattern to download an individual monthly container. Replace the year and month in the path to fetch any container listed in the index. To iterate the full history, walk months from 2018/2018-09.zip through the latest container reported by the JSON index, using the updatedAt field on each container to detect changes between refresh runs. This endpoint requires an API key passed via the token query parameter.
The dataset packages every EDGAR submission of Form N-CEN and its amendment Form N-CEN/A, the structured XML annual census report adopted under SEC Release IC-32314 and effective June 1, 2018. Form N-CEN replaced the legacy semi-annual Form N-SAR and is the canonical Investment Company Act census filing under Section 30 and Rule 30a-1.
One record is a single EDGAR filing of Form N-CEN (or N-CEN/A), addressed by its accession number and packaged as a self-contained accession folder containing metadata.json, the canonical primary_doc.xml, the EDGAR-rendered XHTML view, and any Internal Control Report or Other Required Information attachments. Because a single N-CEN filing typically reports on a registrant trust holding many series, series-level analysis requires unrolling the repeating <managementInvestmentQuestion> block inside the XML rather than treating the folder as a single fund.
Registered management investment companies (open-end mutual funds, closed-end funds, ETFs, and registered SBICs) and registered unit investment trusts (including insurance company separate accounts that fund variable annuity and variable life contracts) file Form N-CEN. Business development companies, private funds under Section 3(c)(1) or 3(c)(7), face-amount certificate companies, and operating companies do not. Advisers, custodians, transfer agents, administrators, and auditors are disclosed within the form but do not file it.
Management investment companies must file within 75 days after the end of their fiscal year, and unit investment trusts must file within 75 days after the end of the calendar year. Because fund fiscal year-ends are spread throughout the calendar, N-CEN filings appear continuously, with peaks roughly 75 days after common year-ends in December, June, September, and October.
The dataset begins with the first wave of N-CEN filings in September 2018 — the month after N-CEN replaced N-SAR on June 1, 2018 — and continues to the present with monthly containers. Legacy N-SAR filings from before mid-2018 are out of scope and are not included.
Form N-CEN is the structural and operational census of the fund: identity, organizational form, board composition, service providers, exemptive-rule reliance, securities-lending, and credit-line participation. Form N-PORT is the monthly position-level holdings report (security identifiers, fair values, derivatives exposures, liquidity classifications), and Form N-CSR is the certified shareholder report carrying audited financial statements and shareholder-letter narrative. The three forms are complements rather than substitutes — N-CEN anchors fund identity and service-provider relationships, N-PORT carries portfolio holdings, and N-CSR carries audited financials.
Each monthly ZIP container holds one accession folder per filing. Inside each folder, the file types are XML (the canonical primary_doc.xml and the XSL-rendered view), JSON (the metadata.json header), HTML, plain text, and PDF (Internal Control Reports and Other Required Information exhibits may appear in any of these formats, identified authoritatively by the SGML <TYPE> line rather than by file extension).