The Form U5S Files Dataset is a closed historical archive of every Form U5S annual report and Form U5S/A amendment filed on EDGAR by registered public utility holding companies under Section 5(c) of the Public Utility Holding Company Act of 1935 (PUHCA 1935). One record in the dataset is a single EDGAR submission keyed by accession number, packaging the parsed submission header together with the primary form body and every non-image exhibit the registrant transmitted. The filings were made by the top-tier registered holding company of each multi-state utility system, covering all "system companies" — the registered parent, intermediate holding companies, public-utility subsidiaries, non-utility subsidiaries, and mutual or service companies — on a consolidated calendar-year basis. The dataset begins with EDGAR adoption in January 1994 and ends when the SEC discontinued Form U5S after the Energy Policy Act of 2005 repealed PUHCA 1935 effective February 8, 2006. It is distributed as ZIP containers with TXT, JSON, HTML, and PDF file types.
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
The dataset is built from Form U5S, the annual disclosure vehicle prescribed by Section 5(c) of PUHCA 1935 and administered by the SEC's Division of Investment Management. Each registered holding-company system used the form to report, on a consolidated calendar-year basis, the structure and activities of every system company — the registered parent, any intermediate subsidiary holding companies, public-utility subsidiaries, non-utility subsidiaries, and mutual or service companies. The form was prescriptive: a fixed sequence of numbered Items, each calling for a specific disclosure, supplemented by lettered Exhibits (typically Exhibit A through Exhibit H) and a schedule of additional or incorporated exhibits.
The underlying source corpus is closed. Form U5S was discontinued when PUHCA 1935 was repealed by the Energy Policy Act of 2005, effective February 8, 2006, so the dataset covers every U5S and U5S/A submission accepted by EDGAR from January 1994 through the repeal date in February 2006, including late amendments that trailed into the repeal window. Pre-1994 Form U5S annual reports exist only on paper and are not part of the dataset. The dataset is delivered in ZIP containers, with TXT, JSON, HTML, and PDF as the file types found across records.
One record in the Form U5S Files Dataset is a single EDGAR submission of either a Form U5S (annual report of a registered public utility holding company under PUHCA 1935) or a Form U5S/A (an amendment to a previously filed U5S). The record unit is the accession-level filing, keyed by EDGAR's 18-digit accession number and materialized as one folder that bundles the parsed EDGAR submission header together with every non-image document the registrant transmitted under that accession. One record therefore corresponds to one registered holding-company system's annual filing event for one fiscal year — or one amendment to such a filing — not to a per-Item observation, a per-subsidiary row, or a discrete event extracted from the narrative.
Each record folder contains two structural layers:
metadata.json file holding the parsed EDGAR submission header (form type, accession number, filing timestamp, period of report, the filer-entity block, and a per-document manifest)..txt, .html, .htm, or .pdf files. Image-format exhibits referenced in the original filing (.jpg, .gif, and similar raster formats) are omitted by design.The folder is named with the un-dashed 18-digit accession number and sits inside a year-month parent folder (for example, 2004-10/000095013704008206/). The accession folder is self-contained: nothing about a single record requires reading any other folder.
A complete Form U5S is organized as a cover page, a numbered sequence of Items, a signature block, and a sequence of lettered Exhibits, each attached as a separate <DOCUMENT> block within the same SGML envelope. The substantive contents typically present, in order, are:
Each Exhibit appears in the EDGAR submission as its own <DOCUMENT> block inside the same SGML envelope as the primary form. A full annual filing therefore yields many document files (one per exhibit); a narrow amendment may contain only the cover page and the single Item or Exhibit being corrected.
metadata.json contentsThe metadata file is a parsed, denormalized form of the EDGAR submission header. The fields populated for U5S records are:
formType — "U5S" for an original annual report or "U5S/A" for an amendment.accessionNo — Dashed 18-character accession number (for example, "0000950137-04-008206"); the parent folder uses the same number without dashes.filedAt — ISO-8601 timestamp with an Eastern Time offset capturing when EDGAR accepted the submission.periodOfReport — Fiscal year-end date covered by the report (YYYY-MM-DD).description — Human-readable form label, for example "Form U5S/A - Annual report for holding companies [Section 5]: [Amend]".linkToFilingDetails, linkToTxt, linkToHtml, linkToXbrl — Stable SEC.gov URLs to, respectively, the primary document, the complete submission text file (full SGML envelope), the EDGAR filing-index HTML page, and the XBRL instance. linkToXbrl is empty across the dataset because U5S filings predate any XBRL mandate that could apply to the form.documentFormatFiles[] — One descriptor per document in EDGAR's primary-document table. Each entry carries sequence, size in bytes (as a string), documentUrl, a filer-supplied description (e.g. "AMENDMENT TO FORM U5S", "EXHIBIT E - TAX ALLOCATION", "Complete submission text file"), and an EDGAR document type code.entities[] — One object per filer or co-registrant. PUHCA holding-company filings commonly enumerate multiple registered system companies (for example, NiSource Inc. and Columbia Energy Group together). Each entity object carries companyName (with a parenthesized role suffix such as "(Filer)"), zero-padded ten-digit cik, irsNo, EDGAR fileNo of the form 030-XXXXX (the 030- prefix is the PUHCA file-number block), filmNo, sic code with label, stateOfIncorporation, four-digit fiscalYearEnd (MMDD), act ("35" for PUHCA 1935), type (form type repeated at the entity level), and tickers[] for any associated trading symbols.dataFiles[] — Reserved for structured data/XBRL exhibits; empty across this dataset.id — 32-character hexadecimal identifier assigned by the dataset publisher to uniquely key the record.Every non-metadata file in the accession folder is an EDGAR document wrapper. Even files with a .txt extension are pre-tagged with SGML markers rather than free text: each begins with <DOCUMENT>, followed by <TYPE> (e.g. U5S, U5S/A, EX-99), <SEQUENCE>, <FILENAME>, an optional <DESCRIPTION>, and a <TEXT>...</TEXT> block enclosing the rendered exhibit body. The body is in turn paginated by <PAGE> markers. HTML exhibits embed <HTML>...</HTML> inside the <TEXT> block; PDF exhibits are base64-encoded within <PDF>...</PDF> segments inside the complete-submission representation but appear as decoded .pdf files when split out per document. For each accession, a "Complete submission text file" entry in documentFormatFiles corresponds to the concatenated multi-document SGML envelope that EDGAR produced at acceptance.
For a full annual U5S, the accession folder typically contains one <DOCUMENT> block for the primary form body plus one additional block per attached exhibit (Exhibits A through H, organization charts, tax allocation agreements, and so on); the number of document files mirrors how many exhibits the registrant elected to attach. A narrow amendment may contain only the cover page and the single Item or Exhibit being corrected.
Each record packages: the structured EDGAR header (metadata.json); the primary U5S or U5S/A document body inside its SGML wrapper; and every non-image exhibit document that was part of the original EDGAR submission, in its native file format (TXT/SGML, HTML/HTM, or PDF).
Three categories of content sit outside the record folder:
.jpg, .gif, or other raster formats are omitted. Their presence in the original filing is sometimes still visible in the SGML <DOCUMENT> list inside the complete-submission text or referenced from the narrative.dataFiles[] is uniformly empty and linkToXbrl is uniformly an empty string for this form type.The prescribed Items and Exhibits remained largely stable across the dataset's 1994 to 2006 window because the underlying PUHCA Section 5(c) instructions were not materially overhauled. The most visible structural variation across records is practical rather than regulatory: amendments (U5S/A) typically carve out and refile only the affected portion (often Item 10 plus a single lettered exhibit such as a corrected Exhibit E tax allocation), so an amendment record is usually a small fraction of the size of the original annual report it amends. Over the dataset's lifetime, EDGAR's own header schema accumulated additional fields (film numbers, SIC labels, ticker associations) that are reflected in metadata.json for later filings even when the underlying U5S content did not change. The form was retired entirely effective February 8, 2006 with PUHCA 1935's repeal; the dataset terminates there and never transitioned to PUHCA 2005 reporting, which is administered by FERC rather than the SEC and does not use Form U5S.
Form U5S was filed on EDGAR exclusively in SGML-wrapped text and, later, HTML and PDF exhibit formats. The earliest filings (1994 through the late 1990s) are almost entirely ASCII text inside SGML <DOCUMENT> blocks, with tabular Items rendered as fixed-width ASCII tables and organization charts approximated with text-art. Beginning in the late 1990s and accelerating through the 2000s, registrants increasingly submitted HTML versions of the primary form and PDF copies of signed exhibits, tax allocation agreements, and consolidating financial statements.
periodOfReport and registrant CIK.entities[] array therefore commonly contains multiple (Filer) and (Subject) entries for a single accession, each with its own CIK, file number, and SIC code. The form body is filed jointly on behalf of all enumerated registrants.030- prefix on PUHCA file numbers is a stable signal that a filing belongs to the registered holding-company population; every fileNo in this dataset carries that prefix.<DOCUMENT>/<TYPE>/<TEXT>/<PAGE> wrappers, then parsing the fixed-width tables that constitute Items 1, 3, 5, 6, and 8. Exhibit A consolidating financial statements are reliably the densest tabular content per record.<TABLE> blocks without column-name metadata; extracting clean rows for system-company rosters or intercompany service charges requires per-filer normalization rather than a single shared schema.&, reflecting how EDGAR emitted the original header text.Form U5S was filed exclusively by registered public utility holding companies under PUHCA 1935. A filer had to:
The filing population was small: it consisted of the multi-state electric and gas utility holding-company systems that did not qualify for, or had not obtained, a Section 3 exemption. The top-tier registered holding company filed on behalf of the entire system. Subsidiary utility operating companies, intermediate holding companies, service companies, non-utility subsidiaries, foreign utility companies, and exempt wholesale generators were disclosed inside the filing but did not file Form U5S in their own right. A U5S/A is an amendment correcting, restating, or supplementing a previously filed annual report.
Form U5S was a periodic, calendar-driven annual report, not an event-triggered filing. The obligation arose each year from the registrant's continuing status as a registered holding company under Section 5 of PUHCA 1935. The form covered the most recently completed fiscal year on a system-wide basis and was due annually on the date set by the instructions (historically May 1 following the year covered, subject to Commission extensions).
The filing obligation ended when:
The latter occurred when the Energy Policy Act of 2005 repealed PUHCA 1935, effective February 8, 2006. The repeal eliminated Section 5(c) as a statutory basis for annual reporting, and the SEC discontinued Form U5S. PUHCA 2005, administered primarily by FERC, imposes no equivalent SEC annual report.
U5S/A amendments were not periodic. They were filed when needed to correct or supplement a prior U5S, or in response to SEC staff comments, with no fixed deadline. A small number of late U5S/A filings may appear near the February 2006 discontinuation date as registrants closed out their pre-repeal cycle. The dataset's electronic coverage begins with EDGAR adoption in 1994 and runs through the form's discontinuation in February 2006.
Form U5S sits inside a narrow, now-closed reporting regime. The most useful comparison points are the other PUHCA-family forms (U-1, U5B, U-13-60, U-3A-2), the Exchange Act periodic reports that the same companies filed in parallel, and the post-February 2006 FERC successor regime.
Filed by holding companies claiming an exemption from PUHCA registration. The populations are mutually exclusive: Form U-3A-2 filers were avoiding the registered regime; U5S filers were inside it. U-3A-2 is a short annual certification; U5S is a multi-hundred-page system compendium with consolidating financials and exhibits. Use U-3A-2 to map exempt structures; use U5S to study the internal economics of registered systems.
One-time entry filings into the PUHCA registered regime: Form U5B is the registration statement, Form U5A the notification. They establish the baseline corporate-structure record at the moment of registration. U5S is the recurring annual update of that same structure. For continuity, U5B/U5A provide the starting point; U5S provides the time series.
Event-driven, transaction-specific authorization filings for securities issuances, acquisitions, intra-system financings, and service arrangements. Form U-1 is prospective and per-transaction; U5S is retrospective and aggregated at fiscal year-end. The two are complementary: U-1 documents how each transaction was authorized; U5S documents the cumulative consequences across the system.
Annual filing by service companies inside registered systems, disclosing cost allocations, billings, and personnel arrangements with affiliates. Same underlying activity as the affiliate-transaction sections of U5S, but at the service-company level with granular cost-of-service tables. U5S aggregates the same flows at the parent holding company level. U-13-60 is the detailed companion; U5S is the system-wide summary.
Restatements or corrections to a previously filed U5S, sometimes submitted years later. Included in this dataset alongside originals. Researchers performing longitudinal analysis must reconcile amendments against the original filings rather than treat each accession as independent.
Registered holding companies with publicly traded securities filed both 10-K and U5S each year. 10-K is investor-facing under Regulation S-K/S-X: business description, MD&A, risk factors, audited consolidated financials. U5S is PUHCA-staff-facing: system-company rosters, asset transaction tables, cross-system officer and director listings, and consolidating (not merely consolidated) financials designed to expose intra-system flows. The 10-K is not a substitute for U5S on system structure or affiliate transactions. After February 2006, much of the U5S content disappeared from SEC disclosure entirely.
After PUHCA 2005 moved oversight from the SEC to FERC, centralized service companies began filing FERC Form 60 annually with FERC. It is not on EDGAR and is not an SEC dataset, but it is the closest substantive successor for service-company cost allocations previously visible through U5S and U-13-60. Any analysis spanning the 2006 transition must pair the U5S dataset with FERC data; SEC sources alone cannot bridge the regime change.
U5S is distinct from every adjacent SEC dataset on three points: (1) it covers a terminated regime with a fixed February 2006 end date, so the record set is historically bounded; (2) it captures system-wide disclosures — consolidating financials, complete subsidiary and officer/director rosters across the entire registered system — at a granularity the Exchange Act forms never required; and (3) its content is organized around PUHCA's substantive concerns (integration, simplification, affiliate transactions, service-company allocations) rather than investor protection. The other PUHCA-family forms cover adjacent slices of the same regime at different cadences and levels of detail; Exchange Act forms cover the same companies through a different disclosure lens. For research on the internal structure and intra-system economics of registered utility holding companies between 1994 and 2006, U5S is the primary record and the others are partial complements rather than substitutes.
The Form U5S Files Dataset is a primary-source archive for users reconstructing pre-2006 utility-system structures, intercompany financings, and consolidating financials. Although Form U5S has been discontinued for two decades, its closed corpus remains the authoritative record of registered utility holding-company structure between 1994 and 2006 and supports a narrow but well-defined community of researchers and practitioners.
Used to establish baselines for affiliate transactions, service-company cost allocations, and intercompany charges in state PUC rate cases and FERC proceedings. Key fields: consolidating statements, intersystem investment schedules, and service-company allocation disclosures. Supports prudency reviews and affiliate-cost benchmarking against the pre-repeal regulatory record.
Counsel for companies that inherited assets from former registered systems use system-company schedules, exhibit indexes, and asset-transaction disclosures to trace prior corporate structures. Supports historical due diligence, successor-liability research, indemnity-claim review under legacy purchase agreements, and discovery in long-tail utility litigation.
Used to trace cash flows, capital contributions, dividends, and intercompany loans across multi-year windows when contemporaneous internal records are unavailable. Key fields: consolidating financials, intercompany debt schedules, and asset-sale disclosures. Supports expert reports, damages models, and reconstruction of affiliate balances.
Used to confirm ownership chains, surviving intercompany obligations, and legacy guarantees at targets whose corporate lineage runs through a former holding system. Key fields: exhibit lists, system-company rosters, and contingent-obligation disclosures. Supports corporate-history memos and chain-of-title work for utility assets.
Fixed-income analysts build long-run histories of leverage, capitalization, intercompany debt, and dividend upstreaming for issuers whose current structure descends from a PUHCA-era system. Key fields: consolidating financials, debt schedules, and capitalization disclosures. Feeds historical credit narratives and recovery analyses tied to legacy entities.
Used as the canonical record of registered holding-company organization before repeal. Key fields: system-company lists, ownership percentages, and consolidating balance sheets. Supports studies of divestitures, industry concentration, and the structural effects of the Energy Policy Act of 2005.
Officer-and-director sections list senior personnel across every system company on a consistent annual basis, supporting director-interlock studies and longitudinal board-composition research for the utility sector that is difficult to assemble from proxy filings alone.
Used as a structured archive of subsidiary lists, intercompany investments, and affiliate rosters for the 1994 to 2006 window. Key fields: metadata, exhibit indexes, and tabular schedules. Feeds entity-resolution pipelines, CIK-to-subsidiary mapping, and holding-company graph databases.
The TXT, HTML, and PDF corpus is a finite, document-bounded archive of an extinct disclosure regime with dense PUHCA and utility-accounting vocabulary. Supports fine-tuning sets for regulatory QA, retrieval benchmarks over historical filings, and entity-extraction evaluation on subsidiary and officer rosters.
Internal teams at companies that succeeded former registered systems use system-company schedules, officer rosters, and exhibits to respond to auditor and regulator inquiries, satisfy records requests, and maintain reliable references for legacy entities and agreements.
The Form U5S Files Dataset supports a small set of operational workflows that all depend on its system-level disclosures: consolidating financials, system-company rosters, intercompany schedules, and officer/director listings across the 1994 to February 2006 registered-holding-company population.
Reconstructing pre-repeal holding-company structures for rate cases. Pull Item 1 system-company rosters and Exhibit A consolidating balance sheets across consecutive fiscal years for a given registered system to establish year-by-year parent-subsidiary ownership percentages, intercompany investment book values, and asset placements. Output supports affiliate-cost benchmarks and prudency arguments in state PUC and FERC proceedings against the pre-2006 baseline.
Building a service-company affiliate-transaction baseline. Extract Item 8 service, sales, and construction contract tables together with Exhibit E tax-allocation schedules for each registered system over multi-year windows. Used to quantify intra-system service billings and tax allocations under PUHCA Sections 13(a)/13(b), then compared against post-2006 FERC Form 60 disclosures to bridge the regime change.
Tracing legacy intercompany debt and capital flows for forensic and credit work. Use Item 3 (issuances, pledges, guarantees) and Item 4 (redemptions and retirements) plus Exhibit A column-level company breakouts to reconstruct intercompany loans, dividend upstreaming, and capital contributions for issuers whose current structure descends from a PUHCA-era system. Feeds damages models, long-tail litigation expert reports, and historical credit narratives.
Chain-of-title and successor-liability diligence for utility M&A. Cross-reference Item 1 ownership tables and Item 2 acquisitions-or-sales-of-utility-assets disclosures across the full history of a target's predecessor system to confirm asset transfers, surviving guarantees, and contingent obligations. Supports corporate-history memos, indemnity-claim review under legacy purchase agreements, and discovery responses.
Director-interlock and governance panels for the utility sector. Parse Item 6 officer-and-director rosters, which list each individual's positions across every system company plus outside directorships, into a longitudinal interlock graph keyed by CIK and fiscal year. Produces a consistent 1994 to 2006 panel that proxy filings alone cannot assemble.
Subsidiary-mapping and entity-resolution reference data. Normalize Item 1 system-company tables and entities[] blocks from metadata.json into a CIK-to-subsidiary graph, using the stable 030- PUHCA file-number prefix to scope the population. Feeds historical entity-reference databases, ownership graphs, and reconciliation against modern EDGAR filer records.
Retrieval and extraction benchmarks for PUHCA-era regulatory NLP. Treat the closed TXT/HTML/PDF corpus as a fixed evaluation set for RAG over an extinct disclosure regime, with the fixed Item/Exhibit structure providing ground-truth segmentation for tasks such as subsidiary-roster extraction, tax-allocation parsing, and PUHCA-release citation linking.
Dataset Index JSON API: https://api.sec-api.io/datasets/form-u5s-files.json
This endpoint returns dataset-level metadata (name, description, last updated timestamp, earliest sample date, total records, total size, form types covered, container format, and content file types), the full dataset archive download URL, and the complete list of container files with per-container size, record count, updated timestamp, and individual download URL. Use it to monitor which containers changed in the most recent refresh run so you can pull only the updated files day by day. This endpoint does not require an API key.
1
{
2
"datasetId": "1f13365b-9ae0-69dd-8628-93362f40cc7a",
3
"datasetDownloadUrl": "https://api.sec-api.io/datasets/form-u5s-files.zip",
4
"name": "Form U5S Files Dataset",
5
"updatedAt": "2026-04-15T12:31:32.924Z",
6
"earliestSampleDate": "1994-01-01",
7
"totalRecords": 4457,
8
"totalSize": 66659391,
9
"formTypes": ["U5S", "U5S/A"],
10
"containerFormat": "ZIP",
11
"fileTypes": ["TXT", "JSON", "HTML", "PDF"],
12
"containers": [
13
{
14
"downloadUrl": "https://api.sec-api.io/datasets/form-u5s-files/2005/2005-05.zip",
15
"key": "2005/2005-05.zip",
16
"size": 8421573,
17
"records": 41,
18
"updatedAt": "2026-04-15T12:31:32.924Z"
19
}
20
]
21
}
Download Entire Dataset: https://api.sec-api.io/datasets/form-u5s-files.zip?token=YOUR_API_KEY
Downloads the full Form U5S Files dataset as a single ZIP archive covering all U5S and U5S/A filings from January 1994 through the form's discontinuation in February 2006. This endpoint requires an API key.
Download Single Container: https://api.sec-api.io/datasets/form-u5s-files/2005/2005-05.zip?token=YOUR_API_KEY
Downloads one monthly container ZIP instead of the full dataset, which is useful for retrieving filings from a specific period or refreshing only updated containers. This endpoint requires an API key.
The dataset covers Form U5S (the annual report of registered public utility holding companies under Section 5(c) of PUHCA 1935) and Form U5S/A (amendments to a previously filed U5S). No other PUHCA-family forms — including U5B/U5A registration statements, U-1 transaction applications, U-13 service-company filings, or U-9C-3 quarterly reports — are included.
One record is a single EDGAR submission of a Form U5S or U5S/A, keyed by EDGAR's 18-digit accession number and materialized as one folder. The folder bundles a parsed metadata.json header together with every non-image document the registrant transmitted (the primary form body and each lettered Exhibit) in its native TXT/SGML, HTML, or PDF format.
Only top-tier registered public utility holding companies under Section 5 of PUHCA 1935 filed Form U5S. Section 3 exempt holding companies, subsidiary operating companies, intermediate holding companies, service companies, and foreign utility holding companies did not file U5S in their own right, although they were disclosed inside the parent's filing.
The dataset covers every U5S and U5S/A filing accepted by EDGAR from January 1994 — when electronic filing began — through the SEC's discontinuation of the form in February 2006, after the Energy Policy Act of 2005 repealed PUHCA 1935 effective February 8, 2006. Pre-1994 paper U5S annual reports are not included.
The dataset is a frozen historical archive. Because Form U5S was retired in February 2006, no new annual reports are generated; only error corrections or backfill of late-filed amendments would alter the population.
The dataset is delivered as ZIP containers, organized into monthly partitions (for example, 2005/2005-05.zip). Inside each container, individual records appear as accession folders containing a metadata.json file alongside SGML-wrapped TXT, HTML, or PDF document files.
Registered holding companies with publicly traded securities filed both a 10-K and a U5S each year. The 10-K is investor-facing under Regulation S-K/S-X (business description, MD&A, risk factors, audited consolidated financials), while U5S is PUHCA-staff-facing and uniquely provides system-company rosters, asset-transaction tables, cross-system officer and director listings, and consolidating financials that expose intra-system flows — disclosures the Exchange Act forms never required.