The Form N-5 Files Dataset is a complete EDGAR collection of Form N-5 and Form N-5/A registration statements filed by small business investment companies (SBICs) under both the Securities Act of 1933 and the Investment Company Act of 1940. One record is a single EDGAR submission of Form N-5 or Form N-5/A by an SBIC, materialized as a per-accession folder containing the primary registration document, all non-image exhibit documents, and a metadata.json file derived from the EDGAR submission header. The filers are SBICs licensed by the U.S. Small Business Administration (SBA) under the Small Business Investment Act of 1958, together with entities that have received SBA preliminary approval to apply for such a license. Coverage begins in March 1996 — the earliest electronic Form N-5 filings on EDGAR — and runs to the present, reflecting both the small SBIC universe and the fact that an issuer registers once and amends infrequently.
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
The dataset packages every EDGAR Form N-5 and Form N-5/A submission as a structured, parseable record. Form N-5 is a dual-purpose registration statement filed under both the Securities Act of 1933 and the Investment Company Act of 1940. It is available only to companies licensed under the Small Business Investment Act of 1958, or to entities that have received preliminary approval from the SBA to apply for such a license. Because the form serves two statutory regimes simultaneously, the same registrant is registered both as a securities issuer (33-Act) and as an investment company (40-Act) through one document, and the filing carries two SEC file numbers — a 333-xxxxxx series file number for the 33-Act registration and an 811-xxxxx series file number for the 40-Act registration.
The form is short and prescriptive relative to general-purpose registration forms such as S-1 or N-2: it consists of a fixed cover page, a numbered schedule of disclosure items, an exhibit index, an undertakings block, and a signatures block. The items cover the registrant's organization and capital structure, fundamental investment policies, business activities and SBA-related financing arrangements, financial statements, and required exhibits. Form N-5/A is the amendment variant; it carries the same skeleton but is filed to update, correct, or supplement a previously submitted N-5, including pre-effective amendments responding to SBA or SEC staff comments.
Records are organized inside monthly ZIP containers keyed by year and month (YYYY/YYYY-MM.zip). Within a container, the top-level directory is the month (YYYY-MM/), and each filing lives in a subfolder named for its 18-digit accession number with the dashes stripped (for example, 000119312506043511/). A record is therefore uniquely identified by the tuple of container month and accession number, and it is fully reconstructible from the container without reference to external EDGAR resources. Coverage starts in March 1996, in the early HTML-on-EDGAR era, and continues to the present.
A record's folder contains three classes of artifact:
metadata.json — always present, exactly one per record. A single JSON object that mirrors the EDGAR submission header and enumerates every document originally filed.N-5 or N-5/A in its SGML envelope and sequenced 1, carrying the registration statement body.EX-11, EX-99, etc.) and a body holding the exhibit text.Every document file is an EDGAR SGML-wrapped object rather than pure HTML or pure text. Each begins with a <DOCUMENT> block whose envelope tags (<TYPE>, <SEQUENCE>, <FILENAME>, <DESCRIPTION>) precede a <TEXT> block containing either an HTML body (<HTML>...</HTML>) or, for earlier-era filings, plain ASCII text. The envelope terminates with </TEXT></DOCUMENT>. Consumers that want the rendered body must strip or parse the envelope before handing the inner payload to an HTML renderer.
Two classes of EDGAR content present in the original submission are excluded from the record folder:
GRAPHIC, JPG, GIF) — omitted dataset-wide. They remain referenced in metadata.json under documentFormatFiles so that the record's relationship to the original submission is auditable, but the binary files themselves are not packaged..txt wrapper — the aggregate single-file representation of the EDGAR submission (the file whose name matches the dashed accession number with a .txt suffix) is not duplicated inside the folder. The per-document files supersede it for any extraction purpose.The body of the N-5 or N-5/A primary document follows a fixed cover-and-schedule pattern. In order, a reader of the rendered document encounters:
"As filed" banner — a centered line reading "As filed with the Securities and Exchange Commission on [date]" together with the two SEC file numbers (33-Act and 40-Act).
Form title and amendment header — the form name ("FORM N-5" or "FORM N-5/A"), and, for amendments, the amendment number and stage (e.g., "Pre-Effective Amendment No. 1").
Registrant identification block — full legal name of the SBIC, state or jurisdiction of incorporation, address of principal executive offices, telephone number, and IRS Employer Identification Number.
Agent for service of process — name, address, and contact details of the person designated to receive service on behalf of the registrant.
Approximate date of proposed sale to the public — a short box or sentence indicating proposed offering timing.
Explanatory note (when applicable) — a brief narrative explaining the purpose of an amendment or the scope of the registration.
Numbered schedule of disclosure items — the heart of the form, presented as a numbered list whose items correspond to Form N-5's prescribed schedule. Items cover, in approximate order:
Each item is presented either inline as a short narrative response or by reference to an exhibit filed separately with the submission.
Exhibit index — a tabular list mapping each exhibit number to its description and (where the filer chose) to the exhibit's filename within the submission.
Undertakings — the registrant's formal statutory undertakings, including the standard Securities Act undertakings adapted for the SBIC context.
Signatures — the registrant's signature, the date and city of execution, and the signatures of officers and a majority of directors as required by Section 6(a) of the Securities Act, frequently executed under powers of attorney.
Because Form N-5 leans heavily on incorporation by reference, much of the substantive disclosure lives in the attached exhibits rather than inside the primary document itself. The primary document often functions more as an index and certification wrapper than as a standalone narrative.
Exhibits are stored as separate files within the same accession folder, one file per exhibit, each wrapped in its own SGML envelope. The typical exhibit set for an N-5 or N-5/A includes:
Each exhibit file's SGML envelope identifies the exhibit by EDGAR type code (for example, <TYPE>EX-11, <TYPE>EX-99.1), preserves the original filename as submitted, and labels it with the descriptive text the filer attached. The body is whatever the filer submitted — most commonly an HTML rendering of a legal letter, a contract, or a tabular consent.
metadata.json contentsThe metadata.json file is a single JSON object that captures the EDGAR submission header in structured form. Its top-level fields include:
formType — "N-5" or "N-5/A".accessionNo — the dashed 18-digit accession identifier (e.g., 0001193125-06-043511).filedAt — ISO 8601 datetime including timezone offset, reflecting EDGAR's accept-time.description — the human-readable description carried on the EDGAR submission (e.g., "Form N-5/A - Registration statement for small business investment companies: [Amend]").linkToFilingDetails — URL to the primary HTML document on sec.gov.linkToTxt — URL to the full EDGAR submission .txt wrapper.linkToHtml — URL to the EDGAR filing index page (the *-index.htm).linkToXbrl — URL to the XBRL instance if present; empty across the dataset.id — a 32-character hex identifier, stable per filing.documentFormatFiles — an ordered array with one entry per document in the original EDGAR submission, including documents the dataset does not physically materialize (image graphics and the combined .txt wrapper). Each entry carries sequence, size, documentUrl, description, and type. This array allows a consumer to reconcile the on-disk files against the original submission and to retrieve omitted images directly from sec.gov if needed.dataFiles — an array reserved for XBRL or other structured data attachments; empty across the dataset.entities — an array of filer entities. Because Form N-5 is a dual registration, the same registrant CIK typically appears twice in entities, once with act: "33" (Securities Act of 1933) and once with act: "40" (Investment Company Act of 1940). Each entry carries cik (zero-padded to ten digits), companyName with the EDGAR role suffix in parentheses (e.g., "(Filer)"), type (mirroring formType), fileNo (the SEC file number scoped to that act — 333- for 33-Act, 811- for 40-Act), filmNo (EDGAR film number for that act entry), stateOfIncorporation (two-letter state code), fiscalYearEnd (MMDD string), irsNo (digits-only EIN), and tickers (an array of ticker symbols, often empty for SBICs that are not exchange-listed).The dual-act representation in entities is the distinguishing structural feature of N-5 metadata: the registrant is the same legal entity but registers under two statutes simultaneously, and each statute's file number, film number, and act code is materialized as a separate entity row.
A record includes the complete, parseable representation of the registration statement and all of its non-image attachments: the SGML-wrapped primary document with its cover page, identification block, schedule of disclosure responses, exhibit index, undertakings, and signatures; every textual or HTML exhibit referenced in the exhibit index; and the structured metadata.json header. This is sufficient to render the filing in its original form, to extract the schedule items and exhibit contents to text, and to cross-reference the registrant against EDGAR's company database via CIK and file numbers.
A record excludes binary image attachments (GIF, JPEG, and other GRAPHIC documents), even when the primary document references them inline — typical excluded items include scanned signatures, logos, and occasional photographs. The combined accession-level .txt submission wrapper produced by EDGAR is also not duplicated inside the folder, since the per-document files together carry the same content. Both excluded classes are still listed in documentFormatFiles with their original URLs on sec.gov, preserving a complete audit trail of what was filed.
Form N-5's substantive disclosure schedule has remained stable over the dataset's coverage period. The numbered disclosure framework, the dual-act registration architecture, the exhibit index, and the undertakings and signatures blocks have not undergone the kind of rule-driven expansion that 10-K, 20-F, or N-2 filings have experienced. The only material structural variation across years is in amendment headers (which appear or vanish depending on whether the filing is an initial N-5 or an N-5/A) and in the exhibit set, which varies with the issuer's specific charter documents, contracts, and counsel arrangements. Because the SBIC registration framework is narrowly defined by the Small Business Investment Act of 1958 and the SBA's licensing regime, the form has not absorbed disclosure mandates from later SEC rule changes targeted at operating companies.
Filings from the late 1990s are predominantly ASCII text inside the SGML document envelope — the <TEXT> block carries unformatted plain text with section headers in upper case, indentation produced by spaces, and tabular content laid out with column alignment rather than HTML tables. As HTML filing became the norm in the early 2000s, the <TEXT> block transitioned to carrying <HTML>...</HTML> content with explicit tags, inline styling, and HTML tables; file extensions correspondingly shift from .txt to .htm. Across both eras the outer SGML envelope is preserved, so a single parsing strategy — strip the SGML envelope, then dispatch on whether the body opens with <HTML> or with plain ASCII — handles every record.
Several characteristics of the underlying form bear on how records should be read:
333- and 811-) chain each amendment to the originating registration.333- 33-Act file number and the 811- 40-Act file number serve different downstream lookups: the former joins to securities registration records, the latter to investment company registrations and related filings (N-SAR, N-CEN, and successors where applicable)..htm files as standalone HTML will produce parse errors or stray tag text at the head of the document. The envelope must be removed before HTML rendering, but its tags (<TYPE>, <SEQUENCE>, <FILENAME>, <DESCRIPTION>) carry useful metadata that aligns with documentFormatFiles in metadata.json.<img src="..."> reference inside a rendered N-5 or N-5/A body resolves to a missing local file; the corresponding URL on sec.gov is available through the documentFormatFiles entry for that graphic.Each record in this dataset is a registration statement (Form N-5) or an amendment to one (Form N-5/A) filed on EDGAR by a small business investment company ("SBIC") acting as the issuer-registrant. The SBIC signs and submits the filing in its own name. Sponsors, advisers, custodians, and underwriters may be named inside the document as related persons, but they are not the filer.
Form N-5 is restricted to two registrant categories:
Both types operate in substance as closed-end investment companies investing in eligible small business concerns under 13 C.F.R. Part 107. Required signatures are those of the registrant, its principal executive, financial, and accounting officers, and a majority of its board.
Form N-5 filings are event-driven, not periodic. The trigger is the SBIC's need to publicly offer securities while simultaneously being subject to the Investment Company Act of 1940. A single Form N-5 satisfies two registration requirements at once:
The form exists because the SEC designed a combined vehicle for issuers caught by both regimes; filing separately under each statute would be duplicative for this narrow class.
Specific trigger points:
There is no annual Form N-5 filing analogous to Form 10-K. Form N-5 is exclusively a registration-statement form; N-5/A is its sole amendment channel.
Form N-5's narrow filer population and dual-statute role create partial overlap with other investment-company registration forms, general Securities Act registrations, Exchange Act reports, and SBA submissions. The closest neighbors are listed below in order of substantive proximity.
Form N-2 registers closed-end management investment companies and BDCs under both the 1933 Act and 1940 Act Acts. It is the nearest substantive cousin to N-5: both are dual-statute registrations for non-redeemable vehicles that invest in illiquid portfolio companies, and SBICs sometimes migrate to N-2 (e.g., when operating as a BDC without SBA licensing).
Key differences:
The two are not substitutes in either direction: N-2 will not surface SBIC-specific filings, and N-5 captures only a niche corner of the closed-end universe.
N-1A is also a dual 1933/1940 Act registration, but for open-end mutual funds and most ETFs. It differs from N-5 in product structure (continuously offered, daily-redeemable shares vs. non-redeemable SBIC securities), filer population, disclosure depth (full statutory prospectus plus SAI), and filing cadence (continuous 485-series amendments). N-1A is one of the highest-volume registration families on EDGAR; N-5 is among the lowest. Shared regulatory framework, fundamentally different fund structures.
These are recurring post-registration reports: certified shareholder reports (N-CSR), annual census data (N-CEN), and monthly portfolio holdings (N-PORT). N-5 is a one-time-plus-amendments registration document, not periodic reporting. SBICs registered on N-5 are generally outside the full periodic-reporting regime that applies to typical 1940 Act registrants, so SBICs rarely appear in N-PORT or N-CEN. Researchers seeking ongoing SBIC holdings or financial data will not find them in either dataset.
S-1 (operating companies) and S-11 (real estate) register securities under the 1933 Act only. The defining boundary is the 1940 Act overlay: N-5 simultaneously satisfies Investment Company Act registration; S-1 and S-11 do not. Their disclosure architecture (MD&A, business description, operating-company or REIT-framed risk factors) does not address 1940 Act fund-governance content, and an SBA-licensed SBIC issuing securities cannot substitute S-1 for N-5.
The most common source of confusion for users seeking SBIC financial reporting over time. Form 10 does not contain Exchange Act periodic reporting; an SBIC with Exchange Act-registered securities files 10-K and 10-Q separately. For audited annual financials and MD&A across multiple years, 10-K is the correct source. N-5 is the registration document; 10-K/10-Q are the ongoing reports.
8-A registers a class of securities under the 1934 Act for issuers that already have an effective 1933 Act registration, typically tied to exchange listing. It is complementary to N-5, not a substitute: an SBIC listing publicly may file both. 8-A is a brief, incorporation-by-reference filing; N-5 carries the substantive disclosure.
Schedule 14A documents specific shareholder-vote events (board elections, advisory contract approvals, fundamental policy changes). N-5 documents the entity at registration or amendment. The overlap is event-driven and the disclosure content (vote items, meeting-tied director information, compensation tables) is structurally distinct from registration disclosure. Proxy filings complement N-5 but do not replace it.
SBA license applications, SBA Form 468 financial reports, and related SBIC submissions cover the same filer population and statutory regime (Small Business Investment Act of 1958), but are filed with the SBA, not the SEC, and do not appear on EDGAR. They lie entirely outside the N-5 dataset. A complete view of an SBIC requires pairing N-5 with SBA-source data.
Form N-5 is distinct because it captures a statutorily narrow, dual-statute (1933 and 1940 Act) registration used only by SBA-licensed or pre-approved SBICs. It is not periodic reporting, not a general fund-registration corpus, and not a full SBIC dataset. For broader closed-end and BDC coverage, use N-2; for ongoing SBIC financials, use 10-K/10-Q where available; for SBA-side regulatory data, no SEC dataset suffices. The N-5 Files dataset spans from 1996 to present and reflects the small, specialized SBIC registration population on EDGAR; it is best treated as a niche reference set rather than a substitute for any larger N-series or Securities Act corpus.
The user base is narrow but distinct: a handful of legal, regulatory, academic, and engineering roles, each reading a different slice of the Form N-5 corpus.
Fund-formation attorneys drafting or amending an N-5 use the dataset as a precedent library. They study the primary document for business descriptions, capital structure, investment policies, and the interaction between SBIC disclosures and Investment Company Act Section 8 election language. Exhibits, consents, and signature blocks inform charter conventions and SBA representations. Supports redlines across N-5 and N-5/A filings on a form where market practice cannot be reconstructed from any single filing.
Sponsor-side counsel review peer disclosures of SBA-guaranteed debenture leverage, affiliated-adviser conflicts, fee structures, and the licensing-versus-registration interplay. They focus on risk factors, related-party disclosures, and the description of securities. Supports the choice between Form N-5 and alternative registration pathways, and diligence on legacy SBIC entities being revived or restructured.
Federal small-business program staff and economists track which SBICs entered public registration, when, and in what corporate form. They use filer metadata (CIK, registrant name, state, filing date), business descriptions, and capitalization sections. Supports longitudinal program analysis from 1996 forward and cross-referencing EDGAR against SBA licensing records.
Finance and public-policy academics treat the corpus as a primary source. They read the full primary document for investment objectives, portfolio targets, and management compensation, and use filing and amendment dates to study registration timing. Given the small filer population, the dataset suits qualitative content analysis and case-based research rather than large-sample econometrics.
Historians of U.S. capital markets and federal small-business policy use the dataset to trace the post-1996 trajectory of an instrument grounded in 1958 legislation. They focus on registrant identities, filing dates, business-description language, and early-filing exhibits as archival primary documents.
Operations and compliance staff servicing SBIC clients verify registration status and locate operative statements and amendments. They use the filing index, accession numbers, effective and amendment dates, and cover-page identifiers for onboarding diligence and recordkeeping reconciliations.
Deal lawyers and analysts pull a target's full N-5 history, reading the original primary document and tracing N-5/A amendments to see how disclosures evolved. Supports diligence memos, reps and warranties, and disclosure schedules for transactions involving SBIC vehicles.
Engineers building EDGAR ingestion pipelines and filing-search tools use the dataset to validate handling of low-frequency form types. They consume structured metadata (CIK, form type, filing date, accession number, primary document URL) and exhibit indices as QA fixtures for parsing, coverage, and registrant-level filing-history construction.
Teams building retrieval-augmented systems for securities and investment-company law embed N-5 primary documents and exhibits as a small, high-signal corpus. Routing SBIC, dual-statute, and 1958 Act queries against this corpus reduces hallucination on a form too obscure for general-purpose models.
The Form N-5 dataset's value lies in completeness across a thin, hard-to-assemble filing population. The use cases below tie to concrete sections of the record (primary schedule items, exhibits, metadata.json entities) and to specific downstream workflows.
Fund-formation counsel pull every N-5 and N-5/A in the corpus, extract the numbered disclosure schedule from each primary document, and align responses item-by-item across filers. Output is a redline-ready template that captures market practice for investment-policy language, SBA debenture leverage descriptions, Section 8 election wording, and standard undertakings — precedent that no single filing can supply on a form with such a small population.
Compliance and diligence teams join records on the 333- (33-Act) and 811- (40-Act) file numbers carried in metadata.json entities to chain each N-5/A amendment back to its originating N-5. This produces a registrant-level filing history with effective dates and amendment sequence, used for onboarding diligence, recordkeeping reconciliations, and target-entity history in SBIC M&A.
Deal lawyers and sponsor counsel index records by exhibit type (EX-11 legal opinions, charter documents, bylaws, investment advisory agreements, SBA debenture and participating-security commitments) and pull the per-exhibit HTML bodies from each accession folder. Output is a per-entity exhibit set used in diligence memos, disclosure schedules, and benchmarking of affiliated-adviser fee structures across SBIC sponsors.
SBA program staff and economists consume the structured entities, filedAt, cik, stateOfIncorporation, and fileNo fields from metadata.json to build a roster of SBICs that entered SEC registration from March 1996 forward. The roster joins to SBA license lists for longitudinal program analysis of which licensees pursued public-market capital and when.
LLM and RAG teams embed the SGML-stripped primary documents and exhibit bodies as a closed corpus for routing dual-statute, SBA-licensed, and 1958 Act queries. Because general-purpose models underrepresent Form N-5, this small targeted index materially reduces hallucination on questions about SBIC capital structure, investment-policy language, and 33/40-Act dual registration mechanics.
Engineers building EDGAR ingestion pipelines use the dataset to test parsing of the SGML envelope-plus-body pattern across both 1990s ASCII filings and modern HTML filings, validate handling of the dual-act entities array, and confirm that image-excluded records still reconcile against documentFormatFiles. The compact dataset scope makes full-corpus regression testing tractable in a single run.
The Form N-5 Files dataset is accessible through three endpoints: a JSON index for metadata discovery, a full archive download, and per-container ZIP downloads.
Dataset Index JSON API: https://api.sec-api.io/datasets/form-n5-files.json
Returns dataset-level metadata (name, description, last updated timestamp, earliest sample date, total records, total size, form types covered, container format, file types) together with the full dataset download URL and the list of individual container files. Each container entry includes its key, size, record count, updated timestamp, and download URL. This endpoint can be polled to detect which containers were touched in the most recent refresh and to selectively download only those updated containers. This endpoint does not require an API key.
Example response:
1
{
2
"datasetId": "1f13365b-9ae0-6a5b-860b-473e5fcfdf29",
3
"datasetDownloadUrl": "https://api.sec-api.io/datasets/form-n5-files.zip",
4
"name": "Form N-5 Files Dataset",
5
"description": "Form N-5 is a registration statement filed by small business investment companies under both the Securities Act of 1933 and the Investment Company Act of 1940.",
6
"updatedAt": "2026-04-16T08:47:31.117Z",
7
"earliestSampleDate": "1996-03-01",
8
"totalRecords": 84,
9
"totalSize": 1561887,
10
"formTypes": ["N-5", "N-5/A"],
11
"containerFormat": "ZIP",
12
"fileTypes": ["TXT", "JSON", "HTML"],
13
"containers": [
14
{
15
"downloadUrl": "https://api.sec-api.io/datasets/form-n5-files/2026/2026-03.zip",
16
"key": "2026/2026-03.zip",
17
"size": 13818,
18
"records": 2,
19
"updatedAt": "2026-04-16T08:47:31.117Z"
20
}
21
]
22
}
Download Entire Dataset: https://api.sec-api.io/datasets/form-n5-files.zip?token=YOUR_API_KEY
Downloads the complete dataset as a single ZIP archive containing all N-5 and N-5/A filings since 1996. This endpoint requires an API key.
Download Single Container: https://api.sec-api.io/datasets/form-n5-files/2026/2026-03.zip?token=YOUR_API_KEY
Downloads one individual container ZIP file (for example a monthly archive) instead of the full dataset. Container paths are listed in the dataset index JSON response. This endpoint requires an API key.
The dataset covers Form N-5 and its amendment variant Form N-5/A — a dual-purpose registration statement filed simultaneously under the Securities Act of 1933 and the Investment Company Act of 1940. The form is exclusively a registration-statement form, with N-5/A as its sole amendment channel.
One record is a single EDGAR submission of Form N-5 or Form N-5/A by a small business investment company, materialized as a per-accession folder containing the primary registration document, all non-image exhibit documents, and a metadata.json file derived from the EDGAR submission header. The record unit is a complete filing, not an extracted item or event observation.
Only two registrant categories may file: SBICs that hold an operating license issued by the SBA under the Small Business Investment Act of 1958, and entities that have received written preliminary approval from the SBA to submit a license application. Private SBICs that rely on exclusions and never register publicly, BDC-elected issuers, and non-SBIC closed-end funds do not file Form N-5.
Coverage begins in March 1996 — the earliest electronic Form N-5 filings on EDGAR — and continues to the present. Pre-1996 paper filings by SBICs are not included. Filing density is low and bursty; monthly containers are often sparse or empty.
Records are organized inside monthly ZIP containers keyed by year and month (YYYY/YYYY-MM.zip). Each filing lives in an accession-numbered subfolder containing SGML-wrapped document files (primary .htm or .txt plus exhibit files) and a metadata.json file. Image attachments and the combined accession-level .txt wrapper are excluded but remain referenced in metadata.json.
Both Form N-5 and Form N-2 are dual-statute (1933 and 1940 Act) registrations for non-redeemable investment vehicles, but Form N-2 covers all closed-end funds, listed BDCs, interval funds, and tender-offer funds and generates far more filings, while Form N-5 is restricted to SBA-licensed or pre-approved SBICs and uses a shorter SBIC-specific format rather than the full prospectus/SAI framework.
Form N-5 is the registration document only and does not contain Exchange Act periodic reporting. An SBIC with Exchange Act-registered securities files 10-K and 10-Q separately for audited annual financials and MD&A. SBA-side regulatory data — including SBA Form 468 financial reports — is filed with the SBA rather than the SEC and does not appear on EDGAR or in this dataset.