The Form S-4EF Files Dataset is a complete corpus of EDGAR submissions on Form S-4EF and its pre-effective amendment Form S-4EF/A — the auto-effective Securities Act registration statement used when a depository institution reorganizes into a bank holding company or savings and loan holding company under General Instruction G of Form S-4. Each record is one EDGAR accession: the primary registration statement, every separately filed exhibit (plan of reorganization, charter and bylaws, legality and tax opinions, auditor consents, revocable proxy), and a metadata.json manifest capturing the EDGAR header. Filers are newly chartered shell holding companies, and the registered securities are the holding-company shares issued to the predecessor bank's or thrift's shareholders in a share-for-share exchange. The dataset spans EDGAR submissions from February 1, 1994 to present, distributed as monthly ZIP containers with TXT, JSON, and HTML file types.
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
Form S-4EF is a specialized, auto-effective registration statement filed under the Securities Act of 1933 to register securities issued in connection with the formation of a bank or savings-and-loan holding company. It is governed by General Instruction G of Form S-4, which restricts eligibility to transactions in which a depository institution reorganizes into a holding-company structure on a share-for-share basis with no other proposals attached — no anti-takeover charter amendments, no unrelated mergers, no compensation plan adoption tied to the reorganization vote. When those eligibility conditions are satisfied, the registration statement becomes effective automatically upon filing, with no SEC staff review cycle. The S-4EF/A variant is a pre-effective amendment used to correct or supplement an S-4EF that has already been submitted; it carries the same content envelope as the original S-4EF and is filed against the same registration file number (a 333- series number).
Substantively, an S-4EF is a combined registration statement and proxy statement/prospectus. The same document both registers the new holding-company shares with the SEC and solicits the predecessor bank's shareholders to approve the plan of reorganization. The disclosure scaffolding follows the line items of Form S-4 as adapted for a holding-company formation, but the narrative is short relative to a full M&A S-4 because the underlying transaction is purely structural.
The dataset includes every Form S-4EF and Form S-4EF/A submission accepted by EDGAR from February 1, 1994 forward. For each accession, the dataset packages every text-based document from the original EDGAR submission together with a canonical metadata manifest. Image files referenced inside the prospectus (signature graphics, logos, organizational diagrams) are excluded by design, as is the rolled-up "complete submission" .txt file that EDGAR generates as a concatenation of every document in the submission. Records are distributed as monthly ZIP containers, and file types inside each container are TXT, JSON, and HTML.
A single record in the Form S-4EF Files Dataset is one complete EDGAR submission of either a Form S-4EF or a Form S-4EF/A registration statement, identified by its 18-digit EDGAR accession number. Each record materializes as one accession-level folder containing every text-based document that was part of the original EDGAR submission together with a metadata.json manifest that captures the EDGAR header. The unit is therefore the filing as a whole, not the individual exhibit, not a single data point inside the prospectus, and not an event extracted from the document. One filing equals one folder equals one record.
Records are distributed inside monthly ZIP containers following the path convention <dataset>/<YYYY>/<YYYY-MM>.zip. Inside each archive the top-level folder repeats the year-month (for example 2005-03/), and beneath that every record is stored as a folder whose name is the EDGAR accession number written as 18 digits with no dashes (for example 000090883405000201). The dashed form of the same accession number is preserved inside metadata.json as accessionNo.
Each accession folder contains:
metadata.json file, the canonical EDGAR header and document manifest for the submission;.txt file per document in the original submission, including the primary registration statement and each separately filed exhibit;.htm/.html document files in addition to or in place of .txt wrappers.The file-types found in the dataset are TXT, JSON, and HTML. Image files (typically .jpg, .gif, and .png graphics referenced by the prospectus, such as signature blocks, logos, or organizational diagrams) are excluded from the dataset by design. The rolled-up "complete submission text file" .txt file that EDGAR generates as a concatenation of every document in the submission is also not materialized into the folder; it is referenced by URL inside metadata.json but omitted to avoid duplicating content already broken out per document. The structured-data attachment slot is empty: S-4EF filings carry no XBRL or other machine-readable financial-data exhibits, and the dataset surfaces this as an empty dataFiles array and an empty linkToXbrl string.
The local file names of the document files are filer-driven slugs (typical patterns combine an issuer abbreviation, an exhibit hint, and the filing date, e.g. ucs4efa_0307.txt, ex5_0307.txt, ex991_0307.txt). These names are not authoritative for identifying the role of a document; the canonical mapping from filename to document type is the documentFormatFiles[] manifest in metadata.json, joined to the folder by the trailing path component of each documentUrl.
metadata.json is the canonical pivot for a record. Each manifest carries a flat block of header fields plus three arrays.
Header fields:
formType — either S-4EF or S-4EF/A.accessionNo — the dashed EDGAR accession (for example 0000908834-05-000201).filedAt — the EDGAR acceptance timestamp as ISO-8601 with offset.description — the human-readable form description echoed from EDGAR, typically containing the General Instruction G phrasing and an [Amend] suffix on /A filings.linkToFilingDetails — URL to the primary document on sec.gov.linkToTxt — URL to the rolled-up complete-submission .txt on sec.gov.linkToHtml — URL to the EDGAR -index.htm landing page for the accession.linkToXbrl — empty string for this dataset.id — an opaque hexadecimal hash, unique per record.Arrays:
documentFormatFiles[] — one entry per constituent document in the submission, plus a trailing manifest row for the complete-submission rollup.dataFiles[] — empty array; S-4EF carries no structured-data attachments.entities[] — one entry per filing party drawn from the EDGAR <COMPANY-DATA> header block.Each documentFormatFiles[] entry describes one constituent document and carries:
sequence — the document's position in the submission, as a string from the EDGAR header.size — byte size as a string.documentUrl — direct sec.gov URL for the document; its trailing path component is the on-disk filename inside the accession folder.description — the filer-supplied caption (for example B&T OPINION, B&T TAX OPINION, REVOCABLE PROXY, or a registrant-specific <NAME> FORM S-4EF/A label).type — the EDGAR document type code (for example S-4EF, S-4EF/A, EX-2, EX-3.1, EX-5, EX-8, EX-23, EX-99.1).The manifest always contains a trailing entry whose type and sequence are blank strings and whose description is Complete submission text file. This row references the complete-submission rollup by URL only; it is a normal artifact of EDGAR's submission packaging and should be skipped by extraction tools that iterate the manifest.
Each entities[] entry captures one filer's <COMPANY-DATA> block:
companyName — filer name with a role suffix in parentheses (for example ... (Filer)).cik — zero-padded ten-digit CIK.irsNo — the filer's EIN.fileNo — the SEC file number; for S-4EF this is typically a 333- registration file number.filmNo — the EDGAR film number assigned at acceptance.act — the Securities Act designator, 33 for S-4EF.type — the form type for this entity in this submission.sic — the four-digit SIC code with label, almost always a banking SIC such as 6022 State Commercial Banks, 6020, 6021, or a savings-institution code.Every text document in an accession folder is an EDGAR SGML-wrapped plain-text file. The wrapper is a short, fixed header followed by the document body and a closing tag:
1
<DOCUMENT>
2
<TYPE>EX-5
3
<SEQUENCE>2
4
<FILENAME>ex5_0307.txt
5
<DESCRIPTION>B&T OPINION
6
<TEXT>
7
... document body ...
8
</TEXT>
9
</DOCUMENT>
The five header tags appear consistently: TYPE carries the EDGAR document type (the same value that appears in documentFormatFiles[].type), SEQUENCE is the position of the document within the submission, FILENAME is the on-disk name, and DESCRIPTION is the filer-supplied caption. The body inside <TEXT>...</TEXT> can take one of three forms:
<TABLE>, <CAPTION>, <S>, <C>, and </TABLE> — used to produce column-aligned cover-page tables, capitalization tables, fee tables, and selected-financial-data tables in the primary registration statement.<html>...<body>... documents, frequently with inline <table> blocks) for filings made after the EDGAR HTML era began, still nested inside the same <DOCUMENT>...</DOCUMENT> SGML wrapper.The primary document (the one whose type is S-4EF or S-4EF/A) carries the substantive disclosure and follows a recognizable order driven by the line items of Form S-4 under General Instruction G:
Facing page and cover. The opening block identifies the registrant (the to-be-formed holding company), state of incorporation, primary SIC code, IRS Employer Identification Number, principal executive office address, agent for service, and the registration fee table listing each title of securities to be registered, the amount, the proposed maximum offering price per unit and in aggregate, and the calculated registration fee. On /A amendments this section repeats with the same caption plus an "Amendment No. N" annotation.
Cross-reference sheet. The Form S-4 line-item-to-prospectus-page cross-reference required by Item 501-style instructions, indicating where each required disclosure is found in the prospectus.
Letter to shareholders and notice of meeting. Because the S-4EF doubles as a proxy statement, it almost always opens with a transmittal letter from the bank's chief executive and a formal notice of the special shareholder meeting at which the plan of reorganization will be voted on, including record date, meeting date, location, and quorum and vote requirements.
Prospectus / proxy statement. The substantive narrative, organized as:
Information not required in the prospectus (Part II). Indemnification of directors and officers, exhibits index, undertakings, and signatures of the registrant, principal executive officer, principal financial officer, principal accounting officer, and a majority of the board.
Exhibits index. A schedule cross-referencing each exhibit number to the document filed and to incorporation-by-reference statements where applicable.
The financial statements of the predecessor institution — typically audited balance sheets and statements of income, changes in shareholders' equity, and cash flows for the prior fiscal years required by Regulation S-X for a bank, plus interim unaudited statements where the filing date warrants — are embedded inside the prospectus body. In the legacy ASCII era they appear as <TABLE> SGML markup; in the modern era they appear as HTML <table> blocks. They are not a separate file.
S-4EF filings carry a small, predictable exhibit set, each shipped as its own SGML-wrapped document inside the accession folder:
EX-2.1.EX-3.1 and EX-3.2. These describe the authorized capital, classes of stock, board structure, and any provisions relevant to shareholder rights.EX-99.1 with description REVOCABLE PROXY), the form of letter of transmittal for share exchange, and any press release announcing the reorganization. These appear because the S-4EF doubles as a proxy solicitation.The exact set varies by registrant — some filings consolidate the charter and bylaws into one exhibit, some omit EX-8 where the tax discussion in the prospectus is adequate, and amendments may refile only the exhibits that have changed and incorporate the rest by reference to the original S-4EF.
Each record contains the EDGAR header (via metadata.json), the primary S-4EF or S-4EF/A registration statement document in its SGML-wrapped form, every separately filed exhibit submitted to EDGAR as a text or HTML document, and the manifest that links them together. Filer-supplied descriptions and exhibit captions are preserved exactly as submitted, allowing reconstruction of the issuer's labeling conventions. The predecessor institution's financial statements are present inside the body of the primary document.
The complete-submission .txt rollup that EDGAR concatenates from the constituent documents is referenced by URL in the manifest but is not extracted into the folder, since it duplicates content already broken out per document. Image files of any kind — logos, signature graphics, organizational diagrams, scanned documents — are excluded by dataset policy. Documents physically incorporated by reference from earlier filings (most often financial statements drawn from a predecessor's bank call reports or prior registration filings) are not re-shipped; the prospectus contains the textual incorporation-by-reference language and the substantive content lives in the referenced filing. There are no XBRL or other structured-data attachments for this form type, so dataFiles[] is always empty.
The dataset spans February 1994 to the present, and the filing format evolves across that window even though the underlying disclosure content stays largely stable.
In the legacy ASCII era (roughly 1994 through the late 1990s), the primary S-4EF and every exhibit were submitted as plain ASCII text inside the EDGAR SGML <DOCUMENT> wrapper. Tables were rendered as fixed-pitch text aligned with spaces; where EDGAR-flavored <TABLE><CAPTION><S><C> markup was used, it produced multi-column blocks intended for monospace rendering. Cover pages, fee tables, capitalization tables, and selected financial data all relied on this convention. Legal opinions, tax opinions, and proxy cards were free-form ASCII with hand-laid line breaks and underscores standing in for signature lines.
After EDGAR began accepting HTML-formatted documents in the late 1990s, S-4EF filings progressively migrated toward HTML for the primary document, with <table>, <font>, and inline-style markup replacing the ASCII column alignment. Exhibits followed at varying speed; opinions and proxies sometimes remained as plain text inside the SGML wrapper for years after the prospectus had moved to HTML. Both formats coexist within the dataset, sometimes within a single accession (for example an HTML primary document alongside ASCII opinion exhibits).
The substantive disclosure scaffolding — cover page, cross-reference sheet, summary, risk factors, plan of reorganization, description of holding-company securities, predecessor financial statements, federal tax consequences, regulatory approvals, voting procedures, Part II items, signatures, exhibit index — has remained essentially constant across the period because General Instruction G of Form S-4 narrowly defines what an S-4EF can contain. The auto-effective character of the form means there are no SEC staff comment-and-response cycles to lengthen or restructure later filings.
A few patterns matter for downstream use of these records:
333- registration fileNo on entities[] is the linking key across the original and any amendments.metadata.json.documentFormatFiles[].type, never through filename heuristics, because filenames are filer-driven slugs with no enforced convention.documentFormatFiles[] whose type and sequence are blank strings represents the complete-submission rollup, not a malformed row, and should be skipped by extraction tools.<TABLE> (legacy) or <table> (HTML) blocks rather than expect a dedicated financial-statements file.<TEXT>...</TEXT>, callers should detect whether the payload begins with ASCII prose, EDGAR SGML table markup, or an <html> root, since all three appear within this dataset and may even appear in different documents of the same filing.The filer of record is the newly organized holding company being created by the reorganization. It is almost always a shell entity, freshly chartered under state law for the sole purpose of becoming the parent of an existing depository institution. At the time of filing it has no operating history, nominal capital, and no assets beyond the contractual right to acquire the predecessor institution at closing.
The predecessor whose shareholders receive the registered securities falls into one of two categories:
The shareholders of the predecessor exchange their existing equity for newly issued holding company shares; those holding company shares are the securities registered on the S-4EF. Form S-4EF/A filings are pre-effective amendments to a previously filed S-4EF, submitted by the same holding company registrant.
Form S-4EF is not available to general operating companies, foreign private issuers, investment companies, insurance holding company formations, or any reorganization other than a single-institution depository holding company formation.
The filing is triggered by a defined corporate-action sequence:
In timing terms, the S-4EF is filed after both boards approve the plan, before or contemporaneously with the mailing of solicitation materials, and before closing of the share exchange. S-4EF/A amendments cluster between the original filing and the shareholder vote or closing, and are used to refresh financial statements, respond to regulatory developments, or correct disclosure.
Banking-regulatory approvals proceed in parallel: Federal Reserve approval under the Bank Holding Company Act for bank holding company formations, the appropriate federal regulator under the Home Owners' Loan Act for savings and loan holding company formations, plus any required state banking approvals. These approvals are independent of, and not displaced by, Securities Act effectiveness.
To use S-4EF, the transaction must satisfy each condition of General Instruction G of Form S-4. The principal constraints are:
If any condition fails, the registrant must use the standard Form S-4, which requires staff review and a formal effectiveness order rather than auto-effectiveness.
Form S-4EF sits at the intersection of Securities Act registration, business combination disclosure, and bank/thrift holding company formation. The comparisons below cover the filings most likely to be confused with, or treated as substitutes for, an S-4EF dataset. Each is contrasted on what actually triggers the choice between forms.
S-4 is the parent form; S-4EF is its narrow auto-effective sub-variant under General Instruction G.
S-4 datasets are the broader, heterogeneous population; S-4EF is the auto-effective tail.
F-4 is the foreign private issuer counterpart to S-4. Same transaction type (combinations and exchanges), but disclosure incorporates Form 20-F items rather than 10-K/10-Q, and the filer must qualify as an FPI. F-4 has no auto-effective formation-only variant, and bank/thrift holding company formations are overwhelmingly U.S. domestic, so F-4 and S-4EF rarely overlap in practice.
Both register securities under the 1933 Act, but the trigger is different: S-1 is for primary cash offerings (IPOs and many secondaries), S-4EF for share-for-share exchange in a holding company reorganization. S-1 requires registrant financials and an underwriting framework; S-4EF carries predecessor-institution financials and no underwriting because nothing is sold for cash. S-1 is staff-reviewed; S-4EF is auto-effective.
Form S-8 shares the auto-effective-on-filing mechanic but covers employee benefit plan securities. No business combination content, no predecessor financials, no holding company context. Useful only as a structural analogue for the effectiveness mechanism, not a content substitute.
A holding company formation usually requires shareholder approval, so an S-4EF transaction often has a parallel proxy record.
Complement, not substitute.
Rule 425 communications (press releases, investor decks, deal letters) related to business combinations. Not a registration statement: no prospectus, no fee, no effectiveness mechanic. 425 filings may accompany an S-4EF, but the S-4EF is the operative registration document. 425 is event/communications-driven; S-4EF is the formal record.
Sequential, not substitutive. After (or alongside) S-4EF, the new holding company typically files an 8-A to register its common stock under Section 12 of the Exchange Act and assume the predecessor's reporting obligations (often with Rule 12g-3 succession).
A full lifecycle reconstruction needs both.
S-4EF is a one-time pre-formation Securities Act filing; Form 10-K is the recurring post-formation Exchange Act periodic disclosure of the new holding company. Different statute, different cadence, different content. Useful for tracing an issuer's filing history but not an alternative view of the same disclosure event.
S-4EF is uniquely defined by four reinforcing constraints that no neighboring dataset satisfies simultaneously:
Adjacent datasets serve as complements: 14A for shareholder solicitation, 425 for transaction communications, 8-A for post-formation Exchange Act registration, 10-K for ongoing reporting. None substitutes for S-4EF when the research question concerns the registered formation transaction itself.
The dataset's narrow scope and consistent structure make it useful to a small set of specialized professionals working on bank and thrift holding company formations.
Counsel structuring community bank and thrift reorganizations use the corpus as a precedent library. They focus on the plan of reorganization exhibit, the share exchange mechanics in the prospectus, and form legal opinions to draft fractional-share provisions, dissenters' rights language under state banking codes, and conditions precedent tied to Federal Reserve and state regulator approvals. The records also guide how concurrent ESOP, stock option, and director plan adjustments are folded into the reorganization.
Drafters of S-4EF prospectuses compare risk factor catalogs, federal tax-consequence sections, and shareholder rights comparison tables across peer filings. They focus on the cover page, Q&A summary, and description of permitted holding company activities to confirm drafting choices stay within General Instruction G eligibility and avoid charter changes that would force conversion to a regular Form S-4.
Supervisory staff cross-reference S-4EF filings against parallel applications under the Bank Holding Company Act, the Home Owners' Loan Act, and state holding company statutes. They focus on cover page entity identifiers (predecessor name, charter type, state, EIN), the post-reorganization structure, and disclosed regulatory approval conditions to reconcile EDGAR disclosure against supervisory case files.
Examiners assigned to a newly formed parent build an initial supervisory profile from the predecessor's audited balance sheets, regulatory capital tables, loan portfolio descriptions, and disclosed concentration and interest-rate risks. Because the holding company has no operating history of its own at formation, the prospectus financials are the cleanest contemporaneous public record for scoping the first post-formation exam cycle.
Researchers studying community bank consolidation, mutual-to-stock conversions, and the diffusion of the holding company form treat the corpus as a longitudinal panel. They use filing dates, predecessor charter types, geographic data from the cover page, and stated rationales (tax flexibility, capital raising, expansion authority) for event studies, post-formation panel analyses, and qualitative coding of motivations.
Chroniclers of regional banking and the post-cleanup thrift industry use prospectus narratives as primary sources for predecessor history, market area, and board composition. The dataset preserves disclosure-quality records of small and rural institutions that left little other public trace.
Investors holding shares in a reorganizing target read the predecessor financial statements, pro forma capitalization, dividend history, and disclosed plans for repurchases, ESOPs, or future capital actions. The prospectus is the definitive source for how the reorganization changes the legal entity held, the dividend pathway, and the menu of available corporate actions.
Analysts covering structural choices among small depositories use cover page data, predecessor asset size, and disclosures about anticipated non-banking activities to size the market for vendors selling to holding companies versus standalone banks and to study how charter form interacts with technology adoption.
Onboarding and counterparty diligence teams use S-4EF and S-4EF/A filings to establish entity lineage from the predecessor depository institution to the current registrant. They focus on the cover page, plan of reorganization, legal opinion on valid share issuance, and disclosures of officers, directors, and significant shareholders at formation.
Vendors building clause extraction and retrieval systems for banking transactional work use the corpus as a tightly scoped training and evaluation set. The uniform transaction type supports supervised extraction of plan-of-reorganization clauses, conditions precedent, regulatory approval language, tax opinion structures, and shareholder rights comparisons, and supports benchmarks that distinguish holding company formation language from other Securities Act registration contexts.
The dataset's narrow eligibility scope and consistent structure support a small set of focused workflows.
Bank transactional counsel pull every EX-2 exhibit across the corpus to compare share-exchange mechanics, fractional-share treatment, conditions precedent tied to Federal Reserve and state regulator approvals, and termination provisions. Filtering documentFormatFiles[] on type equal to EX-2 yields the operative agreements directly; the resulting clause library feeds first drafts of new reorganization agreements and redline checks against state-specific dissenters' rights provisions.
Securities disclosure counsel extract the risk factors section from the primary S-4EF document and the EX-8 tax opinion to compare phrasing on double-leverage, dividend-flow constraints, regulatory-supervision differences, and Section 368(a) tax-free reorganization characterization. Side-by-side comparisons across registrants of similar SIC code (6020, 6021, 6022, savings-institution codes) test whether new draft language stays within General Instruction G eligibility and matches the conventions courts and regulators expect.
Diligence teams trace a current bank holding company back to its predecessor depository institution using the entities[] block (CIK, EIN, SIC, 333- file number) joined to the prospectus narrative on the share exchange and the EX-5 legality opinion. Linking the S-4EF accession to the subsequent 8-A12B/8-A12G and ongoing 10-K filings produces a clean lineage record for onboarding files and beneficial-ownership reviews.
Researchers build a panel keyed on filedAt (1994 to present), predecessor SIC, and state of incorporation drawn from the cover page, then code stated formation rationales (tax flexibility, expansion authority, capital raising) extracted from the prospectus summary. The panel supports event studies around holding-company formation and tracks how mutual-to-stock thrift conversions and community-bank consolidation moved through specific regions and decades.
Legal-tech vendors use the uniform transaction type as a tightly scoped supervised dataset. Documents grouped by type (EX-2, EX-3.1, EX-5, EX-8, EX-23, EX-99.1) provide labeled examples for clause extractors targeting plan-of-reorganization mechanics, charter provisions, legality opinions, tax opinions, auditor consents, and revocable proxies. The dataset also supports format-detection benchmarks because primary documents and exhibits span ASCII, EDGAR SGML <TABLE> markup, and HTML within a single corpus.
Bank examiners assigned to a newly formed parent extract the predecessor's audited balance sheets, regulatory capital tables, loan portfolio descriptions, and concentration and interest-rate risk disclosures from the embedded <TABLE> (legacy) or <table> (HTML) blocks inside the primary S-4EF document. Because the holding company has no operating history at formation, this content scopes the first post-formation examination cycle and seeds peer-group financial comparisons.
The Form S-4EF Files Dataset is distributed through three complementary endpoints: a JSON index for dataset and container metadata, a single archive download for the entire dataset, and per-container ZIP downloads for incremental access. Filings cover form types S-4EF and S-4EF/A starting from 1994-02-01, and each container is delivered in ZIP format containing TXT, JSON, and HTML file types.
Dataset Index JSON API: https://api.sec-api.io/datasets/form-s4ef-files.json
This endpoint returns dataset-level metadata (name, description, last update timestamp, earliest sample date, total record count, total size, form types covered, container format, and file types), the download URL for the full dataset archive, and the list of all available container files with per-container size, record count, last update timestamp, and download URL. Poll this endpoint to detect which containers were refreshed in the latest run and to selectively download only the changed containers. This endpoint does not require an API key.
Example response:
1
{
2
"datasetId": "1f13365b-9ae0-6a24-a736-c0f8f5b6e76e",
3
"datasetDownloadUrl": "https://api.sec-api.io/datasets/form-s4ef-files.zip",
4
"name": "Form S-4EF Files Dataset",
5
"updatedAt": "2026-04-16T08:30:27.681Z",
6
"earliestSampleDate": "1994-02-01",
7
"totalRecords": 628,
8
"totalSize": 12767660,
9
"formTypes": ["S-4EF", "S-4EF/A"],
10
"containerFormat": "ZIP",
11
"fileTypes": ["TXT", "JSON", "HTML"],
12
"containers": [
13
{
14
"downloadUrl": "https://api.sec-api.io/datasets/form-s4ef-files/2026/2026-03.zip",
15
"key": "2026/2026-03.zip",
16
"size": 13818783,
17
"records": 154,
18
"updatedAt": "2026-04-16T08:30:27.681Z"
19
}
20
]
21
}
Fetch the index with curl:
1
curl https://api.sec-api.io/datasets/form-s4ef-files.json
Download Entire Dataset: https://api.sec-api.io/datasets/form-s4ef-files.zip?token=YOUR_API_KEY
Use this URL to download the full dataset as a single ZIP archive containing all containers. An SEC API key is required and must be passed via the token query parameter.
1
curl -o form-s4ef-files.zip \
2
"https://api.sec-api.io/datasets/form-s4ef-files.zip?token=YOUR_API_KEY"
Download Single Container: https://api.sec-api.io/datasets/form-s4ef-files/2026/2026-03.zip?token=YOUR_API_KEY
Use the downloadUrl from any container in the index response to fetch a single monthly archive instead of the full dataset. This is useful for incremental updates after consulting the index for changed containers. An SEC API key is required.
Python example that reads the index and downloads only recently updated containers:
1
import requests
2
3
API_KEY = "YOUR_API_KEY"
4
index = requests.get("https://api.sec-api.io/datasets/form-s4ef-files.json").json()
5
6
for container in index["containers"]:
7
url = f'{container["downloadUrl"]}?token={API_KEY}'
8
with requests.get(url, stream=True) as r:
9
with open(container["key"].replace("/", "_"), "wb") as f:
10
for chunk in r.iter_content(chunk_size=1 << 20):
11
f.write(chunk)
JavaScript equivalent using fetch:
1
const API_KEY = "YOUR_API_KEY";
2
const index = await fetch(
3
"https://api.sec-api.io/datasets/form-s4ef-files.json"
4
).then((r) => r.json());
5
6
for (const c of index.containers) {
7
const res = await fetch(`${c.downloadUrl}?token=${API_KEY}`);
8
// pipe res.body to disk or process in memory
9
}
The dataset covers Form S-4EF and Form S-4EF/A. S-4EF is the auto-effective Securities Act registration statement used to register securities issued in the formation of a bank or savings and loan holding company under General Instruction G of Form S-4. S-4EF/A is the pre-effective amendment to a previously filed S-4EF, submitted by the same registrant against the same 333- registration file number.
One record is one complete EDGAR submission, identified by its 18-digit accession number and stored as one accession-level folder. Each folder contains a metadata.json manifest plus every text-based document from the original submission — the primary registration statement and each separately filed exhibit — in their EDGAR SGML-wrapped form.
The filer of record is the newly organized shell holding company being created by the reorganization, chartered under state law solely to become the parent of an existing depository institution. The predecessor whose shareholders receive the registered shares is either a commercial bank, national bank, state-chartered bank, or trust company (forming a bank holding company), or a savings association, savings bank, or thrift (forming a savings and loan holding company). General operating companies, foreign private issuers, investment companies, and insurance holding companies cannot use S-4EF.
The dataset includes EDGAR submissions of S-4EF and S-4EF/A from February 1, 1994 to present. Records are added on an ongoing basis as new filings are accepted by EDGAR.
The dataset is distributed as monthly ZIP containers following the path convention <dataset>/<YYYY>/<YYYY-MM>.zip. File types inside each container are TXT (EDGAR SGML-wrapped documents), JSON (the per-record metadata.json manifest), and HTML (modern-era document bodies). Image files referenced by prospectuses and the rolled-up complete-submission .txt are excluded by design.
Standard Form S-4 covers the broad universe of business combinations and exchange offers, requires SEC staff review, and typically includes negotiation history, fairness opinions, and projections for two operating entities. Form S-4EF is a narrow auto-effective sub-variant restricted by General Instruction G to single-institution bank or thrift holding company formations with no other proposals attached. S-4EF becomes effective automatically on filing and the disclosure is short, formulaic, and centered on a single predecessor depository institution.
The predecessor institution's financial statements — audited balance sheets and statements of income, changes in shareholders' equity, and cash flows, plus interim unaudited statements where applicable — are embedded inside the body of the primary S-4EF or S-4EF/A document, not as a separate exhibit. In legacy ASCII-era filings they appear as EDGAR <TABLE> SGML markup; in later HTML-era filings they appear as inline <table> blocks. There are no XBRL or other structured-data attachments, so dataFiles[] is always empty.