The Form DRS Files Dataset is a collection of confidential draft registration statements submitted to SEC EDGAR under form type DRS (an original draft) or DRS/A (an amended draft), the channel through which issuers transmit a registration statement for nonpublic SEC staff review before any public filing exists. A single record is one complete EDGAR submission identified by its accession number, packaged as an accession-numbered folder holding a metadata.json descriptor plus the full set of non-image documents — the draft registration statement and its exhibits — that made up the original confidential submission. The filer is the issuer planning a registered securities offering, most often a pre-IPO operating company, transmitting its draft through securities counsel and a filing agent under Securities Act authority. In substance each draft is a registration statement — most commonly a draft Form S-1, but also drafts of Form F-1, Form S-11, Form 10, and similar forms — distinguished from a public filing only by its confidential-submission posture and cover legend. Dataset coverage begins in October 2012, immediately following the JOBS Act's creation of the confidential draft submission process, and runs to the present.
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
The Form DRS Files Dataset packages confidential draft registration statements filed on SEC EDGAR as form type DRS or DRS/A. Form DRS is the vehicle through which an issuer submits a registration statement to the SEC confidentially, for nonpublic staff review, before any public filing. The mechanism originates in Section 6(e) of the Securities Act of 1933 as amended by the Jumpstart Our Business Startups (JOBS) Act of 2012, which initially permitted only emerging growth companies to submit registration statements as confidential drafts. The Division of Corporation Finance subsequently extended the accommodation to substantially all issuers, so the population of DRS submissions spans IPO candidates and other registrants pursuing confidential pre-effective review.
A DRS submission is, in substance, a registration statement — most commonly a draft Form S-1, but also drafts of Form F-1, Form S-11, Form 10, and similar registration forms — wrapped in the DRS submission type so that it is processed nonpublicly rather than entered into the public filing stream. The draft typically carries the full prospectus, the issuer's audited and interim financial statements, risk factors, use-of-proceeds disclosure, and management and capitalization narratives, together with the exhibits the staff needs to review. A DRS/A is a successive draft of a previously submitted registration statement, reflecting the issuer's responses to staff comments or its own revisions; structurally it mirrors the original DRS, often including a marked or clean revised prospectus and updated exhibits.
A defining characteristic of the primary document is its confidential-submission cover language. The draft registration statement carries a legend on or near its cover page stating that the draft has not been publicly filed and is being submitted confidentially for SEC review under the relevant Securities Act provision (typically phrased "This draft registration statement has not been filed publicly..."). This legend distinguishes a DRS primary document from an ordinary publicly filed registration statement even though the body content is otherwise identical in form. Dataset coverage begins in October 2012; the form has been HTML-based throughout that window, and the dataset is distributed as monthly ZIP containers holding HTML, JSON, TXT, and PDF content.
A single record in the Form DRS Files Dataset is one complete EDGAR submission of a draft registration statement, identified by its accession number. Each record corresponds to either a DRS filing (an original confidential draft registration statement) or a DRS/A filing (an amended draft). In the packaged dataset a record materializes as one accession-numbered folder containing a metadata.json descriptor plus the full set of non-image documents that made up the original EDGAR submission. The folder name is the 18-digit accession number with dashes removed (for example, accession 0001628279-25-000707 becomes folder 000162827925000707), while the metadata retains the canonical dashed form. The unit of the record is therefore the submission, not the individual document: a record bundles the primary draft registration statement together with every exhibit and the structured header that EDGAR captured at the moment of confidential submission.
Each record is organized in three layers:
metadata.json, present exactly once per accession folder, mirroring the EDGAR submission header and document index.filename1.htm, holding the draft registration statement itself.filename2.htm onward, each a distinct exhibit carried within the same submission.All document files are renamed sequentially and contiguously from filename1.htm, regardless of their original EDGAR filenames, so the local numbering is a clean ordinal sequence even where the original submission interleaved images among the documents.
The documents are not raw HTML on disk. Each .htm file is enclosed in the EDGAR SGML document wrapper — the same <DOCUMENT> envelope EDGAR uses inside the complete submission text file. The wrapper preserves the original document-level metadata as tagged header lines preceding the HTML payload:
1
<DOCUMENT>
2
<TYPE>DRS
3
<SEQUENCE>1
4
<FILENAME>filename1.htm
5
<TEXT>
6
<html> ... full HTML body of the draft registration statement ... </html>
7
</TEXT>
8
</DOCUMENT>
The <TYPE> line carries the EDGAR document type, the <SEQUENCE> line carries the original EDGAR sequence number, and the <FILENAME> line carries the renamed local filename. The actual HTML body sits inside the <TEXT> block. For the primary document, <TYPE> matches the submission form type (DRS or DRS/A). For exhibits, <TYPE> carries the exhibit designation such as EX-10.1 or EX-10.2.
One nuance follows directly from the wrapper: the <SEQUENCE> values preserve the original EDGAR ordering and can therefore be non-contiguous within a folder, because image documents that occupied intervening sequence numbers in the original submission have been removed. The local filenameN.htm numbering, by contrast, is always contiguous from 1. To reconcile a packaged document back to its place in the original submission, the <SEQUENCE> value inside the wrapper — not the local filename ordinal — is the authoritative cross-reference, and the full original ordering can be reconstructed from the documentFormatFiles array in the metadata.
filename1.htm is the draft registration statement and the substantive core of the record. Internally it follows the structure of whatever registration form is being drafted. For the common S-1-style draft this includes the prospectus cover and confidential-submission legend, the prospectus summary, risk factors, use of proceeds, dividend policy, capitalization and dilution tables, management's discussion and analysis, business description, management and executive compensation disclosure, principal and selling stockholder tables, description of capital stock, and the financial statements with accompanying notes. The primary document is frequently the largest file in the folder — sometimes several megabytes — because it embeds the full prospectus and financial statements as a single HTML document. The HTML payload commonly bears generator artifacts from filing-preparation platforms (for example Workiva/Wdesk comment markers), which are incidental to the disclosure content.
The exhibit documents, filename2.htm onward, are the registration statement's exhibits submitted for staff review. They are typed by their EDGAR exhibit codes and span the usual registration-statement exhibit families: underwriting agreements, charter and bylaw documents, opinions of counsel, material contracts in the EX-10 series, subsidiary lists, consents, and similar items. The number of exhibit documents varies widely across records — many filings carry only the primary document with no separate exhibit files, while others carry a dozen or more, and the largest submissions extend well beyond two dozen documents. Each exhibit is independently SGML-wrapped and independently typed, so the exhibit composition of a record can be read directly from the <TYPE> lines of its documents or from the metadata index.
metadata.json is the structured spine of each record and reproduces the EDGAR submission header together with a complete index of the original documents.
Its top-level fields identify the submission and locate its source:
formType — DRS or DRS/A.accessionNo — the dashed accession number (e.g. 0001628279-25-000707).description — the human-readable form description such as Form DRS - Draft Registration Statement.filedAt — the filing timestamp with timezone offset (e.g. 2025-10-24T20:56:26-04:00).linkToFilingDetails, linkToTxt, linkToHtml — links back to the source on EDGAR: the primary document, the complete submission text file, and the filing-index page, respectively.linkToXbrl, dataFiles, seriesAndClassesContractsInformation — generally empty for these filings.id — an internal record identifier.Two array fields carry the richest structure. documentFormatFiles enumerates every document in the original EDGAR submission, with each entry exposing sequence (the EDGAR sequence number), size (byte size as a string), documentUrl (the original sec.gov location), type (the EDGAR document type), and, on some entries, a description. This index is a faithful record of the original submission and therefore lists documents that are not materialized on disk: GRAPHIC (image) entries appear in the array even though their files are deliberately excluded from the packaged record, and the final entry is always the complete submission text file (<accession>.txt), referenced by URL (with a blank sequence and type) rather than packaged as a separate filenameN document. The array is thus the authoritative bridge between the trimmed on-disk file set and the full original submission inventory.
The entities array carries filer and subject-company information parsed from the EDGAR header. Each entry can include companyName (with a role suffix such as (Filer)), cik, the entity-level type, act (the securities act, typically 33), fileNo (the SEC file number, which for confidential pre-IPO drafts commonly falls in the 377- range), filmNo, irsNo, stateOfIncorporation, fiscalYearEnd as an MMDD value, sic with its descriptive label, and tickers as an array. These fields supply the issuer-identification context that the draft registration statement itself does not present in structured form.
A record includes the structured metadata.json descriptor and every non-image document from the original submission, each preserved with its EDGAR SGML wrapper and original document-type and sequence metadata. This always covers the primary draft registration statement and, where present, the full set of exhibit documents.
Two categories of original content are intentionally excluded from the materialized files:
GRAPHIC documents) are not packaged in the record, although they remain enumerated in the documentFormatFiles index for completeness.linkToTxt URL in the metadata.Everything substantive to the disclosure — the prospectus narrative, financial statements, risk factors, and textual or tabular exhibits — is retained, since these are carried as HTML rather than as images.
Dataset coverage begins in October 2012, immediately following the JOBS Act's creation of the confidential draft submission process, and the form has been HTML-based throughout that window. Because DRS submissions did not exist before the JOBS Act, the record set does not reach back into the earlier ASCII/plain-text EDGAR era; from the outset, draft registration statements were filed as HTML documents wrapped in the SGML document envelope, which is why every materialized document in the dataset is HTML inside an SGML wrapper rather than legacy plain text.
The most material change over the dataset's span is in the eligible filer population rather than in the document format. At inception the DRS accommodation was confined to emerging growth companies, so early records overwhelmingly represent EGC IPO candidates. The later expansion of confidential review to all issuers broadened the population of registrants and, correspondingly, the mix of underlying registration forms drafted (beyond S-1 to F-1, S-11, Form 10, and others). The internal anatomy of a record — confidential legend, prospectus, financial statements, exhibits, SGML-wrapped HTML — is stable across this expansion; what changes is the diversity of issuers and registration types feeding into it.
Several nuances matter for accurate use of a record:
DRS/A record is an amendment in the sense of a successive draft. Successive drafts of the same registration share lineage through their common SEC file number and CIK rather than through a single accession, so the relationship between an original DRS and its amendments must be reconstructed from the entity-level fileNo and cik, not inferred from the accession number alone.filenameN.htm ordering is a contiguous re-numbering and does not equal the original EDGAR sequence. The <SEQUENCE> header inside each wrapper and the documentFormatFiles array preserve the original ordering and are the correct references when mapping documents back to the source submission..htm files must first strip the SGML <DOCUMENT>/<TEXT> wrapper before treating the payload as HTML.Each record is a confidential submission to SEC EDGAR carrying form type DRS or DRS/A. A DRS is an issuer's draft registration statement, transmitted for nonpublic SEC staff review before any public filing exists. A DRS/A is an amended draft of a prior submission, resubmitted after the issuer revises the document in response to staff comments or its own updates. A single contemplated offering typically generates one DRS followed by several DRS/A amendments over the review period.
The filer is the issuer planning a registered securities offering that elects confidential staff review of its draft before going public — most often a pre-IPO operating company preparing a first public equity offering. The record is transmitted by the issuer's officers and securities counsel through a filing agent. Underwriters, selling shareholders, and exchanges appear inside the document but do not file it; the reporting party is always the registrant whose securities are to be registered.
The population spans domestic operating companies, foreign private issuers (submitting draft F-1 or 20-F equivalents), and various issuer structures (corporations, holding companies). EDGAR metadata records these under Securities Act ("act": "33") authority and assigns a file number in the 377- series, reserved for draft/confidential submissions, distinct from the 333- series used for public Securities Act registrations.
Eligibility has two phases:
A DRS is offering-driven and elective — not periodic, and not forced by any external event or deadline. A record arises when:
DRS.DRS/A.There is no calendar cadence. The initial DRS can be filed months before any intended pricing; DRS/A amendments follow the irregular rhythm of the staff comment cycle, not statutory deadlines.
The confidential phase ends when the issuer either converts to public filing — publicly filing the registration statement, with all prior drafts made public, a set number of days before the roadshow (or before the requested effective date if there is no roadshow) — or abandons the offering. The public-filing condition is what eventually moves these contents into the public record; an abandoned registration may leave the draft as the only surviving evidence of the transaction.
Records begin in October 2012, when the mechanism first became available. There is no pre-EDGAR analog: confidential draft submission was created in the EDGAR era as an electronic nonpublic channel, so no earlier records exist.
333- or other public file-number series. A DRS is the confidential 377--series counterpart. An issuer that used confidential review will typically have both DRS submissions and a later public registration for the same offering; one that did not will have no DRS record.DRS is the first confidential draft for a registration; DRS/A is any amended draft of it. Amendment volume reflects the depth of staff review, not a defect.Form DRS holds a precise spot in the registration lifecycle: a complete registration statement in substance, but submitted confidentially and kept out of the public EDGAR stream until the issuer chooses to go public. That dual nature places it between two families of comparison targets — the public registration statements a DRS eventually becomes (S-1, F-1, and their amendments), and the confidential or pre-public submission mechanisms that share the "not yet visible to the market" attribute but apply to different document types.
Form S-1 / S-1/A (public domestic registration). A DRS is the same instrument as an S-1: a Securities Act of 1933 registration statement carrying a prospectus, risk factors, use-of-proceeds, financials, and exhibits (the samples show full prospectuses with EX-3, EX-4, EX-10 attachments). The difference is posture and timing, not content. A DRS is submitted confidentially for nonpublic staff review; an S-1 is the public filing that triggers prospectus delivery and the market-disclosure regime. Issuers typically run one or more DRS/DRS/A rounds, then publicly file an S-1 incorporating staff comments and refile the prior drafts (commonly at least 15 days before a roadshow). The two file-number series make the split visible: DRS carries a 377- draft number, the public registration a 333-.
Form F-1 / F-1/A (foreign private issuer registration). The same draft-to-public relationship governs FPIs, which go public on F-1 rather than S-1. Critically, the DRS form type does not encode whether the underlying registration is domestic or foreign, so the DRS dataset mixes both populations under one label; only the issuer's incorporation and SIC metadata separate them. An FPI-only pipeline cannot be built by form type alone.
Form DRS/A vs. S-1/A and F-1/A (amendments). A DRS/A is a revised confidential draft, almost always responding to a staff comment round before any public filing exists. An S-1/A or F-1/A amends an already-public statement, advancing it toward effectiveness, often near pricing. The DRS dataset (DRS plus DRS/A) therefore captures the confidential iteration history — how many draft rounds occurred and how disclosure evolved before going public — which public amendment datasets cannot show.
Confidential-treatment requests (Rule 406 / Rule 24b-2). DRS is often confused with confidential treatment, but they are inverses. A CTR withholds specific portions of an otherwise public filing — typically sensitive terms redacted from a contract exhibit. With DRS, the entire statement is nonpublic during review and nothing is redacted; it is simply not yet disseminated, then later refiled in full. CTR answers "which terms were withheld from a public document"; DRS answers "which complete offering documents were under confidential review." They can co-occur — a refiled draft may later apply confidential treatment to certain exhibit terms — making them complementary, not interchangeable.
Review correspondence (CORRESP / UPLOAD) and Reg A drafts (Form 1-A). CORRESP and UPLOAD hold the staff–issuer comment letters around a registration, including the confidential phase — the dialogue about a DRS, not the draft itself. Form 1-A also allows confidential Draft Offering Statements, but under the Regulation A exempt-offering regime with different disclosure scope and dollar caps, outside the S-1/F-1 track DRS feeds. Reach for these only when you need the surrounding correspondence or the Reg A analog.
The Form DRS Files Dataset is the confidential, pre-public draft layer of Securities Act registration — full registration statements (prospectus, financials, risk factors, use of proceeds, exhibits) for both domestic and foreign IPOs, with their DRS/A rounds, from October 2012 to present. It differs from public S-1/F-1 data by posture and timing rather than content, from S-1/A and F-1/A by belonging to the confidential rather than public amendment track, from confidential-treatment requests by withholding the whole document instead of redacting passages, and from review correspondence by containing the draft rather than the dialogue. Its distinct value is visibility into offerings at their earliest disclosed stage — including drafts that never went public — making it a complement to, not a substitute for, the public registration datasets that capture the same deals only once the issuer elects to go public.
The Form DRS Files Dataset captures the earliest formal stage of the U.S. IPO pipeline: the confidential draft registration statement an issuer submits for nonpublic SEC staff review before any public S-1 appears on EDGAR. Each accession holds a near-complete draft prospectus — risk factors, use of proceeds, MD&A, audited financials, and exhibits — and DRS/A amendments record how that draft evolves under staff comment. Different professionals read different parts: capital-markets teams track the issuer roster and timing, lawyers mine drafting and disclosure content, investors read the financials of not-yet-public companies, and data teams use the structured metadata and the draft-versus-public comparison.
ECM desks and IPO-focused sell-side analysts use the dataset as an early-warning map of the new-issue pipeline. A DRS submission starts the SEC review clock months before any public filing, so analysts mine the entities[] block — companyName, cik, sic, stateOfIncorporation, tickers — to maintain a running roster of issuers in confidential review. They read the draft for sector, business model, and use-of-proceeds to estimate raise size, and track DRS/A cadence as a proxy for review progress. The output is a forward IPO calendar that feeds coverage-initiation and new-issue conviction calls.
Bankers in coverage, ECM, and syndicate roles use Form DRS for competitive intelligence and benchmarking. Reading comparable issuers' draft statements shows how peers frame the equity story, structure risk factors, present segment economics in the MD&A, and word use-of-proceeds and dilution. SIC codes and filedAt timestamps cluster comparable deals and inform marketing timing. The dataset also reveals which issuers in a target sector are quietly preparing to come to market — deal-flow intelligence that shapes pitch timing, comp selection, and valuation framing.
Capital-markets attorneys use the draft statements and amendments as a drafting and precedent library. They study how registrants in a given industry word risk factors, related-party and lock-up language, dual-class governance, and material-agreement exhibits (EX-10.x, charters, bylaws). Comparing a DRS against its DRS/A versions — and the later public S-1 — shows how disclosure was revised under staff comment, helping counsel anticipate likely SEC questions on a live engagement. The dataset supports drafting benchmarking, completeness review, and precedent research.
Late-stage venture, growth-equity, crossover, and secondaries teams use Form DRS to access detailed, audited disclosure on portfolio companies and competitors not yet public. The draft prospectus exposes revenue, margins, unit metrics, capitalization, and risk factors months ahead of the public filing. These teams focus on the financial statements, MD&A, and capitalization disclosure, plus entities[] metadata to confirm issuer identity and industry. The output is pre-IPO diligence, secondary-market pricing, and competitive mapping of a holding's peer set.
Teams maintaining IPO calendars and new-issue databases treat the dataset as a structured feed of pipeline events. The metadata.json fields — formType (DRS vs DRS/A), accessionNo, filedAt, entities[] (CIK, name, SIC, state, file number) — let them detect new submissions and amendments, deduplicate issuers by CIK, and timestamp each step. DRS/A counts and timing model review progress and forecast pricing windows. The output is a continuously updated pipeline product with alerts and dashboards on top.
Quant and event-driven teams aggregate filedAt and entities[].sic across the full October 2012-to-present history to measure pipeline intensity by sector, study issuance cycles, and link confidential-submission activity to later public filings and post-IPO performance — producing factor signals and issuance-cycle indicators. Academic researchers use the same record to study the JOBS Act and Section 6(e) confidential-review regime: which issuers elect confidential submission, the lag from first DRS to public filing, and how draft disclosure differs from the eventual version. Both rely on the clean metadata.json for panel construction and the draft text as a feature and study source.
Engineering teams integrating SEC content use the dataset as a deduplicated, ready-to-ingest package. The one-metadata.json-per-accession structure, SGML-wrapped documents with explicit <TYPE> tags, and the documentFormatFiles[] index give stable keys (accessionNo, cik, fileNo) for entity resolution, document typing, and linkage to later public filings by the same CIK. Teams building retrieval and extraction systems use the sectioned prospectus text for reliable chunking, full-text indices, and draft-versus-public diffing tools that surface what changed between a DRS, its amendments, and the public filing.
Concrete ways the Form DRS Files Dataset is put to work, each tied to specific parts of the record.
Build a forward IPO pipeline calendar. Poll new DRS accessions and read the entities[] block (companyName, cik, sic, stateOfIncorporation, tickers) and filedAt to maintain a running roster of issuers in confidential review. Deduplicate by cik and group DRS/A filings under their original draft via the shared fileNo (typically 377-). The output is a dated pipeline feed that surfaces issuers months before any public S-1 appears.
Track review progress through amendment cadence. Count DRS/A rounds for a given issuer and measure the time gaps between successive filedAt timestamps to model how far a draft has moved through staff comment. Rising amendment activity flags deals approaching a public filing and pricing window, feeding IPO-timing forecasts and new-issue alerts.
Extract pre-public financials and disclosure. Strip the SGML <DOCUMENT>/<TEXT> wrapper from filename1.htm and pull the prospectus financial statements, MD&A, capitalization, and risk factors for companies that are not yet public. This supports pre-IPO diligence, secondary-market pricing, and competitive mapping using audited numbers available ahead of the public registration.
Diff a draft against its amendments and the eventual public filing. Align a DRS primary document with its DRS/A versions, then link to the issuer's later public S-1/F-1 by cik, to trace how risk factors, related-party language, and use-of-proceeds disclosure were revised under staff comment. Counsel uses this to anticipate likely SEC questions and to benchmark drafting on live engagements.
Assemble an industry precedent library of exhibits and disclosure. Filter records by entities[].sic, then read the exhibit set by <TYPE> (EX-3 charters and bylaws, EX-4, EX-10.x material agreements, opinions and consents) to compare how peer registrants word governance, lock-up, and material-contract terms. The result is a searchable precedent set for drafting benchmarking and completeness review.
Construct sector-level issuance panels for research. Aggregate filedAt and entities[].sic across the full October 2012-to-present history to measure confidential-submission intensity by sector and study the lag from first DRS to public filing. Stable keys (accessionNo, cik, fileNo) and the documentFormatFiles[] index support clean panel construction for issuance-cycle indicators and JOBS Act / Section 6(e) studies.
Power a draft-text retrieval and extraction pipeline. Use the one-metadata.json-per-accession layout, contiguous filenameN.htm numbering, and explicit document <TYPE> tags to ingest, type, and chunk the sectioned prospectus text. This feeds full-text indices and document-Q&A systems, with cik-based linkage to later public filings for draft-versus-public comparison tooling.
Dataset Index JSON API: https://api.sec-api.io/datasets/form-drs-files.json
This endpoint returns the dataset metadata — name, description, last updated timestamp, earliest sample date (2012-10-01), total record and size counts, the form types covered (DRS, DRS/A), the container format (ZIP), and the content file types (HTML, JSON, TXT, PDF). It also includes the download URL for the full dataset and the list of individual container files, each with its size, record count, updated timestamp, and download URL. You can poll this endpoint daily to detect which containers changed in the most recent refresh run and decide which ones to download. This endpoint does not require an API key.
1
{
2
"datasetId": "1f13365b-9ae0-6924-981f-395e0538435d",
3
"datasetDownloadUrl": "https://api.sec-api.io/datasets/form-drs-files.zip",
4
"name": "Form DRS Files Dataset",
5
"updatedAt": "2026-06-20T02:46:53.108Z",
6
"earliestSampleDate": "2012-10-01",
7
"totalRecords": 70609,
8
"totalSize": 5234934958,
9
"formTypes": ["DRS", "DRS/A"],
10
"containerFormat": "ZIP",
11
"fileTypes": ["HTML", "JSON", "TXT", "PDF"],
12
"containers": [
13
{
14
"downloadUrl": "https://api.sec-api.io/datasets/form-drs-files/2026/2026-06.zip",
15
"key": "2026/2026-06.zip",
16
"size": 13818783,
17
"records": 154,
18
"updatedAt": "2026-06-20T02:46:53.108Z"
19
}
20
]
21
}
Download Entire Dataset: https://api.sec-api.io/datasets/form-drs-files.zip?token=YOUR_API_KEY
Downloads the complete dataset as a single ZIP archive. This endpoint requires an API key.
Download Single Container: https://api.sec-api.io/datasets/form-drs-files/2026/2026-06.zip?token=YOUR_API_KEY
Downloads one individual monthly container file instead of the full dataset. Use the container keys listed in the index JSON API to target specific months. This endpoint requires an API key.
The dataset covers EDGAR form types DRS and DRS/A — confidential draft registration statements and their amendments, submitted to the SEC for nonpublic staff review before any public filing. The underlying drafts are most commonly draft Form S-1 registrations, but also drafts of Form F-1, Form S-11, and Form 10.
One record is a single complete EDGAR submission identified by its accession number, packaged as an accession-numbered folder. The folder holds exactly one metadata.json descriptor plus the full set of non-image documents — the primary draft registration statement as filename1.htm and any exhibits as filename2.htm onward.
No one is required to file it; a DRS is an elective, offering-driven submission by an issuer that chooses confidential staff review before going public — most often a pre-IPO operating company. From October 2012 to mid-2017 only emerging growth companies could use the channel; since July 2017 the Division of Corporation Finance has extended nonpublic review to all issuers.
Coverage begins in October 2012 (earliest sample date 2012-10-01), immediately following the JOBS Act's creation of the confidential draft submission process, and runs to the present. There is no pre-2012 analog because the confidential draft channel did not exist before then.
The dataset is distributed as monthly ZIP containers holding HTML, JSON, TXT, and PDF content. Each document .htm file is wrapped in the EDGAR SGML <DOCUMENT>/<TEXT> envelope, which must be stripped before the payload is parsed as HTML; image and graphic files are excluded, though they remain listed in the documentFormatFiles metadata index.
A DRS is the same instrument as a public S-1 or F-1 — same prospectus, financials, risk factors, and exhibits — but submitted confidentially and identified by a 377--series file number rather than the public 333- series. The DRS dataset captures offerings at their earliest disclosed stage, including the full confidential amendment history and drafts for IPOs that were ultimately abandoned and never filed publicly.
Successive drafts share lineage through their common SEC file number (fileNo, typically in the 377- series) and CIK, not through the accession number. To reconstruct an offering's full confidential iteration history, group DRS and DRS/A records by the entity-level fileNo and cik rather than inferring relationships from accession numbers.