Form DOS Files Dataset

The Form DOS Files dataset captures every EDGAR submission of Form DOS (a draft offering statement filed for non-public staff review under Rule 252(d) of the Securities Act) and Form DOS/A (any amendment to a previously submitted draft offering statement). One record is one accession-numbered EDGAR submission — the structured filing header, the primary Regulation A "oneafiler" XML, the preliminary offering circular, every Part III exhibit, and any cover correspondence transmitted with the submission. Filers are prospective Regulation A issuers — privately held domestic operating companies organized in the U.S. or Canada that have not previously had a Regulation A offering statement qualified or a Securities Act registration statement declared effective. The dataset's earliest sample date is June 19, 2015, the day Title IV of the JOBS Act and amended Rule 252(d) took effect, and records continue through the present, packaged as monthly ZIP archives containing XML, HTML, JSON, TXT, and PDF documents.

Update Frequency
Daily
Updated at
2026-05-16
Earliest Sample Date
2015-06-01
Total Size
106.9 MB
Total Records
4,068
Container Format
ZIP
Content Types
XML, HTML, JSON, TXT, PDF
Form Types
DOS, DOS/A

Dataset APIs

Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.

Dataset Index JSON API

Download the entire dataset as a single archive file.

Download Entire Dataset:

Download a single container file (e.g. monthly archive) from the dataset.

Download Single Container:

Dataset Files

120 files · 106.9 MB
Download All
2026-04.zip262.7 KB23 records
2025-09.zip275.3 KB4 records
2025-08.zip365.0 KB6 records
2025-06.zip174.0 KB3 records
2025-05.zip153.8 KB11 records
2025-04.zip108.0 KB7 records
2025-03.zip488.1 KB20 records
2025-02.zip832.8 KB29 records
2025-01.zip127.0 KB11 records
2024-12.zip505.5 KB21 records
2024-11.zip275.2 KB19 records
2024-10.zip564.6 KB33 records
2024-09.zip856.4 KB48 records
2024-08.zip970.5 KB61 records
2024-07.zip1.1 MB61 records
2024-06.zip1.6 MB81 records
2024-05.zip1.1 MB66 records
2024-04.zip1.2 MB44 records
2024-02.zip529.0 KB25 records
2024-01.zip395.3 KB17 records
2023-11.zip306.8 KB9 records
2023-10.zip583.5 KB29 records
2023-09.zip430.2 KB20 records
2023-08.zip162.3 KB6 records
2023-07.zip1.2 MB66 records
2023-06.zip1.2 MB47 records
2023-05.zip585.2 KB22 records
2023-04.zip354.9 KB17 records
2023-03.zip688.2 KB38 records
2023-02.zip739.3 KB28 records
2023-01.zip97.5 KB2 records
2022-12.zip309.2 KB9 records
2022-11.zip3.4 MB15 records
2022-09.zip654.0 KB26 records
2022-08.zip272.9 KB12 records
2022-07.zip178.2 KB5 records
2022-06.zip353.7 KB18 records
2022-05.zip1.2 MB53 records
2022-03.zip352.8 KB14 records
2022-02.zip323.9 KB8 records
2022-01.zip313.4 KB10 records
2021-12.zip122.5 KB2 records
2021-11.zip336.4 KB5 records
2021-10.zip2.9 MB59 records
2021-09.zip798.4 KB32 records
2021-08.zip1.4 MB81 records
2021-07.zip311.3 KB18 records
2021-06.zip940.3 KB39 records
2021-05.zip428.3 KB27 records
2021-04.zip481.5 KB22 records
2021-03.zip884.9 KB53 records
2021-02.zip893.7 KB38 records
2021-01.zip943.4 KB43 records
2020-12.zip964.3 KB37 records
2020-11.zip1.4 MB78 records
2020-10.zip1.5 MB70 records
2020-09.zip1.9 MB85 records
2020-08.zip847.8 KB31 records
2020-07.zip927.7 KB43 records
2020-06.zip307.0 KB20 records
2020-05.zip464.4 KB12 records
2020-04.zip1.2 MB53 records
2020-03.zip1.2 MB66 records
2020-02.zip1.1 MB53 records
2020-01.zip582.7 KB15 records
2019-12.zip1.2 MB73 records
2019-11.zip738.9 KB31 records
2019-10.zip371.0 KB17 records
2019-09.zip372.2 KB5 records
2019-08.zip1.1 MB39 records
2019-07.zip394.1 KB4 records
2019-06.zip658.6 KB36 records
2019-05.zip1.4 MB41 records
2019-04.zip1.2 MB49 records
2019-03.zip1.7 MB38 records
2019-02.zip1.4 MB46 records
2019-01.zip694.6 KB18 records
2018-12.zip2.1 MB79 records
2018-11.zip2.2 MB42 records
2018-10.zip1.2 MB50 records
2018-09.zip2.0 MB65 records
2018-08.zip1.7 MB74 records
2018-07.zip1.1 MB47 records
2018-06.zip1.1 MB31 records
2018-05.zip1.0 MB32 records
2018-04.zip406.0 KB11 records
2018-03.zip556.1 KB37 records
2018-02.zip465.6 KB7 records
2018-01.zip276.5 KB8 records
2017-12.zip2.3 MB98 records
2017-11.zip807.0 KB35 records
2017-10.zip1.2 MB50 records
2017-09.zip1.2 MB55 records
2017-08.zip1.7 MB87 records
2017-07.zip318.5 KB20 records
2017-06.zip1.6 MB70 records
2017-05.zip1.6 MB52 records
2017-04.zip220.4 KB17 records
2017-03.zip1.8 MB44 records
2017-02.zip1.1 MB24 records
2017-01.zip791.0 KB10 records
2016-12.zip1.4 MB33 records
2016-11.zip1.9 MB51 records
2016-10.zip779.1 KB39 records
2016-09.zip1.3 MB59 records
2016-08.zip562.9 KB25 records
2016-07.zip369.2 KB22 records
2016-06.zip775.6 KB25 records
2016-05.zip2.7 MB59 records
2016-04.zip423.0 KB18 records
2016-03.zip855.9 KB6 records
2016-02.zip567.3 KB41 records
2016-01.zip952.6 KB19 records
2015-12.zip1.4 MB64 records
2015-11.zip1.2 MB59 records
2015-10.zip1.1 MB26 records
2015-09.zip664.9 KB19 records
2015-08.zip123.6 KB9 records
2015-07.zip955.4 KB18 records
2015-06.zip590.8 KB8 records

What This Dataset Contains

This dataset is the corpus of confidential, pre-public draft offering statements submitted under Regulation A. Form DOS is the EDGAR template for a draft offering statement created by the Regulation A+ amendments adopted on March 25, 2015 and effective June 19, 2015. Rule 252(d) under the Securities Act of 1933 lets an issuer whose securities have not previously been sold under a qualified Regulation A offering statement or under an effective Securities Act registration statement submit its draft offering statement to the Division of Corporation Finance for non-public review before any public Form 1-A filing. Form DOS is the template for that draft; Form DOS/A is the template for any subsequent amendment to it.

Substantively, a DOS submission carries the same content as a Form 1-A: the same Part I notification of sale (XML-encoded), the same Part II preliminary offering circular, and the same Part III exhibits. The difference is the regulatory track — DOS records are the artifact of confidential pre-public review. When the issuer is ready to launch the offering publicly, the draft and its amendments are typically refiled in whole or by reference on Form 1-A; that subsequent public filing is a separate filing under a separate form type and is not part of this dataset.

Records are delivered as monthly ZIP archives. Each archive contains a top-level folder named for the year-month (for example 2025-04/); inside that folder is one subfolder per filing, named with the 18-digit accession number with no dashes. Each accession folder is the record. The file types found in the dataset are XML, HTML, JSON, TXT, and PDF.

Content Structure of a Single Record

What one record represents

A single record in the Form DOS Files dataset is one complete EDGAR submission of either Form DOS or Form DOS/A. The record unit is the accession-numbered folder bundling the entire submission as transmitted to EDGAR: the structured filing header, the primary Regulation A "oneafiler" XML form data, the preliminary offering circular, every Part III exhibit attached to that submission, and any cover correspondence carried in the EDGAR envelope. One folder corresponds to exactly one accession number. An issuer that goes through several rounds of confidential review surfaces as a chain of distinct records (one DOS plus a sequence of DOS/A amendments) rather than as a single consolidated dossier.

Container and record packaging

Inside each accession folder are:

  • metadata.json — the structured filing-header file.
  • filename1.xml — the EDGAR Regulation A "oneafiler" primary submission XML.
  • filename<N>.htm (occasionally .txt or .pdf) — the Part II preliminary offering circular and each Part III exhibit, one document per file.

Document filenames follow the EDGAR filename<sequence>.<ext> pattern, where the sequence number aligns with the <SEQUENCE> token of the SGML wrapper around the document and with the sequence field of the corresponding documentFormatFiles[] entry in the manifest.

Each non-JSON document file begins with the EDGAR SGML wrapper:

1 <DOCUMENT>
2 <TYPE>...
3 <SEQUENCE>...
4 <FILENAME>...
5 <TEXT>
6 ... payload (HTML, XML, plain text, or base64-encoded PDF) ...
7 </TEXT>
8 </DOCUMENT>

The <TYPE> token is the canonical EDGAR exhibit-type label and aligns one-for-one with the type field on the corresponding documentFormatFiles[] entry. Image binaries that appear in EDGAR as GRAPHIC document entries are intentionally excluded from the ZIP; their URLs remain in the manifest so the reference graph stays intact even though the pixel data is not packaged.

metadata.json — the filing header

metadata.json is the structured header for the record. It exposes filing-level identifiers and a manifest of every attachment EDGAR received:

  • formTypeDOS or DOS/A.
  • accessionNo — the dashed 18-character accession number (for example 0001213900-25-037107).
  • description — the EDGAR human-readable form description; carries the [Amend] suffix on DOS/A.
  • filedAt — ISO 8601 timestamp with timezone offset for the EDGAR acceptance moment.
  • linkToFilingDetails, linkToTxt, linkToHtml — URLs to the primary submission document, the full submission text bundle, and the EDGAR filing index page on sec.gov, respectively.
  • linkToXbrl — empty; draft offering statements are not subject to a structured-financial-data tagging mandate.
  • documentFormatFiles[] — one entry per attachment, each carrying sequence, size (bytes, as a string), documentUrl, type (the EDGAR exhibit-type code), and an optional description. The trailing entry with empty sequence and type is the complete-submission text bundle.
  • dataFiles[] — typically empty for DOS records (the bucket EDGAR uses for structured data attachments such as XBRL or XML data exhibits, none of which apply to draft offering statements).
  • seriesAndClassesContractsInformation[] — typically empty for DOS (this header element applies primarily to investment-company series/class structures).
  • entities[] — one record per filer entity associated with the submission, carrying cik, companyName (with the role suffix EDGAR appends, for example (Filer)), the entity-level type (form type for that filer), fileNo (the EDGAR file number, in the 367- range that EDGAR assigns to Regulation A files), irsNo, sic (industry classification code with its descriptive name), stateOfIncorporation, fiscalYearEnd, act (33 for the Securities Act), and filmNo.
  • id — a 32-character hex identifier for the record within the dataset.

When a co-issuer or guarantor is involved, entities[] carries one or more secondary entities; the primary issuer is the entity whose role suffix in companyName reads (Filer).

The primary "oneafiler" XML

The lead document, filename1.xml, is an EDGAR Regulation A "oneafiler" submission (default namespace http://www.sec.gov/edgar/rega/oneafiler). It encodes the structured Part I of Form 1-A — the notification of sale and offering data the issuer entered into EDGAR's online forms — and is the dataset's only field-level representation of the offering. Its main branches are:

  • headerData/submissionTypeDOS or DOS/A, mirroring the manifest's formType.
  • headerData/filerInfo/filer/issuerCredentials — issuer CIK and a redacted CCC.
  • formData/employeesInfo — issuer legal name, state or country of organization, year of incorporation, CIK, SIC, IRS number, and full-time and part-time employee counts.
  • formData/issuerInfo — mailing address, contact, industry group, plus a compact balance-sheet/income-statement snapshot at the most recent reporting date: cash and equivalents, receivables, property/plant/equipment, total assets, accounts payable, long-term debt, total liabilities, stockholders' equity, revenues, cost of goods sold, depreciation/amortization, net income, basic and diluted EPS, and the auditor's name.
  • formData/commonEquity, formData/preferredEquity, formData/debtSecurities — repeatable per-class blocks listing outstanding share counts, CUSIPs, and public-trading status.
  • formData/issuerEligibility and formData/applicationRule262 — the certifications that the issuer is eligible to use Regulation A and that no Rule 262 bad-actor disqualifying event applies.
  • formData/summaryInfo — the offering economics: Tier 1 vs. Tier 2 election, audit status, security type, best-efforts and delayed-offering flags, securities offered, price per security, aggregate offering amount, plus the names and compensation of the underwriter, sales agents, auditor, legal counsel, promoters, and blue-sky service providers, the broker-dealer CRD, and estimated net proceeds.
  • formData/juridictionSecuritiesOffered — two state-by-state lists: issueJuridicationSecuritiesOffering (states in which the issuer intends to offer) and dealersJuridicationSecuritiesOffering (states in which the registered dealers intend to offer). The element-name spellings preserve EDGAR's schema as filed.
  • formData/unregisteredSecurities, repeated securitiesIssued blocks, and formData/unregisteredSecuritiesAct/securitiesActExcemption — the issuer's prior unregistered-securities issuances within the look-back window and the exemption (commonly Regulation D Rule 506) relied upon for each.

Everything else in the record is narrative or document-formatted prose.

The Part II offering circular and Part III exhibits

The remaining files in an accession folder are the Part II preliminary offering circular and the Part III exhibits, each delivered as an SGML-wrapped HTML document (or, less commonly, plain text or PDF) whose <TYPE> token tags its role.

The offering circular itself is tagged PART II AND III and is by far the largest single document in a typical record. Its narrative interior follows the Form 1-A Part II disclosure schema, in roughly this order: cover page; offering circular summary; risk factors; dilution; plan of distribution; use of proceeds to issuer; description of business; description of property; management's discussion and analysis of financial condition and results of operations; directors, executive officers, and significant employees; compensation of directors and executive officers; security ownership of management and certain securityholders; interest of management and others in certain transactions; securities being offered; and the issuer's financial statements with notes. Under Tier 2 the financial statements must be audited; under Tier 1 they may be unaudited, in which case unaudited financial-statement schedules also appear in this section. Tabular content (capitalization tables, use-of-proceeds tables, selling-securityholder tables, executive-compensation tables, beneficial-ownership tables, and the financial-statement tables themselves) is rendered inline in the HTML rather than as separate exhibits.

The Part III exhibits are numbered under the Form 1-A EX1A- family. Common <TYPE> tokens include:

  • EX1A-1 — underwriting agreements.
  • EX1A-2A — charter document (certificate of incorporation, articles of organization, or trust agreement).
  • EX1A-2B BYLAWS — bylaws or operating agreement.
  • EX1A-3 — instruments defining the rights of securityholders, indentures, and similar.
  • EX1A-4 SUBS AGMT — subscription agreement.
  • EX1A-5voting trust agreements.
  • EX1A-6 MAT CTRCT — material contracts.
  • EX1A-7 — plans of acquisition, reorganization, arrangement, liquidation, or succession.
  • EX1A-8escrow agreements.
  • EX1A-9 — letters of eligibility from a transfer agent or trustee, where applicable.
  • EX1A-10powers of attorney.
  • EX1A-11 CONSENT — consents of independent accountants, counsel, and other named experts.
  • EX1A-12 OPN CNSLlegality opinion of counsel as to the validity of the securities being offered.
  • EX1A-13testing-the-waters materials, when used.
  • EX1A-14 — appointment of agent for service of process.
  • EX1A-15 — additional exhibits not captured by the other categories.

GRAPHIC entries in the manifest correspond to embedded images (logos, charts, signature blocks, scanned exhibits) referenced from inside the HTML by <IMG> tags such as <IMG SRC="image_001.jpg"> or <IMG SRC="ex12-1_001.jpg">. The HTML keeps the references; the dataset, by design, omits the binary image payloads.

Signatures appear inside the documents that require them: the issuer's principal executive, principal financial, and principal accounting officer signatures, plus a majority of the board, are reproduced at the end of the PART II AND III document; counsel, auditor, and other expert signatures appear in their respective EX1A-11 and EX1A-12 exhibits; powers of attorney, where filed, appear in EX1A-10.

What is included versus excluded

Included in the record: the structured header (metadata.json); the primary Regulation A "oneafiler" XML; the SGML-wrapped HTML (or, occasionally, plain-text or PDF) preliminary offering circular; every numbered Part III exhibit attached to the submission; and the original EDGAR file naming and SGML envelope around each document, which preserves the <TYPE>, <SEQUENCE>, and <FILENAME> lineage.

Excluded from the record: binary image attachments referenced as GRAPHIC files in the manifest (their URLs are kept in documentFormatFiles[] but the bytes are not bundled); any non-public correspondence between the issuer and the staff that is not part of an accepted EDGAR submission; and the staff's review comments themselves, which travel through a separate channel and never enter the EDGAR submission stream. Subsequent amendments live in their own DOS/A records rather than being merged into the parent DOS record, and the eventual public Form 1-A filing — even when it incorporates the draft by reference — is a different filing belonging to a different dataset.

DOS versus DOS/A — structural difference

A DOS record is the issuer's first confidential draft. A DOS/A record is an amendment to that draft and is structurally identical: same "oneafiler" XML schema, same Part II/Part III exhibit conventions, same SGML wrappers, same metadata layout. The amendment carries an [Amend] suffix in the EDGAR description, the submissionType element in the XML reads DOS/A, and exhibits that have not changed since the prior round are typically dropped from the new submission rather than refiled. Because each amendment is its own accession number, a single offering's confidential-review history surfaces as a chain of related records that share the issuer CIK and Regulation A file number (367-NNNNN) but differ in accessionNo and filedAt.

Evolution of required content and structure

Form DOS has had a relatively short and stable life compared with longer-standing forms. The form was created by the Regulation A+ amendments adopted on March 25, 2015 and effective June 19, 2015; the dataset's earliest records date to that effective period. A few content shifts since 2015 are worth noting:

  • The Tier 1 / Tier 2 split is reflected directly inside the "oneafiler" XML's summaryInfo block via an explicit Tier election. Tier 2 elections trigger audited financial-statement requirements inside the Part II offering circular; Tier 1 elections do not. The structural slot is the same; the substantive content of the Part II financial statements differs.
  • The SEC raised the Tier 2 offering ceiling from $50 million to $75 million effective March 15, 2021 (the "harmonization amendments"). That change is invisible at the schema level but materially expands the range of aggregate offering amount values that appear inside formData/summaryInfo from records filed after that date.
  • The Rule 262 bad-actor certification block (applicationRule262) and the structured listing of prior unregistered securities issuances (unregisteredSecurities / securitiesIssued / securitiesActExcemption) have been part of the form since its inception and remain stable.
  • The Part III exhibit numbering scheme (the EX1A-1 through EX1A-15 family) has remained the canonical taxonomy throughout the dataset's history; minor additions to the EDGAR exhibit-type vocabulary over time appear as new <TYPE> tokens but do not displace existing categories.

The confidential-review mechanism itself (Rule 252(d)) has not been substantially restructured since 2015, so the gross anatomy — header XML, Part II circular, numbered Part III exhibits — is consistent across the entire dataset.

Evolution of data format

Because Form DOS was introduced in 2015, after EDGAR had completed its migration away from ASCII-only filings, the dataset has never carried text-only or paper-style submissions. From the first accepted DOS filing onward, the primary submission has been the Regulation A "oneafiler" XML, the Part II preliminary offering circular has been HTML, and exhibits have been delivered as HTML, plain text, or PDF, each wrapped in the standard EDGAR <DOCUMENT><TYPE><SEQUENCE><FILENAME><TEXT> envelope. The format profile of a DOS record has been stable since June 2015; the only meaningful evolution has been at the EDGAR vocabulary level (the set of exhibit-type tokens) rather than at the file-format level.

Interpretation notes

A few nuances matter for working with these records:

  • The "oneafiler" XML and the Part II HTML circular partially duplicate each other. The XML carries structured field-level facts (offering size, security type, fees, jurisdictions, prior issuances); the HTML carries the narrative disclosure those facts are embedded within. For numerical extraction, the XML is the authoritative source; for risk-factor, MD&A, or business-description text, the HTML is the only source.
  • The SGML envelope is not optional flavor. It is the boundary that delimits one logical document inside a multi-document submission. Parsers should consume the <DOCUMENT>...</DOCUMENT> block before attempting to interpret the inner HTML or XML, and should associate the <TYPE> token with the inner payload rather than relying on filename heuristics.
  • Image references inside the HTML resolve to filenames (image_001.jpg, ex12-1_001.jpg, and similar) that are intentionally absent from the ZIP. Renderers that care about visual fidelity must fetch the binaries separately from the EDGAR URLs preserved in documentFormatFiles[].
  • DOS/A amendments are independent records, not deltas. Reconstructing the full history of one offering requires joining records by issuer CIK and Regulation A fileNo (367-NNNNN) and ordering them by filedAt. Exhibits that the issuer chose not to refile in an amendment must be retrieved from the prior record in the chain.
  • A draft offering statement is, by regulatory design, not yet public at the moment of filing; the dataset reflects the documents as they were transmitted to the staff for non-public review. Any subsequent public Form 1-A filing for the same offering is a separate filing under a different form type and is not part of this dataset.
  • Because Regulation A confidential review is available only to first-time Regulation A users (those who have not previously sold under a qualified offering statement or under an effective Securities Act registration statement), every DOS record corresponds to an issuer's initial confidential approach to the staff; there is no "subsequent-offering" DOS pattern to disentangle.

Who Files or Publishes This Dataset, and When

Each Form DOS record is a draft Regulation A offering statement submitted to the SEC for non-public staff review. The filer is the prospective Regulation A issuer itself, acting through its officers and counsel. Form DOS is the EDGAR submission type for the initial draft; DOS/A is the submission type for any subsequent draft amendment by the same issuer during non-public review. The SEC Division of Corporation Finance is the recipient and reviewer, not a co-filer. Underwriters, placement agents, selling securityholders, auditors, and counsel may be described in the draft but are not filers of the DOS record.

Filer eligibility

Confidential draft submission under Rule 252(d) of Regulation A is restricted to issuers that have not previously:

  • had a Regulation A offering statement qualified by the SEC, or
  • had a Securities Act registration statement (Form S-1, S-11, F-1, etc.) declared effective.

This effectively limits Form DOS to first-time public-market issuers. Eligible filers are typically:

Issuers excluded from Regulation A altogether cannot file Form DOS, including registered investment companies, business development companies, blank check companies, issuers of fractional interests in oil/gas/mineral rights, Rule 262 "bad actor"-disqualified issuers, and issuers outside the U.S./Canada perimeter.

Trigger and timing

Form DOS is event-driven and entirely optional:

  • Initial DOS: filed when an eligible issuer chooses, before any public Form 1-A filing, to invoke Rule 252(d) non-public review. There is no statutory deadline; filing a DOS is itself elective.
  • DOS/A: filed when the issuer revises a previously submitted draft, typically in response to SEC staff comment letters or to update financial statements, deal terms, or other disclosure. Multiple amendment rounds are common until staff comments are resolved.
  • Transition out: when the issuer is ready to proceed publicly, it files Form 1-A on EDGAR. Rule 252(d) requires that all confidential drafts and related staff correspondence be publicly filed as exhibits to the Form 1-A at least 21 calendar days before the issuer requests qualification.

The earliest possible Form DOS record dates to June 19, 2015, when amended Regulation A and Rule 252(d) took effect under Title IV of the JOBS Act. There is no paper-era predecessor.

Important distinctions

  • DOS vs. DRS: DOS covers draft Regulation A offering statements (Form 1-A drafts) under Rule 252(d). DRS covers draft Securities Act registration statements (such as Form S-1) submitted under JOBS Act Section 106 and its later non-EGC expansion. Parallel but distinct regimes.
  • DOS vs. Form 1-A: Form 1-A is the public Reg A offering statement; DOS is the confidential draft of the same content. Issuers that go directly to a public Form 1-A do not appear in the DOS dataset.
  • DOS/A vs. 1-A/A: DOS/A amends a confidential draft during non-public review; 1-A/A amends a publicly filed Form 1-A. They are separate EDGAR submission types.
  • Repeat issuers: an issuer that previously qualified a Reg A offering or had a Securities Act registration statement declared effective is ineligible under Rule 252(d) and must file Form 1-A publicly from the outset, even for a new Reg A offering.
  • Confidentiality: while in the non-public queue, a DOS is not visible on EDGAR's public interface. Public availability arises only through the later Form 1-A exhibit filing, which is a separate record from the DOS submission captured here.

How This Dataset Differs From Similar Datasets or Filings

Form DOS occupies one narrow slot in the Regulation A lifecycle: the non-public, pre-qualification draft an issuer submits to SEC staff under Rule 252(d). The forms and datasets most easily confused with it either sit elsewhere on the same Reg A timeline, replicate the confidential-draft mechanic for a different registration regime, or document the staff dialogue running alongside the draft.

Form 1-A and 1-A/A (public Reg A offering statements). 1-A is the public successor to DOS. The same Reg A schema (Parts I, II, III) and frequently the same document text apply — but 1-A is the filing the SEC actually qualifies, and it must be public at least 21 calendar days before qualification. DOS produces nothing effective on its own; it only feeds the eventual 1-A. 1-A/A amends the public version; DOS/A amends the non-public draft. The two amendment streams are mechanically parallel but live on opposite sides of the public/non-public boundary.

DRS and DRSLTR (draft registration statements under JOBS Act §6(e)). DRS is the structural cousin of DOS for full Securities Act registration: confidential drafts of S-1, F-1, etc., reviewed by staff before public filing, with DRSLTR carrying the related issuer correspondence. The confidentiality mechanic is nearly identical, but the underlying regime is not — DRS belongs to full registration (large disclosure burden, IPO-oriented, no offering cap), while DOS belongs to the Reg A exemption (Tier 1/Tier 2 ceilings, lighter "oneafiler" XML schema). An issuer picks one path or the other at the exemption-vs-registration level, not within a single offering.

Form S-1 (standard Securities Act registration). Sometimes confused with DOS because both are capital-raising disclosure documents under staff review, but S-1 is public on filing (unless first submitted as a DRS), governed by full registration rules, and produces an effective registration statement. DOS is non-public, governed by Reg A, and qualifies nothing. S-1 data is not a substitute for DOS data — it is a different track.

Reg A ongoing reports (1-K, 1-SA, 1-U, 1-Z). Same regime, opposite end of the lifecycle: these are post-qualification, periodic, and public (annual, semiannual, current, exit). DOS is pre-qualification, one-shot, and confidential. A complete Reg A research set chains DOS and DOS/A → 1-A and 1-A/A → 1-K / 1-SA / 1-U / 1-Z.

CORRESP and UPLOAD (staff correspondence). These run alongside DOS during confidential review — CORRESP holds issuer letters to staff, UPLOAD holds staff comment letters to issuers. They are procedural dialogue, not draft offering documents. DOS supplies the substantive draft text and exhibits; CORRESP/UPLOAD record the back-and-forth that drives the revisions visible in DOS/A and the eventual 1-A.

Key differences at a glance

  • Public vs non-public: DOS is confidential at submission; 1-A, 1-A/A, S-1, and the 1-K family are public.
  • Regime: DOS and the 1-A/1-K family sit inside Regulation A; DRS, DRSLTR, and S-1 sit inside full Securities Act registration.
  • Lifecycle position: DOS is pre-qualification draft; 1-A is the qualification filing; 1-K/1-SA/1-U/1-Z are post-qualification reporting.
  • Effect: DOS qualifies nothing on its own; only the subsequent public 1-A can be qualified.
  • Document role: DOS is the draft offering statement itself; CORRESP/UPLOAD are correspondence about it.

Boundary summary

Form DOS is defined by the intersection of three attributes: non-public at submission, Regulation A exemption framework, and pre-qualification draft stage. That intersection separates it cleanly from 1-A and 1-A/A (same regime, public stage), DRS and DRSLTR (same confidentiality mechanic, different regime), S-1 (different regime, public track), the 1-K/1-SA/1-U/1-Z series (same regime, post-qualification reporting), and CORRESP/UPLOAD (correspondence, not the draft document). DOS is most useful on its own for studying the confidential pre-public stage, and most powerful when joined to the subsequent public 1-A filings and Reg A periodic reports to reconstruct the full Regulation A lifecycle.

Who Uses This Dataset

Form DOS and DOS/A filings expose Regulation A offerings before they become public on Form 1-A, so the dataset draws a narrow professional audience tied to small-issuer capital formation.

Securities lawyers and Reg A practitioners

Counsel advising emerging-growth issuers, microcap sponsors, and crowdfunding portals use the corpus as a precedent library. They mine peer drafts for Tier 1 versus Tier 2 disclosure conventions, risk-factor framing, securities descriptions, and exhibit lists. The DOS-to-DOS/A amendment trail is the most-used signal: it implicitly reveals what staff is asking for in non-public review and informs comment-response strategy.

Issuer-side CFOs and finance leads

CFOs and founder-led finance teams weighing Reg A use the dataset to calibrate offering size, security type, use-of-proceeds language, and the realistic timeline from draft to qualification. The exhibit list and amendment cadence help them scope counsel engagement and internal disclosure work before committing to a public filing.

Placement agents and broker-dealers

Reg A-active placement agents and broker-dealers treat DOS filings as an early-warning pipeline. The CIK, proposed offering size, and security type let them identify issuers ahead of the public Form 1-A and pitch distribution, escrow, or transfer-agent services. Underwriting committees pre-screen candidates against suitability and reputational criteria.

Deal sourcing at venture, growth, and SPAC sponsors

Pipeline analysts at venture funds, growth-equity platforms, and SPAC sponsors scan draft business descriptions, capital structures, and use-of-proceeds sections for pre-public issuers in target verticals. The data supports outbound sourcing for follow-on private rounds, PIPE financing alongside qualification, and SPAC combination conversations.

Analytics teams at retail Reg A distribution platforms

Analytics teams at platforms that list qualified Reg A deals to retail investors enrich issuer profiles, forecast inventory, and estimate qualification probability from amendment cadence. They also build diff dashboards comparing the draft against the eventual Form 1-A to flag material changes introduced during staff review.

Academic researchers in finance, law, and accounting

Researchers studying small-issuer capital formation use the DOS-to-DOS/A-to-1-A linkage to build longitudinal samples on staff comment intensity (proxied by amendments and document changes), drop-off rates, disclosure quality, and the impact of confidential pre-filing review. Stable CIK identification makes panel construction tractable.

Competitive intelligence at peer issuers

Corporate strategy teams use draft business descriptions, capital structures, and use-of-proceeds disclosures to detect competitors preparing to raise public capital ahead of the Form 1-A signal. This matters most in concentrated Reg A verticals such as cannabis, real estate funds, fintech, and consumer brands.

RegTech and disclosure-tooling vendors

Vendors building EDGAR pipelines, issuer authoring tools, and Reg A workflow software use the corpus to train extraction and classification models. The mixed XML, HTML, and PDF documents support parser development; the DOS-versus-DOS/A distinction supports versioning and redline workflows; and the exhibit inventory feeds document-type taxonomies.

Compliance and surveillance teams

Compliance teams at broker-dealers and Reg A platforms monitor issuers whose draft offerings are amended, withdrawn, or materially restructured. Movement between DOS and DOS/A can signal valuation drift, sponsor turnover, or disclosure risk that warrants re-underwriting before the deal goes public, with CIK linkage tying activity back to KYC and AML records.

LLM and RAG developers for capital-markets workflows

Teams building retrieval and question-answering systems use Form DOS as training and evaluation material for draft-stage offering language. Multiple drafts of the same issuer make it well suited to fine-tuning models that recognize disclosure edits, track risk-factor changes, and extract structured offering terms from semi-structured prose.

The dataset's users share one interest: visibility into the pre-public phase of Reg A offerings. Lawyers, issuers, intermediaries, sourcing analysts, researchers, and technology vendors each draw on different fields — the proposed terms, the exhibit list, the amendment trail, or the CIK linkage — to support drafting, sourcing, surveillance, or empirical work that no public source otherwise enables.

Specific Use Cases

The dataset's value lies in its narrow but unique window: the confidential, pre-public stage of Regulation A offerings. The use cases below reflect concrete workflows operators actually run against the records.

Building a precedent library for Reg A drafting

Securities counsel pull the Part II offering circulars and Part III exhibits across DOS records filtered by sic and Tier election in the oneafiler XML's summaryInfo block. Risk-factor sections, securities descriptions, subscription agreements (EX1A-4 SUBS AGMT), and legality opinions (EX1A-12 OPN CNSL) become a peer-comparable corpus for first drafts and exhibit checklists, with the SIC and state-of-incorporation fields in metadata.json providing the slicing keys.

Inferring staff comment pressure from the DOS-to-DOS/A chain

Researchers and law firms join records by issuer CIK and Regulation A fileNo (367-NNNNN), order them by filedAt, and diff successive Part II circulars and oneafiler XML payloads. The number, cadence, and locus of edits between DOS and each DOS/A serve as a proxy for staff comment intensity, supporting both empirical work on confidential review and tactical comment-response planning at the issuer level.

Early-pipeline sourcing for placement agents and PIPE investors

Placement agents, transfer agents, and growth-stage investors monitor new DOS records weekly, extracting issuer CIK, aggregate offering amount, securityType, Tier election, and the juridictionSecuritiesOffered state lists from formData/summaryInfo. The output is a ranked outreach list of issuers months ahead of any public Form 1-A signal, scoped to verticals, deal sizes, and jurisdictions the agent or fund can serve.

Diff dashboards from draft to qualified offering

Analytics teams at retail Reg A distribution platforms join DOS and DOS/A records to the eventual public Form 1-A by issuer CIK and fileNo, then surface field-level deltas — changes in offering size, price, security class, use-of-proceeds tables, executive compensation, and risk factors — to flag what staff review actually altered. The output feeds investor-facing materiality flags and internal qualification-probability scores.

Training extraction and redline models for RegTech tooling

LLM and RegTech vendors use the corpus as labeled training data: the structured oneafiler XML supplies ground-truth offering terms (Tier, price, aggregate amount, jurisdictions, exemption claims) for the same content that appears in narrative form inside the Part II HTML circular. Pairing the two yields supervised data for offering-term extraction; pairing DOS with DOS/A yields supervised data for redline detection and disclosure-change classification.

Competitor surveillance in concentrated Reg A verticals

Corporate strategy teams in cannabis, real estate funds, fintech, and consumer brands run keyword and SIC filters across new DOS filings to catch peers preparing to raise public capital. The Part II business description, capitalization table, and use-of-proceeds section provide enough signal to brief leadership before the issuer's public Form 1-A appears, and the entities[] block ties co-issuers and guarantors back to corporate-family graphs.

Dataset Access

Dataset Index JSON API: https://api.sec-api.io/datasets/form-dos-files.json

This endpoint returns the dataset's metadata and the full list of available container files. The response includes the dataset name, description, last updated timestamp, earliest sample date, total record count and size, covered form types (DOS, DOS/A), the container format (ZIP), and the file types found inside containers (XML, HTML, JSON, TXT, PDF). It also includes the download URL for the full dataset archive and an array of containers, each with its key, size, records, updatedAt, and downloadUrl. Use this endpoint to monitor which containers were refreshed in the most recent run and to decide which files to pull on a day-by-day basis.

This endpoint does not require an API key.

Example response:

Example
1 {
2 "datasetId": "1f13365b-9ae0-69c5-a5a6-b53ccae7fe43",
3 "datasetDownloadUrl": "https://api.sec-api.io/datasets/form-dos-files.zip",
4 "name": "Form DOS Files Dataset",
5 "updatedAt": "2026-04-25T03:02:38.726Z",
6 "earliestSampleDate": "2015-06-01",
7 "totalRecords": 4054,
8 "totalSize": 106638474,
9 "formTypes": ["DOS", "DOS/A"],
10 "containerFormat": "ZIP",
11 "fileTypes": ["XML", "HTML", "JSON", "TXT", "PDF"],
12 "containers": [
13 {
14 "downloadUrl": "https://api.sec-api.io/datasets/form-dos-files/2026/2026-04.zip",
15 "key": "2026/2026-04.zip",
16 "size": 2184356,
17 "records": 18,
18 "updatedAt": "2026-04-25T03:02:38.726Z"
19 }
20 ]
21 }

Download Entire Dataset: https://api.sec-api.io/datasets/form-dos-files.zip?token=YOUR_API_KEY

Downloads the complete Form DOS Files dataset as a single ZIP archive containing all monthly containers from June 2015 to the present. This endpoint requires an API key.

Download Single Container: https://api.sec-api.io/datasets/form-dos-files/2026/2026-04.zip?token=YOUR_API_KEY

Downloads one individual monthly container instead of the full archive. Use the key value from any container entry in the index response to construct the URL, or pass it to node scripts/download-sec-api-file.js to fetch a specific month. This endpoint requires an API key.

Frequently Asked Questions

What forms does this dataset cover?

The dataset covers Form DOS (a draft Regulation A offering statement submitted under Rule 252(d) for non-public SEC staff review) and Form DOS/A (any subsequent amendment to that draft). Both share the same EDGAR Regulation A "oneafiler" XML schema and the same Part II / Part III document conventions.

What does one record represent?

One record is one complete EDGAR submission — a single accession-numbered folder containing metadata.json, the primary filename1.xml "oneafiler" submission, the SGML-wrapped Part II preliminary offering circular, and every Part III exhibit attached to that submission. An issuer that goes through several rounds of confidential review surfaces as a chain of distinct records (one DOS plus a sequence of DOS/A amendments) rather than as a single consolidated dossier.

Who is eligible to file Form DOS?

Form DOS is restricted to issuers that have not previously had a Regulation A offering statement qualified by the SEC and have not previously had a Securities Act registration statement (Form S-1, S-11, F-1, etc.) declared effective. Eligible filers are typically privately held domestic operating companies organized in the U.S. or Canada that are considering a Tier 1 (up to $20 million in 12 months) or Tier 2 (up to $75 million in 12 months) Regulation A offering.

What time period does the dataset cover?

The dataset's earliest sample date is June 19, 2015, when amended Regulation A and Rule 252(d) took effect under Title IV of the JOBS Act. There is no paper-era predecessor, so the dataset is complete from inception forward and continues through the present.

What file formats are inside a record?

A record contains a metadata.json filing header, a filename1.xml "oneafiler" primary submission, and one HTML, plain-text, or PDF document per Part II circular and Part III exhibit, each wrapped in the standard EDGAR <DOCUMENT><TYPE><SEQUENCE><FILENAME><TEXT> SGML envelope. The dataset's overall file-type set is XML, HTML, JSON, TXT, and PDF, packaged in monthly ZIP container archives.

How does Form DOS differ from Form 1-A?

Form 1-A is the public Regulation A offering statement that the SEC actually qualifies; Form DOS is the confidential draft of the same content, submitted for non-public staff review before any public filing. DOS qualifies nothing on its own — Rule 252(d) requires that all confidential drafts be publicly refiled as exhibits to the eventual Form 1-A at least 21 calendar days before the issuer requests qualification. Issuers that go directly to a public Form 1-A do not appear in this dataset.

How do I reconstruct the full history of one offering?

Each DOS and DOS/A record is its own accession number and is independent rather than a delta. To reconstruct one offering's confidential-review history, join records by issuer CIK and Regulation A fileNo (the 367-NNNNN value in entities[]) and order them by filedAt. Exhibits that an issuer chose not to refile in a given amendment must be retrieved from the prior record in the chain.