Form SB-2 Files Dataset

The Form SB-2 Files Dataset is a closed archive of every Securities Act of 1933 registration statement on Form SB-2 and every amendment on Form SB-2/A submitted to EDGAR by "small business issuers" under former Regulation S-B. Each record is a single EDGAR submission — one accession-number folder containing a structured metadata.json plus every document the registrant filed with the SEC, except binary image files. The dataset spans April 1995 (the start of mandatory EDGAR filing) through the form's 2008 rescission, with later wind-down amendments tied to pre-rescission registration statements. Filers are issuers of the securities being registered: micro-cap companies, shell vehicles, recent reverse-merger entities, and resource-exploration startups that qualified for the Regulation S-B scaled-disclosure regime. The corpus is distributed as monthly ZIP containers organized by year and covers form types SB-2 and SB-2/A.

Update Frequency
Daily
Updated at
2026-04-14
Earliest Sample Date
1995-04-01
Total Size
3.1 GB
Total Records
143,958
Container Format
ZIP
Content Types
TXT, JSON, HTML, XFD, PDF, FRM
Form Types
SB-2, SB-2/A

Dataset APIs

Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.

Dataset Index JSON API

Download the entire dataset as a single archive file.

Download Entire Dataset:

Download a single container file (e.g. monthly archive) from the dataset.

Download Single Container:

Dataset Files

155 files · 3.1 GB
Download All
2008-02.zip5.6 MB255 records
2008-01.zip46.9 MB1,360 records
2007-12.zip40.1 MB1,397 records
2007-11.zip37.0 MB1,341 records
2007-10.zip49.6 MB1,366 records
2007-09.zip32.3 MB1,053 records
2007-08.zip43.1 MB1,211 records
2007-07.zip38.3 MB1,210 records
2007-06.zip46.0 MB1,285 records
2007-05.zip58.3 MB1,429 records
2007-04.zip34.3 MB1,266 records
2007-03.zip37.2 MB1,128 records
2007-02.zip46.6 MB1,457 records
2007-01.zip47.0 MB1,452 records
2006-12.zip46.6 MB1,567 records
2006-11.zip35.9 MB1,383 records
2006-10.zip41.1 MB1,427 records
2006-09.zip33.7 MB1,275 records
2006-08.zip42.8 MB1,565 records
2006-07.zip41.4 MB1,468 records
2006-06.zip48.0 MB1,755 records
2006-05.zip46.7 MB1,702 records
2006-04.zip38.6 MB1,444 records
2006-03.zip34.3 MB1,320 records
2006-02.zip40.1 MB1,632 records
2006-01.zip52.8 MB1,457 records
2005-12.zip37.2 MB1,485 records
2005-11.zip31.4 MB1,485 records
2005-10.zip34.0 MB1,149 records
2005-09.zip29.7 MB1,390 records
2005-08.zip37.7 MB1,719 records
2005-07.zip31.8 MB1,364 records
2005-06.zip38.9 MB1,414 records
2005-05.zip32.7 MB1,258 records
2005-04.zip30.1 MB1,535 records
2005-03.zip23.9 MB1,041 records
2005-02.zip27.6 MB1,329 records
2005-01.zip27.0 MB1,300 records
2004-12.zip32.3 MB1,572 records
2004-11.zip30.0 MB1,348 records
2004-10.zip24.6 MB1,205 records
2004-09.zip22.7 MB1,136 records
2004-08.zip27.3 MB1,206 records
2004-07.zip30.2 MB1,239 records
2004-06.zip29.2 MB1,338 records
2004-05.zip21.8 MB1,183 records
2004-04.zip17.5 MB812 records
2004-03.zip17.6 MB811 records
2004-02.zip28.0 MB919 records
2004-01.zip17.3 MB796 records
2003-12.zip17.8 MB813 records
2003-11.zip10.5 MB553 records
2003-10.zip18.3 MB897 records
2003-09.zip14.3 MB812 records
2003-08.zip13.9 MB844 records
2003-07.zip12.8 MB691 records
2003-06.zip16.0 MB810 records
2003-05.zip14.5 MB771 records
2003-04.zip17.5 MB746 records
2003-03.zip11.2 MB603 records
2003-02.zip11.9 MB648 records
2003-01.zip14.6 MB738 records
2002-12.zip12.9 MB698 records
2002-11.zip17.1 MB761 records
2002-10.zip18.9 MB911 records
2002-09.zip15.1 MB856 records
2002-08.zip18.7 MB761 records
2002-07.zip16.8 MB1,079 records
2002-06.zip15.7 MB975 records
2002-05.zip17.8 MB979 records
2002-04.zip15.3 MB956 records
2002-03.zip11.4 MB663 records
2002-02.zip14.9 MB937 records
2002-01.zip13.6 MB857 records
2001-12.zip13.5 MB796 records
2001-11.zip17.2 MB936 records
2001-10.zip18.2 MB1,029 records
2001-09.zip14.9 MB872 records
2001-08.zip17.8 MB1,091 records
2001-07.zip15.6 MB979 records
2001-06.zip14.6 MB825 records
2001-05.zip16.0 MB991 records
2001-04.zip11.8 MB896 records
2001-03.zip13.1 MB917 records
2001-02.zip17.5 MB1,187 records
2001-01.zip13.9 MB890 records
2000-12.zip17.4 MB1,212 records
2000-11.zip15.7 MB1,139 records
2000-10.zip15.3 MB1,004 records
2000-09.zip17.7 MB1,191 records
2000-08.zip16.5 MB1,115 records
2000-07.zip12.0 MB756 records
2000-06.zip14.9 MB1,000 records
2000-05.zip14.3 MB990 records
2000-04.zip14.1 MB1,029 records
2000-03.zip10.3 MB735 records
2000-02.zip14.3 MB968 records
2000-01.zip10.0 MB672 records
1999-12.zip16.5 MB1,135 records
1999-11.zip11.1 MB836 records
1999-10.zip11.0 MB724 records
1999-09.zip13.1 MB905 records
1999-08.zip9.8 MB596 records
1999-07.zip11.4 MB771 records
1999-06.zip8.9 MB603 records
1999-05.zip11.3 MB632 records
1999-04.zip10.0 MB604 records
1999-03.zip8.5 MB581 records
1999-02.zip11.5 MB769 records
1999-01.zip8.9 MB516 records
1998-12.zip11.6 MB656 records
1998-11.zip12.5 MB712 records
1998-10.zip10.6 MB635 records
1998-09.zip13.9 MB808 records
1998-08.zip12.9 MB829 records
1998-07.zip16.0 MB1,008 records
1998-06.zip12.6 MB829 records
1998-05.zip15.9 MB859 records
1998-04.zip11.5 MB683 records
1998-03.zip12.1 MB808 records
1998-02.zip14.5 MB682 records
1998-01.zip12.4 MB758 records
1997-12.zip13.0 MB820 records
1997-11.zip14.3 MB859 records
1997-10.zip15.9 MB942 records
1997-09.zip15.8 MB944 records
1997-08.zip15.8 MB917 records
1997-07.zip15.4 MB925 records
1997-06.zip13.2 MB750 records
1997-05.zip16.0 MB918 records
1997-04.zip11.8 MB665 records
1997-03.zip14.8 MB882 records
1997-02.zip15.8 MB939 records
1997-01.zip18.6 MB1,108 records
1996-12.zip16.0 MB951 records
1996-11.zip18.4 MB1,014 records
1996-10.zip20.6 MB1,267 records
1996-09.zip16.6 MB1,033 records
1996-08.zip13.1 MB852 records
1996-07.zip14.2 MB787 records
1996-06.zip12.0 MB659 records
1996-05.zip9.8 MB529 records
1996-04.zip187.0 KB9 records
1996-03.zip434.7 KB31 records
1996-02.zip598.8 KB30 records
1996-01.zip506.8 KB23 records
1995-12.zip651.2 KB32 records
1995-11.zip910.2 KB51 records
1995-10.zip665.9 KB47 records
1995-09.zip1.1 MB55 records
1995-08.zip1.3 MB64 records
1995-07.zip734.4 KB34 records
1995-06.zip469.0 KB31 records
1995-05.zip423.5 KB29 records
1995-04.zip75.6 KB4 records

What This Dataset Contains

The dataset packages the complete EDGAR submission for every SB-2 and SB-2/A accession across the form's regulatory lifespan. Form SB-2 was the long-form Securities Act registration statement for small business issuers under former Regulation S-B, the scaled-disclosure regime that applied to companies with revenues and public float each below the $25 million threshold. Functionally it served the same purpose as Form S-1 — registering securities for sale to the public — but with reduced obligations: shorter selected financial data, two years of audited financial statements rather than three, simpler executive compensation discussion, and abbreviated business and MD&A requirements. The form was promulgated alongside Regulation S-B in 1992, used continuously after EDGAR filing began in April 1995, and rescinded effective in 2008 when the SEC eliminated the "small business issuer" category and replaced it with the "smaller reporting company" framework folded into Regulation S-K.

Every record bundles three layers in one place: a structured JSON manifest describing the filing and the filer, the prospectus and exhibit content of the registration statement itself, and the SGML/EDGAR header that wraps every document body. The dataset preserves the underlying filing as filed — SGML envelopes intact, filer-controlled filenames unchanged, and exhibit ordering as submitted — so it functions as a source-of-truth bundle rather than an extracted slice. Containers are monthly ZIP files (YYYY/YYYY-MM.zip) covering form types SB-2 and SB-2/A, with file payloads in TXT, JSON, HTML, XFD, PDF, and FRM formats.

Content Structure of a Single Record

What one record represents

One record is a single EDGAR submission of either Form SB-2 (an initial small-business issuer registration statement under the Securities Act of 1933) or Form SB-2/A (a pre-effectiveness or post-effectiveness amendment to a previously filed SB-2). On disk, a record is one accession-number subdirectory within a monthly ZIP container. The subdirectory is named with the 18-digit EDGAR accession number with dashes stripped (for example, 0001144204-08-006062 becomes 000114420408006062). Inside that folder sit a single metadata.json plus every document the registrant submitted to EDGAR for that accession, with the sole exception of binary image files.

The underlying filing

The underlying document is a prospectus-centric registration statement. It opens with the EDGAR/SGML cover and the form facing page (registrant name, state of incorporation, IRS number, primary SIC code, principal executive offices, agent for service, and the calculation-of-registration-fee table). It then steps through the prospectus proper, which under Regulation S-B item-numbered scaled disclosure typically contains: prospectus cover and outside back cover; prospectus summary; risk factors; cautionary language regarding forward-looking statements; use of proceeds; determination of offering price; dilution; selling shareholders table (where resale is contemplated); plan of distribution; description of securities to be registered; interests of named experts and counsel; description of business; description of property; legal proceedings; market for common equity and related shareholder matters; management's discussion and analysis or, for issuers without revenue history, the SB-2-specific shorter "plan of operation"; changes in and disagreements with accountants; directors, executive officers, promoters and control persons; executive compensation; security ownership of certain beneficial owners and management; certain relationships and related transactions; and the audited financial statements with notes. After the prospectus, Part II of the registration statement contains indemnification of directors and officers, recent sales of unregistered securities, the exhibit index, undertakings, and the signature block executed by the registrant, principal executive officer, principal financial officer, principal accounting officer, and a majority of the directors.

Folder layout

Each accession folder contains exactly one metadata.json and a flat set of document files; there are no nested subfolders. The metadata.json is the structured anchor: it enumerates every document the registrant filed under the accession, identifies the filer entity, and carries the form type, filed-at timestamp, accession number, and SEC file number. The companion document files carry the prospectus and exhibit content. Every document file — whether named with .htm, .html, .txt, .frm, .xfd, or .pdf — is wrapped in EDGAR's SGML <DOCUMENT> envelope, so each file begins with the pattern

1 <DOCUMENT>
2 <TYPE>SB-2
3 <SEQUENCE>1
4 <FILENAME>...
5 <DESCRIPTION>...
6 <TEXT>
7 ... document body ...
8 </TEXT>
9 </DOCUMENT>

The <TYPE> tag holds the EDGAR document classifier (SB-2, SB-2/A, EX-3.1, EX-5.1, EX-10.3, EX-23.1, EX-24.1, EX-99, CORRESP, etc.), and <SEQUENCE>1 is reserved for the main registration statement. Inside <TEXT>, the body is either a full <html>…</html> document (often emitted by filing-agent tooling such as Vintage Filings's EDGARizer, RDG, or Donnelley, with embedded inline CSS and styled tables) or a legacy fixed-width ASCII layout using EDGAR's <TABLE>/<S>/<C> financial-table markers to delimit columnar financial data.

metadata.json

metadata.json mirrors the SEC-API filing-object shape and exposes the following top-level fields:

  • id — opaque hex hash identifier, stable per filing.
  • accessionNo — canonical EDGAR accession number with dashes (e.g., 0001144204-08-006062).
  • formTypeSB-2 or SB-2/A.
  • description — human-readable form description.
  • filedAt — ISO 8601 timestamp with timezone (Eastern, matching EDGAR acceptance).
  • linkToFilingDetails — absolute URL of the primary registration statement on EDGAR.
  • linkToTxt — URL of the consolidated SGML submission .txt on EDGAR.
  • linkToHtml — URL of the EDGAR accession -index.htm page.
  • linkToXbrl — empty for this form type; SB-2 predates XBRL applicability and was rescinded before any phase-in could reach it.
  • documentFormatFiles — array of every document referenced by the submission.
  • dataFiles — empty across SB-2 records (no XBRL or financial-data files were ever attached to the form).
  • entities — array of registrant/filer records.

Each documentFormatFiles[] element carries sequence (numeric string; the synthetic "Complete submission text file" entry uses a blank sequence), type (the EDGAR document classifier), documentUrl (absolute EDGAR URL), size (bytes as a string), and an optional description (e.g., "OPINION OF QUARLES & BRADY LLP AS TO THE LEGALITY OF SECURITIES BEING REGISTERED", "GRAPHIC", "Complete submission text file").

Each entities[] element carries companyName (display name with parenthesized role suffix such as "(Filer)"), cik (10-digit zero-padded), irsNo (9-digit EIN, or "000000000" if not supplied), fileNo (the Securities Act registration file number, characteristically prefixed 333-), filmNo (8-digit EDGAR film number), act (always "33" because SB-2 is a Securities Act registration), type (form-type echo), sic (SIC code plus label, sometimes omitted), stateOfIncorporation (two-letter code), fiscalYearEnd (MMDD; sometimes omitted for shell companies), and tickers (often empty because SB-2 issuers were frequently not yet trading at the time of registration).

Primary registration document (sequence 1)

The sequence-1 document is the registration statement and prospectus itself, with <TYPE> set to SB-2 or SB-2/A. For HTML filings, the body is a single long HTML document containing the entire prospectus and Part II in reading order: cover page and calculation-of-registration-fee table at the top, prospectus narrative items in Regulation S-B order, audited financial statements as an embedded section (rendered as inline HTML tables or as image-free ASCII tables converted to HTML), and Part II followed by signatures. For plain-text filings, the same content is rendered as a fixed-width ASCII document. Amendment filings (SB-2/A) restate the entire registration statement rather than only the changed pages; redlines or bracketed change indicators are filer-discretionary and not standardized.

Exhibits

Exhibits are filed as additional sequenced documents in the same accession folder, each in its own SGML-wrapped file. The exhibit numbering follows Item 601 of Regulation S-B, which mirrored Regulation S-K with adjustments. The exhibit taxonomy used in this dataset is:

  • EX-3.x — articles of incorporation, certificates of designation, bylaws, and committee charters. Multiple EX-3 exhibits often appear (EX-3.1 through EX-3.4 are common).
  • EX-4.x — instruments defining the rights of security holders: specimen stock certificates, warrant agreements, indentures, registration-rights agreements.
  • EX-5.1 — opinion of counsel as to the legality of the securities being registered. Required and almost universally present.
  • EX-10.x — material contracts: consulting and employment agreements, share-purchase agreements, leases, broker-dealer placement agreements, license agreements. Frequently the largest exhibit set by count.
  • EX-21.x — list of subsidiaries (often omitted for shell-company filers with no subsidiaries).
  • EX-23.x — consents of independent registered public accountants and consents of named legal counsel. Auditor consent (EX-23.1) is functionally always present because audited financials are incorporated.
  • EX-24.1 — power of attorney granted by directors and officers to allow execution of subsequent amendments.
  • EX-99 and EX-99.x — additional ancillary documents: subscription agreements, escrow agreements, sample investor forms, press releases.
  • CORRESP — correspondence with the SEC staff, treated as a document type within the submission rather than a numbered prospectus exhibit.
  • GRAPHIC — image entries (JPG, GIF) referenced by the prospectus for logos, cover-page artwork, geological maps, and similar visuals. The GRAPHIC entries appear in metadata.json, but the binary image files are intentionally excluded from the on-disk record.

SGML envelope

Every body file in the folder, regardless of file extension, opens with the same <DOCUMENT>/<TYPE>/<SEQUENCE>/<FILENAME>/<DESCRIPTION>/<TEXT> header block and closes with </TEXT></DOCUMENT>. The envelope is preserved as filed — it is not stripped during dataset assembly. Consumers extracting the raw prospectus or exhibit body must skip past the <TEXT> opening tag and stop at the closing </TEXT>. For HTML payloads the inner content is a self-contained <html> document and may include filer-tool signatures (e.g., <!-- Document Created using EDGARizer HTML 3.0.4.0 -->) that are useful as provenance markers.

What is included

  • The full structured metadata.json for every accession.
  • The primary SB-2 or SB-2/A registration statement, in whatever format the registrant filed (HTML or plain text).
  • All exhibit text and HTML documents referenced in the EDGAR submission, including legal opinions, consents, charters, bylaws, material contracts, and subscription agreements.
  • Correspondence (CORRESP) documents when present in the submission.
  • Occasional XFD (paper-form facsimile data) and FRM files when the registrant attached them.
  • PDF attachments where the registrant filed a PDF as part of the submission.
  • The complete SGML document headers wrapping each body, preserving the as-filed <TYPE>, <SEQUENCE>, <FILENAME>, and <DESCRIPTION> values.

What is excluded or structurally separate

  • Image files referenced by the prospectus (JPG, GIF, PNG) are not redistributed inside the record. The GRAPHIC-typed entries remain in documentFormatFiles[] with their EDGAR URLs, so consumers can still fetch them from EDGAR if needed, but the bytes are not in the ZIP.
  • The synthetic "Complete submission text file" — the single concatenated SGML .txt that EDGAR generates for an entire accession — is referenced in metadata.json (with a blank sequence value) but is not redistributed as a separate file in the folder. The same content is reconstructible by concatenating the included document files together with their preserved SGML envelopes.
  • XBRL instance documents and related taxonomy files do not exist for this form type because SB-2 was rescinded before XBRL applied to it; dataFiles is empty and linkToXbrl is the empty string in every record.
  • Successor registration filings on Form S-1 (post-rescission) and other smaller-reporting-company filings are not part of this dataset; they live in the S-1 corpus.

Changes in required content and structure over time

The SB-2 corpus has a tightly bounded regulatory lifespan — April 1995 (EDGAR adoption) through 2008 (form rescission) — and the disclosure requirements were comparatively stable across that window because Regulation S-B itself was largely unchanged after its 1992 promulgation. The substantive shifts within the corpus are:

  • Sarbanes-Oxley (2002) added Section 302 and Section 906 certifications to periodic reports, but SB-2, as a Securities Act registration statement, did not pick up Section 302/906 certifications. Attestations were absorbed into the signature block and the auditor consent regime, and the auditor consent (EX-23.1) became more legally consequential after PCAOB-registered-firm requirements took effect.
  • The 2005 Securities Offering Reform did not extend WKSI-style scaled benefits to SB-2 the way it did to S-3, but it did formalize free writing prospectuses and incorporation conventions that are sometimes visible in late-corpus filings.
  • The 2007–2008 SEC rulemaking that rescinded Regulation S-B, eliminated the "small business issuer" category, and created the "smaller reporting company" classification under amended Items 10(f) and 8 of Regulation S-K terminated SB-2 effective in 2008. After that date, a smaller reporting company seeking to register securities filed Form S-1 with the same scaled accommodations folded into the unified Regulation S-K. The dataset's distribution accordingly tapers sharply through 2008, with stragglers and late amendments dominating the final months. The companion S-1 dataset picks up the post-rescission successor filings.
  • Within Regulation S-B itself, the option to substitute a "plan of operation" for full MD&A (available to issuers without revenue from operations during each of the past two fiscal years) and the two-year (rather than three-year) audited financial statement requirement persisted throughout the form's life and explain why early-stage and shell SB-2 filings have visibly thinner financial sections than equivalent S-1 filings.

Changes in data format over time

  • Earliest filings (April 1995 onward) are uniformly plain-text SGML submissions, with the <TEXT> block carrying fixed-width ASCII pages and EDGAR <TABLE>/<S>/<C> financial-table tags. Image content was effectively absent from this era.
  • HTML filing began to be accepted by EDGAR in the late 1990s and became increasingly common through the early 2000s. By the mid-2000s a sizable share of SB-2 filings ship the prospectus as a single styled HTML document produced by a filing agent's tooling, while smaller and shell-company filers continued to submit plain text through the form's rescission. The late corpus is therefore a mixed-format population, with both HTML and legacy ASCII coexisting and both wrapped in identical SGML envelopes.
  • XBRL is not applicable to this form type. SB-2 was rescinded before XBRL phase-in reached small registrants, so no record carries an XBRL instance document or inline-XBRL structured data, and dataFiles and linkToXbrl are uniformly empty.
  • Filenames inside accession folders are filer-controlled and not normalized: filing-agent prefixes (Vintage Filings's v######_sb2.htm, Donnelley's d-prefixed dsb2.htm / dex51.htm), descriptive lowercase names (legalopinion.htm, auditorconsent.htm), generic placeholders (filename2.htm, filename13.htm) emitted when the original filename slot was empty, and bare 8.3-style names (g2177.txt, ex5-1.txt) all coexist. The authoritative document classification is always the type field in metadata.json and the <TYPE> tag inside each file's SGML envelope, never the filename.

Interpretation notes

  • The amendment chain is not always discoverable from metadata.json alone. The formType field flattens every amendment to SB-2/A regardless of ordinal; only fileNo plus filedAt ordering reveal the sequence. Filers occasionally encode the amendment number in the filename (e.g., strasbaugh_sb2a3-020508.txt for a third amendment dated 2008-02-05), but this is convention, not specification.
  • Because amendments restate the entire registration statement, a single fileNo (the 333- Securities Act registration number) typically resolves to multiple records across the original SB-2 and several SB-2/A accessions. Deduplicating to "the registration" requires grouping on fileNo.
  • The documentFormatFiles[] array is the source of truth for which files exist in the EDGAR submission, including the ones omitted from redistribution (images and the synthetic complete-submission text). When the on-disk folder is missing a file referenced in documentFormatFiles[], that file is either a GRAPHIC entry or the trailing Complete submission text file entry; both can be retrieved by following documentUrl to EDGAR.
  • The SGML <DOCUMENT> envelope must be parsed out before HTML parsers will accept the body cleanly. HTML payloads usually start at the first <html> or <HTML> tag inside <TEXT>; plain-text payloads should be read between the <TEXT> and </TEXT> markers and treated as fixed-width ASCII with the EDGAR financial-table tags optionally promoted to columnar tables.
  • Where EX-23.1 is missing or where audited financials appear unsigned, the filing is almost always an early-stage SB-2 that was withdrawn or never declared effective; the dataset preserves these as filed.
  • CORRESP documents reflect issuer-to-staff communication and are not part of the prospectus; treat them as supplementary metadata about the comment-letter process rather than as disclosure to investors.
  • Issuer-specific variation is high because SB-2 was used predominantly by micro-cap and shell issuers with heterogeneous filing-agent tooling. Robust extraction should rely on the metadata.json type field and the SGML <TYPE> tag for document classification, on the entities[] block for issuer identification, and on filename matching only as a last resort.

Who Files or Publishes This Dataset, and When

Who files the record

Each record is a Securities Act of 1933 registration statement on Form SB-2, or an amendment on Form SB-2/A, submitted to EDGAR by an issuer that qualified as a "small business issuer" under former Regulation S-B. The filer is always the issuer of the securities being registered, never the underwriter, selling shareholder, or auditor (although those parties are named, and accountants and counsel provide consents).

To use the form, an issuer had to satisfy every prong of Item 10(a)(1) of Regulation S-B (cross-referenced in Securities Act Rule 405 and Exchange Act Rule 12b-2) at the time of filing:

  • revenues under $25 million in the most recent fiscal year
  • aggregate market value of voting and non-voting common equity held by non-affiliates (public float) under $25 million
  • incorporated or organized in the United States, a U.S. state or territory, or Canada or a Canadian province or territory
  • not registered, or required to be registered, under the Investment Company Act of 1940
  • if a majority-owned subsidiary, its parent must also qualify as a small business issuer

Issuers exceeding either threshold, foreign private issuers other than Canadian filers, registered investment companies, business development companies, and asset-backed issuers were excluded and used Form S-1, F-1, N-1A/N-2, Form S-3, Form S-4, or Form S-11 as applicable.

When the record is created or required

Form SB-2 is event-driven, not periodic. Section 5 of the Securities Act prohibits any offer (Section 5(c)) or sale (Section 5(a)) of securities in interstate commerce unless a registration statement is on file and effective. An eligible issuer therefore filed Form SB-2 whenever it elected to register a public offering under the scaled Regulation S-B regime, including:

  • initial public offerings
  • follow-on or secondary offerings
  • resale registrations for selling shareholders
  • shares underlying warrants, options, or convertible instruments
  • securities issued in mergers, acquisitions, or reorganizations not eligible for Form S-4

Effectiveness ran on the Section 8(a) twenty-day clock from the most recent filing or amendment, but in practice nearly all issuers requested acceleration under Rule 461 after staff review concluded. The acceleration request was submitted jointly by the issuer and any managing underwriters and granted by an SEC notice of effectiveness.

Form SB-2/A amendments are filed whenever the registration statement must be revised before or after effectiveness, including in response to:

  • SEC staff comment letters during review (risk factors, financial statements, use of proceeds, dilution, compensation, related-party transactions)
  • material changes in the issuer's business, financial condition, capitalization, or offering terms during the review period
  • pricing amendments under Rule 430A and prospectus supplements under Rule 424(b)
  • the addition of selling shareholders or new classes of securities
  • post-effective amendments to update the prospectus, add offerings, or deregister unsold securities

Each SB-2/A receives its own EDGAR accession number but shares file-number lineage with the original SB-2.

Operational submission requirements

Form SB-2 filings were submitted through EDGAR with:

  • the registration fee calculated under Securities Act Rule 457, generally on the maximum aggregate offering price
  • signatures of the issuer, its principal executive, financial, and accounting officers, and a majority of the board, as required by the form's signature block
  • consents of independent accountants under Rule 436 covering audited financial statements
  • consents of counsel and other named experts, including the Item 601(b)(5) legality opinion
  • the full Item 601 exhibit set (legal opinions, material contracts, charter documents, underwriting agreements)

Regulatory framework and historical scope

Form SB-2 was adopted in 1992 under the Small Business Initiatives release (Release No. 33-6949), which created Regulation S-B and the integrated small business disclosure system. EDGAR coverage of SB-2 filings begins in April 1995 with the phase-in of mandatory electronic filing.

The form was rescinded effective February 4, 2008 by Release No. 33-8876 (Smaller Reporting Company Regulatory Relief and Simplification), which eliminated the "small business issuer" category and Regulation S-B and folded scaled disclosure into Regulation S-K under the new "smaller reporting company" definition (initially a $75 million public float test). Former SB-2 filers transitioned to Form S-1 with SRC scaled disclosure available within that form. No new originating SB-2 filings have been accepted by EDGAR since the rescission; post-2008 SB-2 or SB-2/A submissions in the dataset are generally post-effective amendments or wind-down filings tied to pre-rescission registration statements.

Important distinctions

  • Form 10-SB was the small business Exchange Act registration statement under Section 12(b) or Section 12(g); it triggered periodic reporting but did not register an offering. SB-2 effectiveness did not by itself create periodic obligations, although Section 15(d) reporting could attach upon effectiveness.
  • Form SB-1 covered offerings up to $10 million, was used far less often than SB-2, and was rescinded at the same time.
  • An issuer that lost small-business-issuer status mid-review could be required to convert to Form S-1, though staff often permitted the SB-2 to proceed if the issuer qualified at original filing.
  • Form SB-2/A is not a new offering; it amends an existing registration statement and shares file-number continuity with the original SB-2.

How This Dataset Differs From Similar Datasets or Filings

The Form SB-2 Files Dataset sits in a tight cluster of Securities Act registration filings. Its closest neighbors fall into three groups: other registration forms in use during SB-2's lifespan (1992-2008), the broader Regulation S-B small-business reporting regime, and exempt offering frameworks that competed with SB-2 for small-issuer capital formation. The distinctions below are rules-based: eligibility, trigger, scope, disclosure scaling, and timing.

Form S-1 / S-1/A

S-1 is the general-purpose Securities Act registration statement, available to any issuer including those eligible for SB-2. The structural overlap is heavy: prospectus, use of proceeds, risk factors, business, MD&A, audited financials, management, exhibits. The distinction is the disclosure rulebook. SB-2 ran on Regulation S-B (two years of audited statements rather than three, scaled MD&A, lighter executive compensation tables); S-1 ran on full Regulation S-K and Regulation S-X. After SB-2's 2008 rescission, the small-issuer population migrated to S-1 under the new smaller-reporting-company accommodations within S-K. The post-2008 S-1 corpus therefore absorbs SB-2's filer base; SB-2 is its predecessor, not a substitute.

Form SB-1

SB-1 was the smaller small-business form, capped at $10 million in any rolling twelve-month period and permitting a question-and-answer prospectus format. SB-2 had no offering-size ceiling for qualifying small-business issuers and required a conventional narrative prospectus. Both used Regulation S-B scaling, but SB-1 generated a much smaller and shallower historical population.

Form 10-SB

10-SB is an Exchange Act Section 12(g) registration, not a Securities Act offering registration. It registers a class of securities to make the issuer a reporting company; it does not register a transaction. There is no offering price, no use of proceeds, no underwriting section. It is the small-business analogue of Form 10, not of S-1. Use 10-SB to identify when a small issuer became a reporting company; use SB-2 to study how a small issuer raised capital.

Form S-3 / S-3/A

S-3 is the short-form registration for seasoned issuers meeting eligibility tests (reporting history, timely filings, qualifying public float, historically $75 million). It incorporates by reference from Exchange Act filings rather than restating business and financial information. SB-2 sat on the opposite side of that eligibility line, designed for issuers without S-3 qualifications and often without any Exchange Act reporting history. SB-2 filings are self-contained and information-dense; S-3 filings are thin and reference-driven. The two datasets cover non-overlapping issuer populations.

Form S-11

S-11 is the dedicated registration statement for REITs and other real-estate-focused issuers, with real-estate-specific schedules (property tables, occupancy data, Schedule III). A small real estate issuer qualifying under Regulation S-B could in some cases register on SB-2 instead of S-11, creating overlap at that boundary. For real estate offering research, S-11 is the complete corpus; SB-2 captures only the small-issuer slice and lacks the standardized real-estate schedules.

Form F-1

Form F-1 is the Securities Act registration for foreign private issuers, following Form 20-F-aligned disclosure with IFRS or reconciled GAAP financials. SB-2 was never available to foreign private issuers; the Regulation S-B small-business definition applied only to U.S. and Canadian issuers meeting specific tests. The two datasets are mutually exclusive by filer geography and reporting regime.

Form 10-KSB and Form 10-QSB

Form 10-KSB and Form 10-QSB were the Regulation S-B annual and quarterly reports — the periodic counterparts to SB-2's registration role. Same scaled disclosure regime, same 2008 rescission, same migration path to 10-K and 10-Q with smaller-reporting-company accommodations. SB-2 captures the registration event; 10-KSB and 10-QSB capture the ongoing reporting that followed. Complementary, not substitutable.

Form 1-A (Regulation A)

Form 1-A is the offering statement for exempt offerings under Regulation A (and post-2015 Regulation A+). Reg A offerings are exempt from Section 5 and qualified by staff rather than declared effective. Disclosure is scaled below SB-2, financials are lighter, and offering size has historically been capped (originally $5 million; post-2015 $20 million for Tier 1, $75 million for Tier 2). For small-issuer capital formation, Reg A is a parallel path, but legally and structurally distinct from SB-2.

Other "Files" datasets versus extracted-section datasets

The SB-2 Files Dataset packages the complete EDGAR submission per accession (metadata plus every document except images). Extracted datasets — prospectus-only, Item-level, exhibit-only — discard surrounding documents in exchange for cleaner, narrower content. The files dataset is the source-of-truth bundle; extracted variants are downstream refinements. They are complements.

Boundary summary

The Form SB-2 Files Dataset is defined by the intersection of three constraints, each of which a neighboring dataset breaks:

  1. Securities Act registration of an offering — excludes 10-SB (Exchange Act) and 1-A (exempt).
  2. Regulation S-B small-business issuer eligibility — excludes S-3 (seasoned), F-1 (foreign), most S-11 (real estate specialty), and the unscaled portion of S-1.
  3. Closed historical window from SB-2's introduction through its 2008 rescission — after which the population migrates to smaller-reporting-company S-1, 10-K, and 10-Q.

Within that intersection, the dataset preserves the full EDGAR submission rather than an extracted slice, which is what distinguishes it from prospectus-only or exhibit-only derivatives of the same filings.

Who Uses This Dataset

Users are concentrated in roles that work on small-issuer disclosure, micro-cap securities, historical enforcement, and the SB-2 to smaller reporting company transition. The closed 1995 to 2008 archive is dominated by micro-cap, penny-stock, early-stage, and shell or near-shell filers, and that filer profile shapes the user base.

Securities lawyers and capital markets paralegals

Used as a precedent library for small-issuer registered offerings. Drafting teams pull historical prospectuses to model risk factors, plan of distribution, selling shareholder tables, and lock-up language for resale registrations, equity lines, PIPE warrant registrations, and best-efforts deals. Paralegals mine exhibit indexes for EX-5.1 legality opinions, EX-10 templates (subscription, registration rights, equity line, finder agreements), and EX-23 auditor consents. The SB-2 to SB-2/A amendment chain is itself a workflow input: comparing successive amendments shows how disclosure shifted in response to staff comments.

Micro-cap and OTC equity research analysts

Used to reconstruct share-count and capital-structure history for long-listed issuers that went through multiple SB-2 registrations. The capitalization table, use-of-proceeds, and selling shareholder table together expose registered share counts, prior placement prices, warrant overhang, and the dilution path into today's float. For coverage initiations, SB-2 filings are often the only structured source for founder holdings, early seed positions, and pre-IPO convertible terms. Risk factors are read against current management narratives.

Forensic accountants and litigation-support teams

Used on penny-stock fraud, pump-and-dump, manipulation, and restatement matters, since many micro-cap enforcement cases trace back to an SB-2 or SB-2/A. Selling shareholder tables, plan of distribution, and EX-10 contracts (consulting, share-issuance, debt-conversion) reconstruct how shares moved into the float and who was paid in stock. Risk factors and use-of-proceeds are compared against actual cash use and later restatements. EX-23 consents and the disclosed auditor identity link issuers to small audit firms with their own PCAOB or SEC histories. Output is issuer chronologies and exhibit material for expert reports.

Shell-company and reverse-merger diligence teams

Used to triage issuers organized as shells or that became shells post-effectiveness and later served as reverse-merger vehicles. Diligence teams pull original promoter, auditor, CIK, file number, state of incorporation, and SIC code from metadata.json, then read use-of-proceeds and business description against subsequent operating history. Classic shell precursors (minimal proceeds, vague business plans, promoter-dominated selling shareholder tables) drive watchlists and target lists for custodianship petitions and shell revival projects.

Corporate development and M&A diligence teams

Used when acquiring long-listed micro-cap targets or evaluating shell vehicles for go-public transactions. EX-10 material contracts surface legacy registration rights, anti-dilution clauses, board observer rights, and consultant share grants that may still be live. Selling shareholder tables identify legacy holders potentially still on the register. Metadata fields (CIK, file numbers, state of incorporation, fiscal year end, SIC) confirm continuity of the legal entity. Output is a legacy-securities diligence memo and a remediation plan for surviving registration rights.

Broker-dealer compliance and penny-stock supervision

Used for Section 15(g) penny-stock supervision, Rule 144 tacking on resale customers, and KYC on issuers onboarded for market-making or DVP settlement. Selling shareholder tables establish original holders and cost-basis representations; plan of distribution and legend disclosures establish whether shares were registered for resale or restricted. EX-5.1 confirms the legality basis for registered shares. CIK and file numbers in metadata tie SB-2 history to later trading symbols for surveillance and SRO inquiries.

Regulators, SRO staff, and enforcement researchers

Used for historical lookups tied to enforcement, registration revocations, and trading suspensions of SB-2-era issuers. Metadata (CIK, file numbers, SIC, state of incorporation, fiscal year end) indexes the corpus; prospectus body and exhibits drive substantive review. Section 12(j) deregistration, manipulation, and gatekeeper matters rely on EX-5.1 and EX-23 to identify the attorneys and auditors who signed offerings, feeding pattern analysis across issuers tied to the same gatekeepers.

Academic researchers in finance, law, and economics

Used as a primary corpus for small-issuer going-public dynamics, micro-cap IPO underpricing, scaled-disclosure cost, staff-review effectiveness, and the 2008 SRC regime change. The closed 1995 to 2008 window suits difference-in-differences and event-study designs. Risk factor text, executive compensation tables, capitalization tables, and offering-size figures are merged with later trading data. SB-2/A chains support studies on staff review intensity and the substantive effect of comment-letter cycles.

Financial NLP and data engineering teams

Used as training and evaluation data for extracting offering size and use-of-proceeds categories, classifying risk factors by topic, detecting shell-company language, and parsing capitalization and selling shareholder tables from heterogeneous HTML and TXT. The 1995 to 2008 span and per-accession metadata.json provide format-diverse coverage and ground-truth labels for entity linking and SIC classification. Shell-detection models use later observed outcomes (reverse mergers, deregistrations, enforcement) to label positives and negatives.

Specific Use Cases

The workflows below draw on the prospectus body, the EX-3 through EX-99 exhibit set, the per-accession metadata.json, and the closed 1995 to 2008 amendment chains.

Reconstructing capitalization and dilution histories for surviving micro-cap tickers

Pull every accession sharing a fileNo (the 333- Securities Act number), order by filedAt, and walk the SB-2 to SB-2/A chain to extract the calculation-of-registration-fee table, selling shareholder table, and capitalization section from each restatement. The output is a per-issuer share-issuance ledger keyed by CIK that ties registered share counts, warrant coverage, and named selling holders to the prices on the cover page. Used by micro-cap analysts to attribute today's float and overhang back to specific 2003 to 2008 placements.

Mining EX-10 material contracts for small-issuer deal precedent

Filter documentFormatFiles[] on type values matching EX-10.* and pull the SGML-wrapped bodies for equity line agreements, registration rights agreements, finder agreements, consulting share-issuance contracts, and Standby Equity Distribution Agreements. Drafting teams cluster these by counterparty (Cornell Capital, Dutchess, YA Global, etc.) to model current PIPE and ELOC documents on language that survived staff review. The exhibit description field in metadata.json accelerates the initial filter before any document is opened.

Building a comment-letter response corpus from SB-2/A diffs

Group accessions by fileNo, then diff the sequence-1 prospectus across consecutive SB-2/A filings and align the diffs against any CORRESP documents in the same accession folders. The result is a labeled dataset of "staff comment leads to disclosure change" pairs covering risk factors, use of proceeds, plan of distribution, and going-concern language. Securities counsel use this to anticipate staff requests on small-issuer registrations; academic researchers use it to study comment-letter effectiveness across the 1995 to 2008 window.

Identifying shell-company precursors for custodianship and reverse-merger targeting

Score each record on shell-precursor signals extracted from sequence-1: a "plan of operation" rather than full MD&A, two-year audited statements with minimal revenue, vague Item 101 business descriptions, promoter-dominated selling shareholder tables, and missing EX-21 subsidiary lists. Cross-reference cik, stateOfIncorporation, and sic from metadata.json against later trading suspensions and Section 12(j) deregistrations. Output is a ranked watchlist of dormant CIKs for custodianship petitions and revival diligence.

Gatekeeper pattern analysis across EX-5.1 and EX-23.1

Parse the EX-5.1 legality opinion and EX-23.1 auditor consent from every accession to extract the issuing law firm and the consenting audit firm, then aggregate by gatekeeper across the corpus. The resulting issuer-to-counsel and issuer-to-auditor adjacency lists feed enforcement research on small audit firms and securities counsel who appeared repeatedly on filings later tied to manipulation, restatement, or deregistration matters. The description field on each exhibit (e.g., "OPINION OF ... AS TO THE LEGALITY OF SECURITIES BEING REGISTERED") gives a reliable starting filter.

NLP training data for scaled-disclosure extraction

Use the format-diverse population (legacy fixed-width ASCII with EDGAR <TABLE>/<S>/<C> markers from the late 1990s alongside filing-agent HTML from EDGARizer, RDG, and Donnelley in the mid-2000s) to train extractors for offering size, use-of-proceeds categories, risk-factor topics, and selling shareholder rows. The per-accession metadata.json supplies ground-truth formType, sic, stateOfIncorporation, and entity labels; the SGML <TYPE> tag inside each document body provides authoritative document classification independent of filer-controlled filenames.

Dataset Access

The Form SB-2 Files Dataset is accessible through three endpoints: a JSON metadata index, a full archive download, and per-container downloads. Containers are monthly ZIP files organized by year, covering filings from April 1995 onward across SB-2 and SB-2/A form types.

Dataset Index JSON API: https://api.sec-api.io/datasets/form-sb2-files.json

Returns dataset-level metadata (name, description, last updated timestamp, earliest sample date, total record count, total size, form types, container format, and file types) along with the full dataset download URL and a list of all monthly container files. Each container entry includes its key, size, record count, last updated timestamp, and individual download URL. Poll this endpoint to detect which monthly containers were refreshed in the most recent run, and download only the changed containers on a daily basis. This endpoint does not require an API key.

Example response:

Example
1 {
2 "datasetId": "1f13365b-9ae0-6917-849d-750c95918b65",
3 "datasetDownloadUrl": "https://api.sec-api.io/datasets/form-sb2-files.zip",
4 "name": "Form SB-2 Files Dataset",
5 "updatedAt": "2026-04-14T15:11:27.498Z",
6 "earliestSampleDate": "1995-04-01",
7 "totalRecords": 143958,
8 "totalSize": 3060933189,
9 "formTypes": ["SB-2", "SB-2/A"],
10 "containerFormat": "ZIP",
11 "fileTypes": ["TXT", "JSON", "HTML", "XFD", "PDF", "FRM"],
12 "containers": [
13 {
14 "downloadUrl": "https://api.sec-api.io/datasets/form-sb2-files/2008/2008-02.zip",
15 "key": "2008/2008-02.zip",
16 "size": 13818783,
17 "records": 154,
18 "updatedAt": "2026-04-14T15:11:27.498Z"
19 }
20 ]
21 }

Download Entire Dataset: https://api.sec-api.io/datasets/form-sb2-files.zip?token=YOUR_API_KEY

Downloads the complete dataset as a single ZIP archive containing every monthly container. This endpoint requires an API key.

Download Single Container: https://api.sec-api.io/datasets/form-sb2-files/2008/2008-02.zip?token=YOUR_API_KEY

Downloads one monthly container archive identified by its YYYY/YYYY-MM.zip key, allowing incremental retrieval of only the months that changed. This endpoint requires an API key.

Frequently Asked Questions

What forms does this dataset cover?

The dataset covers Form SB-2 (the initial small-business issuer registration statement under the Securities Act of 1933) and Form SB-2/A (pre-effectiveness and post-effectiveness amendments to a previously filed SB-2). Both form types share fileNo lineage when they belong to the same registration.

What does one record in this dataset represent?

One record is a single EDGAR submission identified by an 18-digit accession number. On disk it is one accession-number folder containing a metadata.json manifest plus every document the registrant submitted to EDGAR for that accession, except binary image files. Each record bundles a structured JSON manifest, the prospectus and Part II content, exhibits, and the SGML envelopes wrapping every document body.

Who was required to file Form SB-2?

The filer was always the issuer of the securities being registered, and only issuers that qualified as "small business issuers" under Regulation S-B could use the form — meaning revenues and public float each below $25 million, organization in the United States or Canada, and not registered under the Investment Company Act. Foreign private issuers (other than Canadian filers), registered investment companies, business development companies, and asset-backed issuers were ineligible.

What time period does the dataset cover?

The dataset begins in April 1995, when EDGAR electronic filing became mandatory, and runs through Form SB-2's rescission in 2008 under Release No. 33-8876. Post-2008 records in the dataset are wind-down amendments tied to pre-rescission registration statements; no new originating SB-2 filings have been accepted by EDGAR since the rescission.

What file format is the dataset distributed in?

The dataset is distributed as monthly ZIP containers organized by year, with keys of the form YYYY/YYYY-MM.zip. Inside each container, every accession is its own subdirectory containing a metadata.json plus document files in TXT, HTML, XFD, PDF, or FRM format, each wrapped in EDGAR's SGML <DOCUMENT> envelope.

How does this dataset differ from the Form S-1 Files Dataset?

Form S-1 is the general-purpose Securities Act registration statement and runs on full Regulation S-K and S-X disclosure; Form SB-2 ran on the scaled Regulation S-B regime, with two years of audited financials rather than three and lighter MD&A and compensation requirements. After SB-2's 2008 rescission, the small-issuer population migrated to S-1 with smaller-reporting-company accommodations folded into Regulation S-K, so the post-2008 S-1 corpus absorbs SB-2's filer base — SB-2 is its predecessor, not a substitute.

Does the dataset contain XBRL data?

No. Form SB-2 was rescinded before XBRL phase-in reached small registrants, so no record carries an XBRL instance document or inline-XBRL structured data. The dataFiles array and linkToXbrl field in metadata.json are uniformly empty across the corpus.