The Form C Files Dataset contains the complete filing packages for every Regulation Crowdfunding (Regulation CF) offering statement submitted to the SEC on EDGAR. Each record corresponds to a single Form C or Form C/A filing and includes the structured XML offering statement, an XSL-rendered HTML presentation, a provider-generated metadata file, and all non-image exhibit documents such as offering memoranda, financial statements, and corporate formation documents. Filers are the issuing companies themselves — overwhelmingly early-stage startups, small businesses, and single-asset LLCs raising capital from the public through SEC-registered funding portals. The dataset covers filings from May 2016, when Regulation CF took effect, to the present.
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
The Form C Files Dataset packages every Form C and Form C/A filing submitted to EDGAR since May 16, 2016, when the SEC's final Regulation Crowdfunding rules took effect. Form C is the offering statement required under Section 4(a)(6) of the Securities Act of 1933; Form C/A is its amendment variant, used to update or correct offering terms, financial data, or issuer details during an active campaign. Both form types are structurally identical and appear in the dataset.
Each record is a folder identified by accession number and contains the machine-readable XML offering statement, a styled HTML rendering of that XML, a JSON metadata enrichment file, and all non-image exhibit documents from the original EDGAR submission — predominantly PDFs of offering memoranda, financial statements, CPA review letters, and corporate governance documents. The dataset is distributed in monthly ZIP containers, with each container covering one calendar month of filings.
Each record in the Form C Files Dataset is a folder corresponding to a single EDGAR submission of Form C or Form C/A, identified by accession number. The folder contains the structured XML offering statement, an XSL-rendered HTML presentation of that XML, a dataset-provider-generated metadata file, and all non-image exhibit documents from the original EDGAR submission. One record therefore captures the complete machine-readable and human-readable disclosure package for a single Regulation Crowdfunding offering statement or amendment.
Each record folder is named by accession number (digits only, no hyphens) and contains these files:
metadata.json — A dataset-provider enrichment file (not part of the original EDGAR submission) providing structured metadata about the filing. Top-level fields include formType, accessionNo, filedAt (ISO 8601 timestamp), description, and URL links: linkToFilingDetails (the XSL-rendered HTML on EDGAR), linkToTxt (the monolithic SGML submission text file), linkToHtml (the EDGAR filing index page), and linkToXbrl (always empty for Form C). The documentFormatFiles array enumerates every document in the submission with its sequence number, size in bytes, documentUrl, type (e.g., C, C/A, EX-99), and optional description. The entities array typically contains a single entry for the filing issuer, with fields including cik, companyName, fileNo, irsNo, sic (SIC code and description), stateOfIncorporation, fiscalYearEnd, act (always "33" for Securities Act filings), type, and filmNo. The dataFiles, seriesAndClassesContractsInformation, and linkToXbrl fields are consistently empty for Form C filings.
primary_doc.xml (root level) — The machine-readable structured XML offering statement as filed with EDGAR. This is the core data artifact of the record. It uses the http://www.sec.gov/edgar/formc namespace (with http://www.sec.gov/edgar/common for address elements) and contains the full set of structured disclosures required by Form C. Its internal structure is described in detail below.
xslC_X01/primary_doc.xml — An XSL-transformed HTML rendering of the same XML data, presenting the offering statement as a styled web page with labeled tables, checkbox-style indicators, and formatted text. This mirrors what a user sees when viewing the filing on the EDGAR website. It contains no information beyond the raw XML but provides a human-readable presentation layer. The XSL stylesheet identifier xslC_X01 has been stable since Form C's inception.
Exhibit documents — Zero or more supplemental files attached to the filing, predominantly PDFs. These are typically typed as EX-99 in the EDGAR submission. Common exhibit types include offering memoranda, financial statements, CPA review or audit letters, certificates of incorporation, corporate bylaws, and subscription agreements. The number of exhibits per filing ranges from zero to over a dozen. Some filings also include HTML stub files such as imageattachments.htm (which reference image files via <img> tags) or documents_list.htm (a table-of-contents page, often nearly empty). HTML and plain-text exhibit files may retain EDGAR's SGML document envelope, with <DOCUMENT>, <TYPE>, <SEQUENCE>, <FILENAME>, <DESCRIPTION>, and <TEXT> tags wrapping the inner content. PDFs are always stored as raw binaries without an SGML wrapper.
The root element is <edgarSubmission>. Its major children, in document order:
headerData — Identifies the filing at the EDGAR system level. Contains submissionType (C or C/A), and within filerInfo/filer: the filerCredentials block with filerCik (zero-padded 10-digit CIK) and filerCcc (always masked as XXXXXXXX in public filings), plus fileNumber (the SEC file number, formatted as 020-XXXXX). The liveTestFlag element indicates whether the filing is LIVE or TEST. A flags sub-element contains three boolean flags: confirmingCopyFlag, returnCopyFlag, and overrideInternetFlag.
formData/issuerInformation — Identifies the issuer, the intermediary, and (for amendments) the nature of the amendment. Key sub-elements:
isAmendment — Boolean (true/false). Present in C/A filings to mark the submission as an amendment.natureOfAmendment — Free-text description of what the amendment changes (e.g., "Extending Campaign End Date", "Updating Form C financial information"). This is an unstructured string, not a fixed enumeration; issuers describe the amendment purpose in their own words.issuerInfo — Contains nameOfIssuer, a legalStatus block (legalStatusForm with values such as Corporation, Limited Liability Company, Limited Partnership, or Other; legalStatusOtherDesc for non-standard entity types; jurisdictionOrganization as a state/territory code; and dateIncorporation in MM-DD-YYYY format), issuerAddress (using the com: namespace for street1, optional street2, city, stateOrCountry, zipCode), and issuerWebsite.isCoIssuer — Y or N, indicating whether a co-issuer exists.companyName (the funding portal or broker-dealer name), commissionCik (the intermediary's CIK), commissionFileNumber (the intermediary's SEC file number), and optionally crdNumber (the intermediary's FINRA CRD number, present when the intermediary is a registered broker-dealer).formData/offeringInformation — Discloses the economic terms of the offering. Key elements:
compensationAmount — Free-text description of the intermediary's compensation structure (e.g., percentage of amount raised, flat fees, payment processing fees).financialInterest — Free-text description of any financial interest the intermediary holds in the issuer, or "None"/"No" if none.securityOfferedType — The category of security: values include Common Equity, Preferred Equity, Debt, SAFE, or Other. When Other, the securityOfferedOtherDesc element provides a free-text description (e.g., "Series CF Preferred Stock", "Convertible Note", "Revenue Share Agreement").noOfSecurityOffered — Number of securities offered (may be absent for debt or SAFE instruments where unit count is not meaningful).price — Price per security as a decimal value.priceDeterminationMethod — Free-text explanation of how the price was determined (e.g., "Fixed price", "Pro-rated portion of the total principal", or "N/A").offeringAmount — Target offering amount in dollars.overSubscriptionAccepted — Y or N.overSubscriptionAllocationType — Describes the allocation method if oversubscription is accepted (e.g., "First-come, first-served basis", "Other").descOverSubscription — Optional free-text elaboration on oversubscription terms when allocation type is "Other".maximumOfferingAmount — Maximum dollar amount the issuer will accept.deadlineDate — Offering deadline in MM-DD-YYYY format.formData/annualReportDisclosureRequirements — Contains the issuer's employee count and two fiscal years of summary financial data as numeric values. The financial fields, each present for both MostRecentFiscalYear and PriorFiscalYear, are: totalAsset, cashEqui (cash and cash equivalents), actReceived (accounts receivable), shortTermDebt, longTermDebt, revenue, costGoodsSold, taxPaid, and netIncome. All amounts are in dollars. The currentEmployees field sometimes contains decimal values (e.g., 8.00, 2). This section also contains a repeating issueJurisdictionSecuritiesOffering element listing every jurisdiction where the securities will be offered. These are two-letter U.S. state postal codes (e.g., CA, NY, DC) and EDGAR territory codes (e.g., A0 through A9, B0, Z4 for U.S. territories such as Puerto Rico, Guam, and the U.S. Virgin Islands). Most issuers list all 50 states plus DC; some also list territory codes.
formData/signatureInfo — Contains the issuer-level signature and individual signatory information. The issuerSignature block includes issuer (entity name), issuerSignature (the signatory's name or /s/ signature), and issuerTitle. The signaturePersons block contains one or more signaturePerson entries, each with personSignature, personTitle, and signatureDate (MM-DD-YYYY). Multiple signatories appear when directors or officers beyond the principal executive sign the filing.
The non-XML, non-metadata files in a record folder constitute the exhibit layer. Their content is predominantly narrative and unstructured:
Offering memoranda — The most common and typically most substantive exhibit. These PDF documents (sometimes styled as "Form C" or "Offering Circular") present investor-facing disclosure including a business description, risk factors, description of the securities, use-of-proceeds breakdown, capitalization table, management biographies, and financial statements. Length ranges from a few pages to over fifty.
Financial statements — Standalone PDF files containing balance sheets, income statements, and cash-flow statements, sometimes with notes. Depending on the offering amount and whether the issuer is a first-time filer, these may be self-certified by the principal executive officer, reviewed by an independent public accountant, or audited. CPA review or audit letters, when present, appear as separate PDF exhibits.
Corporate formation documents — Certificates of incorporation, articles of organization, operating agreements, and corporate bylaws.
Other exhibits — Subscription agreements, investor questionnaires, term sheets, progress update narratives, and miscellaneous attachments. File naming conventions are not standardized across issuers or intermediaries. Some portals use descriptive names (offeringmemoformc.pdf, reviewletter.pdf), others use sequential names (document_1.pdf through document_13.pdf), and others use issuer-specific or date-stamped names. The documentFormatFiles array in metadata.json and the <DESCRIPTION> tag within SGML-wrapped documents provide the most reliable mapping between filenames and document roles.
Each dataset record includes the metadata file, the raw XML offering statement, the XSL-rendered HTML version of the XML, and all non-image documents from the original EDGAR submission. PDFs are stored as raw binaries. HTML and plain-text exhibits that were originally wrapped in EDGAR's SGML document envelope retain that wrapper in the dataset copy, preserving the <TYPE>, <SEQUENCE>, <FILENAME>, and <DESCRIPTION> metadata within the file itself.
Image files (JPEG, PNG, GIF) referenced by or embedded in the EDGAR submission are excluded from the dataset. HTML stub files that reference images (such as imageattachments.htm containing <img> tags) may be present, but the image binaries they reference are not included. The complete EDGAR submission text file (the monolithic SGML archive concatenating all documents) is not stored in the record folder, though its URL is available in the linkToTxt field of metadata.json.
Amendment linkage — Form C/A filings are amendments to a prior Form C but do not contain an explicit reference to the parent filing's accession number. Linking amendments to their original offering requires matching on issuer CIK (or company name) and examining chronological filing order. The natureOfAmendment field is free text and not machine-parseable into fixed categories.
Financial data: XML versus exhibits — The XML annualReportDisclosureRequirements section provides machine-extractable summary financial figures (total assets, revenue, net income, etc.) for two fiscal years. The PDF exhibits may contain more detailed financial statements with notes, auditor opinions, and comparative schedules. These two sources are complementary: the XML gives structured summary fields; the PDFs give the full financial picture in narrative and tabular form.
Intermediary variation — The crowdfunding portal or broker-dealer through which the offering is conducted (e.g., StartEngine, Wefunder, Republic, Honeycomb, DealMaker) significantly influences exhibit formatting, naming conventions, number of attached documents, and the level of detail in free-text XML fields like compensationAmount. Systematic differences by intermediary are common.
SGML-wrapped versus raw files — HTML and text exhibit files may retain the EDGAR SGML document wrapper (<DOCUMENT>...</DOCUMENT> with <TYPE>, <SEQUENCE>, <FILENAME>, <TEXT> tags), while PDFs are always raw binaries. Parsers extracting content from HTML or text exhibits should detect and strip the SGML envelope to reach the inner content.
Jurisdiction codes — The issueJurisdictionSecuritiesOffering elements use two-character codes. Standard U.S. state postal abbreviations and DC are straightforward; codes like A0 through A9, B0, Z4 represent U.S. territories and possessions using EDGAR's internal code scheme. Because Regulation CF preempts state registration requirements, most issuers list all jurisdictions rather than indicating geographic targeting.
Offering outcome not recorded — The XML deadlineDate records when the offering period ends, but the filing itself does not indicate whether the offering was successfully completed, withdrawn, or expired. Outcome data must be inferred from subsequent filings (Form C-U annual reports, Form C/A termination amendments) or external sources.
No XBRL component — Form C filings contain no XBRL or Inline XBRL data. The linkToXbrl field in metadata is always empty. The XML offering statement uses a Form C-specific schema, not the XBRL taxonomy used by other SEC filings.
Date format — All dates within the XML (incorporation date, signature dates, offering deadline) use MM-DD-YYYY format, not the ISO 8601 format used in the metadata filedAt timestamp.
The filer is the issuer — the company or entity raising capital from the public under Regulation Crowdfunding (Regulation CF). The issuer's CIK and accession number appear in the EDGAR submission header.
Form C filers are overwhelmingly small, private, domestic companies: early-stage startups, small businesses, single-asset real estate LLCs, and similar ventures. Entity types include corporations, LLCs, partnerships, and public benefit corporations organized under U.S. state or territorial law.
Ineligible issuers (entities that cannot file Form C):
Every Regulation CF offering must be conducted through an SEC-registered intermediary — a broker-dealer or a funding portal registered with the SEC and FINRA. The intermediary's name, CIK, Commission file number (007- series for funding portals), and CRD number appear in the filing, but the intermediary is not the filer. The filing obligation belongs solely to the issuer. In practice, many platforms help issuers prepare and submit filings, but the issuer remains the legal filer of record.
The trigger is transactional, not periodic. An issuer files Form C when it decides to conduct a Regulation CF offering. The filing must be made on EDGAR and provided to the intermediary before the offering can be presented to investors. There is no minimum advance-filing period; the filing and the opening of the offering may occur on the same day.
Each new offering requires a separate Form C, even by the same issuer. There is no fixed calendar or quarterly cycle. Filings arrive at EDGAR on a rolling basis year-round.
An issuer files Form C/A to disclose any material change to a pending offering statement. Material changes include revisions to offering terms (price, amount, deadline), changes in officers or directors, updated financial statements, a change of intermediary, or corrections of material errors. Amendments must be filed promptly when a material change occurs. Investors then have five business days to reconfirm their commitments or have them cancelled. Amendments are event-driven and may occur at any point before the offering closes or is withdrawn.
This dataset contains only Form C and Form C/A filings. Related forms filed under Regulation CF but not included are:
The forms and datasets most likely to overlap or cause confusion with the Form C Files Dataset are other exempt-offering notice filings, scaled offering statements, and the Form C reporting family itself.
Form D (Regulation D notice filings). The nearest neighbor. Both Form C and Form D are exempt-offering filings by private, often early-stage companies, and both record basic offering metadata (amount, security type, issuer identity). However, Form D is a short structured notice with no substantive disclosure — no financial statements, no business narrative, no use-of-proceeds detail. Form C is a full offering statement: structured XML with financials, officer/director disclosure, intermediary identification, and detailed offering terms, plus attached PDFs (offering memoranda, review letters, governance documents). Form D covers Regulation D placements (Rules 504, 506(b), 506(c)) with no hard dollar cap and investor restrictions based on accreditation; Form C covers Regulation CF offerings capped at $5 million annually and open to non-accredited investors through registered portals. A Form D dataset tracks placement volume; the Form C Files Dataset provides campaign-level disclosure content.
Form 1-A (Regulation A offering statements). Also an exempt-offering disclosure document filed before securities are sold, containing financial statements, business descriptions, use of proceeds, and management information. The key differences are scale and ongoing obligations. Regulation A permits raises up to $75 million (Tier 2) versus Regulation CF's $5 million cap. Form 1-A filers tend to be more mature, and Tier 2 issuers must file audited financials and periodic reports (Form 1-K annually, Form 1-SA semiannually). Form C filers often submit only reviewed or officer-certified financials and have no periodic reporting obligation beyond the one-time Form C-U progress update. Form 1-A resembles a scaled-down registration statement; Form C resembles a standardized pitch document with basic financials.
Form C-U (Regulation CF progress reports). Part of the same regulatory family but not included in this dataset. Form C-U reports the total amount of securities actually sold in a completed or terminated offering. It is a brief structured filing with no financial statements or exhibits. The Form C Files Dataset captures the offering as proposed; Form C-U captures the outcome. Researchers tracking which campaigns reached their targets need both.
Form C/A (amendments). Included in this dataset alongside original Form C filings. Amendments revise offering terms, financials, intermediary details, or other material elements during an active offering. The dataset labels these with form type C/A but does not indicate what changed — identifying revisions requires comparing amendment content against the original. A single offering may produce multiple accession numbers (one Form C plus one or more C/A amendments), and the latest C/A represents the current disclosure state.
| Dimension | Form C | Form D | Form 1-A |
|---|---|---|---|
| Regulation | Reg CF | Reg D (504, 506) | Reg A (Tier 1/2) |
| Offering cap | $5M/year | None (504: $10M) | $20M/$75M |
| Disclosure depth | Structured XML + PDF exhibits | Minimal notice | Full offering circular |
| Financial statements | Reviewed or officer-certified | None | Audited (Tier 2) |
| Ongoing reporting | Form C-U only | None | Form 1-K, 1-SA |
| Investor restrictions | Open to all via portal | Accredited (506) or limited | Open to all (Tier 2) |
| Typical filer | Pre-revenue / seed-stage | Private companies, all stages | Growth-stage private |
Structured financial data. Form C XML contains roughly a dozen financial line items per year (total assets, cash, revenue, net income, debt, etc.) in a proprietary XML schema — not XBRL. No XBRL-based financial dataset (derived from 10-K, 10-Q, or Form 1-A filings) includes Form C issuers. Detailed financials for crowdfunding issuers exist only in the attached PDF exhibits, which vary widely in format.
The Form C Files Dataset is the only SEC dataset that packages Regulation Crowdfunding offering statements with their full set of attached documents — offering memoranda, review letters, governance filings, and financial exhibits. It is distinguished from Form D by disclosure depth, from Form 1-A by offering scale and reporting lifecycle, and from XBRL financial datasets by schema and filer population. Within the Regulation CF family, it covers the offering-stage disclosure (Form C and C/A) but not the outcome reporting (Form C-U).
Compliance officers and in-house counsel at SEC-registered funding portals and broker-dealers use Form C filings to verify issuer disclosures before campaigns launch and to monitor Form C/A amendments for material mid-offering changes. Key fields: issuer jurisdiction, officer and director disclosures, financial statement tier, offering limits, and use-of-proceeds descriptions. These feed ongoing checks against Regulation CF caps, disclosure completeness, and bad-actor disqualification rules.
Attorneys advising early-stage issuers benchmark disclosure language against peer filings — risk factors, business descriptions, pricing methods, and use of proceeds. Amendment patterns across Form C/A filings reveal which disclosure areas most often require revision, informing drafting conventions for new offerings.
Analysts at venture funds, angel syndicates, and family offices screen the dataset for new deal flow. They track offerings by industry, geography, and size, focusing on business descriptions, target amounts, implied valuations, management backgrounds, and financials. The Form C Files Dataset supports pipeline generation and competitive intelligence on companies that may later seek institutional capital.
Quantitative researchers and data scientists parse Form C XML and structured text to build databases of offering terms, founder characteristics, and capital-raising outcomes. Extracted fields — target and maximum amounts, deadlines, securities types, financial metrics — feed scoring models, dashboards, and crowdfunding market reports.
Finance economists study Regulation CF as a small-business capital-formation channel. The dataset supports analysis of offering success rates, geographic and industry concentration, disclosure quality versus fundraising outcomes, and the effects of regulatory changes. Longitudinal coverage from 2016 onward enables time-series work on post-JOBS Act market evolution.
Staff at state securities divisions identify in-state issuers, cross-reference offering details against state registration requirements, and flag potential disclosure deficiencies or offering-limit violations. Primary fields: issuer jurisdiction, offering amounts, and financial statements.
Data engineering teams ingest Form C filings as a source of structured and semi-structured small-company records. The mix of XML metadata, HTML offering circulars, and PDF financial statements provides material for entity extraction, document classification, and retrieval-augmented generation systems. The standardized form structure paired with wide variation in issuer quality makes the corpus useful for training and evaluating NLP models.
Reporters covering startup finance use the dataset to surface patterns such as repeat issuers, geographic clusters, outlier offering sizes, and filings amended multiple times. Business descriptions, officer fields, and financials anchor stories on individual companies or broader crowdfunding market dynamics.
Investment analysts extract offeringAmount, maximumOfferingAmount, securityOfferedType, and price from the XML to filter new Regulation CF campaigns by size, security structure, and implied valuation. Pairing these fields with nameOfIssuer, SIC code, and issuerAddress enables systematic pipeline generation by industry and geography. Offering memorandum PDFs attached as EX-99 exhibits supply management bios, capitalization tables, and business narratives for deeper screening.
Securities attorneys compare risk factors, use-of-proceeds descriptions, and pricing-method disclosures across Form C filings to establish drafting norms for new Regulation CF campaigns. The priceDeterminationMethod and compensationAmount free-text fields reveal how peer issuers describe valuation rationale and intermediary fee structures. Tracking Form C/A amendment patterns — specifically the natureOfAmendment field — identifies which disclosure areas most frequently require mid-offering revision.
Data engineers parse the primary_doc.xml across all records to extract machine-readable fields: target and maximum offering amounts, security types, deadline dates, intermediary identity, and the two-year summary financials (revenue, net income, total assets, debt, cash) from annualReportDisclosureRequirements. This structured extract powers dashboards, market reports, and scoring models covering the full Regulation CF market from 2016 onward — a population not covered by any XBRL-based financial dataset.
Compliance teams at funding portals and state securities regulators track Form C/A filings by matching amendments to original offerings on issuer CIK and chronological order. Comparing XML field values between an original Form C and its amendments surfaces changes to offering amounts, deadline extensions, updated financials, or revised intermediary terms. The isAmendment flag and natureOfAmendment text identify which filings are amendments and what the issuer says changed.
NLP and data science teams use the dataset's mix of structured XML, HTML renderings, and unstructured PDF exhibits as a labeled corpus for document processing tasks. The documentFormatFiles array in metadata.json maps each file to its EDGAR document type and sequence number, providing ground-truth labels for classifying offering memoranda, financial statements, CPA review letters, and corporate formation documents. The standardized XML schema paired with wide variation in exhibit quality and formatting across intermediaries creates a useful training and evaluation set.
Dataset Index JSON API: https://api.sec-api.io/datasets/form-c-files.json
This endpoint returns metadata about the Form C Files Dataset, including the dataset name, description, last updated timestamp, earliest sample date, total records, total size, covered form types (C, C/A), container format (ZIP), and content file types (XML, PDF, HTML, JSON, TXT). It also returns the download URL for the entire dataset and a list of all individual container files with per-container metadata such as size, record count, last updated timestamp, and download URL. This endpoint does not require an API key.
Use this endpoint to monitor which containers have been updated in the most recent refresh run, so you can decide on a daily basis which containers to download rather than re-downloading the full dataset each time.
1
{
2
"datasetId": "1f13365b-9ae0-6919-9991-a7a48f300d3e",
3
"datasetDownloadUrl": "https://api.sec-api.io/datasets/form-c-files.zip",
4
"name": "Form C Files Dataset",
5
"updatedAt": "2026-04-17T02:57:24.606Z",
6
"earliestSampleDate": "2016-05-01",
7
"totalRecords": 133200,
8
"totalSize": 345212926344,
9
"formTypes": ["C", "C/A"],
10
"containerFormat": "ZIP",
11
"fileTypes": ["XML", "PDF", "HTML", "JSON", "TXT"],
12
"containers": [
13
{
14
"downloadUrl": "https://api.sec-api.io/datasets/form-c-files/2026/2026-04.zip",
15
"key": "2026/2026-04.zip",
16
"size": 48291037,
17
"records": 312,
18
"updatedAt": "2026-04-17T02:57:24.606Z"
19
}
20
]
21
}
Download Entire Dataset: https://api.sec-api.io/datasets/form-c-files.zip?token=YOUR_API_KEY
Downloads the full dataset as a single ZIP archive containing all Form C and C/A filing containers. This endpoint requires an API key passed as the token query parameter.
Download Single Container: https://api.sec-api.io/datasets/form-c-files/2026/2026-04.zip?token=YOUR_API_KEY
Downloads an individual monthly container file instead of the full dataset. Each container is a ZIP archive covering one month of filings. Replace the year and month in the URL path to target a specific period. This endpoint requires an API key passed as the token query parameter.
The dataset covers Form C and Form C/A, the offering statement and amendment forms required under Regulation Crowdfunding (Regulation CF), codified under Section 4(a)(6) of the Securities Act of 1933.
Each record is a folder identified by accession number containing the structured XML offering statement, an XSL-rendered HTML presentation, a JSON metadata file, and all non-image exhibit documents (primarily PDFs) from a single Form C or Form C/A EDGAR submission.
The filer is the issuing company — typically an early-stage startup, small business, or LLC organized under U.S. state or territorial law — that intends to raise capital from the public through a Regulation CF offering conducted via an SEC-registered funding portal or broker-dealer.
The dataset covers filings from May 2016, when the SEC's Regulation Crowdfunding rules took effect, to the present. Form C has no pre-EDGAR or paper history.
No. Form C filings contain no XBRL or Inline XBRL data. Financial figures are provided as numeric values in a Form C-specific XML schema within the annualReportDisclosureRequirements section, and as narrative statements in attached PDF exhibits.
Form D is a minimal notice filing with no financial statements, business narratives, or exhibits. Form C is a full offering statement with structured XML financials, detailed offering terms, intermediary identification, and attached PDF documents such as offering memoranda and CPA review letters. Form D covers Regulation D placements; Form C covers Regulation CF offerings capped at $5 million annually.
The dataset is distributed as monthly ZIP containers. Each container holds record folders containing XML, HTML, JSON, PDF, and TXT files. The full dataset can also be downloaded as a single ZIP archive.