The Form C-U Files dataset is a complete, monthly-refreshed archive of every Form C-U progress-update filing submitted to EDGAR by Regulation Crowdfunding ("Reg CF") issuers since the form first became available. One record represents one Form C-U submission — a single milestone, close, or amendment in a Reg CF offering's life — packaged as an accession-named folder containing the canonical structured XML form, an XSL-rendered XHTML view, and a sec-api metadata.json index. Form C-U is filed by the issuer (not the funding portal) under Rule 203(a)(3) of Regulation Crowdfunding within five business days of reaching the target offering amount, optionally at the 50% and 100% target marks when the intermediary does not provide frequent platform updates, and again at offering close. The dataset's earliest sample is dated 2016-08-01, tracking the May 2016 effective date of Reg CF, and is delivered as monthly ZIP archives containing XML, JSON, HTML, and (occasionally) PDF artifacts.
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
The dataset packages every Form C-U progress-update submission filed on EDGAR by Regulation Crowdfunding issuers, beginning in August 2016 and continuing on a monthly refresh cadence. Form C-U is the "progress update" variant of the Form C family used to administer Regulation Crowdfunding offerings under Section 4(a)(6) of the Securities Act of 1933. Each filing is event-driven — triggered by a specific funding milestone or by the close of the offering — rather than periodic, so the dataset accumulates as issuers hit thresholds rather than on a calendar.
The dataset is distributed as monthly ZIP archives named YYYY/YYYY-MM.zip. Each archive unpacks to a single top-level YYYY-MM/ directory whose immediate children are accession-number folders, one per filing. Accession folders are named with the 18-digit dash-stripped form of the EDGAR accession number (e.g., the dashed accession 0001859901-25-000002 becomes the folder 000185990125000002). The first ten digits identify the filer (or filing agent) CIK that submitted the envelope, the next two encode the year of filing, and the last six are the sequence within that filer-year. There is no nested form-type directory and no index file at the month level — the accession folders themselves are the only contents under YYYY-MM/. The dataset's file types are XML, JSON, HTML, and PDF, although in practice Form C-U records consist almost entirely of an XML/JSON/HTML triplet, with PDF appearances limited to the unusual filings where an issuer attaches a supplementary document.
Unlike the issuer's initial Form C, Form C-U is a thin, highly structured progress-reporting form. It does not re-disclose offering documents, financials, or risk factors as exhibits, but it does carry the full annual-report financial-disclosure block and offering-terms block from the original Form C so each progress update is internally self-contained. EDGAR also accepts a C-U/A amendment variant; both surface in the Form C-U universe and must be disambiguated by reading the progress-update prose and the form-type metadata.
One record in the Form C-U Files dataset is a single Form C-U progress-update submission filed on EDGAR by a Regulation Crowdfunding issuer. The record is identified by its EDGAR accession number and materializes on disk as an accession-named folder containing the canonical structured XML form, the EDGAR XSL-rendered XHTML view of that XML, and a sec-api metadata.json index file. The accession folder is the canonical record unit: it corresponds one-to-one with a Form C-U filing event, whether that event is the issuer's first 50%-of-target progress notice, a 100%-of-target update, a final close-of-offering report (successful or unsuccessful), or an amendment to an earlier Form C-U.
Inside every accession folder, the file set is uniform across the dataset:
metadata.json — sec-api filing-level index record describing the filing envelope, filer/issuer entity, and document inventory.primary_doc.xml — the canonical EDGAR Form C-U XML submission containing all structured form data.xslC_X01/primary_doc.xml — the EDGAR-rendered XHTML view of the same form, produced by EDGAR's xslC_X01.xsl stylesheet. Despite its .xml extension, the file is XHTML (<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" ...>) and is intended for human reading.Form C-U is a structured form with no separate exhibits in practice; the XML form and its HTML rendering together constitute the entire filing. The dataset omits image files from the archive, but this is rarely material for Form C-U because the form is essentially metadata-only and does not commonly carry image, PDF, or other binary attachments.
metadata.jsonThe metadata file is a flat JSON object describing the filing envelope and document inventory:
formType — fixed at "C-U".accessionNo — dashed accession number (e.g., "0001859901-25-000002").description — typically "Form C-U - Progress Update".filedAt — ISO 8601 timestamp with US Eastern offset.linkToFilingDetails — URL to the rendered HTML on EDGAR (.../xslC_X01/primary_doc.xml).linkToHtml — URL to the EDGAR filing index page (...-index.htm).linkToTxt — URL to the consolidated SGML .txt submission envelope on EDGAR.linkToXbrl — empty string for Form C-U; the form does not carry XBRL exhibits.documentFormatFiles[] — array describing the documents in the EDGAR submission. A typical Form C-U has three entries: the rendered HTML (type: "C-U", size carried as a single-space placeholder rather than a byte count), the raw XML (type: "C-U", byte size), and the consolidated SGML envelope (type: " ", description: "Complete submission text file"). The rendered-HTML entry's space-valued size should be interpreted as "size unknown" rather than as a malformed integer.entities[] — issuer/filer descriptors, normally length one. Common keys include cik, companyName (suffixed with (Filer)), irsNo, fileNo (the Reg CF file number, typically of the form 020-NNNNN), act ("33" for the Securities Act of 1933), stateOfIncorporation, fiscalYearEnd in MMDD form, filmNo, and type ("C-U"). For the small subset of issuers that are also public reporters, the entity object additionally carries sic (industry code) and tickers[]; the presence of a ticker on a Form C-U record is a signal that the issuer has a parallel reporting history outside Regulation Crowdfunding.seriesAndClassesContractsInformation[] — present but empty for Form C-U.dataFiles[] — present but empty for Form C-U.id — sec-api's 32-character hex internal record identifier.primary_doc.xmlThe XML body is rooted at <edgarSubmission> under the namespace http://www.sec.gov/edgar/formc, with a secondary namespace xmlns:com="http://www.sec.gov/edgar/common" used for shared address-block elements. The root element has two children: <headerData> and <formData>.
<headerData> — filing envelope<headerData> declares the submission type and filer credentials. It contains <submissionType>C-U</submissionType> and a <filerInfo> block holding <filer><filerCredentials><filerCik> (the issuer CIK, zero-padded to ten digits) and <filerCcc> (the EDGAR Confirmation Code, redacted as XXXXXXXX in the public archive by design), plus <fileNumber> carrying the Reg CF file number. <filerInfo> also holds <liveTestFlag> (LIVE for production filings) and a <flags> element with boolean <confirmingCopyFlag>, <returnCopyFlag>, and <overrideInternetFlag> values that govern EDGAR's submission-handling behavior.
<formData> — Form C-U body<formData> is divided into four ordered subsections that mirror the visual sections of the Form C-U as rendered by EDGAR.
<issuerInformation>This block carries the progress update itself plus the identification of the issuer and the funding-portal intermediary.
<progressUpdate> — free-text update describing the status of the offering. This is the most semantically important field in the form. Issuers convey final-close summaries (e.g., "At the close of the offering, the Issuer closed on $377,412.95 and 119,713 number of securities"), corrective amendments (e.g., "Adjustment to the initial Form C-U filing..."), and failed-offering outcomes ("Offering closed unsuccessful") all through this single prose field. There is no separate enumerated status code — success, failure, intermediate milestones, and amendment context are all inferred from the text.<issuerInfo> — issuer identity. Children include <nameOfIssuer>; a <legalStatus> wrapper holding <legalStatusForm> (Corporation, Limited Liability Company, etc.), <jurisdictionOrganization> (two-letter state or territory code), and <dateIncorporation> in MM-DD-YYYY; an <issuerAddress> element using the shared com: namespace (com:street1, com:city, com:stateOrCountry, com:zipCode); and <issuerWebsite>.<isCoIssuer> — Y / N flag indicating whether a co-issuer is involved.<companyName>, <commissionCik>, <commissionFileNumber> — the funding-portal intermediary's name (e.g., [StartEngine Primary, LLC](https://brokercheck.finra.org/firm/summary/291773), [Issuance Express](https://issuanceexpress.com/), [Honeycomb Portal LLC](https://www.finra.org/about/entities-we-regulate/funding-portals-we-regulate)), its EDGAR CIK, and its SEC file number. An optional <crdNumber> carries the FINRA CRD identifier when the intermediary is a broker-dealer-affiliated portal. The intermediary identification is structurally distinct from the issuer entity in metadata.json entities[], which always describes the issuer.<offeringInformation>This block restates the economic terms of the underlying Reg CF offering, mirroring the original Form C's offering-terms section.
<compensationAmount> — fee paid to the intermediary, expressed as free text (e.g., "7 - 13 percent" or a multi-clause fee schedule). Optional; absence is not zero.<financialInterest> — any equity or other financial stake held by the intermediary; commonly "N/A" or "None", otherwise descriptive prose.<securityOfferedType> — enumerated: Preferred Stock, Debt, Other, etc.<securityOfferedOtherDesc> — free-text description, present only when <securityOfferedType> is Other (e.g., "Series A-4 Preferred Stock").<noOfSecurityOffered> — integer count of securities offered. Sometimes omitted, particularly for debt offerings priced at $1 per unit.<price> — per-security price in decimal dollars.<priceDeterminationMethod> — free-text description of how the price was set.<offeringAmount> — target dollar amount of the offering.<overSubscriptionAccepted> — Y / N.<overSubscriptionAllocationType> — First-come, first-served basis, Pro-rata basis, or Other.<descOverSubscription> — free-text description, present only when allocation type is Other.<maximumOfferingAmount> — dollar cap on the offering.<deadlineDate> — offering deadline in MM-DD-YYYY form.<annualReportDisclosureRequirements>This block carries the issuer's annual-report financial disclosures. Each financial line item appears as a paired "MostRecentFiscalYear" / "PriorFiscalYear" element so the form is self-contained for two fiscal years:
<currentEmployees> (decimal headcount),<totalAssetMostRecentFiscalYear> / <totalAssetPriorFiscalYear>,<cashEquiMostRecentFiscalYear> / <cashEquiPriorFiscalYear> (cash and cash equivalents),<actReceivedMostRecentFiscalYear> / <actReceivedPriorFiscalYear> (accounts receivable; the truncated element name is the canonical EDGAR schema name, not a transcription artifact),<shortTermDebtMostRecentFiscalYear> / <shortTermDebtPriorFiscalYear>,<longTermDebtMostRecentFiscalYear> / <longTermDebtPriorFiscalYear>,<revenueMostRecentFiscalYear> / <revenuePriorFiscalYear>,<costGoodsSoldMostRecentFiscalYear> / <costGoodsSoldPriorFiscalYear>,<taxPaidMostRecentFiscalYear> / <taxPaidPriorFiscalYear>,<netIncomeMostRecentFiscalYear> / <netIncomePriorFiscalYear> (often negative for early-stage issuers).This is followed by zero or more <issueJurisdictionSecuritiesOffering> elements, each holding a single two-letter US state or territory code identifying a jurisdiction in which the securities are being offered (Blue Sky coverage). The list is bimodal in practice: filings either omit it entirely or enumerate the full set of US jurisdictions (typically all 50 states plus DC and PR). Partial selections are uncommon.
<signatureInfo>The signature block has two parts. <issuerSignature> wraps an <issuer> element (the entity name), an inner <issuerSignature> element holding the signer's name (sometimes prefixed with /s/ to indicate a transcribed signature), and <issuerTitle>. A subsequent <signaturePersons> element holds one or more <signaturePerson> records, each containing <personSignature>, <personTitle>, and <signatureDate> in MM-DD-YYYY format. The number of signers ranges from a single officer to four-person boards, so extraction must iterate the list rather than assume a single signer.
xslC_X01/primary_doc.xml is the EDGAR-rendered XHTML view of the same form, generated by stylesheet xslC_X01.xsl. Its header reads FORM C / FORM C-U and references an OMB control number. Each form field is rendered inside <div class="fakeBox">, fakeBox2, or fakeBox3 wrappers so the page resembles a fillable government form; Yes/No questions render as paired <img src="/Images/radio-checked.jpg"> / radio-unchecked.jpg icons. The state-jurisdiction list is rebuilt as a sequence of single-row <table> elements with the human-readable state name (ALABAMA, ALASKA, ...) rather than the two-letter code used in the XML. Signatures appear at the bottom under a Form C: Signature heading, each signer represented by a three-row Signature/Title/Date table. The HTML carries no information beyond what is in the XML — it is a presentational mirror, and parsers should be content-sniff aware or check the DOCTYPE rather than dispatching solely on the .xml file extension.
Each record includes the structured XML form with all four <formData> subsections, the EDGAR XHTML rendering of the same form, and the sec-api metadata.json index. The documentFormatFiles[] array additionally references the consolidated SGML .txt submission envelope on EDGAR for traceability, but only by URL — the SGML envelope itself is not packaged into the accession folder.
Image files referenced or embedded in the original EDGAR submission are stripped from the archive. The original Form C and any earlier Form C-U filings from the same offering are not packaged with a given Form C-U record; each accession folder is one filing event in isolation, and lineage must be reconstructed by the consumer from issuer CIK, file number, and progress-update text. Exhibit attachments and PDF supplements, when they exist on EDGAR, are not retained in the archive. The linkToFilingDetails, linkToHtml, linkToTxt, and documentFormatFiles[].documentUrl fields in metadata.json provide URLs back to EDGAR for any content not packaged locally. The <filerCcc> access code in <headerData> is intentionally redacted as XXXXXXXX; this is by design and not a data-quality issue.
<progressUpdate> (e.g., "Offering closed unsuccessful"); there is no enumerated success/failure flag, so downstream classification must read the free text.C-U form type with corrective wording in <progressUpdate> (e.g., "Adjustment to the initial Form C-U filing..."). Formally amended progress updates are submitted as C-U/A; both variants surface in the dataset and require disambiguation by form-type metadata and progress-update text.<compensationAmount> and <noOfSecurityOffered> are omitted in some filings; absence is not an error and should not be treated as zero. <descOverSubscription> and <securityOfferedOtherDesc> only appear when their parent enumerated value is Other.metadata.json entities[0] carries an sic industry code and a tickers[] array.<companyName> inside <issuerInformation>, with <commissionCik>, <commissionFileNumber>, and optionally <crdNumber>. This is structurally distinct from the entities[] issuer block in metadata.json.<actReceivedMostRecentFiscalYear> and its prior-year counterpart are the canonical EDGAR schema names for accounts-receivable disclosure; do not treat the truncation as a transcription error to correct.xslC_X01/primary_doc.xml is XHTML despite the .xml extension. Detect content by DOCTYPE or MIME sniffing, not by extension alone.http://www.sec.gov/edgar/formc namespace and the four-section <formData> body have remained broadly stable across the coverage window. Reg CF's substantive thresholds have shifted over the form's lifetime (most notably the 2021 increase of the maximum offering amount from $1.07 million to $5 million), changing the dollar values that appear in <offeringAmount> and <maximumOfferingAmount> without altering the XML structure. The _X01 suffix on the rendering path reflects EDGAR's current XSL stylesheet generation; the structured XML remains authoritative regardless of any rendering variation in older filings.Each Form C-U is filed by the issuer conducting a Regulation Crowdfunding offering — the company offering and selling its own securities to retail investors through a registered crowdfunding intermediary. The intermediary itself does not file Form C-U; funding portals registered with the SEC and FINRA, and broker-dealers acting as Reg CF intermediaries, host the offering and may surface progress on their platforms, but the filing obligation rests entirely with the issuer. Investors, lead investors, and promoters are not filers.
The filer population is therefore dominated by small, early-stage, privately held operating companies — startups, pre-revenue ventures, small consumer brands, real-estate single-purpose entities, local food and beverage businesses, indie media projects, and similar issuers using Reg CF because they do not pursue a registered, Regulation D, or Regulation A offering. By design these issuers are non-reporting: they have no Section 13 or Section 15(d) reporting obligation and exist on EDGAR only because of the Form C series. A small number of Exchange Act-reporting issuers do appear, since Reg CF does not categorically exclude them, but they are uncommon.
Form C-U exists under Regulation Crowdfunding (17 CFR Part 227), adopted by the SEC in 2015 to implement Title III of the JOBS Act of 2012, which created the Section 4(a)(6) Securities Act of 1933 exemption for crowdfunding offerings. The specific obligation to file Form C-U flows from Rule 203(a)(3) of Regulation Crowdfunding. Within Rule 203, paragraph (a)(1) requires the initial Form C offering statement, paragraph (a)(2) governs amendments on Form C/A, and paragraph (a)(3) requires the Form C-U progress update. Filings are submitted on EDGAR with submission type "C-U" (or "C-U/A" for amendments).
The aggregate Reg CF offering cap was originally $1.07 million per issuer in any 12-month period and was raised to $5 million effective March 15, 2021. Form C-U dollar disclosures before and after that date reflect the respective caps.
Form C-U is event-driven, not periodic. Each filing is triggered by a milestone in the life of a Reg CF offering, and the filing window is five business days after the triggering event:
There is no annual or quarterly cadence; cadence is entirely a function of when offerings hit milestones or close. Corrections to a previously filed progress update are made on Form C-U/A under the same Rule 203(a)(3) basis.
The dataset begins in August 2016 because Regulation Crowdfunding only became effective on May 16, 2016; Form C-U has no pre-EDGAR analog and has always been filed electronically. Filings appear continuously but irregularly throughout the year, with monthly volume trending upward after the 2021 cap increase.
Form C-U sits inside the Regulation Crowdfunding family on EDGAR, alongside other Form C variants and adjacent exemption regimes (Reg D, Reg A) that cover the same issuers or offering goals through different mechanics. The closest comparison targets are the other Form C variants, then Reg D and Reg A filings, and finally the unofficial platform-side updates that resemble Form C-U content but carry no SEC status.
Form C (initial offering disclosure). The front-of-offering Rule 201 disclosure filed before any securities are sold; narrative-heavy, one-time, and contains the target amount that Form C-U later measures progress against. Form C-U is downstream, numeric, and may be filed multiple times against a single Form C.
Form C/A (amendment to Form C). Amends material offering terms (target, deadline, price, financials) and triggers a five-business-day investor reconfirmation under Rule 304(c). Form C-U never changes terms; it reports progress toward an unchanged target.
Form C-U/A (amendment to a Form C-U). Identical scope and fields to Form C-U but corrects a previously filed progress report. For analysis, C-U and C-U/A should be collapsed by accession lineage with the amendment treated as superseding.
Form C-AR (annual report). Calendar-triggered Rule 202 issuer disclosure filed within 120 days of fiscal year-end and continuing for years after the offering closes; Form C-U is event-triggered and ends at offering close.
Form C-AR/A. Amends a prior annual report, so it differs from Form C-U/A in both subject (yearly issuer disclosure vs. offering-period progress) and trigger (fiscal year-end vs. funding milestone).
Form C-TR (termination of reporting). A Rule 202(b) one-line filing that ends the issuer's ongoing C-AR obligation; Form C-U marks milestones inside a single offering, not the end of issuer-level reporting.
Form C-W (withdrawal). Pulls a Reg CF offering before close and is mutually exclusive with a final Form C-U on the same offering: C-W signals abandonment, the final C-U signals completion with a recorded total sold.
Form D (Reg D, Rules 506(b)/506(c)). A one-time accredited-investor private-placement notice with no funding-portal requirement, no offering cap, and no progress-update regime; covers a non-overlapping issuer population and offering structure from Reg CF.
Form 1-A (Reg A, Tier 1 and Tier 2). SEC-qualified "mini-IPO" offering circular permitting up to $20M/$75M per twelve months with no 50%/100%-of-target SEC progress filings; Reg A's periodic stack (1-K, 1-SA, 1-U) parallels C-AR, not C-U.
Form 1-Z (Reg A termination). Terminates Tier 2 reporting and discloses final amount sold, functioning as a Reg A hybrid of C-U (final tally) and C-TR (end of reporting); the Reg CF equivalent of its final-tally role lives in the closing Form C-U.
Funding portal platform updates (Wefunder, StartEngine, Republic, Honeycomb, etc.). Real-time amount-raised counters and free-text issuer posts that are more frequent than Form C-U but unstandardized, off-EDGAR, and unauditable through SEC channels; Form C-U is the only authoritative, time-stamped record of milestone and closing amounts for a Reg CF raise.
Form C-U is distinguished by three combined features: it is event-triggered by specific numeric thresholds (50%, 100%, close) rather than by calendar period or term changes; it is exclusive to Securities Act Section 4(a)(6) and has no analogue under Reg D or Reg A; and it is offering-scoped, attaching to a single Form C accession lineage, whereas C-AR and C-TR are issuer-scoped. For measuring how much a Reg CF campaign actually raised, the final Form C-U is the canonical record with no substitute inside or outside the Reg CF stack.
Form C-U documents progress and close events for Reg CF offerings, so it draws a narrow set of users tied to that single regulatory event.
Counsel advising Reg CF issuers use Form C-U as the primary record for whether Rule 203(a)(3) progress-update obligations were met on time. They check the filing date against the target-hit date for the five-business-day window, read targetOfferingAmount and amountRaised to identify which trigger (50 percent, 100 percent, or final close) the C-U represents, and use issuer CIK to chain the C-U back to its Form C and any C/A amendments. The dataset supports late-filing diagnostics before a follow-on round, C-U/A drafting, and cap-table reconciliation against the annual Reg CF cap when rolling into Reg D or Reg A+.
Compliance and operations teams at FINRA-registered funding portals and broker-dealer intermediaries filter on intermediary CIK to isolate offerings hosted on their platform, then track amountRaised, totalAmountSold, and filing timestamps. The workflow covers confirming that issuers filed required 50 percent and 100 percent C-Us in window, populating books-and-records files, escalating missed closing C-Us, and reconciling escrow release events with the issuer's public disclosure. Cross-platform monitoring uses intermediary CIK to track C-U volume and final raise sizes by competing portal.
Academic finance and entrepreneurship researchers, policy analysts, and market analysts join Form C, C/A, and C-U on issuer CIK to build offering-level panels covering Reg CF since August 2016. They aggregate targetOfferingAmount, offeringMaximum, amountRaised, totalAmountSold, currentNumberOfEmployees, and close dates to compute success rates, time-to-target, undersubscription rates, and intermediary market share. Outputs include peer-reviewed studies, capital-formation policy briefs, and longitudinal market reports.
Analysts at seed funds, micro-VCs, and angel groups use C-U as a sourcing signal. They flag issuers whose 50 percent C-U lands within days of launch (raise-velocity signal), issuers whose final C-U falls well below offeringMaximum (potential discounted bridge), and issuers whose currentNumberOfEmployees and totalAmountSold match a thesis stage. The workflow ingests new C-Us daily, joins to Form C issuer data, ranks on amount-raised-per-day, and routes candidates to deal pipelines.
Engineering and product teams building Reg CF dashboards and exempt-offering trackers consume the XML and JSON as the canonical feed for offering progress and close events. They render amountRaised, targetOfferingAmount, totalAmountSold, and filing date into "amount raised," "percentage of target," and "offering closed" tiles, drive notifications off filing timestamps, and use the dataset to backfill historical states, validate scrapers against EDGAR, and power downstream APIs.
Reporters covering startup finance and alternative investing use C-U to verify crowdfunding milestones: fast targets, closes well below maximum, raises near the annual Reg CF cap, and quarterly success-rate trends. They cross-check issuer press releases against the legally binding amountRaised and totalAmountSold, and use intermediary CIK to attribute milestones to a portal.
Corp-dev analysts scanning for capital-stack signals use C-U to flag issuers that just closed a Reg CF round. They read totalAmountSold and the close date to estimate cash on hand, infer post-money valuation from offering price carried through the chain, and size targets via currentNumberOfEmployees. Issuers closing well below target surface as more receptive to acquisition or licensing conversations than to another raise.
Quant teams treat C-U as a structured event source. They engineer features such as amountRaised over targetOfferingAmount, days from launch to 50 percent C-U, days from 50 percent to final C-U, employee-count change between Form C and closing C-U, and intermediary fixed effects from intermediary CIK. These feed models for follow-on financing probability, issuer-note default risk, and intermediary-level success rates, with full-population coverage from August 2016 supporting clean backtests.
Lawyers and portal compliance staff treat the dataset as compliance evidence; researchers and quants treat it as an offering-outcomes panel; investors and corp-dev teams treat it as a sourcing signal; journalists and product teams treat it as the authoritative number behind any Reg CF milestone. The value concentrates in a small field set: issuer CIK, intermediary CIK, targetOfferingAmount, amountRaised, totalAmountSold, currentNumberOfEmployees, and filing date.
The dataset supports a narrow set of high-value workflows tied to Reg CF progress and close events. Each use case below names the fields consumed and the artifact produced.
Building an offering-outcome panel for Reg CF research. Join Form C and Form C-U records on issuer CIK and Reg CF file number, take the latest C-U (or C-U/A) per offering, and pull <offeringAmount>, <maximumOfferingAmount>, the dollar and security counts in <progressUpdate>, and <deadlineDate> to compute success rate, time-to-target, and undersubscription rate. Output: a longitudinal panel covering August 2016 onward suitable for capital-formation studies and Reg CF policy briefs.
Auditing the five-business-day Rule 203(a)(3) filing window. For each issuer, compare metadata.json.filedAt against the date the offering crossed 50 percent, 100 percent, or final close as inferred from the <progressUpdate> text and dollar figures. Output: a per-issuer late-filing diagnostic that compliance counsel can use before a follow-on round, plus a population-level distribution of filing-window adherence by year and intermediary.
Funding-portal market-share and league tables. Filter on the <commissionCik> and <companyName> fields inside <issuerInformation> to attribute each progress update to its intermediary, then aggregate final-close totals (parsed from <progressUpdate> of the last C-U per offering) by portal and quarter. Output: a quarterly league table of StartEngine, Wefunder, Republic, Honeycomb, and other portals ranked by closed dollars, offering count, and average raise size.
Failed-offering classification and post-mortem corpus. Run a text classifier over <progressUpdate> for phrases such as "closed unsuccessful" and "no securities sold," then segment by issuer attributes from <issuerInfo> (<legalStatusForm>, <jurisdictionOrganization>, <dateIncorporation>) and the financial block (<totalAssetMostRecentFiscalYear>, <revenueMostRecentFiscalYear>, <netIncomeMostRecentFiscalYear>). Output: a labeled corpus of failed Reg CF raises and a feature table for failure-prediction models.
Daily sourcing feed for early-stage investors. Ingest new C-U accession folders monthly, parse <progressUpdate> for milestone type and dollars raised, and rank issuers on raise-velocity (days from <deadlineDate> window start to 50-percent C-U) and on final raise versus <maximumOfferingAmount>. Output: a daily candidate list of fast-funding issuers and undersubscribed closes routed into seed-fund, micro-VC, and angel deal pipelines.
Reg CF cap monitoring for issuer counsel. Sum totalAmountSold parsed from each issuer's final C-U over a rolling twelve-month window, keyed on issuer CIK, to confirm the issuer remained under the Reg CF annual cap before stacking a new Form C, Reg D, or Reg A offering. Output: a cap-utilization worksheet attached to the issuer's books-and-records file and used in transition memos.
Two-year financial snapshot dataset for early-stage benchmarking. Extract the paired <...MostRecentFiscalYear> / <...PriorFiscalYear> elements (assets, cash, receivables, short- and long-term debt, revenue, COGS, taxes paid, net income) plus <currentEmployees> from each C-U, deduplicated to the latest filing per issuer-year. Output: a benchmark table of crowdfunded-issuer financials by SIC, state of incorporation, and vintage for use in valuation comps and academic studies.
Real-time Reg CF dashboard backfill. Use the monthly ZIP archives as the canonical historical feed to populate "amount raised," "percent of target," and "offering closed" tiles in a fintech product, joining <offeringAmount>, the dollars in <progressUpdate>, and metadata.json.filedAt. Output: a backfilled time series for every Reg CF offering since 2016 and a validation harness for the live EDGAR scraper.
C-U / C-U/A lineage reconstruction. Group records by issuer CIK and <fileNumber>, order by filedAt, and use metadata.json.formType together with corrective phrasing in <progressUpdate> (for example, "Adjustment to the initial Form C-U filing") to collapse amendments onto their underlying progress event. Output: a deduplicated milestone timeline per offering where each 50-percent, 100-percent, and close event is represented by its latest authoritative filing.
Blue Sky coverage analysis. Read the <issueJurisdictionSecuritiesOffering> list on each C-U and classify offerings as all-states, none, or partial. Output: a coverage map showing how Reg CF issuers use the Section 18(b)(4)(C) preemption in practice and a flag for the rare partial-jurisdiction filings that warrant manual review.
The Form C-U Files dataset is available through three access methods: a JSON index API for metadata and container discovery, a full dataset archive download, and per-container ZIP downloads for incremental retrieval.
Dataset Index JSON API: https://api.sec-api.io/datasets/form-cu-files.json
This endpoint returns dataset-level metadata along with the list of all monthly container files. The metadata includes the dataset name, description, last update timestamp, earliest sample date (2016-08-01), total records, total size, covered form types (C-U), container format (ZIP), and file types contained in each archive (XML, JSON, HTML, PDF). Each container entry includes its key, size, record count, last updated timestamp, and direct download URL. Poll this endpoint to monitor which containers have changed in the latest refresh and selectively download only the updated months. This endpoint does not require an API key.
Example response:
1
{
2
"datasetId": "1f13365b-9ae0-694e-a2b8-265f1c8e2522",
3
"datasetDownloadUrl": "https://api.sec-api.io/datasets/form-cu-files.zip",
4
"name": "Form C-U Files Dataset",
5
"description": "Form C-U filings provide progress updates on securities offerings conducted under Regulation Crowdfunding...",
6
"updatedAt": "2026-04-24T03:01:42.414Z",
7
"earliestSampleDate": "2016-08-01",
8
"totalRecords": 9385,
9
"totalSize": 33394120,
10
"formTypes": ["C-U"],
11
"containerFormat": "ZIP",
12
"fileTypes": ["XML", "JSON", "HTML", "PDF"],
13
"containers": [
14
{
15
"downloadUrl": "https://api.sec-api.io/datasets/form-cu-files/2026/2026-04.zip",
16
"key": "2026/2026-04.zip",
17
"size": 412883,
18
"records": 118,
19
"updatedAt": "2026-04-24T03:01:42.414Z"
20
}
21
]
22
}
Download Entire Dataset: https://api.sec-api.io/datasets/form-cu-files.zip?token=YOUR_API_KEY
Downloads the complete Form C-U Files dataset as a single ZIP archive containing every monthly container from 2016 to the most recent refresh. This endpoint requires an API key.
Download Single Container: https://api.sec-api.io/datasets/form-cu-files/2026/2026-04.zip?token=YOUR_API_KEY
Downloads one monthly container ZIP, useful for incremental syncs where only specific months need to be retrieved or refreshed. Replace the year and month segments with any container key returned by the dataset index API. This endpoint requires an API key.
The dataset covers Form C-U, the progress-update form filed on EDGAR by Regulation Crowdfunding issuers under Rule 203(a)(3) of Regulation Crowdfunding. The EDGAR C-U/A amendment variant is also included and surfaces alongside the base C-U form type, requiring disambiguation by form-type metadata and the wording of the <progressUpdate> field.
One record represents one Form C-U filing event — for example, the issuer's first 50%-of-target progress notice, a 100%-of-target update, a final close-of-offering report (successful or unsuccessful), or an amendment to an earlier Form C-U. On disk, each record materializes as an accession-named folder containing metadata.json, the canonical primary_doc.xml, and the EDGAR XSL-rendered XHTML view at xslC_X01/primary_doc.xml.
The issuer conducting a Regulation Crowdfunding offering files Form C-U; the funding portal or broker-dealer intermediary does not. Filings are due within five business days of the triggering event — reaching the target offering amount, hitting the 50 percent or 100 percent target marks (when the intermediary does not provide frequent platform updates), or closing the offering — and a final Form C-U is required even on undersubscribed or canceled offerings.
Form C is the one-time, narrative-heavy initial offering disclosure filed before any securities are sold; Form C-U is the downstream, numeric, repeatable progress and close report against that Form C's target. Form C-AR is the calendar-triggered annual report filed after the offering closes, while Form C-U is event-triggered and ends at offering close. Form D covers Regulation D private placements with no funding-portal requirement, no offering cap, and no progress-update regime, so it covers a non-overlapping issuer population and offering structure from Reg CF.
The dataset's earliest sample is dated 2016-08-01, shortly after Regulation Crowdfunding became effective on May 16, 2016. Form C-U has no pre-EDGAR analog and has always been filed electronically, so the dataset constitutes the full population of Form C-U filings since the regime's inception, refreshed monthly thereafter.
Each accession folder contains a sec-api metadata.json index, the canonical EDGAR Form C-U primary_doc.xml, and an XSL-rendered XHTML view at xslC_X01/primary_doc.xml (XHTML despite its .xml extension). The dataset's file types are XML, JSON, HTML, and PDF, but in practice Form C-U records consist almost entirely of the XML/JSON/HTML triplet, with PDF appearing only on the rare filings that include a supplementary attachment. The dataset is distributed as monthly ZIP archives named YYYY/YYYY-MM.zip.
There is no enumerated success/failure flag. Outcome is conveyed exclusively through the prose of <progressUpdate> — for example, "At the close of the offering, the Issuer closed on $377,412.95 and 119,713 number of securities" for a successful close, or "Offering closed unsuccessful" for a failure. Downstream classification must read this free-text field, often in combination with the dollar figures it contains and the <offeringAmount> and <maximumOfferingAmount> from <offeringInformation>.