The SEC Comment Letters and Correspondence Dataset contains the complete public record of written communications between SEC staff and registrants during the SEC's filing review process. It covers all EDGAR submissions of form types CORRESP (registrant response letters) and LETTER (SEC staff comment letters) from March 2004 through the present — approximately 600,000 files totaling ~18 GB. The dataset gives researchers, lawyers, accountants, and investors direct access to the regulator's own accounting questions, disclosure insufficiency determinations, and interpretive positions applied to specific company filings.
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
The dataset captures both sides of the SEC staff review dialogue:
SEC staff comment letters (LETTER) are formal written communications from Division of Corporation Finance, Division of Investment Management, or Division of Trading and Markets staff to a registrant. Each letter identifies the filing under review, lists numbered comment items citing specific sections or disclosure elements, directs the registrant to respond within 10 business days, and identifies the reviewing staff attorney or accountant by name and title.
Registrant response letters (CORRESP) are filed by the issuer — or its outside securities counsel — in reply to an SEC staff comment letter. Each response reproduces the SEC's comment items verbatim and addresses them in order, either committing to revised disclosure, providing supplemental financial analysis, or arguing that existing disclosure is adequate.
Together, these two form types reconstruct the complete regulatory review dialogue: the SEC's questions and the registrant's answers. No other EDGAR dataset type captures this exchange. Primary filings (10-K, S-1, proxy) present the registrant's voice; comment letters and correspondence present the regulator's evaluation of those disclosures.
Each record corresponds to one EDGAR filing submission. Submissions from 2004–2010 are predominantly plain TXT. Post-2010 submissions are predominantly HTML. Recent Staff comment letters are published as PDF files and included as PDF-to-text converted TXT files.
A typical LETTER submission contains:
A typical CORRESP submission contains:
The following topics appear with sufficient regularity across the corpus to constitute a practical taxonomy of SEC disclosure focus areas:
SEC staff originate comment letters (LETTER). Corp Fin handles the largest volume by far, organized into industry offices (Office of Manufacturing, Office of Finance, Office of Technology, etc.) that each review a defined registrant population. Division of Investment Management reviews investment company and investment adviser filings. Division of Trading and Markets reviews broker-dealer and exchange filings. Letters are typically signed jointly by a Staff Attorney or Staff Accountant and a Branch Chief.
Registrants file response letters (CORRESP). The full range of public issuers appears: domestic operating companies of all sizes, foreign private issuers (FPIs) on Form 20-F/40-F/F-1, registered investment companies, SPACs (particularly 2020–2022), emerging growth companies in IPO review, REITs, banks, utilities, and shell companies. CORRESP submissions are sometimes drafted and submitted by outside securities counsel, whose letterhead appears on the document, though the filing appears under the registrant's CIK.
Comment letters are not routine deadline events — they are initiated at SEC staff discretion:
For Securities Act registration statements, the SEC targets an initial comment letter within 30 calendar days of filing. Response deadlines are almost always 10 business days from the letter date, with extensions routinely available. Full review cycles run 4–16 weeks depending on form type and complexity; complex S-1 IPO reviews can extend over several months through 3–5 rounds of correspondence.
Comment letters and responses are withheld from EDGAR public release for 20 business days after the SEC formally closes the review. This creates a rolling 4–6 week lag at the leading edge of the dataset. Correspondence from recently closed reviews does not appear immediately in the dataset.
The process rests on Securities Exchange Act of 1934 Section 13(a) (continuous reporting obligations), Securities Act Section 8 (registration review authority), Sarbanes-Oxley Act Section 408 (three-year review mandate), and SEC Release No. 33-8238 (2003), which established the public release policy for EDGAR correspondence. Comment letters are not enforcement actions and carry no formal sanctions in themselves, though unresolved issues can be a precursor to investigation referrals.
The comment letters corpus is structurally unlike every other major EDGAR dataset because it represents the regulator's evaluation of disclosures, not the disclosures themselves.
vs. Primary periodic and registration filings (Form 10-K, Form 10-Q, S-1, 20-F, DEF 14A): Primary filings are the registrant's voice — disclosures presented to investors. Comment letters are the SEC's response — evaluative judgments about whether those disclosures are adequate, accurate, and compliant. Combining both enables attribution analysis: identifying which disclosure changes between consecutive filings were SEC-directed.
vs. Registration statement amendments (S-1/A, F-1/A, S-4/A): Amendments show what changed in a disclosure; comment letters explain why the change was required. Without the comment letter, a disclosure revision in an S-1/A is visible but causally unexplained.
vs. No-action letters: No-action letters are voluntary and prospective — a company asks whether a contemplated action would trigger enforcement. Comment letters are mandatory and reactive — the SEC initiates them after a filing is made. No-action letters are published on SEC.gov in a separate database, not through EDGAR CORRESP/LETTER submissions, and address exemptive relief and rule interpretation rather than disclosure quality.
vs. SEC enforcement actions: Enforcement actions involve formal investigation, carry potential penalties, and represent rare, escalated outcomes. Comment letters represent the routine, high-volume oversight process — the vast majority of comment letter reviews are resolved without enforcement referral. In the SEC oversight pipeline, comment letters are the early-stage signal; enforcement actions are the downstream outcome.
| Dimension | Comment Letters (CORRESP, LETTER) | Primary Filings (10-K, S-1) | No-Action Letters | Enforcement Actions |
|---|---|---|---|---|
| Initiating party | SEC staff | Registrant | Registrant / counsel | SEC Enforcement |
| Nature | Regulatory review dialogue | Disclosure to investors | Advisory guidance | Adjudicatory proceeding |
| EDGAR form types | CORRESP, LETTER | 10-K, S-1, 8-K, etc. | Not on EDGAR | AP, Admin Proc |
| Record count | Thousands (2004–2026) | Millions | Thousands | Hundreds/year |
| Structured as dialogue | Yes — matched Q&A | No | Partially | No |
| Reflects SEC's view | Directly, per-filing | Implicitly (by acceptance) | Prospectively | By adjudication |
| Public release lag | 20 business days post-closure | Same day as filing | Published as issued | Published when filed |
The comment letters corpus attracts a different user profile than most EDGAR financial datasets. Because it captures the SEC's regulatory evaluation of disclosures rather than the disclosures themselves, the primary users are professionals whose work depends on understanding what SEC staff expects, has questioned, or has required companies to change.
Securities lawyers and outside counsel are the heaviest practical users. Before preparing a registration statement or annual report, counsel reviews prior comment letters received by the same client and by comparable issuers to identify areas likely to attract staff attention, research Corp Fin's current interpretation of specific rules, build firm-internal precedent databases, and draft preemptive disclosure language that has resolved similar comments in prior CORRESP filings.
In-house securities and disclosure counsel use comment letter data for ongoing disclosure governance — reviewing the company's own correspondence history, identifying active SEC focus areas relevant to their sector, and preparing disclosure committee meeting materials grounded in current staff expectations rather than generic checklists.
Audit partners and technical accounting teams at Big Four and national firms use the corpus to track SEC staff interpretations of US GAAP that are not yet reflected in FASB standards or formal guidance — monitoring how staff applies ASC 606, ASC 842, ASC 805, and ASC 740 in practice across a wide range of registrants, and using this to prepare audit committee briefings and update firm accounting positions.
Investment banking ECM and capital markets teams use the dataset to reduce IPO registration review risk — pulling recent LETTER filings from comparable S-1 issuers to build pre-filing checklists, estimating review timelines, and identifying business-model-specific disclosure issues the SEC has flagged for similar companies.
Academic researchers in accounting, finance, and law use the 22-year longitudinal corpus for empirical studies of SEC oversight behavior — testing whether comment receipt predicts restatements or enforcement, analyzing variation in comment intensity by auditor quality or firm size, and studying how accounting standard adoptions and regulatory events propagate into comment letter topics.
Financial journalists and investigative reporters use comment letters to surface regulatory scrutiny that companies have not publicly highlighted — finding SEC questions about transactions, accounting treatments, or disclosures that later proved problematic, and corroborating investigative reporting with contemporaneous regulatory documentary evidence.
Compliance officers and disclosure advisors use the dataset as a rolling regulatory intelligence feed — running periodic sweeps of recent LETTER filings in their industry to update disclosure checklists, benchmark their company's MD&A and risk factors, and prepare board presentations on current SEC disclosure expectations.
Hedge funds, short sellers, and activist investors use comment letter histories to identify companies with recurring SEC accounting questions, unresolved comment rounds, or staff scrutiny of the exact financial statement items the investor wants to examine. Multi-round review histories and comments touching goodwill impairment, related-party transactions, or going concern assessments are high-value signals.
M&A due diligence counsel and financial advisors review a target company's full CORRESP/LETTER history to assess disclosure quality, unresolved regulatory concerns, and accounting treatments questioned by the SEC that may affect post-acquisition financial statement presentation.
Securities counsel and ECM teams use the dataset before finalizing an S-1 to build a pre-filing comment risk checklist. They pull all LETTER filings from the prior 12–24 months directed at comparable-industry S-1 issuers, systematically extract comment topics and resolution approaches from matching CORRESP responses, and identify patterns — revenue recognition policy, non-GAAP metric definitions, risk factor specificity, segment disclosure, use-of-proceeds language — that consistently appear across comparable companies. The output is a disclosure review checklist specific to the issuer's sector and business model, grounded in recent SEC review experience rather than generic templates. Well-prepared filings clear in 1–2 comment rounds; unprepared filings typically run 3–5 rounds over 3–5 months.
Compliance officers and disclosure advisors retrieve all LETTER filings from the prior two years directed at peer companies in the same SIC code, filtered to form type 10-K. They catalog comment items by section (MD&A, risk factors, financial statement notes, executive compensation) and identify which disclosure patterns consistently attract comment. Comparing the company's own current disclosure against those patterns identifies specific gaps — for example, if multiple peers were asked to provide more specific liquidity runway analysis, and the company's MD&A uses similarly general language. The output is an annual disclosure gap analysis grounded in current SEC comment experience, supporting the disclosure committee review with specific, citation-backed recommendations.
Audit firm national offices and law firm regulatory practices extract LETTER filings from the most recent 12 months and apply NLP topic modeling or keyword frequency analysis to identify rising and falling comment themes. The approach has identified major shifts before formal guidance materialized: the emergence of climate-related disclosure comments in 2022 before formal rule finalization; the concentration of warrant accounting comments on SPAC filings in early 2021; the 2023–2024 surge in comments on AI-related business description language. The 2004–present time series enables distinguishing genuinely new topics from recurring cyclical themes. Output is quarterly regulatory intelligence reporting used to update client alerts, audit practice guidance, and firm-wide disclosure advisory materials.
M&A counsel and due diligence teams retrieve all CORRESP and LETTER filings for a target company's CIK from 2004 to present, reconstructing the full review history. The assessment covers: total number of review cycles; rounds per cycle; topics consistently raised across multiple reviews; instances where the registrant pushed back rather than amended; and any reviews not yet publicly released. Accounting-focused comments on revenue recognition, segment reporting, and related-party transactions receive particular scrutiny as indicators of post-acquisition financial reporting risk. Output is a structured comment letter history exhibit in the due diligence report, with flagged material recurring issues and document citations.
Plaintiff and defense counsel in securities class actions retrieve all LETTER and CORRESP filings for the defendant company's CIK for the relevant period, examining whether the SEC ever questioned the specific accounting treatment or disclosure at issue. A comment letter predating the class period may establish knowledge or recklessness; a review occurring without comment on the relevant item may support a reasonable-reliance defense. CORRESP responses containing confidential supplemental analysis (even if redacted in the public record) may be relevant to discovery about what information the SEC possessed. Output is a chronological record of the company's regulatory review dialogue with annotations to items intersecting the litigation claims.
Accounting and finance researchers use the 600,000+ record, 22-year corpus for empirical studies at a scale not achievable with manually collected samples. Representative study designs: panel regressions testing whether comment receipt predicts future restatements or enforcement actions; cross-sectional analysis of comment intensity by auditor quality, firm size, or financial distress; event studies of market reaction to comment letter receipt; difference-in-differences designs around new accounting standard adoptions (ASC 606, ASC 842) or regulatory events (SPAC boom). The matched LETTER-CORRESP structure enables dialogue analysis — studying whether concession vs. pushback response strategies differ in achieving review closure.
Short-seller analysts and activist research teams build systematic screens across the comment letter corpus: identifying companies that received three or more rounds of comments in a single review cycle, received repeated comments on the same accounting topic across multiple review cycles, filed CORRESP responses defending a treatment the SEC later required to be revised, or had a comment letter directly preceding a restatement announcement. For specific investment targets, analysts read the full LETTER-CORRESP thread to extract the SEC's own framing of accounting questions — often more pointed and specific than anything in the public filings. The SEC's own words about a company's accounting provide analytically and rhetorically powerful support for short or activist theses.
The dataset is distributed as a single master ZIP archive (sec-comment-letters.zip) containing monthly sub-archives organized by year and month (e.g., 2026/2026-02.zip). Each monthly archive contains the EDGAR submission files for CORRESP and LETTER filings accepted in that month.
Full dataset download: https://api.sec-api.io/datasets/sec-comment-letters.zip
Content is provided as-filed from EDGAR. No NLP pre-processing, comment item extraction, or structured field parsing is applied. Users building structured analytical datasets must apply their own text processing to primary document files.
How are comment letters and response letters linked in the dataset? The link is implicit, not explicit. A staff letter and its CORRESP response share the same registrant CIK. The response text cites the SEC letter's date and sometimes its accession number. Reconstructing a complete review thread requires parsing document text and applying date-sequence logic — there is no dedicated metadata field connecting a LETTER to its corresponding CORRESP.
Do the records include the full text of the letters? Yes. The primary document in each submission contains the full letter body as filed on EDGAR — not excerpts or summaries. The original HTML, TXT, or PDF format is preserved.
Are all portions of every letter publicly available? Not necessarily. Some letters contain passages subject to confidential treatment orders, which are redacted (typically shown as bracketed placeholders or omitted passages) before public EDGAR release. The dataset reflects the public EDGAR record, including any such redactions.
How current is the dataset? Comment letters are withheld from EDGAR public release for 20 business days after SEC staff formally closes the review. The most recent 4–6 weeks of completed reviews are typically not yet visible. The dataset is updated monthly.
Does the dataset distinguish SEC-authored letters from registrant responses?
Yes. Form type LETTER identifies SEC staff comment letters; form type CORRESP identifies registrant responses. Both are filed under the registrant's CIK, but the form type field provides an unambiguous distinction.
Which SEC divisions are represented? Primarily Division of Corporation Finance (the dominant source), with Division of Investment Management and Division of Trading and Markets also represented. The letter body identifies the originating office and staff member.
Is this dataset useful for non-US companies? Yes. Foreign private issuers (FPIs) appear extensively in the corpus, particularly companies filing on Forms 20-F and F-1. Comment letters directed at FPIs often address IFRS reconciliation, home-country exemptions, and VIE structure disclosures relevant to China-based issuers.
Can this dataset be used to build AI/NLP models? Yes. The corpus size and consistent dialogue structure make it well-suited for NLP applications including topic classification, regulatory question extraction, comment-response matching, and training data for compliance-oriented language models.