The Form DEF 14A dataset contains every definitive proxy statement filed on EDGAR from 1994 to the present — approximately 197,700 filings across all SEC-reporting registrants subject to the proxy solicitation rules under Section 14(a) of the Securities Exchange Act of 1934. Each record preserves the complete textual and tabular content of the proxy statement in its original HTML or plain-text format, covering executive compensation, board composition, corporate governance, auditor relationships, beneficial ownership, shareholder proposals, and all voting matters submitted for shareholder approval. Images such as proxy voting cards are excluded. The dataset spans publicly listed operating companies, closed-end funds, REITs, BDCs, SPACs, trusts, limited partnerships, and all other issuers subject to SEC proxy rules. It is survivorship-bias free, includes filings by entities that have since ceased reporting, and is organized into monthly ZIP containers updated daily.
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
The dataset is organized into monthly ZIP containers (e.g., 2025-11.zip). Each ZIP extracts into a top-level folder named by year-month (e.g., 2025-11/), containing one subfolder per filing named by accession number with dashes removed (e.g., 000121390025114245 for accession 0001213900-25-114245). Each subfolder holds exactly two files: a metadata.json file with structured filing metadata and one .htm file containing the full proxy statement content.
1
2025-11.zip
2
└── 2025-11/
3
├── 000089687825000057/
4
│ ├── metadata.json
5
│ └── intu-20251126.htm
6
├── 000121390025114245/
7
│ ├── metadata.json
8
│ └── ea0266162-def14a_nukkleus.htm
9
└── ... (one folder per filing)
Each record represents a single Form DEF 14A filing — a definitive proxy statement — as submitted to EDGAR by a registrant. One record corresponds to one accession number and contains the complete textual content of the proxy statement in its original HTML or plain-text format. Images (proxy voting cards, logos, performance graphs rendered as image files, signature graphics) are excluded; all narrative, tabular, and disclosure content is preserved.
The metadata.json file contains fields including id (unique MD5 hash), accessionNo (SEC accession number), cik, ticker, companyName, companyNameLong, formType, description, filedAt (ISO 8601 timestamp), periodOfReport, linkToFilingDetails, linkToTxt, linkToHtml, linkToXbrl, documentFormatFiles (array listing all documents in the filing package with sequence, size, documentUrl, description, and type for each entry), dataFiles, entities (array with companyName, cik, sic, stateOfIncorporation, fiscalYearEnd, fileNo, irsNo, act, type, and filmNo for each entity), and seriesAndClassesContractsInformation. The .htm file contains the complete HTML markup of the DEF 14A proxy statement as published on EDGAR, beginning with EDGAR document tags and including all textual and tabular content. The .htm filename varies by filer (e.g., formdef14a.htm, intu-20251126.htm, aspi_def14a.htm).
Form DEF 14A is the definitive proxy statement filed under Section 14(a) of the Securities Exchange Act of 1934 and Rule 14a-6. It is the principal disclosure document through which registrants solicit shareholder votes — at annual meetings, special meetings, or for written-consent solicitations. The proxy statement accompanies or precedes the proxy card and provides the information shareholders need to vote on matters such as director elections, executive compensation approval (say-on-pay), auditor ratification, equity plan approvals, mergers, charter amendments, and shareholder proposals.
Unlike periodic financial reports (Form 10-K, 10-Q), the proxy statement is centered on corporate governance, compensation, voting mechanics, and the relationship between management, the board, and shareholders. It is the single most important annual disclosure document for corporate governance analysis.
The .htm file within each record contains the full proxy statement. A typical DEF 14A follows a structured order, though registrants have flexibility in arranging certain sections:
Cover page and filing metadata — registrant identification, meeting date, record date, soliciting person(s), and the date proxy materials are first distributed.
Notice of meeting — meeting date, time, location or virtual platform, matters to be voted on, record date, and voting instructions.
Voting and proxy information — solicitation mechanics: who may vote, how to submit or revoke a proxy, quorum requirements, vote tabulation, broker non-votes, routine vs. non-routine matters, record vs. street-name ownership, solicitation costs, and proxy solicitor engagement.
Election of directors — biographical information for each nominee and continuing director: name, age, principal occupation, committee memberships, tenure, other directorships, qualifications considered by the nominating committee, independence status under NYSE or NASDAQ listing standards, and board leadership structure.
Corporate governance — board independence and composition, board diversity data (including the NASDAQ board diversity matrix since 2022), committee descriptions (audit, compensation, nominating/governance, and other standing committees), meeting attendance, risk oversight (including cybersecurity risk oversight), shareholder communication procedures, and director nomination process.
Executive compensation — the longest and most complex section, comprising:
Director compensation — table and narrative for non-employee director pay: retainers, meeting fees, equity grants, and other benefits.
Security ownership — beneficial ownership of equity securities by each director, nominee, NEO, all directors and officers as a group, and known holders of more than 5% of voting securities. Tables show shares owned, options exercisable within 60 days, unvested RSUs, and percentages.
Related-party transactions — transactions exceeding $120,000 involving directors, officers, 5% shareholders, or their immediate family members (Item 404 of Regulation S-K), plus review and approval policies.
Auditor ratification — identification of the audit firm, two-year audit and non-audit fee breakdown (audit fees, audit-related fees, tax fees, all other fees), and pre-approval policies.
Say-on-pay vote — non-binding advisory vote on NEO compensation under Section 14A of the Exchange Act.
Shareholder proposals — proposals submitted under Rule 14a-8, each with the proponent's supporting statement and the board's response. Common topics include climate disclosure, political spending, lobbying, board chairman independence, special meeting rights, and compensation clawback policies.
Equity plan proposals — proposals for new equity incentive plans, plan amendments, or share authorization increases, with plan descriptions, dilution analysis, and fair value estimates.
Merger and transaction proposals — when a merger, asset sale, or extraordinary transaction requires shareholder approval, the proxy statement contains transaction background, fairness opinions, merger agreement terms, pro forma financials, and risk factors. These filings can exceed 500 pages.
Audit committee report — furnished (not filed) statement confirming oversight of financial reporting and discussions with independent auditors.
Section 16(a) delinquent filings — disclosure of late Form 3, 4, or 5 filings by officers, directors, or 10% shareholders.
Additional information and appendices — shareholder proposal submission deadlines, Form 10-K availability, householding notices, and appended plan documents or charter provisions.
The DEF 14A's disclosure scope has expanded substantially since 1994:
Filing format evolved from plain ASCII text (1994 to early 2000s) to HTML (early 2000s onward), with increasing CSS sophistication and inline XBRL cover-page tagging in recent years. The proxy statement body is not subject to full inline XBRL tagging.
The definitive proxy statement is filed by registrants subject to the proxy solicitation rules under Section 14(a) of the Securities Exchange Act of 1934. The filing population includes domestic publicly traded operating companies with Section 12-registered equity, closed-end investment companies, REITs, BDCs, SPACs, publicly traded trusts, and limited partnerships with Exchange Act-registered securities.
Foreign private issuers are exempt from Section 14(a) proxy rules and do not file DEF 14A; their governance disclosures appear in Form 20-F or Form 6-K. Section 15(d)-only reporters (typically debt-only or ABS issuers) and open-end mutual funds also do not file DEF 14A.
The primary trigger is the annual meeting of shareholders. Companies with calendar fiscal years typically hold annual meetings between March and June, filing the definitive proxy statement 30 to 60 days before the meeting. The proxy must be filed with the SEC no later than the date it is first sent or given to shareholders. Special meetings triggered by mergers, asset sales, charter amendments, or contested elections produce additional DEF 14A filings.
The regulatory framework is Regulation 14A under the Exchange Act (17 CFR 240.14a-1 through 240.14a-21). Rule 14a-3 requires furnishing a proxy statement meeting Schedule 14A disclosure requirements. Rule 14a-6 governs filing timing. Rule 14a-8 governs shareholder proposals. Rule 14a-21 implements say-on-pay and say-on-frequency votes. This dataset contains only DEF 14A (definitive filings); preliminary proxy statements (PRE 14A) are a separate form type not included.
Form PRE 14A (Preliminary Proxy Statement) is the preliminary version of the same proxy statement, filed for SEC staff review at least 10 days before the definitive version is mailed. It is not distributed to shareholders. The DEF 14A dataset contains the final shareholder-facing disclosure.
Form DEFA14A (Additional Soliciting Materials) consists of supplemental materials filed after the definitive proxy: shareholder letters, press releases, investor presentations, and responses to proxy advisory firms. These are short, targeted documents, not comprehensive proxy statements.
Form 10-K (Annual Report) covers business operations, financial condition, risk factors, and financial statements. Part III of the 10-K (Items 10-14) is routinely satisfied by incorporation by reference to the DEF 14A. The two forms are complementary: the 10-K provides the financial and operational picture, the DEF 14A provides governance and compensation.
Form 20-F serves as the annual report for foreign private issuers, who are exempt from the proxy rules. Users seeking governance disclosures for FPIs must use the 20-F dataset.
Form DEF 14C (Information Statement) is the definitive information statement filed when no proxy solicitation is needed — typically because a controlling shareholder holds sufficient votes. It contains similar disclosures but omits the proxy card and solicitation request.
Form N-14 is used by registered investment companies for fund mergers and reorganizations. The DEF 14A dataset includes proxy statements from exchange-listed closed-end funds but not N-14 filings.
The DEF 14A dataset is the definitive source for the full text of proxy statements distributed to shareholders across all domestic SEC-reporting issuers. No other single filing type provides the same breadth of governance, compensation, ownership, audit relationship, and shareholder voting disclosure.
Corporate governance analysts and proxy advisory firms analyze board composition, director independence, committee structures, board diversity, and shareholder rights to generate governance scores and voting recommendations.
Executive compensation analysts and consultants extract pay data from compensation tables and the CD&A to benchmark CEO and NEO pay, evaluate incentive plan design, and assess pay-for-performance alignment across industry peers.
Securities and disclosure lawyers benchmark proxy disclosure practices, draft proxy materials, track SEC staff comment themes, and review merger and contested-election proxy disclosures.
Institutional stewardship teams use proxy statements to inform voting decisions on director elections, compensation proposals, and environmental and social shareholder proposals.
Fundamental equity research analysts examine management incentive structures, director ownership levels, related-party transactions, and governance frameworks to assess shareholder alignment.
Activist investors analyze governance weaknesses, board vulnerabilities, compensation misalignment, and entrenching provisions to support shareholder campaigns and contested elections.
Forensic accountants and audit fee analysts track audit and non-audit fees, monitor auditor concentration and market share, and analyze fee ratios as measures of auditor independence risk.
Financial data engineers and NLP/LLM researchers extract structured tabular data and use narrative sections (CD&A, shareholder proposals, governance discussions) for text classification, entity extraction, RAG applications, and governance-specific model training across the dataset's 30+ year span.
Academic researchers use the survivorship-bias-free dataset for empirical studies on executive compensation, board diversity, shareholder proposal outcomes, governance spillover effects, and disclosure evolution.
Benchmarking executive compensation across peer groups. Extract CEO and NEO pay data from Summary Compensation Tables and pay-versus-performance tables across all proxy statements in a sector and fiscal year. Compare total compensation levels, incentive plan structures, performance metric selection, and pay-for-performance alignment for advisory engagements or compensation committee deliberations.
Tracking board diversity and composition trends over time. Parse director biographies, board diversity matrices, and committee membership disclosures across the full filer population from 1994 to present. Measure the longitudinal evolution of board diversity, identify industry-level patterns, and assess the impact of NASDAQ's board diversity rules on disclosure rates and actual composition changes.
Analyzing shareholder proposal outcomes and trends. Extract shareholder proposals submitted under Rule 14a-8, classify by topic (climate, political spending, compensation, governance reform), record voting outcomes, and track multi-year trends in submission rates, support levels, and board adoption of majority-supported proposals.
Building structured audit fee databases. Extract audit fees, audit-related fees, tax fees, and all-other-fees from auditor ratification sections. Aggregate by audit firm, industry, issuer size, and year to analyze market share, fee trends, and non-audit-to-audit fee ratios.
Monitoring governance provisions and entrenchment features. Identify classified boards, poison pills, supermajority vote requirements, dual-class structures, proxy access provisions, and special meeting rights across the full filer population. Track adoption and removal over time to study governance convergence and the impact of activist campaigns.
Extracting merger proxy disclosures for deal analysis. Identify DEF 14A filings containing merger or acquisition proposals. Extract transaction terms, fairness opinion summaries, transaction background narratives, and pro forma financial data for deal benchmarking and premium analysis.
Training governance-focused NLP and RAG systems. Use the full-text corpus — 30+ years and nearly 200,000 filings — as training and retrieval data for LLMs specialized in corporate governance, compensation, and shareholder voting language.
Dataset Index JSON API: https://api.sec-api.io/datasets/form-def-14a-filings-html-and-text-only.json
The dataset index API returns metadata about the dataset, the download URL for the full archive, and a list of all individual container files with their sizes, record counts, and last-updated timestamps. This endpoint does not require an API key. Use it to monitor daily which containers have been refreshed and decide which files to download incrementally.
1
{
2
"datasetId": "1f123a72-5a90-6320-9511-8a7d5030157a",
3
"datasetDownloadUrl": "https://api.sec-api.io/datasets/form-def-14a-filings-html-and-text-only.zip",
4
"name": "Form DEF 14A Filings - Definitive Proxy Statements - HTML and Text Only",
5
"updatedAt": "2026-03-28T08:41:57.000Z",
6
"earliestSampleDate": "1994-01-01",
7
"totalRecords": 197729,
8
"totalSize": 11036470451,
9
"formTypes": ["DEF 14A"],
10
"containerFormat": "ZIP",
11
"fileTypes": ["TXT", "HTML", "PAPER"],
12
"containers": [
13
{
14
"downloadUrl": "https://api.sec-api.io/datasets/form-def-14a-filings-html-and-text-only/2026/2026-03.zip",
15
"key": "2026/2026-03.zip",
16
"size": 48219384,
17
"records": 312,
18
"updatedAt": "2026-03-28T08:41:57.000Z"
19
}
20
]
21
}
Download Entire Dataset: https://api.sec-api.io/datasets/form-def-14a-filings-html-and-text-only.zip?token=YOUR_API_KEY
Downloads the full dataset as a single ZIP archive containing all 197,729 records (~11 GB compressed). Requires an API key.
Download Single Container: https://api.sec-api.io/datasets/form-def-14a-filings-html-and-text-only/2026/2026-03.zip?token=YOUR_API_KEY
Downloads one monthly container instead of the full archive. Each container is a ZIP file covering one month of filings. Requires an API key.
What format are the proxy statements in? Each record is in HTML or plain text, matching the original EDGAR submission format. Earlier filings (1994 to early 2000s) are predominantly plain ASCII text; later filings are in HTML with styled tables and formatting.
Are images included? No. Images such as proxy voting cards, company logos, performance graphs rendered as images, and signature graphics are excluded. All textual and tabular content is preserved.
Does the dataset include preliminary proxy statements (PRE 14A)? No. The dataset contains only DEF 14A — definitive proxy statements actually distributed to shareholders. Preliminary filings are a separate form type.
Does the dataset cover foreign private issuers? No. Foreign private issuers are exempt from Section 14(a) proxy rules and do not file DEF 14A. Their governance disclosures appear in Form 20-F or Form 6-K.
How far back does the dataset go? The dataset covers EDGAR filings from 1994 to the present, spanning over 30 years of proxy statement disclosure.
How often is the dataset updated? The dataset is updated daily. The dataset index JSON API shows which monthly containers have been refreshed in the latest update cycle.
What is the total size of the dataset? Approximately 197,729 records totaling ~11 GB compressed across all monthly ZIP containers.
Do proxy statements vary significantly across issuer types? Yes. Operating company proxy statements emphasize executive compensation and corporate governance. Closed-end fund proxies focus on advisory agreements and fund governance. REIT proxies reflect FFO-linked compensation. SPAC merger proxies contain extensive transaction disclosures and can exceed 500 pages. The proxy statement's content emphasis depends on the registrant type and the matters submitted for shareholder approval.