The Form DEF 14A Files dataset is a collection of definitive proxy statements filed on EDGAR under Section 14(a) of the Securities Exchange Act of 1934 and distributed to shareholders ahead of annual or special meetings. One record is one complete EDGAR submission of a DEF 14A, uniquely keyed by SEC accession number, delivered as a single folder that holds a metadata.json descriptor, the primary Inline XBRL HTML proxy statement, every textual exhibit filed under the accession, and — for a small minority of large-cap filings — a registrant-supplied courtesy PDF. Filers are overwhelmingly domestic operating companies with equity registered under Section 12(b) or Section 12(g), along with closed-end funds, business development companies, REITs, and similar Section 12-registered entities that solicit shareholder votes. Submissions are packaged as monthly ZIP containers and cover the period from January 1, 1994 to the present, aligning with the EDGAR rollout of mandatory electronic proxy filing in the mid-1990s.
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
The dataset is built from definitive proxy statements filed on Form DEF 14A. Form DEF 14A is the "definitive proxy statement" furnished under Section 14(a) of the Securities Exchange Act of 1934 and Rule 14a-3, sent to shareholders in advance of an annual or special meeting so that they can vote on matters such as director elections, auditor ratification, Say-on-Pay and Say-on-Frequency votes, equity compensation plan approvals, charter and bylaw amendments, stock splits, mergers, and shareholder proposals. Its substantive content is governed by Schedule 14A (Rule 14a-101), the enumerated Items of Regulation S-K (notably 201(d), 401, 402, 403, 404, and 407), and Rule 14a-21 for compensation-related advisory votes. Since 2023, Item 402(v) adds a structured, iXBRL-tagged pay-versus-performance disclosure.
The dataset contains only DEF 14A submissions. It deliberately excludes the sibling proxy form types that share the same Section 14(a) framework: PRE 14A (preliminary), DEFA14A (additional definitive soliciting materials), DEFM14A (merger-specific), DEFC14A (contested), and DEF 14A/A (amendment). Coverage begins January 1, 1994, the start of mandatory electronic proxy filing on EDGAR, and extends to the present. Filings are distributed as monthly ZIP containers; each container unpacks into one folder per accession, holding the EDGAR-native descriptor plus the primary textual artifacts of the submission.
One record in the Form DEF 14A Files dataset is one complete EDGAR submission of a definitive proxy statement, uniquely keyed by its SEC accession number. Inside a monthly ZIP container, the record materializes as a single folder whose name is the 18-digit dashless form of the accession (for example 000004846525000074 for accession 0000048465-25-000074). The folder holds a metadata.json descriptor plus every textual and PDF document the registrant filed under that accession: the primary proxy statement, any textual exhibits, and, when produced, a registrant-supplied courtesy PDF. Image graphics (EDGAR GRAPHIC artifacts) and the XBRL linkbase sidecars are catalogued in the metadata but deliberately omitted from the payload. The record unit is therefore the filing-as-delivered-to-EDGAR, flattened into a single folder of primary textual artifacts plus a self-describing manifest.
At the filesystem level, one record consists of up to four kinds of artifact:
metadata.json descriptor (always present; exactly one per record)..htm file in Inline XBRL HTML (always present for modern filings)..htm file wrapped in an EDGAR SGML <DOCUMENT> envelope.The typical footprint is two files (metadata.json plus the primary .htm). In the December 2025 sample container, 201 of 210 indexed submissions resolved to folders (the remaining nine were image-only or late-arriving), producing 204 HTML documents, 201 metadata JSONs, and 6 courtesy PDFs across 411 files. Folders grow to four or five files only when textual exhibits (charter, bylaws, material contracts, press releases) or a courtesy PDF accompany the primary document.
metadata.json descriptorThe metadata.json file captures the EDGAR-level descriptor of the submission and acts as both a record header and a pointer registry to assets that are not physically shipped. Top-level fields:
id — 32-character hexadecimal internal record identifier used by sec-api.formType — always "DEF 14A".accessionNo — dashed SEC accession number (e.g. "0000048465-25-000074").description — EDGAR submission description, typically "Form DEF 14A - Other definitive proxy statements".filedAt — ISO-8601 timestamp including the EDGAR receive timezone (e.g. "2025-12-17T16:30:40-05:00").periodOfReport — date string; for DEF 14A this is the shareholder meeting or record date, not a fiscal-period end.linkToFilingDetails — canonical EDGAR URL for the primary document (often the iXBRL viewer).linkToTxt — URL of the complete SGML submission text file.linkToHtml — URL of the EDGAR -index.htm filing-index page.linkToXbrl — URL of a standalone XBRL viewer; frequently an empty string for DEF 14A, because tagging is carried inline in the primary HTML rather than in a separate instance.entities — array of every participating entity (filer, subject, filing agent). Each object carries companyName (with an EDGAR role suffix such as "(Filer)" or "(Subject)"), cik, fileNo, irsNo, stateOfIncorporation, fiscalYearEnd (MMDD), sic (numeric code plus human label), act, type, filmNo, a tickers array that may enumerate multiple classes (e.g. ["OPTX","OLIT","OLITU","OPTXW","OLITW"]), and the EDGAR industry-office label.seriesAndClassesContractsInformation — array that is empty for operating-company proxies and populated only when the filer is a registered investment company disclosing series/class identifiers.documentFormatFiles — ordered manifest of every textual and binary document EDGAR recorded for the submission. Each entry has sequence (string; blank for the synthetic complete-submission .txt), size (bytes, stringified), documentUrl, type (e.g. "DEF 14A", "GRAPHIC", "EX-3.1", "EX-99.1"), and an optional description. The sequence: "1" entry is always the primary proxy; GRAPHIC entries are listed for completeness even though the image bytes are not inside the ZIP.dataFiles — array of XBRL sidecar artifacts: the schema (EX-101.SCH, .xsd), the definition/label/presentation linkbases (EX-101.DEF, EX-101.LAB, EX-101.PRE, each .xml), and an extracted XBRL instance (for example *_htm.xml). These XML artifacts are referenced by URL only and are not carried in the payload.Together these fields make metadata.json a self-describing index of what is present on disk, a pointer registry for excluded assets, and the EDGAR-native view of the submission.
.htm)The primary document is a single Inline XBRL HTML file: a complete <html> page whose body is rendered for human reading while simultaneously exposing machine-readable XBRL facts through elements in the http://www.xbrl.org/2013/inlineXBRL namespace (<ix:nonNumeric>, <ix:nonFraction>, <ix:header>, <ix:hidden>, <ix:references>, <ix:resources>). The root element declares XBRL taxonomy namespaces — dei, us-gaap, ecd (the Executive Compensation Disclosure taxonomy used for pay-versus-performance), srt, any company-specific extension taxonomy, and iso4217 for currency units. An <ix:hidden> block near the top carries at minimum the cover-page facts dei:DocumentType (value "DEF 14A"), dei:AmendmentFlag, and dei:EntityCentralIndexKey; a reporting context (for example <xbrli:context id="c-1">) supplies the fiscal year start and end dates that any ecd:* pay-versus-performance facts reference.
Filenames follow two conventions driven by the filing agent. Large-company filings produced by Workiva or Donnelley use {ticker}-{yyyymmdd}.htm (e.g. hrl-20251216.htm, apd-20251210.htm, lits-20251230.htm). Smaller filers using Broadridge, Toppan Merrill, or EdgarAgents use generic names such as formdef14a.htm, def14aproxystatement.htm, tm{id}_def14a.htm, or ea{id}_def14a.htm.
The rendered body follows the customary Schedule 14A section ordering, with substantial registrant-specific variation in depth and labelling:
ex*.htm)Textual exhibits are preserved inside the EDGAR SGML <DOCUMENT> envelope. Each exhibit file opens with a header block of the form:
1
<DOCUMENT>
2
<TYPE>EX-3.1
3
<SEQUENCE>2
4
<FILENAME>ex3-1.htm
5
<DESCRIPTION>EX-3.1
6
<TEXT>
7
<HTML> ... </HTML>
8
</TEXT>
9
</DOCUMENT>
The <TYPE>, <SEQUENCE>, <FILENAME>, and <DESCRIPTION> lines correspond one-to-one with entries in metadata.json -> documentFormatFiles, enabling deterministic joining between the manifest and the filesystem. Common exhibit types observed in DEF 14A submissions include EX-3.1 and EX-3.3 (amended certificates of incorporation, bylaws), EX-10.* (material agreements such as equity incentive plan texts), and EX-99.* (press releases, voting-result supplements, supplemental disclosures). Exhibits are never iXBRL-tagged; only the primary proxy carries inline XBRL.
A small fraction of records (6 of 201 folders in the December 2025 sample, all large-cap filers such as Air Products, Accenture, Rockwell Automation, Mueller Water, and Spire) include a registrant-produced courtesy PDF with a file name matching *courtesy*pdf.pdf. This is a standard binary PDF (magic bytes %PDF-1.7) that reproduces the proxy with the filer's print-oriented typesetting and includes the image graphics (director photographs, infographics, governance icons) that are filtered out of the HTML-only payload. Content parity with the primary .htm is intended by the filer but not enforced; the courtesy PDF is useful for visually faithful rendering and for recovering embedded images.
The record ships with the metadata.json descriptor, the primary Inline XBRL HTML proxy statement, every textual exhibit filed under the accession, and any courtesy PDF. All textual content — cover page, proposals, compensation tables, governance narrative, ownership tables, audit disclosure, shareholder proposals, appendices, and every textual exhibit — is present end-to-end, including the inline XBRL tagging on the cover page and the pay-versus-performance table. The EDGAR-level index (documentFormatFiles, dataFiles) is fully preserved in metadata even for artifacts that are not physically shipped, so the record is self-describing with respect to what the original submission contained.
Three categories of material referenced by metadata.json are intentionally not shipped inside the ZIP:
type: "GRAPHIC") — JPEG, GIF, and similar artifacts referenced by the primary HTML. Omitted to reduce container size. Consumers needing the images can follow documentUrl in documentFormatFiles to fetch them from EDGAR, or use the courtesy PDF (when present) as a print-ready surrogate.dataFiles — the schema (.xsd), the definition/label/presentation linkbases, and the extracted XBRL instance XML. The inline XBRL facts themselves are still available because they live inside the primary .htm..txt file — EDGAR's concatenation of every document in the submission. Referenced via linkToTxt but not replicated, because its content is the union of the payload files plus the image and XML assets already excluded.Separately, sibling filings associated with the same meeting — PRE 14A (preliminary), DEFA14A (additional soliciting material), DEF 14A/A (amendment), DEFM14A (merger-specific), DEFC14A (contested), as well as Form 4 updates and Schedule 13D/G disclosures around the meeting — are not part of this dataset. Consumers who need a complete meeting file must retrieve companions from adjacent datasets and group by meeting date or registrant CIK.
Schedule 14A content has expanded materially between 1994 and the present; a 1995 DEF 14A is structurally far shorter and less codified than one filed in 2025, even though the envelope form type is unchanged. Key inflection points:
ecd:* facts covering Compensation Actually Paid, Total Shareholder Return, net income, and a company-selected measure over a rolling multi-year window.The presentation format of DEF 14A filings on EDGAR has evolved through three distinct eras:
.txt documents. The proxy body was rendered with fixed-width columns, dashes as table rules, and minimal typography. Biographical blocks, compensation tables, and ownership grids appeared as text art. Exhibits were concatenated into the same SGML stream, delimited by <DOCUMENT>/<TYPE>/<SEQUENCE> headers. Early records in the dataset resolve to primary documents that are HTML only to the extent the registrant voluntarily filed HTML..htm files used inline CSS and HTML tables for layout; graphics were filed as separate GRAPHIC documents and referenced via relative <img> tags. The SGML <DOCUMENT> envelope remained around each attached exhibit. This era is the most consistent in rendering fidelity across the dataset.<ix:hidden>, and tag the pay-versus-performance table with ecd:* facts. Large-cap filers increasingly attach a courtesy PDF as a print-typeset companion. linkToXbrl is usually empty for this form because the tagging lives inline, with no separate XBRL instance.Across all eras, the SGML <DOCUMENT> wrapping of textual exhibits remains stable, and metadata.json normalizes the EDGAR-level view regardless of the era the filing belongs to.
metadata.json.accessionNo; canonicalize to one form before joining against other datasets.documentFormatFiles enumerates every document EDGAR recorded, including assets not physically shipped (images, the complete-submission .txt). The effective on-disk payload is the subset with extension .htm or .pdf; joining on sequence or filename against the files actually present in the folder yields the reconciled list.dataFiles is not shipped, but structured extraction can proceed directly against the inline iXBRL facts inside the primary .htm using any iXBRL-aware parser.entities are listed, the object carrying the (Filer) role suffix is the reporting registrant. Subject companies and filing agents appear with distinct role suffixes. Registered investment company filings populate seriesAndClassesContractsInformation; operating-company filings do not.periodOfReport semantics — For DEF 14A this field carries the meeting or record date, not a fiscal-period end. Treating it as a fiscal date will produce misaligned joins against 10-K or 10-Q datasets.<DOCUMENT> header before the <HTML> body. Parsers that feed the file directly into an HTML renderer must strip (or ignore) the leading SGML lines; parsers that tokenize the SGML envelope can use <TYPE>, <SEQUENCE>, and <FILENAME> to reconcile against documentFormatFiles.Each DEF 14A record is a definitive proxy statement filed on EDGAR by a party soliciting votes, consents, or authorizations from security holders of a class registered under Section 12 of the Securities Exchange Act of 1934. In the vast majority of filings the filer is the issuer itself.
The filer population includes:
Issuers that report only under Section 15(d) (no Section 12-registered class) are not subject to the Section 14(a) proxy rules and do not file DEF 14A, even though they file periodic reports. Foreign private issuers are exempt under Rule 3a12-3(b) and furnish home-country proxy materials on Form 6-K instead; Canadian MJDS filers likewise do not file DEF 14A. The DEF 14A population is therefore effectively a domestic-issuer population.
DEF 14A is event-driven, not calendar-driven. The trigger is any solicitation of proxies, consents, or authorizations from holders of a Section 12-registered class. Typical triggers:
Timing under Rule 14a-6:
There is no fixed annual deadline, but exchange-listing rules generally require annual meetings and Rule 14a-8 proposal windows assume an anniversary cadence. For calendar-year issuers, DEF 14A filings concentrate in March through May ahead of spring annual-meeting season. Subsequent communications are filed as DEFA14A (additional soliciting material) or, for material changes to the proxy statement itself, a revised DEF 14A / DEFR14A.
The governing framework is Section 14(a), Regulation 14A (Rule 14a-1 through 14a-21), and Schedule 14A, which incorporates Items 402, 404, and 407 of Regulation S-K for executive compensation, related-person transactions, and corporate governance disclosure.
Mandatory electronic filing of definitive proxy statements was phased in during the EDGAR rollout of the mid-1990s. DEF 14A records on EDGAR begin in 1994, aligning with this dataset's coverage start of January 1, 1994. Earlier proxy statements exist only in the SEC's paper archives.
DEF 14A sits inside a family of Section 14(a) proxy materials that share overlapping content and timing. The comparisons below clarify when DEF 14A is the correct dataset and when an adjacent filing type better fits the question.
Same document, earlier stage. PRE 14A is filed when proposals go beyond routine meeting business (mergers, charter amendments, equity plan approvals) and SEC staff review is expected. It may contain bracketed placeholders and text that shifts in response to staff comments. DEF 14A is the version mailed to shareholders and used for the vote; PRE 14A is the working draft. Use DEF 14A for final shareholder-facing disclosure, PRE 14A for staff-review and language-evolution studies.
Supplemental campaign materials filed after the definitive proxy: press releases, investor decks, shareholder letters, proxy advisor rebuttals. Same solicitation, not the proxy statement itself. DEF 14A is the single comprehensive voting document; DEFA14A is an expanding stream of follow-on communications, often concentrated near the record date or during contested votes. Full contested-meeting coverage requires both.
Definitive proxies with the same legal basis as DEF 14A but distinct form codes: DEFM14A for merger/acquisition votes, DEFC14A for contested solicitations. They carry specialized disclosure (fairness opinions, transaction backgrounds, dissident nominees) and are excluded from a DEF 14A dataset. PREM14A and PREC14A are their preliminary counterparts and are likewise excluded.
Used when shareholder approval is secured without soliciting proxies, typically because a controlling holder has already consented. Discloses similar meeting and transaction information but includes no proxy card and no solicitation. DEF 14A implies an active vote contest for proxies; 14C implies a fait accompli.
Frequently issued in sequence with DEF 14A and linked through Part III incorporation by reference, which creates overlap on executive compensation, director biographies, and related-party transactions. Purpose differs: 10-K is a financial and business report (audited statements, MD&A, risk factors); DEF 14A is a voting document (meeting mechanics, nominees, compensation governance, shareholder proposals). Use DEF 14A for say-on-pay, director elections, and proposal studies; 10-K for financial performance.
Reports the numeric tally after the meeting, typically within four business days. DEF 14A describes what will be voted on; 8-K 5.07 reports what was decided. Complementary, not substitutes: DEF 14A supplies proposal text, board recommendations, and supporting disclosure; 8-K 5.07 supplies the outcome.
The voter-side mirror of DEF 14A. Institutional managers and funds report how they voted on proxy proposals. Common research pattern is to link N-PX vote records back to DEF 14A proposal text, but N-PX contains no proposal content of its own.
The DEF 14A Files dataset captures the final, shareholder-distributed proxy statement and its non-image exhibits for routine annual and ordinary special meetings of U.S. domestic reporting companies, from 1994 onward. It is narrower than the full proxy ecosystem (excludes PRE 14A, DEFA14A, DEFM14A, DEFC14A, and Schedule 14C) and broader than any extracted section (retains the full filing rather than isolating compensation tables or proposals). It sits between the event-driven outcome disclosure of 8-K Item 5.07 and the financial disclosure of 10-K, and is the authoritative source for the text that governs each vote.
A DEF 14A package binds together director biographies, executive pay tables, equity plan proposals, auditor ratification, shareholder proposals, and beneficial ownership. Each professional group below extracts a specific slice for a defined workflow.
Firms advising boards and compensation committees build peer-pay benchmarks from the Summary Compensation Table, Grants of Plan-Based Awards, Outstanding Equity Awards, Pay Ratio, and Pay Versus Performance disclosures. Output: Say-on-Pay recommendations, plan-design memos, and realizable-pay comparisons across named peer groups.
ISS, Glass Lewis, and internal stewardship teams use director biographies, committee assignments, overboarding data, equity plan dilution metrics, and shareholder proposal text to produce vote recommendations and governance scorecards. Clawback policies, independence, and diversity disclosures feed recurring scoring models.
Outside counsel and in-house disclosure teams use the corpus as a precedent bank when drafting CD&A narratives, risk oversight sections, perquisite disclosures, change-in-control arrangements, and opposition statements to shareholder proposals. Supports first-draft generation and disclosure gap reviews against current peer conventions.
Activist research desks, defense banks, law firms, and proxy solicitors map board vulnerabilities: staggered boards, advance notice bylaws, proxy access, CEO pay outliers, weak Say-on-Pay history, and ownership concentration from beneficial ownership tables. Feeds target screens, white papers, and contested-election scenarios.
Buy-side and sell-side analysts consult proxies for incentive metrics, equity dilution, insider ownership, and related-party exposure that the 10-K does not cover. Informs investment theses where management incentives or capital allocation materially affect valuation, plus engagement letters before annual meetings.
Mine proxies for red flags that complement 10-K review: related-party transactions with officers or principal stockholders, auditor changes, unusual perquisites, retroactive equity grants, and related-party employment agreements. Longitudinal comparison across years at one issuer is the core workflow.
Total-rewards groups at registrants benchmark their own pay, equity plan features, and severance arrangements against self-selected peers. Summary Compensation Table data, share-reserve histories, and pay ratio inputs feed committee decks and plan-design stress tests.
IR and solicitation teams study peer vote outcomes, engagement cadence disclosures, and management proposal framing to prepare shareholder communications, meeting Q&A, and vote projections.
Used for large-sample studies of pay-for-performance sensitivity, board composition, director networks, diversity, voting behavior, and activism. Full 1994-to-present coverage supports panel data and event studies around plan adoption, director turnover, and proxy contests.
Extract structured features from compensation tables, director networks, and ownership concentrations to build governance factors for return, volatility, and accounting-quality signals.
Quantify the change-in-control cost stack: accelerated vesting, severance multipliers, excise tax gross-ups, and Item 402(t) golden-parachute disclosures. Also review D&O indemnification and related-party agreements requiring unwind at close. Feeds transaction cost models and retention-package design.
Teams building governance copilots, CD&A summarizers, and compensation-extraction pipelines use the full filings plus exhibits in HTML and PDF as a training and evaluation corpus for proxy question answering and structured table extraction.
The workflows below illustrate how the Form DEF 14A Files dataset is put to operational use. Each example ties a specific record component to a concrete output.
Compensation consultants and internal total-rewards teams extract the Summary Compensation Table, Grants of Plan-Based Awards, and Outstanding Equity Awards tables from the primary .htm across a self-selected peer list, and join them to the Item 402(v) Pay-Versus-Performance block read from the inline XBRL ecd:* facts. The result feeds Say-on-Pay recommendations, realizable-pay comparisons, and committee decks sized against the same fiscal window the peers disclose.
Quant and governance analytics teams parse the <ix:nonFraction> and <ix:nonNumeric> elements tagged with the ecd taxonomy inside the primary proxy to pull Compensation Actually Paid, Total Shareholder Return, peer-group TSR, net income, and the company-selected measure for each year in the rolling window. This produces a cross-sectional pay-for-performance panel without layout-dependent HTML table scraping.
Proxy solicitors and activist-defense advisers pair each DEF 14A with the matching Form 8-K Item 5.07 outcome, keyed on registrant CIK and periodOfReport (meeting date). The proxy supplies proposal text, board recommendation, committee memberships, and overboarding disclosures; the 8-K supplies the tally. The combined panel drives vote projections, failed-vote watchlists, and director-level support trends.
Transaction teams pull the Potential Payments upon Termination or Change in Control section and the Item 402(t) golden-parachute disclosure from target proxies to quantify accelerated vesting, severance multipliers, and excise tax gross-ups. Material equity plan exhibits attached as EX-10.* documents inside the same accession folder supply the underlying plan text used to validate acceleration triggers and retention-package design.
Securities lawyers query the full-text corpus for CD&A narratives, clawback policy language, perquisite disclosures, and opposition statements to Rule 14a-8 shareholder proposals filed by peers in the prior season. Because the dataset preserves every textual exhibit and the full proxy body (not extracted sections), counsel can pull complete passages in surrounding context for first-draft generation and disclosure-gap reviews.
Teams building CD&A summarizers, proposal classifiers, and compensation-extraction models use the paired metadata.json plus primary .htm (and courtesy PDF where present) as a training and evaluation corpus. The documentFormatFiles manifest, combined with stable SGML <DOCUMENT> wrappers on exhibits, provides deterministic document-type labels (DEF 14A, EX-3.1, EX-10.x, EX-99.x) for supervised extraction tasks and retrieval chunking.
Forensic accountants iterate the same registrant's DEF 14A filings year over year to flag newly disclosed related-person transactions under Item 404, auditor changes reflected in the audit committee report and fee table, retroactive or off-cycle equity grants in the Grants of Plan-Based Awards table, and unusual perquisite growth in the Summary Compensation Table. The 1994-to-present coverage supports multi-decade panels around a single CIK.
The Form DEF 14A Files dataset is available through a JSON metadata endpoint, a full archive download, and per-container downloads. Filings are packaged as monthly ZIP containers and cover submissions from January 1994 to the present.
Dataset Index JSON API: https://api.sec-api.io/datasets/form-def-14a-files.json
This endpoint returns dataset-level metadata (name, description, last updated timestamp, earliest sample date, total records, total size, form types, container format, and file types), the full dataset download URL, and the list of all individual container files with their size, record count, updated timestamp, and download URL. It is useful for monitoring which containers changed in the most recent daily refresh and for deciding which containers to pull incrementally. This endpoint does not require an API key.
Example response:
1
{
2
"datasetId": "1f13365b-9ae0-68e1-a294-5c6dfc349088",
3
"datasetDownloadUrl": "https://api.sec-api.io/datasets/form-def-14a-files.zip",
4
"name": "Form DEF 14A Files Dataset",
5
"updatedAt": "2026-04-16T02:57:49.783Z",
6
"earliestSampleDate": "1994-01-01",
7
"totalRecords": 204986,
8
"totalSize": 22305757239,
9
"formTypes": ["DEF 14A"],
10
"containerFormat": "ZIP",
11
"fileTypes": ["TXT", "JSON", "HTML", "PDF"],
12
"containers": [
13
{
14
"downloadUrl": "https://api.sec-api.io/datasets/form-def-14a-files/2026/2026-04.zip",
15
"key": "2026/2026-04.zip",
16
"size": 13818783,
17
"records": 154,
18
"updatedAt": "2026-04-16T02:57:49.783Z"
19
}
20
]
21
}
Download Entire Dataset: https://api.sec-api.io/datasets/form-def-14a-files.zip?token=YOUR_API_KEY
Downloads the complete dataset as a single ZIP archive containing all monthly containers from 1994 onward. This endpoint requires an API key.
Download Single Container: https://api.sec-api.io/datasets/form-def-14a-files/2026/2026-04.zip?token=YOUR_API_KEY
Downloads one monthly container ZIP, which is useful for incremental updates or when only a specific time window is needed. This endpoint requires an API key.
The dataset covers Form DEF 14A, the definitive proxy statement furnished under Section 14(a) of the Securities Exchange Act of 1934 and Rule 14a-3. It excludes the related form types PRE 14A, DEFA14A, DEFM14A, DEFC14A, and DEF 14A/A, each of which is filed under a separate form code.
One record is one complete EDGAR submission of a DEF 14A, uniquely keyed by SEC accession number and materialized as a single folder whose name is the 18-digit dashless accession. The folder holds a metadata.json descriptor, the primary Inline XBRL HTML proxy statement, any textual exhibits filed under the accession, and — optionally — a registrant-supplied courtesy PDF.
DEF 14A is filed by parties soliciting votes, consents, or authorizations from holders of a class registered under Section 12 of the Exchange Act — overwhelmingly the issuer itself. The population covers domestic operating companies listed under Section 12(b), Section 12(g) registrants, closed-end funds, business development companies, REITs, and similar Section 12-registered entities. Foreign private issuers and Section 15(d)-only reporters do not file DEF 14A.
Coverage begins January 1, 1994, aligning with the EDGAR rollout of mandatory electronic proxy filing, and extends to the present. Earlier proxy statements exist only in the SEC's paper archives and are not part of the dataset.
The dataset is distributed as monthly ZIP containers. Each container unpacks into per-accession folders that contain a metadata.json descriptor, one or more .htm files (the primary Inline XBRL proxy plus any textual exhibits wrapped in SGML <DOCUMENT> envelopes), and, for a small minority of filings, a courtesy .pdf. Image graphics and XBRL linkbase sidecars are referenced in metadata but not shipped inside the ZIP.
DEF 14A is the definitive proxy statement mailed to shareholders and used for the vote. PRE 14A is the earlier preliminary draft filed when SEC staff review is expected on non-routine matters. DEFA14A is the stream of additional soliciting materials — press releases, investor decks, advisor rebuttals — filed after the definitive proxy. This dataset contains only DEF 14A submissions; reconstructing a full meeting file requires pulling companions from adjacent datasets.
Containers are updated on a daily refresh cadence. The JSON index endpoint at https://api.sec-api.io/datasets/form-def-14a-files.json reports the updatedAt timestamp for each monthly container, which is the recommended way to detect which containers changed and to pull incremental updates.