Form DEF 14A Files Dataset

The Form DEF 14A Files dataset is a collection of definitive proxy statements filed on EDGAR under Section 14(a) of the Securities Exchange Act of 1934 and distributed to shareholders ahead of annual or special meetings. One record is one complete EDGAR submission of a DEF 14A, uniquely keyed by SEC accession number, delivered as a single folder that holds a metadata.json descriptor, the primary Inline XBRL HTML proxy statement, every textual exhibit filed under the accession, and — for a small minority of large-cap filings — a registrant-supplied courtesy PDF. Filers are overwhelmingly domestic operating companies with equity registered under Section 12(b) or Section 12(g), along with closed-end funds, business development companies, REITs, and similar Section 12-registered entities that solicit shareholder votes. Submissions are packaged as monthly ZIP containers and cover the period from January 1, 1994 to the present, aligning with the EDGAR rollout of mandatory electronic proxy filing in the mid-1990s.

Update Frequency
Daily
Updated at
2026-05-09
Earliest Sample Date
1994-01-01
Total Size
22.8 GB
Total Records
207,751
Container Format
ZIP
Content Types
TXT, JSON, HTML, PDF
Form Types
DEF 14A

Dataset APIs

Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.

Dataset Index JSON API

Download the entire dataset as a single archive file.

Download Entire Dataset:

Download a single container file (e.g. monthly archive) from the dataset.

Download Single Container:

Dataset Files

389 files · 22.8 GB
Download All
2026-05.zip12.0 MB144 records
2026-04.zip966.3 MB4,335 records
2026-03.zip850.9 MB1,005 records
2026-02.zip84.7 MB133 records
2026-01.zip40.3 MB175 records
2025-12.zip57.5 MB210 records
2025-11.zip30.3 MB174 records
2025-10.zip76.0 MB290 records
2025-09.zip34.5 MB239 records
2025-08.zip27.4 MB192 records
2025-07.zip43.5 MB208 records
2025-06.zip54.7 MB220 records
2025-05.zip37.2 MB270 records
2025-04.zip949.4 MB2,301 records
2025-03.zip883.2 MB1,045 records
2025-02.zip113.0 MB134 records
2025-01.zip72.1 MB177 records
2024-12.zip77.7 MB217 records
2024-11.zip18.4 MB193 records
2024-10.zip53.2 MB285 records
2024-09.zip36.9 MB213 records
2024-08.zip37.1 MB215 records
2024-07.zip135.8 MB195 records
2024-06.zip83.6 MB253 records
2024-05.zip54.9 MB277 records
2024-04.zip846.9 MB2,363 records
2024-03.zip658.2 MB1,029 records
2024-02.zip42.3 MB153 records
2024-01.zip60.7 MB220 records
2023-12.zip89.2 MB222 records
2023-11.zip18.0 MB224 records
2023-10.zip49.2 MB285 records
2023-09.zip44.8 MB240 records
2023-08.zip38.1 MB245 records
2023-07.zip31.3 MB241 records
2023-06.zip34.8 MB267 records
2023-05.zip76.9 MB488 records
2023-04.zip784.4 MB2,250 records
2023-03.zip550.7 MB1,213 records
2023-02.zip37.8 MB158 records
2023-01.zip42.5 MB229 records
2022-12.zip57.0 MB252 records
2022-11.zip120.5 MB256 records
2022-10.zip153.8 MB269 records
2022-09.zip35.4 MB226 records
2022-08.zip28.0 MB207 records
2022-07.zip50.1 MB217 records
2022-06.zip37.7 MB217 records
2022-05.zip39.5 MB395 records
2022-04.zip583.3 MB2,379 records
2022-03.zip420.6 MB1,110 records
2022-02.zip38.1 MB155 records
2022-01.zip128.1 MB179 records
2021-12.zip28.8 MB171 records
2021-11.zip17.9 MB184 records
2021-10.zip70.5 MB241 records
2021-09.zip21.5 MB187 records
2021-08.zip32.0 MB196 records
2021-07.zip37.7 MB221 records
2021-06.zip45.6 MB255 records
2021-05.zip27.2 MB215 records
2021-04.zip704.7 MB2,245 records
2021-03.zip486.0 MB1,175 records
2021-02.zip18.5 MB114 records
2021-01.zip33.0 MB165 records
2020-12.zip48.5 MB182 records
2020-11.zip18.5 MB159 records
2020-10.zip74.7 MB253 records
2020-09.zip22.4 MB196 records
2020-08.zip24.8 MB173 records
2020-07.zip26.3 MB218 records
2020-06.zip38.4 MB234 records
2020-05.zip30.4 MB316 records
2020-04.zip373.5 MB2,018 records
2020-03.zip314.1 MB1,123 records
2020-02.zip18.0 MB135 records
2020-01.zip26.4 MB176 records
2019-12.zip26.0 MB187 records
2019-11.zip11.3 MB135 records
2019-10.zip49.4 MB231 records
2019-09.zip23.4 MB178 records
2019-08.zip20.9 MB153 records
2019-07.zip20.1 MB175 records
2019-06.zip36.6 MB199 records
2019-05.zip21.9 MB228 records
2019-04.zip332.9 MB2,088 records
2019-03.zip334.1 MB1,171 records
2019-02.zip24.6 MB173 records
2019-01.zip25.8 MB171 records
2018-12.zip29.0 MB165 records
2018-11.zip12.1 MB123 records
2018-10.zip23.9 MB223 records
2018-09.zip26.2 MB190 records
2018-08.zip22.2 MB204 records
2018-07.zip11.6 MB162 records
2018-06.zip44.3 MB210 records
2018-05.zip19.5 MB259 records
2018-04.zip299.6 MB2,033 records
2018-03.zip331.8 MB1,192 records
2018-02.zip13.8 MB152 records
2018-01.zip19.7 MB181 records
2017-12.zip22.2 MB171 records
2017-11.zip12.7 MB125 records
2017-10.zip23.5 MB227 records
2017-09.zip16.7 MB197 records
2017-08.zip11.9 MB170 records
2017-07.zip14.6 MB167 records
2017-06.zip19.3 MB226 records
2017-05.zip26.8 MB353 records
2017-04.zip249.3 MB1,896 records
2017-03.zip195.4 MB1,272 records
2017-02.zip12.2 MB150 records
2017-01.zip19.5 MB202 records
2016-12.zip21.9 MB171 records
2016-11.zip9.5 MB152 records
2016-10.zip33.7 MB239 records
2016-09.zip15.3 MB202 records
2016-08.zip11.0 MB158 records
2016-07.zip13.9 MB174 records
2016-06.zip15.8 MB215 records
2016-05.zip23.8 MB287 records
2016-04.zip226.6 MB2,104 records
2016-03.zip180.7 MB1,252 records
2016-02.zip10.8 MB153 records
2016-01.zip18.5 MB201 records
2015-12.zip16.5 MB194 records
2015-11.zip51.6 MB129 records
2015-10.zip21.5 MB271 records
2015-09.zip23.3 MB215 records
2015-08.zip16.6 MB164 records
2015-07.zip17.5 MB221 records
2015-06.zip16.8 MB248 records
2015-05.zip17.3 MB244 records
2015-04.zip200.7 MB2,149 records
2015-03.zip174.5 MB1,200 records
2015-02.zip14.5 MB199 records
2015-01.zip14.8 MB206 records
2014-12.zip14.3 MB183 records
2014-11.zip7.9 MB144 records
2014-10.zip19.0 MB264 records
2014-09.zip53.8 MB198 records
2014-08.zip17.0 MB177 records
2014-07.zip13.6 MB194 records
2014-06.zip21.4 MB250 records
2014-05.zip15.3 MB234 records
2014-04.zip168.3 MB2,106 records
2014-03.zip117.9 MB1,191 records
2014-02.zip14.0 MB160 records
2014-01.zip13.5 MB209 records
2013-12.zip13.3 MB197 records
2013-11.zip8.6 MB157 records
2013-10.zip21.8 MB297 records
2013-09.zip11.9 MB169 records
2013-08.zip12.7 MB187 records
2013-07.zip12.0 MB193 records
2013-06.zip16.7 MB251 records
2013-05.zip17.4 MB268 records
2013-04.zip164.1 MB2,028 records
2013-03.zip107.4 MB1,154 records
2013-02.zip12.3 MB203 records
2013-01.zip14.4 MB229 records
2012-12.zip11.7 MB169 records
2012-11.zip10.0 MB148 records
2012-10.zip22.9 MB304 records
2012-09.zip18.1 MB235 records
2012-08.zip12.6 MB181 records
2012-07.zip17.1 MB237 records
2012-06.zip15.4 MB224 records
2012-05.zip14.5 MB243 records
2012-04.zip159.2 MB2,072 records
2012-03.zip114.0 MB1,276 records
2012-02.zip9.1 MB148 records
2012-01.zip14.1 MB213 records
2011-12.zip12.8 MB197 records
2011-11.zip9.2 MB166 records
2011-10.zip22.7 MB382 records
2011-09.zip17.9 MB245 records
2011-08.zip12.4 MB211 records
2011-07.zip13.3 MB215 records
2011-06.zip17.7 MB302 records
2011-05.zip23.2 MB371 records
2011-04.zip155.5 MB2,073 records
2011-03.zip106.9 MB1,254 records
2011-02.zip8.7 MB178 records
2011-01.zip14.0 MB230 records
2010-12.zip14.7 MB217 records
2010-11.zip13.4 MB226 records
2010-10.zip24.8 MB357 records
2010-09.zip16.1 MB290 records
2010-08.zip13.4 MB205 records
2010-07.zip14.0 MB239 records
2010-06.zip19.5 MB298 records
2010-05.zip16.8 MB279 records
2010-04.zip158.9 MB2,219 records
2010-03.zip108.0 MB1,284 records
2010-02.zip12.8 MB227 records
2010-01.zip15.1 MB251 records
2009-12.zip21.2 MB282 records
2009-11.zip13.2 MB236 records
2009-10.zip24.6 MB493 records
2009-09.zip16.6 MB320 records
2009-08.zip21.5 MB312 records
2009-07.zip16.7 MB278 records
2009-06.zip18.4 MB303 records
2009-05.zip21.1 MB392 records
2009-04.zip154.9 MB2,341 records
2009-03.zip103.0 MB1,263 records
2009-02.zip12.4 MB248 records
2009-01.zip16.8 MB274 records
2008-12.zip17.1 MB293 records
2008-11.zip12.1 MB227 records
2008-10.zip25.2 MB408 records
2008-09.zip14.9 MB240 records
2008-08.zip13.6 MB257 records
2008-07.zip16.1 MB287 records
2008-06.zip17.4 MB307 records
2008-05.zip23.1 MB425 records
2008-04.zip159.2 MB2,551 records
2008-03.zip99.0 MB1,293 records
2008-02.zip11.9 MB225 records
2008-01.zip20.0 MB308 records
2007-12.zip15.9 MB241 records
2007-11.zip14.6 MB264 records
2007-10.zip21.0 MB384 records
2007-09.zip13.6 MB252 records
2007-08.zip26.4 MB396 records
2007-07.zip18.4 MB307 records
2007-06.zip18.9 MB332 records
2007-05.zip21.9 MB392 records
2007-04.zip156.0 MB2,556 records
2007-03.zip95.6 MB1,381 records
2007-02.zip11.2 MB247 records
2007-01.zip15.8 MB347 records
2006-12.zip13.3 MB279 records
2006-11.zip14.4 MB229 records
2006-10.zip23.7 MB450 records
2006-09.zip15.5 MB316 records
2006-08.zip27.0 MB265 records
2006-07.zip15.5 MB278 records
2006-06.zip17.5 MB430 records
2006-05.zip31.0 MB642 records
2006-04.zip123.2 MB2,234 records
2006-03.zip79.7 MB1,485 records
2006-02.zip11.5 MB242 records
2006-01.zip15.0 MB307 records
2005-12.zip13.1 MB274 records
2005-11.zip10.1 MB250 records
2005-10.zip24.4 MB487 records
2005-09.zip18.8 MB410 records
2005-08.zip12.6 MB311 records
2005-07.zip14.9 MB322 records
2005-06.zip27.5 MB474 records
2005-05.zip27.3 MB628 records
2005-04.zip109.6 MB2,423 records
2005-03.zip74.3 MB1,469 records
2005-02.zip10.9 MB250 records
2005-01.zip12.3 MB321 records
2004-12.zip12.4 MB289 records
2004-11.zip10.8 MB250 records
2004-10.zip20.3 MB463 records
2004-09.zip20.3 MB350 records
2004-08.zip10.7 MB248 records
2004-07.zip23.1 MB382 records
2004-06.zip20.5 MB384 records
2004-05.zip21.7 MB449 records
2004-04.zip117.0 MB2,476 records
2004-03.zip77.6 MB1,589 records
2004-02.zip14.0 MB243 records
2004-01.zip12.8 MB323 records
2003-12.zip10.8 MB312 records
2003-11.zip11.8 MB275 records
2003-10.zip16.5 MB479 records
2003-09.zip12.4 MB343 records
2003-08.zip12.5 MB301 records
2003-07.zip18.8 MB350 records
2003-06.zip19.8 MB394 records
2003-05.zip19.8 MB499 records
2003-04.zip97.9 MB2,524 records
2003-03.zip82.2 MB1,602 records
2003-02.zip12.8 MB294 records
2003-01.zip13.0 MB337 records
2002-12.zip11.9 MB322 records
2002-11.zip13.3 MB289 records
2002-10.zip15.3 MB480 records
2002-09.zip15.0 MB413 records
2002-08.zip12.3 MB331 records
2002-07.zip14.4 MB365 records
2002-06.zip21.5 MB438 records
2002-05.zip18.9 MB608 records
2002-04.zip89.7 MB2,668 records
2002-03.zip68.0 MB1,624 records
2002-02.zip7.5 MB243 records
2002-01.zip9.5 MB329 records
2001-12.zip7.9 MB290 records
2001-11.zip8.0 MB291 records
2001-10.zip12.9 MB461 records
2001-09.zip12.3 MB376 records
2001-08.zip10.8 MB301 records
2001-07.zip10.7 MB375 records
2001-06.zip14.3 MB426 records
2001-05.zip19.1 MB646 records
2001-04.zip80.8 MB2,712 records
2001-03.zip69.8 MB1,707 records
2001-02.zip7.5 MB251 records
2001-01.zip9.0 MB340 records
2000-12.zip10.0 MB328 records
2000-11.zip8.7 MB319 records
2000-10.zip12.8 MB494 records
2000-09.zip11.8 MB429 records
2000-08.zip10.7 MB346 records
2000-07.zip11.7 MB391 records
2000-06.zip13.9 MB522 records
2000-05.zip24.4 MB871 records
2000-04.zip61.9 MB2,336 records
2000-03.zip59.7 MB1,995 records
2000-02.zip8.9 MB280 records
2000-01.zip8.4 MB325 records
1999-12.zip13.6 MB379 records
1999-11.zip8.0 MB293 records
1999-10.zip12.4 MB485 records
1999-09.zip12.6 MB453 records
1999-08.zip9.3 MB302 records
1999-07.zip11.2 MB365 records
1999-06.zip11.6 MB456 records
1999-05.zip17.3 MB654 records
1999-04.zip66.4 MB2,616 records
1999-03.zip51.7 MB1,961 records
1999-02.zip9.1 MB281 records
1999-01.zip9.7 MB362 records
1998-12.zip10.7 MB415 records
1998-11.zip7.7 MB281 records
1998-10.zip13.2 MB480 records
1998-09.zip11.9 MB465 records
1998-08.zip9.5 MB341 records
1998-07.zip11.3 MB410 records
1998-06.zip12.8 MB492 records
1998-05.zip18.0 MB662 records
1998-04.zip67.3 MB2,620 records
1998-03.zip51.0 MB1,993 records
1998-02.zip9.5 MB304 records
1998-01.zip9.7 MB382 records
1997-12.zip9.7 MB409 records
1997-11.zip10.0 MB320 records
1997-10.zip12.4 MB508 records
1997-09.zip13.2 MB485 records
1997-08.zip10.1 MB342 records
1997-07.zip10.5 MB375 records
1997-06.zip13.2 MB464 records
1997-05.zip16.6 MB660 records
1997-04.zip75.2 MB2,608 records
1997-03.zip53.8 MB2,027 records
1997-02.zip8.3 MB322 records
1997-01.zip9.7 MB400 records
1996-12.zip8.4 MB367 records
1996-11.zip9.4 MB332 records
1996-10.zip14.4 MB536 records
1996-09.zip12.8 MB527 records
1996-08.zip10.5 MB381 records
1996-07.zip12.1 MB429 records
1996-06.zip11.1 MB402 records
1996-05.zip13.6 MB507 records
1996-04.zip27.4 MB1,155 records
1996-03.zip33.8 MB1,354 records
1996-02.zip5.9 MB236 records
1996-01.zip4.7 MB206 records
1995-12.zip5.9 MB242 records
1995-11.zip3.8 MB164 records
1995-10.zip6.4 MB262 records
1995-09.zip6.3 MB275 records
1995-08.zip4.6 MB176 records
1995-07.zip5.4 MB197 records
1995-06.zip4.8 MB193 records
1995-05.zip6.5 MB299 records
1995-04.zip17.9 MB793 records
1995-03.zip29.5 MB1,162 records
1995-02.zip3.3 MB107 records
1995-01.zip1.6 MB67 records
1994-12.zip2.3 MB91 records
1994-11.zip1.3 MB49 records
1994-10.zip1.8 MB66 records
1994-09.zip3.6 MB117 records
1994-08.zip2.5 MB71 records
1994-07.zip1.4 MB52 records
1994-06.zip6.5 MB136 records
1994-05.zip5.0 MB147 records
1994-04.zip10.9 MB401 records
1994-03.zip20.7 MB811 records
1994-02.zip1.6 MB61 records
1994-01.zip1.3 MB55 records

What This Dataset Contains

The dataset is built from definitive proxy statements filed on Form DEF 14A. Form DEF 14A is the "definitive proxy statement" furnished under Section 14(a) of the Securities Exchange Act of 1934 and Rule 14a-3, sent to shareholders in advance of an annual or special meeting so that they can vote on matters such as director elections, auditor ratification, Say-on-Pay and Say-on-Frequency votes, equity compensation plan approvals, charter and bylaw amendments, stock splits, mergers, and shareholder proposals. Its substantive content is governed by Schedule 14A (Rule 14a-101), the enumerated Items of Regulation S-K (notably 201(d), 401, 402, 403, 404, and 407), and Rule 14a-21 for compensation-related advisory votes. Since 2023, Item 402(v) adds a structured, iXBRL-tagged pay-versus-performance disclosure.

The dataset contains only DEF 14A submissions. It deliberately excludes the sibling proxy form types that share the same Section 14(a) framework: PRE 14A (preliminary), DEFA14A (additional definitive soliciting materials), DEFM14A (merger-specific), DEFC14A (contested), and DEF 14A/A (amendment). Coverage begins January 1, 1994, the start of mandatory electronic proxy filing on EDGAR, and extends to the present. Filings are distributed as monthly ZIP containers; each container unpacks into one folder per accession, holding the EDGAR-native descriptor plus the primary textual artifacts of the submission.

Content Structure of a Single Record

What one record represents

One record in the Form DEF 14A Files dataset is one complete EDGAR submission of a definitive proxy statement, uniquely keyed by its SEC accession number. Inside a monthly ZIP container, the record materializes as a single folder whose name is the 18-digit dashless form of the accession (for example 000004846525000074 for accession 0000048465-25-000074). The folder holds a metadata.json descriptor plus every textual and PDF document the registrant filed under that accession: the primary proxy statement, any textual exhibits, and, when produced, a registrant-supplied courtesy PDF. Image graphics (EDGAR GRAPHIC artifacts) and the XBRL linkbase sidecars are catalogued in the metadata but deliberately omitted from the payload. The record unit is therefore the filing-as-delivered-to-EDGAR, flattened into a single folder of primary textual artifacts plus a self-describing manifest.

Record-level content structure

At the filesystem level, one record consists of up to four kinds of artifact:

  1. A metadata.json descriptor (always present; exactly one per record).
  2. The primary proxy statement, a single .htm file in Inline XBRL HTML (always present for modern filings).
  3. Zero or more textual exhibits, each an .htm file wrapped in an EDGAR SGML <DOCUMENT> envelope.
  4. Optionally, one courtesy PDF reproducing the proxy with print-oriented typesetting and embedded images.

The typical footprint is two files (metadata.json plus the primary .htm). In the December 2025 sample container, 201 of 210 indexed submissions resolved to folders (the remaining nine were image-only or late-arriving), producing 204 HTML documents, 201 metadata JSONs, and 6 courtesy PDFs across 411 files. Folders grow to four or five files only when textual exhibits (charter, bylaws, material contracts, press releases) or a courtesy PDF accompany the primary document.

Component-by-component breakdown

metadata.json descriptor

The metadata.json file captures the EDGAR-level descriptor of the submission and acts as both a record header and a pointer registry to assets that are not physically shipped. Top-level fields:

  • id — 32-character hexadecimal internal record identifier used by sec-api.
  • formType — always "DEF 14A".
  • accessionNo — dashed SEC accession number (e.g. "0000048465-25-000074").
  • description — EDGAR submission description, typically "Form DEF 14A - Other definitive proxy statements".
  • filedAt — ISO-8601 timestamp including the EDGAR receive timezone (e.g. "2025-12-17T16:30:40-05:00").
  • periodOfReport — date string; for DEF 14A this is the shareholder meeting or record date, not a fiscal-period end.
  • linkToFilingDetails — canonical EDGAR URL for the primary document (often the iXBRL viewer).
  • linkToTxt — URL of the complete SGML submission text file.
  • linkToHtml — URL of the EDGAR -index.htm filing-index page.
  • linkToXbrl — URL of a standalone XBRL viewer; frequently an empty string for DEF 14A, because tagging is carried inline in the primary HTML rather than in a separate instance.
  • entities — array of every participating entity (filer, subject, filing agent). Each object carries companyName (with an EDGAR role suffix such as "(Filer)" or "(Subject)"), cik, fileNo, irsNo, stateOfIncorporation, fiscalYearEnd (MMDD), sic (numeric code plus human label), act, type, filmNo, a tickers array that may enumerate multiple classes (e.g. ["OPTX","OLIT","OLITU","OPTXW","OLITW"]), and the EDGAR industry-office label.
  • seriesAndClassesContractsInformation — array that is empty for operating-company proxies and populated only when the filer is a registered investment company disclosing series/class identifiers.
  • documentFormatFiles — ordered manifest of every textual and binary document EDGAR recorded for the submission. Each entry has sequence (string; blank for the synthetic complete-submission .txt), size (bytes, stringified), documentUrl, type (e.g. "DEF 14A", "GRAPHIC", "EX-3.1", "EX-99.1"), and an optional description. The sequence: "1" entry is always the primary proxy; GRAPHIC entries are listed for completeness even though the image bytes are not inside the ZIP.
  • dataFiles — array of XBRL sidecar artifacts: the schema (EX-101.SCH, .xsd), the definition/label/presentation linkbases (EX-101.DEF, EX-101.LAB, EX-101.PRE, each .xml), and an extracted XBRL instance (for example *_htm.xml). These XML artifacts are referenced by URL only and are not carried in the payload.

Together these fields make metadata.json a self-describing index of what is present on disk, a pointer registry for excluded assets, and the EDGAR-native view of the submission.

Primary proxy statement (.htm)

The primary document is a single Inline XBRL HTML file: a complete <html> page whose body is rendered for human reading while simultaneously exposing machine-readable XBRL facts through elements in the http://www.xbrl.org/2013/inlineXBRL namespace (<ix:nonNumeric>, <ix:nonFraction>, <ix:header>, <ix:hidden>, <ix:references>, <ix:resources>). The root element declares XBRL taxonomy namespaces — dei, us-gaap, ecd (the Executive Compensation Disclosure taxonomy used for pay-versus-performance), srt, any company-specific extension taxonomy, and iso4217 for currency units. An <ix:hidden> block near the top carries at minimum the cover-page facts dei:DocumentType (value "DEF 14A"), dei:AmendmentFlag, and dei:EntityCentralIndexKey; a reporting context (for example <xbrli:context id="c-1">) supplies the fiscal year start and end dates that any ecd:* pay-versus-performance facts reference.

Filenames follow two conventions driven by the filing agent. Large-company filings produced by Workiva or Donnelley use {ticker}-{yyyymmdd}.htm (e.g. hrl-20251216.htm, apd-20251210.htm, lits-20251230.htm). Smaller filers using Broadridge, Toppan Merrill, or EdgarAgents use generic names such as formdef14a.htm, def14aproxystatement.htm, tm{id}_def14a.htm, or ea{id}_def14a.htm.

The rendered body follows the customary Schedule 14A section ordering, with substantial registrant-specific variation in depth and labelling:

  • Cover page and notice of annual (or special) meeting of stockholders — meeting date, time, and location (physical address or virtual meeting URL), record date, and enumerated proposals to be voted on.
  • Proxy statement summary / proxy highlights — an executive overview common among large-cap filers since the early 2010s.
  • Questions and answers about the meeting and voting — voting mechanics by proxy, in person, by phone, or online; discussion of broker non-votes, revocation, and quorum rules.
  • Voting proposals — each introduced by a narrative description, the board's recommendation, and the required vote standard. Proposal 1 is almost always director elections; subsequent proposals typically cover auditor ratification, say-on-pay, equity plan approvals, and shareholder proposals.
  • Director nominee section — biographies, skills matrices, committee memberships, independence determinations, attendance records, and diversity tables.
  • Corporate governance — board leadership structure, risk oversight, committee charters, related-person transactions policy, ESG and sustainability oversight, shareholder engagement.
  • Beneficial ownership tables — 5% holders and directors/officers under Item 403.
  • Executive compensation under Item 402Compensation Discussion and Analysis (CD&A), Summary Compensation Table, Grants of Plan-Based Awards, Outstanding Equity Awards at Fiscal Year-End, Option Exercises and Stock Vested, Pension Benefits, Nonqualified Deferred Compensation, Potential Payments upon Termination or Change in Control, Director Compensation Table, CEO pay-ratio disclosure (Item 402(u)), and the Pay-Versus-Performance table (Item 402(v), iXBRL-tagged).
  • Audit committee report and auditor ratification — including fees paid to the independent registered public accounting firm broken out by category.
  • Equity compensation plan information table — under Item 201(d).
  • Shareholder proposals and management responses — each with proponent identification (where required) and the board's opposing or supporting statement.
  • Administrative disclosures — householding, stockholder communications procedures, advance-notice bylaw summaries, and Rule 14a-8 deadlines for submitting proposals for the next annual meeting.
  • Appendices / annexes — non-GAAP reconciliations, full text of equity plans being approved, or proposed charter or bylaw amendments presented as marked redlines.

Textual exhibits (ex*.htm)

Textual exhibits are preserved inside the EDGAR SGML <DOCUMENT> envelope. Each exhibit file opens with a header block of the form:

1 <DOCUMENT>
2 <TYPE>EX-3.1
3 <SEQUENCE>2
4 <FILENAME>ex3-1.htm
5 <DESCRIPTION>EX-3.1
6 <TEXT>
7 <HTML> ... </HTML>
8 </TEXT>
9 </DOCUMENT>

The <TYPE>, <SEQUENCE>, <FILENAME>, and <DESCRIPTION> lines correspond one-to-one with entries in metadata.json -> documentFormatFiles, enabling deterministic joining between the manifest and the filesystem. Common exhibit types observed in DEF 14A submissions include EX-3.1 and EX-3.3 (amended certificates of incorporation, bylaws), EX-10.* (material agreements such as equity incentive plan texts), and EX-99.* (press releases, voting-result supplements, supplemental disclosures). Exhibits are never iXBRL-tagged; only the primary proxy carries inline XBRL.

Courtesy PDF (optional)

A small fraction of records (6 of 201 folders in the December 2025 sample, all large-cap filers such as Air Products, Accenture, Rockwell Automation, Mueller Water, and Spire) include a registrant-produced courtesy PDF with a file name matching *courtesy*pdf.pdf. This is a standard binary PDF (magic bytes %PDF-1.7) that reproduces the proxy with the filer's print-oriented typesetting and includes the image graphics (director photographs, infographics, governance icons) that are filtered out of the HTML-only payload. Content parity with the primary .htm is intended by the filer but not enforced; the courtesy PDF is useful for visually faithful rendering and for recovering embedded images.

What the dataset record includes

The record ships with the metadata.json descriptor, the primary Inline XBRL HTML proxy statement, every textual exhibit filed under the accession, and any courtesy PDF. All textual content — cover page, proposals, compensation tables, governance narrative, ownership tables, audit disclosure, shareholder proposals, appendices, and every textual exhibit — is present end-to-end, including the inline XBRL tagging on the cover page and the pay-versus-performance table. The EDGAR-level index (documentFormatFiles, dataFiles) is fully preserved in metadata even for artifacts that are not physically shipped, so the record is self-describing with respect to what the original submission contained.

What is excluded or structurally separate

Three categories of material referenced by metadata.json are intentionally not shipped inside the ZIP:

  • Image graphics (type: "GRAPHIC") — JPEG, GIF, and similar artifacts referenced by the primary HTML. Omitted to reduce container size. Consumers needing the images can follow documentUrl in documentFormatFiles to fetch them from EDGAR, or use the courtesy PDF (when present) as a print-ready surrogate.
  • XBRL sidecar files listed in dataFiles — the schema (.xsd), the definition/label/presentation linkbases, and the extracted XBRL instance XML. The inline XBRL facts themselves are still available because they live inside the primary .htm.
  • The synthetic complete-submission .txt file — EDGAR's concatenation of every document in the submission. Referenced via linkToTxt but not replicated, because its content is the union of the payload files plus the image and XML assets already excluded.

Separately, sibling filings associated with the same meeting — PRE 14A (preliminary), DEFA14A (additional soliciting material), DEF 14A/A (amendment), DEFM14A (merger-specific), DEFC14A (contested), as well as Form 4 updates and Schedule 13D/G disclosures around the meeting — are not part of this dataset. Consumers who need a complete meeting file must retrieve companions from adjacent datasets and group by meeting date or registrant CIK.

Changes in required content over time

Schedule 14A content has expanded materially between 1994 and the present; a 1995 DEF 14A is structurally far shorter and less codified than one filed in 2025, even though the envelope form type is unchanged. Key inflection points:

  • 1996Rule 14a-4 amendments tightened the presentation of ballot items and discretionary voting authority.
  • 2003 — Disclosure of nominating committee processes and shareholder communications with directors became mandatory.
  • 2006Release 33-8732A (Executive Compensation and Related Person Disclosure) overhauled Item 402, introducing the modern CD&A, the Summary Compensation Table format, and the $120,000 related-person transaction threshold; also added director compensation tables and relocated Section 16(a) delinquency reporting into Schedule 14A.
  • 2009Release 33-9089 added required disclosure of board leadership structure, risk oversight, diversity considerations, and compensation consultant fees.
  • 2011Dodd-Frank implementation (Rule 14a-21) introduced Say-on-Pay and Say-on-Frequency advisory votes, embedding a recurring compensation vote block in almost every DEF 14A.
  • 2018 — CEO pay-ratio disclosure under Item 402(u) first appeared.
  • 2022Rule 14a-19 (universal proxy) restructured director-election ballots in contested elections; Rule 10D-1 added clawback policy disclosure.
  • 2023Release 33-11126 made the Item 402(v) pay-versus-performance table mandatory, introducing a structured iXBRL-tagged block with ecd:* facts covering Compensation Actually Paid, Total Shareholder Return, net income, and a company-selected measure over a rolling multi-year window.
  • Recent years — Climate, cybersecurity oversight, and human-capital-management disclosures have increasingly appeared in the governance section, though DEF 14A is not the primary locus for those mandates.

Changes in source-file format over time

The presentation format of DEF 14A filings on EDGAR has evolved through three distinct eras:

  • 1994 — c. 2001 — Filings were submitted as plain-ASCII SGML .txt documents. The proxy body was rendered with fixed-width columns, dashes as table rules, and minimal typography. Biographical blocks, compensation tables, and ownership grids appeared as text art. Exhibits were concatenated into the same SGML stream, delimited by <DOCUMENT>/<TYPE>/<SEQUENCE> headers. Early records in the dataset resolve to primary documents that are HTML only to the extent the registrant voluntarily filed HTML.
  • c. 2002 — 2019 — HTML became the dominant presentation format. Primary .htm files used inline CSS and HTML tables for layout; graphics were filed as separate GRAPHIC documents and referenced via relative <img> tags. The SGML <DOCUMENT> envelope remained around each attached exhibit. This era is the most consistent in rendering fidelity across the dataset.
  • 2020 — present — Inline XBRL (iXBRL) adoption accelerated, driven first by the cover-page tagging rule (Release 33-10618, phased in 2019-2020) and then by Item 402(v) pay-versus-performance tagging (2023). Modern primary documents declare XBRL namespaces at the root, place cover-page facts in <ix:hidden>, and tag the pay-versus-performance table with ecd:* facts. Large-cap filers increasingly attach a courtesy PDF as a print-typeset companion. linkToXbrl is usually empty for this form because the tagging lives inline, with no separate XBRL instance.

Across all eras, the SGML <DOCUMENT> wrapping of textual exhibits remains stable, and metadata.json normalizes the EDGAR-level view regardless of the era the filing belongs to.

Interpretation and extraction notes

  • Accession canonicalization — The 18-digit dashless folder name is equivalent to the dashed accession number in metadata.json.accessionNo; canonicalize to one form before joining against other datasets.
  • Effective payload reconciliationdocumentFormatFiles enumerates every document EDGAR recorded, including assets not physically shipped (images, the complete-submission .txt). The effective on-disk payload is the subset with extension .htm or .pdf; joining on sequence or filename against the files actually present in the folder yields the reconciled list.
  • XBRL extraction — The linkbase XML in dataFiles is not shipped, but structured extraction can proceed directly against the inline iXBRL facts inside the primary .htm using any iXBRL-aware parser.
  • Entity disambiguation — When multiple entities are listed, the object carrying the (Filer) role suffix is the reporting registrant. Subject companies and filing agents appear with distinct role suffixes. Registered investment company filings populate seriesAndClassesContractsInformation; operating-company filings do not.
  • Courtesy PDF semantics — The courtesy PDF is a second rendering of the same proxy, not an independent exhibit. It should not be double-counted when measuring distinct proposals or sections, and its content parity with the primary HTML is filer-controlled rather than enforced.
  • periodOfReport semantics — For DEF 14A this field carries the meeting or record date, not a fiscal-period end. Treating it as a fiscal date will produce misaligned joins against 10-K or 10-Q datasets.
  • Exhibit wrapper handling — Textual exhibits carry the SGML <DOCUMENT> header before the <HTML> body. Parsers that feed the file directly into an HTML renderer must strip (or ignore) the leading SGML lines; parsers that tokenize the SGML envelope can use <TYPE>, <SEQUENCE>, and <FILENAME> to reconcile against documentFormatFiles.
  • Table extraction — The pay-versus-performance table and cover-page facts are iXBRL-tagged and extractable as structured data; other compensation, ownership, and governance tables remain HTML-only and require layout-sensitive extraction.
  • Meeting-file completeness — Amendments (DEF 14A/A), additional soliciting materials (DEFA14A), preliminary drafts (PRE 14A), and contested/merger variants are filed under separate form types and are not part of this dataset. Reconstructing a full meeting context requires pulling companions from adjacent datasets keyed on CIK and meeting date.

Who Files or Publishes This Dataset, and When

Each DEF 14A record is a definitive proxy statement filed on EDGAR by a party soliciting votes, consents, or authorizations from security holders of a class registered under Section 12 of the Securities Exchange Act of 1934. In the vast majority of filings the filer is the issuer itself.

The filer population includes:

  • Domestic operating companies with equity registered under Section 12(b) (NYSE, Nasdaq, and other national exchanges). This is the dominant filer class.
  • Domestic issuers registered under Section 12(g) meeting the asset and holder-of-record thresholds.
  • Registered closed-end funds, business development companies (BDCs), and other registered investment companies that hold shareholder meetings. Open-end mutual funds file on DEF 14A when soliciting fund-shareholder votes.
  • REITs, master limited partnerships, and similar Section 12-registered entities that solicit unitholder or shareholder votes.
  • Non-issuer soliciting persons (for example, dissident shareholders in a contest). These more typically appear as PREC14A / DEFC14A, but can file DEF 14A.

Issuers that report only under Section 15(d) (no Section 12-registered class) are not subject to the Section 14(a) proxy rules and do not file DEF 14A, even though they file periodic reports. Foreign private issuers are exempt under Rule 3a12-3(b) and furnish home-country proxy materials on Form 6-K instead; Canadian MJDS filers likewise do not file DEF 14A. The DEF 14A population is therefore effectively a domestic-issuer population.

When the record is created or required

DEF 14A is event-driven, not calendar-driven. The trigger is any solicitation of proxies, consents, or authorizations from holders of a Section 12-registered class. Typical triggers:

  • Annual meetings (director elections, auditor ratification, Say-on-Pay, Say-on-Frequency, shareholder proposals, equity plan approvals). This drives the bulk of DEF 14A volume.
  • Special meetings for mergers, acquisitions, asset sales, charter amendments, reverse splits, share-authorization increases, or reincorporations.
  • Written consent solicitations in lieu of a meeting, where state law and the charter permit.

Timing under Rule 14a-6:

  • For routine matters (director elections, auditor ratification, Say-on-Pay, compensation plans, Rule 14a-8 proposals), the issuer may file DEF 14A directly, with no preliminary step, on or before the date proxy materials are first sent to shareholders.
  • For non-routine matters (M&A, charter amendments affecting shareholder rights, going-private transactions), the issuer must first file a preliminary proxy statement (PRE 14A) at least 10 calendar days before definitive materials are disseminated; the DEF 14A follows.
  • In all cases, DEF 14A must be filed no later than the date it is first sent or given to shareholders.

There is no fixed annual deadline, but exchange-listing rules generally require annual meetings and Rule 14a-8 proposal windows assume an anniversary cadence. For calendar-year issuers, DEF 14A filings concentrate in March through May ahead of spring annual-meeting season. Subsequent communications are filed as DEFA14A (additional soliciting material) or, for material changes to the proxy statement itself, a revised DEF 14A / DEFR14A.

The governing framework is Section 14(a), Regulation 14A (Rule 14a-1 through 14a-21), and Schedule 14A, which incorporates Items 402, 404, and 407 of Regulation S-K for executive compensation, related-person transactions, and corporate governance disclosure.

Important distinctions

  • PRE 14A / DEFA14A / DEFR14A: Preliminary, supplemental, and revised filings in the same proxy chain. Only DEF 14A is the definitive proxy statement sent to shareholders as the voting document; this dataset covers DEF 14A only.
  • DEF 14C: Filed when the issuer is not soliciting proxies (for example, when a controlling holder has already executed a written consent). Outside this dataset.
  • DEFC14A / PREC14A: Contested proxy filings. Both issuer and dissident may file; designation depends on the filer and contest posture.
  • Schedule 13D / 14D-9: Beneficial-ownership and tender-offer recommendations arise under Sections 13 and 14(d)/(e), not 14(a).
  • Form N-PX: Reports how funds and certain managers voted proxies. It is a holder-side reporting regime, not an issuer solicitation filing.
  • Rule 13e-3 / Schedule 13E-3: Going-private transactions trigger additional disclosure typically combined with the DEF 14A.
  • Smaller reporting companies and emerging growth companies: File on the same framework but may use scaled compensation disclosure and are exempt from certain Dodd-Frank items (e.g., pay ratio, some Say-on-Pay mechanics during EGC status).
  • Foreign private issuers: Do not file DEF 14A; proxy-related materials appear on Form 6-K.

EDGAR coverage

Mandatory electronic filing of definitive proxy statements was phased in during the EDGAR rollout of the mid-1990s. DEF 14A records on EDGAR begin in 1994, aligning with this dataset's coverage start of January 1, 1994. Earlier proxy statements exist only in the SEC's paper archives.

How This Dataset Differs From Similar Datasets or Filings

DEF 14A sits inside a family of Section 14(a) proxy materials that share overlapping content and timing. The comparisons below clarify when DEF 14A is the correct dataset and when an adjacent filing type better fits the question.

PRE 14A (Preliminary Proxy Statement)

Same document, earlier stage. PRE 14A is filed when proposals go beyond routine meeting business (mergers, charter amendments, equity plan approvals) and SEC staff review is expected. It may contain bracketed placeholders and text that shifts in response to staff comments. DEF 14A is the version mailed to shareholders and used for the vote; PRE 14A is the working draft. Use DEF 14A for final shareholder-facing disclosure, PRE 14A for staff-review and language-evolution studies.

DEFA14A (Additional Definitive Proxy Soliciting Materials)

Supplemental campaign materials filed after the definitive proxy: press releases, investor decks, shareholder letters, proxy advisor rebuttals. Same solicitation, not the proxy statement itself. DEF 14A is the single comprehensive voting document; DEFA14A is an expanding stream of follow-on communications, often concentrated near the record date or during contested votes. Full contested-meeting coverage requires both.

DEFM14A and DEFC14A

Definitive proxies with the same legal basis as DEF 14A but distinct form codes: DEFM14A for merger/acquisition votes, DEFC14A for contested solicitations. They carry specialized disclosure (fairness opinions, transaction backgrounds, dissident nominees) and are excluded from a DEF 14A dataset. PREM14A and PREC14A are their preliminary counterparts and are likewise excluded.

Schedule 14C (Information Statement)

Used when shareholder approval is secured without soliciting proxies, typically because a controlling holder has already consented. Discloses similar meeting and transaction information but includes no proxy card and no solicitation. DEF 14A implies an active vote contest for proxies; 14C implies a fait accompli.

Form 10-K

Frequently issued in sequence with DEF 14A and linked through Part III incorporation by reference, which creates overlap on executive compensation, director biographies, and related-party transactions. Purpose differs: 10-K is a financial and business report (audited statements, MD&A, risk factors); DEF 14A is a voting document (meeting mechanics, nominees, compensation governance, shareholder proposals). Use DEF 14A for say-on-pay, director elections, and proposal studies; 10-K for financial performance.

Form 8-K Item 5.07 (Vote Results)

Reports the numeric tally after the meeting, typically within four business days. DEF 14A describes what will be voted on; 8-K 5.07 reports what was decided. Complementary, not substitutes: DEF 14A supplies proposal text, board recommendations, and supporting disclosure; 8-K 5.07 supplies the outcome.

Form N-PX

The voter-side mirror of DEF 14A. Institutional managers and funds report how they voted on proxy proposals. Common research pattern is to link N-PX vote records back to DEF 14A proposal text, but N-PX contains no proposal content of its own.

Boundary summary

The DEF 14A Files dataset captures the final, shareholder-distributed proxy statement and its non-image exhibits for routine annual and ordinary special meetings of U.S. domestic reporting companies, from 1994 onward. It is narrower than the full proxy ecosystem (excludes PRE 14A, DEFA14A, DEFM14A, DEFC14A, and Schedule 14C) and broader than any extracted section (retains the full filing rather than isolating compensation tables or proposals). It sits between the event-driven outcome disclosure of 8-K Item 5.07 and the financial disclosure of 10-K, and is the authoritative source for the text that governs each vote.

Who Uses This Dataset

A DEF 14A package binds together director biographies, executive pay tables, equity plan proposals, auditor ratification, shareholder proposals, and beneficial ownership. Each professional group below extracts a specific slice for a defined workflow.

Executive compensation consultants

Firms advising boards and compensation committees build peer-pay benchmarks from the Summary Compensation Table, Grants of Plan-Based Awards, Outstanding Equity Awards, Pay Ratio, and Pay Versus Performance disclosures. Output: Say-on-Pay recommendations, plan-design memos, and realizable-pay comparisons across named peer groups.

Proxy advisory and governance research analysts

ISS, Glass Lewis, and internal stewardship teams use director biographies, committee assignments, overboarding data, equity plan dilution metrics, and shareholder proposal text to produce vote recommendations and governance scorecards. Clawback policies, independence, and diversity disclosures feed recurring scoring models.

Securities lawyers and disclosure counsel

Outside counsel and in-house disclosure teams use the corpus as a precedent bank when drafting CD&A narratives, risk oversight sections, perquisite disclosures, change-in-control arrangements, and opposition statements to shareholder proposals. Supports first-draft generation and disclosure gap reviews against current peer conventions.

Activist investors and activist-defense advisers

Activist research desks, defense banks, law firms, and proxy solicitors map board vulnerabilities: staggered boards, advance notice bylaws, proxy access, CEO pay outliers, weak Say-on-Pay history, and ownership concentration from beneficial ownership tables. Feeds target screens, white papers, and contested-election scenarios.

Fundamental equity analysts and portfolio managers

Buy-side and sell-side analysts consult proxies for incentive metrics, equity dilution, insider ownership, and related-party exposure that the 10-K does not cover. Informs investment theses where management incentives or capital allocation materially affect valuation, plus engagement letters before annual meetings.

Forensic accountants and fraud researchers

Mine proxies for red flags that complement 10-K review: related-party transactions with officers or principal stockholders, auditor changes, unusual perquisites, retroactive equity grants, and related-party employment agreements. Longitudinal comparison across years at one issuer is the core workflow.

Internal executive compensation and HR analytics teams

Total-rewards groups at registrants benchmark their own pay, equity plan features, and severance arrangements against self-selected peers. Summary Compensation Table data, share-reserve histories, and pay ratio inputs feed committee decks and plan-design stress tests.

Investor relations and proxy solicitors

IR and solicitation teams study peer vote outcomes, engagement cadence disclosures, and management proposal framing to prepare shareholder communications, meeting Q&A, and vote projections.

Academic researchers in corporate finance and governance

Used for large-sample studies of pay-for-performance sensitivity, board composition, director networks, diversity, voting behavior, and activism. Full 1994-to-present coverage supports panel data and event studies around plan adoption, director turnover, and proxy contests.

Quantitative and alternative-data teams

Extract structured features from compensation tables, director networks, and ownership concentrations to build governance factors for return, volatility, and accounting-quality signals.

M&A and private equity diligence teams

Quantify the change-in-control cost stack: accelerated vesting, severance multipliers, excise tax gross-ups, and Item 402(t) golden-parachute disclosures. Also review D&O indemnification and related-party agreements requiring unwind at close. Feeds transaction cost models and retention-package design.

LLM and RAG developers

Teams building governance copilots, CD&A summarizers, and compensation-extraction pipelines use the full filings plus exhibits in HTML and PDF as a training and evaluation corpus for proxy question answering and structured table extraction.

Specific Use Cases

The workflows below illustrate how the Form DEF 14A Files dataset is put to operational use. Each example ties a specific record component to a concrete output.

Peer-pay benchmarking from Summary Compensation Tables

Compensation consultants and internal total-rewards teams extract the Summary Compensation Table, Grants of Plan-Based Awards, and Outstanding Equity Awards tables from the primary .htm across a self-selected peer list, and join them to the Item 402(v) Pay-Versus-Performance block read from the inline XBRL ecd:* facts. The result feeds Say-on-Pay recommendations, realizable-pay comparisons, and committee decks sized against the same fiscal window the peers disclose.

Pay-Versus-Performance extraction via inline XBRL

Quant and governance analytics teams parse the <ix:nonFraction> and <ix:nonNumeric> elements tagged with the ecd taxonomy inside the primary proxy to pull Compensation Actually Paid, Total Shareholder Return, peer-group TSR, net income, and the company-selected measure for each year in the rolling window. This produces a cross-sectional pay-for-performance panel without layout-dependent HTML table scraping.

Say-on-Pay and director-election vote modeling

Proxy solicitors and activist-defense advisers pair each DEF 14A with the matching Form 8-K Item 5.07 outcome, keyed on registrant CIK and periodOfReport (meeting date). The proxy supplies proposal text, board recommendation, committee memberships, and overboarding disclosures; the 8-K supplies the tally. The combined panel drives vote projections, failed-vote watchlists, and director-level support trends.

Change-in-control cost models for M&A diligence

Transaction teams pull the Potential Payments upon Termination or Change in Control section and the Item 402(t) golden-parachute disclosure from target proxies to quantify accelerated vesting, severance multipliers, and excise tax gross-ups. Material equity plan exhibits attached as EX-10.* documents inside the same accession folder supply the underlying plan text used to validate acceleration triggers and retention-package design.

Precedent mining for disclosure counsel

Securities lawyers query the full-text corpus for CD&A narratives, clawback policy language, perquisite disclosures, and opposition statements to Rule 14a-8 shareholder proposals filed by peers in the prior season. Because the dataset preserves every textual exhibit and the full proxy body (not extracted sections), counsel can pull complete passages in surrounding context for first-draft generation and disclosure-gap reviews.

Governance-signal training corpus for LLM and RAG pipelines

Teams building CD&A summarizers, proposal classifiers, and compensation-extraction models use the paired metadata.json plus primary .htm (and courtesy PDF where present) as a training and evaluation corpus. The documentFormatFiles manifest, combined with stable SGML <DOCUMENT> wrappers on exhibits, provides deterministic document-type labels (DEF 14A, EX-3.1, EX-10.x, EX-99.x) for supervised extraction tasks and retrieval chunking.

Longitudinal red-flag screening for forensic review

Forensic accountants iterate the same registrant's DEF 14A filings year over year to flag newly disclosed related-person transactions under Item 404, auditor changes reflected in the audit committee report and fee table, retroactive or off-cycle equity grants in the Grants of Plan-Based Awards table, and unusual perquisite growth in the Summary Compensation Table. The 1994-to-present coverage supports multi-decade panels around a single CIK.

Dataset Access

The Form DEF 14A Files dataset is available through a JSON metadata endpoint, a full archive download, and per-container downloads. Filings are packaged as monthly ZIP containers and cover submissions from January 1994 to the present.

Dataset Index JSON API: https://api.sec-api.io/datasets/form-def-14a-files.json

This endpoint returns dataset-level metadata (name, description, last updated timestamp, earliest sample date, total records, total size, form types, container format, and file types), the full dataset download URL, and the list of all individual container files with their size, record count, updated timestamp, and download URL. It is useful for monitoring which containers changed in the most recent daily refresh and for deciding which containers to pull incrementally. This endpoint does not require an API key.

Example response:

Example
1 {
2 "datasetId": "1f13365b-9ae0-68e1-a294-5c6dfc349088",
3 "datasetDownloadUrl": "https://api.sec-api.io/datasets/form-def-14a-files.zip",
4 "name": "Form DEF 14A Files Dataset",
5 "updatedAt": "2026-04-16T02:57:49.783Z",
6 "earliestSampleDate": "1994-01-01",
7 "totalRecords": 204986,
8 "totalSize": 22305757239,
9 "formTypes": ["DEF 14A"],
10 "containerFormat": "ZIP",
11 "fileTypes": ["TXT", "JSON", "HTML", "PDF"],
12 "containers": [
13 {
14 "downloadUrl": "https://api.sec-api.io/datasets/form-def-14a-files/2026/2026-04.zip",
15 "key": "2026/2026-04.zip",
16 "size": 13818783,
17 "records": 154,
18 "updatedAt": "2026-04-16T02:57:49.783Z"
19 }
20 ]
21 }

Download Entire Dataset: https://api.sec-api.io/datasets/form-def-14a-files.zip?token=YOUR_API_KEY

Downloads the complete dataset as a single ZIP archive containing all monthly containers from 1994 onward. This endpoint requires an API key.

Download Single Container: https://api.sec-api.io/datasets/form-def-14a-files/2026/2026-04.zip?token=YOUR_API_KEY

Downloads one monthly container ZIP, which is useful for incremental updates or when only a specific time window is needed. This endpoint requires an API key.

Frequently Asked Questions

What form does this dataset cover?

The dataset covers Form DEF 14A, the definitive proxy statement furnished under Section 14(a) of the Securities Exchange Act of 1934 and Rule 14a-3. It excludes the related form types PRE 14A, DEFA14A, DEFM14A, DEFC14A, and DEF 14A/A, each of which is filed under a separate form code.

What does one record in this dataset represent?

One record is one complete EDGAR submission of a DEF 14A, uniquely keyed by SEC accession number and materialized as a single folder whose name is the 18-digit dashless accession. The folder holds a metadata.json descriptor, the primary Inline XBRL HTML proxy statement, any textual exhibits filed under the accession, and — optionally — a registrant-supplied courtesy PDF.

Who is required to file Form DEF 14A?

DEF 14A is filed by parties soliciting votes, consents, or authorizations from holders of a class registered under Section 12 of the Exchange Act — overwhelmingly the issuer itself. The population covers domestic operating companies listed under Section 12(b), Section 12(g) registrants, closed-end funds, business development companies, REITs, and similar Section 12-registered entities. Foreign private issuers and Section 15(d)-only reporters do not file DEF 14A.

What time period does the dataset cover?

Coverage begins January 1, 1994, aligning with the EDGAR rollout of mandatory electronic proxy filing, and extends to the present. Earlier proxy statements exist only in the SEC's paper archives and are not part of the dataset.

What file format is the dataset distributed in?

The dataset is distributed as monthly ZIP containers. Each container unpacks into per-accession folders that contain a metadata.json descriptor, one or more .htm files (the primary Inline XBRL proxy plus any textual exhibits wrapped in SGML <DOCUMENT> envelopes), and, for a small minority of filings, a courtesy .pdf. Image graphics and XBRL linkbase sidecars are referenced in metadata but not shipped inside the ZIP.

How does this dataset differ from a DEFA14A or PRE 14A dataset?

DEF 14A is the definitive proxy statement mailed to shareholders and used for the vote. PRE 14A is the earlier preliminary draft filed when SEC staff review is expected on non-routine matters. DEFA14A is the stream of additional soliciting materials — press releases, investor decks, advisor rebuttals — filed after the definitive proxy. This dataset contains only DEF 14A submissions; reconstructing a full meeting file requires pulling companions from adjacent datasets.

How often is the dataset refreshed?

Containers are updated on a daily refresh cadence. The JSON index endpoint at https://api.sec-api.io/datasets/form-def-14a-files.json reports the updatedAt timestamp for each monthly container, which is the recommended way to detect which containers changed and to pull incremental updates.