The Form PREA14A Files Dataset is a closed, terminal archive of every EDGAR submission ever accepted under form type PREA14A — revised or additional preliminary proxy soliciting material filed under Regulation 14A of the Securities Exchange Act of 1934. A single record represents one complete EDGAR submission, identified by a unique accession number, packaged as a JSON metadata sidecar plus one or more plain-text body documents reconstructed from the original SGML. The form was active on EDGAR only from February 1994 through February 2000, when its function was absorbed into other Regulation 14A submission codes (PRER14A for revised preliminaries, DEFA14A for post-definitive additions), so the record universe is fixed and no new filings will ever be added. Filers are registrants and other soliciting persons subject to Section 14(a) — public-company issuers, closed-end funds and BDCs, and third-party contestants in proxy contests — who needed to put revised or supplemental preliminary material in front of SEC staff before mailing the definitive proxy statement (DEF 14A) to shareholders. The dataset is distributed as monthly ZIP containers holding per-accession folders of TXT body documents and a metadata.json sidecar.
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
This dataset packages every EDGAR submission of form type PREA14A — preliminary additional or revised preliminary proxy soliciting material filed under Rule 14a-6 of the Exchange Act. Two documentary roles flow through the single form type. The first is a revised preliminary proxy statement, a re-issued Schedule 14A preliminary proxy statement typically filed after SEC staff comments on an earlier PRE 14A and submitted for further staff review before the registrant distributes the definitive proxy statement to shareholders. The second is additional soliciting material accompanying a preliminary proxy — supplemental letters to shareholders, press releases, media transcripts, or notices of meeting adjournment submitted for staff review as part of the same preliminary phase. In both roles the underlying paper document conforms to Schedule 14A: a structured cover page with the prescribed checkbox grid, a registrant identification block, a filing-fee election and computation table, a notice of the shareholders' meeting, and a body addressing the matters to be voted on.
Because PREA14A was retired as a submission type in February 2000, the dataset is closed and terminal. The record universe is fixed at the historical filings EDGAR ever accepted under this designation, all packaged identically. Every record is from the pre-modern EDGAR era: submissions were authored as flat ASCII inside an SGML wrapper, with hard line-wrapping at roughly 60–72 characters per line. No format evolution occurred within the form's lifecycle — from the first 1994 filings to the last 2000 filings, PREA14A submissions remained plain-text-inside-SGML, so the record set shows no transition through ASCII to HTML to iXBRL stages. The file types present in any record are limited to JSON (the sidecar) and TXT (the body documents); HTML, XBRL, iXBRL, and image formats are not present, either because they did not yet exist on EDGAR during the form's active years or because they were explicitly excluded from packaging (images).
A single record in the Form PREA14A Files dataset is one complete EDGAR submission of form type PREA14A — one filing of revised or additional preliminary proxy soliciting material made by a registrant under Regulation 14A of the Securities Exchange Act of 1934, identified by a unique EDGAR accession number. Each record bundles two layers: a structured JSON sidecar (metadata.json) carrying the filing's header-level facts (form type, accession, filer identity, filing date, document inventory, EDGAR back-links) and one or more plain-text body documents (document-1.txt, document-2.txt, ...) carrying the proxy content extracted from the original SGML submission. Together these files reconstruct the SEC submission as it appeared on EDGAR between February 1994 and February 2000, minus image attachments.
Records live inside monthly ZIP containers organized by calendar year. The path hierarchy from container down to a single record is:
1
<year>/<year>-<month>.zip
2
└── <year>-<month>/
3
└── <accession-no-undashed>/
4
├── metadata.json
5
└── document-1.txt (and document-2.txt, ... if multi-document)
Each accession folder is named with the 18-digit dash-stripped EDGAR accession number (for example 000095014994000088 corresponds to the canonical dashed form 0000950149-94-000088). Inside that folder, exactly one metadata.json is always present, accompanied by one or more document-N.txt files whose numeric suffix mirrors the sequence value EDGAR assigned to each document inside the original SGML submission. There is no top-level manifest or per-month index; the folder structure itself is the index.
metadata.json sidecarmetadata.json is a flat JSON object describing one filing. Its top-level fields are:
formType — the literal string "PREA14A" for every record.accessionNo — the canonical dashed accession identifier in the format NNNNNNNNNN-YY-NNNNNN. Note this differs from the un-dashed accession-folder name.id — a 32-character hexadecimal hash serving as the internal dataset record identifier.description — a human-readable expansion of the form type, typically "Form PREA14A - Additional Preliminary Proxy Solicitation Material".filedAt — an ISO-8601 timestamp with an Eastern-time offset (-04:00 or -05:00 depending on daylight saving). The time component is pinned to T00:00:00 for these records; only the date carries information.linkToFilingDetails, linkToTxt, linkToHtml — EDGAR back-references to the filing's archive folder, its combined .txt submission, and its -index.htm index page respectively.documentFormatFiles — an array of document descriptors. Each element carries sequence, size, documentUrl, description, and type. The array typically contains one entry per real document (with sequence "1", "2", ... and a meaningful type such as "PREA14A") plus one trailing entry describing the combined .txt submission, where both sequence and type are the single space character " " rather than null or an empty string — an EDGAR header-parser convention for the "complete submission text file" pseudo-document. The description on the primary entry is the most reliable signal of what the document actually is (e.g., "REVISED PRELIMINARY PROXY STATEMENT", supplemental soliciting text, or a notice of meeting adjournment).entities — an array of entity records, one per party named in the EDGAR header (filer and any subject companies, with a role suffix such as (Filer) appended to companyName). Each entity carries cik, companyName, irsNo, fileNo, filmNo, sic (numeric code concatenated with its human-readable label), stateOfIncorporation, fiscalYearEnd (as MMDD), act ("34" for Exchange Act filings), type, and an optional tickers array. The tickers key is omitted for entities that have no associated ticker rather than being present-but-empty.Three modern structural fields are present in the schema but uniformly empty across this dataset, reflecting the 1994–2000 vintage:
linkToXbrl — always an empty string.dataFiles — always an empty array.seriesAndClassesContractsInformation — always an empty array; the series/class disclosure framework for investment-company filers post-dates these filings.Each document-N.txt file holds the plain-text content of one document from the original SGML submission. The outer SGML wrapper (<DOCUMENT> / <TYPE> / <SEQUENCE> / <FILENAME> / <TEXT> ... </TEXT> / </DOCUMENT>) has been stripped, and the wrapper's metadata (TYPE, SEQUENCE, FILENAME, DESCRIPTION) has been hoisted into metadata.json → documentFormatFiles[*]. The body file therefore begins directly with the proxy content. Original EDGAR filenames are replaced with a synthetic document-<sequence>.txt derived from the sequence value; single-document records appear as document-1.txt.
A substantive PREA14A body opens with the prescribed Schedule 14A cover page. A characteristic opening looks like:
1
<PAGE> 1
2
3
SCHEDULE 14A INFORMATION
4
5
PROXY STATEMENT PURSUANT TO SECTION 14(A) OF THE SECURITIES EXCHANGE ACT OF 1934
6
7
Filed by the Registrant [X]
8
9
Filed by a Party other than the Registrant [ ]
10
11
Check the appropriate box:
12
13
[X] Revised Preliminary Proxy Statement
14
[ ] Definitive Proxy Statement
15
[ ] Definitive Additional Materials
16
[ ] Soliciting Material Pursuant to Section 240.14a-11(c) or Section 240.14a-12
After the cover header, a substantive document follows the canonical Schedule 14A ordering:
For an "additional" PREA14A — a supplemental soliciting letter, press release, transcript of a media appearance, or notice of meeting adjournment — the body is typically much shorter, often only a paragraph or two of supplemental disclosure prefaced by the Schedule 14A cover, and the conventional proxy-statement body sections (compensation tables, ownership tables, proposal descriptions) are absent.
Throughout, the body is interleaved with <PAGE> paginator markers — lines such as <PAGE> 1, <PAGE> 2, and so on. These are legacy EDGAR pagination sentinels retained inside the extracted text rather than true SGML elements with closing tags; they should be treated as page-break markers.
Each record packages:
metadata.json sidecar, fully populated with header-derived structured fields.<DOCUMENT> block in the original SGML..txt submission are not preserved, though textual references to them inside the prose remain.<SEC-DOCUMENT>, <SEC-HEADER>, and per-document <DOCUMENT>...</DOCUMENT> wrappers) is not retained verbatim in any file. Its header content has been parsed into metadata.json; its per-document wrapper metadata has been parsed into documentFormatFiles[*]; and the inner <TEXT> payload has been written out as document-N.txt.seriesAndClassesContractsInformation is always an empty array).The PREA14A submission type was used on EDGAR only from February 1994 through February 2000. After that the SEC consolidated additional preliminary proxy soliciting material into other Regulation 14A submission categories, and PREA14A was retired. Consequently:
<TEXT> block, not a packaging error), and line breaks frequently fall mid-clause.documentFormatFiles[0].description field is the most reliable single signal of what a record actually contains, because the form type alone (PREA14A) covers a range of sub-uses: a full revised preliminary proxy statement, a brief notice of meeting adjournment, a supplemental letter to shareholders, and a freestanding soliciting transcript can all share the same formType value.documentFormatFiles entry whose sequence and type are both " " (a single space character) is a verbatim preservation of the EDGAR header-parser convention for the "complete submission text file" pseudo-document. Consumers should expect this entry and not normalize it to null or empty string.<PAGE> markers inside document-N.txt are page-break sentinels, not true SGML tags; they have no closing form and should be stripped or used as paragraph-break heuristics during text extraction.entities array can contain more than one party. Where it does, role suffixes appended to companyName (for example (Filer), (Subject)) carry the distinction; the suffix is part of the string and must be parsed off if the bare legal name is needed.sic field concatenates the four-digit code with its human label and occasionally contains typographical artifacts carried over from the original EDGAR header (for example a misspelled industry label such as "6512 Opeators of Nonresidential Buildings"). Parsers that need a clean SIC code should split on the leading numeric prefix.000095014994000088) and canonically dashed (the accessionNo field inside metadata.json, e.g., 0000950149-94-000088). Cross-referencing across these two forms requires normalizing one to the other.The filing population is limited to persons subject to Section 14(a) of the Exchange Act and Regulation 14A when they solicit proxies, consents, or authorizations with respect to a security registered under Section 12 of the Exchange Act. In practice that means:
A single contested solicitation can therefore produce PREA14A filings from more than one party; each EDGAR submission identifies its own filer in the header.
PREA14A is event-driven, not periodic. A filing is generated only when a soliciting person needs to put revised or additional preliminary material in front of SEC staff before disseminating the definitive proxy statement. The two principal triggers:
Timing fits within the Rule 14a-6(a) window: preliminary copies of the proxy statement and form of proxy must be filed at least 10 calendar days before definitive copies are first sent to security holders, unless an exemption applies. PREA14A submissions sit inside that 10-day window, between the initial PRE 14A and the eventual DEF 14A. The cadence of any given filing is driven by the staff-comment iteration cycle or by the filer's own revision schedule; some solicitations produced no PREA14A filings and others produced several.
PREA14A itself was an EDGAR taxonomy code — the identifier the Commission used to distinguish preliminary additional or revised preliminary proxy submissions from the initial PRE 14A and the definitive DEF 14A.
PREA14A sits at a specific cell of a two-dimensional grid: preliminary vs definitive on one axis, initial vs additional/revised on the other. It occupies the preliminary x additional/revised cell, used on EDGAR only from February 1994 to February 2000 before its function dispersed into PRER14A and DEFA14A. The most useful comparisons are to the other cells of that grid and to nearby soliciting-material codes.
Same row as PREA14A (preliminary, pre-mailing, addressed to SEC staff), but the initial column. Filed under Rule 14a-6(a) at least 10 calendar days before mailing the definitive proxy when non-routine matters are on the ballot. PRE 14A is the first preliminary draft; PREA14A is the supplemental or revised material layered on top of it. For the base preliminary text, use PRE 14A; for the iteration history, use PREA14A.
The modern successor for revised preliminary material and the closest functional twin to PREA14A. Both can carry revisions to pre-mailing proxy content. The difference is generational: after February 2000, revised preliminary proxies are filed as PRER14A, while PREA14A no longer exists. For the 1994–2000 window, conceptually similar revisions live only under PREA14A; afterward, only under PRER14A.
Same column as PRE 14A (initial), but the definitive row. It is the mailable, shareholder-facing document that controls the actual solicitation. Content overlaps with PREA14A (agenda, compensation tables, nominees, plan descriptions), but DEF 14A reflects the post-comment, post-revision version shareholders received. PREA14A exposes the drafting layer that DEF 14A flattens away.
The structural mirror of PREA14A across the preliminary/definitive line: same additional/revised column, opposite row. DEFA14A carries supplemental letters, press releases, investor decks, and contest pieces filed after the definitive proxy is out and is itself distributed to shareholders. PREA14A is submitted before mailing, for SEC staff review, never to shareholders. After February 2000, post-definitive additions flow to DEFA14A and revised preliminaries to PRER14A — together absorbing PREA14A's role.
Filed by shareholders, activists, or proxy advisors under PX14A6G Rule 14a-6(g) (written exempt solicitation material) or as a notice of exempt solicitation (PX14A6N) — not by the registrant. They share the "soliciting material" label with PREA14A but differ in filer population (non-management vs registrant), regime (exempt solicitation vs Rule 14a-6(a) pre-mailing review), and audience (public/shareholder vs SEC staff). They are not part of the registrant's drafting cycle.
Downstream of the entire proxy cycle: a post-meeting, event-driven tabular report of vote counts (for, against, abstain, broker non-vote) per proposal. PREA14A holds the draft questions before mailing; Item 5.07 holds the answers after the vote. Complementary across the timeline, never interchangeable.
Beneficial-ownership filings under Sections 13(d)/(g), filed by holders crossing the 5 percent threshold — not by the registrant. A 13D/A may disclose intent to oppose management or run a slate, which can intersect a contested vote, but the regime, filer, and content (Schedule 13D / Schedule 13G ownership, purpose, agreements) are unrelated to the registrant's proxy drafting captured in PREA14A.
PREA14A is the only EDGAR code from 1994–2000 that combines two attributes: preliminary (pre-mailing, for SEC staff, not for shareholders) and additional or revised (layered on a prior PRE 14A, not the initial submission). It is always a registrant or soliciting-person filing, narrative in form, and historically bounded — a closed archive. It cannot substitute for PRE 14A (initial preliminary), DEF 14A (mailed definitive), DEFA14A (post-definitive additions), or PRER14A (post-2000 revised preliminaries); each occupies a different cell of the proxy-filing grid.
The closed Feb 1994 – Feb 2000 archive supports a narrow, historically oriented audience rather than real-time research. The following roles get concrete value from it.
Used to trace the historical use of preliminary additional soliciting material under Regulation 14A. Counsel read the Schedule 14A cover, the notice of meeting, and any transmittal language to identify what prompted a refile rather than a jump to definitive proxy. Edits visible in the executive compensation tables and proposal responses reveal staff-comment patterns. Output: precedent memos and internal training notes on the lineage of current Schedule 14A practice.
Academic and think-tank researchers treat the records as a concentrated pre-2000 slice. They mine the summary compensation, option grant, and long-term incentive tables, plus board and related-party disclosures, to study option repricing, golden parachutes, and director-independence language at the inflection point between the 1992 compensation rules and post-2000 reforms. The filedAt field anchors longitudinal comparisons; entities[].sic allows industry slicing.
Used to document why PREA14A existed and why it was consolidated with PRE 14A and PRER14A. Researchers read formType, filedAt, and accession metadata to map the form's lifecycle, and read the document bodies to reconstruct the SGML-era staff comment-and-revise workflow that predated full-text EDGAR search.
Used as labeled, well-understood anchor examples at the early end of a multi-decade proxy corpus (combined with PRE 14A, DEF 14A, PRER14A). The plain-text documents help calibrate parsers that must handle legacy SGML encoding, pre-XBRL tables, and pre-2000 compensation layouts. The dataset is too small for standalone modeling but useful as ground truth.
Senior staff at solicitation and IR advisory firms consult the records for institutional memory of pre-2000 contest dynamics. They focus on the matters-to-be-voted section, supplemental soliciting language, and cover narratives explaining the refile, then file the examples into internal precedent banks used to brief newer practitioners.
Used as teaching examples for associates and junior compliance officers learning Schedule 14A and Rule 14a-6 mechanics. Trainees compare the fee table, notice of meeting, compensation tables, and proposal language across the filings to see what a pre-definitive proxy package looked like and what triggered the additional preliminary filing.
Concrete workflows the PREA14A archive supports. Each use case treats the dataset as a closed historical artifact, not a streaming source.
A taxonomy historian wants to document why PREA14A existed alongside PRE 14A before its February 2000 retirement. They pull formType, filedAt, accessionNo, and documentFormatFiles[0].description across the records to plot the filings on a timeline, then read each body's Schedule 14A cover sheet and any cover transmittal language to classify each record as either a full revised preliminary proxy or a short supplemental letter. The output is a lifecycle memo mapping PREA14A onto its post-2000 successors (PRER14A and DEFA14A).
An NLP researcher assembling a multi-decade Schedule 14A corpus needs labeled early-1990s examples that exercise SGML-era quirks: hard line-wrapped ASCII, <PAGE> paginator markers, ASCII compensation tables, and duplicate-paste artifacts. They use the document-N.txt files as canonical fixtures for parser regression tests, with metadata.json.documentFormatFiles[*].description as the type label. The records calibrate the parser's behavior on the oldest stratum of a corpus that later transitions into HTML and iXBRL filings.
A proxy disclosure counsel preparing an internal memo on the evolution of Rule 14a-6 practice contrasts PREA14A revision patterns with present-day PRER14A and DEFA14A workflow. They diff the Schedule 14A cover-page checkbox grid, the filing-fee election block, the notice of meeting, and the executive compensation tables across the records to identify what kinds of edits historically prompted a refile rather than a direct move to DEF 14A. The result is a precedent note describing how today's bifurcated post-2000 regime absorbed PREA14A's two roles.
A governance researcher studying option repricing and golden-parachute language at the 1992-rules-to-post-2000-reforms inflection point treats the dataset as a small but uniform pre-2000 slice. They extract the Summary Compensation Table, Option Grants table, Option Exercises and Year-End Values table, and beneficial-ownership table from each substantive body, joined to entities[].sic for industry slicing and entities[].stateOfIncorporation for charter-law context. Outputs feed a working paper using PREA14A records as historical anchors against later DEF 14A samples.
A compliance trainer building a CLE module on preliminary-proxy mechanics assigns trainees to walk through one record end-to-end: the checkbox grid, the registrant identification block, the fee-computation table, the notice of meeting, the proposal section, and the signature block. They compare a full revised preliminary proxy against a short "additional" supplemental letter from the same dataset to show how one formType covers two documentary roles. The exercise produces annotated reference packets used in associate onboarding.
A solicitation-firm analyst maintains an internal precedent bank of contested- and non-routine-vote situations. They index each PREA14A record by the matters-to-be-voted list in the notice of meeting, the supplemental soliciting language in any additional filing, and the entities block identifying registrant and any subject company. The precedent bank is consulted when briefing newer staff on how revised preliminary materials were historically positioned before mailing.
The dataset is distributed as ZIP containers organized by month. Each container contains per-accession folders holding the original EDGAR submission documents (TXT) and a metadata file (JSON) describing the filing. The dataset can be accessed in three ways: by retrieving the dataset index JSON, by downloading the full dataset archive, or by downloading individual monthly containers.
Dataset Index JSON API: https://api.sec-api.io/datasets/form-prea14a-files.json
This endpoint returns dataset-level metadata and the full list of available monthly containers, including each container's download URL, size, record count, and last updated timestamp. Use it to discover available containers and to monitor which containers have been updated in the latest refresh run, so you can fetch only the changed containers. This endpoint does not require an API key.
Example response:
1
{
2
"datasetId": "1f13365b-9ae0-6a72-83c2-eb5dc995c0c8",
3
"datasetDownloadUrl": "https://api.sec-api.io/datasets/form-prea14a-files.zip",
4
"name": "Form PREA14A Files Dataset",
5
"updatedAt": "2026-04-16T08:54:08.435Z",
6
"earliestSampleDate": "1994-02-01",
7
"totalRecords": 11,
8
"totalSize": 266185,
9
"formTypes": ["PREA14A"],
10
"containerFormat": "ZIP",
11
"fileTypes": ["TXT", "JSON"],
12
"containers": [
13
{
14
"downloadUrl": "https://api.sec-api.io/datasets/form-prea14a-files/1994/1994-07.zip",
15
"key": "1994/1994-07.zip",
16
"size": 38421,
17
"records": 2,
18
"updatedAt": "2026-04-16T08:54:08.435Z"
19
}
20
]
21
}
Download Entire Dataset: https://api.sec-api.io/datasets/form-prea14a-files.zip?token=YOUR_API_KEY
Downloads the complete dataset as a single ZIP archive covering all Form PREA14A filings from February 1994 through February 2000. This endpoint requires an API key, passed as the token query parameter or via the standard sec-api.io authentication header.
Download Single Container: https://api.sec-api.io/datasets/form-prea14a-files/1994/1994-07.zip?token=YOUR_API_KEY
Downloads one monthly container ZIP instead of the full archive, which is useful for targeted historical retrieval or for fetching only containers that have been updated since the last run. Inside each container, filings are organized as per-accession folders containing the original EDGAR TXT documents and a JSON metadata file. This endpoint requires an API key.
The dataset covers EDGAR form type PREA14A — revised or additional preliminary proxy soliciting material filed under Regulation 14A of the Securities Exchange Act of 1934. A single form type encompasses two documentary roles: a full revised preliminary proxy statement (typically refiled after SEC staff comments) and shorter additional soliciting material such as supplemental shareholder letters, press releases, transcripts, or notices of meeting adjournment submitted for staff review before the definitive proxy is mailed.
PREA14A was an active EDGAR submission type only from February 1994 through February 2000. The dataset is closed and terminal: every record EDGAR ever accepted under this designation is included, and no new filings will ever be added.
The SEC retired the PREA14A submission type from the EDGAR taxonomy in February 2000 and dispersed its function into other Regulation 14A codes. Revised preliminary proxy statements now flow to PRER14A, and additional soliciting material filed after the definitive proxy is mailed flows to DEFA14A. PREA14A simply no longer exists as a filing category, so the historical population is the complete population.
One record is one complete EDGAR submission of form type PREA14A, identified by a unique accession number. Each record consists of a metadata.json sidecar carrying header-level facts (form type, accession number, filer identity, filing date, document inventory, EDGAR back-links, filer and subject-company entities) plus one or more document-N.txt plain-text body files extracted from the original SGML submission.
Two file types only: JSON for the metadata sidecar and TXT for the body documents. HTML, XBRL, and iXBRL renditions do not exist for these records because EDGAR did not require or accept those formats during the form's active years. Image attachments from the original SGML submission are excluded by design, although textual references to them inside the prose remain.
PRE 14A is the initial preliminary proxy statement; PREA14A is the revised or additional preliminary material layered on top of it; DEF 14A is the definitive shareholder-facing proxy that is actually mailed. PREA14A sits between PRE 14A and DEF 14A in the Rule 14a-6(a) 10-day pre-mailing window — it is for SEC staff review, never distributed to shareholders, and exposes the drafting layer that DEF 14A flattens away.
PREA14A's two roles split across two successor codes. Revised preliminary proxy statements now file as PRER14A; supplemental soliciting material filed after the definitive proxy is mailed files as DEFA14A. Together those two codes absorb the function PREA14A served during 1994–2000.