The Form 5 Files Dataset is a historical corpus of SEC Form 5 and Form 5/A submissions — the "Annual Statement of Changes in Beneficial Ownership of Securities" required under Section 16(a) of the Securities Exchange Act of 1934. Each record is a single EDGAR submission, identified by its 18-digit SEC accession number and packaged as a per-accession folder that contains the EDGAR envelope metadata, the structured ownershipDocument XML, and the pre-rendered HTML view produced by EDGAR's xslF345X05 stylesheet. Filings are contributed by the insiders themselves — directors, Section 16 officers, and greater-than-10% beneficial owners of a Section 12-registered equity class — and report transactions during the issuer's fiscal year that were exempt from real-time Form 4 reporting or that were missed and reported late. The dataset covers every Form 5 and Form 5/A submission on EDGAR from February 1996 through the current month, organized on disk as monthly ZIP containers under <year>/<year>-<month>.zip.
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
The dataset is built around the complete EDGAR submission for every Form 5 and Form 5/A accepted since February 1996. One record is a single Form 5 or Form 5/A filing, and the dataset treats the whole EDGAR submission — filing envelope metadata, the structured XML ownership document, and EDGAR's pre-rendered HTML view of that document — as the atomic unit. Each record corresponds to exactly one annual Section 16(a) ownership statement by an individual insider reporting transactions in the issuer's securities for the issuer's fiscal year that were either exempt from Form 4's two-business-day reporting requirement or that should have been, but were not, reported on a timely Form 4.
Records live on disk under <year>/<year>-<month>.zip. Inside each monthly archive, each submission is stored in a folder whose name is the accession number with dashes removed (for example, 0001899338-25-000018 becomes 000189933825000018/). The folder always contains metadata.json, one primary XML document carrying the structured ownership data, and an xslF345X05/ subdirectory holding the pre-rendered HTML view. For a Form 5/A amendment, the same structure applies: the XML sets documentType = "5/A" and adds a dateOfOriginalSubmission element pointing to the Form 5 being amended.
Form 5 is prescribed under Section 16(a) of the Exchange Act and Rules 16a-3(f) and 16a-8. It captures anything Section 16 reporting persons were not required to report in real time on Form 4 or in opening statements on Form 3: small acquisitions under de minimis thresholds, certain gifts and inheritances, transactions exempt under Rules 16a-11 through 16b-3, discretionary-transaction exemptions, and transactions that should have been reported on Form 4 but were omitted or late. Form 5 is due within 45 days after the issuer's fiscal year end, and a reporting person with nothing to report may claim the Rule 16a-3(f)(2) "no Form 5 required" certification and file nothing — in which case no record appears in this dataset. The EDGAR submission shares the same XML wrapper used for Forms 3 and 4: it conforms to the shared ownershipDocument schema family (historically X0202, X0203, X0206, X0306, X0405, and currently X0508) and is rendered by EDGAR's xslF345X05 stylesheet.
A record is built from three co-resident components inside the accession folder:
metadata.json — a flat JSON object describing the EDGAR submission envelope: form type, accession number, filing timestamp, period of report, links back to the canonical EDGAR artifacts, a documentFormatFiles manifest enumerating every attached file, and an entities array carrying structured identifiers for both the issuer and the reporting person.ownershipDocument schema version. Every filing observed in the recent sample uses schemaVersion = X0508; older filings carry earlier schema versions. The filename is not standardized: primary_doc.xml (EDGAR online wizard), ownership.xml (common for fund/RIC filers), doc5a.xml (some amendments), Workiva's wk-form5_<epoch>.xml, and financial-printer names such as fp0096404-2_5.xml all occur. The correct primary XML is whichever XML file lives directly under the accession folder, never the same-named file inside xslF345X05/.xslF345X05/ — a complete <!DOCTYPE html> document that EDGAR produces by applying its xslF345X05 stylesheet server-side to the ownership document. Despite the filename mirroring the primary XML and retaining the .xml extension, the content is HTML, not XML. It reproduces the printed Form 5 layout — issuer and reporting-person boxes, the director/officer/10% relationship indicators, Table I (non-derivative), Table II (derivative), the footnote list, remarks, and the signature line — with EDGAR's inline CSS classes (.FormData, .FootnoteData). No separate XSLT stylesheet, external CSS, or JavaScript is bundled; the HTML is self-contained.The shortened index declares TXT, JSON, HTML, PDF, and XML as possible file types across the dataset, but any individual accession typically materializes only a subset — most commonly metadata.json plus two identically named XML-extension files (the true XML at the root and the rendered HTML under xslF345X05/). Image files from the original EDGAR submission are excluded by dataset policy.
metadata.jsonThe metadata file is a flat JSON object. Top-level fields:
formType — "5" or "5/A".accessionNo — dashed 18-digit SEC accession number.description — EDGAR form description (e.g. "Form 5 - Annual statement of changes in beneficial ownership of securities", with ": [Amend]" appended for amendments).filedAt — ISO-8601 filing timestamp with EDGAR-reported timezone offset.periodOfReport — ISO date of the fiscal year end covered.linkToFilingDetails, linkToTxt, linkToHtml, linkToXbrl — absolute SEC.gov URLs for, respectively, the EDGAR-rendered primary document, the concatenated SGML .txt submission, the filing index page, and an XBRL link (typically empty for Section 16 forms).documentFormatFiles — ordered array of {sequence, size, documentUrl, description, type} entries, one per file EDGAR associates with the submission (the XSLT-rendered view, the raw XML, and the complete submission .txt).dataFiles, seriesAndClassesContractsInformation — typically empty arrays for Form 5.entities — array of entity objects (see below).id — 32-character hexadecimal dataset-internal record identifier.The entities array consistently holds two objects, distinguished by a suffix on companyName: the issuer ("<NAME> (Issuer)") and the reporting person ("<NAME> (Reporting)").
cik (CIK), companyName, irsNo, fiscalYearEnd (MMDD), stateOfIncorporation, sic (industry code and label, e.g. "6021 National Commercial Banks" — see the full SIC list), and tickers (array; populated for operating companies such as ["WBHC"] or ["ADBE"], sometimes absent for closed-end funds).cik, companyName (individual name with (Reporting) suffix), act (typically "34" for the Exchange Act), fileNo (EDGAR file number on the issuer — 000-* for operating companies, 811-* for registered investment companies), type (mirrors formType), and filmNo (EDGAR film number).ownershipDocument XMLThe primary XML root is <ownershipDocument> and carries every piece of structured content that appears on the rendered form.
Top-level scalar elements.
schemaVersion — pins the document to an ownership schema. X0508 is the version seen across current filings; historical filings use earlier versions (X0202, X0203, X0206, X0306, X0405) in the order those schemas were in force.documentType — 5 or 5/A.periodOfReport — ISO date for the issuer's fiscal year end.dateOfOriginalSubmission — present only on amendments; ISO date of the original Form 5 being amended.notSubjectToSection16 — 0/1 flag (sometimes omitted on fund-insider filings).form3HoldingsReported, form4TransactionsReported — Form 5-specific 0/1 flags indicating whether this annual filing is picking up holdings or transactions that belonged on earlier Section 16 filings.aff10b5One — 0/1 or true/false flag encoding the Rule 10b5-1(c) plan affirmation that became mandatory following the SEC's December 2022 amendments (effective April 2023).Boolean encoding is not uniform: most filers emit numeric 1/0, while some (notably fund-complex filers) emit literal true/false. Parsers must accept both.
issuer. Three children: issuerCik (zero-padded 10-digit CIK), issuerName, and issuerTradingSymbol (literal "none" when the issuer has no public ticker).
reportingOwner. Three nested blocks per reporting person:
reportingOwnerId — rptOwnerCik, rptOwnerName.reportingOwnerAddress — rptOwnerStreet1/2, rptOwnerCity, rptOwnerState, rptOwnerZipCode. For foreign addresses, rptOwnerState = "X0" plus a free-text rptOwnerStateDescription naming the country.reportingOwnerRelationship — boolean flags isDirector, isOfficer, isTenPercentOwner, isOther, plus optional free-text officerTitle (e.g. "Chief Marketing Officer & EVP", "PRESIDENT") and otherText (e.g. "Portfolio Manager") when the corresponding flags are set.nonDerivativeTable (Table I). Zero or more <nonDerivativeTransaction> and, optionally, <nonDerivativeHolding> children. Each transaction contains:
securityTitle.value — title of the non-derivative class (e.g. "Common Stock", "Common Shares", "Shares of China Fund, Inc (CHN)").transactionDate.value and optional deemedExecutionDate.value (ISO dates).transactionCoding — transactionFormType (5), a single-letter transactionCode from the SEC Section 16 code set (e.g. A acquisition from issuer, P open-market purchase, S open-market sale, J other, U tender disposition, W acquired via inheritance/gift, G bona fide gift, M exercise of derivative), an equitySwapInvolved boolean, and optional footnoteId references.transactionTimeliness.value — optional; the value "L" marks a late-reported transaction (one that should have been on a Form 4), almost always paired with an explanatory footnote.transactionAmounts — transactionShares.value, transactionPricePerShare.value, and transactionAcquiredDisposedCode.value (A acquired / D disposed). Any of these may carry an attached <footnoteId>.postTransactionAmounts — typically sharesOwnedFollowingTransaction.value; for non-share or value-denominated positions, valueOwnedFollowingTransaction.value is used instead. The two are mutually exclusive.ownershipNature — directOrIndirectOwnership.value (D direct / I indirect); when I, natureOfOwnership.value carries a free-text string describing the relationship ("by spouse", "By Spouse IRA", "By child's account", "By trust", "By 401(k) plan", etc.).derivativeTable (Table II). Always present as an element, often empty. When populated, each <derivativeTransaction> (or <derivativeHolding>) mirrors the non-derivative fields and adds derivative-specific ones: conversionOrExercisePrice.value, exerciseDate.value, expirationDate.value, and an underlyingSecurity block containing underlyingSecurityTitle.value and underlyingSecurityShares.value (or underlyingSecurityValue.value). This is where options, warrants, RSUs, convertible instruments, and similar derivative security types are reported.
footnotes. Zero or more <footnote id="F1">…</footnote> elements referenced from transaction or holding fields via <footnoteId id="F1"/> siblings. Footnote content carries disclaimers of beneficial ownership, DRIP explanations, vesting schedules, family-relationship disclosures, power-of-attorney attribution, and the textual justifications for late (L-coded) transactions.
remarks and ownerSignature. remarks is an optional free-text block, often empty. ownerSignature carries signatureName (plain name, "/s/ <name>" signature line, or "<name>, Attorney-in-Fact" for agent-filed forms) and signatureDate (ISO date).
xslF345X05/ rendered HTMLThe file under xslF345X05/ shares the primary XML's filename but contains a complete HTML document produced by EDGAR's shared Forms 3/4/5 rendering stylesheet. It reproduces the classical printed Form 5 layout: a banner with issuer CIK and ticker, the reporting-person name/address block, the officer/director/10%/other relationship indicators, Table I (non-derivative), Table II (derivative), the footnote list, the remarks area, and the signature line, styled with EDGAR's inline .FormData and .FootnoteData classes. It is useful as a human-readable view; all machine-readable content is already present in the primary XML. The folder holds the already-transformed HTML output, not an XSLT stylesheet — no .xsl file is included.
xslF345X05 XSL source file. The xslF345X05/ folder contains the already-transformed HTML output..txt submission wrapper and the filing-index HTML page are referenced by linkToTxt and linkToHtml in metadata.json and may or may not be materialized as files inside the accession folder; when present, they are enumerated in documentFormatFiles.ownershipDocument schema is itself the structured representation — so linkToXbrl is typically empty and no XBRL instance or taxonomy files are attached.rptOwnerCik and issuerCik values.Paper origins (1996–2002). When EDGAR accepted its earliest Form 5 filings in February 1996, Section 16 forms were still fundamentally paper documents; filers could submit either a hardship-exempt paper Form 5 or a voluntary electronic SGML/ASCII version. The statutory content requirements — issuer and reporting-person identification, Table I for non-derivative transactions, Table II for derivative transactions, footnotes, and a manual signature — were stable from the 1991 overhaul of Section 16 (Exchange Act Release 34-28869) onward, but pre-2003 electronic submissions were free-form text with labeled items rather than typed XML.
Sarbanes-Oxley and mandatory EDGAR filing (2002–2003). Section 403 of the Sarbanes-Oxley Act of 2002 rewrote Section 16(a) to accelerate insider reporting. The related SEC release (No. 34-46421, with final rules in Release 33-8230) mandated electronic filing of Forms 3, 4, and 5 through EDGAR effective June 30, 2003, along with issuer-website posting. This is the single most important structural inflection point in the dataset: filings before mid-2003 are largely paper or ad hoc SGML, while filings from mid-2003 onward conform to the structured ownershipDocument XML schema.
Schema evolution (X02xx through X0508). The shared EDGAR ownership schema for Forms 3, 4, and 5 has been versioned as Section 16 disclosure requirements have expanded. The earliest structured filings used X0202; subsequent revisions (X0203, X0206, X0306, X0405, and currently X0508) added or refined fields. Notable additions across versions include the notSubjectToSection16 flag, the form3HoldingsReported and form4TransactionsReported scalars that make catch-up filings explicit, finer-grained transactionTimeliness coding (including the L late marker), the aff10b5One flag introduced to capture the Rule 10b5-1(c) plan affirmation mandated by the December 2022 rule amendments (effective April 2023), and expanded reportingOwnerAddress handling for foreign addresses via rptOwnerState = "X0" plus a country-name description. Records retain whatever schemaVersion EDGAR accepted at filing time; consumers joining across decades should expect drift in element presence, optionality, and boolean encoding.
Signature and attribution conventions. The 2003 mandate standardized electronic signature conventions: agent-filed forms carry "<name>, Attorney-in-Fact" in signatureName, and the /s/ prefix is common for self-filed forms. Earlier paper filings typically carried a scanned manual signature rendered as text.
1996–2002 (SGML/ASCII era). Early EDGAR Form 5 submissions appear as flat ASCII or SGML documents in the concatenated .txt submission stream, with form content expressed as labeled text blocks rather than typed elements. No machine-readable ownership XML exists for this period.
2003–present (structured XML era). From the June 2003 mandate onward, every Form 5 carries a structured ownershipDocument XML instance, and EDGAR renders the XML to HTML on demand via the xslF345X05 stylesheet. The dataset captures both the authoritative XML and the pre-rendered HTML view so downstream consumers do not need to apply the stylesheet themselves. XBRL and inline XBRL do not apply to Section 16 forms in either era — the ownership schema is itself the structured representation.
Filename conventions. The primary XML filename is not standardized by EDGAR; it reflects the filing agent or tool. The file under xslF345X05/ is always named identically to the primary XML, retains the .xml extension despite containing HTML, and is always co-located in that subfolder. To locate the primary XML, take the XML at the accession-folder root rather than hard-coding a filename.
xslF345X05/ is an HTML document despite its .xml extension.dateOfOriginalSubmission identifies the original Form 5's filing date, but the amendment carries its own accession number, its own accession folder, and its own metadata.json. When reconstructing a reporting person's fiscal-year activity, the most recent amendment for a given issuer and period should supersede any earlier Form 5 for the same period.L transactions and Form 4 catch-up. A transaction whose transactionTimeliness.value is "L" is a late-reported transaction that should have been filed on a Form 4, typically accompanied by a footnote explaining the delay. Combined with form4TransactionsReported = 1 at the document level, these signals identify annual filings that carry content originally due in real time.isDirector, isOfficer, isTenPercentOwner, isOther, equitySwapInvolved, notSubjectToSection16, form3HoldingsReported, form4TransactionsReported, and aff10b5One appear as 1/0 in most filings and as true/false in others (notably fund-complex filers). Parsers must accept both.<footnoteId id="Fx"/> siblings. A transaction may reference multiple footnotes, and a footnote may be referenced from multiple transactions. Preserving these references is essential because disclaimers of beneficial ownership, DRIP explanations, and late-reporting justifications live in footnote text, not in structured fields.directOrIndirectOwnership.value = "I", the paired natureOfOwnership.value string is the only place the relationship (spouse, child, trust, retirement account, LLC, partnership, etc.) is recorded. These strings are free text and vary in capitalization and phrasing.valueOwnedFollowingTransaction.value in place of sharesOwnedFollowingTransaction.value. Treat the two as mutually exclusive post-transaction state fields.A for DRIP reinvestments with transactionPricePerShare = 0 and carry disclaimers of beneficial ownership in footnotes; several of these filers also emit boolean flags as true/false.rptOwnerState = "X0" paired with a free-text rptOwnerStateDescription encodes non-US addresses. The country name appears as a description string rather than an ISO code.Form 5 is an insider filing under Section 16(a) of the Exchange Act of 1934. Each record in this dataset corresponds to a single EDGAR submission on Form 5 or Form 5/A by, or on behalf of, a reporting person of a Section 12-registered issuer. The filer is the insider, not the issuer; the issuer appears only as the reference company whose registered equity is reported.
The reporting population has three classes:
Entities file Form 5 when the entity itself is the 10% holder (e.g., a holding company, investment vehicle, or family limited partnership). Issuers frequently prepare and transmit the filing via filing agents on behalf of the insider, but the filer of record remains the reporting person.
Form 5 is periodic, not event-driven. Rule 16a-3(f) requires it within 45 days after the issuer's fiscal year end — mid-February for a calendar-year issuer. It sweeps up categories of transactions deferrable from Form 4, plus any Form 3 or Form 4 items that should have been filed earlier but were missed.
Transactions typically reported on Form 5:
Rule 16a-3(f)(2) "no Form 5 required" representation. If every reportable transaction was already timely reported on Form 4 and there are no deferrable items, Form 5 is not required. The insider may deliver a written representation to the issuer, which the issuer relies on for its Item 405 of Regulation S-K disclosure in the proxy statement or Form 10-K.
Form 5/A (amendment). Filed to correct errors or omissions in a previously filed Form 5. No fixed deadline — the amendment must be filed promptly upon discovery. The amendment inherits the original filer identity and fiscal year; it cannot be used to add a different insider or a different year.
Tenure. Reporting ends when the person ceases to be an officer or director, but transactions executed while subject to Section 16 remain reportable after departure, and certain transactions within six months of departure may still need reporting. A person who becomes or ceases to be an insider mid-year reports only the portion of the year during which they were subject to Section 16.
Form 5 sits inside a tightly connected family of insider and beneficial ownership disclosures. The most useful comparisons are to the other two Section 16(a) forms (Forms 3 and 4), to the Section 13 beneficial ownership schedules, to Form 144 sale notices, and to the issuer-side ownership and delinquency disclosures in proxies and 10-Ks. Each captures a different slice of the same underlying phenomenon — insider and large-holder positions in registered equity — at different triggers, cadences, and levels of granularity.
Form 3 is the one-time initial statement of beneficial ownership, filed within 10 days of becoming an officer, director, or greater-than-10% holder (or when a class first registers). It is a pure holdings snapshot on the date the reporting obligation attaches, with no transaction history.
Form 3 and Form 5 share the X0508 XML schema family, the same filer population, and the same security-level granularity. The distinction is temporal and functional: Form 3 is the entry-point snapshot; Form 5 is the annual residual of transactions not captured on Form 4. Use the Form 3 Files Dataset to answer "who is an insider and what did they hold on day one." Use Form 5 to answer "what exempt or late transactions occurred during the fiscal year."
Form 4 is the real-time statement of changes in beneficial ownership, filed within two business days of most reportable transactions. It is by far the largest Section 16 dataset and captures the bulk of open-market purchases, sales, option exercises, and grants.
Form 4 and Form 5 are complementary, never substitutable. Form 4 reports non-exempt transactions on a near-real-time basis; Form 5 reports transactions that were exempt from Form 4 (small acquisitions under Rule 16a-6, gifts, inheritances, certain Rule 16b-3 items) plus transactions that should have been on a timely Form 4 but were not. No transaction appears on both. Volume differs by orders of magnitude — Form 5 filings are a small fraction of Form 4 transaction rows. Use the Form 4 Files Dataset for insider trading signals, short-swing profit analysis, and open-market behavior. Reach for Form 5 when the research question specifically involves gifts, exempt grants, inheritances, or delinquent reporting.
Schedule 13D and Schedule 13G are Section 13 filings triggered by crossing a 5% beneficial ownership threshold in a registered class, regardless of insider status. 13D covers active holders; 13G is the short form for passive investors, qualified institutions, and exempt holders.
Overlap with Form 5 is narrow — both concern beneficial ownership in registered equity. The filer populations diverge: 13D/G filers are typically institutional investors, activists, and acquirers, often not officers or directors. 13D/G reports aggregate position and control intent and does not itemize transactions; cadence is event-driven rather than an annual true-up. Use 13D/G for 5%+ block ownership, activist campaigns, or institutional concentration. Use Form 5 for officer and director transaction completeness.
Form 144 is a forward-looking notice of proposed sale of restricted or control securities under Rule 144, filed by affiliates (a population that substantially overlaps with Section 16 insiders) above de minimis thresholds.
The purposes are inverted. Form 144 announces an intent to sell; the sale may or may not be executed as described. Form 5 is a backward-looking record of transactions that already occurred, including non-sale events. Form 144 is narrow to Rule 144 sales and does not report gifts, grants, or exempt acquisitions. Use Form 144 for proposed affiliate sales and Rule 144 compliance. Use Form 5 for anything else in the affiliate transaction universe.
Item 405 of Regulation S-K requires issuers to disclose, in their annual proxy or 10-K, any known failures by Section 16(a) reporting persons to file Forms 3, 4, or 5 on time.
Item 405 is the issuer's compliance report card; Form 5 is the insider's own catch-up filing. Item 405 is prose, issuer-authored, typically naming delinquent insiders and counting late filings in aggregate. Form 5 is the insider's structured XML, itemizing specific transactions. Item 405 commonly references Form 5 filings but contains none of the underlying transaction data. Use Item 405 for compliance culture and governance studies. Use Form 5 when the actual late or exempt transaction records are required.
Item 403 requires issuers to disclose, in tabular form, beneficial ownership above 5% by any person and beneficial ownership of all directors, nominees, and named executive officers. These tables appear in the proxy and, by incorporation, in the 10-K.
Item 403 is an annual aggregated snapshot as of a record date, reported in shares and percent of class, prepared by the issuer. It enumerates no transactions. Form 5 is filed by the insider and itemizes exempt or late transactions during the fiscal year. Use Item 403 of Regulation S-K for a single-date ownership snapshot across the executive team. Use Form 5 (with Forms 3 and 4) to reconstruct transaction-level ownership changes.
Three features set Form 5 apart from every comparison above:
Residual by design. Form 5 exists to capture what Forms 3 and 4 did not — transactions exempt from two-business-day reporting and transactions that should have been on a timely Form 4 but were not. Its content is defined by what is missing elsewhere.
Frequently absent. Under Rule 16a-3(f)(2), an insider with no reportable transactions for the year — and who timely reported all Form 4 transactions — is not required to file. Absence is not non-compliance, and the dataset is therefore a record of exempt-or-late activity, not a census of insiders.
Schema-compatible with Forms 3 and 4. Form 5 uses the same X0508 XML family, footnote conventions, derivative/non-derivative split, and transaction codes, allowing trivial joins on CIK, issuer, and security across the three-form Section 16 corpus.
Form 5 is narrow but not substitutable. Form 4 cannot cover exempt transactions; Form 3 cannot cover any transaction reporting; 13D/G and 144 address different regimes; and Items 405 and 403 describe the same universe from the issuer side without the underlying transaction detail. Any study of Section 16 completeness, exempt insider transactions, officer and director gift activity, or late-reporting behavior should treat Form 5 as a primary source, typically joined against Forms 3 and 4.
Form 5 captures the Section 16(a) transactions that never appear in the Form 4 feed: exempt small acquisitions, bona fide gifts, inheritances, benefit-plan activity, and Form 4 trades reported late. The users below rely on it to close the gap between real-time insider reporting and true year-end ownership.
Quants building insider-sentiment factors merge Form 5 with Form 4 to correct survivorship and delay bias. They key on transactionCode (G for gifts, A/F for plan activity), transactionTimeliness=L to isolate delinquent trades rolled into the annual filing, the nonDerivativeTable and derivativeTable blocks, and rptOwnerRelationship.isDirector/isOfficer/isTenPercentOwner for cohort construction. Output: point-in-time insider holdings panels and cleaned factor inputs.
Single-name analysts use Form 5 to surface off-wire transfers: director and officer gifts to trusts, foundations, and family members, plus inheritance events. The footnotes field typically carries the substantive narrative (trust restructuring, estate distribution, divorce transfer); derivativeTable discloses year-end option and RSU positions. These feed qualitative flags in research notes, including gifts timed ahead of price-sensitive events.
In-house compliance staff use the dataset to benchmark their own insiders against peers and to support Item 405 proxy disclosure. They read periodOfReport, transactionTimeliness, and rptOwnerRelationship flags to separate genuinely exempt activity from trades that should have been on Form 4, and they track Rule 16a-3(f)(2) representation letters and prior-year Form 5 remediations for delinquent Form 4s.
Outside counsel drafting and reviewing Form 5 filings use the corpus as precedent. They study transactionCode usage, footnote language citing Rule 16a-6, Rule 16b-3, or code G gifts, and the Form 5/A amendment chain to see how peers correct errors. The dataset also supports diligence on new officer and director hires with prior filing histories.
Regulators mine Section 16(a) filings to identify recidivist late filers. Useful fields include transactionTimeliness, per-CIK multi-year filing patterns, isTenPercentOwner, and amendment frequency via Form 5/A. The data supports compliance sweeps, referrals on chronic non-filers, and staff guidance from Corp Fin.
Governance teams treat Form 5 hygiene as a signal of broader control-environment quality. They pull reporting-owner identity, relationship flags, postTransactionAmounts, and footnote narratives to score insider filing discipline, quantify gifting behavior, and flag directors tied to Section 16(a) delinquencies in voting recommendations.
Researchers use the historical panel for event studies on late-reported trades, gift timing around price moves, and the link between Section 16(a) compliance and firm governance. They rely on structured XML tables, periodOfReport, acceptance timestamps, and transaction codes to separate open-market activity from plan, gift, and inheritance transactions.
Expert witnesses reconstructing insider ownership for M&A disputes, securities class actions, derivative suits, and Section 16(b) disgorgement claims need a trail that Form 4 alone cannot provide. They use postTransactionAmounts, the full derivativeTable (annual option grants, exercises, expirations), and footnotes describing trust and indirect beneficial ownership to build damages models and date-specific holdings reports.
Private-client advisors to officers, directors, and principal shareholders use Form 5 to research peer practice around GRATs, CRTs, charitable gifts, and estate transfers. The footnote text describing transfer structure and the derivativeTable for outstanding options and RSUs drive gift-timing memos and coordination with Section 16 counsel.
Engineering teams building insider-transaction products ingest Form 5 and Form 5/A to produce normalized feeds. They consume filing metadata (CIK, accession, acceptance time, periodOfReport, issuer CIK), the full XML transaction tables, rptOwnerState=X0 and similar non-US indicators for foreign private issuers and foreign insiders, and amendment links from 5/A back to the original Form 5, feeding deduplicated timelines and late-filing classifiers.
Teams building retrieval systems over SEC filings index Form 5 to extend coverage beyond Form 4 so that insider QA correctly handles gifts, inheritances, and late reports. The structured XML supports deterministic extraction; footnotes supply natural-language context; metadata enables indexing by reporting owner, issuer, and period for compliance copilots and analyst assistants that reconcile Form 4 and Form 5 into a single insider narrative.
Taken together, the Form 5 Files Dataset serves a narrow professional audience — Section 16 insider compliance and legal teams on both the issuer and outside-counsel side, regulators, governance and forensic analysts, academic researchers, tax and wealth advisors to insiders, quantitative and event-driven investors, and the data engineers packaging insider activity for all of them. They share one need: visibility into the exempt, gifted, inherited, plan-driven, and late-reported transactions that only surface in the annual statement and its Form 5/A amendments.
Practical workflows the Form 5 Files Dataset supports. Each ties to specific fields in the ownershipDocument XML, the metadata.json envelope, or the Form 5/A amendment chain.
Filter records where transactionTimeliness.value = "L" inside nonDerivativeTable or derivativeTable, optionally gated by form4TransactionsReported = 1 at the document level. Join the matching rows to the referenced <footnote> text to extract the filer's explanation for the delay. Output: an insider-year panel of late transactions with issuer CIK, reporting-owner CIK, transaction date, deemed-execution date, and narrative reason — feeding 10-K/DEF 14A Item 405 cross-checks, enforcement triage, and governance scorecards.
Select transactions where transactionCode = "G" (bona fide gift) or W (inheritance or gift received), partitioned by reportingOwnerRelationship.isDirector and isOfficer. Pull transactionShares.value, transactionDate.value, directOrIndirectOwnership.value, and the natureOfOwnership.value string (e.g. "By GRAT", "By spouse", "By family foundation") together with the attached footnote text. Output: a transfer-level dataset supporting research on gift timing around price-sensitive events, estate-planning structures (GRATs, CRTs, family LLCs), and foundation-funding patterns.
For each accession, walk derivativeTable to collect securityTitle, conversionOrExercisePrice.value, exerciseDate.value, expirationDate.value, underlyingSecurity.underlyingSecurityShares.value, and postTransactionAmounts.sharesOwnedFollowingTransaction.value. Combine with reportingOwnerRelationship.officerTitle and periodOfReport. Output: a fiscal-year-end options, warrants, and RSU inventory per officer — used for overhang modeling, peer-company compensation benchmarking, and damages reconstructions in Section 16(b) disgorgement claims.
Use reportingOwner.reportingOwnerId.rptOwnerCik and issuer.issuerCik to join Form 5 records to the same individual's Form 4 and Form 3 filings. Flag Form 5 non-derivative and derivative transactions that were not previously reported on Form 4 by that insider for the same issuer and fiscal year. Output: a reconciled insider-transaction timeline that closes the gap between the real-time Form 4 feed and the annual true-up — the canonical input for point-in-time holdings panels and insider-sentiment factors.
Group records by (issuer.issuerCik, reportingOwner.reportingOwnerId.rptOwnerCik, periodOfReport) and use documentType = "5/A" plus dateOfOriginalSubmission to link each amendment back to its original Form 5. Diff the transaction tables and footnote text between versions to identify corrected share counts, added late transactions, revised ownership-nature descriptions, or withdrawn entries. Output: an amendment-history dataset for compliance-quality scoring, proxy advisor governance reports, and litigation diligence on specific insiders.
Index <footnote> text across the full corpus, keyed by the transactionCode and transactionTimeliness values of the transactions that reference them. Retrieve precedent language for Rule 16a-6 de minimis exemptions, Rule 16b-3 employee benefit plan transactions, DRIP reinvestments (code A with price zero), indirect ownership disclaimers, and explanations for late filings. Output: a drafting-assistant corpus or RAG index that returns peer-filer wording for specific exempt-transaction scenarios.
Filter records where reportingOwnerRelationship.isTenPercentOwner = 1 (accepting both numeric and true/false encodings) and aggregate by issuer CIK, SIC code from entities, and periodOfReport. Track changes in postTransactionAmounts.sharesOwnedFollowingTransaction.value and any code G or J transactions through trusts or affiliated entities described in natureOfOwnership.value. Output: a block-holder activity feed complementing Schedule 13D/G monitoring, focused on exempt transfers invisible to the 13D/G regime.
The Form 5 Files Dataset is available through three access methods: a JSON metadata endpoint, a full-archive download, and individual monthly container downloads. Downloads require an sec-api.io API key, which can be passed either via the Authorization HTTP header or as a token query parameter. The dataset index JSON endpoint is public and does not require an API key.
Dataset Index JSON API: https://api.sec-api.io/datasets/form-5-files.json
This endpoint returns dataset-level metadata together with the full containers array listing every monthly archive, its size, record count, last-updated timestamp, and direct download URL. It can be polled to detect which monthly containers have changed in the most recent refresh and to drive incremental download workflows. No API key is required to call this endpoint.
1
{
2
"datasetId": "1f13365b-9ade-61e1-a9a3-1c7a385da679",
3
"datasetDownloadUrl": "https://api.sec-api.io/datasets/form-5-files.zip",
4
"name": "Form 5 Files Dataset",
5
"updatedAt": "2026-04-21T02:54:54.000Z",
6
"earliestSampleDate": "1996-02-01",
7
"totalRecords": 225750,
8
"totalSize": 666115590,
9
"formTypes": ["5", "5/A"],
10
"containerFormat": "ZIP",
11
"fileTypes": ["TXT", "JSON", "HTML", "PDF", "XML"],
12
"containers": [
13
{
14
"downloadUrl": "https://api.sec-api.io/datasets/form-5-files/2026/2026-04.zip",
15
"key": "2026/2026-04.zip",
16
"size": 13818783,
17
"records": 154,
18
"updatedAt": "2026-04-21T02:54:54.000Z"
19
}
20
]
21
}
Download Entire Dataset: https://api.sec-api.io/datasets/form-5-files.zip?token=YOUR_API_KEY
Downloads the full dataset as a single ZIP archive covering all Form 5 and Form 5/A filings from February 1996 to the present. This endpoint requires a valid API key.
Download Single Container: https://api.sec-api.io/datasets/form-5-files/2025/2025-12.zip?token=YOUR_API_KEY
Downloads one monthly container archive instead of the full dataset. This is the preferred approach for incremental updates or scoped backfills, since container keys follow a predictable <year>/<year>-<month>.zip pattern and can be combined with the updatedAt values from the index JSON to pull only what has changed. This endpoint requires a valid API key.
The dataset covers SEC Form 5 and its amendment variant Form 5/A — the "Annual Statement of Changes in Beneficial Ownership of Securities" prescribed under Section 16(a) of the Securities Exchange Act of 1934 and Rules 16a-3(f) and 16a-8. The filings are submitted on EDGAR and share the ownershipDocument XML schema family used by Forms 3 and 4.
One record is a single Form 5 or Form 5/A submission filed on EDGAR, identified by its 18-digit SEC accession number. Each submission is stored as a per-accession folder inside a monthly ZIP container and includes the metadata.json envelope, the primary ownershipDocument XML, and the pre-rendered HTML view under xslF345X05/.
Form 5 is filed by Section 16(a) reporting persons: directors of a Section 12-registered issuer, officers meeting the functional definition in Rule 16a-1(f), and beneficial owners of more than 10% of a registered equity class. Issuers do not file Form 5 themselves; their counterpart obligation is the Item 405 disclosure in the proxy or 10-K.
Form 5 is due within 45 days after the issuer's fiscal year end, per Rule 16a-3(f) — around mid-February for calendar-year issuers. An insider who has no reportable items and timely filed every Form 4 may instead deliver the Rule 16a-3(f)(2) "no Form 5 required" representation to the issuer, in which case no EDGAR submission is generated and no record appears in this dataset.
Form 4 is the real-time report of non-exempt insider transactions, due within two business days. Form 5 is the annual catch-up for transactions exempt from Form 4 (small acquisitions under Rule 16a-6, gifts, inheritances, certain Rule 16b-3 items) and for transactions that should have been on a timely Form 4 but were not. No transaction appears on both forms.
The dataset includes every Form 5 and Form 5/A submission to EDGAR starting from February 1, 1996 through the current month, with monthly containers keyed as <year>/<year>-<month>.zip. Filings before the June 2003 EDGAR mandate under Section 403 of the Sarbanes-Oxley Act of 2002 are largely paper or ad hoc SGML; filings from mid-2003 onward conform to the structured ownershipDocument XML schema.
A typical accession folder contains metadata.json, the primary ownershipDocument XML at the folder root, and a same-named .xml-extension file inside xslF345X05/ that is actually an HTML rendering. The dataset index declares TXT, JSON, HTML, PDF, and XML as possible file types, and containers are distributed as ZIP archives. Image files from the original EDGAR submission are excluded by dataset policy.