The Form 4 Files Dataset contains the complete filing packages for every Form 4 and Form 4/A submitted to SEC EDGAR from January 1996 to the present. Each record represents a single EDGAR submission — one insider's Statement of Changes in Beneficial Ownership of Securities — and includes the primary XML ownership document, a filing-level metadata JSON file, a pre-rendered HTML view, and any exhibit attachments such as powers of attorney. Form 4 is filed by officers, directors, and ten-percent beneficial owners of issuers with a Section 12-registered equity class, within two business days of each reportable transaction in the issuer's equity securities. The dataset preserves each filing in its original structured form, supporting both machine extraction from XML and visual inspection through the rendered HTML.
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
The Form 4 Files Dataset is built from Form 4 and Form 4/A filings as submitted to EDGAR. Form 4 is the Statement of Changes in Beneficial Ownership of Securities, required under Section 16(a) of the Securities Exchange Act of 1934. It discloses transactions in an issuer's equity securities by corporate insiders: officers, directors, and beneficial owners of more than ten percent of a registered equity class. Form 4/A is an amendment that fully restates the original Form 4 disclosure.
The dataset spans all Form 4 and Form 4/A filings from January 1996 to the present. Filings before mid-2003 represent voluntary electronic submissions and do not capture all Form 4 filers from that period; from mid-2003 onward — when the SEC mandated electronic filing — coverage is essentially complete. The dataset is distributed as monthly ZIP containers, each containing one folder per filing. Each folder is named by the zero-padded, dash-stripped accession number (e.g., accession 0001185185-25-001587 becomes folder 000118518525001587).
A single record in the Form 4 Files Dataset corresponds to one EDGAR submission of Form 4 or Form 4/A, identified by its accession number. Each record folder contains a metadata.json file with structured filing-index data, the primary XML ownership document, a pre-rendered HTML view of that document in an xslF345X05/ subdirectory, and occasionally one or more exhibit files (most commonly a power-of-attorney under Exhibit 24). Image files referenced in the original EDGAR submission are excluded. The record unit is the complete textual content of a single Form 4 or Form 4/A filing as submitted to EDGAR, minus graphic attachments.
Each record folder contains three to five files:
Always present:
metadata.json — filing-level metadata drawn from the EDGAR filing index: accession number, form type, filing timestamp, period of report, entity identifiers, document inventory, and SEC.gov links.form4.xml, ownership.xml, primary_doc.xml, primarydocument.xml, rdgdoc.xml, edgardoc.xml, marketforms-*.xml, wk-form4_*.xml, tm*_4seq1.xml, fp*_4.xml, or other filing-agent-specific names) — the machine-readable ownership document conforming to the SEC's ownershipDocument XML schema.xslF345X05/<same-filename>.xml — a pre-rendered HTML file produced by applying the SEC's F345X05 XSL stylesheet to the XML. Despite retaining a .xml extension, this file contains HTML markup (with a <!DOCTYPE html> declaration) and replicates the tabular layout of the official printed Form 4.Occasionally present (roughly 7% of filings):
.htm, .html, or .txt format, containing a limited power of attorney authorizing a named individual to execute and file Section 16 forms on the reporting person's behalf. These files are wrapped in the SEC's SGML <DOCUMENT> envelope carrying <TYPE>, <SEQUENCE>, <FILENAME>, and optionally <DESCRIPTION> headers before the <TEXT> block with the exhibit content.Excluded:
.jpg, .gif, .png) referenced in the EDGAR submission — typically signature images, letterheads, or notary seals embedded in power-of-attorney exhibits — appear in metadata.json document listings but are not included in the dataset.metadata.jsonThe metadata.json file provides the filing-level index record. Key fields:
Top-level scalars. formType — "4" or "4/A". accessionNo — the SEC accession number in dash-delimited form. id — a 32-character hexadecimal identifier. filedAt — ISO 8601 timestamp with timezone offset recording when the filing was accepted by EDGAR. periodOfReport — the transaction date or report period in YYYY-MM-DD format. description — human-readable filing description; amendments typically include "[Amend]". linkToFilingDetails — URL to the XSL-rendered view on SEC.gov. linkToTxt — URL to the full-text SGML submission file. linkToHtml — URL to the EDGAR filing index page. linkToXbrl — always an empty string, since Form 4 is not filed in XBRL.
Entities array. The entities array typically contains two objects: the issuer and the reporting owner, distinguished by the "(Issuer)" or "(Reporting)" suffix on companyName. Element ordering within the array is not guaranteed. The issuer entity carries cik, sic (SIC code with industry description), irsNo, stateOfIncorporation, fiscalYearEnd (in MMDD format), and a tickers array listing trading symbols (absent or empty for issuers without listed equity). The reporting-owner entity carries cik, act (typically "34" for the Exchange Act), fileNo, type (form type), and filmNo.
Document inventory. The documentFormatFiles array lists all documents in the submission. Each entry includes sequence, size (in bytes; may be a blank string for the XSL-rendered view), documentUrl, type (e.g., "4", "EX-24", "GRAPHIC"), and description. GRAPHIC entries appear in this inventory even though the image files are excluded from the dataset. The complete submission text file appears as an entry with blank sequence and type. The dataFiles and seriesAndClassesContractsInformation arrays are always empty for Form 4 filings.
The core of each Form 4 Files Dataset record is the XML ownership document, conforming to the SEC's ownershipDocument schema (version X0508 in current filings). It contains all substantive Form 4 disclosure, organized into the following sections in document order:
Schema version and document header. schemaVersion identifies the XML schema version. documentType is 4 or 4/A. periodOfReport states the transaction date. notSubjectToSection16 is a boolean flag (0 or 1) indicating whether the reporting person claims exemption from Section 16. These elements appear at the top of the document, before the issuer block.
Issuer block. The issuer element contains issuerCik (zero-padded CIK), issuerName, and issuerTradingSymbol. The trading symbol may be absent, empty, or contain placeholder text for issuers without a listed ticker.
Reporting owner block. The reportingOwner element nests three sub-blocks. reportingOwnerId contains rptOwnerCik and rptOwnerName. reportingOwnerAddress provides street lines, city, state, zip code, and an optional rptOwnerStateDescription (used for foreign addresses). reportingOwnerRelationship contains four boolean flags — isDirector, isOfficer, isTenPercentOwner, isOther — plus officerTitle when the person is an officer (e.g., "Chairman, CEO, and Secretary") and otherText when isOther is set. Multiple reportingOwner blocks appear in joint filings, though these are uncommon on Form 4.
Rule 10b5-1 flag. The aff10b5One element (0 or 1) indicates whether reported transactions were effected pursuant to a Rule 10b5-1(c) trading plan. This element appears between the reporting-owner block and the transaction tables. It was introduced in 2023 and is absent in older filings.
Non-derivative table. The nonDerivativeTable contains zero or more nonDerivativeTransaction and nonDerivativeHolding entries. Each transaction includes: securityTitle (e.g., "Common Stock"), transactionDate, optional deemedExecutionDate, transactionCoding (with transactionFormType, transactionCode, and equitySwapInvolved), optional transactionTimeliness, transactionAmounts (with transactionShares, transactionPricePerShare, and transactionAcquiredDisposedCode — A for acquired, D for disposed), postTransactionAmounts (with sharesOwnedFollowingTransaction), and ownershipNature (with directOrIndirectOwnership — D for direct, I for indirect — and optionally natureOfOwnership describing the indirect arrangement). Standard transaction codes include P (purchase), S (sale), A (grant/award), M (exercise/conversion of derivative), F (tax-withholding disposition), G (gift), J (other), C (conversion), among others. Holding entries report positions without an associated transaction and lack transaction-specific fields. Individual value elements may carry both a <value> child and one or more <footnoteId> references.
Derivative table. The derivativeTable contains zero or more derivativeTransaction and derivativeHolding entries. These carry the same transactional fields as non-derivative entries plus derivative-specific elements: conversionOrExercisePrice, exerciseDate, expirationDate, and an underlyingSecurity block (with underlyingSecurityTitle and underlyingSecurityShares). Derivative fields frequently contain only a <footnoteId> reference instead of a literal <value>, particularly for exercise dates, expiration dates, and conversion prices that depend on vesting schedules or plan terms.
Footnotes. The footnotes element contains footnote elements keyed by id attributes (F1, F2, F3, etc.). These provide narrative explanations referenced throughout the transaction and holding entries — describing vesting conditions, indirect ownership arrangements, plan-based transaction details, price computation methods, tax-withholding mechanics, and other qualifications. Footnotes are a critical interpretive layer: many derivative-table fields contain only a footnote reference rather than a literal value, making footnote resolution essential for meaningful data extraction.
Owner signature. The ownerSignature element contains signatureName (typically /s/ FIRSTNAME LASTNAME, or with "by" attribution when signed under power of attorney) and signatureDate. Multiple signature blocks appear when multiple reporting owners are listed.
Remarks. An optional remarks element may contain free-text commentary; it is rarely populated.
The xslF345X05/ subdirectory holds a copy of the primary document that has been transformed through the SEC's F345X05 XSL stylesheet into a complete HTML page. The file retains the original .xml filename but contains HTML content. It presents the Form 4 data in the standard SEC tabular layout: a header block with issuer and reporting-person identification, Table I (Non-Derivative Securities Acquired, Disposed of, or Beneficially Owned), Table II (Derivative Securities Acquired, Disposed of, or Beneficially Owned), footnotes, and the signature block. The rendered view is useful for visual inspection but contains no data beyond what the raw XML provides.
Approximately 7% of Form 4 filings include Exhibit 24 (EX-24) attachments — limited powers of attorney authorizing a named individual to execute and file Section 16 forms on the reporting person's behalf. The exhibit content is typically a short legal instrument naming the grantor, the attorney-in-fact, the scope of authority (limited to Section 16 filings), the grantor's signature, and in many cases a notarial acknowledgment with notary signature and commission expiration date. Powers of attorney may cover a single filing or authorize ongoing filings for a stated period.
Each exhibit file is wrapped in the SEC's SGML <DOCUMENT> envelope with <TYPE>, <SEQUENCE>, <FILENAME>, and <TEXT> tags. The exhibit body within <TEXT> is usually HTML or plain text.
Other exhibit types are rare on Form 4. The dataset excludes GRAPHIC-type attachments (signature images, letterheads, notary seals) that occasionally accompany power-of-attorney exhibits.
Pre-XML era (1996 to mid-2003). Form 4 was originally filed as flat text or HTML without structured XML. Filings from 1996 through approximately June 2003 are plain-text or HTML submissions presenting ownership-change data in unstructured or semi-structured tabular formats. These filings lack the ownershipDocument XML schema and require text or HTML parsing for field-level extraction.
Mandatory XML schema (June 30, 2003 onward). The SEC's mandate for electronic filing of Section 16 forms introduced the ownershipDocument XML schema. From this date forward, Form 4 filings are structured XML, enabling direct machine extraction of transaction data, security titles, share counts, prices, ownership types, and relationship flags. The schema has been revised incrementally through versions X0306, X0407, and X0508, each adding or refining elements.
Rule 10b5-1 disclosure (2023). Amendments to Section 16 rules effective in 2023, implementing the SEC's December 2022 rulemaking on insider trading arrangements, added the aff10b5One element. Filings before this rule change do not contain the element.
XSL stylesheet versions. The stylesheet identifier in the filing path has changed across schema versions (xslF345X02, xslF345X03, xslF345X05, etc.), with xslF345X05 current. Each version corresponds to layout refinements in the rendered HTML.
Filing-agent naming conventions. The primary XML filename varies by filing agent and era. Common patterns include form4.xml, ownership.xml, primary_doc.xml, primarydocument.xml, rdgdoc.xml, edgardoc.xml, marketforms-*.xml, wk-form4_*.xml (Workiva), tm*_4seq1.xml (Toppan Merrill), and fp*_4.xml (FilingPoint). This variation is cosmetic and does not affect XML content or schema conformance.
Amendments are full restatements. A Form 4/A replaces the entire original disclosure, not just the changed fields. The dataset stores each amendment as a separate record with its own accession number. The amendment XML does not contain a pointer to the original filing's accession number; linking amendments to originals requires matching on issuer CIK, reporting-owner CIK, and period of report.
Footnote-dependent fields. Many fields in the derivative table — particularly conversionOrExercisePrice, exerciseDate, and expirationDate — contain only a <footnoteId> element without a <value>. Non-derivative fields such as transactionShares, transactionPricePerShare, and sharesOwnedFollowingTransaction may also carry supplementary footnote references alongside their literal values. Extracting complete data from derivative transactions requires resolving footnote references to their narrative text.
Multiple transactions per filing. A single Form 4 may report multiple transactions across both tables. Each transaction is a separate XML element within its table. Transactions within one filing may span different dates, different security titles, and different transaction codes.
Direct versus indirect ownership. The directOrIndirectOwnership field distinguishes direct holdings (D) from indirect holdings (I). Indirect holdings include shares held through a spouse, trust, partnership, or other entity, with natureOfOwnership describing the arrangement. A single filing may report both direct and indirect positions in the same security as separate transaction or holding lines.
Ticker availability. The tickers array in issuer metadata and issuerTradingSymbol in the XML are populated when the issuer has listed equity. For issuers without a listed class — certain investment funds, private issuers with registered debt, or delisting-stage issuers — these fields may be absent, empty, or contain placeholder text such as "N/A" or "None". The tickers array may be omitted entirely from the issuer entity in metadata.json rather than appearing as an empty array.
File sizes. Primary XML files are typically 2 to 10 KB. Metadata JSON files run 1 to 3 KB. Power-of-attorney exhibits range from 3 to 8 KB. The XSL-rendered HTML views are similar in size to the primary XML. Records are lightweight; aggregate dataset size reflects the volume of filings rather than per-record bulk.
The reporting person — not the issuer — files Form 4 on EDGAR. The obligation arises under Section 16(a) of the Securities Exchange Act of 1934 and applies to three classes of corporate insiders of issuers with a Section 12-registered equity class:
Officers. As defined in Rule 16a-1(f): the president, principal financial officer, principal accounting officer, any vice president heading a principal business unit or function, and any other officer performing a policy-making function. This is narrower than general corporate usage; not every person with an officer title is a Section 16 reporting person.
Directors. Every board member, including directors of a corporate general partner of a limited partnership issuer or of a managing member entity.
Ten-percent beneficial owners. Any person — natural or legal — beneficially owning more than ten percent of any Section 12-registered equity class. Beneficial ownership for threshold purposes is measured under Section 13(d). Groups acting together may trigger the threshold on aggregate holdings.
In practice, filings are often prepared by the issuer's counsel or a filing agent under power of attorney, but the legal obligation remains the reporting person's. Each filing is indexed on EDGAR under the reporting person's CIK, with the issuer identified by its own CIK, name, and ticker within the document.
Form 4 applies only to securities of issuers with a Section 12-registered equity class — primarily companies listed on NYSE, Nasdaq, or other national securities exchanges, plus companies registered under Section 12(g). The issuer is identified in the filing but is never the filer.
Excluded from the Form 4 regime:
Form 4 is event-driven, not periodic. A filing is triggered by any change in the reporting person's beneficial ownership of the issuer's equity securities, including:
Transactions exempt from short-swing profit liability under Section 16(b) — such as Rule 16b-3 employee benefit plan acquisitions — still require Form 4 reporting. The Sarbanes-Oxley Act of 2002 moved most formerly deferrable transactions from Form 5 to Form 4; only a narrow residual category (certain small acquisitions, certain exempt transactions) remains eligible for deferred annual reporting on Form 5.
Form 4 must be filed before the end of the second business day after the transaction date (trade date, not settlement date). This accelerated deadline took effect under the Sarbanes-Oxley Act, replacing the former tenth-of-the-following-month rule.
Multiple same-day transactions in one issuer's securities are typically reported on a single filing. Transactions on different dates or in different issuers require separate filings.
Form 4 is part of the Section 16 form series:
This dataset covers only Form 4 and Form 4/A (amendment) filings.
Form 4/A corrects a prior Form 4 — fixing transaction dates, share amounts, prices, or ownership figures. Each amendment contains the complete corrected disclosure, not just changed fields. Both originals and amendments appear in the dataset as separate records.
Form 4 sits within the SEC's ownership-disclosure regime alongside several overlapping filing types. The closest comparisons are Forms 3 and 5 (the other Section 16 forms), Schedules 13D/13G (large beneficial ownership), Form 144 (notice of proposed sale), and structured transaction extracts derived from Form 4 XML itself.
Form 3 shares Form 4's filer population (officers, directors, ten-percent holders), XML schema, and document structure. The sole difference is trigger: Form 3 is filed once, within ten days of becoming a Section 16 insider, and reports only existing holdings at that moment — no transactions. Form 4 is filed within two business days of each subsequent reportable transaction. Form 3 establishes the baseline; Form 4 records every change after it. Form 3 filings are sparse (one per insider-issuer relationship), while a single insider may generate dozens of Form 4 filings over time.
Form 5 uses the same XML schema and filer population as Forms 3 and 4. It captures transactions eligible for deferred reporting — small acquisitions, gifts, inheritance transfers, certain benefit-plan transactions — filed within 45 days of the issuer's fiscal year-end.
Since the SEC shortened the Form 4 deadline to two business days in 2002 and narrowed deferral eligibility, Form 5 volume has dropped substantially. Form 4 now captures nearly all insider transactions in near-real time. Relying solely on the Form 4 dataset misses only a small residual of deferred items that appear exclusively on Form 5.
Schedules 13D/13G overlap with Form 4 only for the narrow subset of filers who are both Section 16 insiders and holders above five percent. The two regimes differ on every major dimension:
Form 4 delivers high-frequency transactional granularity; Schedules 13D/13G deliver ownership-level snapshots with qualitative context about intent and control.
Form 144 is filed by affiliates and insiders intending to sell restricted or control securities under Rule 144. Many planned insider sales trigger both a Form 144 (filed at or before the sale) and a Form 4 (filed within two business days of execution). The key distinction is temporal: Form 144 is a forward-looking notice of intent; Form 4 is a backward-looking report of a completed transaction. Form 144 also covers restricted-security sales by affiliates who may lack Section 16 status and thus never file Form 4. Form 4 is the more reliable record of actual transactions; Form 144 signals planned but not necessarily executed dispositions.
Some data providers parse Form 4 XML into flat tabular records (insider name, ticker, transaction date, code, shares, price). The Form 4 Files Dataset differs by preserving the complete filing package: raw XML, metadata JSON, rendered HTML, and exhibits such as powers of attorney. This retains footnotes, derivative-table structures, multi-transaction filing context, amendment chains, and full reporting-owner metadata. Tabular extracts are more convenient for quantitative screening; the full-file dataset is necessary when footnote text, exhibit content, or original document structure matters.
The Form 4 Files Dataset serves professionals who track, analyze, or act on insider activity across the full spectrum of Section 16 reporting persons.
Quant teams use Form 4 data as a signal source. They extract transaction dates, transaction codes (open-market purchases, dispositions, option exercises), share volumes, and prices to build insider-sentiment indicators at the issuer, sector, or market level. The reporting person's role — CEO, CFO, director, or large holder — drives signal weighting, since C-suite purchases are typically treated as more informative than routine option exercises. The full 1996-to-present history and structured XML support clean backtesting with minimal manual cleanup.
Equity analysts at asset managers and sell-side firms monitor Form 4 filings for companies they cover. A cluster of insider purchases by multiple officers reinforces a bullish thesis; large CEO or CFO dispositions near earnings dates prompt scrutiny. Key fields: transaction type, dollar size relative to the insider's existing holdings, and post-transaction ownership totals. These feed into investment notes, earnings previews, and recommendation changes.
Event-driven analysts use the dataset to detect early accumulation or disposition by ten-percent holders. A series of open-market purchases disclosed on Form 4 may precede a 13D filing or activist campaign. They track the reporting person's identity, cumulative ownership changes over rolling windows, and the issuer's share count to estimate evolving stakes. Historical depth supports pattern recognition across prior contested situations.
Disclosure counsel review Section 16 filing patterns for clients and counterparties. They verify two-business-day filing deadlines, check whether amendments corrected material errors, and assess consistency with insider trading policies and Rule 10b5-1 plans. In enforcement defense or internal investigations, lawyers reconstruct full trading timelines by pulling every Form 4 for a reporting person or issuer. Transaction codes, footnotes, and amendment history are the critical fields.
Governance teams at institutional asset managers and proxy advisory firms examine insider transaction patterns to assess board and management alignment with shareholders. They focus on the relationship field (officer, director, or large holder), transaction direction and size, and post-transaction holdings to gauge ongoing equity exposure. The analysis feeds proxy voting recommendations, engagement agendas, and governance scoring models.
Broker-dealer compliance departments and exchange surveillance units cross-reference Form 4 filings against unusual trading activity. When a suspicious price move occurs, they check transaction dates, codes, and reporting-person roles for any insider trades within the relevant window. The dataset also supports routine monitoring for late filings and Section 16(a) noncompliance patterns.
Data engineering teams at financial data vendors and institutional research platforms ingest the full dataset to build structured insider-transaction databases. They parse XML submissions to extract transaction tables, reporting-person metadata, issuer identifiers (CIK, ticker), and footnote text, then normalize and deduplicate across originals and amendments. The resulting cleaned data powers screening tools, alerting systems, and API endpoints consumed by analysts and compliance teams.
Forensic teams reconstruct insider trading timelines during fraud investigations, SEC enforcement matters, and shareholder litigation. They look for abnormal selling ahead of negative earnings surprises, restatements, or regulatory actions. Transaction price, date, and volume fields enable precise calculation of insider profits or avoided losses. Amendments (Form 4/A) are especially relevant, since corrections to previously reported transactions can indicate disclosure problems.
Diligence teams review Form 4 filings for a target company's insiders to identify unusual transaction activity before deal announcement. They also verify timeliness and accuracy of Section 16 reporting — late or amended filings can signal weak internal controls. Post-transaction holdings help estimate management equity stakes relevant to rollover and retention negotiations.
IR professionals at public companies monitor their own insiders' filings for accuracy and timeliness, and review peer-company insider activity to anticipate market interpretation. Before earnings calls or investor meetings, they track recent insider sales likely to draw analyst questions, focusing on transaction type, timing, size, and any footnote indicating a 10b5-1 plan.
Finance and economics researchers use nearly three decades of insider transaction records to study information asymmetry, market efficiency, executive compensation, and governance. Structured fields — dates, prices, volumes, transaction codes, and reporting-person roles — support panel datasets, event studies, and natural-experiment designs around Section 16 regulatory changes.
Teams building language-model applications for financial use cases use the dataset as a structured-text corpus for training, fine-tuning, and retrieval pipelines. Filings combine tabular transaction data with free-text footnotes, supporting entity extraction, transaction classification, and question-answering over insider-activity records. The mix of XML, HTML, and plain-text formats across millions of filings provides both structured grounding fields and unstructured text for model development.
Equity analysts and quant teams screen for multiple officers or directors purchasing shares of the same issuer within a short window. The reportingOwnerRelationship flags (isDirector, isOfficer, officerTitle) identify each buyer's role, while transactionCode isolates open-market purchases (P) from grants and option exercises. Aggregating transactionShares and transactionPricePerShare across filings for a single issuer CIK over rolling periods produces a cluster score that feeds buy-side conviction models or systematic trading signals.
Quantitative researchers construct multi-decade panel datasets by parsing the nonDerivativeTransaction and derivativeTransaction elements from XML filings back to the June 2003 mandatory-XML cutoff. Transaction codes, share volumes, prices, and postTransactionAmounts supply the core signal features, while the reportingOwnerRelationship fields enable weighting by insider seniority. The periodOfReport date anchors each observation, and the aff10b5One flag (available from 2023 onward) allows researchers to control for planned-trade effects in recent data.
Securities lawyers and compliance teams verify whether filings meet the two-business-day deadline by comparing transactionDate in the XML against filedAt in metadata.json. Amendment records (Form 4/A) are identified by formType and linked to originals by matching on issuer CIK, reporting-owner CIK, and periodOfReport. Patterns of late filings or frequent amendments for a particular reporting person or issuer flag potential Section 16(a) noncompliance or weak internal controls during due diligence reviews.
Event-driven analysts monitor ten-percent owners by filtering on the isTenPercentOwner flag and tracking cumulative acquired shares (transactionAcquiredDisposedCode = A) across sequential filings for the same reporting-owner CIK and issuer. Rising sharesOwnedFollowingTransaction totals over rolling windows can signal stake-building ahead of a Schedule 13D filing or proxy contest. Historical filing depth supports pattern matching against prior contested situations for the same issuer or holder.
Data engineers building structured insider-transaction databases parse the derivativeTable to capture option exercises, conversions, and award vesting. Fields such as conversionOrExercisePrice, exerciseDate, and expirationDate frequently contain only a footnoteId reference rather than a literal value. Resolving these references against the footnotes element recovers vesting schedules, plan names, and price computation formulas needed to produce complete derivative-transaction records for downstream analytics and screening tools.
Forensic accountants and investigators pull every Form 4 and Form 4/A for a given reporting-owner CIK to build a chronological record of all transactions across issuers. Transaction dates, prices, share volumes, and transaction codes enable precise calculation of profits or avoided losses relative to material events such as earnings misses or restatements. Amendment history is critical: corrections disclosed in Form 4/A filings can reveal originally misreported transactions relevant to enforcement proceedings or shareholder litigation.
Dataset Index JSON API: https://api.sec-api.io/datasets/form-4-files.json
This endpoint returns metadata about the Form 4 Files Dataset, including its name, description, last updated timestamp, earliest sample date, total records, total size, form types covered (4 and 4/A), container format (ZIP), and content file types (TXT, JSON, HTML, PDF, XML). It also returns the download URL for the entire dataset and a list of all individual container files with per-container metadata such as size, record count, last updated timestamp, and download URL. This endpoint does not require an API key.
Use this API to monitor which containers have been updated in the most recent daily refresh, so you can selectively download only the containers that changed rather than re-downloading the full dataset each time.
1
{
2
"datasetId": "1f1333bd-dbdb-6340-ba36-580af17fba9d",
3
"datasetDownloadUrl": "https://api.sec-api.io/datasets/form-4-files.zip",
4
"name": "Form 4 Files Dataset",
5
"updatedAt": "2026-04-17T02:54:16.820Z",
6
"earliestSampleDate": "1996-01-01",
7
"totalRecords": 9579934,
8
"totalSize": 28354410534,
9
"formTypes": ["4", "4/A"],
10
"containerFormat": "ZIP",
11
"fileTypes": ["TXT", "JSON", "HTML", "PDF", "XML"],
12
"containers": [
13
{
14
"downloadUrl": "https://api.sec-api.io/datasets/form-4-files/2025/2025-10.zip",
15
"key": "2025/2025-10.zip",
16
"size": 135291847,
17
"records": 29543,
18
"updatedAt": "2026-04-17T02:54:16.820Z"
19
}
20
]
21
}
Download Entire Dataset: https://api.sec-api.io/datasets/form-4-files.zip?token=YOUR_API_KEY
Downloads the full dataset as a single ZIP archive containing all containers. This endpoint requires an API key passed as the token query parameter.
Download Single Container: https://api.sec-api.io/datasets/form-4-files/2025/2025-10.zip?token=YOUR_API_KEY
Downloads one individual monthly container instead of the full dataset. Use the container paths returned by the dataset index JSON API to construct the download URL for any specific month. This endpoint requires an API key passed as the token query parameter.
The Form 4 Files Dataset covers SEC Form 4 (Statement of Changes in Beneficial Ownership of Securities) and Form 4/A (amendments). These are the filings required under Section 16(a) of the Securities Exchange Act of 1934 to report insider transactions.
One record corresponds to a single EDGAR submission of Form 4 or Form 4/A, identified by accession number. Each record folder contains a metadata.json file, the primary XML ownership document, a pre-rendered HTML view, and occasionally Exhibit 24 (power-of-attorney) attachments.
Three classes of corporate insiders must file: officers (as defined in Rule 16a-1(f)), directors, and beneficial owners of more than ten percent of any Section 12-registered equity class. Foreign private issuers and their insiders are exempt.
Form 4 must be filed before the end of the second business day after the transaction date. This accelerated deadline was established by the Sarbanes-Oxley Act of 2002.
The dataset spans January 1996 to the present. Filings before mid-2003 represent voluntary electronic submissions; from June 30, 2003 onward — when the SEC mandated electronic filing of Section 16 forms — coverage is essentially complete.
The dataset is distributed as monthly ZIP containers. Each container holds one folder per filing, containing XML, JSON, and HTML files. The primary ownership document is structured XML conforming to the SEC's ownershipDocument schema (for filings from mid-2003 onward); earlier filings are plain text or HTML.
Tabular extracts flatten Form 4 XML into row-per-transaction records (insider, ticker, date, code, shares, price). The Form 4 Files Dataset preserves the complete filing package — raw XML, metadata JSON, rendered HTML, and exhibits — retaining footnotes, derivative-table structures, multi-transaction context, amendment chains, and power-of-attorney documents that tabular extracts discard.