The Form 497K3A Files dataset is a closed historical corpus of mutual fund profile filings made on EDGAR under submission type 497K3A. Each record is one EDGAR accession submitted by an open-end management investment company registered on Form N-1A, packaging the fund profile authorized by the original Rule 498 — a concise, plain-English alternative to the full statutory prospectus that the SEC permitted from December 1998 until the profile framework was replaced by the summary prospectus regime in March 2009. The legal filer is the registrant (typically a Massachusetts business trust, Delaware statutory trust, or Maryland corporation), and a single 497K3A submission may carry profile content for multiple share classes or multiple series of an umbrella trust. The dataset's earliest sample date is December 1, 1998, and coverage tracks the full operational lifetime of the form through its retirement in March 2009. Files are distributed as monthly ZIP containers; record content includes a parsed metadata.json header alongside each non-image body document from the original EDGAR submission, in TXT, JSON, and HTML form.
Programmatically retrieve the full list of dataset archive files, download URLs and dataset metadata.
Dataset Index JSON API
Download the entire dataset as a single archive file.
Download Entire Dataset:
Download a single container file (e.g. monthly archive) from the dataset.
Download Single Container:
The Form 497K3A Files dataset captures every Form 497K3A filing accepted by EDGAR during the form's eleven-year operational window. Form 497K3A is a Rule 497 filing under the Securities Act of 1933 used by open-end management investment companies registered on Form N-1A to file a fund "profile" — the concise, plain-English alternative to a full statutory prospectus authorised by SEC Rule 498 in 1998. The "K3A" suffix corresponds to Rule 497(k)(1)(iii)(A), which prescribed the profile's required content and its standardised question-and-answer ordering. A 497K3A submission is the public dissemination vehicle for that profile: the actual document filed is the profile itself, not a registration statement, and its role is to satisfy prospectus delivery for investors who elected the profile alternative.
The form was operative from December 1998, when Rule 498 first took effect, until March 2009, when the SEC replaced the profile framework with the summary prospectus regime under amended Rule 498 (the "Form N-1A summary prospectus"). The dataset's temporal envelope tracks the full operational lifetime of the form; no records exist after the March 2009 retirement, and the dataset is closed-ended and historical. It is distributed as monthly ZIP archives, with per-record content shipped as JSON metadata plus the original-name body documents (TXT or HTML) preserved inside their EDGAR SGML envelopes. Image attachments listed in the EDGAR submission as GRAPHIC documents are intentionally excluded, but their inventory entries remain in the per-record metadata so that consumers can fetch them on demand from EDGAR.
One record in the Form 497K3A Files dataset is a single EDGAR filing of submission type 497K3A, identified by its accession number and materialised on disk as one accession-keyed folder. The folder bundles a parsed-header metadata.json together with every non-image document that was part of the original EDGAR submission, each preserved under its original filename and inside its original EDGAR SGML envelope. The unit of record is therefore the filing as a whole — not an extracted section, not a per-document row, not a per-fund or per-share-class observation. When a single 497K3A submission carried profiles for multiple share classes or multiple portfolios in one umbrella document, all of that content remains together in the same record because the underlying EDGAR submission was itself a single accession.
The dataset is distributed as monthly ZIP archives organised under a YYYY/YYYY-MM.zip path scheme. Extraction yields a top-level YYYY-MM/ directory whose immediate children are accession-number folders. Folder names use the path-safe 18-digit form of the accession number with no dashes (e.g. 000031321207000095), while metadata.json stores the canonical hyphenated form (0000313212-07-000095). Inside each accession folder sit metadata.json and one or more original-name body documents from the EDGAR submission. Because 497K3A is a low-volume form throughout its life, monthly containers are small.
The file types present in the dataset are TXT, JSON, and HTML. JSON is reserved for the per-record metadata.json. HTML (most often with an .htm extension) is the dominant body-document format from the mid-2000s onward, while flat TXT is most common in the earliest years of the form. Image attachments listed in the EDGAR submission as GRAPHIC documents (typically .gif or .jpg) are intentionally excluded from the archive, although their inventory entries remain in metadata.json with their original filenames and direct EDGAR URLs so consumers can fetch them on demand.
metadata.json is a compact JSON object derived from the EDGAR submission header and document index. Its fields describe the filing and enumerate every document EDGAR received, regardless of whether the dataset ships that document.
formType — always 497K3A for this dataset.accessionNo — the canonical hyphenated accession number, e.g. 0000313212-07-000095.filedAt — an ISO-8601 timestamp with timezone offset capturing the EDGAR acceptance time.description — the standard Rule 497 description string ("Form 497K3A — Profiles for certain open-end management investment companies, [Rule 497(k)(1)(iii)(A)]").linkToFilingDetails, linkToTxt, linkToHtml, linkToXbrl — back-references to the EDGAR filing detail page, the full submission .txt, the rendered HTML, and the XBRL instance. linkToXbrl is empty for every record because XBRL was never required for 497K3A filings.documentFormatFiles — an array of objects describing each document in the EDGAR submission: sequence number, document type (e.g. 497K3A, GRAPHIC), original filename, byte size, and document URL on EDGAR. The final entry typically points at the complete *.txt submission file. GRAPHIC entries remain listed even though their bytes are not shipped, providing a complete inventory and EDGAR URLs for downstream retrieval.entities — an array of filer entity objects. Each entity carries cik, companyName (suffixed with the EDGAR role marker, e.g. (Filer)), fileNo (1933 Act file number), irsNo, fiscalYearEnd (as MMDD), act (commonly 33 for the Securities Act of 1933), type (the form type as reported by EDGAR for that entity), and filmNo (the SEC film identifier).dataFiles — an empty array for this form type, because 497K3A carries no XBRL data files.id — an opaque internal identifier used for deduplication and indexing.Each non-image document listed in documentFormatFiles is shipped under its original EDGAR filename in the accession folder. Despite the .htm extension on most modern filings, the bytes on disk are not pure HTML: they are EDGAR SGML, with the document body bracketed by an opening <DOCUMENT> block whose header tags are unclosed in the EDGAR style and a <TEXT> payload region. Only </TEXT> and </DOCUMENT> carry explicit close tags:
1
<DOCUMENT>
2
<TYPE>497K3A
3
<SEQUENCE>1
4
<FILENAME>inteqpro07ame.htm
5
<TEXT>
6
... document payload ...
7
</TEXT>
8
</DOCUMENT>
Inside <TEXT>, the payload is either an HTML document (mid-2000s onward) or a flat ASCII text block with EDGAR-style monospaced table markup (early-era filings). To parse the file as HTML, consumers strip the leading <DOCUMENT>…<TEXT> envelope and the trailing </TEXT></DOCUMENT> lines and feed the inner content to an HTML parser.
The HTML payloads characteristic of profile filings are heavily print-oriented. They rely on inline <font> and <div style="…"> markup with explicit point sizes, named typefaces (e.g. Berkeley Book, Trajan, MetaPlusLF-MediumRoman), explicit color attributes, and numeric HTML entities such as   (non-breaking space) and — (em-dash). EDGAR redlining markers carried over from authoring tools — escaped <R> … </R> pairs — frequently bracket recently revised passages. Embedded <img src="…gif"> tags remain in the HTML and refer to the omitted GRAPHIC files; the URLs in metadata.json make those images recoverable from EDGAR.
Encoding is ASCII with numeric HTML entities for any non-ASCII characters. There is no byte-order mark and no explicit <meta charset> declaration. Line endings are predominantly \n. Body files are often a single long logical line of HTML, with newline-terminated breaks limited to the SGML wrapper itself, so line-oriented tools may report only a handful of lines for files that are hundreds of kilobytes long.
Rule 498 fixed the profile's required disclosures and ordered them as a short series of plain-language questions, so most 497K3A documents follow a recognisable internal sequence:
Filings range from compact single-fund profiles to multi-share-class or multi-portfolio profiles in which several funds share one umbrella document. Tabular sections (fees, performance) are rendered with HTML tables in modern filings and with monospaced ASCII alignment in early-era filings.
Each record contains the complete parsed header (metadata.json), every non-image body document from the EDGAR submission preserved in its original SGML-wrapped form under its original filename, and the full document inventory in documentFormatFiles listing every file EDGAR received including those not shipped. The hyphenated and non-hyphenated accession numbers, filer CIK and entity attributes, file number, IRS number, fiscal year end, film number, and EDGAR-side URLs are all retained.
Image documents listed as GRAPHIC in EDGAR (typically .gif and .jpg files used for logos, charts, and decorative typography) are intentionally omitted from the accession folder. Their entries remain in metadata.json's documentFormatFiles array with their original filenames and direct EDGAR URLs, allowing any consumer to retrieve them from EDGAR if needed. No XBRL instance documents exist for this form because 497K3A was never subject to XBRL tagging requirements; dataFiles is uniformly empty and linkToXbrl is uniformly an empty string.
The dataset packages per-document files individually rather than the full EDGAR submission .txt concatenation. The metadata still records a reference and URL to the original full submission text file, but that consolidated artefact is not duplicated alongside the per-document files.
The 497K3A profile's core content requirements were fixed by Rule 498 at the form's inception in December 1998 and remained largely stable across the form's eleven-year life. The set of disclosures — objective, strategies, risks, performance, fees, purchase/redemption procedures — and their ordered question-style presentation persisted from the earliest 1998 filings through the final March 2009 filings. Substantive changes were incremental: increased granularity in risk disclosure, evolution of fee-table conventions in line with broader Form N-1A amendments, and gradual standardisation of the performance-table benchmark and example formats. The form was discontinued when the SEC restructured prospectus delivery around the new summary prospectus under amended Rule 498, which absorbed the profile's role and triggered the removal of 497K3A as a valid EDGAR submission type.
Across the form's lifetime the on-disk presentation of the body document evolved while the SGML envelope remained constant.
<TEXT> block, often within a <TYPE>497K3A document whose payload used monospaced text and EDGAR table conventions for fee and performance tables. The body document is therefore a flat text file in this era.<DOCUMENT>…<TEXT>…</TEXT></DOCUMENT> envelope. HTML markup is print-oriented and heavily inline-styled, with named typefaces, explicit point sizes, color attributes, numeric entities, and embedded <img> references to GRAPHIC files. EDGAR redlining tags surface as escaped <R> markers around revised passages.Several nuances matter for working with these records.
<DOCUMENT>, <TYPE>, <SEQUENCE>, <FILENAME>, and <TEXT> opening tags and the closing </TEXT></DOCUMENT> lines.<R> redlining markers may interfere with naive HTML cleanup if a consumer unescapes entities indiscriminately; preserving them as literal text is usually safer.<img> references inside the HTML point at files that are not in the accession folder, so any rendering pipeline must either suppress the missing images or fetch them from the EDGAR URLs preserved in documentFormatFiles.wc -l, naive line splitters) will dramatically under-report the size and structure of the payload; byte- or token-oriented tooling is more reliable.metadata.json — and joining the two requires inserting (or removing) the standard NNNNNNNNNN-YY-NNNNNN dashes.Each Form 497K3A submission is made by an open-end management investment company registered on Form N-1A — in practice, a mutual fund. The legal filer is the registrant itself, typically organized as a Massachusetts business trust, Delaware statutory trust, or Maryland corporation. A single registrant often operates as a series trust with many funds and share classes, so one 497K3A accession may carry profile content for one or several series.
The form is not used by:
A filing agent or financial printer normally transmits the submission to EDGAR, but the disclosure obligation rests with the registrant and, indirectly, the principal underwriter that distributes the shares. Officers and trustees of the registrant are responsible for the underlying profile content.
Form 497K3A is a submission type under Rule 497 of the Securities Act of 1933, the rule that requires registered investment companies to file with the SEC the prospectus-related materials they actually use in offering their shares.
The "K" family of 497 suffixes was created to carry fund profiles authorized under the original Rule 498, adopted by the SEC in March 1998. That version of Rule 498 let an N-1A-registered fund prepare a short, standardized profile summarizing:
Funds could deliver this profile to prospective investors in lieu of the full statutory prospectus, provided the statutory prospectus was made available on request and was delivered with the confirmation of the initial purchase.
Within the 497K suffix family, the trailing characters identified the role of the profile being filed (initial profile, revised profile, profile filed alongside other materials, etc.). 497K3A was one of these operational sub-codes for profile filings made against an already-effective N-1A registration statement. All 497K-suffix submissions in this 1998-2009 regime share the same legal authority: a Rule 498 profile filed pursuant to Rule 497.
A 497K3A filing is event-driven, not periodic. Rule 497 required any prospectus or profile used after the registration statement's effective date to be filed no later than the date it was first used. The trigger is therefore the fund's actual deployment of the profile — once the fund, through its principal underwriter or selling intermediaries, began distributing the profile, the registrant was obligated to file the corresponding 497K3A on EDGAR.
In practice, profiles were refreshed alongside the fund's annual N-1A update and its statutory prospectus, so 497K3A filings cluster around annual prospectus cycles, with additional filings whenever a profile was reissued or corrected mid-cycle.
The original Rule 498 profile regime ran from March 1998 until it was replaced. In January 2009 the SEC adopted amendments to Form N-1A and a rewritten Rule 498 that established the summary prospectus — a short document that is itself part of the statutory prospectus and whose delivery (combined with online posting of the full prospectus) satisfies prospectus delivery obligations. The summary prospectus replaced the older profile concept entirely.
As funds transitioned during 2009, the profile-specific submission types, including 497K3A, were retired. The last 497K3A filings on EDGAR appear in early 2009. The 497K3A dataset is therefore closed and covers a fixed historical window from late 1998 through early 2009.
Post-2009 "497K" is a different animal. After the 2009 amendments, EDGAR continues to accept a submission type labeled Form 497K, but those filings are summary prospectuses filed under the new Rule 498(k), not Rule 498 profiles. Despite the visually similar code, post-2009 497K filings are not part of this dataset and are governed by a different regulatory regime. The 497K3A code itself was not reused.
497K3A is not a registration filing. It is the EDGAR copy of a prospectus-equivalent document associated with an already-effective N-1A registration. It does not effect registration, does not move effective dates, and does not trigger staff review the way an N-1A post-effective amendment under Rule 485 does.
Profile use was optional. Funds that did not adopt the profile alternative filed nothing in the 497K family; they continued to file full statutory prospectuses and stickers under other Rule 497 suffixes (e.g., 497, 497J) and amended their registration statements under Rule 485. The 497K3A population is therefore narrower than the universe of N-1A registrants.
Series-trust filings. When a series trust filed one 497K3A covering multiple series or share classes, a single accession number was generated even though each series remained the substantive subject of its own disclosure.
Corrections. Amendments to a previously filed profile were handled by filing a new 497-family submission, not by amending the original accession in place.
No withdrawals at sunset. The discontinuation of 497K3A in 2009 is a regulatory boundary, not a corporate event. Funds did not withdraw existing 497K3A filings; they simply stopped producing new ones once the summary prospectus regime took effect.
Form 497K3A sits inside a tightly clustered family of mutual fund prospectus submission types filed under Rule 497 of the Securities Act of 1933, layered on top of the Form N-1A registration regime. The most useful comparisons are to other Rule 497 sub-types, to the post-effective amendments that contain the underlying statutory prospectus, to Form N-1A itself, and to the post-2009 summary prospectus framework that replaced the profile.
Form 497 is the parent submission type for definitive materials filed under Rule 497 after a registration statement is effective: full prospectuses, supplements, stickers, and certain sales literature. A bare 497 tag does not indicate which document type was filed.
497K3A is narrower on two axes: it is restricted to the fund profile authorized by the original Rule 498 (adopted 1998), and it encodes a specific delivery posture within the K-suffix taxonomy. Researchers using a generic 497 corpus capture 497K3A as a sibling category but cannot isolate profile documents without the suffix.
These siblings come from the same original Rule 498 profile regime and all carry profile content with substantially similar disclosure (objectives, strategies, risks, performance, fees, purchase/redemption procedures). The suffix encodes the operational and delivery context in which the profile was filed, not the substantive content.
The precise procedural meaning of each numeric suffix is an EDGAR submission-type convention rather than a category defined on the face of Rule 498 itself, and SEC public materials do not give a clean one-line definition for each suffix. What can be said accurately:
For content-level analysis the suffixes should generally be combined; for compliance or delivery-mechanics research, the specific suffix is what carries the distinction. Avoid attributing precise rule citations to individual K-suffixes beyond what EDGAR submission-type documentation supports.
The amended Rule 498, adopted January 2009 and effective March 31, 2009, replaced the profile with the summary prospectus. The submission type 497K (no numeric suffix) is reused on EDGAR for these post-2009 filings, but the legal substance changed:
Form 497K-SP appears in some EDGAR records in connection with summary prospectus filings under the post-2009 regime and belongs to that successor universe, not the 497K3A peer set.
The 497K3A dataset terminates in March 2009 because the original profile framework was rescinded. The post-2009 497K corpus is the conceptual successor but is not content-interchangeable; cross-regime comparisons must account for the document-format change.
Form 485APOS and Form 485BPOS are post-effective amendments to Form N-1A registration statements. 485APOS is filed under Rule 485(a) and is subject to a delayed effective date pending staff review; 485BPOS is filed under Rule 485(b) and goes effective immediately or on a date certain. Both carry the full statutory prospectus, the Statement of Additional Information, and Part C.
The relationship to 497K3A is hierarchical: the profile summarizes information that lives in greater detail in the corresponding 485BPOS prospectus. 485BPOS is the source-of-truth offering document; 497K3A is the investor-facing summary derived from it. They are complementary, not substitutable.
Form N-1A is the registration form for open-end management investment companies. It defines the disclosure architecture that flows into both the 485-series amendments and the 497-series filings. N-1A as a registration statement covers Part A, Part B (SAI), and C; 497K3A captures only the profile document and its EDGAR submission metadata, and does not contain N-1A's full content.
Form 497K3A is distinct on four axes at once:
It is not a substitute for 485BPOS when full prospectus detail is needed, not interchangeable with the post-2009 497K summary prospectus corpus, and not interchangeable with sibling K-suffixes when delivery-posture distinctions matter. It is the closed historical record of the pre-2009 profile experiment that preceded and informed the summary prospectus regime.
Because the Form 497K3A corpus is closed and bounded by the December 1998 to March 2009 profile era, its users are professionals who need precise retrieval of how funds disclosed objectives, strategies, risks, fees, and performance during that decade.
Securities lawyers advising open-end funds use the corpus as a precedent set for Rule 498 profile drafting. Disclosure counsel compare the narrative HTML sections (investment objectives, principal strategies, principal risks, performance, fee table) against current Item 2-Item 8 summary prospectus language to trace which conventions migrated into the 2009 regime and which were dropped. Litigation counsel handling claims tied to the 1998-2009 window pull profiles by accession number, using metadata.json (CIK, filing date, submission type) to anchor the evidentiary chain and the HTML body to quote risk language verbatim.
In-house compliance officers at fund complexes that survived the profile era use the dataset as a precedent library for legacy series. They reference older 497K3A filings for the same family to reconstruct earlier disclosures of strategies, derivatives use, redemption fees, and sales loads, then verify continuity (or document changes) against current filings. The fee and expense table and the principal risks section receive the heaviest use.
Policy staff and securities-regulation academics study the profile experiment itself: why it was adopted, how it was used, and why it was abandoned after eleven years. The closed corpus is small enough to read exhaustively and large enough to code qualitatively. The standardized structure — objectives, strategies, risks, performance, fees, purchase/redemption — permits direct cross-fund and cross-year comparison. Outputs include retrospectives on plain-English disclosure and comment-letter submissions on prospectus rulemakings.
Quant teams reconstructing historical fund universes mine the fee table for total annual operating expenses, management fees, 12b-1 fees, and the one/three/five/ten-year cost example, and the past-performance bar chart and average annual total returns table for point-in-time, as-disclosed return histories. These structured fields cross-check commercial fund databases and support work on fee dispersion, share-class economics, and disclosed-versus-realized risk.
Financial data engineering teams use the dataset as a bounded, text-only fixture (images excluded) for 497K3A handling. Every accession is enumerated, so metadata.json serves as a complete CIK/date lookup and the HTML and TXT documents act as a regression suite for parser changes, section extractors, and cross-form joins to N-1A and later 497K filings.
Plaintiff and defense counsel in fund mis-selling, suitability, fee, and fiduciary-duty matters from the 1998-2009 period retrieve the exact profile delivered at a given date. Expert witnesses preparing damages or disclosure-adequacy reports rely on the fee table, performance table, and principal-risks language to establish what was and was not said; metadata.json filing date and accession number anchor the timeline.
Historians of the U.S. fund industry use the narrative sections — especially principal strategies and principal risks — as primary source material on how funds described themselves through the late-1990s expansion and the 2008 crisis. Faculty teaching securities regulation and investment-company law use individual filings as self-contained classroom artifacts: the fixed profile order makes one filing a complete teaching example and a handful sufficient for comparative exercises.
Teams building retrieval-augmented systems and fine-tuning corpora for financial language models use the dataset as a structurally consistent, pre-2009 disclosure source. The uniform profile schema yields clean pairs for section classification, fee extraction, and risk-language summarization, and pairs naturally with later summary prospectus corpora to expose models to the predecessor style.
The common thread across these audiences is authoritative access to the same recurring artifacts: the metadata.json identifiers, the fee and expense table, the performance bar chart and returns table, and the principal-risks narrative. The dataset's value is being complete, bounded, and standardized.
The following workflows show how the closed Form 497K3A archive (December 1998 — March 2009) is operated on in practice. Each ties to specific record fields — metadata.json identifiers and the standardized profile sections inside the SGML-wrapped body documents.
Plaintiff and defense counsel working fund mis-selling or fee-disclosure cases from the profile era retrieve the exact document delivered to investors on a given date. The workflow joins metadata.json filedAt, accessionNo, and entities[].cik to a calendar of investor transactions, then quotes the principal risks narrative and fee table from the body HTML verbatim. Output is an exhibit-ready packet: filing-date provenance from the metadata header plus a clean text extraction of the relevant section.
Quant researchers parse the standardized fee and expense table out of each profile to assemble a panel of management fees, 12b-1 fees, other expenses, total annual operating expenses, and the 1/3/5/10-year dollar example. CIK and fiscalYearEnd from entities anchor each row to a fund family; filedAt provides the as-disclosed timestamp. The resulting table cross-checks commercial fund databases and supports analyses of fee dispersion and share-class economics during 1998-2009.
Investment-management lawyers drafting current summary prospectuses use the corpus as a precedent library. The workflow pulls the principal investment strategies and principal risks sections from each body HTML, segments them by fund family via CIK, and diffs phrasing on derivatives use, foreign-currency exposure, or redemption-fee mechanics across the family's profile history. Output is a precedent file showing which conventions migrated into the post-2009 summary prospectus and which were abandoned.
Performance-research teams extract the bar chart of annual total returns and the average-annual-total-return table (including best-quarter / worst-quarter call-outs and the benchmark column) from the past performance section. Pairing that with the filedAt timestamp yields an as-disclosed return record uncontaminated by later restatements, useful for studies of disclosed-versus-realized return and benchmark drift.
Data engineers maintaining EDGAR ingestion pipelines use the closed record set as a bounded test fixture. The mix of early-era flat-ASCII payloads and mid-2000s heavily inline-styled HTML inside the <DOCUMENT>...<TEXT> envelope exercises SGML-wrapper stripping, redlining-marker handling (<R>), GRAPHIC reference resolution from documentFormatFiles, and section detection by text pattern. Every accession is enumerated, so metadata.json doubles as the ground-truth CIK/date/accession lookup for parser regression tests.
Regulatory economists and academic researchers study the transition from profile to summary prospectus by aligning each 497K3A profile against the same fund family's first post-March-2009 497K filing. The fixed profile section order (objective, strategies, risks, performance, fees, purchase/redemption) maps onto Form N-1A Items 2-8, enabling structured side-by-side coding of which disclosures were retained, compressed, or dropped under amended Rule 498.
The dataset is distributed as ZIP containers organized by month, covering filings from December 1998 through March 2009 when Form 497K3A was discontinued. Because the dataset is compact, most users will download the full archive directly, but the index JSON and per-container endpoints remain available for incremental workflows.
Dataset Index JSON API: https://api.sec-api.io/datasets/form-497k3a-files.json
Returns dataset metadata (name, description, last updated timestamp, earliest sample date, total records, total size, form types, container format, and file types), the download URL for the full dataset, and the list of monthly container files with per-container size, record count, updated timestamp, and download URL. Use this endpoint to monitor which containers were touched in the latest refresh run and to decide which monthly archives to fetch on a given day. This endpoint does not require an API key.
Example response:
1
{
2
"datasetId": "1f13365b-9ae0-6a2e-b0a3-867d7bd2e7ab",
3
"datasetDownloadUrl": "https://api.sec-api.io/datasets/form-497k3a-files.zip",
4
"name": "Form 497K3A Files Dataset",
5
"updatedAt": "2026-04-16T08:33:43.524Z",
6
"earliestSampleDate": "1998-12-01",
7
"totalRecords": 52,
8
"totalSize": 1258065,
9
"formTypes": ["497K3A"],
10
"containerFormat": "ZIP",
11
"fileTypes": ["TXT", "JSON", "HTML"],
12
"containers": [
13
{
14
"downloadUrl": "https://api.sec-api.io/datasets/form-497k3a-files/2009/2009-03.zip",
15
"key": "2009/2009-03.zip",
16
"size": 24576,
17
"records": 1,
18
"updatedAt": "2026-04-16T08:33:43.524Z"
19
}
20
]
21
}
Download Entire Dataset: https://api.sec-api.io/datasets/form-497k3a-files.zip?token=YOUR_API_KEY
Downloads the complete dataset as a single ZIP archive. Given the small overall size, this is typically the most convenient way to obtain the full corpus in one request. This endpoint requires an API key.
Download Single Container: https://api.sec-api.io/datasets/form-497k3a-files/2009/2009-03.zip?token=YOUR_API_KEY
Downloads one monthly container ZIP (for example, March 2009) instead of the full dataset. Use the downloadUrl values from the index JSON response to fetch specific months. This endpoint requires an API key.
The dataset covers EDGAR submission type 497K3A, a Rule 497 filing under the Securities Act of 1933 used by open-end management investment companies registered on Form N-1A to file the fund "profile" — the concise plain-English alternative to a full statutory prospectus authorised by the original SEC Rule 498. The "K3A" suffix corresponds to Rule 497(k)(1)(iii)(A), which governed the profile's required content and its question-and-answer ordering.
One record is a single EDGAR 497K3A filing, identified by accession number and materialised as one accession-keyed folder containing a parsed metadata.json header and every non-image body document from the original EDGAR submission, preserved under its original filename inside its EDGAR SGML envelope. When a single submission carried profiles for multiple share classes or multiple series in one umbrella document, all of that content remains together in the same record.
The legal filer is an open-end management investment company (a mutual fund) registered on Form N-1A — typically organised as a Massachusetts business trust, Delaware statutory trust, or Maryland corporation. Closed-end funds, business development companies, unit investment trusts, variable insurance separate accounts, and operating-company issuers do not use the form. Profile use was optional, so the 497K3A population is narrower than the universe of N-1A registrants.
The dataset covers the full operational lifetime of Form 497K3A, beginning at the December 1998 effective date of the original Rule 498 and ending in early 2009, when the SEC's January 2009 amendments replaced the profile framework with the summary prospectus regime under amended Rule 498 (effective March 31, 2009). The earliest sample date is December 1, 1998. No new accessions appear after the form's retirement; the dataset is closed-ended and historical.
Records are distributed as monthly ZIP containers under a YYYY/YYYY-MM.zip path scheme. Inside each accession folder, file types are TXT, JSON, and HTML: JSON for the per-record metadata.json, HTML (typically .htm) for body documents from the mid-2000s onward, and flat TXT for body documents in the earliest years of the form. Body documents are wrapped in an EDGAR SGML <DOCUMENT>...<TEXT>...</TEXT></DOCUMENT> envelope that must be stripped before HTML parsing.
Despite the visually similar code, post-2009 497K filings on EDGAR are summary prospectuses filed under the rewritten Rule 498(k), not Rule 498 profiles. The summary prospectus is keyed to Form N-1A Items 2–8 and, paired with website posting of the full statutory prospectus, satisfies prospectus-delivery obligations on different terms than the original profile. Post-2009 497K filings are not part of this dataset and must be analysed as a separate, non-interchangeable corpus.
No. Image documents listed as GRAPHIC in EDGAR (typically .gif and .jpg files used for logos, charts, and decorative typography) are intentionally omitted, although their inventory entries remain in metadata.json's documentFormatFiles array with their original filenames and direct EDGAR URLs. No XBRL instance documents exist for this form because 497K3A was never subject to XBRL tagging requirements; dataFiles is uniformly empty and linkToXbrl is uniformly an empty string.