Search SEC Filings With Python
This page provides ready-to-execute examples demonstrating how to search and retrieve metadata of EDGAR filings from 1993 to present with Python. Various use cases are covered, such as how to find SEC filings by ticker, form type, and publication date, how to download historical 10-Q filings over an extended period, or how to find Form 8-K filings by specific event items.
Quick Start
The example below demonstrates how to use the sec-api
Python package to interact with the Query API. It shows how to retrieve metadata for the latest 50 EDGAR filings for Tesla using the get_filings(search_parameters)
function from the QueryApi
class. The search parameters are passed to the get_filings
function and include:
query
: The Lucene search query, such asticker:TSLA
.from
: The starting index for pagination of search results.size
: The number of results to return per request (maximum of 50).sort
: The field to sort the results by, defaulting tofiledAt
in descending order.
Refer to the documentation for more details on the API request format.
pip install sec-api
from sec_api import QueryApi
queryApi = QueryApi(api_key="YOUR_API_KEY")
search_params = {
"query": "ticker:TSLA",
"from": "0",
"size": "50",
"sort": [{"filedAt": {"order": "desc"}}],
}
response = queryApi.get_filings(search_params)
print("Response object fields:\n", list(response.keys()))
Response object fields:
['total', 'query', 'filings']
print("Number of filings matching the search criteria:\n", response['total']['value'])
Number of filings matching the search criteria:
1565
print("Number of filings returned in this response:\n", len(response["filings"]))
Number of filings returned in this response:
50
import pandas as pd
metadata = pd.DataFrame.from_records(response["filings"])
print("Filing metadata objects converted to a DataFrame:")
metadata.head(3)
Filing metadata objects converted to a DataFrame:
ticker | formType | accessionNo | cik | companyNameLong | companyName | linkToFilingDetails | description | linkToTxt | filedAt | documentFormatFiles | periodOfReport | entities | id | seriesAndClassesContractsInformation | items | linkToHtml | linkToXbrl | dataFiles | effectivenessDate | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | TSLA | 8-K | 0001628280-24-041816 | 1318605 | Tesla, Inc. (Filer) | Tesla, Inc. | https://www.sec.gov/Archives/edgar/data/131860... | Form 8-K - Current report - Item 2.02 Item 9.01 | https://www.sec.gov/Archives/edgar/data/131860... | 2024-10-02T09:06:34-04:00 | [{'sequence': '1', 'size': '34449', 'documentU... | 2024-10-02 | [{'fiscalYearEnd': '1231', 'stateOfIncorporati... | 347fb19c073dc3f3e8be14aeec9d026c | [] | [Item 2.02: Results of Operations and Financia... | https://www.sec.gov/Archives/edgar/data/131860... | [{'sequence': '3', 'size': '2509', 'documentUr... | NaN | |
1 | TSLA | 4 | 0001771340-24-000004 | 1318605 | Tesla, Inc. (Issuer) | Tesla, Inc. | https://www.sec.gov/Archives/edgar/data/131860... | Form 4 - Statement of changes in beneficial ow... | https://www.sec.gov/Archives/edgar/data/131860... | 2024-09-25T19:48:42-04:00 | [{'sequence': '1', 'size': ' ', 'documentUrl':... | 2024-09-23 | [{'fiscalYearEnd': '1231', 'stateOfIncorporati... | e61d1cf43894398a94c4094bcaf71052 | [] | NaN | https://www.sec.gov/Archives/edgar/data/131860... | [] | NaN | |
2 | TSLA | 144 | 0001950047-24-007342 | 1318605 | Tesla, Inc. (Subject) | Tesla, Inc. | https://www.sec.gov/Archives/edgar/data/131860... | Form 144 - Report of proposed sale of securities | https://www.sec.gov/Archives/edgar/data/131860... | 2024-09-23T16:27:57-04:00 | [{'sequence': '1', 'size': ' ', 'documentUrl':... | NaN | [{'fiscalYearEnd': '1231', 'stateOfIncorporati... | 4e9e5cbfa0992bf5306f62d530b35d82 | [] | NaN | https://www.sec.gov/Archives/edgar/data/131860... | [] | NaN |
available_fields = metadata.columns
print("Available filing metadata fields for this response:")
list(available_fields)
Available filing metadata fields for this response:
['ticker',
'formType',
'accessionNo',
'cik',
'companyNameLong',
'companyName',
'linkToFilingDetails',
'description',
'linkToTxt',
'filedAt',
'documentFormatFiles',
'periodOfReport',
'entities',
'id',
'seriesAndClassesContractsInformation',
'items',
'linkToHtml',
'linkToXbrl',
'dataFiles',
'effectivenessDate']
Search SEC Filings by Ticker, Form Type and Publication Date
This examples shows a how to find filter the EDGAR filings database by ticker symbol, form type and publication date. For this example, we will search for all 10-K filings for Apple Inc. (ticker: AAPL) between 2019-01-01 and 2023-12-31. The corresponding search query is:
ticker:AAPL AND formType:"10-K" AND filedAt:[2019-01-01 TO 2023-12-31]
Remember to enclose the formType
value in double quotes ("10-K"
) to ensure that the search matches 10-K filings, and not "10" or "K" filings.
from sec_api import QueryApi
queryApi = QueryApi(api_key="YOUR_API_KEY")
search_query = 'ticker:AAPL AND formType:"10-K" AND filedAt:[2019-01-01 TO 2023-12-31]'
parameters = {
"query": search_query,
"from": "0",
"size": "50",
"sort": [{"filedAt": {"order": "desc"}}],
}
response = queryApi.get_filings(parameters)
print(f"Number of 10-K filings from Apple between 2019 and 2023:\n {response['total']['value']}")
Number of 10-K filings from Apple between 2019 and 2023:
5
import json
print("Apple's 10-K filing metdata:")
print(json.dumps(response["filings"][0], indent=2))
Apple 10-K filing metdata:
{
"ticker": "AAPL",
"formType": "10-K",
"accessionNo": "0000320193-23-000106",
"cik": "320193",
"companyNameLong": "Apple Inc. (Filer)",
"companyName": "Apple Inc.",
"linkToFilingDetails": "https://www.sec.gov/Archives/edgar/data/320193/000032019323000106/aapl-20230930.htm",
"description": "Form 10-K - Annual report [Section 13 and 15(d), not S-K Item 405]",
"linkToTxt": "https://www.sec.gov/Archives/edgar/data/320193/000032019323000106/0000320193-23-000106.txt",
"filedAt": "2023-11-02T18:08:27-04:00",
"documentFormatFiles": [
{
"sequence": "1",
"size": "1558924",
"documentUrl": "https://www.sec.gov/ix?doc=/Archives/edgar/data/320193/000032019323000106/aapl-20230930.htm",
"description": "10-K",
"type": "10-K"
},
{
"sequence": "2",
"size": "121966",
"documentUrl": "https://www.sec.gov/Archives/edgar/data/320193/000032019323000106/a10-kexhibit4109302023.htm",
"description": "EX-4.1",
"type": "EX-4.1"
},
{
"sequence": "3",
"size": "11229",
"documentUrl": "https://www.sec.gov/Archives/edgar/data/320193/000032019323000106/a10-kexhibit21109302023.htm",
"description": "EX-21.1",
"type": "EX-21.1"
},
{
"sequence": "4",
"size": "5451",
"documentUrl": "https://www.sec.gov/Archives/edgar/data/320193/000032019323000106/a10-kexhibit23109302023.htm",
"description": "EX-23.1",
"type": "EX-23.1"
},
{
"sequence": "5",
"size": "10507",
"documentUrl": "https://www.sec.gov/Archives/edgar/data/320193/000032019323000106/a10-kexhibit31109302023.htm",
"description": "EX-31.1",
"type": "EX-31.1"
},
{
"sequence": "6",
"size": "10543",
"documentUrl": "https://www.sec.gov/Archives/edgar/data/320193/000032019323000106/a10-kexhibit31209302023.htm",
"description": "EX-31.2",
"type": "EX-31.2"
},
{
"sequence": "7",
"size": "8356",
"documentUrl": "https://www.sec.gov/Archives/edgar/data/320193/000032019323000106/a10-kexhibit32109302023.htm",
"description": "EX-32.1",
"type": "EX-32.1"
},
{
"sequence": "13",
"size": "10963",
"documentUrl": "https://www.sec.gov/Archives/edgar/data/320193/000032019323000106/aapl-20230930_g1.jpg",
"type": "GRAPHIC"
},
{
"sequence": "14",
"size": "146631",
"documentUrl": "https://www.sec.gov/Archives/edgar/data/320193/000032019323000106/aapl-20230930_g2.jpg",
"type": "GRAPHIC"
},
{
"sequence": "\u00a0",
"size": "9569569",
"documentUrl": "https://www.sec.gov/Archives/edgar/data/320193/000032019323000106/0000320193-23-000106.txt",
"description": "Complete submission text file",
"type": "\u00a0"
}
],
"periodOfReport": "2023-09-30",
"entities": [
{
"fiscalYearEnd": "0930",
"stateOfIncorporation": "CA",
"act": "34",
"cik": "320193",
"fileNo": "001-36743",
"irsNo": "942404110",
"companyName": "Apple Inc. (Filer)",
"type": "10-K",
"sic": "3571 Electronic Computers",
"filmNo": "231373899"
}
],
"id": "0de409aa5047085970060a9efa218f8b",
"seriesAndClassesContractsInformation": [],
"linkToHtml": "https://www.sec.gov/Archives/edgar/data/320193/000032019323000106/0000320193-23-000106-index.htm",
"linkToXbrl": "",
"dataFiles": [
{
"sequence": "8",
"size": "59730",
"documentUrl": "https://www.sec.gov/Archives/edgar/data/320193/000032019323000106/aapl-20230930.xsd",
"description": "XBRL TAXONOMY EXTENSION SCHEMA DOCUMENT",
"type": "EX-101.SCH"
},
{
"sequence": "9",
"size": "155393",
"documentUrl": "https://www.sec.gov/Archives/edgar/data/320193/000032019323000106/aapl-20230930_cal.xml",
"description": "XBRL TAXONOMY EXTENSION CALCULATION LINKBASE DOCUMENT",
"type": "EX-101.CAL"
},
{
"sequence": "10",
"size": "233471",
"documentUrl": "https://www.sec.gov/Archives/edgar/data/320193/000032019323000106/aapl-20230930_def.xml",
"description": "XBRL TAXONOMY EXTENSION DEFINITION LINKBASE DOCUMENT",
"type": "EX-101.DEF"
},
{
"sequence": "11",
"size": "854271",
"documentUrl": "https://www.sec.gov/Archives/edgar/data/320193/000032019323000106/aapl-20230930_lab.xml",
"description": "XBRL TAXONOMY EXTENSION LABEL LINKBASE DOCUMENT",
"type": "EX-101.LAB"
},
{
"sequence": "12",
"size": "516226",
"documentUrl": "https://www.sec.gov/Archives/edgar/data/320193/000032019323000106/aapl-20230930_pre.xml",
"description": "XBRL TAXONOMY EXTENSION PRESENTATION LINKBASE DOCUMENT",
"type": "EX-101.PRE"
},
{
"sequence": "90",
"size": "1432664",
"documentUrl": "https://www.sec.gov/Archives/edgar/data/320193/000032019323000106/aapl-20230930_htm.xml",
"description": "EXTRACTED XBRL INSTANCE DOCUMENT",
"type": "XML"
}
]
}
Find URLs of SEC Filings Over an Extended Period
A common use case for the Query API is retrieving filing URLs and accession numbers from a large range of filings over an extended disclosure period, and saving the metadata locally for further processing.
Use cases for such data include:
- Downloading filings via the Filing Download API
- Generating PDF versions of filings and exhibits using the PDF Generator API
- Extracting content sections with the Extractor API
- Accessing structured XBRL data in JSON format from filings with the XBRL-JSON API
- And more
One such example is to locate all Form 10-Q filings submitted to the SEC between 2014 and 2023, and save the URLs pointing to the HTML version of each filing to a local file. The initial search query is:
formType:"10-Q" AND filedAt:[2014-01-01 TO 2023-12-31]
By default, the Query API can return a maximum of 10,000 filings per search query. Since the API returns 50 filings per request, pagination is necessary to retrieve more than 50 results. The from
parameter is used to paginate through the results, specifying the starting position for each batch, and is incremented by 50 with each subsequent API call.
However, because there are typically three 10-Q filings per company each quarter, and more than 5,000 active companies per year, this query will return far more than 10,000 filings, meaning pagination alone is insufficient. To retrieve all filings, the query needs to be broken down into smaller chunks, such as by year and month, with pagination applied to each chunk. Since no single month contains more than 10,000 Form 10-Q filings, pagination within each month ensures complete retrieval.
For example, the search queries for 10-Q filings filed in January and February of 2014 would look like this:
# Search for 10-Q filings filed in January 2014
formType:"10-Q" AND filedAt:[2014-01-01 TO 2014-01-31]
# Search for 10-Q filings filed in February 2014
formType:"10-Q" AND filedAt:[2014-02-01 TO 2014-02-28]
The following approach iterates over the years from 2014 to 2023, and for each year, loops through the months from January to December. For each month, multiple Query API requests are executed by incrementing the from
parameter in steps of 50 to capture all 10-Q filings for that month. The URLs of the filings are then saved to a file on the local disk.
The same approach can be applied to other filing types and filtering criteria, such as Form 10-K filings, Form 8-K filings, filings by specific companies, and more.
pip install sec-api
from sec_api import QueryApi
queryApi = QueryApi(api_key="YOUR_API_KEY")
search_parameters = {
"query": "<PLACEHOLDER>", # will be set during runtime
"from": "0", # will be incremented by 50 during runtime
"size": "50",
"sort": [{ "filedAt": { "order": "desc" } }]
}
# open the file to store the filing URLs
log_file = open("filing_urls.csv", "a")
# fetch filings filed in 2023, then 2022, 2021, ... up to 2014
# uncomment line below to fetch all filings filed between 2014 and 2023
# for year in range(2023, 2013, -1):
for year in range(2023, 2022, -1):
print(f"Starting search for: {year}")
# iterate over all months in a year and
# build a query to fetch all 10-Q filings
# filed in that month and year
for month in range(1, 13, 1):
search_parameters["from"] = 0
form_type_query = 'formType:"10-Q"'
date_range_query = f"filedAt:[{year}-{month:02d}-01 TO {year}-{month:02d}-31]"
search_parameters["query"] = form_type_query + " AND " + date_range_query
print("Starting filing search for: ", search_parameters["query"])
# paginate through results by increasing "from" parameter
# until all results are fecthed and no more filings are returned
# uncomment line below to fetch all 10,000 filings per month
# for from_param in range(0, 9950, 50):
for from_param in range(0, 50, 50):
search_parameters["from"] = from_param
response = queryApi.get_filings(search_parameters)
# stop if no more filings are returned
if len(response["filings"]) == 0:
break
# for each filing, get the URL of the filing
# set in the dict key "linkToFilingDetails"
urls_list = list(
map(lambda x: x["linkToFilingDetails"], response["filings"])
)
# transform the list of URLs into a single string by
# joining all list elements, add a new-line character between each element
urls_string = "\n".join(urls_list) + "\n"
log_file.write(urls_string)
log_file.close()
Starting search for: 2023
Starting filing search for: formType:"10-Q" AND filedAt:[2023-01-01 TO 2023-01-31]
Starting filing search for: formType:"10-Q" AND filedAt:[2023-02-01 TO 2023-02-31]
Starting filing search for: formType:"10-Q" AND filedAt:[2023-03-01 TO 2023-03-31]
Starting filing search for: formType:"10-Q" AND filedAt:[2023-04-01 TO 2023-04-31]
Starting filing search for: formType:"10-Q" AND filedAt:[2023-05-01 TO 2023-05-31]
Starting filing search for: formType:"10-Q" AND filedAt:[2023-06-01 TO 2023-06-31]
Starting filing search for: formType:"10-Q" AND filedAt:[2023-07-01 TO 2023-07-31]
Starting filing search for: formType:"10-Q" AND filedAt:[2023-08-01 TO 2023-08-31]
Starting filing search for: formType:"10-Q" AND filedAt:[2023-09-01 TO 2023-09-31]
Starting filing search for: formType:"10-Q" AND filedAt:[2023-10-01 TO 2023-10-31]
Starting filing search for: formType:"10-Q" AND filedAt:[2023-11-01 TO 2023-11-31]
Starting filing search for: formType:"10-Q" AND filedAt:[2023-12-01 TO 2023-12-31]
# let's inspect the content of file
log_file = open("filing_urls.csv", "r")
urls = log_file.readlines()
# replace \n at the end of each line with empty string
urls = list(map(lambda x: x.replace("\n", ""), urls))
print("Number of URLs fetched:\n", len(urls))
print("Snippet of 10-Q filing URLs fetched: ")
urls[:10]
Number of URLs fetched:
600
Snippet of 10-Q filing URLs fetched:
['https://www.sec.gov/Archives/edgar/data/1596993/000159699323000011/lpg-20221231x10q.htm',
'https://www.sec.gov/Archives/edgar/data/1666138/000166613823000032/atkr-20221230.htm',
'https://www.sec.gov/Archives/edgar/data/96021/000009602123000033/syy-20221231.htm',
'https://www.sec.gov/Archives/edgar/data/50725/000005072523000006/gff-20221231.htm',
'https://www.sec.gov/Archives/edgar/data/17313/000001731323000006/cswc-20221231.htm',
'https://www.sec.gov/Archives/edgar/data/1169561/000116956123000013/cvlt-20221231.htm',
'https://www.sec.gov/Archives/edgar/data/1680873/000168087323000008/hffg-20220930.htm',
'https://www.sec.gov/Archives/edgar/data/1680873/000168087323000007/hffg-20220630.htm',
'https://www.sec.gov/Archives/edgar/data/1680873/000168087323000006/hffg-20220331.htm',
'https://www.sec.gov/Archives/edgar/data/1681622/000168162223000013/var-20221230.htm']
How to Export SEC Filing Metadata to a CSV File
SEC EDGAR filing metadata, including the filer's CIK, ticker, URL to the filing content, filing date, form type, and other parameters, can be exported to a CSV file for analysis in different tools. This example demonstrates how to export SEC filing metadata to a CSV file using the sec-api
library.
The Query API is used to retrieve SEC filings that match a specified search criteria, such as all filings from Microsoft in January 2024.
ticker:MSFT AND filedAt:[2024-01-01 TO 2024-01-31]
A field_names
list is defined, with each entry representing a filing field and column to be saved to the CSV file. Pandas is used to create a DataFrame from the search results and export the data to a CSV file filings.csv
.
pip install sec-api
from sec_api import QueryApi
import pandas as pd
queryApi = QueryApi(api_key="YOUR_API_KEY")
search_parameters = {
"query": "ticker:MSFT AND filedAt:[2024-01-01 TO 2024-03-31]",
"from": "0",
"size": "50",
"sort": [{"filedAt": {"order": "desc"}}],
}
response = queryApi.get_filings(search_parameters)
filings = pd.DataFrame(response["filings"])
print("Number of filings: ", len(filings))
filings.head(3)
Number of filings: 42
ticker | formType | accessionNo | cik | companyNameLong | companyName | linkToFilingDetails | description | linkToTxt | filedAt | documentFormatFiles | periodOfReport | entities | id | seriesAndClassesContractsInformation | linkToHtml | linkToXbrl | dataFiles | items | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | MSFT | 11-K | 0000950170-24-034807 | 789019 | MICROSOFT CORP (Filer) | MICROSOFT CORP | https://www.sec.gov/Archives/edgar/data/789019... | Form 11-K - Annual report of employee stock pu... | https://www.sec.gov/Archives/edgar/data/789019... | 2024-03-21T16:43:04-04:00 | [{'sequence': '1', 'size': '169301', 'document... | 2023-12-31 | [{'fiscalYearEnd': '0630', 'stateOfIncorporati... | 6bfec9c9284b22aaa31b585dfc2367e1 | [] | https://www.sec.gov/Archives/edgar/data/789019... | [] | NaN | |
1 | MSFT | 4 | 0001062993-24-006558 | 789019 | MICROSOFT CORP (Issuer) | MICROSOFT CORP | https://www.sec.gov/Archives/edgar/data/789019... | Form 4 - Statement of changes in beneficial ow... | https://www.sec.gov/Archives/edgar/data/789019... | 2024-03-15T18:14:39-04:00 | [{'sequence': '1', 'size': ' ', 'documentUrl':... | 2024-03-14 | [{'fiscalYearEnd': '0630', 'stateOfIncorporati... | 97a378d6579c752fb2b8dc472895dbaf | [] | https://www.sec.gov/Archives/edgar/data/789019... | [] | NaN | |
2 | MSFT | 4 | 0001062993-24-006557 | 789019 | MICROSOFT CORP (Issuer) | MICROSOFT CORP | https://www.sec.gov/Archives/edgar/data/789019... | Form 4 - Statement of changes in beneficial ow... | https://www.sec.gov/Archives/edgar/data/789019... | 2024-03-15T18:14:13-04:00 | [{'sequence': '1', 'size': ' ', 'documentUrl':... | 2024-03-14 | [{'fiscalYearEnd': '0630', 'stateOfIncorporati... | f214c7b6a44d43eda13f77d3c06e5570 | [] | https://www.sec.gov/Archives/edgar/data/789019... | [] | NaN |
# every entry in field_names represents a filing parameter
# and the column name in the CSV file
field_names = ["id", "accessionNo", "formType",
"filedAt", "items", "periodOfReport",
"companyName", "cik", "ticker",
"linkToFilingDetails", "linkToTxt",
"documentFormatFiles", "dataFiles", "entities"]
# write the data to a CSV file
filings[field_names].to_csv('filings.csv', index=False)
# read data from the CSV file
filings_from_csv = pd.read_csv('filings.csv')
print("Filing metadata of 10 most recent filings by Tesla saved to filings.csv:\n")
filings_from_csv.head(3)
Filing metadata of 10 most recent filings by Tesla saved to filings.csv:
id | accessionNo | formType | filedAt | items | periodOfReport | companyName | cik | ticker | linkToFilingDetails | linkToTxt | documentFormatFiles | dataFiles | entities | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 6bfec9c9284b22aaa31b585dfc2367e1 | 0000950170-24-034807 | 11-K | 2024-03-21T16:43:04-04:00 | NaN | 2023-12-31 | MICROSOFT CORP | 789019 | MSFT | https://www.sec.gov/Archives/edgar/data/789019... | https://www.sec.gov/Archives/edgar/data/789019... | [{'sequence': '1', 'size': '169301', 'document... | [] | [{'fiscalYearEnd': '0630', 'stateOfIncorporati... |
1 | 97a378d6579c752fb2b8dc472895dbaf | 0001062993-24-006558 | 4 | 2024-03-15T18:14:39-04:00 | NaN | 2024-03-14 | MICROSOFT CORP | 789019 | MSFT | https://www.sec.gov/Archives/edgar/data/789019... | https://www.sec.gov/Archives/edgar/data/789019... | [{'sequence': '1', 'size': '\xa0', 'documentUr... | [] | [{'fiscalYearEnd': '0630', 'stateOfIncorporati... |
2 | f214c7b6a44d43eda13f77d3c06e5570 | 0001062993-24-006557 | 4 | 2024-03-15T18:14:13-04:00 | NaN | 2024-03-14 | MICROSOFT CORP | 789019 | MSFT | https://www.sec.gov/Archives/edgar/data/789019... | https://www.sec.gov/Archives/edgar/data/789019... | [{'sequence': '1', 'size': '\xa0', 'documentUr... | [] | [{'fiscalYearEnd': '0630', 'stateOfIncorporati... |
How to Find XSD URLs of Annual Reports on Form 10-K
This example demonstrates how to locate and download URLs for XSD files containing the XBRL taxonomy extension schema from 10-K filings submitted between 2020 and 2023. The located URLs are saved in a local file, xsd_urls.csv
, and can be used to download the XSD files via the Download API.
The link to the content of an XSD file can be found in the documentUrl
field within the first item of the dataFiles
array of a filing metadata object, as returned by the Query API.
The Python code example below shows how to use the Query API to search for 10-K filings within the specified date range and extract the URLs of the XSD files. Search queries are constructed for each month from 2020 to 2023, and all filings for that month are retrieved through pagination by incrementing the from
parameter until all filings are collected. The XSD URLs are then extracted from the dataFiles
array in the metadata and stored in a local CSV file.
The QUICK_RUN
variable allows for a test run by skipping 90% of the URLs within the time range. Set it to False
to retrieve all URLs.
pip install sec-api
from sec_api import QueryApi
queryApi = QueryApi(api_key="YOUR_API_KEY")
# file to write XSD URLs to
XSD_FILE_NAME = "xsd_urls.csv"
# filing type to extract XSD URLs from
FILING_TYPE = "10-K"
# search range
YEAR_START = 2022
YEAR_END = 2023
# set True if you want to test the program and skip 90% of filings (quick run)
# set False if you want to download all URLs (takes up to 10 minutes)
QUICK_RUN = True
search_params = {
"query": "PLACEHOLDER", # this will be set during runtime
"from": "0",
"size": "50",
"sort": [{ "filedAt": { "order": "desc" } }]
}
# open the file we use to store the XSD URLs
log_file = open(XSD_FILE_NAME, "a")
# helper function to extract the XSD URL of the first item
# in the dataFiles list of a filing
def get_xsd_url(filing):
if len(filing["dataFiles"]) > 0:
return filing["dataFiles"][0]["documentUrl"]
return None
for year in range(YEAR_START, YEAR_END + 1, 1):
print(f"Starting {year}")
# a single search universe is represented as a month of the given year
for month in range(1, 13, 1):
search_params["from"] = 0
form_type_query = f'formType:"{FILING_TYPE}"'
date_range_query = f"filedAt:[{year}-{month:02d}-01 TO {year}-{month:02d}-31]"
# example query: "formType:"10-K" AND filedAt:[2021-01-01 TO 2021-01-31]"
search_params["query"] = f"{form_type_query} AND {date_range_query}"
print("Starting filing search for: ", search_params["query"])
# paginate through results by increasing "from" parameter
# until all filings are fetched
end = 50 if QUICK_RUN else 10000
for from_batch in range(0, end, 50):
search_params["from"] = from_batch
response = queryApi.get_filings(search_params)
if len(response["filings"]) == 0:
break
# for each filing, get the URL of the XSD file
urls_list = list(map(get_xsd_url, response["filings"]))
# remove empty URLS
urls_list = list(filter(None, urls_list))
# transform list of URLs into one string by joining all list elements
# and add a new-line character between each element.
urls_string = "\n".join(urls_list) + "\n"
log_file.write(urls_string)
break
log_file.close()
print("All done!")
Starting 2022
Starting filing search for: formType:"10-K" AND filedAt:[2022-01-01 TO 2022-01-31]
Starting filing search for: formType:"10-K" AND filedAt:[2022-02-01 TO 2022-02-31]
Starting filing search for: formType:"10-K" AND filedAt:[2022-03-01 TO 2022-03-31]
Starting filing search for: formType:"10-K" AND filedAt:[2022-04-01 TO 2022-04-31]
Starting filing search for: formType:"10-K" AND filedAt:[2022-05-01 TO 2022-05-31]
Starting filing search for: formType:"10-K" AND filedAt:[2022-06-01 TO 2022-06-31]
Starting filing search for: formType:"10-K" AND filedAt:[2022-07-01 TO 2022-07-31]
Starting filing search for: formType:"10-K" AND filedAt:[2022-08-01 TO 2022-08-31]
Starting filing search for: formType:"10-K" AND filedAt:[2022-09-01 TO 2022-09-31]
Starting filing search for: formType:"10-K" AND filedAt:[2022-10-01 TO 2022-10-31]
Starting filing search for: formType:"10-K" AND filedAt:[2022-11-01 TO 2022-11-31]
Starting filing search for: formType:"10-K" AND filedAt:[2022-12-01 TO 2022-12-31]
Starting 2023
Starting filing search for: formType:"10-K" AND filedAt:[2023-01-01 TO 2023-01-31]
Starting filing search for: formType:"10-K" AND filedAt:[2023-02-01 TO 2023-02-31]
Starting filing search for: formType:"10-K" AND filedAt:[2023-03-01 TO 2023-03-31]
Starting filing search for: formType:"10-K" AND filedAt:[2023-04-01 TO 2023-04-31]
Starting filing search for: formType:"10-K" AND filedAt:[2023-05-01 TO 2023-05-31]
Starting filing search for: formType:"10-K" AND filedAt:[2023-06-01 TO 2023-06-31]
Starting filing search for: formType:"10-K" AND filedAt:[2023-07-01 TO 2023-07-31]
Starting filing search for: formType:"10-K" AND filedAt:[2023-08-01 TO 2023-08-31]
Starting filing search for: formType:"10-K" AND filedAt:[2023-09-01 TO 2023-09-31]
Starting filing search for: formType:"10-K" AND filedAt:[2023-10-01 TO 2023-10-31]
Starting filing search for: formType:"10-K" AND filedAt:[2023-11-01 TO 2023-11-31]
Starting filing search for: formType:"10-K" AND filedAt:[2023-12-01 TO 2023-12-31]
All done!
import pandas as pd
xsd_urls = pd.read_csv(XSD_FILE_NAME, header=None, names=["URL"])
print("Number of XSD File URLs:\n", xsd_urls.shape[0])
print("First 5 XSD File URLs:")
xsd_urls.head()
Number of XSD File URLs:
120
First 5 XSD File URLs:
URL | |
---|---|
0 | https://www.sec.gov/Archives/edgar/data/12239/... |
1 | https://www.sec.gov/Archives/edgar/data/831489... |
2 | https://www.sec.gov/Archives/edgar/data/106508... |
3 | https://www.sec.gov/Archives/edgar/data/109006... |
4 | https://www.sec.gov/Archives/edgar/data/12927/... |
How to Get a List of Fiscal Period End Dates of Companies
The fiscal period end date, also referred to as "period of report" in EDGAR filings, is stated in Form 10-Q, 10-K, 13F-HR and other SEC filings. In the case of SEC filings, the period of report defines the time period the filing applies to, and is reported as a single date rather than an actual time period. The property periodOfReport
in a filing object returned by the Query API represents its value.
In the following example, the Query API is used to find and list the period of reports of companies that disclosed the 50 latest 10-K filings filed in 2023. The approach can be adapted to extract period end dates for any filing type, such as 10-Q filings, across different years.
To speed up the collection of data points, 10-K filings without their variants, such as 10-K/A and late submission notifications NT 10-K, are considered. After fetching the metadata of the filings and extracting the period end date values, a pandas DataFrame is created, holding the fiscal year end periods per CIK and per ticker of the companies, respectively.
pip install sec-api
from sec_api import QueryApi
queryApi = QueryApi(api_key="YOUR_API_KEY")
search_params = {
"query": "formType:"10-K" " +
"AND NOT formType:"NT 10-K" " +
"AND NOT formType:"10-K/A" " +
"AND filedAt:[2023-01-01 TO 2023-12-31]",
"from": "0",
"size": "50",
"sort": [{ "filedAt": { "order": "desc" } }]
}
response = queryApi.get_filings(search_params)
The Query API response contains a list of filings under response["filings"]
. To inspect the response, a subset of properties from each filing—specifically formType
and periodOfReport
—can be printed.
For those unfamiliar with map
and lambda
, here's a brief explanation. The map
function applies a lambda
function to each filing in the list. In this case, the lambda
function returns a new dictionary for each filing, extracting the formType
and periodOfReport
properties and setting them as values in the new dictionary. The result is then converted into a list for further use.
list(
map(
lambda x: {"formType": x["formType"], "periodOfReport": x["periodOfReport"]},
response["filings"],
)
)[:5]
[{'formType': '10-K', 'periodOfReport': '2023-09-30'},
{'formType': '10-K', 'periodOfReport': '2023-09-30'},
{'formType': '10-K', 'periodOfReport': '2023-07-31'},
{'formType': '10-K', 'periodOfReport': '2023-09-30'},
{'formType': '10-K', 'periodOfReport': '2023-09-30'}]
The same data, containing the formType
and periodOfReport
properties, can also be retrieved and processed using pandas DataFrames. This approach provides a more structured and flexible way to handle and analyze the data.
import pandas as pd
filings = pd.DataFrame(response["filings"])
filings[["cik", "ticker", "formType", "periodOfReport"]].head()
cik | ticker | formType | periodOfReport | |
---|---|---|---|---|
0 | 90168 | SIF | 10-K | 2023-09-30 |
1 | 1967306 | MSBB | 10-K | 2023-09-30 |
2 | 1787412 | WBBA | 10-K | 2023-07-31 |
3 | 12040 | BDL | 10-K | 2023-09-30 |
4 | 1929589 | MRDB | 10-K | 2023-09-30 |
print("Fiscal Period End Dates of Companies by CIK:")
filings[["cik", "periodOfReport"]].head()
Fiscal Period End Dates of Companies by CIK:
cik | periodOfReport | |
---|---|---|
0 | 90168 | 2023-09-30 |
1 | 1967306 | 2023-09-30 |
2 | 1787412 | 2023-07-31 |
3 | 12040 | 2023-09-30 |
4 | 1929589 | 2023-09-30 |
print("Fiscal Period End Dates of Companies by Ticker:")
filings[["ticker", "periodOfReport"]].head()
Fiscal Period End Dates of Companies by Ticker:
ticker | periodOfReport | |
---|---|---|
0 | SIF | 2023-09-30 |
1 | MSBB | 2023-09-30 |
2 | WBBA | 2023-07-31 |
3 | BDL | 2023-09-30 |
4 | MRDB | 2023-09-30 |
Finding EDGAR Filings by Item
The following examples show how to search for specific SEC filings based on their items using the Query API. These examples focus on locating 8-K filings that include Item 1.03, "Bankruptcy or Receivership," as well as Form D filings referencing items 6b and 3C.1. The approach can be adapted to search for any EDGAR filings that reference different items.
The EDGAR form types that include or reference specific items and can be searched by item are as follows:
- 8-K
- 8-K12B
- 8-K12G3
- 8F-2 ORDR
- D
- APP ORDR
- APP NTC
- APP WDG
- 1-U
- ABS-15G
- N-8F ORDR
- N-8F NTC
- 6B ORDR
pip install -q sec-api
from sec_api import QueryApi
queryApi = QueryApi(api_key="YOUR_API_API_KEY")
Finding 8-K Filings by Event Items
This example demonstrates how to search for Form 8-K filings based on the event item IDs disclosed within the filing. The URLs and accession numbers of the located filings can then be used in combination with the Extractor API to extract and download the desired content sections of the event items.
For instance, the following search query retrieves the metadata of 8-K filings that include Item 1.03, "Bankruptcy or Receivership":
formType:"8-K" AND items:"1.03"
This query filters filings with a form type of "8-K" that also contain Item 1.03. To search for other event items, simply replace 1.03
with the desired item ID. For example, items:"3.01"
corresponds to the section "Notice of Delisting or Failure to Satisfy a Continued Listing Rule or Standard; Transfer of Listing."
To further refine the search by date, a date range can be added:
filedAt:[2019-01-01 TO 2022-12-31]
This ensures that filings within the specified date range are retrieved, allowing for more focused results when running the query.
search_params = {
"query": 'formType:"8-K" AND items:"1.03" AND filedAt:[2021-01-01 TO 2023-12-31]',
"from": "0",
"size": "50",
"sort": [{"filedAt": {"order": "desc"}}],
}
response = queryApi.get_filings(search_params)
import pandas as pd
filings = pd.json_normalize(response['filings'])
print('Total filings matching the search query:', response['total']['value'])
print('Filings returned per API call:', len(filings))
Total filings matching the search query: 282
Filings returned per API call: 50
A total of 282 Form 8-K filings with Item 1.03 were filed between 2021-01-01 and 2023-12-31. The Query API limits the number of filings returned per request to a maximum of 50. In order to retrieve all 282 filings, we need to implement pagination by adjusting the from
parameter in subsequent requests while keeping the size
parameter constant at 50
. The from
parameter indicates the starting position or offset in the search results.
To fetch all 282 Form 8-K filings, we can define a function called get_filings(query)
that handles the pagination process. This function will retrieve the next set of 50 filings by incrementing the from
parameter in each subsequent request until no more filings are returned.
def get_filings(query):
search_params = {
"query": query,
"from": 0,
"size": "50",
"sort": [{"filedAt": {"order": "desc"}}],
}
all_filings = []
while True:
response = queryApi.get_filings(search_params)
filings = response["filings"]
if len(filings) == 0:
break
all_filings.extend(filings)
search_params["from"] += 50
return pd.json_normalize(all_filings)
search_query = 'formType:"8-K" AND items:"1.03" AND filedAt:[2021-01-01 TO 2023-12-31]'
form_8K_with_1_03 = get_filings(search_query)
print('Number of 8-K filings with Item 1.03 between 2021 and 2023:
', len(form_8K_with_1_03))
Number of 8-K filings with Item 1.03 between 2021 and 2023:
282
print('Metadata of 8-K filings')
print('------------------------')
form_8K_with_1_03[["formType", "accessionNo", "cik", "ticker", "filedAt", "items", "linkToFilingDetails"]].head()
Metadata of 8-K filings
------------------------
formType | accessionNo | cik | ticker | filedAt | items | linkToFilingDetails | |
---|---|---|---|---|---|---|---|
0 | 8-K | 0001193125-23-304154 | 716006 | YELLQ | 2023-12-27T17:20:57-05:00 | [Item 1.03: Bankruptcy or Receivership, Item 2... | https://www.sec.gov/Archives/edgar/data/716006... |
1 | 8-K | 0001104659-23-129388 | 84129 | RADCQ | 2023-12-27T06:03:09-05:00 | [Item 1.01: Entry into a Material Definitive A... | https://www.sec.gov/Archives/edgar/data/84129/... |
2 | 8-K/A | 0001861449-23-000222 | 1861449 | BRDSQ | 2023-12-26T21:05:59-05:00 | [Item 1.01: Entry into a Material Definitive A... | https://www.sec.gov/Archives/edgar/data/186144... |
3 | 8-K | 0001213900-23-098373 | 1698113 | IDICQ | 2023-12-26T09:09:08-05:00 | [Item 1.01: Entry into a Material Definitive A... | https://www.sec.gov/Archives/edgar/data/169811... |
4 | 8-K | 0001861449-23-000220 | 1861449 | BRDSQ | 2023-12-26T08:54:08-05:00 | [Item 1.01: Entry into a Material Definitive A... | https://www.sec.gov/Archives/edgar/data/186144... |
print('Items disclosed in the first 2 filings:')
list(form_8K_with_1_03['items'])[:2]
Items disclosed in the first 2 filings:
[['Item 1.03: Bankruptcy or Receivership',
'Item 2.01: Completion of Acquisition or Disposition of Assets'],
['Item 1.01: Entry into a Material Definitive Agreement',
'Item 1.03: Bankruptcy or Receivership',
'Item 8.01: Other Events',
'Item 9.01: Financial Statements and Exhibits']]
The linkToFilingDetails
field in the response contains the URLs to the 8-K filings and items
provides the item IDs of the triggering events that are disclosed in each filing. These URLs can be used to extract and download the textual content of Item 1.03 from all the located filings using the Extractor API.
Find Form D Filings by Item
The next example demonstrates how to find Form D filings by items specified under "Item 6 - Federal Exemptions and Exclusions Claimed". The following list represents commonly used items:
04
04.1
04.2
04.3
4a5
06b
06c
3C
3C.1
3C.3
3C.4
3C.5
3C.6
3C.7
3C.9
The following query will search for Form D filings (formType:D
) that include both 06b
and 3C.1
items. Modify the query to include other items as needed.
formType:D AND items:"06b" AND items:"3C.1"
search_query = (
'formType:D AND items:"06b" AND items:"3C.1" AND filedAt:[2023-01-01 TO 2023-01-31]'
)
form_D_filings = get_filings(search_query)
print('Number of Form D filings found:', len(form_D_filings))
Number of Form D filings found: 1166
1166 Form D filings were found that match the search criteria. To validate and ensure that all filings include items 6b and 3C.1 in the items
field, inspect the items
field of the first five filings using the following code snippet:
list(form_D_filings['items'])[:5]
[['Item 06b: ',
'Item 3C: Investment Company Act Section 3(c)',
'Item 3C.1: Section 3(c)(1)'],
['Item 06b: ',
'Item 3C: Investment Company Act Section 3(c)',
'Item 3C.1: Section 3(c)(1)'],
['Item 06b: ',
'Item 3C: Investment Company Act Section 3(c)',
'Item 3C.1: Section 3(c)(1)'],
['Item 06b: ',
'Item 3C: Investment Company Act Section 3(c)',
'Item 3C.1: Section 3(c)(1)',
'Item 3C.7: Section 3(c)(7)'],
['Item 06b: ',
'Item 3C: Investment Company Act Section 3(c)',
'Item 3C.1: Section 3(c)(1)']]