Download SEC Filings as PDFs
On this page:
This guide demonstrates how to download SEC EDGAR filings, such as Form 10-K filings, in PDF format using Python and the PDF Generator API.
Overview of SEC Filing Formats
Most SEC filings, such as annual and quarterly reports on Forms 10-K and 10-Q, are published in HTML format. Older filings, particularly those before 2005, are often only available as text files (.txt
). Certain forms nowadays, like Form 4 (insider trading reports), Form 13F (institutional portfolio holdings), and Form N-PORT (mutual fund portfolio updates), are available exclusively in XML format. In rare cases, filings such as SEC staff actions (ORDER form type) are directly available as PDFs.
Why Conversion to PDF is Necessary
Since the majority of SEC filings, exhibits, and attachments are not published in PDF format, converting their HTML, XML, or text-based content into PDFs is required for downloading them in PDF format.
However, most off-the-shelf PDF converters face significant challenges when handling SEC filings due to several limitations:
- Image processing: Large filings, such as Form 10-K annual reports or DEF14A proxy statements, often contain more than 10 images. Off-the-shelf converters may exceed the SEC’s rate limit of 10 requests per second when attempting to download all images during conversion. This frequently results in failed conversions, as the EDGAR system blocks requests beyond this limit.
- Image scaling: Images in the original HTML filings are often not scalable to fit standard A4 page dimensions, requiring adjustments before conversion.
- Links: SEC filings often contain non-standard URL targets, which can result in broken links in converted PDFs if not properly standardized.
- Fonts: Font types in SEC filings vary significantly and are frequently used to style checkboxes and lists. If not handled properly during PDF conversion, these elements may become invisible, leading to incomplete or unreadable documents.
- File size and complexity: Large filings such as prospectuses, registration statements, and annual reports can overwhelm conventional PDF converters, causing memory overflow issues. Additionally, the presence of inline XBRL metadata increases the file size and bloat if not properly cleaned before conversion.
PDF Generator API
The PDF Generator API overcomes these limitations by converting HTML, XML, and text-based SEC filings and exhibits into PDFs while preserving the original formatting, and optimizing images and tables. The API supports downloading all EDGAR form types as PDFs, such as Forms 10-K, 10-Q, 8-K, DEF14A, and more, published from 1993 to present.
Key features include:
- Optimized images: Images are scaled and optimized for high-quality printing, particularly in forms like proxy statements.
- File size management: Invisible inline XBRL tags are removed to reduce file size and avoid unnecessary bloat.
- Preservation of original content: All original content is retained without alteration.
- Links and Fonts: Links are standardized, and fonts are processed to ensure visibility of checkboxes and lists.
- Shareable and printable PDFs: The resulting PDFs are easily shareable, printable, and suitable for archiving.
Note: Legacy text-based filings (
.txt
) were not designed for print-friendly PDF output. As a result, table rows may span multiple lines, potentially exceeding A4 page width. For tasks involving natural language processing (NLP), large language models (LLMs), or retrieval-augmented generation (RAG), using the original text format may yield better accuracy than converting to PDF.
Quick Start
To download EDGAR filings and exhibits as PDFs, use the .get_pdf(file_url)
function from the PdfGeneratorApi
class in the sec-api
Python package. The file_url
can represent any EDGAR filing or exhibit from 1993 to present, such as Tesla's 2024 10-K filing:
https://www.sec.gov/Archives/edgar/data/1318605/000162828024002390/tsla-20231231.htm
Below are examples of various use cases for the PDF Generator API. Before proceeding, make sure to install the sec-api
package via pip
:
pip install sec-api
Download 10-K Filings as PDFs
The example demonstrates how to download SEC 10-K filings (annual reports) as PDFs. Form 10-K filings are published on EDGAR in HTML format, or in text-based format (.txt
) for older filings. These filings are not available as PDFs by default, so their HTML or text-based versions must be converted to PDF to download them in that format. Annual reports often span hundreds of pages and include numerous tables and images requiring form type specific adjustments prior to PDF conversion, which can overwhelm standard PDF converters or cause issues when trying to save the webpage as a PDF through a browser. Additionally, Form 10-K filings often contain invisible metadata that needs to be cleaned to prevent bloated file sizes.
The PDF Generator API addresses these challenges by converting 10-K filings into properly formatted PDFs while optimizing the file size. Simply provide the URL of the 10-K filing on EDGAR, use the .get_pdf(filing_10K_url)
method, and save the resulting PDF locally. For example, Tesla's 2024 10-K filing, which contains over 100 pages, is easily converted to a PDF with a final file size of just 4 MB. The converted PDF of the 10-K filing is accessible here for direct download and evaluation.
from sec_api import PdfGeneratorApi
pdfGeneratorApi = PdfGeneratorApi("YOUR_API_KEY")
# example URL of Tesla's 2024 10-K filing
filing_10K_url = "https://www.sec.gov/Archives/edgar/data/1318605/000162828024002390/tsla-20231231.htm"
# download 10-K filing as PDF
pdf_10K_filing = pdfGeneratorApi.get_pdf(filing_10K_url)
# save PDF of 10-K filing to disk
with open("tesla_10K.pdf", "wb") as file:
file.write(pdf_10K_filing)
Download Proxy Statements (DEF 14A) as PDFs
The example demonstrates how to download proxy statements on DEF14A filings as PDFs. Proxy statements (DEF 14A) typically include a "Scan to View Materials & Vote" image, as well as an image of the proxy voting card. When downloading a proxy statement as a PDF using a browser or standard PDF converter, the voting cards are often not correctly scaled for A4 page dimensions, resulting in portions of the image being cut off - making the document unfit for printing and returning for voting purposes.
The PDF Generator API addresses this issue by scaling all images, including voting cards, to fit A4 page dimensions, ensuring that the proxy statement is both printable and easily shareable as a PDF.
In the example below, Nvidia’s 2024 80-page proxy statement is downloaded as a PDF using the .get_pdf(proxy_statement_url)
function from the PdfGeneratorApi
class in the sec-api
package, allowing the proxy filing to be saved in PDF format. The converted PDF of the proxy statement is available here for direct download and evaluation.
from sec_api import PdfGeneratorApi
pdfGeneratorApi = PdfGeneratorApi("YOUR_API_KEY")
# example URL of Nvidia's 2024 proxy statement (DEF14A)
proxy_statement_url = "https://www.sec.gov/Archives/edgar/data/1045810/000104581024000104/nvda-20240514.htm"
# download proxy statement as PDF
pdf_proxy_filing = pdfGeneratorApi.get_pdf(proxy_statement_url)
# save PDF of proxy statement to disk
with open("nvidia_proxy_statement.pdf", "wb") as file:
file.write(pdf_proxy_filing)
Download 8-K Filings as PDFs
This example demonstrates how to download SEC Form 8-K filings as PDFs. Form 8-K filings disclose material events, such as director changes or cybersecurity incidents, for publicly traded companies. There are 33 triggering events that require companies to file an 8-K, providing detailed information about the event. For instance, in the case of director changes, the filing may include details about who is leaving, whether a replacement has been appointed, and the reasons for the departure.
Like other EDGAR form types, 8-K filings are published in HTML format only. If specific sections of the filing, such as the content of Item 1.05, need to be extracted, the Extractor API can be used.
In this example, Microsoft's Form 8-K filing with Item 1.05, disclosing a cybersecurity incident, is converted to PDF. The converted PDF of the Form 8-K filing is available here for download and review.
from sec_api import PdfGeneratorApi
pdfGeneratorApi = PdfGeneratorApi("YOUR_API_KEY")
# example URL of Microsoft's Form 8-K filings disclosing a cybersecurity incident
form_8K_filing_url = "https://www.sec.gov/Archives/edgar/data/789019/000119312524011295/d708866d8k.htm"
# download Form 8-K filing as PDF
pdf_8K_filing = pdfGeneratorApi.get_pdf(form_8K_filing_url)
# save PDF of 8-K to disk
with open("microsoft_8K.pdf", "wb") as file:
file.write(pdf_8K_filing)
Download Form 4 Filings as PDFs
This example demonstrates how to download SEC Form 4 filings as PDFs. SEC Form 4 filings are submitted when an insider, such as a director or a 10% shareholder, buys or sells shares in a company. These filings provide details such as the insider's name and position, the company involved, and specifics about the transaction - how many shares were purchased or sold, the price, and the insider's remaining holdings after the transaction.
Like Forms 3, 5, and other EDGAR forms, Form 4 filings are only available in XML format. To download them as PDFs, they must first be converted.
This example demonstrates how to download the Form 4 filing disclosing Berkshire Hathaway's $87 million purchase of SIRI stock. The PDF of the Form 4 filing is available here for download and review.
from sec_api import PdfGeneratorApi
pdfGeneratorApi = PdfGeneratorApi("YOUR_API_KEY")
# Form 4 filing disclosing Berkshire Hathaway's $86 million purchase of SIRI stock
filing_4_url = "https://www.sec.gov/Archives/edgar/data/315090/000095017024114414/xslF345X05/ownership.xml"
# download Form 4 filing as PDF
pdf_form_4_filing = pdfGeneratorApi.get_pdf(filing_4_url)
# save PDF of Form 4 filing
with open("berkshire_form-4_filing.pdf", "wb") as file:
file.write(pdf_form_4_filing)
Download Exhibits of SEC Filings as PDFs
The example demonstrates how to download SEC filing exhibits, such as Exhibit 99, as PDFs. SEC filing exhibits encompass a wide range of information, including press releases, material contracts, underwriting agreements, bylaws, tax opinions, letters regarding unaudited interim financials, changes in certifying accountants, filing fee tables, and more. Different form types contain different types of exhibits, all of which are typically published in HTML format. To download exhibits as PDFs, conversion is required.
The example below demonstrates how to download AMD's press release on its Q2 2024 financial results from Exhibit 99 of the company's Form 8-K filing and convert it to PDF. The generated PDF of the press release can be downloaded and reviewed here.
from sec_api import PdfGeneratorApi
pdfGeneratorApi = PdfGeneratorApi("YOUR_API_KEY")
# URL of AMD's press release about its Q2 2024 financial results
amd_pr_financial_results_url = "https://www.sec.gov/Archives/edgar/data/2488/000000248824000121/q22024991.htm"
# download PDF of AMD's press release for second quarter 2024 financial results
pdf_investor_presentation = pdfGeneratorApi.get_pdf(amd_pr_financial_results_url)
# save PDF of press release
with open("amd_financial_results_press_release.pdf", "wb") as file:
file.write(pdf_investor_presentation)