sec-api.ioSEC API by D2V
FilingsPricingSandboxDocs
Log inGet Free API Key
  1. Home
  2. Tutorials

How to Download Exhibit 99 Files from Form 8-K Filings

Open In Colab   Download Notebook

On this page:

  • Tutorial Overview
    • What is Exhibit 99?
      • Example Exhibit 99 Files
      • Setup
        • Find URLs of Exhibit 99 Files
          • Download Exhibit 99 Files

            This tutorial guides you through the process of finding and downloading Exhibit 99 files attached to EDGAR Form 8-K filings. It is split into two main parts: locating the relevant filings and downloading the corresponding exhibit files.

            Tutorial Overview

            1. Finding Exhibit 99 Files: We use the EDGAR Filing Query API to search for Form 8-K filings that include Exhibit 99 files. The API provides access to the entire EDGAR filings database, from 1993 to the present. We will retrieve a list of URLs that point directly to these Exhibit 99 files.

            2. Downloading the Exhibit 99 Files: Next, we utilize the Download API to download the files. This API supports downloading up to 40 files in parallel directly from our servers, and is not restricted by the SEC's rate limits. The files will be saved and organized into folders based on year, month, and filer for easy access.

            What is Exhibit 99?

            Exhibit 99 is classified as "additional exhibits" under § 229.601 (Item 601) Exhibits of Regulation S-K. These exhibits contain supplementary information that, while not filed directly as an EDGAR filing, is attached to a filing for further context. Examples include:

            • Investor presentations
            • Press releases
            • Supplier agreements
            • Reports from independent accounting firms

            Each triggering event category for a Form 8-K filing can include an Exhibit 99 attachment. For example, a company might file a press release as Exhibit 99 under Item 5.02 when announcing leadership changes.

            Example Exhibit 99 Files

            • Private placement announcement
            • Investor presentation
            • Report of independent public accounting firm & results of operations
            • Update on debt status
            • Press release about purchase agreement
            • Mortgage loan agreement

            Setup

            We start by installing the sec-api Python package and import the necessary libraries, such as pandas. Ensure to replace YOUR_API_KEY with your personal API key, which you can find in your profile on sec-api.io/signup.

            !pip install -q sec-api pandas
            API_KEY = "YOUR_API_KEY"

            import pandas as pd
            from sec_api import QueryApi, RenderApi

            queryApi = QueryApi(API_KEY)
            renderApi = RenderApi(API_KEY)

            Find URLs of Exhibit 99 Files

            We will use the Filing Query API to retrieve URLs for Exhibit 99 files attached to Form 8-K filings. For reproducibility, we'll focus on the date range from January 1, 2020, to December 31, 2020. This ensures that running the code at any time will yield the same results as shown in this tutorial.

            The Query API allows searches based on various parameters such as form type, filing date, and company name. You can explore all available search options in the API documentation.

            To search for Form 8-K filings with Exhibit 99 files filed during 2020, we will use the following query:

            formType:"8-K" AND documentFormatFiles.type:*99* AND filedAt:[2020-01-01 TO 2020-12-31]

            The query uses Lucene syntax, where field:value specifies the field to search in and the value the field should to contain. For example, formType:"8-K" searches for records where the formType field exactly matches "8-K". Multiple conditions can be combined using logical operators like AND, OR, and NOT. For a more detailed explanation of Lucene syntax, refer to this guide.

            Now, let's fetch the 10 most recent Form 8-K filings with Exhibit 99 from 2020 using a simple example.

            lucene_query = 'formType:"8-K" AND documentFormatFiles.type:*99* AND filedAt:[2020-01-01 TO 2020-12-31]'

            query = {
                "query": lucene_query,
                # defines the start record to fetch. Used for pagination.
                "from": "0",
                # defines how many records to return. Maximum is 50.
                "size": "10",
                # sort results by filedAt, starting with the most recent filings.
                "sort": [{"filedAt": {"order": "desc"}}],
            }

            response = queryApi.get_filings(query)

            print(f"Number of Form 8-K filings with exhibit 99 in 2020: {response['total']}")
            Number of Form 8-K filings with exhibit 99 in 2020: {'value': 10000, 'relation': 'gte'}

            The Query API identified over 10,000 Form 8-K filings with Exhibit 99 attachments published in 2020, as indicated by the gte (greater than or equal to) value in the relation field of the response.

            For use cases requiring Exhibit 99 files tied to a specific event category, such as Item 5.02 (Departure of Directors or Certain Officers), you can refine the query by adding an items filter like this:

            formType:"8-K" AND documentFormatFiles.type:*99* AND filedAt:[2020-01-01 TO 2020-12-31] AND items:"5.02"

            The API response includes two main fields:

            • total: the total number of filings matching the query.
            • filings: a list of filings that meet the search criteria.

            Each filing object contains over 20 metadata fields, including details such as the unique accession number, the CIK, ticker and name of the filer, filing date, information about attached exhibits and files, and more.

            Next, we'll convert the JSON response from the Query API into a DataFrame and display the first 5 rows for review.

            filings = pd.DataFrame(response["filings"])
            filings[
                [
                    "accessionNo",
                    "filedAt",
                    "companyName",
                    "cik",
                    "ticker",
                    "items",
                    "documentFormatFiles",
                ]
            ].head()
            Out:
            accessionNofiledAtcompanyNameciktickeritemsdocumentFormatFiles
            00001654954-20-0140942020-12-31T17:20:28-05:00TOMI Environmental Solutions, Inc.314227TOMZ[Item 5.07: Submission of Matters to a Vote of...[{'sequence': '1', 'size': '49284', 'documentU...
            10001104659-20-1410382020-12-31T17:10:56-05:00GOLD RESOURCE CORP1160791GORO[Item 5.02: Departure of Directors or Certain ...[{'sequence': '1', 'size': '40187', 'documentU...
            20001104659-20-1410372020-12-31T17:08:55-05:00McEwen Mining Inc.314203MUX[Item 3.02: Unregistered Sales of Equity Secur...[{'sequence': '1', 'size': '27630', 'documentU...
            30001640334-20-0031992020-12-31T17:07:50-05:00Lexaria Bioscience Corp.1348362LEXX[Item 7.01: Regulation FD Disclosure, Item 9.0...[{'sequence': '1', 'size': '16594', 'documentU...
            40001493152-20-0246942020-12-31T17:04:37-05:00MONMOUTH REAL ESTATE INVESTMENT CORP67625MNR.PC[Item 7.01: Regulation FD Disclosure, Item 8.0...[{'sequence': '1', 'size': '41725', 'documentU...

            We are primarily interested in the documentUrl field within the documentFormatFiles array of a filing. This field contains the URL of any attached files, including Exhibit 99 files. Each object inside the documentFormatFiles array includes the following fields:

            • sequence (string, optional): The sequence number of the file attached to the filing, e.g., "1".
            • description (string, optional): A brief description of the file, e.g., "EXHIBIT 31.1".
            • documentUrl (string): The URL of the file hosted on SEC.gov.
            • type (string, optional): The type of the file, e.g., "EX-32.1", "GRAPHIC", or "10-Q".
            • size (string, optional): The file size in bytes, e.g., "6627216".

            For example, here are the first three objects in the documentFormatFiles array of the first filing object:

            filings["documentFormatFiles"][0][:3]
            Out:
            [{'sequence': '1',
              'size': '49284',
              'documentUrl': 'https://www.sec.gov/Archives/edgar/data/314227/000165495420014094/tomi_8k.htm',
              'description': 'CURRENT REPORT',
              'type': '8-K'},
             {'sequence': '2',
              'size': '24539',
              'documentUrl': 'https://www.sec.gov/Archives/edgar/data/314227/000165495420014094/tomi_ex991.htm',
              'description': 'PRESENTATION',
              'type': 'EX-99.1'},
             {'sequence': '3',
              'size': '35390',
              'documentUrl': 'https://www.sec.gov/Archives/edgar/data/314227/000165495420014094/tomi_ex991000.jpg',
              'description': 'IMAGE',
              'type': 'GRAPHIC'}]

            Note: A single filing may include multiple Exhibit 99 files, resulting in multiple EX-99 types within the documentFormatFiles array.

            Let's extract the URLs of all Exhibit 99 files from the response and create a new DataFrame that includes the exhibit URLs along with other relevant metadata fields, such as the filing date, CIK and ticker of the filer, and accession number.

            def extract_ex_99_urls(row):
                urls = []

                for file in row["documentFormatFiles"]:
                    if "EX-99" in file["type"]:
                        urls.append(
                            {
                                "filedAt": row["filedAt"],
                                "accessionNo": row["accessionNo"],
                                "cik": row["cik"],
                                "ticker": row["ticker"],
                                "type": file["type"],
                                "exhibit99Url": file["documentUrl"],
                            }
                        )

                return urls


            exhibit_99_urls = filings.apply(lambda row: extract_ex_99_urls(row), axis=1)
            exhibit_99_urls = pd.DataFrame(exhibit_99_urls.explode().to_list())
            exhibit_99_urls
            Out:
            filedAtaccessionNociktickertypeexhibit99Url
            02020-12-31T17:20:28-05:000001654954-20-014094314227TOMZEX-99.1https://www.sec.gov/Archives/edgar/data/314227...
            12020-12-31T17:10:56-05:000001104659-20-1410381160791GOROEX-99.1https://www.sec.gov/Archives/edgar/data/116079...
            22020-12-31T17:08:55-05:000001104659-20-141037314203MUXEX-99.1https://www.sec.gov/Archives/edgar/data/314203...
            32020-12-31T17:08:55-05:000001104659-20-141037314203MUXEX-99.2https://www.sec.gov/Archives/edgar/data/314203...
            42020-12-31T17:07:50-05:000001640334-20-0031991348362LEXXEX-99.1https://www.sec.gov/Archives/edgar/data/134836...
            52020-12-31T17:04:37-05:000001493152-20-02469467625MNR.PCEX-99.1https://www.sec.gov/Archives/edgar/data/67625/...
            62020-12-31T17:01:35-05:000001469709-20-0001011647705GBBTEX-99.1https://www.sec.gov/Archives/edgar/data/164770...
            72020-12-31T17:01:35-05:000001469709-20-0001011647705GBBTEX-99.2https://www.sec.gov/Archives/edgar/data/164770...
            82020-12-31T17:00:11-05:000001104659-20-1410251815903PTPIEX-99.1https://www.sec.gov/Archives/edgar/data/181590...
            92020-12-31T17:00:11-05:000001104659-20-1410251815903PTPIEX-99.2https://www.sec.gov/Archives/edgar/data/181590...
            102020-12-31T17:00:10-05:000001477932-20-0075991281984WDLFEX-99.1https://www.sec.gov/Archives/edgar/data/128198...
            112020-12-31T16:55:00-05:000001104659-20-141020837852IDEXEX-99.1https://www.sec.gov/Archives/edgar/data/837852...
            122020-12-31T16:43:06-05:000001580695-20-0004631372183NXTPEX-99.1https://www.sec.gov/Archives/edgar/data/137218...
            132020-12-31T16:43:06-05:000001580695-20-0004631372183NXTPEX-99.2https://www.sec.gov/Archives/edgar/data/137218...

            Now that we've successfully extracted the Exhibit 99 URLs for 10 filings, let's implement a function to retrieve Exhibit 99 URLs for all Form 8-K filings in 2020. The Query API limits results to 10,000 entries per search expression. Since our initial search found more than 10,000 Form 8-K filings with Exhibit 99 files in 2020, we need to refine our search to reduce the result set. A simple solution is to query filings on a month-by-month basis, ensuring that we capture all Exhibit 99 files by running the query for each month separately.

            To speed up the process, we'll use the pandarallel package to parallelize the pandas apply function across multiple processes. pandarallel extends pandas DataFrame objects with a parallel_apply method, which behaves similarly to the standard apply method but processes each row in parallel across multiple threads. The number of threads (or processes) is controlled by the nb_workers parameter. In our case, we set nb_workers=10, instructing pandarallel to spawn ten threads. Once parallel_apply is invoked, pandarallel efficiently distributes the rows among the threads and waits for all threads to complete before returning the results.

            In this scenario, the fetch_exhibit_99_urls function is applied to each row in the queries DataFrame via parallel_apply, enabling us to execute 10 Query API requests concurrently to retrieve the Exhibit 99 URLs for each query.

            Note: While we use the terms processes, workers, and threads interchangeably in this tutorial, they have distinct technical meanings. For simplicity, we treat them as synonymous in this context.

            !pip install -q pandarallel ipywidgets
            from pandarallel import pandarallel

            pandarallel.initialize(nb_workers=10, progress_bar=False)
            INFO: Pandarallel will run on 10 workers.
            INFO: Pandarallel will use standard multiprocessing data transfer (pipe) to transfer data between the main process and workers.
            """
            fetch_exhibit_99_urls(query, retry_counter=0)

            Fetches the exhibit 99 URLs for a given Query API query.

            Parameters:
            - query (dict): The Query API query to be used to fetch filing metadata and exhibit 99 URLs.
            - retry_counter (int): The number of times the function has retried to fetch data.

            Returns:
            - list: A list of exhibit 99 URLs for the filings returned by the query.
            """
            def fetch_exhibit_99_urls(query, retry_counter=0):
                try:
                    response = queryApi.get_filings(query)
                except Exception as e:
                    if retry_counter < 3:
                        print(f"Retrying... {retry_counter}")
                        return fetch_exhibit_99_urls(query, retry_counter + 1)
                    else:
                        print(f"Failed to fetch data after {retry_counter} retries")
                        return []

                if len(response["filings"]) == 0:
                    return []

                filings = pd.DataFrame(response["filings"])
                return filings.apply(lambda row: extract_ex_99_urls(row), axis=1).explode().to_list()
            """
            fetch_all_exhibit_99_urls(start_year, end_year)

            Fetches all exhibit 99 URLs of Form 8-K filings for the specified range of years.

            Parameters:
            - start_year (int): The start year of the range (inclusive).
            - end_year (int): The end year (inclusive).

            Returns:
            - list: A list of dictionaries containing the exhibit 99 URLs.
            """
            def fetch_all_exhibit_99_urls(start_year, end_year):
                if start_year > end_year:
                    raise ValueError("start_year must be less than or equal to end_year")

                all_exhibit_99_urls = []

                for year in range(start_year, end_year + 1):
                    print(f"Fetching exhibit 99 URLs for year {year}")
                    for month in range(1, 13):
                        print(f" Processing month: {month}")

                        queries = []

                        query_from = 0
                        form_type_filter = 'formType:"8-K"'
                        file_filter = "documentFormatFiles.type:*99*"
                        date_filter = f"filedAt:[{year}-{month:02d}-01 TO {year}-{month:02d}-31]"

                        lucene_query = f"{form_type_filter} AND {file_filter} AND {date_filter}"

                        query = {
                            "query": lucene_query,
                            "from": query_from,
                            "size": "50",
                            "sort": [{"filedAt": {"order": "desc"}}],
                        }

                        response = queryApi.get_filings(query)

                        total_filings = response["total"]["value"]

                        print(f" Found {total_filings} filings in {year}-{month:02d}")

                        if total_filings == 0:
                            continue

                        # create queries, each query with a from value of 50, 100, 150, etc.
                        for i in range(0, total_filings, 50):
                            queries.append(
                                {
                                    "query": {
                                        "query": lucene_query,
                                        "from": i,
                                        "size": "50",
                                        "sort": [{"filedAt": {"order": "desc"}}],
                                    }
                                }
                            )

                        queries = pd.DataFrame(queries)

                        # use pandarallel to parallelize the fetching of exhibit 99 URLs
                        exhibit_99_urls = queries["query"].parallel_apply(fetch_exhibit_99_urls)
                        all_exhibit_99_urls.extend(exhibit_99_urls)

                # flatten, filter, and sort the exhibit 99 URLs
                all_exhibit_99_urls_flat = [item for sublist in all_exhibit_99_urls for item in sublist]
                all_exhibit_99_urls_flat = [item for item in all_exhibit_99_urls_flat if type(item) == dict]
                all_exhibit_99_urls_df = pd.DataFrame(all_exhibit_99_urls_flat)
                all_exhibit_99_urls_df["filedAt"] = pd.to_datetime(all_exhibit_99_urls_df["filedAt"], utc=True)
                all_exhibit_99_urls_df["filedAt"] = all_exhibit_99_urls_df["filedAt"].dt.tz_convert("America/New_York")
                all_exhibit_99_urls_df = all_exhibit_99_urls_df.sort_values("filedAt", ascending=True)

                return all_exhibit_99_urls_df


            exhibit_99_urls_2020 = fetch_all_exhibit_99_urls(2020, 2020)
            Fetching exhibit 99 URLs for year 2020
              Processing month: 1
              Found 3068 filings in 2020-01
              Processing month: 2
              Found 4037 filings in 2020-02
              Processing month: 3
              Found 3762 filings in 2020-03
              Processing month: 4
              Found 4023 filings in 2020-04
              Processing month: 5
              Found 4768 filings in 2020-05
              Processing month: 6
              Found 2673 filings in 2020-06
              Processing month: 7
              Found 3688 filings in 2020-07
              Processing month: 8
              Found 4376 filings in 2020-08
              Processing month: 9
              Found 2489 filings in 2020-09
              Processing month: 10
              Found 3899 filings in 2020-10
              Processing month: 11
              Found 4426 filings in 2020-11
              Processing month: 12
              Found 2762 filings in 2020-12
            print(f"{len(exhibit_99_urls_2020):,} exhibit 99 URLs fetched for 2020")
            exhibit_99_urls_2020
            53,166 exhibit 99 URLs fetched for 2020
            Out:
            filedAtaccessionNociktickertypeexhibit99Url
            37262020-01-02 06:03:38-05:000001104659-20-0000411526113GNLEX-99.1https://www.sec.gov/Archives/edgar/data/152611...
            37252020-01-02 06:04:33-05:000001104659-20-0000501568162RTLEX-99.1https://www.sec.gov/Archives/edgar/data/156816...
            37242020-01-02 06:38:00-05:000000052795-20-00000452795AXEEX-99.2https://www.sec.gov/Archives/edgar/data/52795/...
            37232020-01-02 06:38:00-05:000000052795-20-00000452795AXEEX-99.1https://www.sec.gov/Archives/edgar/data/52795/...
            37222020-01-02 06:41:40-05:000001193125-20-0001001337553AERIEX-99.1https://www.sec.gov/Archives/edgar/data/133755...
            .....................
            497782020-12-31 17:07:50-05:000001640334-20-0031991348362LEXXEX-99.1https://www.sec.gov/Archives/edgar/data/134836...
            497772020-12-31 17:08:55-05:000001104659-20-141037314203MUXEX-99.2https://www.sec.gov/Archives/edgar/data/314203...
            497762020-12-31 17:08:55-05:000001104659-20-141037314203MUXEX-99.1https://www.sec.gov/Archives/edgar/data/314203...
            497752020-12-31 17:10:56-05:000001104659-20-1410381160791GOROEX-99.1https://www.sec.gov/Archives/edgar/data/116079...
            497742020-12-31 17:20:28-05:000001654954-20-014094314227TOMZEX-99.1https://www.sec.gov/Archives/edgar/data/314227...

            53166 rows × 6 columns

            Download Exhibit 99 Files

            With the list of Exhibit 99 URLs created, we can now download the files in parallel using the Download API. The Download API supports downloading up to 40 EDGAR filings or exhibit files simultaneously.

            To optimize this process, we use pandarallel again to spawn 10 threads, enabling parallel downloads. Additionally, we'll activate the progress bar feature to monitor the download progress.

            Note: When using standard for or while loops, downloads occur sequentially, meaning each download waits for the previous one to complete. By using pandarallel or other multiprocessing techniques, multiple downloads can run concurrently, each in its own thread or process, greatly speeding up the process.

            We organize the downloaded Exhibit 99 files using the following directory structure:

            ex-99-files/
                └── 2020/
                    ├── 01/
                    │ └── CIK-1/
                    │ └── accessionNo_ex99-filename.html
                    ├── 02/
                    └── ...
                    └── 12/

            This structure ensures that files are sorted by year, month, and company (CIK), making them easier to manage and reference.

            import os

            pandarallel.initialize(nb_workers=10, progress_bar=True)


            def download_ex_99_file(row, retry_counter=0):
                accession_no = row["accessionNo"]
                cik = row["cik"]
                exhibit_99_url = row["exhibit99Url"]
                exhibit_99_filename = exhibit_99_url.split("/")[-1]
                publication_year = row["filedAt"].year
                publication_month = row["filedAt"].month
                file_name = f"{accession_no}_{exhibit_99_filename}"
                file_path = (
                    f"ex-99-files/{publication_year}/{publication_month:02d}/{cik}/{file_name}"
                )

                os.makedirs(os.path.dirname(file_path), exist_ok=True)

                content = None
                try:
                    content = renderApi.get_filing(exhibit_99_url)
                except Exception as e:
                    if retry_counter < 3:
                        return download_ex_99_file(row, retry_counter + 1)
                    else:
                        print(f"Failed: {exhibit_99_url}")
                        return

                with open(file_path, "wb") as f:
                    f.write(content.encode("utf-8"))


            # download sample of 1000
            exhibit_99_urls_2020[:1000].parallel_apply(download_ex_99_file, axis=1)
            # download all
            # exhibit_99_urls_2020_df.parallel_apply(download_ex_99_file, axis=1)

            print("Download complete")
            INFO: Pandarallel will run on 10 workers.
            INFO: Pandarallel will use standard multiprocessing data transfer (pipe) to transfer data between the main process and workers.
            VBox(children=(HBox(children=(IntProgress(value=0, description='0.00%'), Label(value='0 / 100'))), HBox(childr…
            Download complete

            Footer

            Products

            • EDGAR Filing Search API
            • Full-Text Search API
            • Real-Time Filing Stream API
            • Filing Download & PDF Generator API
            • XBRL-to-JSON Converter
            • 10-K/10-Q/8-K Item Extractor
            • Investment Adviser & Form ADV API
            • Insider Trading Data - Form 3, 4, 5
            • Restricted Sales Notifications - Form 144
            • Institutional Holdings - Form 13F
            • Form N-PORT API - Investment Company Holdings
            • Form N-PX API - Proxy Voting Records
            • Form 13D/13G API
            • Form S-1/424B4 - IPOs, Debt & Rights Offerings
            • Form C - Crowdfunding Offerings
            • Form D - Private Placements & Exempt Offerings
            • Regulation A Offering Statements API
            • Changes in Auditors & Accountants
            • Non-Reliance on Prior Financial Statements
            • Executive Compensation Data API
            • Directors & Board Members Data
            • Company Subsidiaries Database
            • Outstanding Shares & Public Float
            • SEC Enforcement Actions
            • Accounting & Auditing Enforcement Releases (AAERs)
            • SRO Filings
            • CIK, CUSIP, Ticker Mapping

            General

            • Pricing
            • Features
            • Supported Filings
            • EDGAR Filing Statistics

            Account

            • Sign Up - Start Free Trial
            • Log In
            • Forgot Password

            Developers

            • API Sandbox
            • Documentation
            • Resources & Tutorials
            • Python API SDK
            • Node.js API SDK

            Legal

            • Terms of Service
            • Privacy Policy

            Legal

            • Terms of Service
            • Privacy Policy

            SEC API

            © 2025 sec-api.io by Data2Value GmbH. All rights reserved.

            SEC® and EDGAR® are registered trademarks of the U.S. Securities and Exchange Commission (SEC).

            EDGAR is the Electronic Data Gathering, Analysis, and Retrieval system operated by the SEC.

            sec-api.io and Data2Value GmbH are independent of, and not affiliated with, sponsored by, or endorsed by the U.S. Securities and Exchange Commission.

            sec-api.io is classified under SIC code 7375 (Information Retrieval Services), providing on-demand access to structured data and online information services.