API_KEY = 'YOUR_API_KEY'

!pip install -q sec-api

from sec_api import ExtractorApiextractorApi = ExtractorApi(API_KEY)

# helper function to pretty print long, single-line text to multi-line textdef pprint(text, line_length=100):  words = text.split(' ')  lines = []  current_line = ''  for word in words:    if len(current_line + ' ' + word) <= line_length:      current_line += ' ' + word    else:      lines.append(current_line.strip())      current_line = word  if current_line:    lines.append(current_line.strip())  print('\n'.join(lines))

# URL of Tesla's 10-K filingfiling_10_k_url = 'https://www.sec.gov/Archives/edgar/data/1318605/000156459021004599/tsla-10k_20201231.htm'# extract text section "Item 1 - Business" from 10-Kitem_1_text = extractorApi.get_section(filing_10_k_url, '1', 'text')print('Extracted Item 1 (Text)')print('-----------------------')pprint(item_1_text[0:1500])print('... cut for brevity')print('-----------------------')

Extracted Item 1 (Text)-----------------------ITEM 1. BUSINESS##TABLE_ENDOverview We design, develop, manufacture, sell and leasehigh-performance fully electric vehicles and energy generation and storage systems, and offerservices related to our sustainable energy products. We generally sell our products directly tocustomers, including through our website and retail locations. We also continue to grow ourcustomer-facing infrastructure through a global network of vehicle service centers, Mobile Servicetechnicians, body shops, Supercharger stations and Destination Chargers to accelerate the widespreadadoption of our products. We emphasize performance, attractive styling and the safety of our usersand workforce in the design and manufacture of our products and are continuing to develop fullself-driving technology for improved safety. We also strive to lower the cost of ownership for ourcustomers through continuous efforts to reduce manufacturing costs and by offering financialservices tailored to our products. Our mission to accelerate the world&#8217;s transition tosustainable energy, engineering expertise, vertically integrated business model and focus on userexperience differentiate us from other companies.Segment Information We operate as tworeportable segments: (i) automotive and (ii) energy generation and storage.The automotive segmentincludes the design, development, manufacturing, sales and leasing of electric vehicles as well assales of automotive regulatory credits. Additionally, the... cut for brevity-----------------------

from IPython.display import display, HTML

# extract HTML section "Item 1 - Business" from 10-Kitem_1_html = extractorApi.get_section(filing_10_k_url, '1', 'html')print('Extracted Item 1 (HTML)')print('-----------------------')display(HTML(item_1_html[0:3000]))print('... cut for brevity')print('-----------------------')

# extract the HTML version of section "Item 6 - Selected Financial Data"item_6_html = extractorApi.get_section(filing_10_k_url, '6', 'html')

print('Extracted Item 6 (HTML)')print('-----------------------')display(HTML(item_6_html[0:150000]))print('... cut for brevity')print('-----------------------')

print('Extracted Content of Item 6 (HTML)')print('-----------------------')pprint(item_6_html[0:1000])print('... cut for brevity')print('-----------------------')

Extracted Content of Item 6 (HTML)-----------------------<span style="font-weight:bold;font-family:Times New RomanBold;font-size:10pt;font-style:normal;text-transform:none;font-variant: normal;">ITEM6.</span></p></td> <td valign="top"> <pstyle="margin-bottom:0pt;margin-top:0pt;font-weight:bold;font-style:normal;text-transform:none;font-variant:normal;font-family:Times New Roman Bold;font-size:10pt;"id="ITEM_6_SELECTED_CONSOLIDATED_FINANCIAL_D">SELECTED CONSOLIDATED FINANCIALDATA</p></td></tr></table></div> <pstyle="margin-top:4pt;margin-bottom:0pt;text-indent:4.54%;font-family:Times NewRoman;font-size:10pt;font-weight:normal;font-style:normal;text-transform:none;font-variant:normal;">The following selected consolidated financial data should be read in conjunction with&#8220;Management&#8217;s Discussion and Analysis of Financial Condition and Results ofOperations&#8221; and the consolidated financial statements and the related notes included elsewherein this Annual Report on Form 10-K and from the historical consolidated financia... cut for brevity-----------------------

import pandas as pd# read HTML table from a string and convert to dataframetables = pd.read_html(item_6_html)# first table includes the financial statementsdf = tables[0]

# drop all columns with NaN values except if the first cell is not NaNmask = (df.iloc[1:, :].isna()).all(axis=0)financial_statements = df.drop(df.columns[mask], axis=1).fillna('')print('Consolidated financial statements as dataframe:')financial_statements

Consolidated financial statements as dataframe:

# # extract text sectionsitem_1_text    = extractorApi.get_section(filing_10_k_url, '1', 'text')item_1_a_text  = extractorApi.get_section(filing_10_k_url, '1A', 'text')item_1_b_text  = extractorApi.get_section(filing_10_k_url, '1B', 'text')item_2_text    = extractorApi.get_section(filing_10_k_url, '2', 'text')item_3_text    = extractorApi.get_section(filing_10_k_url, '3', 'text')item_4_text    = extractorApi.get_section(filing_10_k_url, '4', 'text')item_5_text    = extractorApi.get_section(filing_10_k_url, '5', 'text')item_6_text    = extractorApi.get_section(filing_10_k_url, '6', 'text')item_7_text    = extractorApi.get_section(filing_10_k_url, '7', 'text')item_7_a_text  = extractorApi.get_section(filing_10_k_url, '7A', 'text')item_8_text    = extractorApi.get_section(filing_10_k_url, '8', 'text')item_9_text    = extractorApi.get_section(filing_10_k_url, '9', 'text')item_9_a_text  = extractorApi.get_section(filing_10_k_url, '9A', 'text')item_9_b_text  = extractorApi.get_section(filing_10_k_url, '9B', 'text')item_10_text   = extractorApi.get_section(filing_10_k_url, '10', 'text')item_11_text   = extractorApi.get_section(filing_10_k_url, '11', 'text')item_12_text   = extractorApi.get_section(filing_10_k_url, '12', 'text')item_13_text   = extractorApi.get_section(filing_10_k_url, '13', 'text')item_14_text   = extractorApi.get_section(filing_10_k_url, '14', 'text')item_15_text   = extractorApi.get_section(filing_10_k_url, '15', 'text')# # extract HTML sectionsitem_1_html    = extractorApi.get_section(filing_10_k_url, '1', 'html')item_1_a_html  = extractorApi.get_section(filing_10_k_url, '1A', 'html')item_1_b_html  = extractorApi.get_section(filing_10_k_url, '1B', 'html')item_2_html    = extractorApi.get_section(filing_10_k_url, '2', 'html')item_3_html    = extractorApi.get_section(filing_10_k_url, '3', 'html')item_4_html    = extractorApi.get_section(filing_10_k_url, '4', 'html')item_5_html    = extractorApi.get_section(filing_10_k_url, '5', 'html')item_6_html    = extractorApi.get_section(filing_10_k_url, '6', 'html')item_7_html    = extractorApi.get_section(filing_10_k_url, '7', 'html')item_7_a_html  = extractorApi.get_section(filing_10_k_url, '7A', 'html')item_8_html    = extractorApi.get_section(filing_10_k_url, '8', 'html')item_9_html    = extractorApi.get_section(filing_10_k_url, '9', 'html')item_9_a_html  = extractorApi.get_section(filing_10_k_url, '9A', 'html')item_9_b_html  = extractorApi.get_section(filing_10_k_url, '9B', 'html')item_10_html   = extractorApi.get_section(filing_10_k_url, '10', 'html')item_11_html   = extractorApi.get_section(filing_10_k_url, '11', 'html')item_12_html   = extractorApi.get_section(filing_10_k_url, '12', 'html')item_13_html   = extractorApi.get_section(filing_10_k_url, '13', 'html')item_14_html   = extractorApi.get_section(filing_10_k_url, '14', 'html')item_15_html   = extractorApi.get_section(filing_10_k_url, '15', 'html')

	0	2	3	6	7	8	10	11	12	14	15	16	18	19	20
0		Year Ended December 31,	Year Ended December 31,	Year Ended December 31,	Year Ended December 31,	Year Ended December 31,	Year Ended December 31,	Year Ended December 31,	Year Ended December 31,	Year Ended December 31,	Year Ended December 31,	Year Ended December 31,	Year Ended December 31,	Year Ended December 31,
1		2020	2020	2019 (3)	2019 (3)		2018 (2)	2018 (2)		2017	2017		2016 (1)	2016 (1)
2	Consolidated Statements of Operations Data:
3	Total revenues	$	31536	$	24578		$	21461		$	11759		$	7000
4	Gross profit	$	6630	$	4069		$	4042		$	2223		$	1599
5	Income (loss) from operations	$	1994	$	(69	)	$	(388	)	$	(1,632	)	$	(667	)
6	Net income (loss) attributable to common stock...	$	721	$	(862	)	$	(976	)	$	(1,962	)	$	(675	)
7	Net income (loss) per share of common stock at...
8	Basic	$	0.74	$	(0.98	)	$	(1.14	)	$	(2.37	)	$	(0.94	)
9	Diluted	$	0.64	$	(0.98	)	$	(1.14	)	$	(2.37	)	$	(0.94	)
10	Weighted average shares used in computing net ...
11	Basic		933		887			853			829			721
12	Diluted		1083		887			853			829			721

Extract Textual Data from EDGAR 10-K Filings Using Python

Getting Started

Extract "Item 1 - Business" from 10-K Filings

Extract "Item 6 - Financial Data" from 10-K Filings

Convert Financial Statements in an HTML Table to a DataFrame

Extract Other Text Sections from 10-K Filings