Extract XBRL Data From SEC Filings With Python
The following examples show you how to access, extract, parse and convert XBRL data from SEC filings with the XBRL to JSON converter in Python.
Quick Start
The following tutorial illustrates how to access and extract XBRL data from SEC filings using Python and our XBRL-to-JSON Converter API.
Overview: XBRL Converter in Python
The following example provides a quick overview on how to extract XBRL data from SEC filings and convert XBRL data to JSON.
!pip -q install sec-api
from sec_api import XbrlApi
xbrlApi = XbrlApi("YOUR_API_KEY")
# 10-K HTM File URL example
htm_url="https://www.sec.gov/Archives/edgar/data/320193/000032019320000096/aapl-20200926.htm"
xbrl_json = xbrlApi.xbrl_to_json(htm_url=htm_url)
# access income statement, balance sheet and cash flow statement
print(xbrl_json["StatementsOfIncome"])
print(xbrl_json["BalanceSheets"])
print(xbrl_json["StatementsOfCashFlows"])
# 10-K XBRL File URL example
xbrl_url="https://www.sec.gov/Archives/edgar/data/1318605/000156459021004599/tsla-10k_20201231_htm.xml"
xbrl_json = xbrlApi.xbrl_to_json(xbrl_url=xbrl_url)
# 10-K accession number example
accession_no="0001564590-21-004599"
xbrl_json = xbrlApi.xbrl_to_json(accession_no=accession_no)
The XbrlApi
interface provides a .xbrl_to_json()
method to convert XBRL data of an SEC filing into a standardized JSON object. The XBRL-JSON output can be converted into a pandas DataFrame, exported to a CSV or Excel file, and more.
The API fetches all XML-XBRL data, merges the relevant files, such as calculation, schema and instance document, and returns a single, standardized JSON output.
Method parameters:
htm_url
(string) - The URL of the filing from which XBRL data should be extracted and converted. Example:https://www.sec.gov/Archives/edgar/data/320193/000032019320000096/aapl-20200926.htm
xbrl_url
(string) - The URL of any XML file containing XBRL information. This can be the XBRL instance document, XBRL schema file (.xsd
), XBRL calculation file, XBRL definition file, XBRL label file or the XBRL presentation file of a filing.accession_no
(string) - The accession number of the filing from which XBRL data should be extracted and converted.
Only one parameter needs to be set. For example, if you generated a local list of accession numbers, simply call .xbrl_to_json(accession_no=...)
for each accession number to access the complete XBRL data in Python.
The XBRL-to-JSON Converter supports all SEC form types. Refer to the section "Supported XBRL Filing Types" on the overview page to find a complete list of supported SEC form types.
Convert XBRL to JSON with Python by providing the URL of an SEC filing
Accessing the income statements, balance sheets and cash flow statements of annual and quarterly reports disclosed in 10-K, 40-F, 10-Q and 20-F SEC filings, respectively, is as simple as calling three lines of Python code.
The below table shows the mapping between the financial statements and their corresponding keys in the XBRL-JSON output.
Financial Statement | XBRL-JSON Key |
---|---|
Income statement | xbrl_json["StatementsOfIncome"] |
Balance sheet | xbrl_json["BalanceSheets"] |
Cash flow statement | xbrl_json["StatementsOfCashFlows"] |
Let's start with looking at all available keys of the converted XBRL object while using one of Apple's 10-K filings as an example.
# 10-K HTM File URL example
htm_url = "https://www.sec.gov/Archives/edgar/data/320193/000032019320000096/aapl-20200926.htm"
xbrl_json = xbrlApi.xbrl_to_json(htm_url=htm_url)
print("Keys of the XBRL-JSON dictionary of Apple's 10-K filing")
print('---------------------------------')
print(*list(xbrl_json.keys()), sep='\n')
Keys of the XBRL-JSON dictionary of Apple's 10-K filing
---------------------------------
CoverPage
StatementsOfIncome
StatementsOfComprehensiveIncome
BalanceSheets
BalanceSheetsParenthetical
StatementsOfShareholdersEquity
StatementsOfCashFlows
SummaryofSignificantAccountingPolicies
SummaryofSignificantAccountingPoliciesPolicies
SummaryofSignificantAccountingPoliciesTables
SummaryofSignificantAccountingPoliciesAdditionalInformationDetails
SummaryofSignificantAccountingPoliciesComputationofBasicandDilutedEarningsPerShareDetails
RevenueRecognition
RevenueRecognitionTables
RevenueRecognitionAdditionalInformationDetails
RevenueRecognitionDeferredRevenueExpectedTimingofRealizationDetails
RevenueRecognitionNetSalesDisaggregatedbySignificantProductsandServicesDetails
FinancialInstruments
FinancialInstrumentsTables
FinancialInstrumentsCashCashEquivalentsandMarketableSecuritiesDetails
FinancialInstrumentsAdditionalInformationDetails
FinancialInstrumentsRestrictedCashDetails
FinancialInstrumentsDerivativeInstrumentsatGrossFairValueDetails
FinancialInstrumentsPreTaxGainsandLossesofDerivativeandNonDerivativeInstrumentsDesignatedasCashFlowandNetInvestmentHedgesDetails
FinancialInstrumentsDerivativeInstrumentsDesignatedasFairValueHedgesandRelatedHedgedItemsDetails
FinancialInstrumentsNotionalAmountsandCreditRiskAmountsAssociatedwithDerivativeInstrumentsDetails
ConsolidatedFinancialStatementDetails
ConsolidatedFinancialStatementDetailsTables
ConsolidatedFinancialStatementDetailsPropertyPlantandEquipmentNetDetails
ConsolidatedFinancialStatementDetailsOtherNonCurrentLiabilitiesDetails
ConsolidatedFinancialStatementDetailsOtherIncomeExpenseNetDetails
IncomeTaxes
IncomeTaxesTables
IncomeTaxesProvisionforIncomeTaxesDetails
IncomeTaxesAdditionalInformationDetails
IncomeTaxesReconciliationoftheProvisionforIncomeTaxesDetails
IncomeTaxesSignificantComponentsofDeferredTaxAssetsandLiabilitiesDetails
IncomeTaxesAggregateChangesinGrossUnrecognizedTaxBenefitsDetails
Debt
DebtTables
DebtAdditionalInformationDetails
DebtSummaryofCashFlowsAssociatedwithCommercialPaperDetails
DebtSummaryofTermDebtDetails
DebtFuturePrincipalPaymentsforTermDebtDetails
ShareholdersEquity
ShareholdersEquityTables
ShareholdersEquityAdditionalInformationDetails
ShareholdersEquitySharesofCommonStockDetails
ComprehensiveIncome
ComprehensiveIncomeTables
ComprehensiveIncomePretaxAmountsReclassifiedfromAOCIintotheConsolidatedStatementsofOperationsDetails
ComprehensiveIncomeChangeinAOCIbyComponentDetails
BenefitPlans
BenefitPlansTables
BenefitPlansAdditionalInformationDetails
BenefitPlansRestrictedStockUnitsActivityandRelatedInformationDetails
BenefitPlansSummaryofShareBasedCompensationExpenseandtheRelatedIncomeTaxBenefitDetails
CommitmentsandContingencies
CommitmentsandContingenciesTables
CommitmentsandContingenciesChangesinAccruedWarrantiesandRelatedCostsDetails
CommitmentsandContingenciesAdditionalInformationDetails
CommitmentsandContingenciesFuturePaymentsUnderUnconditionalPurchaseObligationsDetails
SegmentInformationandGeographicData
SegmentInformationandGeographicDataTables
SegmentInformationandGeographicDataSummaryInformationbyReportableSegmentDetails
SegmentInformationandGeographicDataReconciliationofSegmentOperatingIncometotheConsolidatedStatementsofOperationsDetails
SegmentInformationandGeographicDataNetSalesDetails
SegmentInformationandGeographicDataLongLivedAssetsDetails
Leases
LeasesTables
LeasesAdditionalInformationDetails
LeasesROUAssetsandLeaseLiabilitiesDetails
LeasesLeaseLiabilityMaturitiesDetails
SelectedQuarterlyFinancialInformationUnaudited
SelectedQuarterlyFinancialInformationUnauditedTables
SelectedQuarterlyFinancialInformationUnauditedSummaryofQuarterlyFinancialInformationDetails
We continue with listing all US GAAP items of the income statement in xbrl_json["StatementsOfIncome"]
.
print("Keys of income statement in XBRL")
print('---------------------------------')
print(*list(xbrl_json["StatementsOfIncome"].keys()), sep='\n')
Keys of income statement in XBRL
---------------------------------
RevenueFromContractWithCustomerExcludingAssessedTax
CostOfGoodsAndServicesSold
GrossProfit
ResearchAndDevelopmentExpense
SellingGeneralAndAdministrativeExpense
OperatingExpenses
OperatingIncomeLoss
NonoperatingIncomeExpense
IncomeLossFromContinuingOperationsBeforeIncomeTaxesExtraordinaryItemsNoncontrollingInterest
IncomeTaxExpenseBenefit
NetIncomeLoss
EarningsPerShareBasic
EarningsPerShareDiluted
WeightedAverageNumberOfSharesOutstandingBasic
WeightedAverageNumberOfDilutedSharesOutstanding
Now let's inspect the content of the revenue item of the income statement.
print("Revenue items of income statement in XBRL")
print('---------------------------------')
print(*list(xbrl_json["StatementsOfIncome"]["RevenueFromContractWithCustomerExcludingAssessedTax"]), sep='\n')
Keys of income statement in XBRL
---------------------------------
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2019-09-29', 'endDate': '2020-09-26'}, 'segment': {'dimension': 'srt:ProductOrServiceAxis', 'value': 'us-gaap:ProductMember'}, 'value': '220747000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2018-09-30', 'endDate': '2019-09-28'}, 'segment': {'dimension': 'srt:ProductOrServiceAxis', 'value': 'us-gaap:ProductMember'}, 'value': '213883000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2017-10-01', 'endDate': '2018-09-29'}, 'segment': {'dimension': 'srt:ProductOrServiceAxis', 'value': 'us-gaap:ProductMember'}, 'value': '225847000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2019-09-29', 'endDate': '2020-09-26'}, 'segment': {'dimension': 'srt:ProductOrServiceAxis', 'value': 'us-gaap:ServiceMember'}, 'value': '53768000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2018-09-30', 'endDate': '2019-09-28'}, 'segment': {'dimension': 'srt:ProductOrServiceAxis', 'value': 'us-gaap:ServiceMember'}, 'value': '46291000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2017-10-01', 'endDate': '2018-09-29'}, 'segment': {'dimension': 'srt:ProductOrServiceAxis', 'value': 'us-gaap:ServiceMember'}, 'value': '39748000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2019-09-29', 'endDate': '2020-09-26'}, 'value': '274515000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2018-09-30', 'endDate': '2019-09-28'}, 'value': '260174000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2017-10-01', 'endDate': '2018-09-29'}, 'value': '265595000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2019-09-29', 'endDate': '2020-09-26'}, 'segment': {'dimension': 'srt:ProductOrServiceAxis', 'value': 'aapl:IPhoneMember'}, 'value': '137781000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2018-09-30', 'endDate': '2019-09-28'}, 'segment': {'dimension': 'srt:ProductOrServiceAxis', 'value': 'aapl:IPhoneMember'}, 'value': '142381000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2017-10-01', 'endDate': '2018-09-29'}, 'segment': {'dimension': 'srt:ProductOrServiceAxis', 'value': 'aapl:IPhoneMember'}, 'value': '164888000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2019-09-29', 'endDate': '2020-09-26'}, 'segment': {'dimension': 'srt:ProductOrServiceAxis', 'value': 'aapl:MacMember'}, 'value': '28622000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2018-09-30', 'endDate': '2019-09-28'}, 'segment': {'dimension': 'srt:ProductOrServiceAxis', 'value': 'aapl:MacMember'}, 'value': '25740000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2017-10-01', 'endDate': '2018-09-29'}, 'segment': {'dimension': 'srt:ProductOrServiceAxis', 'value': 'aapl:MacMember'}, 'value': '25198000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2019-09-29', 'endDate': '2020-09-26'}, 'segment': {'dimension': 'srt:ProductOrServiceAxis', 'value': 'aapl:IPadMember'}, 'value': '23724000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2018-09-30', 'endDate': '2019-09-28'}, 'segment': {'dimension': 'srt:ProductOrServiceAxis', 'value': 'aapl:IPadMember'}, 'value': '21280000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2017-10-01', 'endDate': '2018-09-29'}, 'segment': {'dimension': 'srt:ProductOrServiceAxis', 'value': 'aapl:IPadMember'}, 'value': '18380000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2019-09-29', 'endDate': '2020-09-26'}, 'segment': {'dimension': 'srt:ProductOrServiceAxis', 'value': 'aapl:WearablesHomeandAccessoriesMember'}, 'value': '30620000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2018-09-30', 'endDate': '2019-09-28'}, 'segment': {'dimension': 'srt:ProductOrServiceAxis', 'value': 'aapl:WearablesHomeandAccessoriesMember'}, 'value': '24482000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2017-10-01', 'endDate': '2018-09-29'}, 'segment': {'dimension': 'srt:ProductOrServiceAxis', 'value': 'aapl:WearablesHomeandAccessoriesMember'}, 'value': '17381000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2019-09-29', 'endDate': '2020-09-26'}, 'segment': {'dimension': 'us-gaap:StatementBusinessSegmentsAxis', 'value': 'aapl:AmericasSegmentMember'}, 'value': '124556000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2018-09-30', 'endDate': '2019-09-28'}, 'segment': {'dimension': 'us-gaap:StatementBusinessSegmentsAxis', 'value': 'aapl:AmericasSegmentMember'}, 'value': '116914000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2017-10-01', 'endDate': '2018-09-29'}, 'segment': {'dimension': 'us-gaap:StatementBusinessSegmentsAxis', 'value': 'aapl:AmericasSegmentMember'}, 'value': '112093000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2019-09-29', 'endDate': '2020-09-26'}, 'segment': {'dimension': 'us-gaap:StatementBusinessSegmentsAxis', 'value': 'aapl:EuropeSegmentMember'}, 'value': '68640000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2018-09-30', 'endDate': '2019-09-28'}, 'segment': {'dimension': 'us-gaap:StatementBusinessSegmentsAxis', 'value': 'aapl:EuropeSegmentMember'}, 'value': '60288000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2017-10-01', 'endDate': '2018-09-29'}, 'segment': {'dimension': 'us-gaap:StatementBusinessSegmentsAxis', 'value': 'aapl:EuropeSegmentMember'}, 'value': '62420000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2019-09-29', 'endDate': '2020-09-26'}, 'segment': {'dimension': 'us-gaap:StatementBusinessSegmentsAxis', 'value': 'aapl:GreaterChinaSegmentMember'}, 'value': '40308000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2018-09-30', 'endDate': '2019-09-28'}, 'segment': {'dimension': 'us-gaap:StatementBusinessSegmentsAxis', 'value': 'aapl:GreaterChinaSegmentMember'}, 'value': '43678000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2017-10-01', 'endDate': '2018-09-29'}, 'segment': {'dimension': 'us-gaap:StatementBusinessSegmentsAxis', 'value': 'aapl:GreaterChinaSegmentMember'}, 'value': '51942000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2019-09-29', 'endDate': '2020-09-26'}, 'segment': {'dimension': 'us-gaap:StatementBusinessSegmentsAxis', 'value': 'aapl:JapanSegmentMember'}, 'value': '21418000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2018-09-30', 'endDate': '2019-09-28'}, 'segment': {'dimension': 'us-gaap:StatementBusinessSegmentsAxis', 'value': 'aapl:JapanSegmentMember'}, 'value': '21506000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2017-10-01', 'endDate': '2018-09-29'}, 'segment': {'dimension': 'us-gaap:StatementBusinessSegmentsAxis', 'value': 'aapl:JapanSegmentMember'}, 'value': '21733000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2019-09-29', 'endDate': '2020-09-26'}, 'segment': {'dimension': 'us-gaap:StatementBusinessSegmentsAxis', 'value': 'aapl:RestOfAsiaPacificSegmentMember'}, 'value': '19593000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2018-09-30', 'endDate': '2019-09-28'}, 'segment': {'dimension': 'us-gaap:StatementBusinessSegmentsAxis', 'value': 'aapl:RestOfAsiaPacificSegmentMember'}, 'value': '17788000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2017-10-01', 'endDate': '2018-09-29'}, 'segment': {'dimension': 'us-gaap:StatementBusinessSegmentsAxis', 'value': 'aapl:RestOfAsiaPacificSegmentMember'}, 'value': '17407000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2019-09-29', 'endDate': '2020-09-26'}, 'segment': {'dimension': 'srt:StatementGeographicalAxis', 'value': 'country:US'}, 'value': '109197000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2018-09-30', 'endDate': '2019-09-28'}, 'segment': {'dimension': 'srt:StatementGeographicalAxis', 'value': 'country:US'}, 'value': '102266000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2017-10-01', 'endDate': '2018-09-29'}, 'segment': {'dimension': 'srt:StatementGeographicalAxis', 'value': 'country:US'}, 'value': '98061000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2019-09-29', 'endDate': '2020-09-26'}, 'segment': {'dimension': 'srt:StatementGeographicalAxis', 'value': 'country:CN'}, 'value': '40308000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2018-09-30', 'endDate': '2019-09-28'}, 'segment': {'dimension': 'srt:StatementGeographicalAxis', 'value': 'country:CN'}, 'value': '43678000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2017-10-01', 'endDate': '2018-09-29'}, 'segment': {'dimension': 'srt:StatementGeographicalAxis', 'value': 'country:CN'}, 'value': '51942000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2019-09-29', 'endDate': '2020-09-26'}, 'segment': {'dimension': 'srt:StatementGeographicalAxis', 'value': 'aapl:OtherCountriesMember'}, 'value': '125010000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2018-09-30', 'endDate': '2019-09-28'}, 'segment': {'dimension': 'srt:StatementGeographicalAxis', 'value': 'aapl:OtherCountriesMember'}, 'value': '114230000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2017-10-01', 'endDate': '2018-09-29'}, 'segment': {'dimension': 'srt:StatementGeographicalAxis', 'value': 'aapl:OtherCountriesMember'}, 'value': '115592000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2020-06-28', 'endDate': '2020-09-26'}, 'value': '64698000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2020-03-29', 'endDate': '2020-06-27'}, 'value': '59685000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2019-12-29', 'endDate': '2020-03-28'}, 'value': '58313000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2019-09-29', 'endDate': '2019-12-28'}, 'value': '91819000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2019-06-30', 'endDate': '2019-09-28'}, 'value': '64040000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2019-03-31', 'endDate': '2019-06-29'}, 'value': '53809000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2018-12-30', 'endDate': '2019-03-30'}, 'value': '58015000000'}
{'decimals': '-6', 'unitRef': 'usd', 'period': {'startDate': '2018-09-30', 'endDate': '2018-12-29'}, 'value': '84310000000'}
Convert XBRL data into a pandas DataFrame
In the next step, we extract XBRL data from Apple's 10-Q filing and convert it into a pandas DataFrame allowing us to access information such as total revenue per quarter, revenue per product, such as Mac or iPad, and revenue per region, such as America, Europe, Greater China, Japan and more.
aapl_10k = "https://www.sec.gov/Archives/edgar/data/320193/000032019323000006/aapl-20221231.htm"
aapl_xbrl = xbrlApi.xbrl_to_json(htm_url=aapl_10k)
import pandas as pd
revenue = pd.json_normalize(aapl_xbrl["StatementsOfIncome"]["RevenueFromContractWithCustomerExcludingAssessedTax"])
revenue
decimals | unitRef | value | period.startDate | period.endDate | segment.dimension | segment.value | |
---|---|---|---|---|---|---|---|
0 | -6 | usd | 96388000000 | 2022-09-25 | 2022-12-31 | srt:ProductOrServiceAxis | us-gaap:ProductMember |
1 | -6 | usd | 104429000000 | 2021-09-26 | 2021-12-25 | srt:ProductOrServiceAxis | us-gaap:ProductMember |
2 | -6 | usd | 20766000000 | 2022-09-25 | 2022-12-31 | srt:ProductOrServiceAxis | us-gaap:ServiceMember |
3 | -6 | usd | 19516000000 | 2021-09-26 | 2021-12-25 | srt:ProductOrServiceAxis | us-gaap:ServiceMember |
4 | -6 | usd | 117154000000 | 2022-09-25 | 2022-12-31 | NaN | NaN |
5 | -6 | usd | 123945000000 | 2021-09-26 | 2021-12-25 | NaN | NaN |
6 | -6 | usd | 65775000000 | 2022-09-25 | 2022-12-31 | srt:ProductOrServiceAxis | aapl:IPhoneMember |
7 | -6 | usd | 71628000000 | 2021-09-26 | 2021-12-25 | srt:ProductOrServiceAxis | aapl:IPhoneMember |
8 | -6 | usd | 7735000000 | 2022-09-25 | 2022-12-31 | srt:ProductOrServiceAxis | aapl:MacMember |
9 | -6 | usd | 10852000000 | 2021-09-26 | 2021-12-25 | srt:ProductOrServiceAxis | aapl:MacMember |
10 | -6 | usd | 9396000000 | 2022-09-25 | 2022-12-31 | srt:ProductOrServiceAxis | aapl:IPadMember |
11 | -6 | usd | 7248000000 | 2021-09-26 | 2021-12-25 | srt:ProductOrServiceAxis | aapl:IPadMember |
12 | -6 | usd | 13482000000 | 2022-09-25 | 2022-12-31 | srt:ProductOrServiceAxis | aapl:WearablesHomeandAccessoriesMember |
13 | -6 | usd | 14701000000 | 2021-09-26 | 2021-12-25 | srt:ProductOrServiceAxis | aapl:WearablesHomeandAccessoriesMember |
14 | -6 | usd | 49278000000 | 2022-09-25 | 2022-12-31 | us-gaap:StatementBusinessSegmentsAxis | aapl:AmericasSegmentMember |
15 | -6 | usd | 51496000000 | 2021-09-26 | 2021-12-25 | us-gaap:StatementBusinessSegmentsAxis | aapl:AmericasSegmentMember |
16 | -6 | usd | 27681000000 | 2022-09-25 | 2022-12-31 | us-gaap:StatementBusinessSegmentsAxis | aapl:EuropeSegmentMember |
17 | -6 | usd | 29749000000 | 2021-09-26 | 2021-12-25 | us-gaap:StatementBusinessSegmentsAxis | aapl:EuropeSegmentMember |
18 | -6 | usd | 23905000000 | 2022-09-25 | 2022-12-31 | us-gaap:StatementBusinessSegmentsAxis | aapl:GreaterChinaSegmentMember |
19 | -6 | usd | 25783000000 | 2021-09-26 | 2021-12-25 | us-gaap:StatementBusinessSegmentsAxis | aapl:GreaterChinaSegmentMember |
20 | -6 | usd | 6755000000 | 2022-09-25 | 2022-12-31 | us-gaap:StatementBusinessSegmentsAxis | aapl:JapanSegmentMember |
21 | -6 | usd | 7107000000 | 2021-09-26 | 2021-12-25 | us-gaap:StatementBusinessSegmentsAxis | aapl:JapanSegmentMember |
22 | -6 | usd | 9535000000 | 2022-09-25 | 2022-12-31 | us-gaap:StatementBusinessSegmentsAxis | aapl:RestOfAsiaPacificSegmentMember |
23 | -6 | usd | 9810000000 | 2021-09-26 | 2021-12-25 | us-gaap:StatementBusinessSegmentsAxis | aapl:RestOfAsiaPacificSegmentMember |
Extract an income statement from XBRL data and convert it to a pandas DataFrame
The next example shows the function get_income_statement()
that extracts the income statement from the XBRL data of the 10-Q SEC filing and converts the statement into a pandas DataFrame.
# convert XBRL-JSON of income statement to pandas dataframe
def get_income_statement(xbrl_json):
income_statement_store = {}
# iterate over each US GAAP item in the income statement
for usGaapItem in xbrl_json['StatementsOfIncome']:
values = []
indicies = []
for fact in xbrl_json['StatementsOfIncome'][usGaapItem]:
# only consider items without segment. not required for our analysis.
if 'segment' not in fact:
index = fact['period']['startDate'] + '-' + fact['period']['endDate']
# ensure no index duplicates are created
if index not in indicies:
values.append(fact['value'])
indicies.append(index)
income_statement_store[usGaapItem] = pd.Series(values, index=indicies)
income_statement = pd.DataFrame(income_statement_store)
# switch columns and rows so that US GAAP items are rows and each column header represents a date range
return income_statement.T
income_statement = get_income_statement(aapl_xbrl)
income_statement
2022-09-25-2022-12-31 | 2021-09-26-2021-12-25 | |
---|---|---|
RevenueFromContractWithCustomerExcludingAssessedTax | 117154000000 | 123945000000 |
CostOfGoodsAndServicesSold | 66822000000 | 69702000000 |
GrossProfit | 50332000000 | 54243000000 |
ResearchAndDevelopmentExpense | 7709000000 | 6306000000 |
SellingGeneralAndAdministrativeExpense | 6607000000 | 6449000000 |
OperatingExpenses | 14316000000 | 12755000000 |
OperatingIncomeLoss | 36016000000 | 41488000000 |
NonoperatingIncomeExpense | -393000000 | -247000000 |
IncomeLossFromContinuingOperationsBeforeIncomeTaxesExtraordinaryItemsNoncontrollingInterest | 35623000000 | 41241000000 |
IncomeTaxExpenseBenefit | 5625000000 | 6611000000 |
NetIncomeLoss | 29998000000 | 34630000000 |
EarningsPerShareBasic | 1.89 | 2.11 |
EarningsPerShareDiluted | 1.88 | 2.10 |
WeightedAverageNumberOfSharesOutstandingBasic | 15892723000 | 16391724000 |
WeightedAverageNumberOfDilutedSharesOutstanding | 15955718000 | 16519291000 |
Extract an income statement from an SEC filing and export it to Excel
The next example illustrates how to extract the income statement from the XBRL data of an SEC filing, such as a 10-Q or 10-K report, and export the statement to an Excel file.
income_statement.to_excel('income-statement.xlsx')
The resulting Excel file income-statement.xlsx
looks like this.
Extract a balance sheet from XBRL data and convert it to a pandas DataFrame
The following function get_balance_sheet()
extracts the balance sheet from the XBRL data of the 10-Q SEC filing and converts it into a pandas DataFrame.
# convert XBRL-JSON of balance sheet to pandas dataframe
def get_balance_sheet(xbrl_json):
balance_sheet_store = {}
for usGaapItem in xbrl_json['BalanceSheets']:
values = []
indicies = []
for fact in xbrl_json['BalanceSheets'][usGaapItem]:
# only consider items without segment.
if 'segment' not in fact:
index = fact['period']['instant']
# avoid duplicate indicies with same values
if index in indicies:
continue
# add 0 if value is nil
if "value" not in fact:
values.append(0)
else:
values.append(fact['value'])
indicies.append(index)
balance_sheet_store[usGaapItem] = pd.Series(values, index=indicies)
balance_sheet = pd.DataFrame(balance_sheet_store)
# switch columns and rows so that US GAAP items are rows and each column header represents a date instant
return balance_sheet.T
balance_sheet = get_balance_sheet(aapl_xbrl)
balance_sheet
2021-09-25 | 2021-12-25 | 2022-09-24 | 2022-12-31 | |
---|---|---|---|---|
CashAndCashEquivalentsAtCarryingValue | NaN | NaN | 23646000000 | 20535000000 |
MarketableSecuritiesCurrent | NaN | NaN | 24658000000 | 30820000000 |
AccountsReceivableNetCurrent | NaN | NaN | 28184000000 | 23752000000 |
InventoryNet | NaN | NaN | 4946000000 | 6820000000 |
NontradeReceivablesCurrent | NaN | NaN | 32748000000 | 30428000000 |
OtherAssetsCurrent | NaN | NaN | 21223000000 | 16422000000 |
AssetsCurrent | NaN | NaN | 135405000000 | 128777000000 |
MarketableSecuritiesNoncurrent | NaN | NaN | 120805000000 | 114095000000 |
PropertyPlantAndEquipmentNet | NaN | NaN | 42117000000 | 42951000000 |
OtherAssetsNoncurrent | NaN | NaN | 54428000000 | 60924000000 |
AssetsNoncurrent | NaN | NaN | 217350000000 | 217970000000 |
Assets | NaN | NaN | 352755000000 | 346747000000 |
AccountsPayableCurrent | NaN | NaN | 64115000000 | 57918000000 |
OtherLiabilitiesCurrent | NaN | NaN | 60845000000 | 59893000000 |
ContractWithCustomerLiabilityCurrent | NaN | NaN | 7912000000 | 7992000000 |
CommercialPaper | NaN | NaN | 9982000000 | 1743000000 |
LongTermDebtCurrent | NaN | NaN | 11128000000 | 9740000000 |
LiabilitiesCurrent | NaN | NaN | 153982000000 | 137286000000 |
LongTermDebtNoncurrent | NaN | NaN | 98959000000 | 99627000000 |
OtherLiabilitiesNoncurrent | NaN | NaN | 49142000000 | 53107000000 |
LiabilitiesNoncurrent | NaN | NaN | 148101000000 | 152734000000 |
Liabilities | NaN | NaN | 302083000000 | 290020000000 |
CommitmentsAndContingencies | NaN | NaN | 0.0 | 0.0 |
CommonStocksIncludingAdditionalPaidInCapital | NaN | NaN | 64849000000 | 66399000000 |
RetainedEarningsAccumulatedDeficit | NaN | NaN | -3068000000 | 3240000000 |
AccumulatedOtherComprehensiveIncomeLossNetOfTax | NaN | NaN | -11109000000 | -12912000000 |
StockholdersEquity | 63090000000 | 71932000000 | 50672000000 | 56727000000 |
LiabilitiesAndStockholdersEquity | NaN | NaN | 352755000000 | 346747000000 |
Extract the address of a company from its SEC filing
The next example illustrates how to extract the address of a company or filer from the cover page of an XBRL instance document attached to an SEC filing. For this example, we use one of Google's 10-K filings.
The XBRL cover page includes information such as:
- Company's fiscal year end
- Trading symbols and exchanges
- Tax identification number
- Phone number
- Business and filer category, e.g. large accelerated filer or emerging growth company.
goog_10k = "https://www.sec.gov/Archives/edgar/data/1652044/000165204423000016/goog-20221231.htm"
goog_xbrl = xbrlApi.xbrl_to_json(htm_url=goog_10k)
print("Cover page items")
print('---------------------------------')
print(*list(goog_xbrl["CoverPage"]), sep='\n')
Cover page items
---------------------------------
DocumentType
DocumentAnnualReport
DocumentPeriodEndDate
CurrentFiscalYearEndDate
DocumentTransitionReport
EntityFileNumber
EntityRegistrantName
EntityIncorporationStateCountryCode
EntityTaxIdentificationNumber
EntityAddressAddressLine1
EntityAddressCityOrTown
EntityAddressStateOrProvince
EntityAddressPostalZipCode
CityAreaCode
LocalPhoneNumber
Security12bTitle
TradingSymbol
SecurityExchangeName
EntityWellKnownSeasonedIssuer
EntityVoluntaryFilers
EntityCurrentReportingStatus
EntityInteractiveDataCurrent
EntityFilerCategory
EntitySmallBusiness
EntityEmergingGrowthCompany
IcfrAuditorAttestationFlag
EntityShellCompany
EntityPublicFloat
EntityCommonStockSharesOutstanding
DocumentsIncorporatedByReferenceTextBlock
AmendmentFlag
DocumentFiscalYearFocus
DocumentFiscalPeriodFocus
EntityCentralIndexKey
We identify five relevant keys in the cover page that include information about the company's address:
- EntityAddressAddressLine1
- EntityAddressCityOrTown
- EntityAddressStateOrProvince
- EntityAddressPostalZipCode
- CityAreaCode
cover = goog_xbrl["CoverPage"]
print("Google's address:")
print(cover["EntityAddressAddressLine1"])
print(cover["EntityAddressCityOrTown"])
print(cover["EntityAddressStateOrProvince"])
print(cover["EntityAddressPostalZipCode"])
print(cover["CityAreaCode"])
Google's address:
1600 Amphitheatre Parkway
Mountain View
CA
94043
650