sec-api.ioSEC API by D2V
FilingsPricingSandboxDocs
Log inGet Free API Key
  1. Home
  2. Tutorials

Extract Google's Revenue Metrics from 10-K Filings with Python

Open In Colab   Download Notebook

On this page:

  • Getting Started
    • Extract Income Statements from Google's 10-K Filings
      • Visualize Google's Revenue by Region from 2016 to 2022
        • Visualize Google's Revenue by Product Category from 2016 to 2022
          • Create a Figure with Two Revenue Charts: Region and Product

            Welcome to this tutorial on how to extract revenue metrics from Google's 10-K SEC EDGAR filings and visualize them using Python's matplotlib library.

            In this tutorial, we will explore how to extract revenue data by product categories and geographical regions, and create bar charts to show the progression of revenue over the years. We will look at income statements from 2016 to 2022 reported in 10-K filings from 2018 to 2022. Using the XBRL-to-JSON Converter API, we will extract income statements and revenue metrics from the XBRL data of 10-K filings and convert them into a standardized JSON format, which can be easily imported into pandas dataframes. Finally, we will extract the relevant US GAAP items and visualize the revenue data in a stacked bar chart.

            Below is an illustrative example of the two bar charts that we will create to visualize Google's revenue by product categories and revenue by geographical regions, respectively.

            google-revenue-by-region-and-product

            Getting Started

            While similar to our previous tutorial, Extract Financial Statements from SEC Filings with Python, this tutorial is tailored specifically towards Google's 10-K and XBRL structure.

            For those already familiar with the previous tutorial, feel free to skip ahead. Otherwise, let's get started by installing the sec-api Python package and creating an instance of XbrlApi, which exposes the .xbrl_to_json(htm_url) function to convert XBRL data to JSON format. We will then define the function get_income_statement(xbrl_json) to extract the income statement from the XBRL-converted JSON data and create a pandas dataframe for us to hold all relevant metrics, such as revenue and research and development expenses per year for 2020 to 2022.

            API_KEY = 'YOUR_API_KEY'
            !pip install -q sec-api
            from sec_api import XbrlApi

            xbrlApi = XbrlApi(API_KEY)
            # URL of Google's 10-K filings
            url_10k = 'https://www.sec.gov/Archives/edgar/data/1652044/000165204423000016/goog-20221231.htm'

            xbrl_json = xbrlApi.xbrl_to_json(htm_url=url_10k)
            import pandas as pd 

            # convert XBRL-JSON of income statement to pandas dataframe
            def get_income_statement(xbrl_json):
                income_statement_store = {}

                # iterate over each US GAAP item in the income statement
                for usGaapItem in xbrl_json['StatementsOfIncome']:
                    values = []
                    indicies = []

                    for fact in xbrl_json['StatementsOfIncome'][usGaapItem]:
                        # only consider items without segment. not required for our analysis.
                        if 'segment' not in fact:
                            index = fact['period']['startDate'] + '-' + fact['period']['endDate']
                            # ensure no index duplicates are created
                            if index not in indicies:
                                values.append(fact['value'])
                                indicies.append(index)

                    income_statement_store[usGaapItem] = pd.Series(values, index=indicies)

                income_statement = pd.DataFrame(income_statement_store)
                # switch columns and rows so that US GAAP items are rows and each column header represents a date range
                return income_statement.T


            income_statement_google = get_income_statement(xbrl_json)


            print("Income statement from Google's 2022 10-K filing as dataframe")
            print('------------------------------------------------------------')
            income_statement_google
            Income statement from Google's 2022 10-K filing as dataframe
            ------------------------------------------------------------
            Out:
            2020-01-01-2020-12-312021-01-01-2021-12-312022-01-01-2022-12-31
            RevenueFromContractWithCustomerExcludingAssessedTax182527000000257637000000282836000000
            CostOfRevenue84732000000110939000000126203000000
            ResearchAndDevelopmentExpense275730000003156200000039500000000
            SellingAndMarketingExpense179460000002291200000026567000000
            GeneralAndAdministrativeExpense110520000001351000000015724000000
            CostsAndExpenses141303000000178923000000207994000000
            OperatingIncomeLoss412240000007871400000074842000000
            NonoperatingIncomeExpense685800000012020000000-3514000000
            IncomeLossFromContinuingOperationsBeforeIncomeTaxesExtraordinaryItemsNoncontrollingInterest480820000009073400000071328000000
            IncomeTaxExpenseBenefit78130000001470100000011356000000
            NetIncomeLoss402690000007603300000059972000000
            EarningsPerShareBasic2.965.694.59
            EarningsPerShareDiluted2.935.614.56

            Extract Income Statements from Google's 10-K Filings

            In the following section, we will extract all XBRL data from Google's 10-K filings filed between 2018 and 2022. We will use the XBRL-to-JSON Converter API to convert this data and create a consolidated dataframe that holds multiple years of income statement data ranging from 2016 to 2022. Furthermore, we will merge all US GAAP revenue items across different XBRL dimensions into a single dataframe, which will assist us in filtering revenue items based on product and region.

            # Google's 10Ks of the last 5 years, 2018 to 2022, with data from 2016 to 2022
            url_10k_2018 = 'https://www.sec.gov/Archives/edgar/data/1652044/000165204419000004/goog10-kq42018.htm'
            url_10k_2019 = 'https://www.sec.gov/Archives/edgar/data/1652044/000165204420000008/goog10-k2019.htm'
            url_10k_2020 = 'https://www.sec.gov/Archives/edgar/data/1652044/000165204421000010/goog-20201231.htm'
            url_10k_2021 = 'https://www.sec.gov/Archives/edgar/data/1652044/000165204422000019/goog-20211231.htm'
            url_10k_2022 = 'https://www.sec.gov/Archives/edgar/data/1652044/000165204423000016/goog-20221231.htm'

            xbrl_json_2018 = xbrlApi.xbrl_to_json(htm_url=url_10k_2018)
            xbrl_json_2019 = xbrlApi.xbrl_to_json(htm_url=url_10k_2019)
            xbrl_json_2020 = xbrlApi.xbrl_to_json(htm_url=url_10k_2020)
            xbrl_json_2021 = xbrlApi.xbrl_to_json(htm_url=url_10k_2021)
            xbrl_json_2022 = xbrlApi.xbrl_to_json(htm_url=url_10k_2022)

            In 2020, Google made a change to its reporting naming convention by using the key Revenues instead of the previously used RevenueFromContractWithCustomerExcludingAssessedTax. Therefore, in this tutorial, we will address this naming inconsistency by simply renaming the element.

            # fix naming inconsistency
            xbrl_json_2020['StatementsOfIncome']['RevenueFromContractWithCustomerExcludingAssessedTax'] = xbrl_json_2020['StatementsOfIncome']['Revenues']
            del xbrl_json_2020['StatementsOfIncome']['Revenues']
            xbrl_json_2021['StatementsOfIncome']['RevenueFromContractWithCustomerExcludingAssessedTax'] = xbrl_json_2021['StatementsOfIncome']['Revenues']
            del xbrl_json_2021['StatementsOfIncome']['Revenues']
            income_statement_2018 = get_income_statement(xbrl_json_2018)
            income_statement_2019 = get_income_statement(xbrl_json_2019)
            income_statement_2020 = get_income_statement(xbrl_json_2020)
            income_statement_2021 = get_income_statement(xbrl_json_2021)
            income_statement_2022 = get_income_statement(xbrl_json_2022)
            income_statements_merged = pd.concat([income_statement_2018, 
                                                  income_statement_2019,
                                                  income_statement_2020,
                                                  income_statement_2021,
                                                  income_statement_2022], axis=0, sort=False)

            # sort & reset the index of the merged dataframe
            income_statements_merged = income_statements_merged.sort_index().reset_index()

            # convert cells to float
            income_statements_merged = income_statements_merged.applymap(lambda x: pd.to_numeric(x, errors='ignore'))


            print("Merged, uncleaned financials of all income statements")
            print('-----------------------------------------------------')
            income_statements_merged.head(10)
            Merged, uncleaned financials of all income statements
            -----------------------------------------------------
            Out:
            index2016-01-01-2016-12-312017-01-01-2017-12-312018-01-01-2018-12-312019-01-01-2019-12-312020-01-01-2020-12-312021-01-01-2021-12-312022-01-01-2022-12-31
            0CostOfRevenueNaN4.558300e+105.954900e+107.189600e+10NaNNaNNaN
            1CostOfRevenue3.513800e+104.558300e+105.954900e+10NaNNaNNaNNaN
            2CostOfRevenueNaNNaNNaNNaN8.473200e+101.109390e+111.262030e+11
            3CostOfRevenueNaNNaN5.954900e+107.189600e+108.473200e+10NaNNaN
            4CostOfRevenueNaNNaNNaN7.189600e+108.473200e+101.109390e+11NaN
            5CostsAndExpensesNaNNaNNaN1.276260e+111.413030e+111.789230e+11NaN
            6CostsAndExpenses6.655600e+108.470900e+101.104980e+11NaNNaNNaNNaN
            7CostsAndExpensesNaNNaN1.092950e+111.276260e+111.413030e+11NaNNaN
            8CostsAndExpensesNaN8.467700e+101.092950e+111.276260e+11NaNNaNNaN
            9CostsAndExpensesNaNNaNNaNNaN1.413030e+111.789230e+112.079940e+11
            income_statements = income_statements_merged.groupby('index').max()

            # reindex the merged dataframe using the index of the first dataframe
            income_statements = income_statements.reindex(income_statement_2019.index)

            # loop over the columns
            for col in income_statements.columns[1:]:
                # extract start and end dates from the column label
                splitted = col.split('-')
                start = '-'.join(splitted[:3])
                end = '-'.join(splitted[3:])

                # convert start and end dates to datetime objects
                start_date = pd.to_datetime(start)
                end_date = pd.to_datetime(end)

                # calculate the duration between start and end dates
                duration = (end_date - start_date).days / 360

                # drop the column if duration is less than a year
                if duration < 1:
                    income_statements.drop(columns=[col], inplace=True)

            # convert datatype of cells to readable format, e.g. "2.235460e+11" becomes "223546000000"
            income_statements = income_statements.apply(lambda row: pd.to_numeric(row, errors='coerce', downcast='integer').astype(str), axis=1)


            print("Income statements from Google's 10-K filings (2016 to 2022) as dataframe")
            print('------------------------------------------------------------------------')
            income_statements
            Income statements from Google's 10-K filings (2016 to 2022) as dataframe
            ------------------------------------------------------------------------
            Out:
            2016-01-01-2016-12-312017-01-01-2017-12-312018-01-01-2018-12-312019-01-01-2019-12-312020-01-01-2020-12-312021-01-01-2021-12-312022-01-01-2022-12-31
            RevenueFromContractWithCustomerExcludingAssessedTax90272000000110855000000136819000000161857000000182527000000257637000000282836000000
            CostOfRevenue3513800000045583000000595490000007189600000084732000000110939000000126203000000
            ResearchAndDevelopmentExpense13948000000166250000002141900000026018000000275730000003156200000039500000000
            SellingAndMarketingExpense10485000000128930000001633300000018464000000179460000002291200000026567000000
            GeneralAndAdministrativeExpense6985000000687200000081260000009551000000110520000001351000000015724000000
            LossContingencyLossInPeriod0.02736000000.05071000000.01697000000.00.00.0nan
            CostsAndExpenses6655600000084709000000110498000000127626000000141303000000178923000000207994000000
            OperatingIncomeLoss23716000000261780000002752400000034231000000412240000007871400000074842000000
            NonoperatingIncomeExpense434000000104700000085920000005394000000685800000012020000000-3514000000
            IncomeLossFromContinuingOperationsBeforeIncomeTaxesMinorityInterestAndIncomeLossFromEquityMethodInvestments24150000000.027193000000.034913000000.039625000000.048082000000.0nannan
            IncomeTaxExpenseBenefit4672000000145310000004177000000528200000078130000001470100000011356000000
            NetIncomeLoss19478000000126620000003073600000034343000000402690000007603300000059972000000
            EarningsPerShareBasic28.3218.2744.2249.5959.15113.884.59
            EarningsPerShareDiluted27.8518.043.749.1658.61112.24.56
            # fix naming inconsistency
            # xbrl_json_2021['StatementsOfIncome']['RevenueFromContractWithCustomerExcludingAssessedTax'] = xbrl_json_2021['StatementsOfIncome']['Revenues']
            # del xbrl_json_2021['StatementsOfIncome']['Revenues']

            Visualize Google's Revenue by Region from 2016 to 2022

            We combine all US GAAP elements in each RevenueFromContractWithCustomerExcludingAssessedTax list in the XBRL-JSON data for each year into a single array and then use the pd.json_normalize() function to create a dataframe.

            However, some segment objects in the XBRL data have multiple items with dimension and value keys, forming an array. For such cases, we use the .explode() method to create new rows for each array item.

            For instance, if we have one row with a cell containing the array

            [
            {
            "dimension": "srt:ProductOrServiceAxis",
            "value": "goog:AdvertisingRevenueMember"
            },
            {
            "dimension": "us-gaap:StatementBusinessSegmentsAxis",
            "value": "goog:GoogleInc.Member"
            }
            ]

            we would create two new rows: the first with dimension: srt:ProductOrServiceAxis, value: goog:AdvertisingRevenueMember, and the second with dimension: us-gaap:StatementBusinessSegmentsAxis, value: goog:GoogleInc.Member.

            all_revenues_json = xbrl_json_2018['StatementsOfIncome']['RevenueFromContractWithCustomerExcludingAssessedTax'] + \
                                xbrl_json_2019['StatementsOfIncome']['RevenueFromContractWithCustomerExcludingAssessedTax'] + \
                                xbrl_json_2020['StatementsOfIncome']['RevenueFromContractWithCustomerExcludingAssessedTax'] + \
                                xbrl_json_2021['StatementsOfIncome']['RevenueFromContractWithCustomerExcludingAssessedTax'] + \
                                xbrl_json_2022['StatementsOfIncome']['RevenueFromContractWithCustomerExcludingAssessedTax']

            all_revenues = pd.json_normalize(all_revenues_json)

            # explode segment list of dictionaries
            all_revenues = all_revenues.explode('segment')

            segment_split = all_revenues['segment'].apply(pd.Series)
            segment_split = segment_split.rename(columns={'dimension': 'segment.dimension', 'value': 'segment.value'})
            segment_split = segment_split.drop(0, axis=1)

            all_revenues = all_revenues.combine_first(segment_split)
            all_revenues = all_revenues.drop('segment', axis=1)
            all_revenues = all_revenues.reset_index(drop=True)

            all_revenues
            Out:
            decimalsperiod.endDateperiod.startDatesegment.dimensionsegment.valueunitRefvalue
            0-62016-12-312016-01-01NaNNaNusd90272000000
            1-62016-12-312016-01-01srt:ProductOrServiceAxisgoog:AdvertisingRevenueMemberusd79383000000
            2-62016-12-312016-01-01us-gaap:StatementBusinessSegmentsAxisgoog:GoogleInc.Memberusd79383000000
            3-62016-12-312016-01-01srt:ProductOrServiceAxisgoog:GoogleNetworkMembersPropertiesMemberusd15598000000
            4-62016-12-312016-01-01us-gaap:StatementBusinessSegmentsAxisgoog:GoogleInc.Memberusd15598000000
            ........................
            211-62021-12-312021-01-01us-gaap:ReclassificationOutOfAccumulatedOtherC...us-gaap:ReclassificationOutOfAccumulatedOtherC...usd165000000
            212-62021-12-312021-01-01us-gaap:StatementEquityComponentsAxisus-gaap:AccumulatedGainLossNetCashFlowHedgePar...usd165000000
            213-62022-12-312022-01-01us-gaap:DerivativeInstrumentRiskAxisus-gaap:ForeignExchangeContractMemberusd2046000000
            214-62022-12-312022-01-01us-gaap:ReclassificationOutOfAccumulatedOtherC...us-gaap:ReclassificationOutOfAccumulatedOtherC...usd2046000000
            215-62022-12-312022-01-01us-gaap:StatementEquityComponentsAxisus-gaap:AccumulatedGainLossNetCashFlowHedgePar...usd2046000000

            216 rows × 7 columns

            Next, we aim to gain insight into the data's structure and potential values to develop a strategy for the best extraction and visualization of the revenue metrics. To achieve this, we create a pivot table along the segment.dimension and segment.value columns, which enables us to identify all the potential values for each XBRL dimension.

            xbrl_dimensions = all_revenues.pivot(columns='segment.dimension', values='segment.value').dropna(how='all')

            xbrl_dimensions = pd.DataFrame([(col, xbrl_dimensions[col].unique()) for col in xbrl_dimensions.columns], \
                                         columns=['segment.dimension', 'segment.values'])

            xbrl_dimensions
            Out:
            segment.dimensionsegment.values
            0NaN[nan]
            1srt:ProductOrServiceAxis[goog:AdvertisingRevenueMember, nan, goog:Goog...
            2srt:StatementGeographicalAxis[nan, country:US, goog:AmericasExcludingUnited...
            3us-gaap:DerivativeInstrumentRiskAxis[nan, us-gaap:ForeignExchangeContractMember]
            4us-gaap:IncomeStatementLocationAxis[nan, us-gaap:SalesMember]
            5us-gaap:ReclassificationOutOfAccumulatedOtherC...[nan, us-gaap:ReclassificationOutOfAccumulated...
            6us-gaap:StatementBusinessSegmentsAxis[nan, goog:GoogleInc.Member, us-gaap:AllOtherS...
            7us-gaap:StatementEquityComponentsAxis[nan, us-gaap:AccumulatedGainLossNetCashFlowHe...
            # list(all_revenues['segment.value'].drop_duplicates())

            Based on the pivot table above, we can observe that the srt:StatementGeographicalAxis dimension has four different values, namely

            • country:US
            • goog:AmericasExcludingUnitedStatesMember
            • srt:AsiaPacificMember
            • us-gaap:EMEAMember

            To focus only on the revenue generated from Google's geographical regions, we filter all revenue GAAP items by srt:StatementGeographicalAxis and create a new pivot table with period.endDate (the revenue reporting years) as the index and segment.value (the regions) as the columns. Each cell in the pivot table represents the revenue generated by Google in a specific region during a particular year.

            all_revenues['value'] = all_revenues['value'].astype(int)

            mask = all_revenues['segment.dimension'] == 'srt:StatementGeographicalAxis'

            revenue_region = all_revenues[mask]

            revenue_region = revenue_region.drop_duplicates(subset=['period.endDate', 'segment.value'])

            # pivot the dataframe to create a new dataframe with period.endDate as the index,
            # segment.value as the columns, and value as the values
            revenue_region_pivot = revenue_region.pivot(index='period.endDate', columns='segment.value', values='value')


            print("Google's revenues by region from 2016 to 2022")
            print('---------------------------------------------')
            revenue_region_pivot
            Google's revenues by region from 2016 to 2022
            ---------------------------------------------
            Out:
            segment.valuecountry:USgoog:AmericasExcludingUnitedStatesMembersrt:AsiaPacificMemberus-gaap:EMEAMember
            period.endDate
            2016-12-314278100000046280000001255900000030304000000
            2017-12-315244900000061250000001623500000036046000000
            2018-12-316326900000076090000002137400000044567000000
            2019-12-317484300000089860000002692800000050645000000
            2020-12-318501400000094170000003255000000055370000000
            2021-12-31117854000000144040000004612300000079107000000
            2022-12-31134814000000169760000004702400000082062000000
            import matplotlib.pyplot as plt
            import matplotlib.ticker as ticker

            # plot the histogram bar chart
            ax = revenue_region_pivot.plot(kind='bar', stacked=True, figsize=(8, 6))

            # rotate the x-axis labels by 0 degrees
            plt.xticks(rotation=0)

            # set the title and labels for the chart
            ax.set_title("Google's Revenue by Region", fontsize=16, fontweight='bold')
            ax.set_xlabel('Period End Date', fontsize=12)
            ax.set_ylabel('Revenue (USD)', fontsize=12)

            # set the legend properties
            ax.legend(title='Product Category', loc='upper left', fontsize='small', title_fontsize=10)

            # add gridlines
            ax.grid(axis='y', linestyle='--', alpha=0.3)

            # remove the top and right spines
            ax.spines['top'].set_visible(False)
            ax.spines['right'].set_visible(False)

            # format y-axis ticks to show values in millions in dollars
            formatter = ticker.FuncFormatter(lambda x, pos: '$%1.0fB' % (x*1e-9))
            plt.gca().yaxis.set_major_formatter(formatter)

            # map the original labels to new labels
            label_map = {
                'country:US': 'US',
                'goog:AmericasExcludingUnitedStatesMember': 'Americas Excluding US',
                'srt:AsiaPacificMember': 'Asia Pacific',
                'us-gaap:EMEAMember': 'EMEA'
            }

            # create a list of new labels based on the original labels
            new_labels = [label_map[label] for label in sorted(revenue_region['segment.value'].unique())]
            handles, _ = ax.get_legend_handles_labels()
            plt.legend(handles=handles[::-1], labels=new_labels[::-1])

            # add the values in billions of dollars to each part of the bar
            for p in ax.containers:
                ax.bar_label(p, labels=['%.1f' % (v/1e9) for v in p.datavalues],
                             label_type='center', fontsize=8)

            plt.show()

            google-revenue-by-region

            Visualize Google's Revenue by Product Category from 2016 to 2022

            In this section, we will repeat most of the steps from the previous section but with a revenue item filter that only considers segment.dimension revenue items with srt:ProductOrServiceAxis or us-gaap:StatementBusinessSegmentsAxis as values since Google reported their cloud revenue in the us-gaap:StatementBusinessSegmentsAxis dimension, while it reported all other metrics such as YouTube advertising revenue under the srt:ProductOrServiceAxis dimension.

            To make it easier for you to compare the extracted values to the human-readable values in Google's 2022 10-K, we have added a screenshot of the financial metrics below.

            revenue-by-product-from-10k

            To facilitate a better understanding, we have provided a mapping between the human-readable labels and their corresponding XBRL element names, as the element names of the US GAAP items inside the XBRL data differ from the financial results table in the 10-K filing.

            | Human-readable metric | XBRL metric | -- | -- | | Google Search & other | GoogleSearchOtherMember | | YouTube ads | YouTubeAdvertisingRevenueMember | | Google Network | GoogleNetworkMembersPropertiesMember (2016-2019), GoogleNetworkMember (2020-2022) | | Google advertising | GoogleAdvertisingRevenueMember | | Google other | OtherRevenuesMember | | Google Cloud | GoogleCloudMember |

            GoogleAdvertisingRevenueMember = 
                GoogleSearchOtherMember
              + YouTubeAdvertisingRevenueMember
              + GoogleNetworkMembersPropertiesMember OR GoogleNetworkMember

            An inconsistency in the naming convention can be observed as the revenue reported for Google Network was previously referred to as GoogleNetworkMembersPropertiesMember from 2016 to 2019, which changed to GoogleNetworkMember since 2020. To address this inconsistency, we will merge the values of both items and create a new consolidated item named GoogleNetwork.

            all_revenues['value'] = all_revenues['value'].astype(int)

            mask_1 = all_revenues['segment.dimension'] == 'srt:ProductOrServiceAxis'
            mask_2 = all_revenues['segment.dimension'] == 'us-gaap:StatementBusinessSegmentsAxis'

            revenue_product = all_revenues[mask_1 | mask_2]

            revenue_product = revenue_product.drop_duplicates(subset=['period.endDate', 'segment.value'])

            # pivot the dataframe to create a new dataframe with period.endDate as the index,
            # segment.value as the columns, and value as the values
            revenue_product_pivot = revenue_product.pivot(index='period.endDate', columns='segment.value', values='value')


            print("Google's revenues by product from 2016 to 2022")
            print('-----------------------------------------------')
            revenue_product_pivot
            Google's revenues by product from 2016 to 2022
            -----------------------------------------------
            Out:
            segment.valuegoog:AdvertisingRevenueMembergoog:GoogleAdvertisingRevenueMembergoog:GoogleCloudMembergoog:GoogleInc.Membergoog:GoogleNetworkMembergoog:GoogleNetworkMembersPropertiesMembergoog:GooglePropertiesMembergoog:GoogleSearchOtherMembergoog:GoogleServicesMembergoog:OtherRevenuesMembergoog:YouTubeAdvertisingRevenueMemberus-gaap:AllOtherSegmentsMember
            period.endDate
            2016-12-317.938300e+10NaNNaN7.938300e+10NaN1.559800e+106.378500e+10NaNNaN1.060100e+10NaN2.880000e+08
            2017-12-319.537500e+109.557700e+104.056000e+099.537500e+10NaN1.758700e+107.778800e+106.981100e+10NaN1.500300e+108.150000e+094.770000e+08
            2018-12-311.163180e+111.164610e+115.838000e+091.163180e+11NaN1.998200e+109.633600e+108.529600e+10NaN1.990600e+101.115500e+105.950000e+08
            2019-12-31NaN1.348110e+118.918000e+099.811500e+10NaN2.154700e+101.132640e+119.811500e+10NaN1.701400e+101.514900e+106.590000e+08
            2020-12-31NaN1.469240e+111.305900e+10NaN2.309000e+10NaNNaN1.040620e+111.040620e+112.171100e+101.977200e+106.570000e+08
            2021-12-31NaN2.094970e+111.920600e+10NaN3.170100e+10NaNNaN1.489510e+111.489510e+112.803200e+102.884500e+107.530000e+08
            2022-12-31NaN2.244730e+112.628000e+10NaN3.278000e+10NaNNaN1.624500e+111.624500e+112.905500e+102.924300e+101.068000e+09
            # merge: goog:GoogleCloudMember + goog:GoogleNetworkMembersPropertiesMember
            revenue_product_pivot['goog:GoogleNetwork'] = revenue_product_pivot['goog:GoogleNetworkMember'].fillna(revenue_product_pivot['goog:GoogleNetworkMembersPropertiesMember'])
            revenue_product_pivot = revenue_product_pivot.drop(['goog:GoogleNetworkMember', 'goog:GoogleNetworkMembersPropertiesMember'], axis=1)

            revenue_product_pivot = revenue_product_pivot[['goog:GoogleSearchOtherMember',
                                                           'goog:YouTubeAdvertisingRevenueMember',
                                                           'goog:GoogleNetwork',
                                                           'goog:GoogleCloudMember',
                                                           'goog:OtherRevenuesMember']]

            # remove 2016 row
            revenue_product_pivot = revenue_product_pivot.iloc[1:]

            revenue_product_pivot
            Out:
            segment.valuegoog:GoogleSearchOtherMembergoog:YouTubeAdvertisingRevenueMembergoog:GoogleNetworkgoog:GoogleCloudMembergoog:OtherRevenuesMember
            period.endDate
            2017-12-316.981100e+108.150000e+091.758700e+104.056000e+091.500300e+10
            2018-12-318.529600e+101.115500e+101.998200e+105.838000e+091.990600e+10
            2019-12-319.811500e+101.514900e+102.154700e+108.918000e+091.701400e+10
            2020-12-311.040620e+111.977200e+102.309000e+101.305900e+102.171100e+10
            2021-12-311.489510e+112.884500e+103.170100e+101.920600e+102.803200e+10
            2022-12-311.624500e+112.924300e+103.278000e+102.628000e+102.905500e+10
            sorted(list(revenue_product_pivot.columns.unique()))
            Out:
            ['goog:GoogleCloudMember',
             'goog:GoogleNetwork',
             'goog:GoogleSearchOtherMember',
             'goog:OtherRevenuesMember',
             'goog:YouTubeAdvertisingRevenueMember']
            # plot the histogram bar chart
            ax = revenue_product_pivot.plot(kind='bar', stacked=True, figsize=(8, 6))

            # rotate the x-axis labels by 0 degrees
            plt.xticks(rotation=0)

            # set the title and labels for the chart
            ax.set_title("Google's Revenue by Product", fontsize=16, fontweight='bold')
            ax.set_xlabel('Period End Date', fontsize=12)
            ax.set_ylabel('Revenue (USD)', fontsize=12)

            # set the legend properties
            ax.legend(title='Product Category', loc='upper left', fontsize='small', title_fontsize=10)

            # add gridlines
            ax.grid(axis='y', linestyle='--', alpha=0.3)

            # remove the top and right spines
            ax.spines['top'].set_visible(False)
            ax.spines['right'].set_visible(False)

            # format y-axis ticks to show values in millions in dollars
            formatter = ticker.FuncFormatter(lambda x, pos: '$%1.0fB' % (x*1e-9))
            plt.gca().yaxis.set_major_formatter(formatter)

            # map the original labels to new labels
            label_map = {
                'goog:GoogleCloudMember': 'Google Cloud',
                'goog:GoogleNetwork': 'Google Network',
                'goog:YouTubeAdvertisingRevenueMember': 'YouTube Ads',
                'goog:GoogleSearchOtherMember': 'Google Search & other',
                'goog:OtherRevenuesMember': 'Google other',
            }

            # create a list of new labels based on the original labels
            new_labels = [label_map[label] for label in list(revenue_product_pivot.columns.unique())]
            handles, _ = ax.get_legend_handles_labels()
            plt.legend(handles=handles[::-1], labels=new_labels[::-1])

            # add the values in billions of dollars to each part of the bar
            for p in ax.containers:
                ax.bar_label(p, labels=['%.1f' % (v/1e9) for v in p.datavalues],
                             label_type='center', fontsize=8)

            plt.show()

            google-revenue-by-product

            Create a Figure with Two Revenue Charts: Region and Product

            We will now combine both charts into a single figure, presenting them side by side. To ensure that the x-axis is aligned (i.e., the years from 2017 to 2022), we will remove the 2016 data points from revenue_region_pivot, since we don't have reliable data in revenue_product_pivot for that year.

            fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(12, 5))

            revenue_region_pivot.iloc[1:].plot(kind='bar', stacked=True, ax=ax1)
            revenue_product_pivot.plot(kind='bar', stacked=True, ax=ax2)

            # set the title and labels for the chart
            ax1.set_title("Google's Revenue by Region", fontsize=16, fontweight='bold')
            ax1.set_xlabel('Period End Date', fontsize=12)
            ax1.set_ylabel('Revenue (USD)', fontsize=12)
            ax2.set_title("Google's Revenue by Product", fontsize=16, fontweight='bold')
            ax2.set_xlabel('Period End Date', fontsize=12)

            # set the legend properties
            ax1.legend(title='Region', loc='upper left', fontsize='small', title_fontsize=10)
            ax2.legend(title='Product Category', loc='upper left', fontsize='small', title_fontsize=10)

            # add gridlines
            ax1.grid(axis='y', linestyle='--', alpha=0.3)
            ax2.grid(axis='y', linestyle='--', alpha=0.3)

            # remove the top and right spines
            ax1.spines['top'].set_visible(False)
            ax1.spines['right'].set_visible(False)
            ax2.spines['top'].set_visible(False)
            ax2.spines['right'].set_visible(False)

            # map the original labels to new labels
            label_map_1 = {
                'country:US': 'US',
                'goog:AmericasExcludingUnitedStatesMember': 'Americas Excluding US',
                'srt:AsiaPacificMember': 'Asia Pacific',
                'us-gaap:EMEAMember': 'EMEA'
            }
            label_map_2 = {
                'goog:GoogleCloudMember': 'Google Cloud',
                'goog:GoogleNetwork': 'Google Network',
                'goog:YouTubeAdvertisingRevenueMember': 'YouTube Ads',
                'goog:GoogleSearchOtherMember': 'Google Search & other',
                'goog:OtherRevenuesMember': 'Google other',
            }

            # create a list of new labels based on the original labels
            new_labels1 = [label_map_1[label] for label in list(revenue_region_pivot.columns.unique())]
            new_labels2 = [label_map_2[label] for label in list(revenue_product_pivot.columns.unique())]
            handles1, _ = ax1.get_legend_handles_labels()
            handles2, _ = ax2.get_legend_handles_labels()
            ax1.legend(handles=handles1[::-1], labels=new_labels1[::-1])
            ax2.legend(handles=handles2[::-1], labels=new_labels2[::-1])

            # add the values in billions of dollars to each part of the bar
            for p in ax1.containers:
                ax1.bar_label(p, labels=['%.1f' % (v/1e9) for v in p.datavalues],
                             label_type='center', fontsize=8)

            for p in ax2.containers:
                ax2.bar_label(p, labels=['%.1f' % (v/1e9) for v in p.datavalues],
                             label_type='center', fontsize=8)

            # format y-axis ticks to show values in millions in dollars
            formatter = ticker.FuncFormatter(lambda x, pos: '$%1.0fB' % (x*1e-9))
            ax1.yaxis.set_major_formatter(formatter)
            ax2.yaxis.set_major_formatter(formatter)

            # rotate the x-axis labels by 0 degrees
            ax1.tick_params(axis='x', labelrotation=30)
            ax2.tick_params(axis='x', labelrotation=30)

            plt.show()

            google-revenue-by-region-product

            Footer

            Products

            • EDGAR Filing Search API
            • Full-Text Search API
            • Real-Time Filing Stream API
            • Filing Download & PDF Generator API
            • XBRL-to-JSON Converter
            • 10-K/10-Q/8-K Item Extractor
            • Investment Adviser & Form ADV API
            • Insider Trading Data - Form 3, 4, 5
            • Restricted Sales Notifications - Form 144
            • Institutional Holdings - Form 13F
            • Form N-PORT API - Investment Company Holdings
            • Form N-PX API - Proxy Voting Records
            • Form 13D/13G API
            • Form S-1/424B4 - IPOs, Debt & Rights Offerings
            • Form C - Crowdfunding Offerings
            • Form D - Private Placements & Exempt Offerings
            • Regulation A Offering Statements API
            • Changes in Auditors & Accountants
            • Non-Reliance on Prior Financial Statements
            • Executive Compensation Data API
            • Directors & Board Members Data
            • Company Subsidiaries Database
            • Outstanding Shares & Public Float
            • SEC Enforcement Actions
            • Accounting & Auditing Enforcement Releases (AAERs)
            • SRO Filings
            • CIK, CUSIP, Ticker Mapping

            General

            • Pricing
            • Features
            • Supported Filings
            • EDGAR Filing Statistics

            Account

            • Sign Up - Start Free Trial
            • Log In
            • Forgot Password

            Developers

            • API Sandbox
            • Documentation
            • Resources & Tutorials
            • Python API SDK
            • Node.js API SDK

            Legal

            • Terms of Service
            • Privacy Policy

            Legal

            • Terms of Service
            • Privacy Policy

            SEC API

            © 2025 sec-api.io by Data2Value GmbH. All rights reserved.

            SEC® and EDGAR® are registered trademarks of the U.S. Securities and Exchange Commission (SEC).

            EDGAR is the Electronic Data Gathering, Analysis, and Retrieval system operated by the SEC.

            sec-api.io and Data2Value GmbH are independent of, and not affiliated with, sponsored by, or endorsed by the U.S. Securities and Exchange Commission.

            sec-api.io is classified under SIC code 7375 (Information Retrieval Services), providing on-demand access to structured data and online information services.