sec-api.ioSEC API by D2V
FilingsPricingSandboxDocs
Log inGet Free API Key
API Documentation
Introduction

Filing Query API
Full-Text Search API
Stream API
Download & PDF Generator API
XBRL-to-JSON Converter 
Extractor API 

Form ADV API - Investment Advisers

Form 3/4/5 API - Insider Trading
Form 144 API - Restricted Sales
Form 13F API - Institut. Holdings
Form 13D/13G API - Activist Invst.
Form N-PORT API - Mutual Funds

Form N-CEN API - Annual Reports
Form N-PX API - Proxy Voting

Form S-1/424B4 API - IPOs, Notes
Form C API - Crowdfunding
Form D API - Private Sec. Offerings
Form 1-A/1-K/1-Z - Reg A Offerings

Form 8-K API - Item 4.01
Form 8-K API - Item 4.02
Form 8-K API - Item 5.02
Overview
Example: Python

Executive Compensation API
Directors & Board Members
Company Subsidiaries
Outstanding Shares & Public Float

SEC Enforcement Actions
SEC Litigation Releases
SEC Administrative Proceedings
AAER Database API
SRO Filings Database

CIK, CUSIP, Ticker Mapping API
EDGAR Entities Database

Financial Statements

Analysis of Officer and Director Change Reports

Open In Colab   Download Notebook

On this page:
  • Data Loading and Preparation
    • Vizualization of Officer and Director Change Disclosures over Time
      • Type of Personnel Change
        • Appointed Positions
          • Compensation
            • Departures
              • Departures connected to disagreements
                • Amendments
                  • Organizational Changes
                    • Bonus Plans

                      We illustrate how to perform an exploratory data analysis on disclosures informing investors about changes of certain officers and directors of publicly traded companies on U.S. stock exchanges. These changes are disclosed in Form 8-K filings with the SEC, specifically under Item 5.02, titled "Departure of Directors or Certain Officers; Election of Directors; Appointment of Certain Officers: Compensatory Arrangements of Certain Officers." These disclosures are presented in text form by companies. Utilizing our Structured Data API, we extract and structure the relevant information from the text, making it available for detailed analysis.

                      Our analysis will focus on several key areas:

                      • Number of Item 5.02 disclosures made each year from 2004 to 2024, per quarter, month and at what time of the day (pre-market, regular market, after-market)
                      • Distribution of disclosures across structured data fields, such as the proportion of disclosures reporting appointments and departures.
                      • Appointments: number per year to certain position and statistics such as histograms of the age of the appointed person and the annual compensation.
                      • Departures: number per per year and their reasons and the rate if disagreements in connection with these departures.
                      • Amendments: distribution of amendments across term duration change, compensation changes and compensation types.
                      • Changes of the Board of Directors size
                      • Eligibility of bonus plans

                      Data Loading and Preparation

                      To load and prepare the dataset of over 250,000 structured data objects from Item 5.02 disclosures spanning 2004 to 2024, we utilize the Form 8-K Item 5.02 Structured Data API. The following code handles data loading and preparation by executing multiple download processes in parallel, significantly reducing downloading time. Once downloaded, all structured data objects are saved in JSONL format to ./form-8k-item-5-02-structured-data.jsonl, which will serve as the primary dataset for the analysis. Downloading the entire dataset may take up to 10 minutes.

                      import os
                      import json
                      import random
                      import time
                      import sys
                      import re
                      # from multiprocessing import Pool # use in .py files only
                      from concurrent.futures import ThreadPoolExecutor
                      import pandas as pd
                      import numpy as np
                      import matplotlib.pyplot as plt
                      import matplotlib.style as style
                      import matplotlib.ticker as mtick
                      import seaborn as sns

                      style.use("default")

                      params = {
                          "axes.labelsize": 8, "font.size": 8, "legend.fontsize": 8,
                          "xtick.labelsize": 8, "ytick.labelsize": 8, "font.family": "sans-serif",
                          "axes.spines.top": False, "axes.spines.right": False, "grid.color": "grey",
                          "axes.grid": True, "axes.grid.axis": "y", "grid.alpha": 0.5, "grid.linestyle": ":",
                      }

                      plt.rcParams.update(params)
                      !pip install sec-api
                      from sec_api import Form_8K_Item_X_Api

                      item_X_api = Form_8K_Item_X_Api("YOUR_API_KEY")

                      YEARS = range(2024, 2003, -1) # from 2024 to 2004
                      TEMP_FILE_TEMPLATE = "./temp_file_{}.jsonl"
                      TARGET_FILE = "./form-8k-item-5-02-structured-data.jsonl"


                      def process_year(year):
                          backoff_time = random.randint(10, 800) / 1000
                          print(f"Starting year {year} with backoff time {backoff_time:,}s"); sys.stdout.flush()
                          time.sleep(backoff_time)

                          tmp_filename = TEMP_FILE_TEMPLATE.format(year)
                          tmp_file = open(tmp_filename, "a")

                          for month in range(12, 0, -1):
                              search_from = 0
                              month_counter = 0

                              while True:
                                  query = f"item5_02:* AND filedAt:[{year}-{month:02d}-01 TO {year}-{month:02d}-31]"
                                  searchRequest = {
                                      "query": query,
                                      "from": search_from,
                                      "size": "50",
                                      "sort": [{"filedAt": {"order": "desc"}}],
                                  }

                                  response = None
                                  try:
                                      response = item_X_api.get_data(searchRequest)
                                  except Exception as e:
                                      print(f"{year}-{month:02d} error: {e}"); sys.stdout.flush()
                                      continue

                                  if response == None or len(response["data"]) == 0:
                                      break

                                  search_from += 50
                                  month_counter += len(response["data"])
                                  jsonl_data = "\n".join([json.dumps(entry) for entry in response["data"]])
                                  tmp_file.write(jsonl_data + "\n")

                              print(f"Finished loading {month_counter} Item 5.02 for {year}-{month:02d}")
                              sys.stdout.flush()

                          tmp_file.close()

                          return year



                      if not os.path.exists(TARGET_FILE):
                          with ThreadPoolExecutor(max_workers=4) as pool:
                              processed_years = list(pool.map(process_year, YEARS))
                          print("Finished processing all years.", processed_years)

                          # merge the temporary files into one final file
                          with open(TARGET_FILE, "a") as outfile:
                              for year in YEARS:
                                  temp_file = TEMP_FILE_TEMPLATE.format(year)
                                  if os.path.exists(temp_file):
                                      with open(temp_file, "r") as infile:
                                          outfile.write(infile.read())
                      else:
                          print("File already exists. Skipping download.")
                      File already exists. Skipping download.
                      structured_data = pd.read_json(TARGET_FILE, lines=True)

                      structured_data["filedAt"] = pd.to_datetime(structured_data["filedAt"], utc=True)
                      structured_data["filedAt"] = structured_data["filedAt"].dt.tz_convert("US/Eastern")
                      structured_data = structured_data.sort_values("filedAt", ascending=True).reset_index(drop=True)
                      structured_data.drop_duplicates("accessionNo", keep="first", inplace=True)
                      structured_data["year"] = structured_data["filedAt"].dt.year
                      structured_data["month"] = structured_data["filedAt"].dt.month
                      structured_data["dayOfWeek"] = structured_data["filedAt"].dt.day_name()
                      # filedAtClass: preMarket (4:00AM-9:30AM), regularMarket (9:30AM-4:00PM), afterMarket (4:00PM-8:00PM)
                      structured_data["filedAtClass"] = structured_data["filedAt"].apply(
                          lambda x: (
                              "preMarket"
                              if x.hour < 9 or (x.hour == 9 and x.minute < 30)
                              else (
                                  "regularMarket"
                                  if x.hour < 16
                                  else "afterMarket" if x.hour < 20 else "other"
                              )
                          )
                      )
                      # convert long-form of each item into item id only, e.g. "Item 4.02: ..." => "4.02"
                      structured_data["items"] = structured_data["items"].apply(
                          lambda items: [re.search(r"\d+\.\d+", x).group(0) if x else None for x in items]
                      )
                      # explode column "item5_02" into multiple columns
                      # where each column is a key-value pair of the JSON object
                      # and drop all structured data columns for items, eg "item4_01"
                      item_cols = list(
                          structured_data.columns[
                              structured_data.columns.str.contains(r"item\d+_", case=False)
                          ]
                      )
                      structured_data = pd.concat(
                          [
                              structured_data.drop(item_cols, axis=1),
                              structured_data["item5_02"].apply(pd.Series),
                          ],
                          axis=1,
                      )
                      unique_years = structured_data["year"].nunique()
                      unique_companies = structured_data["cik"].nunique()
                      unique_filings = structured_data["accessionNo"].nunique()
                      min_year = structured_data["year"].min()
                      max_year = structured_data["year"].max()
                      print("Loaded dataframe with structured personnel change data from Form 8-K Item 5.02 filings")
                      print(f"Number of filings: {unique_filings:,}")
                      print(f"Number of records: {len(structured_data):,}")
                      print(f"Number of years: {unique_years:,} ({min_year}-{max_year})")
                      print(f"Number of unique companies: {unique_companies:,}")

                      structured_data.head()
                      Loaded dataframe with structured personnel change data from Form 8-K Item 5.02 filings
                      Number of filings: 250,956
                      Number of records: 250,956
                      Number of years: 21 (2004-2024)
                      Number of unique companies: 18,056
                      Out[6]:
                      idaccessionNoformTypefiledAtperiodOfReportciktickercompanyNameitemsyearmonthdayOfWeekfiledAtClasskeyComponentspersonnelChangesattachmentsorganizationChangesbonusPlans
                      01f87b5f2eb5ebcbe4a33a3b7a62daf640001125282-04-0041098-K2004-08-23 09:51:12-04:002004-08-231022671STLDSTEEL DYNAMICS INC[5.02]20048MondayregularMarketTracy L. Shellabarger resigned as CFO to pursu...[{'type': 'departure', 'departureType': 'resig...NaNNaNNaN
                      134f998c74cf305575e1c99af886877410001104659-04-0252828-K2004-08-23 11:31:33-04:002004-08-231038363NaNMETALS USA INC[5.02]20048MondayregularMarketMetals USA announced the retirement of E. L. (...[{'type': 'departure', 'departureType': 'retir...[Text of Press Release Dated August 16, 2004]NaNNaN
                      2aaf76daa3dbf9e814ba40d9b220a75c40000899715-04-0001648-K2004-08-23 12:35:47-04:002004-08-23899715SKTTANGER FACTORY OUTLET CENTERS INC[5.02]20048MondayregularMarketThe Board of Directors expanded from five to s...[{'type': 'appointment', 'effectiveDate': '200...[Press release announcing expansion of the Com...{'organ': 'Board of Directors', 'details': 'Ex...NaN
                      306f7d324ce85534dc7fc2cc0393db1530001019056-04-0011278-K2004-08-23 13:16:24-04:002004-08-23870228SYBRSYNERGY BRANDS INC[5.02]20048MondayregularMarketMichael Ferrone resigned as a director on the ...[{'type': 'departure', 'departureType': 'resig...NaNNaNNaN
                      405035be548ee5e117923f7b6ca1030d10001104659-04-0253258-K2004-08-23 16:14:05-04:002004-08-171003214SIMGSILICON IMAGE INC[1.01, 1.02, 5.02, 9.01]20048MondayafterMarketDale Brown was appointed as the Chief Accounti...[{'type': 'appointment', 'effectiveDate': '200...[Amended and Restated Employment Agreement, da...NaN[{'specificIndividuals': True, 'eligibleIndivi...

                      Vizualization of Officer and Director Change Disclosures over Time

                      item_5_02_counts = (
                          structured_data.drop_duplicates(subset=["accessionNo"])
                          .groupby(["year"])
                          .size()
                          .to_frame(name="count")
                      )

                      print(f"Item 5.02 counts from 2004 to 2024.")
                      item_5_02_counts.T
                      Item 5.02 counts from 2004 to 2024.
                      Out[7]:
                      year2004200520062007200820092010201120122013...2015201620172018201920202021202220232024
                      count298898051144117137175891456613294132371283212586...12611116181135311171109761072111456113421104410279

                      1 rows × 21 columns

                      def plot_timeseries(ts, title):
                          fig, ax = plt.subplots(figsize=(4, 2.5))
                          ts["count"].plot(ax=ax, legend=False)
                          ax.set_title(title)
                          ax.set_xlabel("Year")
                          ax.set_ylabel("Number of\nItem 5.02 Filings")
                          ax.set_xticks(np.arange(2004, 2025, 2))
                          ax.yaxis.set_major_formatter(mtick.StrMethodFormatter("{x:,.0f}"))
                          ax.set_xlim(2003, 2025)
                          ax.grid(axis="x")
                          ax.set_axisbelow(True)
                          plt.xticks(rotation=45, ha="right")

                          for year in range(2004, 2025, 1):
                              year_y_max = ts.loc[year, "count"]
                              ax.vlines(year, 0, year_y_max, linestyles=":", colors="grey", alpha=0.5, lw=1)

                          plt.tight_layout()
                          plt.show()


                      plot_timeseries(
                          item_5_02_counts,
                          title="Item 5.02 Disclosures per Year (2004-2024)",
                      )
                      structured_data["qtr"] = structured_data["month"].apply(lambda x: (x - 1) // 3 + 1)

                      counts_qtr_yr_piv = (
                          structured_data.groupby(["year", "qtr"]).size().unstack().fillna(0)
                      ).astype(int)

                      print(f"Item 5.02 counts by quarter from 2004 to 2024.")
                      counts_qtr_yr_piv.T
                      Item 5.02 counts by quarter from 2004 to 2024.
                      Out[9]:
                      year2004200520062007200820092010201120122013...2015201620172018201920202021202220232024
                      qtr
                      10239727374772508845733837393937363543...3548338232533164307129433109327930842794
                      20251325754398459136143472340133283329...3305305830352881294828312947299529202833
                      3736241625073831371729742911292027322683...2832252923952452246624162623253424462237
                      42252247936224136419334053074297730363031...2926264926702674249125312777253425942415

                      4 rows × 21 columns

                      plt.figure(figsize=(8, 2))
                      sns.heatmap(
                          counts_qtr_yr_piv.T,
                          annot=True, # Display the cell values
                          fmt="d", # Integer formatting
                          cmap="magma", # Color map
                          cbar_kws={"label": "Count"}, # Colorbar label
                          mask=counts_qtr_yr_piv.T == 0, # Mask the cells with value 0
                          cbar=False,
                          annot_kws={"fontsize": 7},
                      )
                      plt.grid(False)
                      plt.title("Item 5.02 Counts by Quarter (2004-2024)")
                      plt.xlabel("Year")
                      plt.ylabel("Quarter")
                      plt.tight_layout()
                      plt.show()
                      counts_qtr_yr = counts_qtr_yr_piv.stack().reset_index(name="count")

                      fig, ax = plt.subplots(figsize=(6, 2.5))
                      counts_qtr_yr_piv.plot(kind="bar", ax=ax, legend=True)
                      ax.legend(title="Quarter", loc="upper right", bbox_to_anchor=(1.15, 1))
                      ax.set_title("Number of Item 5.02 Disclosures per Quarter\n(2004 - 2024)")
                      ax.set_xlabel("Year")
                      ax.set_ylabel("Number of\nItem 5.02 Filings")
                      ax.yaxis.set_major_formatter(mtick.StrMethodFormatter("{x:,.0f}"))
                      ax.grid(axis="x")
                      ax.set_axisbelow(True)
                      plt.tight_layout()
                      plt.show()
                      counts_month_yr_piv = (
                          structured_data.groupby(["year", "month"]).size().unstack().fillna(0)
                      ).astype(int)
                      plt.figure(figsize=(6, 4))
                      sns.heatmap(
                          counts_month_yr_piv,
                          annot=True,
                          fmt="d",
                          cmap="magma",
                          cbar_kws={"label": "Count"},
                          mask=counts_month_yr_piv == 0,
                          cbar=False,
                          annot_kws={"size": 7},
                      )
                      # convert x-labels to month names: 1 => Jan, 2 => Feb, etc.
                      plt.xticks(
                          ticks=np.arange(0.5, 12.5, 1),
                          labels=[pd.to_datetime(str(i), format="%m").strftime("%b") for i in range(1, 13)],
                      )
                      plt.grid(False)
                      plt.title("Item 5.02 Counts by Month (2004-2024)")
                      plt.xlabel("")
                      plt.ylabel("Year")
                      plt.tight_layout()
                      plt.show()
                      print(f"Descriptive statistics for Item 5.02 counts by month from 2004 to 2024.")
                      month_stats = (
                          counts_month_yr_piv.loc[2004:]
                          .describe(percentiles=[0.025, 0.975])
                          .round(0)
                          .astype(int)
                      )
                      month_stats
                      Descriptive statistics for Item 5.02 counts by month from 2004 to 2024.
                      Out[14]:
                      month123456789101112
                      count212121212121212121212121
                      mean10301125116797411139808609048669009361096
                      std320363341291326265253229135168151234
                      min0000000137599699751782
                      2.5%360408430404423401366458643713760834
                      50%100611021190949113410028399088408729101019
                      97.5%152617411666149016511353130013191156131812601626
                      max154018531695154516891357130114331211134313341748
                      def plot_box_plot_as_line(
                          data: pd.DataFrame,
                          x_months=True,
                          title="",
                          x_label="",
                          x_pos_mean_label=2,
                          pos_labels=None,
                          pos_high_low=None,
                          y_label="",
                          y_formatter=lambda x, p: "{:.0f}".format(int(x) / 1000),
                          show_high_low_labels=True,
                          show_inline_labels=True,
                          show_bands=True,
                          figsize=(4, 2.5),
                          line_source="mean",
                      ):
                          fig, ax = plt.subplots(figsize=figsize)

                          line_to_plot = data[line_source]
                          lower_label = "2.5%"
                          upper_label = "97.5%"
                          lower = data[lower_label]
                          upper = data[upper_label]

                          line_to_plot.plot(ax=ax)

                          if show_bands:
                              ax.fill_between(line_to_plot.index, lower, upper, alpha=0.2)

                          if x_months:
                              ax.set_xlim(0.5, 12.5)
                              ax.set_xticks(range(1, 13))
                              ax.set_xticklabels(["J", "F", "M", "A", "M", "J", "J", "A", "S", "O", "N", "D"])

                          ax.yaxis.set_major_formatter(mtick.FuncFormatter(y_formatter))
                          ax.set_ylabel(y_label)
                          ax.set_xlabel(x_label)
                          ax.set_title(title)

                          ymin, ymax = ax.get_ylim()
                          y_scale = ymax - ymin

                          max_x = int(line_to_plot.idxmax())
                          max_y = line_to_plot.max()
                          min_x = int(line_to_plot.idxmin())
                          min_y = line_to_plot.min()

                          ax.axvline(
                              max_x,
                              ymin=0,
                              ymax=((max_y - ymin) / (ymax - ymin)),
                              linestyle="dashed",
                              color="tab:blue",
                              alpha=0.5,
                          )
                          ax.scatter(max_x, max_y, color="tab:blue", s=10)
                          ax.axvline(
                              min_x,
                              ymin=0,
                              ymax=((min_y - ymin) / (ymax - ymin)),
                              linestyle="dashed",
                              color="tab:blue",
                              alpha=0.5,
                          )
                          ax.scatter(min_x, min_y, color="tab:blue", s=10)

                          x_pos_mean_label_int = int(x_pos_mean_label)
                          if show_inline_labels:
                              mean_x = x_pos_mean_label
                              mean_y = line_to_plot.iloc[x_pos_mean_label_int] * 1.02
                              upper_x = x_pos_mean_label
                              upper_y = upper.iloc[x_pos_mean_label_int]
                              lower_x = x_pos_mean_label
                              lower_y = lower.iloc[x_pos_mean_label_int] * 0.95

                              if pos_labels:
                                  mean_x = pos_labels["mean"]["x"]
                                  mean_y = pos_labels["mean"]["y"]
                                  upper_x = pos_labels["upper"]["x"]
                                  upper_y = pos_labels["upper"]["y"]
                                  lower_x = pos_labels["lower"]["x"]
                                  lower_y = pos_labels["lower"]["y"]

                              ax.text(mean_x, mean_y, "Mean", color="tab:blue", fontsize=8)
                              ax.text(upper_x, upper_y, upper_label, color="tab:blue", fontsize=8)
                              ax.text(lower_x, lower_y, lower_label, color="tab:blue", fontsize=8)

                          if show_high_low_labels:
                              high_x_origin = max_x
                              high_y_origin = max_y
                              high_x_label = high_x_origin + 0.5
                              high_y_label = high_y_origin + 0.1 * y_scale
                              if pos_high_low:
                                  high_x_label = pos_high_low["high"]["x"]
                                  high_y_label = pos_high_low["high"]["y"]
                              ax.annotate(
                                  "High",
                                  (high_x_origin, high_y_origin),
                                  xytext=(high_x_label, high_y_label),
                                  arrowprops=dict(facecolor="black", arrowstyle="->"),
                              )

                              low_x_origin = min_x * 1.01
                              low_y_origin = min_y
                              low_x_label = low_x_origin + 1.5
                              low_y_label = low_y_origin - 0.1 * y_scale
                              if pos_high_low:
                                  low_x_label = pos_high_low["low"]["x"]
                                  low_y_label = pos_high_low["low"]["y"]
                              ax.annotate(
                                  "Low",
                                  (low_x_origin, low_y_origin),
                                  xytext=(low_x_label, low_y_label),
                                  arrowprops=dict(facecolor="black", arrowstyle="->"),
                              )

                          ax.grid(axis="x")
                          ax.set_axisbelow(True)

                          plt.tight_layout()
                          plt.show()


                      plot_box_plot_as_line(
                          data=month_stats.T,
                          title="Descriptive Statistics for Item 5.02 Filings by Month\n(2005 - 2024)",
                          x_label="Month",
                          y_label="Number of\nItem 5.02 Filings",
                          y_formatter=lambda x, p: "{:.0f}".format(int(x)),
                          x_pos_mean_label=5,
                      )
                      fig, ax = plt.subplots(figsize=(3.5, 3))

                      counts_month_yr_piv.loc[2005:].boxplot(
                          ax=ax,
                          grid=False,
                          showfliers=False,
                          flierprops=dict(marker="o", markersize=3),
                          patch_artist=True,
                          boxprops=dict(facecolor="white", color="tab:blue"),
                          showmeans=True,
                          meanline=True,
                          meanprops={"color": "tab:blue", "linestyle": ":"},
                          medianprops={"color": "black"},
                          capprops={"color": "none"},
                      )

                      ax.set_title("Item 5.02 Filings by Month\n(2005 - 2024)")
                      ax.set_xlabel("")
                      ax.set_ylabel("Item 5.02 Count")
                      xticklabels = [pd.to_datetime(str(x), format="%m").strftime("%b") for x in range(1, 13)]
                      ax.set_xticklabels(xticklabels)
                      plt.xticks(rotation=45)
                      plt.tight_layout()
                      plt.show()
                      counts_filedAtClass = (
                          structured_data.drop_duplicates(subset=["accessionNo"])
                          .groupby(["filedAtClass"])
                          .size()
                          .sort_values(ascending=False)
                          .to_frame(name="Count")
                      ).rename_axis("Publication Time")
                      counts_filedAtClass["Pct"] = (
                          counts_filedAtClass["Count"].astype(int)
                          / counts_filedAtClass["Count"].astype(int).sum()
                      ).map("{:.0%}".format)
                      counts_filedAtClass["Count"] = counts_filedAtClass["Count"].map(lambda x: f"{x:,}")
                      counts_filedAtClass.index = (
                          counts_filedAtClass.index.str.replace("preMarket", "Pre-Market (4:00 - 9:30 AM)")
                          .str.replace("marketHours", "Market Hours (9:30 AM - 4:00 PM)")
                          .str.replace("afterMarket", "After Market (4:00 - 8:00 PM)")
                      )
                      counts_filedAtClass = counts_filedAtClass.reindex(counts_filedAtClass.index[::-1])

                      print(
                          f"Item 5.02 counts by pre-market, regular market hours,\nand after-market publication time (2004 - 2024)."
                      )
                      counts_filedAtClass
                      Item 5.02 counts by pre-market, regular market hours,
                      and after-market publication time (2004 - 2024).
                      Out[16]:
                      CountPct
                      Publication Time
                      other4,2202%
                      Pre-Market (4:00 - 9:30 AM)31,00512%
                      regularMarket67,42427%
                      After Market (4:00 - 8:00 PM)148,30759%
                      counts_dayOfWeek = (
                          structured_data.drop_duplicates(subset=["accessionNo"])
                          .groupby(["dayOfWeek"])
                          .size()
                          .to_frame(name="Count")
                      ).rename_axis("Day of the Week")
                      counts_dayOfWeek["Pct"] = (
                          counts_dayOfWeek["Count"].astype(int) / counts_dayOfWeek["Count"].astype(int).sum()
                      ).map("{:.0%}".format)
                      counts_dayOfWeek["Count"] = counts_dayOfWeek["Count"].map(lambda x: f"{x:,}")

                      print(f"Item 5.02 disclosures by day of the week (2004 - 2024).")
                      counts_dayOfWeek.loc[["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"]]
                      Item 5.02 disclosures by day of the week (2004 - 2024).
                      Out[17]:
                      CountPct
                      Day of the Week
                      Monday46,84019%
                      Tuesday50,63420%
                      Wednesday47,35119%
                      Thursday50,74920%
                      Friday55,38022%

                      Type of Personnel Change

                      personnel_change_types = (
                          structured_data[["year", "accessionNo", "personnelChanges"]]
                          .explode(column="personnelChanges")
                          .reset_index(drop=True)
                      )
                      personnel_change_types = pd.concat(
                          [
                              personnel_change_types.drop(columns="personnelChanges"),
                              pd.json_normalize(personnel_change_types["personnelChanges"]),
                          ],
                          axis=1,
                      )
                      print(f"{len(personnel_change_types):,} personnel changes loaded")
                      personnel_change_types.head()
                      474,524 personnel changes loaded
                      Out[161]:
                      yearaccessionNotypedepartureTypepositionsreasoncontinuedConsultingRoledisagreementsperson.nameinterim...compensationIncreasedcompensationDecreasedcompensation.onetimecompensation.noCompensationperson.positionsAtOtherCompaniestermEndDateconsultingEndDateperson.familyRelationshipsamendmentSummaryoldTermEndDate
                      020040001125282-04-004109departureresignation[Chief Financial Officer]Personal considerations to pursue other busine...TrueFalseTracy L. ShellabargerNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
                      120040001125282-04-004109appointmentNaN[Acting Chief Financial Officer]NaNNaNNaNTheresa E. WaglerTrue...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
                      220040001104659-04-025282departureretirement[Senior Vice President, President, Building Pr...NaNNaNNaNE. L. (Tom) ThompsonNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
                      320040001104659-04-025282appointmentNaN[Senior Vice President, President, Building Pr...NaNNaNNaNRobert C. McPherson IIINaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
                      420040000899715-04-000164appointmentNaN[Board of Directors]NaNNaNNaNAllan L. SchumanNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN

                      5 rows × 32 columns

                      changeType = personnel_change_types["type"].value_counts().to_frame()
                      changeType.index.name = "Personnel Change Type"
                      changeType.columns = ["Count"]
                      changeType["Pct."] = changeType["Count"] / changeType["Count"].sum() * 100
                      changeType["Pct."] = changeType["Pct."].round(1)
                      changeType["Count"] = changeType["Count"].apply(lambda x: f"{x:,.0f}")

                      print(f"Types of personnel change if stated in the Item 5.02 filings (2004 - 2024):")
                      changeType
                      Types of personnel change if stated in the Item 5.02 filings (2004 - 2024):
                      Out[62]:
                      CountPct.
                      Personnel Change Type
                      appointment210,15647.5
                      departure178,77740.4
                      amendment41,9379.5
                      bonus8,9832.0
                      nomination1,9080.4
                      refusal2190.0
                      change_type_year = personnel_change_types[["type", "year", "accessionNo"]]
                      changeType_year_pivot = pd.pivot_table(
                          change_type_year,
                          index="type",
                          columns="year",
                          values="accessionNo",
                          aggfunc="count",
                          fill_value=0,
                      )

                      changeType_year_pivot["total"] = changeType_year_pivot.sum(axis=1)
                      changeType_year_pivot = changeType_year_pivot.sort_values(by="total", ascending=False)
                      # remove artifacts
                      changeType_year_pivot = changeType_year_pivot[changeType_year_pivot["total"] >= 18]

                      changeType_year_pivot
                      Out[63]:
                      year2004200520062007200820092010201120122013...201620172018201920202021202220232024total
                      type
                      appointment3132100891146613076128411055210678107181036510075...9603930694559390897511357970893568551210156
                      departure227377318578112111168798009031924789018731...847282438169817374647690851889208170178777
                      amendment351428353860448133202722259324382262...19431893171515712006155715791310134241937
                      bonus001911085822637658607548483...4894243342493082872952332638983
                      nomination16105114128173121936690116...7784979493906250521908
                      refusal5232416162127611...1181012521237219

                      6 rows × 22 columns

                      fig, ax = plt.subplots(figsize=(5, 3))

                      changeType_year_pivot.head(5).drop(columns="total").T.plot(
                          kind="bar", stacked=True, ax=ax
                      )

                      ax.set_title("Top 5 number of Item 5.02 Filings\nby Personnel Change Type and Year")
                      ax.set_xlabel("Year")
                      ax.set_ylabel("Number of Changes")
                      ax.xaxis.grid(False)
                      ax.set_axisbelow(True)
                      ax.yaxis.set_major_formatter(mtick.StrMethodFormatter("{x:,.0f}"))
                      handles, labels = ax.get_legend_handles_labels() # reverse order of legend items

                      ax.legend(
                          [h for h in reversed(handles)],
                          [l for l in reversed(labels)],
                          title="Auditor",
                          bbox_to_anchor=(1.05, 1),
                          labelspacing=0.3,
                          fontsize=8,
                      )

                      plt.tight_layout()
                      plt.show()

                      Appointed Positions

                      appointments = personnel_change_types[
                          personnel_change_types["type"] == "appointment"
                      ].copy()

                      ceo_patterns = ["ceo", "chief executive officer"]
                      # check if one of these pattersn is in the array in the 'positions' column, set values to NaN if there is an error
                      appointments.loc[:, "isCEO"] = appointments["positions"].apply(
                          lambda x: (
                              any([pattern in str(x).lower() for pattern in ceo_patterns])
                              if isinstance(x, list)
                              else False
                          )
                      )
                      cfo_patterns = ["cfo", "chief financial officer"]
                      appointments.loc[:, "isCFO"] = appointments["positions"].apply(
                          lambda x: (
                              any([pattern in str(x).lower() for pattern in cfo_patterns])
                              if isinstance(x, list)
                              else False
                          )
                      )
                      board_patterns = ["board", "director", "chairman"]
                      appointments.loc[:, "isBoardMember"] = appointments["positions"].apply(
                          lambda x: (
                              any([pattern in str(x).lower() for pattern in board_patterns])
                              if isinstance(x, list)
                              else False
                          )
                      )
                      appointments.loc[:, "isOther"] = ~appointments["isCEO"] & ~appointments["isBoardMember"]
                      # assign roles based on conditions
                      appointments.loc[:, "role"] = appointments.apply(
                          lambda row: (
                              "CEO" if row["isCEO"] else ("CFO" if row["isCFO"] else ("Board Member" if row["isBoardMember"] else "Other"))
                          ),
                          axis=1,
                      )
                      appointments_role_year = appointments.groupby(["role", "year"]).size().unstack().fillna(0)
                      appointments_role_year.loc["total"] = appointments_role_year.sum()
                      fig, ax = plt.subplots(figsize=(5, 3))

                      appointments_role_year.T.drop(columns="total").plot(kind="bar", stacked=True, ax=ax)

                      ax.set_title("Types of Appointments in 5.02 Filings")
                      ax.set_xlabel("Year")
                      ax.set_ylabel("Number of Appointments")
                      ax.xaxis.grid(False)
                      ax.set_axisbelow(True)
                      ax.yaxis.set_major_formatter(mtick.StrMethodFormatter("{x:,.0f}"))
                      handles, labels = ax.get_legend_handles_labels() # reverse order of legend items

                      ax.legend(
                          [h for h in reversed(handles)],
                          [l for l in reversed(labels)],
                          title="Appointed as",
                          bbox_to_anchor=(1.05, 1),
                          labelspacing=0.3,
                          fontsize=8,
                      )

                      plt.tight_layout()
                      plt.show()

                      Compensation

                      In this section we analyze the compensation for new appointments including

                      • one-time payments
                      • annual cash values
                      • equity compensation
                      appointments = personnel_change_types[personnel_change_types["type"] == "appointment"]

                      fig, ax = plt.subplots(figsize=(4, 2.5))
                      ax.hist(appointments["person.age"].dropna(), bins=20, edgecolor="darkgrey")
                      ax.set_title("Histogram of Ages of Persons Appointed")
                      ax.set_xlabel("Age")
                      ax.set_ylabel("Number of Persons")
                      ax.grid(axis="y")
                      ax.set_axisbelow(True)
                      ax.yaxis.set_major_formatter(mtick.StrMethodFormatter("{x:,.0f}"))
                      plt.tight_layout()
                      plt.show()

                      # exclude outliers below 18 and above 100
                      appointments = appointments[
                          (appointments["person.age"] >= 18) & (appointments["person.age"] < 100)
                      ]

                      mean_age = appointments["person.age"].mean()
                      min_age = appointments["person.age"].min()
                      max_age = appointments["person.age"].max()
                      median_age = appointments["person.age"].median()

                      print(f"Average Age: {mean_age:.2f}")
                      print(f"Min Age: {min_age:.2f}")
                      print(f"Max Age: {max_age:.2f}")
                      print(f"Median Age: {median_age:.2f}")
                      Average Age: 50.51
                      Min Age: 20.00
                      Max Age: 93.00
                      Median Age: 50.00
                      fig, ax = plt.subplots(figsize=(4, 2.5))

                      # convert the amounts to numbers, stripping dollar signs
                      compensation_annual = pd.to_numeric(
                          appointments["compensation.annual"].str.replace(r"[\$,]", "", regex=True),
                          errors="coerce",
                      ).dropna()

                      ax.hist(compensation_annual, bins=20, edgecolor="darkgrey")
                      ax.set_title("Histogram of Annual Compensation of Persons Appointed")
                      ax.set_xlabel("Annual Compensation (USD in Millions)")
                      ax.set_ylabel("Number of Persons")
                      ax.set_yscale("log")
                      ax.grid(axis="y")
                      ax.set_axisbelow(True)
                      ax.xaxis.set_major_formatter(mtick.FuncFormatter(lambda x, p: f"{x/1_000_000:,.0f}"))
                      ax.yaxis.set_major_formatter(mtick.StrMethodFormatter("{x:,.0f}"))
                      plt.tight_layout()
                      plt.show()

                      # exclude outliers below 0 and above 1e7
                      # appointments = appointments[(appointments["compensation.annual"] >= 0) & (appointments["compensation.annual"] < 1e7)]

                      mean_compensation = compensation_annual.mean()
                      min_compensation = compensation_annual.min()
                      max_compensation = compensation_annual.max()
                      median_compensation = compensation_annual.median()

                      print(f"Average Compensation: {mean_compensation:,.2f}")
                      print(f"Min Compensation: {min_compensation:,.2f}")
                      print(f"Max Compensation: {max_compensation:,.2f}")
                      print(f"Median Compensation: {median_compensation:,.2f}")
                      Average Compensation: 383,013.58
                      Min Compensation: 0.00
                      Max Compensation: 12,000,000.00
                      Median Compensation: 330,000.00
                      # check how many compensations have an equity component
                      numEquityComponent = appointments["compensation.equity"].notnull().sum()

                      print(
                          f"Appointments with including equity component in the Item 5.02 filings (2004 - 2024):"
                      )
                      print(
                          f"Number of appointments with an equity component: {numEquityComponent:,} = {numEquityComponent / len(appointments):.1%} of all appointments."
                      )
                      Appointments with including equity component in the Item 5.02 filings (2004 - 2024):
                      Number of appointments with an equity component: 20,845 = 30.1% of all appointments.

                      Departures

                      departures = personnel_change_types[personnel_change_types["type"] == "departure"]
                      departureType_year = personnel_change_types[
                          ["departureType", "year", "accessionNo", "disagreements"]
                      ].explode("departureType")

                      departureType_year_pivot = pd.pivot_table(
                          departureType_year,
                          index="departureType",
                          columns="year",
                          values="accessionNo",
                          aggfunc="count",
                          fill_value=0,
                      )

                      departureType_year_pivot["total"] = departureType_year_pivot.sum(axis=1)
                      departureType_year_pivot = departureType_year_pivot.sort_values(
                          by="total", ascending=False
                      )

                      departureType_year_pivot
                      Out[99]:
                      year2004200520062007200820092010201120122013...201620172018201920202021202220232024total
                      departureType
                      resignation1578530456707297785564385833589855745287...470445954348430339893956424846224256106144
                      other314109813841840168315291642163714521645...19171823190819821803193723772200204235805
                      retirement28299610871381136310751011113212361260...12661270140014031217137213381368126525347
                      termination68208293500611619390404475397...4093723643193122293555664288052

                      4 rows × 22 columns

                      fig, ax = plt.subplots(figsize=(5, 3))

                      departureType_year_pivot.drop(columns="total").T.plot(kind="bar", stacked=True, ax=ax)

                      ax.set_title("Departures in 5.02 Filings\nby Departure Type and Year")
                      ax.set_xlabel("Year")
                      ax.set_ylabel("Number of Departures")
                      ax.xaxis.grid(False)
                      ax.set_axisbelow(True)
                      ax.yaxis.set_major_formatter(mtick.StrMethodFormatter("{x:,.0f}"))
                      handles, labels = ax.get_legend_handles_labels() # reverse order of legend items

                      ax.legend(
                          [h for h in reversed(handles)],
                          [l for l in reversed(labels)],
                          title="Departure Type",
                          bbox_to_anchor=(1.05, 1),
                          labelspacing=0.3,
                          fontsize=8,
                      )

                      plt.tight_layout()
                      plt.show()
                      percentag_departure_pivot = departureType_year_pivot.copy()
                      percentag_departure_pivot.loc["total"] = percentag_departure_pivot.sum(axis=0)
                      percentag_departure_pivot = (
                          percentag_departure_pivot.div(percentag_departure_pivot.loc["total"], axis=1) * 100
                      )

                      fig, ax = plt.subplots(figsize=(4, 2.5))
                      percentag_departure_pivot.drop("total").T.plot(kind="line", stacked=False, ax=ax)
                      ax.legend(title="Departure Type")
                      ax.set_ylabel("Percentage of Departures")
                      ax.set_title("Percentage of Departures in 5.02 Filings\nby Departure Type and Year")
                      plt.xticks(rotation=45, ha="right")
                      plt.tight_layout()
                      plt.show()

                      Departures connected to disagreements

                      disagreements_year_pivot = departures.pivot_table(
                          # departures,
                          index="year",
                          values=["disagreements"],
                          aggfunc=lambda x: (x == True).sum(),
                          fill_value=0,
                      )

                      departures_year_pivot = departures.pivot_table(
                          index="year",
                          values=["accessionNo"],
                          aggfunc="count",
                          fill_value=0,
                      )

                      disagreements_year_pivot = disagreements_year_pivot.T
                      disagreements_year_pivot["total"] = disagreements_year_pivot.sum(axis=1)
                      disagreements_year_pivot = disagreements_year_pivot.sort_values(
                          by="total", ascending=False
                      )
                      disagreements_year_pivot
                      Out[108]:
                      year2004200520062007200820092010201120122013...201620172018201920202021202220232024total
                      disagreements256184871039081546148...6557443245177046481212

                      1 rows × 22 columns

                      fig, ax = plt.subplots(figsize=(5, 3))

                      disagreements_year_pivot.T.drop("total").plot(kind="bar", stacked=False, ax=ax)

                      ax.set_title(
                          "Number of Departures in Item 5.02 Filings\nwith Disclosed Disagreements by Year"
                      )
                      ax.set_xlabel("Year")
                      ax.set_ylabel("Number of Departures")
                      ax.xaxis.grid(False)
                      ax.set_axisbelow(True)
                      plt.tight_layout()
                      plt.show()
                      disagreements_year_pivot = departures.pivot_table(
                          # departures,
                          index="year",
                          values=["disagreements"],
                          aggfunc=lambda x: (x == True).sum(),
                          fill_value=0,
                      )

                      departures_year_pivot = departures.pivot_table(
                          index="year",
                          values=["accessionNo"],
                          aggfunc="count",
                          fill_value=0,
                      ).rename(columns={"accessionNo": "departures"})

                      percentage_disagreements = (
                          disagreements_year_pivot["disagreements"]
                          / departures_year_pivot["departures"]
                          * 100
                      ).fillna(0)
                      percentage_disagreements = percentage_disagreements.apply(lambda x: f"{x:.2f}")

                      departures_year_pivot["disagreements"] = disagreements_year_pivot["disagreements"]
                      departures_year_pivot["disagreements_percentage"] = percentage_disagreements
                      departures_year_pivot
                      Out[113]:
                      departuresdisagreementsdisagreements_percentage
                      year
                      20042273251.10
                      20057731610.79
                      20068578840.98
                      200711211870.78
                      2008116871030.88
                      20099800900.92
                      20109031810.90
                      20119247540.58
                      20128901610.69
                      20138731480.55
                      20148747530.61
                      20159021410.45
                      20168472650.77
                      20178243570.69
                      20188169440.54
                      20198173320.39
                      20207464450.60
                      20217690170.22
                      20228518700.82
                      20238920460.52
                      20248170480.59

                      Amendments

                      This section includes statistics of amendments of contracts. First we will investigate boolean statistics and then analyze which part of the compensation was changed.

                      amendments = personnel_change_types[personnel_change_types["type"] == "amendment"]
                      columns_to_check_amendments = [
                          "termShortened",
                          "termExtended",
                          "compensationIncreased",
                          "compensationDecreased",
                      ]
                      amendments_summary = (
                          amendments[columns_to_check_amendments]
                          .apply(lambda x: x.value_counts())
                          .T.fillna(0)
                          .astype(int)
                      )
                      amendments_summary["False % tot."] = (
                          (amendments_summary[False] / len(amendments)) * 100
                      ).map("{:.2f}".format)
                      amendments_summary["True % tot."] = (
                          (amendments_summary[True] / len(amendments)) * 100
                      ).map("{:.2f}".format)
                      print(f"Summary of the {len(amendments):,} amendments in Item 5.02 filings (2004 - 2024).")
                      amendments_summary
                      Summary of the 41,937 amendments in Item 5.02 filings (2004 - 2024).
                      Out[115]:
                      FalseTrueFalse % tot.True % tot.
                      termShortened893982.130.23
                      termExtended80239971.919.53
                      compensationIncreased101051002.4112.16
                      compensationDecreased96810492.312.50
                      amendments = personnel_change_types[personnel_change_types["type"] == "amendment"]
                      columns_to_check_amendments = ["compensation.onetime", "compensation.annual", "compensation.equity"]
                      compensation_summary = amendments[columns_to_check_amendments].count().astype(int)
                      percentage_summary = (compensation_summary / len(amendments) * 100).round(2)
                      amendments_summary_df = pd.DataFrame([compensation_summary, percentage_summary], index=["Count", "Percentage"]).T
                      amendments_summary_df["Count"] = amendments_summary_df["Count"].astype(int)
                      amendments_summary_df
                      Out[117]:
                      CountPercentage
                      compensation.onetime19884.74
                      compensation.annual1845344.00
                      compensation.equity925122.06

                      Organizational Changes

                      In this section we have a look at reported organizational changes.

                      org_changes = (
                          structured_data[["accessionNo", "year", "organizationChanges"]]
                          .dropna(subset="organizationChanges")
                          .copy()
                          .reset_index(drop=True)
                      )
                      org_changes = pd.concat(
                          [
                              org_changes[["accessionNo", "year"]],
                              pd.json_normalize(org_changes["organizationChanges"]),
                          ],
                          axis=1,
                      )
                      org_changes

                      print(
                          f"There are {len(org_changes):,} unique Form 8-K filings with a disclosed organization change."
                      )
                      org_changes
                      There are 39,640 unique Form 8-K filings with a disclosed organization change.
                      Out[139]:
                      accessionNoyearorgandetailssizeIncreaseaffectedPersonnelsizeDecreasecreatedabolished
                      00000899715-04-0001642004Board of DirectorsExpansion of the boardTrue[Allan L. Schuman]NaNNaNNaN
                      10001193125-04-1457942004Board of DirectorsAppointment of Marvin S. Hausman, MD to the BoardTrue[Marvin S. Hausman, MD]NaNNaNNaN
                      20000950137-04-0070892004Board of DirectorsElection of Frederick H. Schneider as a Class ...TrueNaNNaNNaNNaN
                      30000066756-04-0000852004Board of DirectorsAmendments to bylaws including shareholder nom...FalseNaNFalseNaNNaN
                      40001104659-04-0257312004Board of DirectorsReduction in size from ten members to nineNaNNaNTrueNaNNaN
                      ..............................
                      396350001171843-24-0071282024Audit CommitteeThe Company intends to maintain a two-member A...False[W. Thorpe McKenzie, Jan-Paul Waldin]FalseFalseFalse
                      396360001331757-24-0002392024Board of DirectorsCommittee assignments for 2025 were finalized.False[Jeff Austin III, J. Mark Riebe, Rufus Cormier...FalseFalseFalse
                      396370001104659-24-1328722024Board of DirectorsTermination of the International Seaways, Inc....FalseNaNFalseFalseFalse
                      396380001193125-24-2871882024Board of DirectorsIncreased the size of the Board to fifteen dir...True[Jacqueline Allard, Somesh Khanna]FalseFalseFalse
                      396390001104659-24-1329032024NaNNaNFalseNaNFalseFalseFalse

                      39640 rows × 9 columns

                      org_changes["organ"].value_counts().head()
                      Out[140]:
                      organ
                      Board of Directors 31805
                      Compensation Committee 452
                      Audit Committee 404
                      Board of Trustees 253
                      Equity Incentive Plan 228
                      Name: count, dtype: int64
                      board_changes = org_changes[org_changes["organ"] == "Board of Directors"].copy()

                      # Define classification function
                      def classify_size_change(row):
                          
                          if row["sizeIncrease"]:
                              return "increase"
                          elif row["sizeDecrease"]:
                              return "decrease"
                          else:
                              return "no size change"
                          
                      board_changes["sizeChange"] = board_changes.apply(classify_size_change, axis=1)

                      print("Size changes of the Board of Directors disclosed in Item 5.02 over the full period (2004 - 2024).")
                      print(board_changes["sizeChange"].value_counts())
                      Size changes of the Board of Directors disclosed in Item 5.02 over the full period (2004 - 2024).
                      sizeChange
                      increase 25566
                      no size change 5518
                      decrease 721
                      Name: count, dtype: int64
                      counts_board_size_change_yr_piv = (
                          board_changes.drop_duplicates(subset=["accessionNo"])
                          .groupby(["year", "sizeChange"])
                          .size()
                          .unstack()
                          .fillna(0)
                      ).astype(int)
                      print(f"Size changes of the Board of Directors disclosed in Item 5.02 filings by year (2004 - 2024).")
                      counts_board_size_change_yr_piv.T
                      Size changes of the Board of Directors disclosed in Item 5.02 filings by year (2004 - 2024).
                      Out[142]:
                      year2004200520062007200820092010201120122013...2015201620172018201920202021202220232024
                      sizeChange
                      decrease4122035283214292132...43294039404144515882
                      increase29698311321229127511071136118312341226...1249127912591304127613351648141713941296
                      no size change21100141286352294297281252268...281249280247308309281330329348

                      3 rows × 21 columns

                      counts_bsc_yr = counts_board_size_change_yr_piv.stack().reset_index(name="count")

                      fig, ax = plt.subplots(figsize=(7, 2.5))
                      counts_board_size_change_yr_piv.plot(kind="bar", ax=ax, legend=True)
                      ax.legend(title="Change", loc="upper right", bbox_to_anchor=(1.32, 1))
                      ax.set_title("Number of Board of Directors changes disclosed in Item 5.02\nby size change and year (2004 - 2024)")
                      ax.set_xlabel("Year")
                      ax.set_ylabel("Number of\nItem 5.02 Filings")
                      ax.yaxis.set_major_formatter(mtick.StrMethodFormatter("{x:,.0f}"))
                      ax.grid(axis="x")
                      ax.set_axisbelow(True)
                      plt.tight_layout()
                      plt.show()

                      Bonus Plans

                      This section gives a quick overview of the bonus plans reported in the Item 5.02 filings.

                      For this section, we downloaded all filings including a bonus plan change from the Form 8-K Item 5.02 Structured Data API and prepared a second dataframe. For this, we exploded the list of bonus plans and saved it to disk. This way, we can start with the analysis right away.

                      bonus_data = (
                          structured_data[["accessionNo", "year", "bonusPlans"]]
                          .copy()
                          .explode(column="bonusPlans")
                          .dropna(subset=["bonusPlans"])
                          .reset_index(drop=True)
                      )

                      bonus_data = pd.concat(
                          [
                              bonus_data[["accessionNo", "year"]],
                              pd.json_normalize(bonus_data["bonusPlans"]),
                          ],
                          axis=1,
                      )
                      print(f"{len(bonus_data):,} bonus plan disclosures loaded.")
                      bonus_data
                      50,189 bonus plan disclosures loaded.
                      Out[162]:
                      accessionNoyearspecificIndividualseligibleIndividualscompensation.cashconditionalconditionscompensation.equitycompensation.equityDetailsspecificRolesgeneralEmployeeeligibleRoles
                      00001104659-04-0253252004True[Dale Brown]20% of annual salaryNaNNaNNaNNaNNaNNaNNaN
                      10000950123-04-0102492004True[Michael P. Huseby]Annual bonus with a target of 80% of his annua...TruePerformance-based vesting for conjunctive righ...Annual awards of 10,000 restricted shares of C...Four-year cliff vestingNaNNaNNaN
                      20000703701-04-0000422004True[Benjamin M. Cutler]Minimum Annual Bonus of $166,666 for 2004 and ...TrueAchievement of performance targetsOptions to purchase 9.9% of the common stockVesting over 5 yearsNaNNaNNaN
                      30001204560-04-0000312004True[Shant Koumriqian]Targeted annual bonus of 40%NaNNaNNaNNaNNaNNaNNaN
                      40001104659-04-0270022004True[Dr. Hahn, Mr. Howard, Mr. Raymond, Mr. Rice]$15,000 annual retainerFalseNaN30,000 stock options25% vesting immediately with an additional 25%...FalseFalseNaN
                      .......................................
                      501840001493152-24-0527072024True[Pete O'Heeron, Hamid Khoja]NaNNaNNaN406,339 shares of common stock for Pete O'Heer...One fourth (1/4th) vest on December 27, 2025, ...NaNNaNNaN
                      501850001193125-24-2871832024True[Matt Reback]$175,000TrueSubject to the conditions set forth in the Emp...NaNNaNFalseFalseNaN
                      501860001213900-24-1140152024True[Dr. William J. McGann]5% of annual base salaryNaNNaNNaNNaNNaNNaNNaN
                      501870001193125-24-2871882024True[Christopher M. Gorman, Clark Khayat, Andrew J...NaNTrueAchievement of regulatory capital requirements...Share-settled performance-based equity awardsVesting in January 2027 based on regulatory ca...FalseFalseNaN
                      501880001104659-24-1329032024True[Marc Holliday]NaNTrueAchievement of specific operational or financi...Performance-Based Class O LTIP Units with a gr...Subject to time-based and performance-based ve...FalseFalseNaN

                      50189 rows × 12 columns

                      bool_variables_to_analyze = [
                          "specificRoles",
                          "generalEmployee",
                          "specificIndividuals",
                          "conditional",
                      ]

                      var_to_label = {
                          "specificRoles": "Specific Positions are eligible",
                          "generalEmployee": "General Employees are eligible",
                          "specificIndividuals": "Specific Individuals are eligible",
                          "conditional": "Conditional Bonus",
                      }


                      total_samples = len(bonus_data)
                      # Create a row for the total samples
                      total_row = pd.DataFrame(
                          {
                              "Samples": [f"{total_samples:,.0f}"],
                              "Pct.": [""],
                              "Pct. tot.": [100],
                          },
                          index=pd.MultiIndex.from_tuples([("Total", "")], names=["Variable", "Value"]),
                      )


                      bool_variables_stats = []

                      for variable in bool_variables_to_analyze:
                          variable_stats = (
                              bonus_data[variable]
                              .value_counts()
                              .to_frame()
                              .reset_index()
                              .rename(columns={variable: "value"})
                          )
                          variable_stats = variable_stats.sort_values(by="value", ascending=False)
                          variable_stats["pct"] = (
                              variable_stats["count"] / variable_stats["count"].sum() * 100
                          ).round(1)
                          variable_stats["pct_tot"] = (variable_stats["count"] / total_samples * 100).round(1)
                          variable_stats.index = pd.MultiIndex.from_tuples(
                              [(variable, row["value"]) for _, row in variable_stats.iterrows()],
                          )
                          variable_stats.drop(columns="value", inplace=True)

                          bool_variables_stats.append(variable_stats)

                      bool_variables_stats = pd.concat(bool_variables_stats, axis=0)
                      bool_variables_stats.index.set_names(["Variable", "Value"], inplace=True)
                      bool_variables_stats.rename(
                          index=var_to_label,
                          columns={"count": "Samples", "pct": "Pct.", "pct_tot": "Pct. tot."},
                          inplace=True,
                      )
                      bool_variables_stats["Samples"] = bool_variables_stats["Samples"].apply(
                          lambda x: f"{x:,.0f}"
                      )


                      bool_variables_stats = pd.concat([total_row, bool_variables_stats])


                      print(
                          f"Number of Bonus Plans disclosed in Item 5.02\nby their disclosed characteristics (2004 - 2024):"
                      )
                      bool_variables_stats
                      Number of Bonus Plans disclosed in Item 5.02
                      by their disclosed characteristics (2004 - 2024):
                      Out[157]:
                      SamplesPct.Pct. tot.
                      VariableValue
                      Total50,189100.0
                      Specific Positions are eligibleTrue9,18127.418.3
                      False24,28072.648.4
                      General Employees are eligibleTrue6,63222.213.2
                      False23,29577.846.4
                      Specific Individuals are eligibleTrue32,64269.665.0
                      False14,23030.428.4
                      Conditional BonusTrue32,41477.864.6
                      False9,25422.218.4
                      columns_to_check_bonus = ["compensation.cash", "compensation.equity"]
                      compensation_summary = bonus_data[columns_to_check_bonus].count().astype(int)
                      percentage_summary = (compensation_summary / len(bonus_data) * 100).round(2)
                      bonus_summary_df = pd.DataFrame([compensation_summary, percentage_summary], index=["Count", "Percentage"]).T
                      bonus_summary_df["Count"] = bonus_summary_df["Count"].astype(int)
                      bonus_summary_df.rename(columns={"Count": "Total Count", "Percentage": "Share (%)"}, inplace=True)
                      bonus_summary_df = bonus_summary_df.rename(index={"compensation.cash": "Cash Compensation",
                                                                        "compensation.equity": "Equity Compensation"})
                      print("Number of Bonus Plans disclosed in Item 5.02 filings (2004 - 2024) with compensation details.")
                      bonus_summary_df
                      Number of Bonus Plans disclosed in Item 5.02 filings (2004 - 2024) with compensation details.
                      Out[158]:
                      Total CountShare (%)
                      Cash Compensation2949458.77
                      Equity Compensation2640852.62

                      Footer

                      Products

                      • EDGAR Filing Search API
                      • Full-Text Search API
                      • Real-Time Filing Stream API
                      • Filing Download & PDF Generator API
                      • XBRL-to-JSON Converter
                      • 10-K/10-Q/8-K Item Extractor
                      • Investment Adviser & Form ADV API
                      • Insider Trading Data - Form 3, 4, 5
                      • Restricted Sales Notifications - Form 144
                      • Institutional Holdings - Form 13F
                      • Form N-PORT API - Investment Company Holdings
                      • Form N-CEN API - Annual Reports by Investment Companies
                      • Form N-PX API - Proxy Voting Records
                      • Form 13D/13G API
                      • Form S-1/424B4 - IPOs, Debt & Rights Offerings
                      • Form C - Crowdfunding Offerings
                      • Form D - Private Placements & Exempt Offerings
                      • Regulation A Offering Statements API
                      • Changes in Auditors & Accountants
                      • Non-Reliance on Prior Financial Statements
                      • Executive Compensation Data API
                      • Directors & Board Members Data
                      • Company Subsidiaries Database
                      • Outstanding Shares & Public Float
                      • SEC Enforcement Actions
                      • Accounting & Auditing Enforcement Releases (AAERs)
                      • SRO Filings
                      • CIK, CUSIP, Ticker Mapping

                      General

                      • Pricing
                      • Features
                      • Supported Filings
                      • EDGAR Filing Statistics

                      Account

                      • Sign Up - Start Free Trial
                      • Log In
                      • Forgot Password

                      Developers

                      • API Sandbox
                      • Documentation
                      • Resources & Tutorials
                      • Python API SDK
                      • Node.js API SDK

                      Legal

                      • Terms of Service
                      • Privacy Policy

                      Legal

                      • Terms of Service
                      • Privacy Policy

                      SEC API

                      © 2025 sec-api.io by Data2Value GmbH. All rights reserved.

                      SEC® and EDGAR® are registered trademarks of the U.S. Securities and Exchange Commission (SEC).

                      EDGAR is the Electronic Data Gathering, Analysis, and Retrieval system operated by the SEC.

                      sec-api.io and Data2Value GmbH are independent of, and not affiliated with, sponsored by, or endorsed by the U.S. Securities and Exchange Commission.

                      sec-api.io is classified under SIC code 7375 (Information Retrieval Services), providing on-demand access to structured data and online information services.