Analyzing SEC Litigation Releases in Python
In this guide, we will analyze all SEC litigation releases from 1995 to 2025 obtained from the litigation database. The dataset contains information about all litigation releases published by the U.S. Securities and Exchange Commission (SEC) from 1995 to 2025 and includes information about the defendants, charges, penalty amounts, and other relevant details.
Our analysis will focus on several key aspects of the dataset, including:
- The number of litigation releases published each year and month
- Litigations by settlement type and agreement to pay penalties
- Penalty amounts by year
- Top 10 penalties by amount
- Categories of requested reliefs, violations, and more
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.style as style
style.use("default")
params = {
"axes.labelsize": 8,
"font.size": 8,
"legend.fontsize": 8,
"xtick.labelsize": 8,
"ytick.labelsize": 8,
"text.usetex": False,
"font.family": "sans-serif",
"axes.spines.top": False,
"axes.spines.right": False,
"grid.color": "grey",
"axes.grid": True,
"axes.grid.axis": "y",
"axes.grid.axis": "x",
"grid.alpha": 0.5,
"grid.linestyle": ":",
}
plt.rcParams.update(params)
For the sake of brevity, we downloaded all SEC litigation data from 1995 to 2025 and saved it as a JSONL file. We will load this data into a pandas DataFrame and perform our analysis using Python.
df = pd.read_json(
# TODO: replace with your path
"/path/to/sec-litigation-releases.jsonl",
lines=True,
)
# convert "releasedAt" to datetime and to EST timezone
df["releasedAt"] = pd.to_datetime(df["releasedAt"], utc=True)
df["releasedAt"] = df["releasedAt"].dt.tz_convert("US/Eastern")
df["releasedAtYear"] = df["releasedAt"].dt.year
df["releasedAtMonth"] = df["releasedAt"].dt.month
df["releasedAtYearMonth"] = df["releasedAt"].dt.to_period("M")
# Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday
df["releasedAtDay"] = df["releasedAt"].dt.day_name()
df["hasAgreedToSettlement"] = df["hasAgreedToSettlement"].astype(bool)
df["hasAgreedToPayPenalty"] = df["hasAgreedToPayPenalty"].astype(bool)
df['caseCitation'] = df['caseCitations'].map(lambda x: x[0] if x else None)
print(f"Loaded {len(df):,.0f} SEC litigation releases from 1995 to 2025\n")
print(df.info())
Loaded 11,536 SEC litigation releases from 1995 to 2025
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11536 entries, 0 to 11535
Data columns (total 26 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 id 11536 non-null object
1 releaseNo 11536 non-null object
2 releasedAt 11536 non-null datetime64[ns, US/Eastern]
3 url 11536 non-null object
4 title 11535 non-null object
5 subTitle 11536 non-null object
6 caseCitations 11536 non-null object
7 resources 11536 non-null object
8 summary 11524 non-null object
9 tags 11524 non-null object
10 entities 11524 non-null object
11 complaints 11524 non-null object
12 parallelActionsTakenBy 11524 non-null object
13 hasAgreedToSettlement 11536 non-null bool
14 hasAgreedToPayPenalty 11536 non-null bool
15 penaltyAmounts 11524 non-null object
16 requestedRelief 11524 non-null object
17 violatedSections 11524 non-null object
18 investigationConductedBy 11524 non-null object
19 litigationLedBy 11524 non-null object
20 otherAgenciesInvolved 11524 non-null object
21 releasedAtYear 11536 non-null int32
22 releasedAtMonth 11536 non-null int32
23 releasedAtYearMonth 11536 non-null period[M]
24 releasedAtDay 11536 non-null object
25 caseCitation 11192 non-null object
dtypes: bool(2), datetime64[ns, US/Eastern](1), int32(2), object(20), period[M](1)
memory usage: 2.0+ MB
None
/var/folders/q3/bt7922t52p78qdm75h_8m5yh0000gn/T/ipykernel_39183/1704038985.py:6: UserWarning: Converting to PeriodArray/Index representation will drop timezone information.
df["releasedAtYearMonth"] = df["releasedAt"].dt.to_period("M")
df.head()
Out:
id | releaseNo | releasedAt | url | title | subTitle | caseCitations | resources | summary | tags | ... | requestedRelief | violatedSections | investigationConductedBy | litigationLedBy | otherAgenciesInvolved | releasedAtYear | releasedAtMonth | releasedAtYearMonth | releasedAtDay | caseCitation | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | c3772c013f5a90d0c1c0170bbed8ad6a | LR-26231 | 2025-01-22 21:17:05-05:00 | https://www.sec.gov/enforcement-litigation/lit... | Gabriel Rebeiz | SEC Charges Technical Consultant with Insider ... | [Securities and Exchange Commission v. Gabriel... | [{'label': 'SEC Complaint', 'url': 'https://ww... | The SEC filed settled charges against Gabriel ... | [insider trading] | ... | [disgorgement of profits, civil penalties, per... | [Section 10(b) of the Securities Exchange Act ... | [Sara Kalin, John Rymas, Stephen Kam] | [Diana Tani, Joseph Sansone] | [{'name': 'Financial Industry Regulatory Autho... | 2025 | 1 | 2025-01 | Wednesday | Securities and Exchange Commission v. Gabriel ... |
1 | 860c50b5c1bfa963e68568f2b6008ba4 | LR-26230 | 2025-01-21 15:00:18-05:00 | https://www.sec.gov/enforcement-litigation/lit... | Old South Trading Co., LLC; Brendan H. Church;... | SEC Charges Father and Son for $25.8 Million U... | [Securities and Exchange Commission v. Old Sou... | [{'label': 'SEC Complaint', 'url': 'https://ww... | The SEC has charged Old South Trading Co., LLC... | [unregistered securities offering, unregistere... | ... | [injunctive relief, disgorgement of allegedly ... | [Sections 5(a) and (c) of the Securities Act o... | [Jonathan Shapiro, Andrew Elliott, Margaret Vi... | [Dean Conway] | [] | 2025 | 1 | 2025-01 | Tuesday | Securities and Exchange Commission v. Old Sout... |
2 | 1485373724567b64d7cee1c8ce853fe3 | LR-26229 | 2025-01-18 19:09:46-05:00 | https://www.sec.gov/enforcement-litigation/lit... | Nova Labs, Inc. | SEC Charges Nova Labs, Inc. with Fraud and Reg... | [Securities and Exchange Commission v. Nova La... | [{'label': 'SEC Complaint', 'url': 'https://ww... | The SEC has charged Nova Labs, Inc. with fraud... | [fraud, registration violations, crypto] | ... | [permanent and conduct-based injunctions, disg... | [Sections 5(a), 5(c), and 17(a)(2) of the Secu... | [Emmy E. Rush, Christopher Colorado, Kim Han, ... | [Emmy E. Rush, Christopher Colorado, Peter Man... | [] | 2025 | 1 | 2025-01 | Saturday | Securities and Exchange Commission v. Nova Lab... |
3 | 35ebfbcaf74cef9401e78f8e92b934e6 | LR-26228 | 2025-01-17 22:20:10-05:00 | https://www.sec.gov/enforcement-litigation/lit... | Arete Wealth Management LLC; Arete Wealth Advi... | SEC Charges Arete Wealth Broker-Dealer and Adv... | [Securities and Exchange Commission v. Arete W... | [{'label': 'SEC Complaint', 'url': 'https://ww... | The SEC has charged Arete Wealth Management LL... | [fraud, illegal securities offering, recordkee... | ... | [permanent injunctions, civil penalties, condu... | [Sections 206(1) and 206(2) of the Investment ... | [Theresa H. Gue, Austin Thompson, Christopher ... | [Oren Gleich, Preethi Krishnamurthy] | [{'name': 'U.S. Attorney's Office for the East... | 2025 | 1 | 2025-01 | Friday | Securities and Exchange Commission v. Arete We... |
4 | 60cabddfdd7abcf4c1da9bffebc55a1e | LR-26227 | 2025-01-17 21:56:17-05:00 | https://www.sec.gov/enforcement-litigation/lit... | Naufal Sanaullah | SEC Charges Fund Executive with Making False S... | [Securities and Exchange Commission v. Naufal ... | [{'label': 'SEC Complaint', 'url': 'https://ww... | The SEC charged Naufal Sanaullah with making f... | [disclosure fraud, securities fraud] | ... | [permanent injunctive relief, disgorgement alo... | [Section 17(a) of the Securities Act of 1933, ... | [Heather Marlow, Kimberly L. Frederick, Nichol... | [Zachary Carlyle, Gregory A. Kasper, Nicholas ... | [{'name': 'Federal Bureau of Investigation', '... | 2025 | 1 | 2025-01 | Friday | Securities and Exchange Commission v. Naufal S... |
5 rows × 26 columns
# piv table with years = col, months = index, values = count of cases
df_year_month = df.pivot_table(
index="releasedAtYear",
columns="releasedAtMonth",
values="id",
aggfunc="count",
fill_value=0,
)
total_col = df_year_month.sum(axis=1)
mean_col = round(df_year_month.mean(axis=1), 0)
median_col = round(df_year_month.median(axis=1), 0)
df_year_month["total"] = total_col
df_year_month["mean"] = mean_col
df_year_month["median"] = median_col
total_row = df_year_month.sum(axis=0)
mean_row = round(df_year_month.mean(axis=0), 0)
median_row = round(df_year_month.median(axis=0), 0)
df_year_month.loc["total"] = total_row
df_year_month.loc["mean"] = mean_row
df_year_month.loc["median"] = median_row
df_year_month = df_year_month.astype(int)
print("SEC Litigation Releases by Year and Month")
df_year_month
SEC Litigation Releases by Year and Month
Out:
releasedAtMonth | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | total | mean | median |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
releasedAtYear | |||||||||||||||
1995 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 23 | 39 | 36 | 25 | 123 | 10 | 0 |
1996 | 37 | 22 | 30 | 36 | 32 | 40 | 27 | 30 | 73 | 42 | 26 | 34 | 429 | 36 | 33 |
1997 | 37 | 32 | 39 | 46 | 24 | 20 | 28 | 39 | 53 | 27 | 26 | 29 | 400 | 33 | 30 |
1998 | 28 | 20 | 35 | 35 | 34 | 31 | 30 | 36 | 53 | 44 | 19 | 25 | 390 | 32 | 32 |
1999 | 38 | 24 | 31 | 27 | 40 | 33 | 24 | 42 | 52 | 23 | 31 | 21 | 386 | 32 | 31 |
2000 | 23 | 41 | 38 | 32 | 45 | 40 | 25 | 34 | 68 | 44 | 27 | 34 | 451 | 38 | 36 |
2001 | 39 | 35 | 28 | 32 | 41 | 36 | 27 | 30 | 55 | 49 | 35 | 41 | 448 | 37 | 36 |
2002 | 58 | 38 | 60 | 49 | 42 | 55 | 52 | 60 | 49 | 65 | 45 | 44 | 617 | 51 | 50 |
2003 | 48 | 49 | 49 | 65 | 41 | 45 | 60 | 51 | 64 | 58 | 40 | 49 | 619 | 52 | 49 |
2004 | 34 | 40 | 50 | 39 | 40 | 44 | 43 | 48 | 53 | 36 | 36 | 32 | 495 | 41 | 40 |
2005 | 39 | 46 | 59 | 46 | 31 | 50 | 27 | 36 | 57 | 39 | 30 | 38 | 498 | 42 | 39 |
2006 | 34 | 31 | 53 | 42 | 34 | 38 | 32 | 35 | 39 | 40 | 32 | 26 | 436 | 36 | 34 |
2007 | 30 | 36 | 41 | 35 | 39 | 39 | 42 | 46 | 51 | 38 | 24 | 40 | 461 | 38 | 39 |
2008 | 26 | 27 | 42 | 33 | 57 | 26 | 35 | 32 | 63 | 34 | 22 | 22 | 419 | 35 | 32 |
2009 | 36 | 42 | 63 | 38 | 44 | 48 | 49 | 29 | 40 | 40 | 43 | 42 | 514 | 43 | 42 |
2010 | 41 | 29 | 44 | 40 | 30 | 45 | 24 | 27 | 43 | 38 | 41 | 37 | 439 | 37 | 39 |
2011 | 36 | 35 | 44 | 38 | 32 | 39 | 32 | 26 | 31 | 33 | 27 | 41 | 414 | 34 | 34 |
2012 | 33 | 27 | 40 | 36 | 32 | 24 | 23 | 34 | 36 | 24 | 30 | 27 | 366 | 30 | 31 |
2013 | 25 | 21 | 34 | 31 | 18 | 24 | 31 | 23 | 38 | 34 | 19 | 20 | 318 | 26 | 24 |
2014 | 14 | 20 | 24 | 25 | 27 | 24 | 18 | 22 | 25 | 25 | 18 | 24 | 266 | 22 | 24 |
2015 | 16 | 25 | 15 | 24 | 22 | 22 | 19 | 16 | 44 | 24 | 17 | 26 | 270 | 22 | 22 |
2016 | 16 | 20 | 33 | 17 | 24 | 35 | 21 | 21 | 39 | 9 | 16 | 18 | 269 | 22 | 20 |
2017 | 21 | 26 | 34 | 24 | 30 | 22 | 20 | 36 | 28 | 21 | 21 | 25 | 308 | 26 | 24 |
2018 | 15 | 21 | 31 | 36 | 25 | 27 | 40 | 34 | 47 | 28 | 34 | 17 | 355 | 30 | 30 |
2019 | 7 | 26 | 21 | 29 | 22 | 40 | 24 | 34 | 46 | 23 | 24 | 22 | 318 | 26 | 24 |
2020 | 31 | 20 | 32 | 25 | 15 | 21 | 15 | 20 | 59 | 14 | 17 | 32 | 301 | 25 | 20 |
2021 | 16 | 19 | 24 | 23 | 18 | 26 | 24 | 37 | 47 | 13 | 22 | 23 | 292 | 24 | 23 |
2022 | 24 | 16 | 16 | 25 | 24 | 30 | 27 | 33 | 53 | 20 | 16 | 21 | 305 | 25 | 24 |
2023 | 23 | 26 | 33 | 16 | 38 | 28 | 30 | 23 | 57 | 11 | 17 | 14 | 316 | 26 | 24 |
2024 | 17 | 11 | 15 | 29 | 23 | 25 | 27 | 28 | 53 | 22 | 14 | 24 | 288 | 24 | 24 |
2025 | 25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 25 | 2 | 0 |
total | 867 | 825 | 1058 | 973 | 924 | 977 | 876 | 962 | 1439 | 957 | 805 | 873 | 11536 | 957 | 910 |
mean | 28 | 27 | 34 | 31 | 30 | 32 | 28 | 31 | 46 | 31 | 26 | 28 | 372 | 31 | 29 |
median | 28 | 26 | 34 | 32 | 31 | 31 | 27 | 33 | 49 | 33 | 26 | 26 | 386 | 32 | 31 |
fig, ax = plt.subplots(figsize=(5, 3))
df_year_month.loc[1995:2024]["total"].plot(
kind="line",
ax=ax,
marker="o",
markersize=3,
linewidth=1,
)
ax.set_title("SEC Litigation Releases per Year")
ax.set_xlabel("Year")
ax.set_ylabel("Number of Cases")
plt.tight_layout()
plt.grid(axis="x")
ax.set_axisbelow(True)
plt.show()

# piv table for hasAgreedToSettlement
# col = hasAgreedToSettlement (true, false), index = year, values = count of cases
df_year_settlement = df.pivot_table(
index="releasedAtYear",
columns="hasAgreedToSettlement",
values="id",
aggfunc="count",
fill_value=0,
)
total_col = df_year_settlement.sum(axis=1)
mean_col = round(df_year_settlement.mean(axis=1), 0)
median_col = round(df_year_settlement.median(axis=1), 0)
total_row = df_year_settlement.sum(axis=0)
mean_row = round(df_year_settlement.mean(axis=0), 0)
median_row = round(df_year_settlement.median(axis=0), 0)
df_year_settlement.loc["total"] = total_row
df_year_settlement.loc["mean"] = mean_row
df_year_settlement.loc["median"] = median_row
df_year_settlement = df_year_settlement.astype(int)
df_year_settlement
Out:
hasAgreedToSettlement | False | True |
---|---|---|
releasedAtYear | ||
1995 | 63 | 60 |
1996 | 215 | 214 |
1997 | 195 | 205 |
1998 | 209 | 181 |
1999 | 196 | 190 |
2000 | 239 | 212 |
2001 | 214 | 234 |
2002 | 337 | 280 |
2003 | 322 | 297 |
2004 | 244 | 251 |
2005 | 241 | 257 |
2006 | 177 | 259 |
2007 | 210 | 251 |
2008 | 188 | 231 |
2009 | 273 | 241 |
2010 | 205 | 234 |
2011 | 194 | 220 |
2012 | 190 | 176 |
2013 | 161 | 157 |
2014 | 135 | 131 |
2015 | 160 | 110 |
2016 | 149 | 120 |
2017 | 174 | 134 |
2018 | 177 | 178 |
2019 | 171 | 147 |
2020 | 161 | 140 |
2021 | 173 | 119 |
2022 | 165 | 140 |
2023 | 165 | 151 |
2024 | 146 | 142 |
2025 | 7 | 18 |
total | 5856 | 5680 |
mean | 189 | 183 |
median | 188 | 181 |
# stacked bar chart for hasAgreedToSettlement
fig, ax = plt.subplots(figsize=(6, 4))
df_year_settlement.drop(["total", "mean", "median"]).plot(
kind="bar", stacked=False, ax=ax, color=["#1f77b4", "#ff7f0e"]
)
plt.title("SEC Litigation Releases by Year and Settlement")
plt.xlabel("Year")
plt.ylabel("Count")
plt.legend(["No Settlement", "Settlement"], loc="upper right")
plt.grid(axis="x")
ax.set_axisbelow(True)
plt.tight_layout()
plt.show()

# piv table for hasAgreedToPayPenalty
# col = hasAgreedToPayPenalty (true, false), index = year, values = count of cases
df_year_penalty = df.pivot_table(
index="releasedAtYear",
columns="hasAgreedToPayPenalty",
values="id",
aggfunc="count",
fill_value=0,
)
total_col = df_year_penalty.sum(axis=1)
mean_col = round(df_year_penalty.mean(axis=1), 0)
median_col = round(df_year_penalty.median(axis=1), 0)
total_row = df_year_penalty.sum(axis=0)
mean_row = round(df_year_penalty.mean(axis=0), 0)
median_row = round(df_year_penalty.median(axis=0), 0)
df_year_penalty.loc["total"] = total_row
df_year_penalty.loc["mean"] = mean_row
df_year_penalty.loc["median"] = median_row
df_year_penalty = df_year_penalty.astype(int)
df_year_penalty
Out:
hasAgreedToPayPenalty | False | True |
---|---|---|
releasedAtYear | ||
1995 | 88 | 35 |
1996 | 296 | 133 |
1997 | 266 | 134 |
1998 | 254 | 136 |
1999 | 222 | 164 |
2000 | 285 | 166 |
2001 | 251 | 197 |
2002 | 375 | 242 |
2003 | 339 | 280 |
2004 | 270 | 225 |
2005 | 243 | 255 |
2006 | 205 | 231 |
2007 | 230 | 231 |
2008 | 224 | 195 |
2009 | 308 | 206 |
2010 | 225 | 214 |
2011 | 219 | 195 |
2012 | 183 | 183 |
2013 | 172 | 146 |
2014 | 144 | 122 |
2015 | 154 | 116 |
2016 | 162 | 107 |
2017 | 173 | 135 |
2018 | 200 | 155 |
2019 | 182 | 136 |
2020 | 171 | 130 |
2021 | 191 | 101 |
2022 | 165 | 140 |
2023 | 192 | 124 |
2024 | 144 | 144 |
2025 | 11 | 14 |
total | 6544 | 4992 |
mean | 211 | 161 |
median | 205 | 146 |
# stacked bar chart for hasAgreedToSettlement
fig, ax = plt.subplots(figsize=(6, 4))
df_year_penalty.drop(["total", "mean", "median"]).plot(
kind="bar", stacked=False, ax=ax, color=["#1f77b4", "#ff7f0e"]
)
plt.title("SEC Litigation Releases by Year and Penalty")
plt.xlabel("Year")
plt.ylabel("Count")
plt.legend(["Did not agree to penalty", "Agreed to penalty"], loc="upper right")
plt.grid(axis="x")
ax.set_axisbelow(True)
plt.tight_layout()
plt.show()

all_penalties = []
case_citations = {}
# iterate over all rows, extract penalties and append to all_penalties
for i, row in df.drop_duplicates(subset=["caseCitation"], keep="last").iterrows():
penaltyAmounts = row["penaltyAmounts"]
if isinstance(penaltyAmounts, list):
for penalty in penaltyAmounts:
if "penaltyAmount" in penalty:
# find entity with "name" == "imposedOn"
entity = list(filter(lambda x: x["name"] == penalty["imposedOn"], row["entities"]))
entity_type = entity[0]["type"] if entity else None
all_penalties.append(
{
"caseCitation": row['caseCitation'],
"releaseNo": row["releaseNo"],
"releasedAt": row["releasedAt"],
"releasedAtYear": row["releasedAtYear"],
"releasedAtMonth": row["releasedAtMonth"],
"amount": penalty["penaltyAmount"],
"imposedOn": penalty["imposedOn"],
"imposedOnType": entity_type,
"url": row["url"],
}
)
all_penalties_df = pd.DataFrame(all_penalties)
all_penalties_df["amount"] = all_penalties_df["amount"].astype(float)
# remove all doubles by caseCitation and imposedOn
all_penalties_df = all_penalties_df.drop_duplicates(
subset=["caseCitation", "imposedOn", "amount"], keep="last"
)
all_penalties_df
Out:
caseCitation | releaseNo | releasedAt | releasedAtYear | releasedAtMonth | amount | imposedOn | imposedOnType | url | |
---|---|---|---|---|---|---|---|---|---|
0 | Securities and Exchange Commission v. Gabriel ... | LR-26231 | 2025-01-22 21:17:05-05:00 | 2025 | 1 | 360673.00 | Gabriel Rebeiz | individual | https://www.sec.gov/enforcement-litigation/lit... |
1 | Securities and Exchange Commission v. Arete We... | LR-26228 | 2025-01-17 22:20:10-05:00 | 2025 | 1 | 200000.00 | Michael Sealy | individual | https://www.sec.gov/enforcement-litigation/lit... |
2 | Securities and Exchange Commission v. American... | LR-26225 | 2025-01-17 17:57:15-05:00 | 2025 | 1 | 1876115.22 | Ross C. Miles, American Equities, Inc., and Am... | None | https://www.sec.gov/enforcement-litigation/lit... |
3 | Securities and Exchange Commission v. American... | LR-26225 | 2025-01-17 17:57:15-05:00 | 2025 | 1 | 1146307.10 | Ross C. Miles, American Equities, Inc., and Am... | None | https://www.sec.gov/enforcement-litigation/lit... |
4 | Securities and Exchange Commission v. American... | LR-26225 | 2025-01-17 17:57:15-05:00 | 2025 | 1 | 230464.00 | Ross C. Miles, American Equities, Inc., and Am... | None | https://www.sec.gov/enforcement-litigation/lit... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9421 | SECURITIES AND EXCHANGE COMMISSION v. CROSS FI... | LR-14649 | 1995-09-21 14:18:04-04:00 | 1995 | 9 | 8600000.00 | Douglas S. Cross | individual | https://www.sec.gov/files/litigation/litreleas... |
9422 | SECURITIES AND EXCHANGE COMMISSION v. CROSS FI... | LR-14649 | 1995-09-21 14:18:04-04:00 | 1995 | 9 | 2600000.00 | Michael J. Colello | individual | https://www.sec.gov/files/litigation/litreleas... |
9423 | SECURITIES AND EXCHANGE COMMISSION v. ROBERT M... | LR-14644 | 1995-09-20 14:18:03-04:00 | 1995 | 9 | 922741.00 | Stifel, Nicolaus and Company, Incorporated | company | https://www.sec.gov/files/litigation/litreleas... |
9424 | SECURITIES AND EXCHANGE COMMISSION v. ROBERT M... | LR-14644 | 1995-09-20 14:18:03-04:00 | 1995 | 9 | 263637.00 | Stifel, Nicolaus and Company, Incorporated | company | https://www.sec.gov/files/litigation/litreleas... |
9425 | SECURITIES AND EXCHANGE COMMISSION v. ROBERT M... | LR-14644 | 1995-09-20 14:18:03-04:00 | 1995 | 9 | 250000.00 | Stifel, Nicolaus and Company, Incorporated | company | https://www.sec.gov/files/litigation/litreleas... |
9291 rows × 9 columns
# aggregate amount by year
penalties_year = all_penalties_df.groupby("releasedAtYear")["amount"].sum()
penalties_year = penalties_year.astype(int)
penalties_year = pd.DataFrame(penalties_year)
penalties_year["amount"] = round(penalties_year["amount"] / 1_000_000, 2)
print("Total Penalties in Million USD by Year")
penalties_year
Total Penalties in Million USD by Year
Out:
amount | |
---|---|
releasedAtYear | |
1995 | 107.50 |
1996 | 187.45 |
1997 | 119.11 |
1998 | 201.08 |
1999 | 384.33 |
2000 | 365.14 |
2001 | 398.37 |
2002 | 1051.64 |
2003 | 7185.29 |
2004 | 1265.45 |
2005 | 2510.20 |
2006 | 4326.68 |
2007 | 1033.73 |
2008 | 2178.02 |
2009 | 1787.05 |
2010 | 3566.43 |
2011 | 1355.98 |
2012 | 1681.49 |
2013 | 1485.04 |
2014 | 1824.30 |
2015 | 690.43 |
2016 | 3168.99 |
2017 | 934.54 |
2018 | 411.99 |
2019 | 495.31 |
2020 | 1043.57 |
2021 | 803.07 |
2022 | 757.16 |
2023 | 539.52 |
2024 | 1111.01 |
2025 | 44.36 |
fig, ax = plt.subplots(figsize=(5, 3))
penalties_year.loc[1995:2024]["amount"].plot(
kind="line",
ax=ax,
marker="o",
markersize=3,
linewidth=1,
)
# format y labels to , notation
ax.get_yaxis().set_major_formatter(plt.FuncFormatter(lambda x, loc: "{:,}".format(int(x))))
ax.set_title("SEC Penalties per Year")
ax.set_xlabel("Year")
ax.set_ylabel("Penalty Amount\nin Million USD")
plt.tight_layout()
plt.grid(axis="x")
ax.set_axisbelow(True)
plt.show()

# sort all penalties by amount, show top 10
top_10_penalties = all_penalties_df.sort_values("amount", ascending=False).head(10)
top_10_penalties["amount"] = round(top_10_penalties["amount"] / 1_000_000, 2)
top_10_penalties["amount"] = top_10_penalties["amount"].map("{:,.1f}".format)
print("Top 10 SEC Penalties in Million USD between 1995 and 2024")
# show all columns in full width
pd.set_option("display.max_colwidth", None)
top_10_penalties[['amount', 'imposedOn', 'releasedAt', 'releaseNo', 'url']]
Top 10 SEC Penalties in Million USD between 1995 and 2024
Out:
amount | imposedOn | releasedAt | releaseNo | url | |
---|---|---|---|---|---|
6846 | 2,250.0 | WorldCom, Inc. | 2003-08-07 14:21:04-04:00 | LR-18277 | https://www.sec.gov/enforcement-litigation/litigation-releases/lr-18277 |
5887 | 1,600.0 | American International Group, Inc. | 2006-02-09 14:22:14-05:00 | LR-19560 | https://www.sec.gov/enforcement-litigation/litigation-releases/lr-19560 |
6974 | 1,510.0 | WorldCom Inc. | 2003-05-19 14:20:57-04:00 | LR-18147 | https://www.sec.gov/enforcement-litigation/litigation-releases/lr-18147 |
2056 | 957.0 | Braskem S.A. | 2016-12-21 14:26:01-05:00 | LR-23705 | https://www.sec.gov/enforcement-litigation/litigation-releases/lr-23705 |
6731 | 894.0 | ten firms, Grubman and Blodget | 2003-10-31 14:21:13-05:00 | LR-18438 | https://www.sec.gov/enforcement-litigation/litigation-releases/lr-18438 |
2502 | 806.2 | Marlon Quan and the other defendants | 2014-09-25 14:25:28-04:00 | LR-23093 | https://www.sec.gov/enforcement-litigation/litigation-releases/lr-23093 |
5888 | 800.0 | American International Group, Inc. | 2006-02-09 14:22:14-05:00 | LR-19560 | https://www.sec.gov/enforcement-litigation/litigation-releases/lr-19560 |
2058 | 632.0 | Braskem S.A. | 2016-12-21 14:26:01-05:00 | LR-23705 | https://www.sec.gov/enforcement-litigation/litigation-releases/lr-23705 |
1012 | 601.0 | CR Intrinsic and the relief defendants | 2021-02-03 14:27:14-05:00 | LR-25022 | https://www.sec.gov/enforcement-litigation/litigation-releases/lr-25022 |
4626 | 569.0 | Siemens Aktiengesellschaft | 2008-12-15 14:23:23-05:00 | LR-20829 | https://www.sec.gov/enforcement-litigation/litigation-releases/lr-20829 |
# reset display options
pd.reset_option("display.max_colwidth")
# create piv table across years and imposedOnType, with values = sum of penalties
penalties_entity_type = all_penalties_df.pivot_table(
index="releasedAtYear",
columns="imposedOnType",
values="amount",
aggfunc="sum",
fill_value=0,
)
total_col = penalties_entity_type.sum(axis=1)
mean_col = round(penalties_entity_type.mean(axis=1), 0)
median_col = round(penalties_entity_type.median(axis=1), 0)
total_row = penalties_entity_type.sum(axis=0)
mean_row = round(penalties_entity_type.mean(axis=0), 0)
median_row = round(penalties_entity_type.median(axis=0), 0)
penalties_entity_type.loc["total"] = total_row
penalties_entity_type.loc["mean"] = mean_row
penalties_entity_type.loc["median"] = median_row
# format to million and , notation
penalties_entity_type = round(penalties_entity_type / 1_000_000, 2)
penalties_entity_type = penalties_entity_type.map("{:,.1f}".format)
print("SEC Penalties in Million USD by Year and Entity Type")
penalties_entity_type = penalties_entity_type[['company', 'individual']]
penalties_entity_type
SEC Penalties in Million USD by Year and Entity Type
Out:
imposedOnType | company | individual |
---|---|---|
releasedAtYear | ||
1995 | 32.4 | 53.1 |
1996 | 37.4 | 58.8 |
1997 | 21.2 | 93.3 |
1998 | 36.6 | 71.4 |
1999 | 173.0 | 186.1 |
2000 | 44.2 | 259.9 |
2001 | 122.8 | 236.4 |
2002 | 284.8 | 680.6 |
2003 | 5,197.4 | 1,043.4 |
2004 | 994.0 | 95.8 |
2005 | 856.7 | 1,648.6 |
2006 | 3,749.3 | 540.9 |
2007 | 572.6 | 451.6 |
2008 | 1,893.3 | 245.2 |
2009 | 1,261.2 | 358.7 |
2010 | 2,650.8 | 451.2 |
2011 | 959.1 | 305.9 |
2012 | 964.6 | 515.1 |
2013 | 1,032.5 | 358.3 |
2014 | 225.9 | 537.7 |
2015 | 259.6 | 374.4 |
2016 | 2,760.4 | 338.6 |
2017 | 140.3 | 732.5 |
2018 | 34.3 | 348.1 |
2019 | 230.3 | 200.8 |
2020 | 788.1 | 104.7 |
2021 | 71.2 | 96.3 |
2022 | 85.3 | 320.9 |
2023 | 199.3 | 239.7 |
2024 | 781.9 | 307.7 |
2025 | 5.2 | 36.0 |
total | 26,465.8 | 11,291.5 |
mean | 853.7 | 364.2 |
median | 259.6 | 307.7 |
fig, ax = plt.subplots(figsize=(5, 3))
data_to_plot = penalties_entity_type.loc[1995:2024].map(
lambda x: float(x.replace(",", ""))
)
data_to_plot.plot(kind="bar", stacked=False, ax=ax, color=["#1f77b4", "#ff7f0e"])
plt.title("SEC Penalties by Year and Entity Type")
plt.xlabel("Year")
plt.ylabel("Penalty Amount\nin Million USD")
plt.legend(["Company", "Individual"], loc="upper right")
plt.grid(axis="x")
ax.set_axisbelow(True)
plt.tight_layout()
plt.show()

all_tags = []
for i, row in df.iterrows():
tags = row["tags"]
if isinstance(tags, list):
all_tags.extend(tags)
all_tags = pd.Series(all_tags)
all_tags = all_tags.value_counts().reset_index()
all_tags.columns = ["tag", "count"]
print("Top 10 Tags in SEC Litigation Releases")
all_tags.head(10)
Top 10 Tags in SEC Litigation Releases
Out:
tag | count | |
---|---|---|
0 | disclosure fraud | 5875 |
1 | securities fraud | 3389 |
2 | insider trading | 1900 |
3 | fraud | 1454 |
4 | ponzi scheme | 932 |
5 | accounting fraud | 676 |
6 | unregistered securities | 656 |
7 | misappropriation | 474 |
8 | investment fraud | 404 |
9 | securities violation | 312 |
# count all unique requestedRelief
all_requested_relief = []
for i, row in df.iterrows():
requestedRelief = row["requestedRelief"]
if isinstance(requestedRelief, list):
all_requested_relief.extend(requestedRelief)
all_requested_relief = pd.Series(all_requested_relief)
all_requested_relief = all_requested_relief.value_counts().reset_index()
all_requested_relief.columns = ["requestedRelief", "count"]
print("Top 10 Requested Reliefs in SEC Litigation Releases")
all_requested_relief.head(10)
Top 10 Requested Reliefs in SEC Litigation Releases
Out:
requestedRelief | count | |
---|---|---|
0 | permanent injunctions | 6522 |
1 | civil penalties | 5632 |
2 | disgorgement of profits | 3931 |
3 | permanent injunction | 1274 |
4 | prejudgment interest | 1185 |
5 | disgorgement of ill-gotten gains | 979 |
6 | disgorgement | 911 |
7 | asset freeze | 576 |
8 | civil money penalties | 401 |
9 | permanent injunctive relief | 390 |
# count all unique violatedSections
all_violated_sections = []
for i, row in df.iterrows():
violatedSections = row["violatedSections"]
if isinstance(violatedSections, list):
all_violated_sections.extend(violatedSections)
all_violated_sections = pd.Series(all_violated_sections)
all_violated_sections = all_violated_sections.value_counts().reset_index()
all_violated_sections.columns = ["violatedSections", "count"]
print("Top 10 Violated Securities Laws in SEC Litigation Releases")
all_violated_sections.head(10)
Top 10 Violated Securities Laws in SEC Litigation Releases
Out:
violatedSections | count | |
---|---|---|
0 | Rule 10b-5 | 6185 |
1 | Section 10(b) of the Securities Exchange Act o... | 5684 |
2 | Section 17(a) of the Securities Act of 1933 | 3827 |
3 | Sections 5(a), 5(c) and 17(a) of the Securitie... | 690 |
4 | Sections 206(1) and 206(2) of the Investment A... | 523 |
5 | Sections 5(a), 5(c), and 17(a) of the Securiti... | 469 |
6 | Section 15(a) of the Exchange Act | 410 |
7 | Section 17(a) of the Securities Act | 402 |
8 | Sections 10(b) and 13(b)(5) of the Securities ... | 311 |
9 | Sections 5(a) and 5(c) of the Securities Act o... | 309 |
all_investigation_conducted_by = []
for i, row in df.iterrows():
investigationConductedBy = row["investigationConductedBy"]
if isinstance(investigationConductedBy, list):
all_investigation_conducted_by.extend(investigationConductedBy)
all_investigation_conducted_by = pd.Series(all_investigation_conducted_by)
all_investigation_conducted_by = all_investigation_conducted_by.value_counts().reset_index()
all_investigation_conducted_by.columns = ["investigationConductedBy", "count"]
print("Top 10 Investigation Conducted By in SEC Litigation Releases")
all_investigation_conducted_by.head(10)
Top 10 Investigation Conducted By in SEC Litigation Releases
Out:
investigationConductedBy | count | |
---|---|---|
0 | Securities and Exchange Commission | 437 |
1 | Division of Enforcement | 256 |
2 | U.S. Securities and Exchange Commission | 161 |
3 | SEC | 103 |
4 | John Rymas | 83 |
5 | Amy Gwiazda | 81 |
6 | New York Regional Office | 72 |
7 | Federal Bureau of Investigation | 61 |
8 | Boston Regional Office | 59 |
9 | Miami Regional Office | 55 |
# count unique otherAgenciesInvolved
all_other_agencies_involved = []
for i, row in df.iterrows():
otherAgenciesInvolved = row["otherAgenciesInvolved"]
if isinstance(otherAgenciesInvolved, list):
all_other_agencies_involved.extend(otherAgenciesInvolved)
all_other_agencies_involved = pd.DataFrame(all_other_agencies_involved)
all_other_agencies_involved = all_other_agencies_involved['name'].value_counts().reset_index()
all_other_agencies_involved.columns = ["otherAgenciesInvolved", "count"]
print("Top 10 Other Agencies Involved in SEC Litigation Releases")
all_other_agencies_involved.head(10)
Top 10 Other Agencies Involved in SEC Litigation Releases
Out:
otherAgenciesInvolved | count | |
---|---|---|
0 | Federal Bureau of Investigation | 1172 |
1 | Financial Industry Regulatory Authority | 552 |
2 | British Columbia Securities Commission | 98 |
3 | Ontario Securities Commission | 82 |
4 | Options Regulatory Surveillance Authority | 78 |
5 | Texas State Securities Board | 77 |
6 | New York Stock Exchange | 77 |
7 | Internal Revenue Service | 71 |
8 | U.S. Attorney's Office for the Southern Distri... | 71 |
9 | U.S. Postal Inspection Service | 69 |