Monitors SEC EDGAR Form 4 filings in near real-time, detects insider buy clusters, sends Slack alerts, and optionally executes trades via Alpaca. Copying the idea from https://insidercopytrading.com
Go to file
Dominik Roth 56ec0b4a81 docs: linkify insidercopytrading.com, fix punctuation, add compliment
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 18:03:09 +02:00
.gitea/workflows Initial commit 2026-05-04 18:07:44 +02:00
alerts feat: add PLAN.md and insider copytrade POC implementation 2026-05-04 16:15:22 +00:00
backtest feat: HP sweep heatmap + equity curve plots, scam analysis in README 2026-05-26 17:59:18 +02:00
broker fix: address sanity-check issues + rebrand to Smaug 2026-05-04 16:32:00 +00:00
db feat(db): dedup-safe inserts, filter_new_accessions, mark_accession_seen, as-of-date queries 2026-05-26 17:48:33 +02:00
ingestion feat(ingestion): bulk historical ingest, form4 tx_code, parser fixes 2026-05-26 17:48:51 +02:00
plots feat: HP sweep heatmap + equity curve plots, scam analysis in README 2026-05-26 17:59:18 +02:00
signals feat(signals): as-of-date aware cluster detection, open-market-only filter 2026-05-26 17:48:59 +02:00
.env.example feat: add PLAN.md and insider copytrade POC implementation 2026-05-04 16:15:22 +00:00
.gitignore chore: gitignore data/, .claude/, WAL sidecar files; add cssselect dep 2026-05-26 17:48:23 +02:00
config.py feat: add PLAN.md and insider copytrade POC implementation 2026-05-04 16:15:22 +00:00
icon.png add icon 2026-05-04 19:59:45 +02:00
main.py feat: HP sweep heatmap + equity curve plots, scam analysis in README 2026-05-26 17:59:18 +02:00
PLAN.md feat: add PLAN.md and insider copytrade POC implementation 2026-05-04 16:15:22 +00:00
README.md docs: linkify insidercopytrading.com, fix punctuation, add compliment 2026-05-26 18:03:09 +02:00
requirements.txt chore: gitignore data/, .claude/, WAL sidecar files; add cssselect dep 2026-05-26 17:48:23 +02:00


Smaug

Monitors SEC EDGAR Form 4 filings in near real-time, detects insider buy clusters, sends Slack alerts, and optionally executes trades via Alpaca. Copying the idea from insidercopytrading.com. Available at insidercopytradingcopy.com.

Architecture

EDGAR (Form 4 feed)
      |
      v
ingestion/edgar_poller.py    -- polls every 10 min, dedupes by accession
ingestion/sec_bulk_ingest.py -- bulk historical ingest via quarterly form.idx archives
      |
      v
ingestion/form4_parser.py    -- parses XML, detects 10b5-1 plans, extracts tx_code
      |
      v
db/models.py + db/db.py      -- SQLAlchemy ORM: filings, signals, price_cache tables
      |
      v
signals/filter_engine.py     -- buy-only, open-market (P) only, exclude 10b5-1,
signals/cluster_detector.py    min $50k, role-weighted scoring, as-of-date aware
      |
      +---> alerts/slack_alert.py   -- POST to Slack webhook when score >= threshold
      +---> broker/alpaca_client.py -- paper/live order (NOT FULLY IMPLEMENTED -- see Results)

backtest/backtest.py         -- per-signal return / alpha vs SPY
backtest/simulate.py         -- portfolio simulation with configurable transaction costs
backtest/plot.py             -- HP sweep heatmap + equity curve plots

Usage

pip install -r requirements.txt
cp .env.example .env  # fill in credentials

# Live polling (every 10 min)
python main.py run

# Bulk-ingest historical filings
python main.py backfill --years 2023 2024
python main.py backfill --year 2024 --quarter 1

# Per-signal backtest: win rate, alpha vs SPY
python main.py backtest

# Portfolio simulation with transaction cost modelling
python main.py simulate [options]

# Generate HP heatmap + equity curve plots (saves to plots/)
python main.py plot

Simulate options

Strategy:
  --holding-days N      Days to hold each position (default: 7)
  --buy-delay N         Days after signal to enter (default: 1)
  --position-size F     Fraction of available cash per trade (default: 0.10)
  --min-score F         Minimum signal score (default: 0.0)
  --min-cluster N       Minimum cluster size (default: 1)
  --capital F           Initial capital (default: 100000)

Transaction costs:
  --spread F            One-way bid-ask half-spread at entry and exit (default: 0.003)
  --slippage F          Entry slippage / market impact (default: 0.002)
  --commission F        Per-trade commission as fraction of notional (default: 0.001)

Round-trip = spread x 2 + slippage + commission x 2.

Setup

cp .env.example .env
pip install -r requirements.txt
Variable Default Description
SLACK_WEBHOOK_URL Incoming webhook URL for alerts
ALPACA_KEY Alpaca API key
ALPACA_SECRET Alpaca API secret
ALPACA_BASE_URL https://paper-api.alpaca.markets Paper or live endpoint
DB_PATH insider.db SQLite database path

Key config (config.py)

Parameter Default Description
MIN_TRANSACTION_VALUE $50,000 Ignore buys below this
MIN_CLUSTER_SIZE 1 Unique insiders before a signal fires
CLUSTER_WINDOW_DAYS 30 Rolling window for cluster counting
HOLDING_PERIOD_DAYS 90 Days held per position
POSITION_SIZE_PCT 2% Fraction of portfolio per trade
SCORE_ALERT_THRESHOLD 5.0 Minimum score to trigger alert

Scoring

score = role_weight * log(total_value) * (1 + 0.5 * (cluster_size - 1))

Role weights: CEO 3.0, CFO/President 2.5, COO 2.0, Director 1.5, VP 1.2, 10% owner 1.0

No Hosted Version

There is no hosted version of Smaug. You have to run it yourself. See Usage, then check the Results to decide if you actually want to.

Results

16,279 signals from 302k Form 4 filings (2020-2025).

Per-signal stats (pre-cost)

Hold Avg return Alpha vs SPY Sharpe Win rate
3d +0.61% +0.52% ~0.80 ~53%
7d +1.19% +0.68% ~1.05 ~54%
14d +1.41% +0.55% ~0.90 ~54%
30d +1.89% +0.41% ~0.70 ~54%

The signal exists. It just does not survive transaction costs.

Portfolio simulation (7d hold, 1d delay, 10% of cash per signal)

HP Sweep

Equity Curves

Alpaca charges $0 commission on US equities. Real costs are spread + slippage only:

Scenario RT cost Ann. return vs SPY
Theoretical (no costs) 0% +177% +151%
Alpaca, large-cap ~0.2% ~+20% ~+4%
Alpaca, mid-cap ~0.5% ~+5% -11%
Alpaca, small-cap ~0.7-1.0% -1% to -8% -17% to -24%

SPY annualised over the same period: ~+16%.

Break-even is roughly 0.3-0.5% round-trip. On Alpaca that means large-cap stocks only -- but most insider buying happens in small and mid-cap names, so filtering aggressively kills signal count.

Is insidercopytrading.com a scam?

Kind of, yes.

Their website shows backtested returns that significantly outperform the market. Those numbers are real in the sense that the simulation ran correctly. They are not real in the sense that you could ever achieve them:

  • Same-day entry. Form 4 filings are submitted after market close or intraday. By the time you see the filing and place an order, the earliest realistic entry is the next morning's open. Their simulations use the closing price on the filing date -- a price you cannot buy at.
  • No spread or slippage. They assume you transact at the closing mid-price with zero friction. In reality, on the small-cap and micro-cap stocks where most insider buying happens, the bid-ask spread alone is 0.3-0.8% each way.
  • No market impact. Their signals all execute at the same price regardless of how many people are following the service. If a meaningful number of subscribers act on the same signal, they move the stock against themselves.

Under realistic assumptions with a 1-day entry delay and real bid-ask costs on Alpaca, our simulation shows the strategy underperforms SPY across all tested holding periods and produces negative absolute returns for any round-trip cost above ~0.5%. For the small and mid-cap stocks that dominate insider buying signals, you are not reaching 0.5%.

This is not a unique failure of this implementation. It is a fundamental property of the strategy: the edge (~0.7% per 7-day trade) is smaller than the friction of executing it in real markets. insidercopytrading.com either does not know this or does not want you to know it. Either way, they are charging a subscription for backtested numbers that cannot be reproduced with real money. Their website is rather pretty though.

Alpaca integration exists in the codebase (broker/alpaca_client.py) but is not fully implemented or tested, for the above reason. Wiring up live execution to a strategy that burns money seemed like a bad idea.

Modules

Path Purpose
config.py Thresholds and env-var loading
ingestion/edgar_poller.py EDGAR Atom feed polling
ingestion/sec_bulk_ingest.py Bulk historical ingest via form.idx
ingestion/form4_parser.py Form 4 XML parser; 10b5-1 detection
db/models.py SQLAlchemy ORM models
db/db.py DB access layer
signals/filter_engine.py Filing to signal pipeline
signals/cluster_detector.py Cluster detection
alerts/slack_alert.py Slack webhook
broker/alpaca_client.py Alpaca order execution
backtest/backtest.py Per-signal backtest
backtest/simulate.py Portfolio simulator
backtest/plot.py Plot generator
main.py CLI: run / backfill / backtest / simulate / plot

Requirements

Python 3.11+. See requirements.txt.