smaug/README.md
2026-05-04 20:02:54 +02:00

4.7 KiB
Raw Blame History


Smaug

Monitors SEC EDGAR Form 4 filings in near real-time, detects insider buy clusters, sends Slack alerts, and optionally executes trades via Alpaca.
Copying the idea from insidercopytrading.com. Available at insidercopytradingcopy.com

Architecture

EDGAR (Form 4 feed)
      │
      ▼
ingestion/edgar_poller.py   ← polls every 10 min, dedupes by accession
      │
      ▼
ingestion/form4_parser.py   ← parses XML, detects 10b5-1 plans
      │
      ▼
db/models.py + db/db.py     ← SQLAlchemy ORM: filings, signals, price_cache tables
      │
      ▼
signals/filter_engine.py    ← buy-only, exclude 10b5-1, min $50k, role-weighted scoring
signals/cluster_detector.py ← counts unique insiders per ticker in rolling 30-day window
      │
      ├──► alerts/slack_alert.py   ← POST to Slack webhook when score ≥ threshold
      └──► broker/alpaca_client.py ← paper/live order: 2% position size, 10% per-ticker cap
                                        positions auto-closed after holding period expires

Setup

cp .env.example .env
# edit .env with your credentials
pip install -r requirements.txt

Environment variables (.env)

Variable Required Default Description
SLACK_WEBHOOK_URL optional Incoming webhook URL for alerts
ALPACA_KEY optional Alpaca API key
ALPACA_SECRET optional Alpaca API secret
ALPACA_BASE_URL optional https://paper-api.alpaca.markets Use paper or live endpoint
DB_PATH optional insider.db SQLite database file path
DATA_DIR optional data/filings Directory for cached raw XML filings

Usage

# Initialize DB and ingest current EDGAR feed (one shot)
python main.py fetch-once

# Run continuous polling loop (every 10 minutes)
python main.py run

# Backtest signals already in the DB against historical prices
python main.py backtest

Key configuration (config.py)

Parameter Default Description
EDGAR_POLL_INTERVAL 600 s Polling cadence
MIN_TRANSACTION_VALUE $50,000 Ignore buys below this
MIN_CLUSTER_SIZE 1 Minimum unique insiders before a signal fires
CLUSTER_WINDOW_DAYS 30 Rolling window for cluster counting
HOLDING_PERIOD_DAYS 90 Days held per position (backtest + auto-close trigger)
POSITION_SIZE_PCT 2% Fraction of portfolio per trade
MAX_POSITIONS 20 Hard position limit
SCORE_ALERT_THRESHOLD 5.0 Minimum score to trigger Slack alert

Scoring

score = role_weight × log(total_value) × (1 + 0.5 × (cluster_size  1))

Role weights: CEO 3.0 · CFO/President 2.5 · COO 2.0 · Director 1.5 · VP 1.2 · 10% owner 1.0

Backtesting

The backtest loads signals from the DB and fetches OHLC data via yfinance. Prices are cached in the price_cache table — completed date ranges are served entirely from the DB on repeat runs, avoiding redundant network calls. Entry price is the closing price on the first trading day on or after the signal date; exit price is the closing price on the last trading day before or on the exit date. Raw XML filings are cached in DATA_DIR (data/filings/) by accession number.

The EDGAR poller also skips fetching XML for filings older than the newest filed_date already stored in the DB, so incremental runs only process truly new filings.

Metrics reported: win rate, average return, average alpha vs SPY, Sharpe ratio.

Position lifecycle

Positions are tracked in the signals table. When a trade is executed, executed_at is recorded. On each poll cycle the poller checks for positions where executed_at is older than HOLDING_PERIOD_DAYS and calls Alpaca to close them, marking closed=1 in the DB.

Modules

Path Purpose
config.py All thresholds and env-var loading
ingestion/edgar_poller.py EDGAR Atom feed polling and deduplication
ingestion/form4_parser.py Form 4 XML → structured dict; 10b5-1 detection
db/models.py SQLAlchemy ORM models (Filing, Signal, PriceCache)
db/db.py DB access layer (SQLAlchemy sessions)
signals/filter_engine.py Filing → signal pipeline
signals/cluster_detector.py Cluster detection from DB
alerts/slack_alert.py Slack webhook alert
broker/alpaca_client.py Alpaca order execution + position exit
backtest/backtest.py Historical backtest runner
main.py CLI entry point

Requirements

  • Python 3.11+
  • See requirements.txt: requests, lxml, yfinance, python-dotenv, alpaca-trade-api, sqlalchemy