Event-driven simulation: 1-day buy delay, N-day hold, position-size % of cash. Models entry cost (spread + slippage + commission) and exit cost (spread + commission) so round-trip is fully parameterised from the CLI. Reports: annualized return, SPY benchmark, excess return, max drawdown, Sharpe, per-trade win rate and avg net return. CLI: python main.py simulate [--holding-days 7] [--spread 0.003] [--slippage 0.002] ... Also runnable directly: python backtest/simulate.py --help Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| .gitea/workflows | ||
| alerts | ||
| backtest | ||
| broker | ||
| db | ||
| ingestion | ||
| signals | ||
| .env.example | ||
| .gitignore | ||
| config.py | ||
| icon.png | ||
| main.py | ||
| PLAN.md | ||
| README.md | ||
| requirements.txt | ||
Smaug
Monitors SEC EDGAR Form 4 filings in near real-time, detects insider buy clusters, sends Slack alerts, and optionally executes trades via Alpaca.
Copying the idea from insidercopytrading.com. Available at insidercopytradingcopy.com
Architecture
EDGAR (Form 4 feed)
│
▼
ingestion/edgar_poller.py ← polls every 10 min, dedupes by accession
│
▼
ingestion/form4_parser.py ← parses XML, detects 10b5-1 plans
│
▼
db/models.py + db/db.py ← SQLAlchemy ORM: filings, signals, price_cache tables
│
▼
signals/filter_engine.py ← buy-only, exclude 10b5-1, min $50k, role-weighted scoring
signals/cluster_detector.py ← counts unique insiders per ticker in rolling 30-day window
│
├──► alerts/slack_alert.py ← POST to Slack webhook when score ≥ threshold
└──► broker/alpaca_client.py ← paper/live order: 2% position size, 10% per-ticker cap
positions auto-closed after holding period expires
Setup
cp .env.example .env
# edit .env with your credentials
pip install -r requirements.txt
Environment variables (.env)
| Variable | Required | Default | Description |
|---|---|---|---|
SLACK_WEBHOOK_URL |
optional | — | Incoming webhook URL for alerts |
ALPACA_KEY |
optional | — | Alpaca API key |
ALPACA_SECRET |
optional | — | Alpaca API secret |
ALPACA_BASE_URL |
optional | https://paper-api.alpaca.markets |
Use paper or live endpoint |
DB_PATH |
optional | insider.db |
SQLite database file path |
DATA_DIR |
optional | data/filings |
Directory for cached raw XML filings |
Usage
# Initialize DB and ingest current EDGAR feed (one shot)
python main.py fetch-once
# Run continuous polling loop (every 10 minutes)
python main.py run
# Backtest signals already in the DB against historical prices
python main.py backtest
Key configuration (config.py)
| Parameter | Default | Description |
|---|---|---|
EDGAR_POLL_INTERVAL |
600 s | Polling cadence |
MIN_TRANSACTION_VALUE |
$50,000 | Ignore buys below this |
MIN_CLUSTER_SIZE |
1 | Minimum unique insiders before a signal fires |
CLUSTER_WINDOW_DAYS |
30 | Rolling window for cluster counting |
HOLDING_PERIOD_DAYS |
90 | Days held per position (backtest + auto-close trigger) |
POSITION_SIZE_PCT |
2% | Fraction of portfolio per trade |
MAX_POSITIONS |
20 | Hard position limit |
SCORE_ALERT_THRESHOLD |
5.0 | Minimum score to trigger Slack alert |
Scoring
score = role_weight × log(total_value) × (1 + 0.5 × (cluster_size − 1))
Role weights: CEO 3.0 · CFO/President 2.5 · COO 2.0 · Director 1.5 · VP 1.2 · 10% owner 1.0
Backtesting
The backtest loads signals from the DB and fetches OHLC data via yfinance. Prices are cached in the price_cache table — completed date ranges are served entirely from the DB on repeat runs, avoiding redundant network calls. Entry price is the closing price on the first trading day on or after the signal date; exit price is the closing price on the last trading day before or on the exit date. Raw XML filings are cached in DATA_DIR (data/filings/) by accession number.
The EDGAR poller also skips fetching XML for filings older than the newest filed_date already stored in the DB, so incremental runs only process truly new filings.
Metrics reported: win rate, average return, average alpha vs SPY, Sharpe ratio.
Position lifecycle
Positions are tracked in the signals table. When a trade is executed, executed_at is recorded. On each poll cycle the poller checks for positions where executed_at is older than HOLDING_PERIOD_DAYS and calls Alpaca to close them, marking closed=1 in the DB.
Modules
| Path | Purpose |
|---|---|
config.py |
All thresholds and env-var loading |
ingestion/edgar_poller.py |
EDGAR Atom feed polling and deduplication |
ingestion/form4_parser.py |
Form 4 XML → structured dict; 10b5-1 detection |
db/models.py |
SQLAlchemy ORM models (Filing, Signal, PriceCache) |
db/db.py |
DB access layer (SQLAlchemy sessions) |
signals/filter_engine.py |
Filing → signal pipeline |
signals/cluster_detector.py |
Cluster detection from DB |
alerts/slack_alert.py |
Slack webhook alert |
broker/alpaca_client.py |
Alpaca order execution + position exit |
backtest/backtest.py |
Historical backtest runner |
main.py |
CLI entry point |
Requirements
- Python 3.11+
- See
requirements.txt:requests,lxml,yfinance,python-dotenv,alpaca-trade-api,sqlalchemy