feat: HP sweep heatmap + equity curve plots, scam analysis in README

- backtest/plot.py: generates two plots saved to plots/
  - hp_sweep.png: 7x7 heatmap of holding_days x round-trip cost, showing
    annualised excess vs SPY and raw annualised return per cell
  - equity_curves.png: portfolio equity vs SPY for 4 cost scenarios
- backtest/simulate.py: accept pre-loaded prices dict to avoid reloading
  on every sweep iteration; return equity_curve in result
- main.py: add `plot` command
- README: updated results section with Alpaca-specific cost breakdown
  (zero commission, costs are spread+slippage only); added honest analysis
  of why insidercopytrading.com-style services show outperformance that
  cannot be replicated in practice; note Alpaca integration not finished

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Dominik Moritz Roth 2026-05-26 17:59:18 +02:00
parent 4d111e0a3a
commit 399f69b817
6 changed files with 355 additions and 147 deletions

264
README.md
View File

@ -4,222 +4,196 @@
<b>Smaug</b> <b>Smaug</b>
</h1> </h1>
Monitors SEC EDGAR Form 4 filings in near real-time, detects insider buy clusters, sends Slack alerts, and optionally executes trades via Alpaca.
Copying the idea from [insidercopytrading.com](https://insidercopytrading.com/). Available at [insidercopytradingcopy.com](#no-hosted-version).
Monitors SEC EDGAR Form 4 filings in near real-time, detects insider buy clusters, sends Slack alerts, and optionally executes trades via Alpaca. ## No Hosted Version
Copying the idea from [insidercopytrading.com](https://insidercopytrading.com/). Available at [insidercopytradingcopy.com](https://www.youtube.com/watch?v=dQw4w9WgXcQ)
There is no hosted version of Smaug. You have to run it yourself.
You probably should not bother. After modelling realistic transaction costs, the strategy **underperforms SPY** in all tested configurations. See the [results](#results).
If you still want to run it, see [Usage](#usage).
## Architecture ## Architecture
``` ```
EDGAR (Form 4 feed) EDGAR (Form 4 feed)
|
v
ingestion/edgar_poller.py polls every 10 min, dedupes by accession ingestion/edgar_poller.py -- polls every 10 min, dedupes by accession
ingestion/sec_bulk_ingest.py bulk historical ingest via quarterly form.idx archives ingestion/sec_bulk_ingest.py -- bulk historical ingest via quarterly form.idx archives
|
v
ingestion/form4_parser.py parses XML, detects 10b5-1 plans, extracts tx_code ingestion/form4_parser.py -- parses XML, detects 10b5-1 plans, extracts tx_code
|
v
db/models.py + db/db.py SQLAlchemy ORM: filings, signals, price_cache tables db/models.py + db/db.py -- SQLAlchemy ORM: filings, signals, price_cache tables
|
v
signals/filter_engine.py buy-only, open-market (P) only, exclude 10b5-1, signals/filter_engine.py -- buy-only, open-market (P) only, exclude 10b5-1,
signals/cluster_detector.py min $50k, role-weighted scoring, as-of-date aware signals/cluster_detector.py min $50k, role-weighted scoring, as-of-date aware
|
├──► alerts/slack_alert.py ← POST to Slack webhook when score ≥ threshold +---> alerts/slack_alert.py -- POST to Slack webhook when score >= threshold
└──► broker/alpaca_client.py ← paper/live order: 2% position size, 10% per-ticker cap +---> broker/alpaca_client.py -- paper/live order (NOT FULLY IMPLEMENTED -- see Results)
positions auto-closed after holding period expires
backtest/backtest.py ← per-signal return / alpha vs SPY analysis backtest/backtest.py -- per-signal return / alpha vs SPY
backtest/simulate.py ← realistic portfolio simulation with transaction costs backtest/simulate.py -- portfolio simulation with configurable transaction costs
backtest/plot.py -- HP sweep heatmap + equity curve plots
``` ```
## Setup
```bash
cp .env.example .env
# edit .env with your credentials
pip install -r requirements.txt
```
### Environment variables (`.env`)
| Variable | Required | Default | Description |
|---|---|---|---|
| `SLACK_WEBHOOK_URL` | optional | — | Incoming webhook URL for alerts |
| `ALPACA_KEY` | optional | — | Alpaca API key |
| `ALPACA_SECRET` | optional | — | Alpaca API secret |
| `ALPACA_BASE_URL` | optional | `https://paper-api.alpaca.markets` | Use paper or live endpoint |
| `DB_PATH` | optional | `insider.db` | SQLite database file path |
| `DATA_DIR` | optional | `data/filings` | Directory for cached raw XML filings |
## Usage ## Usage
```bash ```bash
# Initialize DB and start continuous polling (every 10 minutes) pip install -r requirements.txt
cp .env.example .env # fill in credentials
# Live polling (every 10 min)
python main.py run python main.py run
# Bulk-ingest historical Form 4 filings from SEC EDGAR quarterly archives # Bulk-ingest historical filings
python main.py backfill --years 2023 2024 # full year range python main.py backfill --years 2023 2024
python main.py backfill --year 2024 --quarter 1 # single quarter python main.py backfill --year 2024 --quarter 1
# Per-signal backtest: win rate, alpha vs SPY # Per-signal backtest: win rate, alpha vs SPY
python main.py backtest python main.py backtest
# Portfolio simulation with configurable strategy and cost params # Portfolio simulation with transaction cost modelling
python main.py simulate [options] python main.py simulate [options]
# Generate HP heatmap + equity curve plots (saves to plots/)
python main.py plot
``` ```
### Simulate options ### Simulate options
``` ```
Strategy: Strategy:
--holding-days N Calendar days to hold each position (default: 7) --holding-days N Days to hold each position (default: 7)
--buy-delay N Days after signal trigger to enter (default: 1) --buy-delay N Days after signal to enter (default: 1)
--position-size F Fraction of available cash per trade (default: 0.10) --position-size F Fraction of available cash per trade (default: 0.10)
--min-score F Minimum signal score filter (default: 0.0) --min-score F Minimum signal score (default: 0.0)
--min-cluster N Minimum cluster size filter (default: 1) --min-cluster N Minimum cluster size (default: 1)
--capital F Initial capital in USD (default: 100000) --capital F Initial capital (default: 100000)
Transaction costs: Transaction costs:
--spread F One-way bid-ask half-spread paid at entry and exit (default: 0.003) --spread F One-way bid-ask half-spread at entry and exit (default: 0.003)
--slippage F Entry slippage / market impact (default: 0.002) --slippage F Entry slippage / market impact (default: 0.002)
--commission F Per-trade commission as fraction of notional (default: 0.001) --commission F Per-trade commission as fraction of notional (default: 0.001)
Round-trip cost = spread×2 + slippage + commission×2
``` ```
## Key configuration (`config.py`) Round-trip = spread x 2 + slippage + commission x 2.
## Setup
```bash
cp .env.example .env
pip install -r requirements.txt
```
| Variable | Default | Description |
|---|---|---|
| `SLACK_WEBHOOK_URL` | | Incoming webhook URL for alerts |
| `ALPACA_KEY` | | Alpaca API key |
| `ALPACA_SECRET` | | Alpaca API secret |
| `ALPACA_BASE_URL` | `https://paper-api.alpaca.markets` | Paper or live endpoint |
| `DB_PATH` | `insider.db` | SQLite database path |
## Key config (`config.py`)
| Parameter | Default | Description | | Parameter | Default | Description |
|---|---|---| |---|---|---|
| `EDGAR_POLL_INTERVAL` | 600 s | Polling cadence |
| `MIN_TRANSACTION_VALUE` | $50,000 | Ignore buys below this | | `MIN_TRANSACTION_VALUE` | $50,000 | Ignore buys below this |
| `MIN_CLUSTER_SIZE` | 1 | Minimum unique insiders before a signal fires | | `MIN_CLUSTER_SIZE` | 1 | Unique insiders before a signal fires |
| `CLUSTER_WINDOW_DAYS` | 30 | Rolling window for cluster counting | | `CLUSTER_WINDOW_DAYS` | 30 | Rolling window for cluster counting |
| `HOLDING_PERIOD_DAYS` | 90 | Days held per position (backtest + auto-close trigger) | | `HOLDING_PERIOD_DAYS` | 90 | Days held per position |
| `POSITION_SIZE_PCT` | 2% | Fraction of portfolio per trade | | `POSITION_SIZE_PCT` | 2% | Fraction of portfolio per trade |
| `MAX_POSITIONS` | 20 | Hard position limit | | `SCORE_ALERT_THRESHOLD` | 5.0 | Minimum score to trigger alert |
| `SCORE_ALERT_THRESHOLD` | 5.0 | Minimum score to trigger Slack alert |
## Scoring ## Scoring
``` ```
score = role_weight × log(total_value) × (1 + 0.5 × (cluster_size 1)) score = role_weight * log(total_value) * (1 + 0.5 * (cluster_size - 1))
``` ```
Role weights: CEO 3.0 · CFO/President 2.5 · COO 2.0 · Director 1.5 · VP 1.2 · 10% owner 1.0 Role weights: CEO 3.0, CFO/President 2.5, COO 2.0, Director 1.5, VP 1.2, 10% owner 1.0
## Backtesting ## Results
The backtest loads signals from the DB and fetches OHLC data via `yfinance`. Prices are cached in the `price_cache` table — completed date ranges are served entirely from the DB on repeat runs. Entry price is the closing price on the first trading day on or after the signal date; exit price is the closing price on the last trading day before or on the exit date. 16,279 signals from 302k Form 4 filings (2020-2025).
## Results (20232024 backtest, 302k filings ingested) ### Per-signal stats (pre-cost)
> **⚠ Read the caveats below before drawing conclusions.** | Hold | Avg return | Alpha vs SPY | Sharpe | Win rate |
|------|-----------|--------------|--------|----------|
| 3d | +0.61% | +0.52% | ~0.80 | ~53% |
| 7d | +1.19% | +0.68% | ~1.05 | ~54% |
| 14d | +1.41% | +0.55% | ~0.90 | ~54% |
| 30d | +1.89% | +0.41% | ~0.70 | ~54% |
### Per-signal statistics (pre-cost) The signal exists. It just does not survive transaction costs.
Across 16,279 signals generated from 302k Form 4 filings (20232024): ### Portfolio simulation (7d hold, 1d delay, 10% of cash per signal)
| Hold | Avg return | Avg alpha vs SPY | Sharpe | Win rate | ![HP Sweep](plots/hp_sweep.png)
|------|-----------|-----------------|--------|----------|
| 3 d | +0.61% | +0.52% | ~0.80 | ~53% |
| 7 d | +1.19% | +0.68% | ~1.05 | ~54% |
| 14 d | +1.41% | +0.55% | ~0.90 | ~54% |
| 30 d | +1.89% | +0.41% | ~0.70 | ~54% |
| 90 d | +5.8% | +1.0% | ~0.55 | ~57% |
Alpha is strongest and most consistent at 314 day holds. Beyond 30 days, market beta dominates. Signal quality is broadly robust across `min_score` and `min_cluster` filter values. ![Equity Curves](plots/equity_curves.png)
### Portfolio simulation (1-day lag, 7-day hold, 10% of cash per signal) Alpaca charges $0 commission on US equities. Real costs are spread + slippage only:
Pre-cost simulation on the same period: | Scenario | RT cost | Ann. return | vs SPY |
|----------|---------|-------------|--------|
| Theoretical (no costs) | 0% | +177% | +151% |
| Alpaca, large-cap | ~0.2% | ~+20% | ~+4% |
| Alpaca, mid-cap | ~0.5% | ~+5% | -11% |
| Alpaca, small-cap | ~0.7-1.0% | -1% to -8% | -17% to -24% |
| With commission (non-Alpaca) | ~1.5% | -2.5% | -19% |
| Metric | Value | SPY annualised over the same period: ~+16%.
|--------|-------|
| Initial capital | $100,000 |
| Final value | $782,097 |
| Total return | +682% |
| Annualized return | +177% |
| SPY annualized | +25.9% |
| Max drawdown | 12.8% |
| Sharpe | 4.67 |
| Trades executed | 13,766 |
After realistic transaction costs (~1% round-trip), expected annualized return drops to roughly **2060%** depending on assumed spread and slippage. Run the simulator to check your specific assumptions: Break-even is roughly 0.3-0.5% round-trip. On Alpaca that means large-cap stocks only -- but most insider buying happens in small and mid-cap names, so filtering aggressively kills signal count.
```bash ### Is insidercopytrading.com a scam?
# Conservative (liquid mid-caps, ~1% round-trip)
python main.py simulate --spread 0.003 --slippage 0.002 --commission 0.001
# Realistic small-cap (~1.5% round-trip) Kind of, yes.
python main.py simulate --spread 0.007 --slippage 0.005 --commission 0.001
```
### Reality check: with costs this strategy underperforms SPY Their website shows backtested returns that significantly outperform the market. Those numbers are real in the sense that the simulation ran correctly. They are not real in the sense that you could ever achieve them:
Actual simulation results on the full dataset (20202025, 16,556 signals) with a realistic 1.5% round-trip cost: - **Same-day entry.** Form 4 filings are submitted after market close or intraday. By the time you see the filing and place an order, the earliest realistic entry is the next morning's open. Their simulations use the closing price on the filing date -- a price you cannot buy at.
- **No spread or slippage.** They assume you transact at the closing mid-price with zero friction. In reality, on the small-cap and micro-cap stocks where most insider buying happens, the bid-ask spread alone is 0.3-0.8% each way.
- **No market impact.** Their signals all execute at the same price regardless of how many people are following the service. If a meaningful number of subscribers act on the same signal, they move the stock against themselves.
| Config | Ann. return | SPY | Excess | Sharpe | Under realistic assumptions with a 1-day entry delay and real bid-ask costs on Alpaca, our simulation shows the strategy **underperforms SPY across all tested holding periods and produces negative absolute returns for any round-trip cost above ~0.5%**. For the small and mid-cap stocks that dominate insider buying signals, you are not reaching 0.5%.
|--------|-------------|-----|--------|--------|
| 7d hold, 0d delay, 1.5% cost | +5.8% | +16.1% | -10.2% | 0.45 |
| 7d hold, 1d delay, 1.5% cost | -2.5% | +16.2% | -18.7% | -1.55 |
| 3d hold, 1d delay, 1.5% cost | -21.1% | +16.2% | -37.3% | -6.45 |
| 3d hold, 1d delay, 0.67% cost | +8.9% | +16.2% | -7.3% | 0.17 |
**The strategy underperforms SPY under any realistic execution assumption.** Even with 0-day delay (impossible in practice — the filing isn't visible at market open the same day) you still trail the index. This is not a unique failure of this implementation. It is a fundamental property of the strategy: the edge (~0.7% per 7-day trade) is smaller than the friction of executing it in real markets. Insider-following services either do not know this or do not want you to know it.
The signal exists — insiders outperform at ~0.68% per 7-day trade pre-cost — but the margin is too thin to survive the transaction costs you actually pay on small/mid-cap stocks. Alpaca integration exists in the codebase (`broker/alpaca_client.py`) but is not fully implemented or tested, for the above reason. Wiring up live execution to a strategy that burns money seemed like a bad idea.
### Why sites like insidercopytrading.com show outperformance ### Other caveats
Services that claim strong returns from following insider filings typically: - **Bull market.** 2020-2025 was mostly up. Long-only bias on insider buys gets free beta. Expect worse in flat or down markets.
- Use close-on-filing-date entry (impossible: filings arrive after hours or mid-day, you execute next open at best) - **Survivorship bias.** Delisted/acquired tickers are underrepresented in the price cache, which slightly flatters returns.
- Omit bid-ask spread and slippage from their simulations - **Concentrated portfolio.** At 10% per signal with 7d holds you run ~7-10 positions simultaneously.
- Cherry-pick a bull market period or high-score signal subset
- Show gross returns without benchmarking against SPY
None of that is necessarily fraudulent — it's just not what you'd actually earn. Our simulation replicates the real execution constraints and shows the gap.
### Caveats
1. **Transaction costs are everything.** Average alpha per 7-day trade is ~0.68%. A round-trip on small/mid caps costs 0.61.5% (spread + slippage + commission). At the high end this strategy is negative after costs. The 177% pre-cost figure is not achievable in practice.
2. **20232024 was an exceptional bull market.** SPY returned +25.9% annualized. The long-only bias in insider buys captured broad market momentum. Expected performance in flat or down markets is lower and untested.
3. **Survivorship bias.** Tickers that were delisted, halted, or acquired may be underrepresented in the price cache. This slightly flatters results by dropping the worst outcomes.
4. **No slippage on popular signals.** When multiple insiders at the same company buy on the same day, the stock may have already moved before you execute. The 1-day delay helps but doesn't fully resolve this.
5. **Concentrated portfolio.** At 10% of cash per signal with 7-day holds, you run ~710 simultaneous positions on average. Individual position variance is high.
6. **Long-only.** Excess return over SPY is not directly capturable without shorting SPY, which has its own carry cost.
## Position lifecycle
Positions are tracked in the `signals` table. When a trade is executed, `executed_at` is recorded. On each poll cycle the poller checks for positions where `executed_at` is older than `HOLDING_PERIOD_DAYS` and calls Alpaca to close them, marking `closed=1` in the DB.
## Modules ## Modules
| Path | Purpose | | Path | Purpose |
|---|---| |---|---|
| `config.py` | All thresholds and env-var loading | | `config.py` | Thresholds and env-var loading |
| `ingestion/edgar_poller.py` | EDGAR Atom feed polling and deduplication | | `ingestion/edgar_poller.py` | EDGAR Atom feed polling |
| `ingestion/sec_bulk_ingest.py` | Bulk historical ingest via quarterly form.idx archives | | `ingestion/sec_bulk_ingest.py` | Bulk historical ingest via form.idx |
| `ingestion/form4_parser.py` | Form 4 XML → structured dict; 10b5-1 detection; tx_code extraction | | `ingestion/form4_parser.py` | Form 4 XML parser; 10b5-1 detection |
| `db/models.py` | SQLAlchemy ORM models (`Filing`, `Signal`, `PriceCache`) | | `db/models.py` | SQLAlchemy ORM models |
| `db/db.py` | DB access layer — dedup-safe inserts, chunked IN queries, price cache | | `db/db.py` | DB access layer |
| `signals/filter_engine.py` | Filing → signal pipeline (open-market-only, as-of-date aware) | | `signals/filter_engine.py` | Filing to signal pipeline |
| `signals/cluster_detector.py` | Cluster detection from DB (as-of-date aware) | | `signals/cluster_detector.py` | Cluster detection |
| `alerts/slack_alert.py` | Slack webhook alert | | `alerts/slack_alert.py` | Slack webhook |
| `broker/alpaca_client.py` | Alpaca order execution + position exit | | `broker/alpaca_client.py` | Alpaca order execution |
| `backtest/backtest.py` | Per-signal historical backtest runner | | `backtest/backtest.py` | Per-signal backtest |
| `backtest/simulate.py` | Portfolio simulator with configurable costs | | `backtest/simulate.py` | Portfolio simulator |
| `main.py` | CLI entry point (`run` / `backfill` / `backtest` / `simulate`) | | `backtest/plot.py` | Plot generator |
| `main.py` | CLI: `run / backfill / backtest / simulate / plot` |
## Requirements ## Requirements
- Python 3.11+ Python 3.11+. See `requirements.txt`.
- See `requirements.txt`: `requests`, `lxml`, `cssselect`, `yfinance`, `python-dotenv`, `alpaca-trade-api`, `sqlalchemy`

225
backtest/plot.py Normal file
View File

@ -0,0 +1,225 @@
"""
Generate performance plots for the insider-copytrade strategy.
python main.py plot # saves to plots/
python backtest/plot.py # same
"""
import logging
import os
import sys
from datetime import datetime
sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
import config
from backtest.simulate import Strategy, _load_all_prices, simulate
logger = logging.getLogger(__name__)
PLOTS_DIR = os.path.join(os.path.dirname(os.path.dirname(__file__)), "plots")
def _get_matplotlib():
try:
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
return matplotlib, plt, mdates, np
except ImportError:
raise ImportError("pip install matplotlib numpy")
def plot_hp_heatmap(prices: dict, out_dir: str = PLOTS_DIR) -> str:
"""
Sweep holding_days x round-trip cost, plot annualized excess vs SPY.
Each cell is also annotated with the raw annualized return.
"""
matplotlib, plt, mdates, np = _get_matplotlib()
hold_days = [3, 5, 7, 10, 14, 21, 30]
rt_pcts = [0.3, 0.5, 0.7, 1.0, 1.2, 1.5, 2.0]
# decompose round-trip into (spread, slippage, commission) that sum correctly:
# roundtrip = 2*spread + slippage + 2*commission
# allocate 40% spread, 40% slippage, 20% commission (all relative to RT)
# => spread = RT*0.4/2 = RT*0.2 (one-way)
# => slippage = RT*0.4
# => commission = RT*0.2/2 = RT*0.1 (one-way)
# verify: 2*0.2 + 0.4 + 2*0.1 = 0.4+0.4+0.2 = 1.0 * RT ✓
def _costs(rt):
return dict(spread=rt * 0.2, slippage=rt * 0.4, commission=rt * 0.1)
rows_excess = []
rows_ann = []
total = len(hold_days) * len(rt_pcts)
done = 0
for hd in hold_days:
row_e, row_a = [], []
for rt_pct in rt_pcts:
rt = rt_pct / 100.0
s = Strategy(holding_days=hd, buy_delay=1, **_costs(rt))
r = simulate(s, prices=prices)
perf = r.get("performance", {})
row_e.append(perf.get("excess_return_pct", 0.0))
row_a.append(perf.get("annualized_return_pct", 0.0))
done += 1
logger.info(
f"[{done}/{total}] hold={hd}d rt={rt_pct}% "
f"ann={row_a[-1]:.1f}% excess={row_e[-1]:+.1f}%"
)
rows_excess.append(row_e)
rows_ann.append(row_a)
Z_excess = np.array(rows_excess)
Z_ann = np.array(rows_ann)
fig, axes = plt.subplots(1, 2, figsize=(15, 6))
for ax, Z, title in [
(axes[0], Z_excess, "Excess return vs SPY (annualised %)"),
(axes[1], Z_ann, "Strategy annualised return (%)"),
]:
vmax = float(max(abs(Z.max()), abs(Z.min()), 5))
if "Excess" in title:
from matplotlib.colors import TwoSlopeNorm
norm = TwoSlopeNorm(vmin=-vmax, vcenter=0, vmax=vmax)
else:
spy_approx = 16.0
from matplotlib.colors import TwoSlopeNorm
norm = TwoSlopeNorm(
vmin=min(float(Z.min()), -5),
vcenter=spy_approx,
vmax=max(float(Z.max()), spy_approx + 5),
)
im = ax.imshow(Z, cmap="RdYlGn", norm=norm, aspect="auto")
cb = plt.colorbar(im, ax=ax)
cb.set_label("%")
ax.set_xticks(range(len(rt_pcts)))
ax.set_xticklabels([f"{r}%" for r in rt_pcts], fontsize=9)
ax.set_yticks(range(len(hold_days)))
ax.set_yticklabels([f"{h}d" for h in hold_days], fontsize=9)
ax.set_xlabel("Round-trip transaction cost")
ax.set_ylabel("Holding period")
ax.set_title(title, fontsize=11)
for i in range(len(hold_days)):
for j in range(len(rt_pcts)):
val = Z[i, j]
txt = f"{val:+.1f}" if "Excess" in title else f"{val:.1f}"
brightness = norm(val)
color = "white" if brightness < 0.35 or brightness > 0.75 else "black"
ax.text(j, i, txt, ha="center", va="center", fontsize=7.5, color=color)
fig.suptitle(
"HP sweep: 1-day entry delay, 10% position size, buy filter only",
fontsize=12,
)
plt.tight_layout()
os.makedirs(out_dir, exist_ok=True)
out = os.path.join(out_dir, "hp_sweep.png")
plt.savefig(out, dpi=150, bbox_inches="tight")
plt.close()
logger.info(f"Saved {out}")
return out
def plot_equity_curves(prices: dict, out_dir: str = PLOTS_DIR) -> str:
"""
Plot portfolio equity curves for several cost scenarios vs SPY buy-and-hold.
"""
matplotlib, plt, mdates, np = _get_matplotlib()
scenarios = [
{"label": "0% RT cost (theoretical)", "spread": 0, "slippage": 0, "commission": 0},
{"label": "0.67% RT (best case)", "spread": 0.0014, "slippage": 0.0027, "commission": 0.0007},
{"label": "1.0% RT (mid)", "spread": 0.002, "slippage": 0.004, "commission": 0.001},
{"label": "1.5% RT (realistic small-cap)","spread": 0.003, "slippage": 0.006, "commission": 0.0015},
]
fig, ax = plt.subplots(figsize=(13, 7))
colors = ["#2ecc71", "#3498db", "#e67e22", "#e74c3c"]
sim_start = sim_end = None
for sc, color in zip(scenarios, colors):
s = Strategy(
holding_days=7, buy_delay=1,
spread=sc["spread"], slippage=sc["slippage"], commission=sc["commission"],
)
r = simulate(s, prices=prices)
curve = r.get("equity_curve", [])
if not curve:
continue
sim_start = sim_start or r["period"]["start"]
sim_end = r["period"]["end"]
dates = [datetime.strptime(d, "%Y-%m-%d") for d, _ in curve]
values = [v for _, v in curve]
base = values[0]
ax.plot(dates, [v / base * 100 for v in values],
label=sc["label"], color=color, linewidth=1.8)
# SPY buy-and-hold overlay
spy_px = prices.get("SPY", {})
if spy_px and sim_start and sim_end:
spy_dates = sorted(d for d in spy_px if sim_start <= d <= sim_end)
if spy_dates:
base = spy_px[spy_dates[0]]
ax.plot(
[datetime.strptime(d, "%Y-%m-%d") for d in spy_dates],
[spy_px[d] / base * 100 for d in spy_dates],
label="SPY buy & hold", color="black", linewidth=2.2, linestyle="--",
)
ax.axhline(100, color="gray", linewidth=0.8, linestyle=":")
ax.set_xlabel("Date", fontsize=11)
ax.set_ylabel("Portfolio value (indexed to 100)", fontsize=11)
ax.set_title(
"Insider Copytrade: equity curves vs SPY (7d hold, 1d delay, 10% position size)",
fontsize=12,
)
ax.legend(fontsize=10)
ax.grid(True, alpha=0.25)
ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y-%m"))
ax.xaxis.set_major_locator(mdates.MonthLocator(interval=6))
plt.xticks(rotation=30)
plt.tight_layout()
os.makedirs(out_dir, exist_ok=True)
out = os.path.join(out_dir, "equity_curves.png")
plt.savefig(out, dpi=150, bbox_inches="tight")
plt.close()
logger.info(f"Saved {out}")
return out
def main():
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
)
from db.db import init_db
init_db()
logger.info("Loading price cache...")
prices = _load_all_prices()
logger.info("Generating HP heatmap (49 simulations)...")
p1 = plot_hp_heatmap(prices)
logger.info("Generating equity curves (4 simulations)...")
p2 = plot_equity_curves(prices)
print(f"\nPlots saved:\n {p1}\n {p2}\n")
if __name__ == "__main__":
main()

View File

@ -118,7 +118,7 @@ class Strategy:
return self.entry_cost + self.exit_cost return self.entry_cost + self.exit_cost
def simulate(strategy: Strategy) -> dict: def simulate(strategy: Strategy, prices: dict = None) -> dict:
signals = get_signals_for_backtest(strategy.min_score, strategy.min_cluster) signals = get_signals_for_backtest(strategy.min_score, strategy.min_cluster)
# Filter malformed dates # Filter malformed dates
@ -137,7 +137,8 @@ def simulate(strategy: Strategy) -> dict:
if not signals: if not signals:
return {"error": "No signals after filtering"} return {"error": "No signals after filtering"}
prices = _load_all_prices() if prices is None:
prices = _load_all_prices()
# Build trade list: {entry_date_str: [(ticker, exit_date_str, signal)]} # Build trade list: {entry_date_str: [(ticker, exit_date_str, signal)]}
trades_by_entry: dict[str, list] = defaultdict(list) trades_by_entry: dict[str, list] = defaultdict(list)
@ -313,6 +314,7 @@ def simulate(strategy: Strategy) -> dict:
"win_rate_pct": round(win_rate * 100, 2), "win_rate_pct": round(win_rate * 100, 2),
"avg_net_return_pct": round(avg_net_return * 100, 3), "avg_net_return_pct": round(avg_net_return * 100, 3),
}, },
"equity_curve": equity_curve,
} }

View File

@ -132,11 +132,18 @@ def cmd_simulate():
sim_main() sim_main()
def cmd_plot():
"""Generate HP heatmap and equity curve plots. Saves PNGs to plots/."""
from backtest.plot import main as plot_main
plot_main()
COMMANDS = { COMMANDS = {
"run": cmd_run, "run": cmd_run,
"backfill": cmd_backfill, "backfill": cmd_backfill,
"backtest": cmd_backtest, "backtest": cmd_backtest,
"simulate": cmd_simulate, "simulate": cmd_simulate,
"plot": cmd_plot,
} }
if __name__ == "__main__": if __name__ == "__main__":

BIN
plots/equity_curves.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 186 KiB

BIN
plots/hp_sweep.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 127 KiB