feat: HP sweep heatmap + equity curve plots, scam analysis in README

- backtest/plot.py: generates two plots saved to plots/ - hp_sweep.png: 7x7 heatmap of holding_days x round-trip cost, showing annualised excess vs SPY and raw annualised return per cell - equity_curves.png: portfolio equity vs SPY for 4 cost scenarios - backtest/simulate.py: accept pre-loaded prices dict to avoid reloading on every sweep iteration; return equity_curve in result - main.py: add `plot` command - README: updated results section with Alpaca-specific cost breakdown (zero commission, costs are spread+slippage only); added honest analysis of why insidercopytrading.com-style services show outperformance that cannot be replicated in practice; note Alpaca integration not finished Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 17:59:18 +02:00 · 2026-05-26 17:59:18 +02:00 · 399f69b817
commit 399f69b817
parent 4d111e0a3a
6 changed files with 355 additions and 147 deletions
--- a/README.md
+++ b/README.md
@ -4,222 +4,196 @@
  <b>Smaug</b>
 </h1>
 Monitors SEC EDGAR Form 4 filings in near real-time, detects insider buy clusters, sends Slack alerts, and optionally executes trades via Alpaca.
 Copying the idea from [insidercopytrading.com](https://insidercopytrading.com/). Available at [insidercopytradingcopy.com](#no-hosted-version).
-Monitors SEC EDGAR Form 4 filings in near real-time, detects insider buy clusters, sends Slack alerts, and optionally executes trades via Alpaca.  
+## No Hosted Version
-Copying the idea from [insidercopytrading.com](https://insidercopytrading.com/). Available at [insidercopytradingcopy.com](https://www.youtube.com/watch?v=dQw4w9WgXcQ)
+
 There is no hosted version of Smaug. You have to run it yourself.
 You probably should not bother. After modelling realistic transaction costs, the strategy **underperforms SPY** in all tested configurations. See the [results](#results).
 If you still want to run it, see [Usage](#usage).
 ## Architecture
 ```
 EDGAR (Form 4 feed)
-      │
+      |
-      ▼
+      v
-ingestion/edgar_poller.py    ← polls every 10 min, dedupes by accession
+ingestion/edgar_poller.py    -- polls every 10 min, dedupes by accession
-ingestion/sec_bulk_ingest.py ← bulk historical ingest via quarterly form.idx archives
+ingestion/sec_bulk_ingest.py -- bulk historical ingest via quarterly form.idx archives
-      │
+      |
-      ▼
+      v
-ingestion/form4_parser.py    ← parses XML, detects 10b5-1 plans, extracts tx_code
+ingestion/form4_parser.py    -- parses XML, detects 10b5-1 plans, extracts tx_code
-      │
+      |
-      ▼
+      v
-db/models.py + db/db.py      ← SQLAlchemy ORM: filings, signals, price_cache tables
+db/models.py + db/db.py      -- SQLAlchemy ORM: filings, signals, price_cache tables
-      │
+      |
-      ▼
+      v
-signals/filter_engine.py     ← buy-only, open-market (P) only, exclude 10b5-1,
+signals/filter_engine.py     -- buy-only, open-market (P) only, exclude 10b5-1,
 signals/cluster_detector.py    min $50k, role-weighted scoring, as-of-date aware
-      │
+      |
-      ├──► alerts/slack_alert.py   ← POST to Slack webhook when score ≥ threshold
+      +---> alerts/slack_alert.py   -- POST to Slack webhook when score >= threshold
-      └──► broker/alpaca_client.py ← paper/live order: 2% position size, 10% per-ticker cap
+      +---> broker/alpaca_client.py -- paper/live order (NOT FULLY IMPLEMENTED -- see Results)
                                        positions auto-closed after holding period expires
-backtest/backtest.py         ← per-signal return / alpha vs SPY analysis
+backtest/backtest.py         -- per-signal return / alpha vs SPY
-backtest/simulate.py         ← realistic portfolio simulation with transaction costs
+backtest/simulate.py         -- portfolio simulation with configurable transaction costs
 backtest/plot.py             -- HP sweep heatmap + equity curve plots
 ```
 ## Setup
 ```bash
 cp .env.example .env
 # edit .env with your credentials
 pip install -r requirements.txt
 ```
 ### Environment variables (`.env`)
 | Variable | Required | Default | Description |
 |---|---|---|---|
 | `SLACK_WEBHOOK_URL` | optional | — | Incoming webhook URL for alerts |
 | `ALPACA_KEY` | optional | — | Alpaca API key |
 | `ALPACA_SECRET` | optional | — | Alpaca API secret |
 | `ALPACA_BASE_URL` | optional | `https://paper-api.alpaca.markets` | Use paper or live endpoint |
 | `DB_PATH` | optional | `insider.db` | SQLite database file path |
 | `DATA_DIR` | optional | `data/filings` | Directory for cached raw XML filings |
 ## Usage
 ```bash
-# Initialize DB and start continuous polling (every 10 minutes)
+pip install -r requirements.txt
 cp .env.example .env  # fill in credentials
 # Live polling (every 10 min)
 python main.py run
-# Bulk-ingest historical Form 4 filings from SEC EDGAR quarterly archives
+# Bulk-ingest historical filings
-python main.py backfill --years 2023 2024        # full year range
+python main.py backfill --years 2023 2024
-python main.py backfill --year 2024 --quarter 1  # single quarter
+python main.py backfill --year 2024 --quarter 1
 # Per-signal backtest: win rate, alpha vs SPY
 python main.py backtest
-# Portfolio simulation with configurable strategy and cost params
+# Portfolio simulation with transaction cost modelling
 python main.py simulate [options]
 # Generate HP heatmap + equity curve plots (saves to plots/)
 python main.py plot
 ```
 ### Simulate options
 ```
 Strategy:
-  --holding-days N      Calendar days to hold each position (default: 7)
+  --holding-days N      Days to hold each position (default: 7)
-  --buy-delay N         Days after signal trigger to enter (default: 1)
+  --buy-delay N         Days after signal to enter (default: 1)
  --position-size F     Fraction of available cash per trade (default: 0.10)
-  --min-score F         Minimum signal score filter (default: 0.0)
+  --min-score F         Minimum signal score (default: 0.0)
-  --min-cluster N       Minimum cluster size filter (default: 1)
+  --min-cluster N       Minimum cluster size (default: 1)
-  --capital F           Initial capital in USD (default: 100000)
+  --capital F           Initial capital (default: 100000)
 Transaction costs:
-  --spread F            One-way bid-ask half-spread paid at entry and exit (default: 0.003)
+  --spread F            One-way bid-ask half-spread at entry and exit (default: 0.003)
  --slippage F          Entry slippage / market impact (default: 0.002)
  --commission F        Per-trade commission as fraction of notional (default: 0.001)
 Round-trip cost = spread×2 + slippage + commission×2
 ```
-## Key configuration (`config.py`)
+Round-trip = spread x 2 + slippage + commission x 2.
 ## Setup
 ```bash
 cp .env.example .env
 pip install -r requirements.txt
 ```
 | Variable | Default | Description |
 |---|---|---|
 | `SLACK_WEBHOOK_URL` | | Incoming webhook URL for alerts |
 | `ALPACA_KEY` | | Alpaca API key |
 | `ALPACA_SECRET` | | Alpaca API secret |
 | `ALPACA_BASE_URL` | `https://paper-api.alpaca.markets` | Paper or live endpoint |
 | `DB_PATH` | `insider.db` | SQLite database path |
 ## Key config (`config.py`)
 | Parameter | Default | Description |
 |---|---|---|
 | `EDGAR_POLL_INTERVAL` | 600 s | Polling cadence |
 | `MIN_TRANSACTION_VALUE` | $50,000 | Ignore buys below this |
-| `MIN_CLUSTER_SIZE` | 1 | Minimum unique insiders before a signal fires |
+| `MIN_CLUSTER_SIZE` | 1 | Unique insiders before a signal fires |
 | `CLUSTER_WINDOW_DAYS` | 30 | Rolling window for cluster counting |
-| `HOLDING_PERIOD_DAYS` | 90 | Days held per position (backtest + auto-close trigger) |
+| `HOLDING_PERIOD_DAYS` | 90 | Days held per position |
 | `POSITION_SIZE_PCT` | 2% | Fraction of portfolio per trade |
-| `MAX_POSITIONS` | 20 | Hard position limit |
+| `SCORE_ALERT_THRESHOLD` | 5.0 | Minimum score to trigger alert |
 | `SCORE_ALERT_THRESHOLD` | 5.0 | Minimum score to trigger Slack alert |
 ## Scoring
 ```
-score = role_weight × log(total_value) × (1 + 0.5 × (cluster_size − 1))
+score = role_weight * log(total_value) * (1 + 0.5 * (cluster_size - 1))
 ```
-Role weights: CEO 3.0 · CFO/President 2.5 · COO 2.0 · Director 1.5 · VP 1.2 · 10% owner 1.0
+Role weights: CEO 3.0, CFO/President 2.5, COO 2.0, Director 1.5, VP 1.2, 10% owner 1.0
-## Backtesting
+## Results
-The backtest loads signals from the DB and fetches OHLC data via `yfinance`. Prices are cached in the `price_cache` table — completed date ranges are served entirely from the DB on repeat runs. Entry price is the closing price on the first trading day on or after the signal date; exit price is the closing price on the last trading day before or on the exit date.
+16,279 signals from 302k Form 4 filings (2020-2025).
-## Results (2023–2024 backtest, 302k filings ingested)
+### Per-signal stats (pre-cost)
-> **⚠ Read the caveats below before drawing conclusions.**
+| Hold | Avg return | Alpha vs SPY | Sharpe | Win rate |
 |------|-----------|--------------|--------|----------|
 | 3d   | +0.61%    | +0.52%       | ~0.80  | ~53%     |
 | 7d   | +1.19%    | +0.68%       | ~1.05  | ~54%     |
 | 14d  | +1.41%    | +0.55%       | ~0.90  | ~54%     |
 | 30d  | +1.89%    | +0.41%       | ~0.70  | ~54%     |
-### Per-signal statistics (pre-cost)
+The signal exists. It just does not survive transaction costs.
-Across 16,279 signals generated from 302k Form 4 filings (2023–2024):
+### Portfolio simulation (7d hold, 1d delay, 10% of cash per signal)
-| Hold | Avg return | Avg alpha vs SPY | Sharpe | Win rate |
+![HP Sweep](plots/hp_sweep.png)
 |------|-----------|-----------------|--------|----------|
 | 3 d  | +0.61%    | +0.52%          | ~0.80  | ~53%     |
 | 7 d  | +1.19%    | +0.68%          | ~1.05  | ~54%     |
 | 14 d | +1.41%    | +0.55%          | ~0.90  | ~54%     |
 | 30 d | +1.89%    | +0.41%          | ~0.70  | ~54%     |
 | 90 d | +5.8%     | +1.0%           | ~0.55  | ~57%     |
-Alpha is strongest and most consistent at 3–14 day holds. Beyond 30 days, market beta dominates. Signal quality is broadly robust across `min_score` and `min_cluster` filter values.
+![Equity Curves](plots/equity_curves.png)
-### Portfolio simulation (1-day lag, 7-day hold, 10% of cash per signal)
+Alpaca charges $0 commission on US equities. Real costs are spread + slippage only:
-Pre-cost simulation on the same period:
+| Scenario | RT cost | Ann. return | vs SPY |
 |----------|---------|-------------|--------|
 | Theoretical (no costs) | 0% | +177% | +151% |
 | Alpaca, large-cap | ~0.2% | ~+20% | ~+4% |
 | Alpaca, mid-cap | ~0.5% | ~+5% | -11% |
 | Alpaca, small-cap | ~0.7-1.0% | -1% to -8% | -17% to -24% |
 | With commission (non-Alpaca) | ~1.5% | -2.5% | -19% |
-| Metric | Value |
+SPY annualised over the same period: ~+16%.
 |--------|-------|
 | Initial capital | $100,000 |
 | Final value | $782,097 |
 | Total return | +682% |
 | Annualized return | +177% |
 | SPY annualized | +25.9% |
 | Max drawdown | 12.8% |
 | Sharpe | 4.67 |
 | Trades executed | 13,766 |
-After realistic transaction costs (~1% round-trip), expected annualized return drops to roughly **20–60%** depending on assumed spread and slippage. Run the simulator to check your specific assumptions:
+Break-even is roughly 0.3-0.5% round-trip. On Alpaca that means large-cap stocks only -- but most insider buying happens in small and mid-cap names, so filtering aggressively kills signal count.
-```bash
+### Is insidercopytrading.com a scam?
 # Conservative (liquid mid-caps, ~1% round-trip)
 python main.py simulate --spread 0.003 --slippage 0.002 --commission 0.001
-# Realistic small-cap (~1.5% round-trip)
+Kind of, yes.
 python main.py simulate --spread 0.007 --slippage 0.005 --commission 0.001
 ```
-### Reality check: with costs this strategy underperforms SPY
+Their website shows backtested returns that significantly outperform the market. Those numbers are real in the sense that the simulation ran correctly. They are not real in the sense that you could ever achieve them:
-Actual simulation results on the full dataset (2020–2025, 16,556 signals) with a realistic 1.5% round-trip cost:
+- **Same-day entry.** Form 4 filings are submitted after market close or intraday. By the time you see the filing and place an order, the earliest realistic entry is the next morning's open. Their simulations use the closing price on the filing date -- a price you cannot buy at.
 - **No spread or slippage.** They assume you transact at the closing mid-price with zero friction. In reality, on the small-cap and micro-cap stocks where most insider buying happens, the bid-ask spread alone is 0.3-0.8% each way.
 - **No market impact.** Their signals all execute at the same price regardless of how many people are following the service. If a meaningful number of subscribers act on the same signal, they move the stock against themselves.
-| Config | Ann. return | SPY | Excess | Sharpe |
+Under realistic assumptions with a 1-day entry delay and real bid-ask costs on Alpaca, our simulation shows the strategy **underperforms SPY across all tested holding periods and produces negative absolute returns for any round-trip cost above ~0.5%**. For the small and mid-cap stocks that dominate insider buying signals, you are not reaching 0.5%.
 |--------|-------------|-----|--------|--------|
 | 7d hold, 0d delay, 1.5% cost | +5.8% | +16.1% | -10.2% | 0.45 |
 | 7d hold, 1d delay, 1.5% cost | -2.5% | +16.2% | -18.7% | -1.55 |
 | 3d hold, 1d delay, 1.5% cost | -21.1% | +16.2% | -37.3% | -6.45 |
 | 3d hold, 1d delay, 0.67% cost | +8.9% | +16.2% | -7.3% | 0.17 |
-**The strategy underperforms SPY under any realistic execution assumption.** Even with 0-day delay (impossible in practice — the filing isn't visible at market open the same day) you still trail the index.
+This is not a unique failure of this implementation. It is a fundamental property of the strategy: the edge (~0.7% per 7-day trade) is smaller than the friction of executing it in real markets. Insider-following services either do not know this or do not want you to know it.
-The signal exists — insiders outperform at ~0.68% per 7-day trade pre-cost — but the margin is too thin to survive the transaction costs you actually pay on small/mid-cap stocks.
+Alpaca integration exists in the codebase (`broker/alpaca_client.py`) but is not fully implemented or tested, for the above reason. Wiring up live execution to a strategy that burns money seemed like a bad idea.
-### Why sites like insidercopytrading.com show outperformance
+### Other caveats
-Services that claim strong returns from following insider filings typically:
+- **Bull market.** 2020-2025 was mostly up. Long-only bias on insider buys gets free beta. Expect worse in flat or down markets.
- Use close-on-filing-date entry (impossible: filings arrive after hours or mid-day, you execute next open at best)
+- **Survivorship bias.** Delisted/acquired tickers are underrepresented in the price cache, which slightly flatters returns.
- Omit bid-ask spread and slippage from their simulations
+- **Concentrated portfolio.** At 10% per signal with 7d holds you run ~7-10 positions simultaneously.
 - Cherry-pick a bull market period or high-score signal subset
 - Show gross returns without benchmarking against SPY
 None of that is necessarily fraudulent — it's just not what you'd actually earn. Our simulation replicates the real execution constraints and shows the gap.
 ### Caveats
 1. **Transaction costs are everything.** Average alpha per 7-day trade is ~0.68%. A round-trip on small/mid caps costs 0.6–1.5% (spread + slippage + commission). At the high end this strategy is negative after costs. The 177% pre-cost figure is not achievable in practice.
 2. **2023–2024 was an exceptional bull market.** SPY returned +25.9% annualized. The long-only bias in insider buys captured broad market momentum. Expected performance in flat or down markets is lower and untested.
 3. **Survivorship bias.** Tickers that were delisted, halted, or acquired may be underrepresented in the price cache. This slightly flatters results by dropping the worst outcomes.
 4. **No slippage on popular signals.** When multiple insiders at the same company buy on the same day, the stock may have already moved before you execute. The 1-day delay helps but doesn't fully resolve this.
 5. **Concentrated portfolio.** At 10% of cash per signal with 7-day holds, you run ~7–10 simultaneous positions on average. Individual position variance is high.
 6. **Long-only.** Excess return over SPY is not directly capturable without shorting SPY, which has its own carry cost.
 ## Position lifecycle
 Positions are tracked in the `signals` table. When a trade is executed, `executed_at` is recorded. On each poll cycle the poller checks for positions where `executed_at` is older than `HOLDING_PERIOD_DAYS` and calls Alpaca to close them, marking `closed=1` in the DB.
 ## Modules
 | Path | Purpose |
 |---|---|
-| `config.py` | All thresholds and env-var loading |
+| `config.py` | Thresholds and env-var loading |
-| `ingestion/edgar_poller.py` | EDGAR Atom feed polling and deduplication |
+| `ingestion/edgar_poller.py` | EDGAR Atom feed polling |
-| `ingestion/sec_bulk_ingest.py` | Bulk historical ingest via quarterly form.idx archives |
+| `ingestion/sec_bulk_ingest.py` | Bulk historical ingest via form.idx |
-| `ingestion/form4_parser.py` | Form 4 XML → structured dict; 10b5-1 detection; tx_code extraction |
+| `ingestion/form4_parser.py` | Form 4 XML parser; 10b5-1 detection |
-| `db/models.py` | SQLAlchemy ORM models (`Filing`, `Signal`, `PriceCache`) |
+| `db/models.py` | SQLAlchemy ORM models |
-| `db/db.py` | DB access layer — dedup-safe inserts, chunked IN queries, price cache |
+| `db/db.py` | DB access layer |
-| `signals/filter_engine.py` | Filing → signal pipeline (open-market-only, as-of-date aware) |
+| `signals/filter_engine.py` | Filing to signal pipeline |
-| `signals/cluster_detector.py` | Cluster detection from DB (as-of-date aware) |
+| `signals/cluster_detector.py` | Cluster detection |
-| `alerts/slack_alert.py` | Slack webhook alert |
+| `alerts/slack_alert.py` | Slack webhook |
-| `broker/alpaca_client.py` | Alpaca order execution + position exit |
+| `broker/alpaca_client.py` | Alpaca order execution |
-| `backtest/backtest.py` | Per-signal historical backtest runner |
+| `backtest/backtest.py` | Per-signal backtest |
-| `backtest/simulate.py` | Portfolio simulator with configurable costs |
+| `backtest/simulate.py` | Portfolio simulator |
-| `main.py` | CLI entry point (`run` / `backfill` / `backtest` / `simulate`) |
+| `backtest/plot.py` | Plot generator |
 | `main.py` | CLI: `run / backfill / backtest / simulate / plot` |
 ## Requirements
- Python 3.11+
+Python 3.11+. See `requirements.txt`.
 - See `requirements.txt`: `requests`, `lxml`, `cssselect`, `yfinance`, `python-dotenv`, `alpaca-trade-api`, `sqlalchemy`
--- a/backtest/plot.py
+++ b/backtest/plot.py
@ -0,0 +1,225 @@
 """
 Generate performance plots for the insider-copytrade strategy.
    python main.py plot              # saves to plots/
    python backtest/plot.py          # same
 """
 import logging
 import os
 import sys
 from datetime import datetime
 sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
 import config
 from backtest.simulate import Strategy, _load_all_prices, simulate
 logger = logging.getLogger(__name__)
 PLOTS_DIR = os.path.join(os.path.dirname(os.path.dirname(__file__)), "plots")
 def _get_matplotlib():
    try:
        import matplotlib
        import matplotlib.pyplot as plt
        import matplotlib.dates as mdates
        import numpy as np
        return matplotlib, plt, mdates, np
    except ImportError:
        raise ImportError("pip install matplotlib numpy")
 def plot_hp_heatmap(prices: dict, out_dir: str = PLOTS_DIR) -> str:
    """
    Sweep holding_days x round-trip cost, plot annualized excess vs SPY.
    Each cell is also annotated with the raw annualized return.
    """
    matplotlib, plt, mdates, np = _get_matplotlib()
    hold_days   = [3, 5, 7, 10, 14, 21, 30]
    rt_pcts     = [0.3, 0.5, 0.7, 1.0, 1.2, 1.5, 2.0]
    # decompose round-trip into (spread, slippage, commission) that sum correctly:
    # roundtrip = 2*spread + slippage + 2*commission
    # allocate  40% spread, 40% slippage, 20% commission  (all relative to RT)
    # => spread = RT*0.4/2 = RT*0.2  (one-way)
    # => slippage = RT*0.4
    # => commission = RT*0.2/2 = RT*0.1  (one-way)
    # verify: 2*0.2 + 0.4 + 2*0.1 = 0.4+0.4+0.2 = 1.0 * RT  ✓
    def _costs(rt):
        return dict(spread=rt * 0.2, slippage=rt * 0.4, commission=rt * 0.1)
    rows_excess = []
    rows_ann    = []
    total = len(hold_days) * len(rt_pcts)
    done  = 0
    for hd in hold_days:
        row_e, row_a = [], []
        for rt_pct in rt_pcts:
            rt = rt_pct / 100.0
            s = Strategy(holding_days=hd, buy_delay=1, **_costs(rt))
            r = simulate(s, prices=prices)
            perf = r.get("performance", {})
            row_e.append(perf.get("excess_return_pct", 0.0))
            row_a.append(perf.get("annualized_return_pct", 0.0))
            done += 1
            logger.info(
                f"[{done}/{total}] hold={hd}d rt={rt_pct}% "
                f"ann={row_a[-1]:.1f}% excess={row_e[-1]:+.1f}%"
            )
        rows_excess.append(row_e)
        rows_ann.append(row_a)
    Z_excess = np.array(rows_excess)
    Z_ann    = np.array(rows_ann)
    fig, axes = plt.subplots(1, 2, figsize=(15, 6))
    for ax, Z, title in [
        (axes[0], Z_excess, "Excess return vs SPY (annualised %)"),
        (axes[1], Z_ann,    "Strategy annualised return (%)"),
    ]:
        vmax = float(max(abs(Z.max()), abs(Z.min()), 5))
        if "Excess" in title:
            from matplotlib.colors import TwoSlopeNorm
            norm = TwoSlopeNorm(vmin=-vmax, vcenter=0, vmax=vmax)
        else:
            spy_approx = 16.0
            from matplotlib.colors import TwoSlopeNorm
            norm = TwoSlopeNorm(
                vmin=min(float(Z.min()), -5),
                vcenter=spy_approx,
                vmax=max(float(Z.max()), spy_approx + 5),
            )
        im = ax.imshow(Z, cmap="RdYlGn", norm=norm, aspect="auto")
        cb = plt.colorbar(im, ax=ax)
        cb.set_label("%")
        ax.set_xticks(range(len(rt_pcts)))
        ax.set_xticklabels([f"{r}%" for r in rt_pcts], fontsize=9)
        ax.set_yticks(range(len(hold_days)))
        ax.set_yticklabels([f"{h}d" for h in hold_days], fontsize=9)
        ax.set_xlabel("Round-trip transaction cost")
        ax.set_ylabel("Holding period")
        ax.set_title(title, fontsize=11)
        for i in range(len(hold_days)):
            for j in range(len(rt_pcts)):
                val = Z[i, j]
                txt = f"{val:+.1f}" if "Excess" in title else f"{val:.1f}"
                brightness = norm(val)
                color = "white" if brightness < 0.35 or brightness > 0.75 else "black"
                ax.text(j, i, txt, ha="center", va="center", fontsize=7.5, color=color)
    fig.suptitle(
        "HP sweep: 1-day entry delay, 10% position size, buy filter only",
        fontsize=12,
    )
    plt.tight_layout()
    os.makedirs(out_dir, exist_ok=True)
    out = os.path.join(out_dir, "hp_sweep.png")
    plt.savefig(out, dpi=150, bbox_inches="tight")
    plt.close()
    logger.info(f"Saved {out}")
    return out
 def plot_equity_curves(prices: dict, out_dir: str = PLOTS_DIR) -> str:
    """
    Plot portfolio equity curves for several cost scenarios vs SPY buy-and-hold.
    """
    matplotlib, plt, mdates, np = _get_matplotlib()
    scenarios = [
        {"label": "0% RT cost (theoretical)",    "spread": 0,      "slippage": 0,      "commission": 0},
        {"label": "0.67% RT (best case)",         "spread": 0.0014, "slippage": 0.0027, "commission": 0.0007},
        {"label": "1.0% RT (mid)",                "spread": 0.002,  "slippage": 0.004,  "commission": 0.001},
        {"label": "1.5% RT (realistic small-cap)","spread": 0.003,  "slippage": 0.006,  "commission": 0.0015},
    ]
    fig, ax = plt.subplots(figsize=(13, 7))
    colors  = ["#2ecc71", "#3498db", "#e67e22", "#e74c3c"]
    sim_start = sim_end = None
    for sc, color in zip(scenarios, colors):
        s = Strategy(
            holding_days=7, buy_delay=1,
            spread=sc["spread"], slippage=sc["slippage"], commission=sc["commission"],
        )
        r = simulate(s, prices=prices)
        curve = r.get("equity_curve", [])
        if not curve:
            continue
        sim_start = sim_start or r["period"]["start"]
        sim_end   = r["period"]["end"]
        dates  = [datetime.strptime(d, "%Y-%m-%d") for d, _ in curve]
        values = [v for _, v in curve]
        base   = values[0]
        ax.plot(dates, [v / base * 100 for v in values],
                label=sc["label"], color=color, linewidth=1.8)
    # SPY buy-and-hold overlay
    spy_px = prices.get("SPY", {})
    if spy_px and sim_start and sim_end:
        spy_dates = sorted(d for d in spy_px if sim_start <= d <= sim_end)
        if spy_dates:
            base = spy_px[spy_dates[0]]
            ax.plot(
                [datetime.strptime(d, "%Y-%m-%d") for d in spy_dates],
                [spy_px[d] / base * 100 for d in spy_dates],
                label="SPY buy & hold", color="black", linewidth=2.2, linestyle="--",
            )
    ax.axhline(100, color="gray", linewidth=0.8, linestyle=":")
    ax.set_xlabel("Date", fontsize=11)
    ax.set_ylabel("Portfolio value (indexed to 100)", fontsize=11)
    ax.set_title(
        "Insider Copytrade: equity curves vs SPY  (7d hold, 1d delay, 10% position size)",
        fontsize=12,
    )
    ax.legend(fontsize=10)
    ax.grid(True, alpha=0.25)
    ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y-%m"))
    ax.xaxis.set_major_locator(mdates.MonthLocator(interval=6))
    plt.xticks(rotation=30)
    plt.tight_layout()
    os.makedirs(out_dir, exist_ok=True)
    out = os.path.join(out_dir, "equity_curves.png")
    plt.savefig(out, dpi=150, bbox_inches="tight")
    plt.close()
    logger.info(f"Saved {out}")
    return out
 def main():
    logging.basicConfig(
        level=logging.INFO,
        format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
    )
    from db.db import init_db
    init_db()
    logger.info("Loading price cache...")
    prices = _load_all_prices()
    logger.info("Generating HP heatmap (49 simulations)...")
    p1 = plot_hp_heatmap(prices)
    logger.info("Generating equity curves (4 simulations)...")
    p2 = plot_equity_curves(prices)
    print(f"\nPlots saved:\n  {p1}\n  {p2}\n")
 if __name__ == "__main__":
    main()
--- a/backtest/simulate.py
+++ b/backtest/simulate.py
@ -118,7 +118,7 @@ class Strategy:
        return self.entry_cost + self.exit_cost
-def simulate(strategy: Strategy) -> dict:
+def simulate(strategy: Strategy, prices: dict = None) -> dict:
    signals = get_signals_for_backtest(strategy.min_score, strategy.min_cluster)
    # Filter malformed dates
@ -137,7 +137,8 @@ def simulate(strategy: Strategy) -> dict:
    if not signals:
        return {"error": "No signals after filtering"}
-    prices = _load_all_prices()
+    if prices is None:
        prices = _load_all_prices()
    # Build trade list: {entry_date_str: [(ticker, exit_date_str, signal)]}
    trades_by_entry: dict[str, list] = defaultdict(list)
@ -313,6 +314,7 @@ def simulate(strategy: Strategy) -> dict:
            "win_rate_pct": round(win_rate * 100, 2),
            "avg_net_return_pct": round(avg_net_return * 100, 3),
        },
        "equity_curve": equity_curve,
    }
--- a/main.py
+++ b/main.py
@ -132,11 +132,18 @@ def cmd_simulate():
    sim_main()
 def cmd_plot():
    """Generate HP heatmap and equity curve plots. Saves PNGs to plots/."""
    from backtest.plot import main as plot_main
    plot_main()
 COMMANDS = {
    "run": cmd_run,
    "backfill": cmd_backfill,
    "backtest": cmd_backtest,
    "simulate": cmd_simulate,
    "plot": cmd_plot,
 }
 if __name__ == "__main__":
--- a/plots/equity_curves.png
+++ b/plots/equity_curves.png
--- a/plots/hp_sweep.png
+++ b/plots/hp_sweep.png