feat: HP sweep heatmap + equity curve plots, scam analysis in README
- backtest/plot.py: generates two plots saved to plots/
- hp_sweep.png: 7x7 heatmap of holding_days x round-trip cost, showing
annualised excess vs SPY and raw annualised return per cell
- equity_curves.png: portfolio equity vs SPY for 4 cost scenarios
- backtest/simulate.py: accept pre-loaded prices dict to avoid reloading
on every sweep iteration; return equity_curve in result
- main.py: add `plot` command
- README: updated results section with Alpaca-specific cost breakdown
(zero commission, costs are spread+slippage only); added honest analysis
of why insidercopytrading.com-style services show outperformance that
cannot be replicated in practice; note Alpaca integration not finished
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
4d111e0a3a
commit
399f69b817
264
README.md
264
README.md
@ -4,222 +4,196 @@
|
||||
<b>Smaug</b>
|
||||
</h1>
|
||||
|
||||
Monitors SEC EDGAR Form 4 filings in near real-time, detects insider buy clusters, sends Slack alerts, and optionally executes trades via Alpaca.
|
||||
Copying the idea from [insidercopytrading.com](https://insidercopytrading.com/). Available at [insidercopytradingcopy.com](#no-hosted-version).
|
||||
|
||||
Monitors SEC EDGAR Form 4 filings in near real-time, detects insider buy clusters, sends Slack alerts, and optionally executes trades via Alpaca.
|
||||
Copying the idea from [insidercopytrading.com](https://insidercopytrading.com/). Available at [insidercopytradingcopy.com](https://www.youtube.com/watch?v=dQw4w9WgXcQ)
|
||||
## No Hosted Version
|
||||
|
||||
There is no hosted version of Smaug. You have to run it yourself.
|
||||
|
||||
You probably should not bother. After modelling realistic transaction costs, the strategy **underperforms SPY** in all tested configurations. See the [results](#results).
|
||||
|
||||
If you still want to run it, see [Usage](#usage).
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
EDGAR (Form 4 feed)
|
||||
│
|
||||
▼
|
||||
ingestion/edgar_poller.py ← polls every 10 min, dedupes by accession
|
||||
ingestion/sec_bulk_ingest.py ← bulk historical ingest via quarterly form.idx archives
|
||||
│
|
||||
▼
|
||||
ingestion/form4_parser.py ← parses XML, detects 10b5-1 plans, extracts tx_code
|
||||
│
|
||||
▼
|
||||
db/models.py + db/db.py ← SQLAlchemy ORM: filings, signals, price_cache tables
|
||||
│
|
||||
▼
|
||||
signals/filter_engine.py ← buy-only, open-market (P) only, exclude 10b5-1,
|
||||
|
|
||||
v
|
||||
ingestion/edgar_poller.py -- polls every 10 min, dedupes by accession
|
||||
ingestion/sec_bulk_ingest.py -- bulk historical ingest via quarterly form.idx archives
|
||||
|
|
||||
v
|
||||
ingestion/form4_parser.py -- parses XML, detects 10b5-1 plans, extracts tx_code
|
||||
|
|
||||
v
|
||||
db/models.py + db/db.py -- SQLAlchemy ORM: filings, signals, price_cache tables
|
||||
|
|
||||
v
|
||||
signals/filter_engine.py -- buy-only, open-market (P) only, exclude 10b5-1,
|
||||
signals/cluster_detector.py min $50k, role-weighted scoring, as-of-date aware
|
||||
│
|
||||
├──► alerts/slack_alert.py ← POST to Slack webhook when score ≥ threshold
|
||||
└──► broker/alpaca_client.py ← paper/live order: 2% position size, 10% per-ticker cap
|
||||
positions auto-closed after holding period expires
|
||||
|
|
||||
+---> alerts/slack_alert.py -- POST to Slack webhook when score >= threshold
|
||||
+---> broker/alpaca_client.py -- paper/live order (NOT FULLY IMPLEMENTED -- see Results)
|
||||
|
||||
backtest/backtest.py ← per-signal return / alpha vs SPY analysis
|
||||
backtest/simulate.py ← realistic portfolio simulation with transaction costs
|
||||
backtest/backtest.py -- per-signal return / alpha vs SPY
|
||||
backtest/simulate.py -- portfolio simulation with configurable transaction costs
|
||||
backtest/plot.py -- HP sweep heatmap + equity curve plots
|
||||
```
|
||||
|
||||
## Setup
|
||||
|
||||
```bash
|
||||
cp .env.example .env
|
||||
# edit .env with your credentials
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### Environment variables (`.env`)
|
||||
|
||||
| Variable | Required | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `SLACK_WEBHOOK_URL` | optional | — | Incoming webhook URL for alerts |
|
||||
| `ALPACA_KEY` | optional | — | Alpaca API key |
|
||||
| `ALPACA_SECRET` | optional | — | Alpaca API secret |
|
||||
| `ALPACA_BASE_URL` | optional | `https://paper-api.alpaca.markets` | Use paper or live endpoint |
|
||||
| `DB_PATH` | optional | `insider.db` | SQLite database file path |
|
||||
| `DATA_DIR` | optional | `data/filings` | Directory for cached raw XML filings |
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
# Initialize DB and start continuous polling (every 10 minutes)
|
||||
pip install -r requirements.txt
|
||||
cp .env.example .env # fill in credentials
|
||||
|
||||
# Live polling (every 10 min)
|
||||
python main.py run
|
||||
|
||||
# Bulk-ingest historical Form 4 filings from SEC EDGAR quarterly archives
|
||||
python main.py backfill --years 2023 2024 # full year range
|
||||
python main.py backfill --year 2024 --quarter 1 # single quarter
|
||||
# Bulk-ingest historical filings
|
||||
python main.py backfill --years 2023 2024
|
||||
python main.py backfill --year 2024 --quarter 1
|
||||
|
||||
# Per-signal backtest: win rate, alpha vs SPY
|
||||
python main.py backtest
|
||||
|
||||
# Portfolio simulation with configurable strategy and cost params
|
||||
# Portfolio simulation with transaction cost modelling
|
||||
python main.py simulate [options]
|
||||
|
||||
# Generate HP heatmap + equity curve plots (saves to plots/)
|
||||
python main.py plot
|
||||
```
|
||||
|
||||
### Simulate options
|
||||
|
||||
```
|
||||
Strategy:
|
||||
--holding-days N Calendar days to hold each position (default: 7)
|
||||
--buy-delay N Days after signal trigger to enter (default: 1)
|
||||
--holding-days N Days to hold each position (default: 7)
|
||||
--buy-delay N Days after signal to enter (default: 1)
|
||||
--position-size F Fraction of available cash per trade (default: 0.10)
|
||||
--min-score F Minimum signal score filter (default: 0.0)
|
||||
--min-cluster N Minimum cluster size filter (default: 1)
|
||||
--capital F Initial capital in USD (default: 100000)
|
||||
--min-score F Minimum signal score (default: 0.0)
|
||||
--min-cluster N Minimum cluster size (default: 1)
|
||||
--capital F Initial capital (default: 100000)
|
||||
|
||||
Transaction costs:
|
||||
--spread F One-way bid-ask half-spread paid at entry and exit (default: 0.003)
|
||||
--spread F One-way bid-ask half-spread at entry and exit (default: 0.003)
|
||||
--slippage F Entry slippage / market impact (default: 0.002)
|
||||
--commission F Per-trade commission as fraction of notional (default: 0.001)
|
||||
|
||||
Round-trip cost = spread×2 + slippage + commission×2
|
||||
```
|
||||
|
||||
## Key configuration (`config.py`)
|
||||
Round-trip = spread x 2 + slippage + commission x 2.
|
||||
|
||||
## Setup
|
||||
|
||||
```bash
|
||||
cp .env.example .env
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
| Variable | Default | Description |
|
||||
|---|---|---|
|
||||
| `SLACK_WEBHOOK_URL` | | Incoming webhook URL for alerts |
|
||||
| `ALPACA_KEY` | | Alpaca API key |
|
||||
| `ALPACA_SECRET` | | Alpaca API secret |
|
||||
| `ALPACA_BASE_URL` | `https://paper-api.alpaca.markets` | Paper or live endpoint |
|
||||
| `DB_PATH` | `insider.db` | SQLite database path |
|
||||
|
||||
## Key config (`config.py`)
|
||||
|
||||
| Parameter | Default | Description |
|
||||
|---|---|---|
|
||||
| `EDGAR_POLL_INTERVAL` | 600 s | Polling cadence |
|
||||
| `MIN_TRANSACTION_VALUE` | $50,000 | Ignore buys below this |
|
||||
| `MIN_CLUSTER_SIZE` | 1 | Minimum unique insiders before a signal fires |
|
||||
| `MIN_CLUSTER_SIZE` | 1 | Unique insiders before a signal fires |
|
||||
| `CLUSTER_WINDOW_DAYS` | 30 | Rolling window for cluster counting |
|
||||
| `HOLDING_PERIOD_DAYS` | 90 | Days held per position (backtest + auto-close trigger) |
|
||||
| `HOLDING_PERIOD_DAYS` | 90 | Days held per position |
|
||||
| `POSITION_SIZE_PCT` | 2% | Fraction of portfolio per trade |
|
||||
| `MAX_POSITIONS` | 20 | Hard position limit |
|
||||
| `SCORE_ALERT_THRESHOLD` | 5.0 | Minimum score to trigger Slack alert |
|
||||
| `SCORE_ALERT_THRESHOLD` | 5.0 | Minimum score to trigger alert |
|
||||
|
||||
## Scoring
|
||||
|
||||
```
|
||||
score = role_weight × log(total_value) × (1 + 0.5 × (cluster_size − 1))
|
||||
score = role_weight * log(total_value) * (1 + 0.5 * (cluster_size - 1))
|
||||
```
|
||||
|
||||
Role weights: CEO 3.0 · CFO/President 2.5 · COO 2.0 · Director 1.5 · VP 1.2 · 10% owner 1.0
|
||||
Role weights: CEO 3.0, CFO/President 2.5, COO 2.0, Director 1.5, VP 1.2, 10% owner 1.0
|
||||
|
||||
## Backtesting
|
||||
## Results
|
||||
|
||||
The backtest loads signals from the DB and fetches OHLC data via `yfinance`. Prices are cached in the `price_cache` table — completed date ranges are served entirely from the DB on repeat runs. Entry price is the closing price on the first trading day on or after the signal date; exit price is the closing price on the last trading day before or on the exit date.
|
||||
16,279 signals from 302k Form 4 filings (2020-2025).
|
||||
|
||||
## Results (2023–2024 backtest, 302k filings ingested)
|
||||
### Per-signal stats (pre-cost)
|
||||
|
||||
> **⚠ Read the caveats below before drawing conclusions.**
|
||||
| Hold | Avg return | Alpha vs SPY | Sharpe | Win rate |
|
||||
|------|-----------|--------------|--------|----------|
|
||||
| 3d | +0.61% | +0.52% | ~0.80 | ~53% |
|
||||
| 7d | +1.19% | +0.68% | ~1.05 | ~54% |
|
||||
| 14d | +1.41% | +0.55% | ~0.90 | ~54% |
|
||||
| 30d | +1.89% | +0.41% | ~0.70 | ~54% |
|
||||
|
||||
### Per-signal statistics (pre-cost)
|
||||
The signal exists. It just does not survive transaction costs.
|
||||
|
||||
Across 16,279 signals generated from 302k Form 4 filings (2023–2024):
|
||||
### Portfolio simulation (7d hold, 1d delay, 10% of cash per signal)
|
||||
|
||||
| Hold | Avg return | Avg alpha vs SPY | Sharpe | Win rate |
|
||||
|------|-----------|-----------------|--------|----------|
|
||||
| 3 d | +0.61% | +0.52% | ~0.80 | ~53% |
|
||||
| 7 d | +1.19% | +0.68% | ~1.05 | ~54% |
|
||||
| 14 d | +1.41% | +0.55% | ~0.90 | ~54% |
|
||||
| 30 d | +1.89% | +0.41% | ~0.70 | ~54% |
|
||||
| 90 d | +5.8% | +1.0% | ~0.55 | ~57% |
|
||||

|
||||
|
||||
Alpha is strongest and most consistent at 3–14 day holds. Beyond 30 days, market beta dominates. Signal quality is broadly robust across `min_score` and `min_cluster` filter values.
|
||||

|
||||
|
||||
### Portfolio simulation (1-day lag, 7-day hold, 10% of cash per signal)
|
||||
Alpaca charges $0 commission on US equities. Real costs are spread + slippage only:
|
||||
|
||||
Pre-cost simulation on the same period:
|
||||
| Scenario | RT cost | Ann. return | vs SPY |
|
||||
|----------|---------|-------------|--------|
|
||||
| Theoretical (no costs) | 0% | +177% | +151% |
|
||||
| Alpaca, large-cap | ~0.2% | ~+20% | ~+4% |
|
||||
| Alpaca, mid-cap | ~0.5% | ~+5% | -11% |
|
||||
| Alpaca, small-cap | ~0.7-1.0% | -1% to -8% | -17% to -24% |
|
||||
| With commission (non-Alpaca) | ~1.5% | -2.5% | -19% |
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Initial capital | $100,000 |
|
||||
| Final value | $782,097 |
|
||||
| Total return | +682% |
|
||||
| Annualized return | +177% |
|
||||
| SPY annualized | +25.9% |
|
||||
| Max drawdown | 12.8% |
|
||||
| Sharpe | 4.67 |
|
||||
| Trades executed | 13,766 |
|
||||
SPY annualised over the same period: ~+16%.
|
||||
|
||||
After realistic transaction costs (~1% round-trip), expected annualized return drops to roughly **20–60%** depending on assumed spread and slippage. Run the simulator to check your specific assumptions:
|
||||
Break-even is roughly 0.3-0.5% round-trip. On Alpaca that means large-cap stocks only -- but most insider buying happens in small and mid-cap names, so filtering aggressively kills signal count.
|
||||
|
||||
```bash
|
||||
# Conservative (liquid mid-caps, ~1% round-trip)
|
||||
python main.py simulate --spread 0.003 --slippage 0.002 --commission 0.001
|
||||
### Is insidercopytrading.com a scam?
|
||||
|
||||
# Realistic small-cap (~1.5% round-trip)
|
||||
python main.py simulate --spread 0.007 --slippage 0.005 --commission 0.001
|
||||
```
|
||||
Kind of, yes.
|
||||
|
||||
### Reality check: with costs this strategy underperforms SPY
|
||||
Their website shows backtested returns that significantly outperform the market. Those numbers are real in the sense that the simulation ran correctly. They are not real in the sense that you could ever achieve them:
|
||||
|
||||
Actual simulation results on the full dataset (2020–2025, 16,556 signals) with a realistic 1.5% round-trip cost:
|
||||
- **Same-day entry.** Form 4 filings are submitted after market close or intraday. By the time you see the filing and place an order, the earliest realistic entry is the next morning's open. Their simulations use the closing price on the filing date -- a price you cannot buy at.
|
||||
- **No spread or slippage.** They assume you transact at the closing mid-price with zero friction. In reality, on the small-cap and micro-cap stocks where most insider buying happens, the bid-ask spread alone is 0.3-0.8% each way.
|
||||
- **No market impact.** Their signals all execute at the same price regardless of how many people are following the service. If a meaningful number of subscribers act on the same signal, they move the stock against themselves.
|
||||
|
||||
| Config | Ann. return | SPY | Excess | Sharpe |
|
||||
|--------|-------------|-----|--------|--------|
|
||||
| 7d hold, 0d delay, 1.5% cost | +5.8% | +16.1% | -10.2% | 0.45 |
|
||||
| 7d hold, 1d delay, 1.5% cost | -2.5% | +16.2% | -18.7% | -1.55 |
|
||||
| 3d hold, 1d delay, 1.5% cost | -21.1% | +16.2% | -37.3% | -6.45 |
|
||||
| 3d hold, 1d delay, 0.67% cost | +8.9% | +16.2% | -7.3% | 0.17 |
|
||||
Under realistic assumptions with a 1-day entry delay and real bid-ask costs on Alpaca, our simulation shows the strategy **underperforms SPY across all tested holding periods and produces negative absolute returns for any round-trip cost above ~0.5%**. For the small and mid-cap stocks that dominate insider buying signals, you are not reaching 0.5%.
|
||||
|
||||
**The strategy underperforms SPY under any realistic execution assumption.** Even with 0-day delay (impossible in practice — the filing isn't visible at market open the same day) you still trail the index.
|
||||
This is not a unique failure of this implementation. It is a fundamental property of the strategy: the edge (~0.7% per 7-day trade) is smaller than the friction of executing it in real markets. Insider-following services either do not know this or do not want you to know it.
|
||||
|
||||
The signal exists — insiders outperform at ~0.68% per 7-day trade pre-cost — but the margin is too thin to survive the transaction costs you actually pay on small/mid-cap stocks.
|
||||
Alpaca integration exists in the codebase (`broker/alpaca_client.py`) but is not fully implemented or tested, for the above reason. Wiring up live execution to a strategy that burns money seemed like a bad idea.
|
||||
|
||||
### Why sites like insidercopytrading.com show outperformance
|
||||
### Other caveats
|
||||
|
||||
Services that claim strong returns from following insider filings typically:
|
||||
- Use close-on-filing-date entry (impossible: filings arrive after hours or mid-day, you execute next open at best)
|
||||
- Omit bid-ask spread and slippage from their simulations
|
||||
- Cherry-pick a bull market period or high-score signal subset
|
||||
- Show gross returns without benchmarking against SPY
|
||||
|
||||
None of that is necessarily fraudulent — it's just not what you'd actually earn. Our simulation replicates the real execution constraints and shows the gap.
|
||||
|
||||
### Caveats
|
||||
|
||||
1. **Transaction costs are everything.** Average alpha per 7-day trade is ~0.68%. A round-trip on small/mid caps costs 0.6–1.5% (spread + slippage + commission). At the high end this strategy is negative after costs. The 177% pre-cost figure is not achievable in practice.
|
||||
|
||||
2. **2023–2024 was an exceptional bull market.** SPY returned +25.9% annualized. The long-only bias in insider buys captured broad market momentum. Expected performance in flat or down markets is lower and untested.
|
||||
|
||||
3. **Survivorship bias.** Tickers that were delisted, halted, or acquired may be underrepresented in the price cache. This slightly flatters results by dropping the worst outcomes.
|
||||
|
||||
4. **No slippage on popular signals.** When multiple insiders at the same company buy on the same day, the stock may have already moved before you execute. The 1-day delay helps but doesn't fully resolve this.
|
||||
|
||||
5. **Concentrated portfolio.** At 10% of cash per signal with 7-day holds, you run ~7–10 simultaneous positions on average. Individual position variance is high.
|
||||
|
||||
6. **Long-only.** Excess return over SPY is not directly capturable without shorting SPY, which has its own carry cost.
|
||||
|
||||
## Position lifecycle
|
||||
|
||||
Positions are tracked in the `signals` table. When a trade is executed, `executed_at` is recorded. On each poll cycle the poller checks for positions where `executed_at` is older than `HOLDING_PERIOD_DAYS` and calls Alpaca to close them, marking `closed=1` in the DB.
|
||||
- **Bull market.** 2020-2025 was mostly up. Long-only bias on insider buys gets free beta. Expect worse in flat or down markets.
|
||||
- **Survivorship bias.** Delisted/acquired tickers are underrepresented in the price cache, which slightly flatters returns.
|
||||
- **Concentrated portfolio.** At 10% per signal with 7d holds you run ~7-10 positions simultaneously.
|
||||
|
||||
## Modules
|
||||
|
||||
| Path | Purpose |
|
||||
|---|---|
|
||||
| `config.py` | All thresholds and env-var loading |
|
||||
| `ingestion/edgar_poller.py` | EDGAR Atom feed polling and deduplication |
|
||||
| `ingestion/sec_bulk_ingest.py` | Bulk historical ingest via quarterly form.idx archives |
|
||||
| `ingestion/form4_parser.py` | Form 4 XML → structured dict; 10b5-1 detection; tx_code extraction |
|
||||
| `db/models.py` | SQLAlchemy ORM models (`Filing`, `Signal`, `PriceCache`) |
|
||||
| `db/db.py` | DB access layer — dedup-safe inserts, chunked IN queries, price cache |
|
||||
| `signals/filter_engine.py` | Filing → signal pipeline (open-market-only, as-of-date aware) |
|
||||
| `signals/cluster_detector.py` | Cluster detection from DB (as-of-date aware) |
|
||||
| `alerts/slack_alert.py` | Slack webhook alert |
|
||||
| `broker/alpaca_client.py` | Alpaca order execution + position exit |
|
||||
| `backtest/backtest.py` | Per-signal historical backtest runner |
|
||||
| `backtest/simulate.py` | Portfolio simulator with configurable costs |
|
||||
| `main.py` | CLI entry point (`run` / `backfill` / `backtest` / `simulate`) |
|
||||
| `config.py` | Thresholds and env-var loading |
|
||||
| `ingestion/edgar_poller.py` | EDGAR Atom feed polling |
|
||||
| `ingestion/sec_bulk_ingest.py` | Bulk historical ingest via form.idx |
|
||||
| `ingestion/form4_parser.py` | Form 4 XML parser; 10b5-1 detection |
|
||||
| `db/models.py` | SQLAlchemy ORM models |
|
||||
| `db/db.py` | DB access layer |
|
||||
| `signals/filter_engine.py` | Filing to signal pipeline |
|
||||
| `signals/cluster_detector.py` | Cluster detection |
|
||||
| `alerts/slack_alert.py` | Slack webhook |
|
||||
| `broker/alpaca_client.py` | Alpaca order execution |
|
||||
| `backtest/backtest.py` | Per-signal backtest |
|
||||
| `backtest/simulate.py` | Portfolio simulator |
|
||||
| `backtest/plot.py` | Plot generator |
|
||||
| `main.py` | CLI: `run / backfill / backtest / simulate / plot` |
|
||||
|
||||
## Requirements
|
||||
|
||||
- Python 3.11+
|
||||
- See `requirements.txt`: `requests`, `lxml`, `cssselect`, `yfinance`, `python-dotenv`, `alpaca-trade-api`, `sqlalchemy`
|
||||
Python 3.11+. See `requirements.txt`.
|
||||
|
||||
225
backtest/plot.py
Normal file
225
backtest/plot.py
Normal file
@ -0,0 +1,225 @@
|
||||
"""
|
||||
Generate performance plots for the insider-copytrade strategy.
|
||||
|
||||
python main.py plot # saves to plots/
|
||||
python backtest/plot.py # same
|
||||
"""
|
||||
|
||||
import logging
|
||||
import os
|
||||
import sys
|
||||
from datetime import datetime
|
||||
|
||||
sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
|
||||
|
||||
import config
|
||||
from backtest.simulate import Strategy, _load_all_prices, simulate
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
PLOTS_DIR = os.path.join(os.path.dirname(os.path.dirname(__file__)), "plots")
|
||||
|
||||
|
||||
def _get_matplotlib():
|
||||
try:
|
||||
import matplotlib
|
||||
import matplotlib.pyplot as plt
|
||||
import matplotlib.dates as mdates
|
||||
import numpy as np
|
||||
return matplotlib, plt, mdates, np
|
||||
except ImportError:
|
||||
raise ImportError("pip install matplotlib numpy")
|
||||
|
||||
|
||||
def plot_hp_heatmap(prices: dict, out_dir: str = PLOTS_DIR) -> str:
|
||||
"""
|
||||
Sweep holding_days x round-trip cost, plot annualized excess vs SPY.
|
||||
Each cell is also annotated with the raw annualized return.
|
||||
"""
|
||||
matplotlib, plt, mdates, np = _get_matplotlib()
|
||||
|
||||
hold_days = [3, 5, 7, 10, 14, 21, 30]
|
||||
rt_pcts = [0.3, 0.5, 0.7, 1.0, 1.2, 1.5, 2.0]
|
||||
|
||||
# decompose round-trip into (spread, slippage, commission) that sum correctly:
|
||||
# roundtrip = 2*spread + slippage + 2*commission
|
||||
# allocate 40% spread, 40% slippage, 20% commission (all relative to RT)
|
||||
# => spread = RT*0.4/2 = RT*0.2 (one-way)
|
||||
# => slippage = RT*0.4
|
||||
# => commission = RT*0.2/2 = RT*0.1 (one-way)
|
||||
# verify: 2*0.2 + 0.4 + 2*0.1 = 0.4+0.4+0.2 = 1.0 * RT ✓
|
||||
def _costs(rt):
|
||||
return dict(spread=rt * 0.2, slippage=rt * 0.4, commission=rt * 0.1)
|
||||
|
||||
rows_excess = []
|
||||
rows_ann = []
|
||||
total = len(hold_days) * len(rt_pcts)
|
||||
done = 0
|
||||
|
||||
for hd in hold_days:
|
||||
row_e, row_a = [], []
|
||||
for rt_pct in rt_pcts:
|
||||
rt = rt_pct / 100.0
|
||||
s = Strategy(holding_days=hd, buy_delay=1, **_costs(rt))
|
||||
r = simulate(s, prices=prices)
|
||||
perf = r.get("performance", {})
|
||||
row_e.append(perf.get("excess_return_pct", 0.0))
|
||||
row_a.append(perf.get("annualized_return_pct", 0.0))
|
||||
done += 1
|
||||
logger.info(
|
||||
f"[{done}/{total}] hold={hd}d rt={rt_pct}% "
|
||||
f"ann={row_a[-1]:.1f}% excess={row_e[-1]:+.1f}%"
|
||||
)
|
||||
rows_excess.append(row_e)
|
||||
rows_ann.append(row_a)
|
||||
|
||||
Z_excess = np.array(rows_excess)
|
||||
Z_ann = np.array(rows_ann)
|
||||
|
||||
fig, axes = plt.subplots(1, 2, figsize=(15, 6))
|
||||
|
||||
for ax, Z, title in [
|
||||
(axes[0], Z_excess, "Excess return vs SPY (annualised %)"),
|
||||
(axes[1], Z_ann, "Strategy annualised return (%)"),
|
||||
]:
|
||||
vmax = float(max(abs(Z.max()), abs(Z.min()), 5))
|
||||
if "Excess" in title:
|
||||
from matplotlib.colors import TwoSlopeNorm
|
||||
norm = TwoSlopeNorm(vmin=-vmax, vcenter=0, vmax=vmax)
|
||||
else:
|
||||
spy_approx = 16.0
|
||||
from matplotlib.colors import TwoSlopeNorm
|
||||
norm = TwoSlopeNorm(
|
||||
vmin=min(float(Z.min()), -5),
|
||||
vcenter=spy_approx,
|
||||
vmax=max(float(Z.max()), spy_approx + 5),
|
||||
)
|
||||
|
||||
im = ax.imshow(Z, cmap="RdYlGn", norm=norm, aspect="auto")
|
||||
cb = plt.colorbar(im, ax=ax)
|
||||
cb.set_label("%")
|
||||
|
||||
ax.set_xticks(range(len(rt_pcts)))
|
||||
ax.set_xticklabels([f"{r}%" for r in rt_pcts], fontsize=9)
|
||||
ax.set_yticks(range(len(hold_days)))
|
||||
ax.set_yticklabels([f"{h}d" for h in hold_days], fontsize=9)
|
||||
ax.set_xlabel("Round-trip transaction cost")
|
||||
ax.set_ylabel("Holding period")
|
||||
ax.set_title(title, fontsize=11)
|
||||
|
||||
for i in range(len(hold_days)):
|
||||
for j in range(len(rt_pcts)):
|
||||
val = Z[i, j]
|
||||
txt = f"{val:+.1f}" if "Excess" in title else f"{val:.1f}"
|
||||
brightness = norm(val)
|
||||
color = "white" if brightness < 0.35 or brightness > 0.75 else "black"
|
||||
ax.text(j, i, txt, ha="center", va="center", fontsize=7.5, color=color)
|
||||
|
||||
fig.suptitle(
|
||||
"HP sweep: 1-day entry delay, 10% position size, buy filter only",
|
||||
fontsize=12,
|
||||
)
|
||||
plt.tight_layout()
|
||||
|
||||
os.makedirs(out_dir, exist_ok=True)
|
||||
out = os.path.join(out_dir, "hp_sweep.png")
|
||||
plt.savefig(out, dpi=150, bbox_inches="tight")
|
||||
plt.close()
|
||||
logger.info(f"Saved {out}")
|
||||
return out
|
||||
|
||||
|
||||
def plot_equity_curves(prices: dict, out_dir: str = PLOTS_DIR) -> str:
|
||||
"""
|
||||
Plot portfolio equity curves for several cost scenarios vs SPY buy-and-hold.
|
||||
"""
|
||||
matplotlib, plt, mdates, np = _get_matplotlib()
|
||||
|
||||
scenarios = [
|
||||
{"label": "0% RT cost (theoretical)", "spread": 0, "slippage": 0, "commission": 0},
|
||||
{"label": "0.67% RT (best case)", "spread": 0.0014, "slippage": 0.0027, "commission": 0.0007},
|
||||
{"label": "1.0% RT (mid)", "spread": 0.002, "slippage": 0.004, "commission": 0.001},
|
||||
{"label": "1.5% RT (realistic small-cap)","spread": 0.003, "slippage": 0.006, "commission": 0.0015},
|
||||
]
|
||||
|
||||
fig, ax = plt.subplots(figsize=(13, 7))
|
||||
|
||||
colors = ["#2ecc71", "#3498db", "#e67e22", "#e74c3c"]
|
||||
sim_start = sim_end = None
|
||||
|
||||
for sc, color in zip(scenarios, colors):
|
||||
s = Strategy(
|
||||
holding_days=7, buy_delay=1,
|
||||
spread=sc["spread"], slippage=sc["slippage"], commission=sc["commission"],
|
||||
)
|
||||
r = simulate(s, prices=prices)
|
||||
curve = r.get("equity_curve", [])
|
||||
if not curve:
|
||||
continue
|
||||
|
||||
sim_start = sim_start or r["period"]["start"]
|
||||
sim_end = r["period"]["end"]
|
||||
|
||||
dates = [datetime.strptime(d, "%Y-%m-%d") for d, _ in curve]
|
||||
values = [v for _, v in curve]
|
||||
base = values[0]
|
||||
ax.plot(dates, [v / base * 100 for v in values],
|
||||
label=sc["label"], color=color, linewidth=1.8)
|
||||
|
||||
# SPY buy-and-hold overlay
|
||||
spy_px = prices.get("SPY", {})
|
||||
if spy_px and sim_start and sim_end:
|
||||
spy_dates = sorted(d for d in spy_px if sim_start <= d <= sim_end)
|
||||
if spy_dates:
|
||||
base = spy_px[spy_dates[0]]
|
||||
ax.plot(
|
||||
[datetime.strptime(d, "%Y-%m-%d") for d in spy_dates],
|
||||
[spy_px[d] / base * 100 for d in spy_dates],
|
||||
label="SPY buy & hold", color="black", linewidth=2.2, linestyle="--",
|
||||
)
|
||||
|
||||
ax.axhline(100, color="gray", linewidth=0.8, linestyle=":")
|
||||
ax.set_xlabel("Date", fontsize=11)
|
||||
ax.set_ylabel("Portfolio value (indexed to 100)", fontsize=11)
|
||||
ax.set_title(
|
||||
"Insider Copytrade: equity curves vs SPY (7d hold, 1d delay, 10% position size)",
|
||||
fontsize=12,
|
||||
)
|
||||
ax.legend(fontsize=10)
|
||||
ax.grid(True, alpha=0.25)
|
||||
ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y-%m"))
|
||||
ax.xaxis.set_major_locator(mdates.MonthLocator(interval=6))
|
||||
plt.xticks(rotation=30)
|
||||
|
||||
plt.tight_layout()
|
||||
os.makedirs(out_dir, exist_ok=True)
|
||||
out = os.path.join(out_dir, "equity_curves.png")
|
||||
plt.savefig(out, dpi=150, bbox_inches="tight")
|
||||
plt.close()
|
||||
logger.info(f"Saved {out}")
|
||||
return out
|
||||
|
||||
|
||||
def main():
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
|
||||
)
|
||||
|
||||
from db.db import init_db
|
||||
init_db()
|
||||
|
||||
logger.info("Loading price cache...")
|
||||
prices = _load_all_prices()
|
||||
|
||||
logger.info("Generating HP heatmap (49 simulations)...")
|
||||
p1 = plot_hp_heatmap(prices)
|
||||
|
||||
logger.info("Generating equity curves (4 simulations)...")
|
||||
p2 = plot_equity_curves(prices)
|
||||
|
||||
print(f"\nPlots saved:\n {p1}\n {p2}\n")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@ -118,7 +118,7 @@ class Strategy:
|
||||
return self.entry_cost + self.exit_cost
|
||||
|
||||
|
||||
def simulate(strategy: Strategy) -> dict:
|
||||
def simulate(strategy: Strategy, prices: dict = None) -> dict:
|
||||
signals = get_signals_for_backtest(strategy.min_score, strategy.min_cluster)
|
||||
|
||||
# Filter malformed dates
|
||||
@ -137,7 +137,8 @@ def simulate(strategy: Strategy) -> dict:
|
||||
if not signals:
|
||||
return {"error": "No signals after filtering"}
|
||||
|
||||
prices = _load_all_prices()
|
||||
if prices is None:
|
||||
prices = _load_all_prices()
|
||||
|
||||
# Build trade list: {entry_date_str: [(ticker, exit_date_str, signal)]}
|
||||
trades_by_entry: dict[str, list] = defaultdict(list)
|
||||
@ -313,6 +314,7 @@ def simulate(strategy: Strategy) -> dict:
|
||||
"win_rate_pct": round(win_rate * 100, 2),
|
||||
"avg_net_return_pct": round(avg_net_return * 100, 3),
|
||||
},
|
||||
"equity_curve": equity_curve,
|
||||
}
|
||||
|
||||
|
||||
|
||||
7
main.py
7
main.py
@ -132,11 +132,18 @@ def cmd_simulate():
|
||||
sim_main()
|
||||
|
||||
|
||||
def cmd_plot():
|
||||
"""Generate HP heatmap and equity curve plots. Saves PNGs to plots/."""
|
||||
from backtest.plot import main as plot_main
|
||||
plot_main()
|
||||
|
||||
|
||||
COMMANDS = {
|
||||
"run": cmd_run,
|
||||
"backfill": cmd_backfill,
|
||||
"backtest": cmd_backtest,
|
||||
"simulate": cmd_simulate,
|
||||
"plot": cmd_plot,
|
||||
}
|
||||
|
||||
if __name__ == "__main__":
|
||||
|
||||
BIN
plots/equity_curves.png
Normal file
BIN
plots/equity_curves.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 186 KiB |
BIN
plots/hp_sweep.png
Normal file
BIN
plots/hp_sweep.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 127 KiB |
Loading…
Reference in New Issue
Block a user