Commit Graph

27 Commits

Author SHA1 Message Date
120a77aba3 plots: regenerate with cap-tier equity curves and SPY end-date fix
- equity_curves.png: now shows large/mid/small cap tiers with Alpaca costs
  vs theoretical no-cost baseline; SPY clamped to last strategy data point
- hp_sweep.png: updated to Alpaca zero-commission cost decomposition

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 18:54:07 +02:00
046c286fb6 docs: update results table with real cap-tier simulation numbers
Large/mid underperform SPY significantly. Micro-cap surprisingly beats market
by +12% despite highest RT costs -- per-trade alpha in small stocks is large
enough to survive friction, but missing price data and real illiquidity are
bigger concerns than the simulation can capture.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 18:37:08 +02:00
fdd03940b8 docs: make no-hosted-version section pranky
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 18:12:29 +02:00
d0e98b9cb7 feat: cap-tier filtering, Alpaca cost model, README cleanup
- simulate.py: --cap-tier large|mid|small|micro; yfinance market cap fetch
  with DB cache (ticker_meta table); argv fix for main.py dispatch
- plot.py: equity curves now show cap tiers with Alpaca costs (zero commission);
  HP sweep uses Alpaca cost decomposition; SPY line clamped to last strategy date
- db/models.py: TickerMeta table
- db/db.py: get_cached_market_caps, upsert_market_caps
- README: add --cap-tier to simulate docs; backfill note (~3 days for 2 years
  at SEC 10 req/s limit); remove duplicate setup block; remove em-dashes in prose;
  results table tilde estimates to be updated once cap-tier sims complete

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 18:10:09 +02:00
56ec0b4a81 docs: linkify insidercopytrading.com, fix punctuation, add compliment
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 18:03:09 +02:00
a0c6efc4ec docs: move no-hosted-version section above results, remove spoiler
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 18:02:15 +02:00
03de1bd9c3 docs: drop other caveats section
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 18:01:28 +02:00
04b44bfb0f docs: name insidercopytrading.com explicitly in scam analysis
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 18:01:16 +02:00
1cbf6fe91c docs: drop non-Alpaca commission row from results table
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 18:00:23 +02:00
399f69b817 feat: HP sweep heatmap + equity curve plots, scam analysis in README
- backtest/plot.py: generates two plots saved to plots/
  - hp_sweep.png: 7x7 heatmap of holding_days x round-trip cost, showing
    annualised excess vs SPY and raw annualised return per cell
  - equity_curves.png: portfolio equity vs SPY for 4 cost scenarios
- backtest/simulate.py: accept pre-loaded prices dict to avoid reloading
  on every sweep iteration; return equity_curve in result
- main.py: add `plot` command
- README: updated results section with Alpaca-specific cost breakdown
  (zero commission, costs are spread+slippage only); added honest analysis
  of why insidercopytrading.com-style services show outperformance that
  cannot be replicated in practice; note Alpaca integration not finished

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 17:59:18 +02:00
4d111e0a3a docs: add reality-check results showing strategy underperforms SPY after costs
Actual simulation results with 1.5% round-trip show -2.5% annualized (vs SPY +16%).
The per-trade signal exists but the margin (~0.68% alpha) is too thin to survive
realistic small-cap execution costs and a 1-day entry delay.

Also explains why insider-copytrade sites report outperformance: they use same-day
entry and omit spread/slippage from their simulations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 17:50:03 +02:00
e340d59a69 docs: update README with results section, simulator usage, and caveats
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 17:49:29 +02:00
8f666130b9 feat(cli): add backfill and simulate commands; historical signal reprocessing
- backfill: bulk-ingest SEC EDGAR quarterly archives (--years / --year --quarter),
  then regenerate signals with as-of-date awareness
- simulate: delegate to backtest/simulate.py with full cost params
- _run_signals: deduplicates (ticker, date) pairs, slices dates to 10 chars to
  avoid strptime crash on timezone-suffixed transaction_date values
- Remove fetch-once command (superseded by backfill)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 17:49:23 +02:00
1467033aa2 feat(backtest): portfolio simulator with configurable strategy and transaction costs
Event-driven simulation: 1-day buy delay, N-day hold, position-size % of cash.
Models entry cost (spread + slippage + commission) and exit cost (spread + commission)
so round-trip is fully parameterised from the CLI.

Reports: annualized return, SPY benchmark, excess return, max drawdown, Sharpe,
per-trade win rate and avg net return.

CLI: python main.py simulate [--holding-days 7] [--spread 0.003] [--slippage 0.002] ...
Also runnable directly: python backtest/simulate.py --help

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 17:49:14 +02:00
fb86443987 fix(backtest): squeeze yfinance Close series to avoid DataFrame iteration error
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 17:49:05 +02:00
727ad7cd6d feat(signals): as-of-date aware cluster detection, open-market-only filter
- cluster_detector: pass as_of_date through to DB query so historical signal
  reprocessing doesn't look into the future
- filter_engine: accept as_of_date; skip non-open-market tx_codes (only P/"");
  reject placeholder tickers (NONE, N/A); propagate as_of_date to cluster detection

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 17:48:59 +02:00
b5268f063e feat(ingestion): bulk historical ingest, form4 tx_code, parser fixes
- sec_bulk_ingest.py: new module — downloads quarterly form.idx from SEC EDGAR,
  filters Form 4/4A, fetches each filing's SGML/XML, parses and stores.
  Adaptive token-bucket rate limiter (backs off on 429/5xx, ramps on success).
  Uses filter_new_accessions for fast quarter-level dedup before any HTTP.
  Marks derivative-only filings as seen so they're skipped on resume.
- form4_parser: extract tx_code (transactionCode) from each transaction row;
  fix role extraction (Director/10%owner/Officer fallback); fix _text() to
  handle <value> sub-elements; fix footnote text extraction
- edgar_poller: filter feed entries to Form 4/4A only; skip XSLT stylesheet URLs
  when resolving XML filing links

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 17:48:51 +02:00
0fa36a3390 feat(db): dedup-safe inserts, filter_new_accessions, mark_accession_seen, as-of-date queries
- insert_filing: catch IntegrityError on duplicate accession instead of crashing
- filter_new_accessions: bulk pre-filter entire quarter against DB in chunked IN queries
  (avoids 30min per-row accession_exists loop during resume)
- mark_accession_seen: store placeholder row for derivative-only/empty filings so they
  aren't re-fetched on every resume
- get_recent_buys_for_ticker: accept as_of_date to clamp queries for historical signal gen
- get_all_buys_for_reprocess: return all buy filings ordered by transaction_date for backfill

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 17:48:33 +02:00
2e640b86d0 chore: gitignore data/, .claude/, WAL sidecar files; add cssselect dep
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 17:48:23 +02:00
08abb341f2 add joke to README 2026-05-04 20:02:54 +02:00
e383cd4845 add icon 2026-05-04 19:59:45 +02:00
cc4343d805 Merge pull request 'feat: Insider Copytrade POC + PLAN.md' (#2) from claude/issue-1-insider-copytrade-poc into master
Reviewed-on: #2
2026-05-04 19:38:21 +02:00
b119b9abae feat: SQLAlchemy ORM models, filing cache incremental fetch, yfinance price cache
- Replace db/schema.sql + raw sqlite3 with SQLAlchemy ORM (db/models.py)
  - Filing, Signal, PriceCache models with proper indexes
  - db/db.py uses SQLAlchemy sessions throughout; no raw SQL strings
- Add PriceCache table: stores daily close prices per ticker
  - backtest._fetch_prices checks DB first; skips yfinance for completed ranges
  - New data persisted via upsert_prices()
  - get_cached_prices() / upsert_prices() added to db.py
- EDGAR poller incremental fetch: get_latest_filed_date() returns newest
  filed_date in DB; fetch_and_store_new_filings skips entries older than
  that cutoff before even checking accession_exists
- Add get_signals_for_backtest() to db.py; backtest no longer opens its
  own sqlite3 connection
- requirements.txt: add sqlalchemy>=2.0.0

Co-authored-by: dodox <dodox@users.noreply.local>
2026-05-04 17:21:23 +00:00
2e2be3e9c7 fix: address sanity-check issues + rebrand to Smaug
Co-authored-by: dodox <dodox@users.noreply.local>
2026-05-04 16:32:00 +00:00
8c0085e503 docs: add README
Co-authored-by: dodox <dodox@users.noreply.local>
2026-05-04 16:24:25 +00:00
7e9221a914 feat: add PLAN.md and insider copytrade POC implementation
- PLAN.md: full implementation plan from issue
- config.py: configurable thresholds, API keys via .env
- ingestion/: EDGAR RSS poller + Form 4 XML parser
- db/: SQLite schema + interface (WAL mode)
- signals/: filter engine (buy/10b5-1/value/role) + cluster detector
- alerts/: Slack webhook alert with score gating
- broker/: Alpaca paper/live trade execution
- backtest/: historical signal backtesting with yfinance
- main.py: CLI entrypoint (run | fetch-once | backtest)
2026-05-04 16:15:22 +00:00
7ddf89ebfb Initial commit 2026-05-04 18:07:44 +02:00