Commit Graph

2 Commits

Author SHA1 Message Date
b5268f063e feat(ingestion): bulk historical ingest, form4 tx_code, parser fixes
- sec_bulk_ingest.py: new module — downloads quarterly form.idx from SEC EDGAR,
  filters Form 4/4A, fetches each filing's SGML/XML, parses and stores.
  Adaptive token-bucket rate limiter (backs off on 429/5xx, ramps on success).
  Uses filter_new_accessions for fast quarter-level dedup before any HTTP.
  Marks derivative-only filings as seen so they're skipped on resume.
- form4_parser: extract tx_code (transactionCode) from each transaction row;
  fix role extraction (Director/10%owner/Officer fallback); fix _text() to
  handle <value> sub-elements; fix footnote text extraction
- edgar_poller: filter feed entries to Form 4/4A only; skip XSLT stylesheet URLs
  when resolving XML filing links

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 17:48:51 +02:00
7e9221a914 feat: add PLAN.md and insider copytrade POC implementation
- PLAN.md: full implementation plan from issue
- config.py: configurable thresholds, API keys via .env
- ingestion/: EDGAR RSS poller + Form 4 XML parser
- db/: SQLite schema + interface (WAL mode)
- signals/: filter engine (buy/10b5-1/value/role) + cluster detector
- alerts/: Slack webhook alert with score gating
- broker/: Alpaca paper/live trade execution
- backtest/: historical signal backtesting with yfinance
- main.py: CLI entrypoint (run | fetch-once | backtest)
2026-05-04 16:15:22 +00:00