Commit Graph

2 Commits

Author SHA1 Message Date
55d6e8708e fix: kNN zero-variance dims get inf std; hot-start SAC from saved model
- nucon/model.py: constant input dimensions (zero variance in training
  data) now get std=inf so they contribute 0 to normalised kNN distance
  instead of causing catastrophic OOD from tiny float epsilon
- scripts/train_sac.py: add --load, --steps, --out CLI args; --load
  hot-starts actor/critic weights from a previous run (learning_starts=0)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 12:44:26 +01:00
0932bb353a feat: SAC+HER training on kNN-GP sim with direct bypass and scripts/
- nucon/rl.py: delta_action_scale action space, bool handling (>=0.5),
  direct sim read/write bypassing HTTP for ~2000fps env throughput;
  remove uncertainty_abort from training (use penalty-only), larger
  default batch sizes; fix _read_obs and step for in-process sim
- nucon/model.py: optimise _lookup with einsum squared-L2, vectorised
  rbf kernel; forward_with_uncertainty uses pre-built normalised arrays
- nucon/sim.py: _update_reactor_state writes outputs via setattr directly
- scripts/train_sac.py: moved from root; full SAC+HER example with kNN-GP
  sim, delta actions, uncertainty penalty, init_states
- scripts/collect_dataset.py: CLI tool to collect dynamics dataset from
  live game session (--steps, --delta, --out, --merge)
- README.md: add Scripts section, reference both scripts in training loop

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 20:43:37 +01:00