- NuconGoalEnv now accepts objectives/objective_weights; additive on top
of the goal reward, same interface as NuconEnv
- README: use UncertaintyPenalty/UncertaintyAbort correctly (via objectives
and terminators, not as constructor params that don't exist)
- Step 3 prose updated to reference composable callables
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add NuconGoalEnv for goal-conditioned HER training (SAC + HER)
- Add UncertaintyPenalty and UncertaintyAbort composable callables;
SIM_UNCERTAINTY injected into obs dict when simulator is active
- Fix rl.py: str-typed params crash, missing Enum import, write-only
params in action space, broken step() iteration order
- Remove uncertainty state from sim (return value from update() instead)
- Rename kNN -> kNN-GP throughout README; add model selection note
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The prior sections already have full code examples; the training loop
section now just describes each step concisely and links back to them.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Documents the iterative sim-to-real workflow:
1. Human data collection during gameplay
2. Initial model fitting (kNN or NN)
3. RL training in simulator (SAC + HER)
4. Eval in game while collecting new data
5. Refit model, repeat
Includes ASCII flow diagram, code for each step, and a convergence
criterion (low kNN uncertainty throughout episode).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Model type is irrelevant during data collection. Models are now created
lazily on first use: train_model() creates a ReactorDynamicsModel,
fit_knn(k) creates a ReactorKNNModel. load_model() detects type by
file extension as before. drop_well_fitted() now checks model exists.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Describe both NuconEnv and NuconGoalEnv with their obs/action spaces
- Explain goal-conditioned approach and why HER is appropriate
- Add SAC + HerReplayBuffer usage example with recommended hyperparams
- Show how to inject a custom goal at inference time
- List registered goal env presets
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
rl.py:
- Add missing `from enum import Enum`
- Skip str-typed params in obs/action space construction (was crashing)
- Guard action space: exclude write-only (is_readable=False) and cheat params
- Fix step() param lookup (no longer iterates Nucon, uses _parameters dict directly)
- Correct sim-speed time dilation in real-game sleep
- Extract _build_param_space() helper shared by NuconEnv and NuconGoalEnv
- Add NuconGoalEnv: goal-conditioned env with normalised achieved/desired goal
vectors, compatible with SB3 HerReplayBuffer; goals sampled per episode
- Register Nucon-goal_power-v0 and Nucon-goal_temp-v0 presets
- Enum obs/action space now scalar index (not one-hot)
sim.py:
- Store self.port and self.host on NuconSimulator
- Add set_model() to accept a pre-loaded model directly
- load_model() detects type by extension (.pkl → kNN, else → NN torch)
and reads new checkpoint format with embedded input/output param lists
- _update_reactor_state() uses model.input_params (not all readable params),
calls .forward() directly for both NN and kNN, guards torch.no_grad per type
- Import ReactorKNNModel and pickle
model.py:
- save_model() embeds input_params/output_params in NN checkpoint dict
- load_model() handles new checkpoint format (state_dict key) with fallback
README.md:
- Update note: RODS_POS_ORDERED is no longer the only writable param;
game v2.2.25.213 exposes rod banks, pumps, MSCVs, switches and more
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>