Commit Graph

82 Commits

Author SHA1 Message Date
f93d4bb119 chore: replace logo with minimal SVG reactor cross-section
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 20:46:33 +01:00
0932bb353a feat: SAC+HER training on kNN-GP sim with direct bypass and scripts/
- nucon/rl.py: delta_action_scale action space, bool handling (>=0.5),
  direct sim read/write bypassing HTTP for ~2000fps env throughput;
  remove uncertainty_abort from training (use penalty-only), larger
  default batch sizes; fix _read_obs and step for in-process sim
- nucon/model.py: optimise _lookup with einsum squared-L2, vectorised
  rbf kernel; forward_with_uncertainty uses pre-built normalised arrays
- nucon/sim.py: _update_reactor_state writes outputs via setattr directly
- scripts/train_sac.py: moved from root; full SAC+HER example with kNN-GP
  sim, delta actions, uncertainty penalty, init_states
- scripts/collect_dataset.py: CLI tool to collect dynamics dataset from
  live game session (--steps, --delta, --out, --merge)
- README.md: add Scripts section, reference both scripts in training loop

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 20:43:37 +01:00
3dfe1aa673 fix: flat Box action space, SB3/HER compatibility, sim uninitialized param defaults
rl.py:
- Action space is now a flat Box (SAC/PPO require this, not Dict)
- _build_flat_action_space + _unflatten_action helpers shared by both envs
- Params with undefined bounds excluded from action space (SAC needs finite bounds)
- Fix _build_param_space: use `is not None` check instead of falsy `or` (0 is valid min_val)
- NuconGoalEnv obs params default to simulator.model.input_params when sim provided;
  obs_params kwarg overrides for real-game deployment with same param set
- SIM_UNCERTAINTY kept out of policy obs vector (not available at deployment);
  available in reward_obs passed to objectives/terminators/reward_fn
- _read_obs returns (gym_obs, reward_obs) cleanly instead of smuggling via dict
- NuconGoalEnv additional_objectives wired into step()

sim.py:
- Uninitialized params return type-default (0/False/first-enum) instead of "None"
- Enum params serialised as integer value, not repr string

README.md:
- Fix HerReplayBuffer import path (sb3 2.x: her.her_replay_buffer)
- Remove non-existent simulator.run() call
- Fix broken anchor links, remove "work in progress" from intro

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 19:16:07 +01:00
845ca708a7 remove UncertaintyPenalty/Abort aliases; use Parameterized_Objectives/Terminators dicts
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 18:58:22 +01:00
2c1bbc1a31 refactor: move UncertaintyPenalty/Abort into Parameterized_Objectives/Terminators dicts
- uncertainty_penalty -> Parameterized_Objectives['uncertainty_penalty']
- uncertainty_abort   -> Parameterized_Terminators['uncertainty_abort']
- Add Parameterized_Terminators dict (same pattern as Parameterized_Objectives)
- Keep UncertaintyPenalty / UncertaintyAbort as convenience aliases

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 18:57:44 +01:00
041e0ec1bd rename: objectives -> additional_objectives in NuconGoalEnv
Clarifies that the goal reward is the primary built-in objective;
additional_objectives are additive on top of it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 18:56:34 +01:00
36a33e74e5 fix: add objectives support to NuconGoalEnv; fix README uncertainty example
- NuconGoalEnv now accepts objectives/objective_weights; additive on top
  of the goal reward, same interface as NuconEnv
- README: use UncertaintyPenalty/UncertaintyAbort correctly (via objectives
  and terminators, not as constructor params that don't exist)
- Step 3 prose updated to reference composable callables

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 18:55:16 +01:00
f4d45d3cfd feat: NuconGoalEnv, composable uncertainty helpers, kNN-GP naming
- Add NuconGoalEnv for goal-conditioned HER training (SAC + HER)
- Add UncertaintyPenalty and UncertaintyAbort composable callables;
  SIM_UNCERTAINTY injected into obs dict when simulator is active
- Fix rl.py: str-typed params crash, missing Enum import, write-only
  params in action space, broken step() iteration order
- Remove uncertainty state from sim (return value from update() instead)
- Rename kNN -> kNN-GP throughout README; add model selection note

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 18:51:13 +01:00
1b93699501 docs: mention uncertainty penalty/abort in training loop section
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 18:37:25 +01:00
65190dffea feat: uncertainty-aware training with penalty and abort
sim.py:
- simulator.update(return_uncertainty=True) calls forward_with_uncertainty
  on kNN models and returns the GP std; returns None for NN or when not
  requested (no extra cost if unused)
- No state stored on simulator; caller decides what to do with the value

rl.py (NuconEnv and NuconGoalEnv):
- uncertainty_penalty_start: above this GP std, subtract a linear penalty
  from the reward (scaled by uncertainty_penalty_scale, default 1.0)
- uncertainty_abort: at or above this GP std, set truncated=True
- Only calls update(return_uncertainty=True) when either threshold is set
- Uncertainty only applies when using a simulator (kNN model); ignored otherwise

Example:
    simulator = NuconSimulator()
    simulator.load_model('reactor_knn.pkl')
    env = NuconGoalEnv(..., simulator=simulator,
                       uncertainty_penalty_start=0.3,
                       uncertainty_abort=0.7,
                       uncertainty_penalty_scale=2.0)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 18:37:09 +01:00
6cb93ad56d feat: abort trajectory on high kNN uncertainty in simulator
NuconSimulator now accepts uncertainty_threshold (default None = disabled).
When set and using a kNN model, _update_reactor_state() calls
forward_with_uncertainty() and raises HighUncertaintyError if the GP
posterior std exceeds the threshold.

NuconEnv and NuconGoalEnv catch HighUncertaintyError in step() and
return truncated=True, so SB3 bootstraps the value rather than treating
OOD regions as terminal states.

Usage:
    simulator = NuconSimulator(uncertainty_threshold=0.3)
    # episodes are cut short when the policy wanders OOD

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 18:29:54 +01:00
e2e8db1f04 docs: remove WIP labels and clean up stale transitional prose
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 18:22:25 +01:00
7ee8272034 docs: replace step-by-step code blocks in training loop with prose
The prior sections already have full code examples; the training loop
section now just describes each step concisely and links back to them.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 18:20:10 +01:00
f0cc7ba9c4 docs: replace em-dashes in body text with natural punctuation
Keep em-dashes in step headings, replace in prose with ;/:/./,

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 18:19:04 +01:00
3eb0cc7b60 README ascii art fixes 2026-03-12 18:15:01 +01:00
a4f898c3ad docs: add full training loop section to README
Documents the iterative sim-to-real workflow:
1. Human data collection during gameplay
2. Initial model fitting (kNN or NN)
3. RL training in simulator (SAC + HER)
4. Eval in game while collecting new data
5. Refit model, repeat

Includes ASCII flow diagram, code for each step, and a convergence
criterion (low kNN uncertainty throughout episode).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 18:13:12 +01:00
c3111ad5be fix: retry game connection in __init__ as well as collect_data
The None-param filtering probe at init also needs to wait for the game
to be reachable, not just the collection loop.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 17:53:35 +01:00
088b7d4733 fix: make collect_data resilient to game crashes
- Save dataset every N steps (default 10) so a disconnect loses at most
  one checkpoint's worth of samples instead of everything
- Retry _get_state() on ConnectionError/Timeout rather than crashing,
  resuming automatically once the game comes back up

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 17:52:17 +01:00
ce2019e060 refactor: remove model_type from NuconModelLearner.__init__
Model type is irrelevant during data collection. Models are now created
lazily on first use: train_model() creates a ReactorDynamicsModel,
fit_knn(k) creates a ReactorKNNModel. load_model() detects type by
file extension as before. drop_well_fitted() now checks model exists.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 17:50:31 +01:00
1f7ecc301f docs: document NuconGoalEnv and HER training in README
- Describe both NuconEnv and NuconGoalEnv with their obs/action spaces
- Explain goal-conditioned approach and why HER is appropriate
- Add SAC + HerReplayBuffer usage example with recommended hyperparams
- Show how to inject a custom goal at inference time
- List registered goal env presets

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 17:38:20 +01:00
0dab7a6cec Fix rl/sim blockers and add NuconGoalEnv for HER training
rl.py:
- Add missing `from enum import Enum`
- Skip str-typed params in obs/action space construction (was crashing)
- Guard action space: exclude write-only (is_readable=False) and cheat params
- Fix step() param lookup (no longer iterates Nucon, uses _parameters dict directly)
- Correct sim-speed time dilation in real-game sleep
- Extract _build_param_space() helper shared by NuconEnv and NuconGoalEnv
- Add NuconGoalEnv: goal-conditioned env with normalised achieved/desired goal
  vectors, compatible with SB3 HerReplayBuffer; goals sampled per episode
- Register Nucon-goal_power-v0 and Nucon-goal_temp-v0 presets
- Enum obs/action space now scalar index (not one-hot)

sim.py:
- Store self.port and self.host on NuconSimulator
- Add set_model() to accept a pre-loaded model directly
- load_model() detects type by extension (.pkl → kNN, else → NN torch)
  and reads new checkpoint format with embedded input/output param lists
- _update_reactor_state() uses model.input_params (not all readable params),
  calls .forward() directly for both NN and kNN, guards torch.no_grad per type
- Import ReactorKNNModel and pickle

model.py:
- save_model() embeds input_params/output_params in NN checkpoint dict
- load_model() handles new checkpoint format (state_dict key) with fallback

README.md:
- Update note: RODS_POS_ORDERED is no longer the only writable param;
  game v2.2.25.213 exposes rod banks, pumps, MSCVs, switches and more

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 17:37:16 +01:00
7fcc809852 Update README: valve API, cheat_mode, model learning overhaul
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 17:16:25 +01:00
31cb6862e1 Overhaul model learning: kNN+GP, uncertainty, dataset pruning, sim-speed fix
Data collection:
- time_delta is now target game-time; wall sleep = game_delta / sim_speed
  so stored deltas are uniform regardless of GAME_SIM_SPEED setting
- Auto-exclude junk params (GAME_VERSION, TIME, ALARMS_ACTIVE, …) and
  params returning None (uninstalled subsystems)
- Optional include_valve_states=True adds all 53 valve positions as inputs

Model backends (model_type='nn' or 'knn'):
- ReactorKNNModel: k-nearest neighbours with GP interpolation
  - Finds k nearest states, computes per-second transition rates,
    linearly scales to requested game_delta (linear-in-time assumption)
  - forward_with_uncertainty() returns (prediction_dict, gp_std)
    where std≈0 = on known data, std≈1 = out of distribution
- NN training fixed: loss computed in tensor space, mse_loss per batch

Dataset management:
- drop_well_fitted(error_threshold): drop samples model predicts well,
  keep hard cases (useful for NN curriculum)
- drop_redundant(min_state_distance, min_output_distance): drop samples
  that are close in BOTH input state AND output transition space, keeping
  genuinely different dynamics even at the same input state

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 17:16:22 +01:00
c78106dffc Add valve API, cheat_mode, and write-only param fixes
- Rename is_admin/admin_mode -> is_cheat/cheat_mode (only FUN_* event
  triggers are cheat params, not operational commands like SCRAM)
- Fix steam ejector valve write commands: int 0-100, not bool
- Move SCRAM, EMERGENCY_STOP, bay hatches, turbine trip etc. to normal
  write-only (not cheat-gated)
- Add FUN_IS_ENABLED to readable params (it appears in GET list)
- Add get_valve/get_valves, open/close/off_valve(s) methods with correct
  actuator semantics: OPEN/CLOSE powers motor, OFF holds position

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 17:16:08 +01:00
2ec68ff2e5 Add is_admin flag and admin_mode for dangerous write-only parameters
Parameters like CORE_SCRAM_BUTTON, CORE_EMERGENCY_STOP, bay hatch/fuel
loading, VALVE_OPEN/CLOSE/OFF, STEAM_TURBINE_TRIP, and all FUN_* event
triggers are now marked is_admin=True. Writing to them is blocked unless
the Nucon instance has admin_mode=True or force=True is used.

Normal control setpoints (MSCV_*, STEAM_TURBINE_*_BYPASS_ORDERED,
CHEM_BORON_*) remain write-only but are not admin-gated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 16:42:13 +01:00
90616dcf69 Full parameter coverage compatible with game V2.2.25.213
- Add ~300 missing parameters with types, ranges, and units
- Add SPECIAL_VARIABLES frozenset to block non-param game endpoints
- Fix batch query to handle {"values": {...}} wrapper
- Fix str-typed params falling back to individual GET (batch returns int codes)
- Handle null/empty values from uninstalled subsystems
- Add is_readable field to NuconParameter for write-only support
- Add 57 write-only parameters: SCRAM, emergency stop, bay hatches/fuel loading,
  RODS_ALL_POS_ORDERED, MSCVs, steam turbine bypass/trip, ejector valves,
  VALVE_OPEN/CLOSE/OFF, chemistry rates, FUN_* event triggers
- Update get_all/get_all_readable/get_all_iter to skip write-only params
- __len__ now reflects readable param count (consistent with get_all)
- Update tests to skip write-only params in write test, handle None values

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 16:27:57 +01:00
5cfedceab7 Dev and simulator fixes
- Export ParameterEnum from __init__
- Add flask and numpy to dev dependencies
- Fix sim: remove run() call from test fixture, handle WEBSERVER_LIST_VARIABLES and WEBSERVER_BATCH_GET, normalize variable names to uppercase
- Remove RODS params from sim state (no longer part of sim model)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 16:27:48 +01:00
c180fe4434 Logo! 2025-08-27 12:47:13 +02:00
d5a0212824 Update exposed vars to match game V2 2025-02-18 21:22:43 +01:00
9adee35a6f Fix tests: Nucon is now instantiated 2025-02-18 21:22:16 +01:00
a6b5b68777 Fix: Also export BreakerStatus in __init__ 2025-02-18 21:21:45 +01:00
08ecbb461d Param Repr should contain is_writable not writable 2024-10-11 08:57:40 +02:00
e4c9f047d0 Add PPO example to README 2024-10-10 17:27:12 +02:00
5dfd85a5af Fix README 2024-10-10 17:17:21 +02:00
7e0d85acc7 Update README 2024-10-10 17:14:26 +02:00
878fb9cf4f Fix: Repr for Parameter should contain param_type, not type 2024-10-10 17:14:00 +02:00
81398225ec Ensure we expose the correct default modules 2024-10-10 17:13:35 +02:00
b750e80c80 Updated README 2024-10-10 16:54:43 +02:00
70fd128465 Expanded README 2024-10-10 16:52:37 +02:00
e625c994df Expanded README 2024-10-10 16:50:30 +02:00
ccbb83674a Fix typo 2024-10-10 16:46:13 +02:00
4791b2a4b6 Updated README 2024-10-10 16:45:06 +02:00
6d1df49ede More sensible bool values for enums 2024-10-10 16:43:50 +02:00
502c8a1c78 Fix typo 2024-10-10 16:40:13 +02:00
66481d8486 Fix typo + better install instructions 2024-10-08 17:21:02 +02:00
c0a9ec33a0 README: Added Note about current capabilities 2024-10-07 17:11:44 +02:00
2e759215a8 Rearange README 2024-10-03 23:35:39 +02:00
fb71780563 Better optional dependency management 2024-10-03 23:33:20 +02:00
f9288bf611 More extra deps 2024-10-03 23:33:11 +02:00
a43c9550ac Fix typo in meme 2024-10-03 23:28:29 +02:00