NuCon

Author	SHA1	Message	Date
Dominik Roth	1b93699501	docs: mention uncertainty penalty/abort in training loop section Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:37:25 +01:00
Dominik Roth	65190dffea	feat: uncertainty-aware training with penalty and abort sim.py: - simulator.update(return_uncertainty=True) calls forward_with_uncertainty on kNN models and returns the GP std; returns None for NN or when not requested (no extra cost if unused) - No state stored on simulator; caller decides what to do with the value rl.py (NuconEnv and NuconGoalEnv): - uncertainty_penalty_start: above this GP std, subtract a linear penalty from the reward (scaled by uncertainty_penalty_scale, default 1.0) - uncertainty_abort: at or above this GP std, set truncated=True - Only calls update(return_uncertainty=True) when either threshold is set - Uncertainty only applies when using a simulator (kNN model); ignored otherwise Example: simulator = NuconSimulator() simulator.load_model('reactor_knn.pkl') env = NuconGoalEnv(..., simulator=simulator, uncertainty_penalty_start=0.3, uncertainty_abort=0.7, uncertainty_penalty_scale=2.0) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:37:09 +01:00
Dominik Roth	6cb93ad56d	feat: abort trajectory on high kNN uncertainty in simulator NuconSimulator now accepts uncertainty_threshold (default None = disabled). When set and using a kNN model, _update_reactor_state() calls forward_with_uncertainty() and raises HighUncertaintyError if the GP posterior std exceeds the threshold. NuconEnv and NuconGoalEnv catch HighUncertaintyError in step() and return truncated=True, so SB3 bootstraps the value rather than treating OOD regions as terminal states. Usage: simulator = NuconSimulator(uncertainty_threshold=0.3) # episodes are cut short when the policy wanders OOD Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:29:54 +01:00
Dominik Roth	e2e8db1f04	docs: remove WIP labels and clean up stale transitional prose Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:22:25 +01:00
Dominik Roth	7ee8272034	docs: replace step-by-step code blocks in training loop with prose The prior sections already have full code examples; the training loop section now just describes each step concisely and links back to them. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:20:10 +01:00
Dominik Roth	f0cc7ba9c4	docs: replace em-dashes in body text with natural punctuation Keep em-dashes in step headings, replace in prose with ;/:/./, Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:19:04 +01:00
Dominik Roth	3eb0cc7b60	README ascii art fixes	2026-03-12 18:15:01 +01:00
Dominik Roth	a4f898c3ad	docs: add full training loop section to README Documents the iterative sim-to-real workflow: 1. Human data collection during gameplay 2. Initial model fitting (kNN or NN) 3. RL training in simulator (SAC + HER) 4. Eval in game while collecting new data 5. Refit model, repeat Includes ASCII flow diagram, code for each step, and a convergence criterion (low kNN uncertainty throughout episode). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:13:12 +01:00
Dominik Roth	c3111ad5be	fix: retry game connection in __init__ as well as collect_data The None-param filtering probe at init also needs to wait for the game to be reachable, not just the collection loop. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 17:53:35 +01:00
Dominik Roth	088b7d4733	fix: make collect_data resilient to game crashes - Save dataset every N steps (default 10) so a disconnect loses at most one checkpoint's worth of samples instead of everything - Retry _get_state() on ConnectionError/Timeout rather than crashing, resuming automatically once the game comes back up Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 17:52:17 +01:00
Dominik Roth	ce2019e060	refactor: remove model_type from NuconModelLearner.__init__ Model type is irrelevant during data collection. Models are now created lazily on first use: train_model() creates a ReactorDynamicsModel, fit_knn(k) creates a ReactorKNNModel. load_model() detects type by file extension as before. drop_well_fitted() now checks model exists. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 17:50:31 +01:00
Dominik Roth	1f7ecc301f	docs: document NuconGoalEnv and HER training in README - Describe both NuconEnv and NuconGoalEnv with their obs/action spaces - Explain goal-conditioned approach and why HER is appropriate - Add SAC + HerReplayBuffer usage example with recommended hyperparams - Show how to inject a custom goal at inference time - List registered goal env presets Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 17:38:20 +01:00
Dominik Roth	0dab7a6cec	Fix rl/sim blockers and add NuconGoalEnv for HER training rl.py: - Add missing `from enum import Enum` - Skip str-typed params in obs/action space construction (was crashing) - Guard action space: exclude write-only (is_readable=False) and cheat params - Fix step() param lookup (no longer iterates Nucon, uses _parameters dict directly) - Correct sim-speed time dilation in real-game sleep - Extract _build_param_space() helper shared by NuconEnv and NuconGoalEnv - Add NuconGoalEnv: goal-conditioned env with normalised achieved/desired goal vectors, compatible with SB3 HerReplayBuffer; goals sampled per episode - Register Nucon-goal_power-v0 and Nucon-goal_temp-v0 presets - Enum obs/action space now scalar index (not one-hot) sim.py: - Store self.port and self.host on NuconSimulator - Add set_model() to accept a pre-loaded model directly - load_model() detects type by extension (.pkl → kNN, else → NN torch) and reads new checkpoint format with embedded input/output param lists - _update_reactor_state() uses model.input_params (not all readable params), calls .forward() directly for both NN and kNN, guards torch.no_grad per type - Import ReactorKNNModel and pickle model.py: - save_model() embeds input_params/output_params in NN checkpoint dict - load_model() handles new checkpoint format (state_dict key) with fallback README.md: - Update note: RODS_POS_ORDERED is no longer the only writable param; game v2.2.25.213 exposes rod banks, pumps, MSCVs, switches and more Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 17:37:16 +01:00
Dominik Roth	7fcc809852	Update README: valve API, cheat_mode, model learning overhaul Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 17:16:25 +01:00
Dominik Roth	31cb6862e1	Overhaul model learning: kNN+GP, uncertainty, dataset pruning, sim-speed fix Data collection: - time_delta is now target game-time; wall sleep = game_delta / sim_speed so stored deltas are uniform regardless of GAME_SIM_SPEED setting - Auto-exclude junk params (GAME_VERSION, TIME, ALARMS_ACTIVE, …) and params returning None (uninstalled subsystems) - Optional include_valve_states=True adds all 53 valve positions as inputs Model backends (model_type='nn' or 'knn'): - ReactorKNNModel: k-nearest neighbours with GP interpolation - Finds k nearest states, computes per-second transition rates, linearly scales to requested game_delta (linear-in-time assumption) - forward_with_uncertainty() returns (prediction_dict, gp_std) where std≈0 = on known data, std≈1 = out of distribution - NN training fixed: loss computed in tensor space, mse_loss per batch Dataset management: - drop_well_fitted(error_threshold): drop samples model predicts well, keep hard cases (useful for NN curriculum) - drop_redundant(min_state_distance, min_output_distance): drop samples that are close in BOTH input state AND output transition space, keeping genuinely different dynamics even at the same input state Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 17:16:22 +01:00
Dominik Roth	c78106dffc	Add valve API, cheat_mode, and write-only param fixes - Rename is_admin/admin_mode -> is_cheat/cheat_mode (only FUN_* event triggers are cheat params, not operational commands like SCRAM) - Fix steam ejector valve write commands: int 0-100, not bool - Move SCRAM, EMERGENCY_STOP, bay hatches, turbine trip etc. to normal write-only (not cheat-gated) - Add FUN_IS_ENABLED to readable params (it appears in GET list) - Add get_valve/get_valves, open/close/off_valve(s) methods with correct actuator semantics: OPEN/CLOSE powers motor, OFF holds position Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 17:16:08 +01:00
Dominik Roth	2ec68ff2e5	Add is_admin flag and admin_mode for dangerous write-only parameters Parameters like CORE_SCRAM_BUTTON, CORE_EMERGENCY_STOP, bay hatch/fuel loading, VALVE_OPEN/CLOSE/OFF, STEAM_TURBINE_TRIP, and all FUN_* event triggers are now marked is_admin=True. Writing to them is blocked unless the Nucon instance has admin_mode=True or force=True is used. Normal control setpoints (MSCV_, STEAM_TURBINE__BYPASS_ORDERED, CHEM_BORON_*) remain write-only but are not admin-gated. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 16:42:13 +01:00
Dominik Roth	90616dcf69	Full parameter coverage compatible with game V2.2.25.213 - Add ~300 missing parameters with types, ranges, and units - Add SPECIAL_VARIABLES frozenset to block non-param game endpoints - Fix batch query to handle {"values": {...}} wrapper - Fix str-typed params falling back to individual GET (batch returns int codes) - Handle null/empty values from uninstalled subsystems - Add is_readable field to NuconParameter for write-only support - Add 57 write-only parameters: SCRAM, emergency stop, bay hatches/fuel loading, RODS_ALL_POS_ORDERED, MSCVs, steam turbine bypass/trip, ejector valves, VALVE_OPEN/CLOSE/OFF, chemistry rates, FUN_* event triggers - Update get_all/get_all_readable/get_all_iter to skip write-only params - __len__ now reflects readable param count (consistent with get_all) - Update tests to skip write-only params in write test, handle None values Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 16:27:57 +01:00
Dominik Roth	5cfedceab7	Dev and simulator fixes - Export ParameterEnum from __init__ - Add flask and numpy to dev dependencies - Fix sim: remove run() call from test fixture, handle WEBSERVER_LIST_VARIABLES and WEBSERVER_BATCH_GET, normalize variable names to uppercase - Remove RODS params from sim state (no longer part of sim model) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 16:27:48 +01:00
Dominik Roth	c180fe4434	Logo!	2025-08-27 12:47:13 +02:00
Dominik Roth	d5a0212824	Update exposed vars to match game V2	2025-02-18 21:22:43 +01:00
Dominik Roth	9adee35a6f	Fix tests: Nucon is now instantiated	2025-02-18 21:22:16 +01:00
Dominik Roth	a6b5b68777	Fix: Also export BreakerStatus in __init__	2025-02-18 21:21:45 +01:00
Dominik Roth	08ecbb461d	Param Repr should contain is_writable not writable	2024-10-11 08:57:40 +02:00
Dominik Roth	e4c9f047d0	Add PPO example to README	2024-10-10 17:27:12 +02:00
Dominik Roth	5dfd85a5af	Fix README	2024-10-10 17:17:21 +02:00
Dominik Roth	7e0d85acc7	Update README	2024-10-10 17:14:26 +02:00
Dominik Roth	878fb9cf4f	Fix: Repr for Parameter should contain param_type, not type	2024-10-10 17:14:00 +02:00
Dominik Roth	81398225ec	Ensure we expose the correct default modules	2024-10-10 17:13:35 +02:00
Dominik Roth	b750e80c80	Updated README	2024-10-10 16:54:43 +02:00
Dominik Roth	70fd128465	Expanded README	2024-10-10 16:52:37 +02:00
Dominik Roth	e625c994df	Expanded README	2024-10-10 16:50:30 +02:00
Dominik Roth	ccbb83674a	Fix typo	2024-10-10 16:46:13 +02:00
Dominik Roth	4791b2a4b6	Updated README	2024-10-10 16:45:06 +02:00
Dominik Roth	6d1df49ede	More sensible bool values for enums	2024-10-10 16:43:50 +02:00
Dominik Roth	502c8a1c78	Fix typo	2024-10-10 16:40:13 +02:00
Dominik Roth	66481d8486	Fix typo + better install instructions	2024-10-08 17:21:02 +02:00
Dominik Roth	c0a9ec33a0	README: Added Note about current capabilities	2024-10-07 17:11:44 +02:00
Dominik Roth	2e759215a8	Rearange README	2024-10-03 23:35:39 +02:00
Dominik Roth	fb71780563	Better optional dependency management	2024-10-03 23:33:20 +02:00
Dominik Roth	f9288bf611	More extra deps	2024-10-03 23:33:11 +02:00
Dominik Roth	a43c9550ac	Fix typo in meme	2024-10-03 23:28:29 +02:00
Dominik Roth	9b62a141fa	Include assets when installing	2024-10-03 23:26:14 +02:00
Dominik Roth	70ed9d38ed	Update README	2024-10-03 23:26:07 +02:00
Dominik Roth	e7e7c81d29	Impl drake	2024-10-03 23:25:53 +02:00
Dominik Roth	f467e9cbcb	Fix typo	2024-10-03 22:00:51 +02:00
Dominik Roth	c66a4f9e7d	Updated README	2024-10-03 21:59:22 +02:00
Dominik Roth	e665a457dc	Extended Test Suite	2024-10-03 21:57:08 +02:00
Dominik Roth	03da3415c8	Updated __init__	2024-10-03 21:56:46 +02:00
Dominik Roth	4c3ad983fc	Morer objectives and fixes	2024-10-03 21:56:27 +02:00

1 2

74 Commits