NuCon

Author	SHA1	Message	Date
Dominik Roth	2c1bbc1a31	refactor: move UncertaintyPenalty/Abort into Parameterized_Objectives/Terminators dicts - uncertainty_penalty -> Parameterized_Objectives['uncertainty_penalty'] - uncertainty_abort -> Parameterized_Terminators['uncertainty_abort'] - Add Parameterized_Terminators dict (same pattern as Parameterized_Objectives) - Keep UncertaintyPenalty / UncertaintyAbort as convenience aliases Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:57:44 +01:00
Dominik Roth	041e0ec1bd	rename: objectives -> additional_objectives in NuconGoalEnv Clarifies that the goal reward is the primary built-in objective; additional_objectives are additive on top of it. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:56:34 +01:00
Dominik Roth	36a33e74e5	fix: add objectives support to NuconGoalEnv; fix README uncertainty example - NuconGoalEnv now accepts objectives/objective_weights; additive on top of the goal reward, same interface as NuconEnv - README: use UncertaintyPenalty/UncertaintyAbort correctly (via objectives and terminators, not as constructor params that don't exist) - Step 3 prose updated to reference composable callables Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:55:16 +01:00
Dominik Roth	f4d45d3cfd	feat: NuconGoalEnv, composable uncertainty helpers, kNN-GP naming - Add NuconGoalEnv for goal-conditioned HER training (SAC + HER) - Add UncertaintyPenalty and UncertaintyAbort composable callables; SIM_UNCERTAINTY injected into obs dict when simulator is active - Fix rl.py: str-typed params crash, missing Enum import, write-only params in action space, broken step() iteration order - Remove uncertainty state from sim (return value from update() instead) - Rename kNN -> kNN-GP throughout README; add model selection note Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:51:13 +01:00
Dominik Roth	65190dffea	feat: uncertainty-aware training with penalty and abort sim.py: - simulator.update(return_uncertainty=True) calls forward_with_uncertainty on kNN models and returns the GP std; returns None for NN or when not requested (no extra cost if unused) - No state stored on simulator; caller decides what to do with the value rl.py (NuconEnv and NuconGoalEnv): - uncertainty_penalty_start: above this GP std, subtract a linear penalty from the reward (scaled by uncertainty_penalty_scale, default 1.0) - uncertainty_abort: at or above this GP std, set truncated=True - Only calls update(return_uncertainty=True) when either threshold is set - Uncertainty only applies when using a simulator (kNN model); ignored otherwise Example: simulator = NuconSimulator() simulator.load_model('reactor_knn.pkl') env = NuconGoalEnv(..., simulator=simulator, uncertainty_penalty_start=0.3, uncertainty_abort=0.7, uncertainty_penalty_scale=2.0) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:37:09 +01:00
Dominik Roth	6cb93ad56d	feat: abort trajectory on high kNN uncertainty in simulator NuconSimulator now accepts uncertainty_threshold (default None = disabled). When set and using a kNN model, _update_reactor_state() calls forward_with_uncertainty() and raises HighUncertaintyError if the GP posterior std exceeds the threshold. NuconEnv and NuconGoalEnv catch HighUncertaintyError in step() and return truncated=True, so SB3 bootstraps the value rather than treating OOD regions as terminal states. Usage: simulator = NuconSimulator(uncertainty_threshold=0.3) # episodes are cut short when the policy wanders OOD Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:29:54 +01:00
Dominik Roth	0dab7a6cec	Fix rl/sim blockers and add NuconGoalEnv for HER training rl.py: - Add missing `from enum import Enum` - Skip str-typed params in obs/action space construction (was crashing) - Guard action space: exclude write-only (is_readable=False) and cheat params - Fix step() param lookup (no longer iterates Nucon, uses _parameters dict directly) - Correct sim-speed time dilation in real-game sleep - Extract _build_param_space() helper shared by NuconEnv and NuconGoalEnv - Add NuconGoalEnv: goal-conditioned env with normalised achieved/desired goal vectors, compatible with SB3 HerReplayBuffer; goals sampled per episode - Register Nucon-goal_power-v0 and Nucon-goal_temp-v0 presets - Enum obs/action space now scalar index (not one-hot) sim.py: - Store self.port and self.host on NuconSimulator - Add set_model() to accept a pre-loaded model directly - load_model() detects type by extension (.pkl → kNN, else → NN torch) and reads new checkpoint format with embedded input/output param lists - _update_reactor_state() uses model.input_params (not all readable params), calls .forward() directly for both NN and kNN, guards torch.no_grad per type - Import ReactorKNNModel and pickle model.py: - save_model() embeds input_params/output_params in NN checkpoint dict - load_model() handles new checkpoint format (state_dict key) with fallback README.md: - Update note: RODS_POS_ORDERED is no longer the only writable param; game v2.2.25.213 exposes rod banks, pumps, MSCVs, switches and more Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 17:37:16 +01:00
Dominik Roth	4c3ad983fc	Morer objectives and fixes	2024-10-03 21:56:27 +02:00
Dominik Roth	d580f77fce	Allowed weighted objectives	2024-10-02 19:31:19 +02:00
Dominik Roth	dc59173fe7	Better parameterized objectives and gym bindings	2024-10-02 19:22:23 +02:00
Dominik Roth	08828e2dec	RL oh yeah	2024-10-02 18:45:06 +02:00

11 Commits