NuCon

Author	SHA1	Message	Date
Dominik Roth	646399dcc7	feat: improve NN dynamics model and SAC training - ReactorDynamicsNet: add dropout (0.3) for regularisation - ReactorDynamicsModel: z-score normalisation of inputs/outputs, predict per-second rates of change, forward_with_uncertainty() stub - rl.py: misc SAC training improvements - sim.py: minor fixes - train_sac.py: updated training loop Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-15 11:18:15 +01:00
Dominik Roth	55d6e8708e	fix: kNN zero-variance dims get inf std; hot-start SAC from saved model - nucon/model.py: constant input dimensions (zero variance in training data) now get std=inf so they contribute 0 to normalised kNN distance instead of causing catastrophic OOD from tiny float epsilon - scripts/train_sac.py: add --load, --steps, --out CLI args; --load hot-starts actor/critic weights from a previous run (learning_starts=0) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-13 12:44:26 +01:00
Dominik Roth	0932bb353a	feat: SAC+HER training on kNN-GP sim with direct bypass and scripts/ - nucon/rl.py: delta_action_scale action space, bool handling (>=0.5), direct sim read/write bypassing HTTP for ~2000fps env throughput; remove uncertainty_abort from training (use penalty-only), larger default batch sizes; fix _read_obs and step for in-process sim - nucon/model.py: optimise _lookup with einsum squared-L2, vectorised rbf kernel; forward_with_uncertainty uses pre-built normalised arrays - nucon/sim.py: _update_reactor_state writes outputs via setattr directly - scripts/train_sac.py: moved from root; full SAC+HER example with kNN-GP sim, delta actions, uncertainty penalty, init_states - scripts/collect_dataset.py: CLI tool to collect dynamics dataset from live game session (--steps, --delta, --out, --merge) - README.md: add Scripts section, reference both scripts in training loop Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 20:43:37 +01:00
Dominik Roth	c3111ad5be	fix: retry game connection in __init__ as well as collect_data The None-param filtering probe at init also needs to wait for the game to be reachable, not just the collection loop. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 17:53:35 +01:00
Dominik Roth	088b7d4733	fix: make collect_data resilient to game crashes - Save dataset every N steps (default 10) so a disconnect loses at most one checkpoint's worth of samples instead of everything - Retry _get_state() on ConnectionError/Timeout rather than crashing, resuming automatically once the game comes back up Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 17:52:17 +01:00
Dominik Roth	ce2019e060	refactor: remove model_type from NuconModelLearner.__init__ Model type is irrelevant during data collection. Models are now created lazily on first use: train_model() creates a ReactorDynamicsModel, fit_knn(k) creates a ReactorKNNModel. load_model() detects type by file extension as before. drop_well_fitted() now checks model exists. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 17:50:31 +01:00
Dominik Roth	0dab7a6cec	Fix rl/sim blockers and add NuconGoalEnv for HER training rl.py: - Add missing `from enum import Enum` - Skip str-typed params in obs/action space construction (was crashing) - Guard action space: exclude write-only (is_readable=False) and cheat params - Fix step() param lookup (no longer iterates Nucon, uses _parameters dict directly) - Correct sim-speed time dilation in real-game sleep - Extract _build_param_space() helper shared by NuconEnv and NuconGoalEnv - Add NuconGoalEnv: goal-conditioned env with normalised achieved/desired goal vectors, compatible with SB3 HerReplayBuffer; goals sampled per episode - Register Nucon-goal_power-v0 and Nucon-goal_temp-v0 presets - Enum obs/action space now scalar index (not one-hot) sim.py: - Store self.port and self.host on NuconSimulator - Add set_model() to accept a pre-loaded model directly - load_model() detects type by extension (.pkl → kNN, else → NN torch) and reads new checkpoint format with embedded input/output param lists - _update_reactor_state() uses model.input_params (not all readable params), calls .forward() directly for both NN and kNN, guards torch.no_grad per type - Import ReactorKNNModel and pickle model.py: - save_model() embeds input_params/output_params in NN checkpoint dict - load_model() handles new checkpoint format (state_dict key) with fallback README.md: - Update note: RODS_POS_ORDERED is no longer the only writable param; game v2.2.25.213 exposes rod banks, pumps, MSCVs, switches and more Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 17:37:16 +01:00
Dominik Roth	31cb6862e1	Overhaul model learning: kNN+GP, uncertainty, dataset pruning, sim-speed fix Data collection: - time_delta is now target game-time; wall sleep = game_delta / sim_speed so stored deltas are uniform regardless of GAME_SIM_SPEED setting - Auto-exclude junk params (GAME_VERSION, TIME, ALARMS_ACTIVE, …) and params returning None (uninstalled subsystems) - Optional include_valve_states=True adds all 53 valve positions as inputs Model backends (model_type='nn' or 'knn'): - ReactorKNNModel: k-nearest neighbours with GP interpolation - Finds k nearest states, computes per-second transition rates, linearly scales to requested game_delta (linear-in-time assumption) - forward_with_uncertainty() returns (prediction_dict, gp_std) where std≈0 = on known data, std≈1 = out of distribution - NN training fixed: loss computed in tensor space, mse_loss per batch Dataset management: - drop_well_fitted(error_threshold): drop samples model predicts well, keep hard cases (useful for NN curriculum) - drop_redundant(min_state_distance, min_output_distance): drop samples that are close in BOTH input state AND output transition space, keeping genuinely different dynamics even at the same input state Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 17:16:22 +01:00
Dominik Roth	60cd44cc9e	Implemenetd Model Learning	2024-10-03 21:55:59 +02:00

9 Commits