NuCon

Author	SHA1	Message	Date
Dominik Roth	0932bb353a	feat: SAC+HER training on kNN-GP sim with direct bypass and scripts/ - nucon/rl.py: delta_action_scale action space, bool handling (>=0.5), direct sim read/write bypassing HTTP for ~2000fps env throughput; remove uncertainty_abort from training (use penalty-only), larger default batch sizes; fix _read_obs and step for in-process sim - nucon/model.py: optimise _lookup with einsum squared-L2, vectorised rbf kernel; forward_with_uncertainty uses pre-built normalised arrays - nucon/sim.py: _update_reactor_state writes outputs via setattr directly - scripts/train_sac.py: moved from root; full SAC+HER example with kNN-GP sim, delta actions, uncertainty penalty, init_states - scripts/collect_dataset.py: CLI tool to collect dynamics dataset from live game session (--steps, --delta, --out, --merge) - README.md: add Scripts section, reference both scripts in training loop Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 20:43:37 +01:00
Dominik Roth	3dfe1aa673	fix: flat Box action space, SB3/HER compatibility, sim uninitialized param defaults rl.py: - Action space is now a flat Box (SAC/PPO require this, not Dict) - _build_flat_action_space + _unflatten_action helpers shared by both envs - Params with undefined bounds excluded from action space (SAC needs finite bounds) - Fix _build_param_space: use `is not None` check instead of falsy `or` (0 is valid min_val) - NuconGoalEnv obs params default to simulator.model.input_params when sim provided; obs_params kwarg overrides for real-game deployment with same param set - SIM_UNCERTAINTY kept out of policy obs vector (not available at deployment); available in reward_obs passed to objectives/terminators/reward_fn - _read_obs returns (gym_obs, reward_obs) cleanly instead of smuggling via dict - NuconGoalEnv additional_objectives wired into step() sim.py: - Uninitialized params return type-default (0/False/first-enum) instead of "None" - Enum params serialised as integer value, not repr string README.md: - Fix HerReplayBuffer import path (sb3 2.x: her.her_replay_buffer) - Remove non-existent simulator.run() call - Fix broken anchor links, remove "work in progress" from intro Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 19:16:07 +01:00
Dominik Roth	845ca708a7	remove UncertaintyPenalty/Abort aliases; use Parameterized_Objectives/Terminators dicts Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:58:22 +01:00
Dominik Roth	041e0ec1bd	rename: objectives -> additional_objectives in NuconGoalEnv Clarifies that the goal reward is the primary built-in objective; additional_objectives are additive on top of it. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:56:34 +01:00
Dominik Roth	36a33e74e5	fix: add objectives support to NuconGoalEnv; fix README uncertainty example - NuconGoalEnv now accepts objectives/objective_weights; additive on top of the goal reward, same interface as NuconEnv - README: use UncertaintyPenalty/UncertaintyAbort correctly (via objectives and terminators, not as constructor params that don't exist) - Step 3 prose updated to reference composable callables Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:55:16 +01:00
Dominik Roth	f4d45d3cfd	feat: NuconGoalEnv, composable uncertainty helpers, kNN-GP naming - Add NuconGoalEnv for goal-conditioned HER training (SAC + HER) - Add UncertaintyPenalty and UncertaintyAbort composable callables; SIM_UNCERTAINTY injected into obs dict when simulator is active - Fix rl.py: str-typed params crash, missing Enum import, write-only params in action space, broken step() iteration order - Remove uncertainty state from sim (return value from update() instead) - Rename kNN -> kNN-GP throughout README; add model selection note Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:51:13 +01:00
Dominik Roth	1b93699501	docs: mention uncertainty penalty/abort in training loop section Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:37:25 +01:00
Dominik Roth	e2e8db1f04	docs: remove WIP labels and clean up stale transitional prose Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:22:25 +01:00
Dominik Roth	7ee8272034	docs: replace step-by-step code blocks in training loop with prose The prior sections already have full code examples; the training loop section now just describes each step concisely and links back to them. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:20:10 +01:00
Dominik Roth	f0cc7ba9c4	docs: replace em-dashes in body text with natural punctuation Keep em-dashes in step headings, replace in prose with ;/:/./, Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:19:04 +01:00
Dominik Roth	3eb0cc7b60	README ascii art fixes	2026-03-12 18:15:01 +01:00
Dominik Roth	a4f898c3ad	docs: add full training loop section to README Documents the iterative sim-to-real workflow: 1. Human data collection during gameplay 2. Initial model fitting (kNN or NN) 3. RL training in simulator (SAC + HER) 4. Eval in game while collecting new data 5. Refit model, repeat Includes ASCII flow diagram, code for each step, and a convergence criterion (low kNN uncertainty throughout episode). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 18:13:12 +01:00
Dominik Roth	ce2019e060	refactor: remove model_type from NuconModelLearner.__init__ Model type is irrelevant during data collection. Models are now created lazily on first use: train_model() creates a ReactorDynamicsModel, fit_knn(k) creates a ReactorKNNModel. load_model() detects type by file extension as before. drop_well_fitted() now checks model exists. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 17:50:31 +01:00
Dominik Roth	1f7ecc301f	docs: document NuconGoalEnv and HER training in README - Describe both NuconEnv and NuconGoalEnv with their obs/action spaces - Explain goal-conditioned approach and why HER is appropriate - Add SAC + HerReplayBuffer usage example with recommended hyperparams - Show how to inject a custom goal at inference time - List registered goal env presets Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 17:38:20 +01:00
Dominik Roth	0dab7a6cec	Fix rl/sim blockers and add NuconGoalEnv for HER training rl.py: - Add missing `from enum import Enum` - Skip str-typed params in obs/action space construction (was crashing) - Guard action space: exclude write-only (is_readable=False) and cheat params - Fix step() param lookup (no longer iterates Nucon, uses _parameters dict directly) - Correct sim-speed time dilation in real-game sleep - Extract _build_param_space() helper shared by NuconEnv and NuconGoalEnv - Add NuconGoalEnv: goal-conditioned env with normalised achieved/desired goal vectors, compatible with SB3 HerReplayBuffer; goals sampled per episode - Register Nucon-goal_power-v0 and Nucon-goal_temp-v0 presets - Enum obs/action space now scalar index (not one-hot) sim.py: - Store self.port and self.host on NuconSimulator - Add set_model() to accept a pre-loaded model directly - load_model() detects type by extension (.pkl → kNN, else → NN torch) and reads new checkpoint format with embedded input/output param lists - _update_reactor_state() uses model.input_params (not all readable params), calls .forward() directly for both NN and kNN, guards torch.no_grad per type - Import ReactorKNNModel and pickle model.py: - save_model() embeds input_params/output_params in NN checkpoint dict - load_model() handles new checkpoint format (state_dict key) with fallback README.md: - Update note: RODS_POS_ORDERED is no longer the only writable param; game v2.2.25.213 exposes rod banks, pumps, MSCVs, switches and more Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 17:37:16 +01:00
Dominik Roth	7fcc809852	Update README: valve API, cheat_mode, model learning overhaul Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-12 17:16:25 +01:00
Dominik Roth	c180fe4434	Logo!	2025-08-27 12:47:13 +02:00
Dominik Roth	08ecbb461d	Param Repr should contain is_writable not writable	2024-10-11 08:57:40 +02:00
Dominik Roth	e4c9f047d0	Add PPO example to README	2024-10-10 17:27:12 +02:00
Dominik Roth	5dfd85a5af	Fix README	2024-10-10 17:17:21 +02:00
Dominik Roth	7e0d85acc7	Update README	2024-10-10 17:14:26 +02:00
Dominik Roth	b750e80c80	Updated README	2024-10-10 16:54:43 +02:00
Dominik Roth	70fd128465	Expanded README	2024-10-10 16:52:37 +02:00
Dominik Roth	e625c994df	Expanded README	2024-10-10 16:50:30 +02:00
Dominik Roth	ccbb83674a	Fix typo	2024-10-10 16:46:13 +02:00
Dominik Roth	4791b2a4b6	Updated README	2024-10-10 16:45:06 +02:00
Dominik Roth	502c8a1c78	Fix typo	2024-10-10 16:40:13 +02:00
Dominik Roth	66481d8486	Fix typo + better install instructions	2024-10-08 17:21:02 +02:00
Dominik Roth	c0a9ec33a0	README: Added Note about current capabilities	2024-10-07 17:11:44 +02:00
Dominik Roth	2e759215a8	Rearange README	2024-10-03 23:35:39 +02:00
Dominik Roth	fb71780563	Better optional dependency management	2024-10-03 23:33:20 +02:00
Dominik Roth	a43c9550ac	Fix typo in meme	2024-10-03 23:28:29 +02:00
Dominik Roth	70ed9d38ed	Update README	2024-10-03 23:26:07 +02:00
Dominik Roth	f467e9cbcb	Fix typo	2024-10-03 22:00:51 +02:00
Dominik Roth	c66a4f9e7d	Updated README	2024-10-03 21:59:22 +02:00
Dominik Roth	b0a2ac7574	Fix port in README	2024-10-02 22:36:54 +02:00
Dominik Roth	42f91a2279	Typo in README	2024-10-02 19:57:28 +02:00
Dominik Roth	08a60e2850	Cite?	2024-10-02 19:47:49 +02:00
Dominik Roth	d580f77fce	Allowed weighted objectives	2024-10-02 19:31:19 +02:00
Dominik Roth	dc8bafbbfe	Updated README	2024-10-02 19:22:38 +02:00
Dominik Roth	d941b2c9af	Added RL to README	2024-10-02 18:45:11 +02:00
Dominik Roth	3fa12f65bc	Updated README	2024-10-02 17:05:51 +02:00
Dominik Roth	9e88f68fe0	Update README	2024-10-02 17:00:13 +02:00
Dominik Roth	da871655e6	Updated README	2024-10-02 16:51:57 +02:00
Dominik Roth	312de01f44	Updated README	2024-10-02 16:28:06 +02:00
Dominik Roth	7a34be0f09	Initial commit	2024-10-02 16:25:45 +02:00

46 Commits