- nucon/model.py: constant input dimensions (zero variance in training data) now get std=inf so they contribute 0 to normalised kNN distance instead of causing catastrophic OOD from tiny float epsilon - scripts/train_sac.py: add --load, --steps, --out CLI args; --load hot-starts actor/critic weights from a previous run (learning_starts=0) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| nucon | ||
| scripts | ||
| test | ||
| .gitignore | ||
| logo.svg | ||
| pyproject.toml | ||
| README_meme.jpg | ||
| README.md | ||
NuCon (Nucleares Controller) is a Python library designed to interface with and control parameters in Nucleares, a nuclear reactor simulation game. It provides a robust, type-safe foundation for reading and writing game parameters, allowing users to easily create their own automations and control systems.
NuCon further provides a reinforcement learning environment for training control policies and a simulator based on model learning.
Features
- Enum-based parameter system for type safety and code clarity
- Support for various parameter types including floats, integers, booleans, strings, and custom enums
- Read and write capabilities for game parameters
- Reinforcement learning environment for training control policies
- Built-in simulator for rapid prototyping and testing
- Model learning for dynamics prediction
Installation
To install NuCon, clone this repository and install via pip:
git clone https://git.dominik-roth.eu/dodox/NuCon
cd NuCon
pip install -e .
Usage
Here's a basic example of how to use NuCon:
from nucon import Nucon
nucon = Nucon()
# or nucon = Nucon(host='localhost', port=8786)
# Enable dummy mode for testing (optional)
nucon.set_dummy_mode(True)
# Read a parameter
core_temp = nucon.CORE_TEMP.value
print(f"Core Temperature: {core_temp}")
# >> Core Temperature: 500.0
# Read a parameter with an enum type
pump_status = nucon.COOLANT_CORE_CIRCULATION_PUMP_0_STATUS.value
print(f"Pump 0 Status: {pump_status}")
if nucon.COOLANT_CORE_CIRCULATION_PUMP_0_STATUS.value:
print('Pump 0 is active.')
# >> Pump 0 Status: PumpStatus.INACTIVE
# Write to a parameter (has no effect in dummy mode)
nucon.RODS_POS_ORDERED.value = 50
print(f"Rods Position Ordered: {nucon.RODS_POS_ORDERED.value}")
# >> Rods Position Ordered: 50.0
# The repr of an attribute contains all contained info
nucon.CORE_TEMP
# >> NuconParameter(id='CORE_TEMP', value=500.0, param_type=float, is_writable=False)
API Reference
The nucon instance contains all available parameters.
Parameter properties:
nucon.<PARAMETER>.value: Get or set the current value of the parameter. Assigning a new value will write it to the game.nucon.<PARAMETER>.param_type: Get the type of the parameternucon.<PARAMETER>.is_writable: Check if the parameter is writablenucon.<PARAMETER>.is_readable:Falsefor write-only parameters (e.g. VALVE_OPEN, CORE_SCRAM_BUTTON). Reading raisesAttributeError.nucon.<PARAMETER>.is_cheat:Truefor game-event triggers (allFUN_*). Writing raisesValueErrorunlesscheat_mode=True.nucon.<PARAMETER>.enum_type: Get the enum type of the parameter if it's an enum, otherwise Nonenucon.<PARAMETER>.unit: Unit string if defined (e.g.'°C','bar','%')
Parameter methods:
nucon.<PARAMETER>.read(): Get the current value of the parameter (alias forvalue)nucon.<PARAMETER>.write(new_value, force=False): Write a new value to the parameter.forcewill try to write even if the parameter is known as non-writable or out of known allowed range.
Class methods:
nucon.get(parameter): Get the value of a specific parameter. Also accepts string parameter names.nucon.set(parameter, value, force=False): Set the value of a specific parameter. Also accepts string parameter names.forcebypasses writable/range/cheat checks.nucon.get_all_readable(): Get a dict of all readable parameters.nucon.get_all_writable(): Get a dict of all writable parameters (includes write-only params).nucon.get_all(): Get all readable parameter values as a dictionary.nucon.get_all_iter(): Get all readable parameter values as a generator.nucon.get_multiple(params): Get values for multiple specified parameters.nucon.get_multiple_iter(params): Get values for multiple specified parameters as a generator.nucon.get_game_variable_names(): Query the game for all exposed variable names (GET and POST), excluding special endpoints.nucon.set_dummy_mode(dummy_mode): In dummy mode, returns sensible values without connecting to the game and silently ignores writes.nucon.set_cheat_mode(cheat_mode): Enable writing to cheat parameters (FUN_*event triggers). DefaultFalse.
Valve API (motorized actuators: OPEN/CLOSE powers the motor, OFF holds current position):
nucon.get_valve(name): Get state dict for a single valve (Value,IsOpened,IsClosed,Stuck, …).nucon.get_valves(): Get state dict for all 53 valves.nucon.open_valve(name)/nucon.open_valves(names): Power actuator toward open.nucon.close_valve(name)/nucon.close_valves(names): Power actuator toward closed.nucon.off_valve(name)/nucon.off_valves(names): Cut actuator power, hold current position (normal resting state).
Custom Enum Types:
PumpStatus: Enum for pump status (INACTIVE, ACTIVE_NO_SPEED_REACHED*, ACTIVE_SPEED_REACHED*, REQUIRES_MAINTENANCE, NOT_INSTALLED, INSUFFICIENT_ENERGY)PumpDryStatus: Enum for pump dry status (ACTIVE_WITHOUT_FLUID*, INACTIVE_OR_ACTIVE_WITH_FLUID)PumpOverloadStatus: Enum for pump overload status (ACTIVE_AND_OVERLOAD*, INACTIVE_OR_ACTIVE_NO_OVERLOAD)BreakerStatus: Enum for breaker status (OPEN*, CLOSED)
*: Truthy value (will be treated as true in e.g. if statements).
So if you're not in the mood to play the game manually, this API can be used to easily create your own automations and control systems. Maybe a little PID controller for the rods? Or, if you wanna go crazy, why not try some
Reinforcement Learning
NuCon includes a Reinforcement Learning (RL) environment based on the OpenAI Gym interface. This allows you to train control policies for the Nucleares game instead of writing them yourself. Requires additional dependencies.
Additional Dependencies
To use you'll need to install gymnasium and numpy. You can do so via
pip install -e '.[rl]'
Environments
Two environment classes are provided in nucon/rl.py:
NuconEnv: classic fixed-objective environment. You define one or more objectives at construction time (e.g. maximise power output, keep temperature in range). The agent always trains toward the same goal.
- Observation space: all readable numeric parameters (~290 dims).
- Action space: all readable-back writable parameters (~30 dims): 9 individual rod bank positions, 3 MSCVs, 3 turbine bypass valves, 6 coolant pump speeds, condenser pump, freight/vent switches, resistor banks, and more.
- Objectives: predefined strings (
'max_power','episode_time') or arbitrary callables(obs) -> float. Multiple objectives are weighted-summed.
NuconGoalEnv: goal-conditioned environment. The desired goal (e.g. target generator output) is sampled at the start of each episode and provided as part of the observation. A single policy learns to reach any goal in the specified range, making it far more useful than a fixed-objective agent. Designed for training with Hindsight Experience Replay (HER), which makes sparse-reward goal-conditioned training tractable.
- Observation space:
Dictwith keysobservation(non-goal params),achieved_goal(current goal param values, normalised to [0,1]),desired_goal(target, normalised to [0,1]). - Goals are sampled uniformly from the specified
goal_rangeeach episode. - Reward defaults to negative L2 distance in normalised goal space (dense). Pass
tolerancefor a sparse{0, -1}reward; this works particularly well with HER.
NuconEnv Usage
from nucon.rl import NuconEnv, Parameterized_Objectives
env = NuconEnv(objectives=['max_power'], seconds_per_step=5)
# env2 = gym.make('Nucon-max_power-v0')
# env3 = NuconEnv(objectives=[Parameterized_Objectives['target_temperature'](goal_temp=350)], objective_weights=[1.0], seconds_per_step=5)
obs, info = env.reset()
for _ in range(1000):
action = env.action_space.sample() # Your agent here (instead of random)
obs, reward, terminated, truncated, info = env.step(action)
if terminated or truncated:
obs, info = env.reset()
env.close()
Objectives takes either strings of the name of predefined objectives, or lambda functions which take an observation and return a scalar reward. Final rewards are (weighted) summed across all objectives. info['objectives'] contains all objectives and their values.
You can e.g. train a PPO agent using the sb3 implementation:
from nucon.rl import NuconEnv
from stable_baselines3 import PPO
env = NuconEnv(objectives=['max_power'], seconds_per_step=5)
model = PPO(
"MlpPolicy",
env,
verbose=1,
learning_rate=3e-4,
n_steps=2048,
batch_size=64,
n_epochs=10,
gamma=0.99,
gae_lambda=0.95,
clip_range=0.2,
ent_coef=0.01,
)
model.learn(total_timesteps=100_000)
obs, info = env.reset()
for _ in range(1000):
action, _states = model.predict(obs, deterministic=True)
obs, reward, terminated, truncated, info = env.step(action)
if terminated or truncated:
obs, info = env.reset()
env.close()
NuconGoalEnv + HER Usage
HER works by relabelling past trajectories with the goal that was actually achieved, turning every episode into useful training signal even when the agent never reaches the intended target. This makes it much more sample-efficient than standard RL for goal-reaching tasks. This matters a lot given how slow the real game is.
from nucon.rl import NuconGoalEnv, Parameterized_Objectives, Parameterized_Terminators
from stable_baselines3 import SAC
from stable_baselines3.her.her_replay_buffer import HerReplayBuffer
env = NuconGoalEnv(
goal_params=['GENERATOR_0_KW', 'GENERATOR_1_KW', 'GENERATOR_2_KW'],
goal_range={
'GENERATOR_0_KW': (0.0, 1200.0),
'GENERATOR_1_KW': (0.0, 1200.0),
'GENERATOR_2_KW': (0.0, 1200.0),
},
tolerance=0.05, # sparse: within 5% of range counts as success (recommended with HER)
seconds_per_step=5,
simulator=simulator, # use a pre-trained simulator for fast pre-training
# Keep policy within the simulator's known data distribution.
# SIM_UNCERTAINTY (kNN-GP posterior std) is injected into obs when a simulator is active.
# Tune start/scale/threshold to taste.
additional_objectives=[Parameterized_Objectives['uncertainty_penalty'](start=0.3, scale=1.0)],
terminators=[Parameterized_Terminators['uncertainty_abort'](threshold=0.7)],
)
# Or use a preset: env = gym.make('Nucon-goal_power-v0', simulator=simulator)
model = SAC(
'MultiInputPolicy',
env,
replay_buffer_class=HerReplayBuffer,
replay_buffer_kwargs={'n_sampled_goal': 4, 'goal_selection_strategy': 'future'},
verbose=1,
learning_rate=1e-3,
batch_size=256,
tau=0.005,
gamma=0.98,
train_freq=1,
gradient_steps=1,
)
model.learn(total_timesteps=500_000)
At inference time, inject any target by constructing the observation manually:
import numpy as np
obs, _ = env.reset()
# Override the desired goal (values are normalised to [0,1] within goal_range)
obs['desired_goal'] = np.array([0.8, 0.8, 0.8], dtype=np.float32) # ~960 kW per generator
action, _ = model.predict(obs, deterministic=True)
Predefined goal environments:
Nucon-goal_power-v0: target total generator output (3 × 0–1200 kW)Nucon-goal_temp-v0: target core temperature (280–380 °C)
RL algorithms require a huge number of training steps, and Nucleares is slow and cannot be trivially parallelised. That's why NuCon provides a built-in simulator.
Simulator
NuCon provides a built-in simulator to address the challenge of slow training times in the actual Nucleares game. This simulator allows for rapid prototyping and testing of control policies without the need for the full game environment. Key features include:
- Mimics the behavior of the Nucleares game API
- Configurable initial states and operating modes
- Faster than real-time simulation
- Supports parallel execution for increased training throughput
Additional Dependencies
To use you'll need to install torch and flask. You can do so via
pip install -e '.[sim]'
Usage
To use the NuCon simulator:
from nucon import Nucon
from nucon.sim import NuconSimulator, OperatingState
# Create a simulator instance
simulator = NuconSimulator()
# Load a dynamics model (explained later)
simulator.load_model('path/to/model.pth')
# Set initial state (optional)
simulator.set_state(OperatingState.NOMINAL)
# The web server starts automatically in __init__; access via nucon using the simulator's port
nucon = Nucon(port=simulator.port)
# Or use the simulator with NuconEnv
from nucon.rl import NuconEnv
env = NuconEnv(simulator=simulator) # When given a similator, instead of waiting on the game, we will tell the simulator to skip forward after each step
# Train your RL agent using the simulator
# ...
The simulator needs an accurate dynamics model of the game. NuCon provides tools to learn one from real gameplay data.
Model Learning
To address the challenge of unknown game dynamics, NuCon provides tools for collecting data, creating datasets, and training models to learn the reactor dynamics. Key features include:
- Data Collection: Gathers state transitions from human play or automated agents.
time_deltais specified in game-time seconds; wall-clock sleep is automatically adjusted forGAME_SIM_SPEEDso collected deltas are uniform regardless of simulation speed. - Automatic param filtering: Junk params (GAME_VERSION, TIME, ALARMS_ACTIVE, …) and params from uninstalled subsystems (returns
None) are automatically excluded from model inputs/outputs. - Two model backends: Neural network (NN) or a local Gaussian Process approximated via k-Nearest Neighbours (kNN-GP).
- Uncertainty estimation: The kNN-GP backend returns a GP posterior standard deviation alongside each prediction; 0 means the query lies on known data, ~1 means it is out of distribution.
- Dataset management: Tools for saving, loading, merging, and pruning datasets.
Additional Dependencies
pip install -e '.[model]'
Model selection
kNN-GP (the ReactorKNNModel backend) is a local Gaussian Process: it finds the k nearest neighbours in the training set, fits an RBF kernel on them, and returns a prediction plus a GP posterior std as uncertainty. It works well from a few hundred samples and requires no training. NN needs input normalisation and several thousand samples to generalise; use it once you have a large dataset. For initial experiments, start with kNN-GP (k=10).
Usage
from nucon.model import NuconModelLearner
# --- Data collection ---
learner = NuconModelLearner(
time_delta=10.0, # 10 game-seconds per step (wall sleep auto-scales with sim speed)
include_valve_states=False, # set True to include all 53 valve positions as model inputs
)
learner.collect_data(num_steps=1000)
learner.save_dataset('reactor_dataset.pkl')
# Merge datasets collected across multiple sessions
learner.merge_datasets('other_session.pkl')
# --- Neural network backend ---
nn_learner = NuconModelLearner(dataset_path='reactor_dataset.pkl')
nn_learner.train_model(batch_size=32, num_epochs=50) # creates NN model on first call
# Drop samples the NN already predicts well (keep hard cases for further training)
nn_learner.drop_well_fitted(error_threshold=1.0)
nn_learner.save_model('reactor_nn.pth')
# --- kNN-GP backend ---
knn_learner = NuconModelLearner(dataset_path='reactor_dataset.pkl')
# Drop near-duplicate samples before fitting (keeps diverse coverage).
# A sample is dropped only if BOTH its input state AND output transition
# are within the given distances of an already-kept sample.
knn_learner.drop_redundant(min_state_distance=0.1, min_output_distance=0.05)
knn_learner.fit_knn(k=10) # creates kNN-GP model on first call
# Point prediction
state = knn_learner._get_state()
pred = knn_learner.model.forward(state, time_delta=10.0)
# Prediction with uncertainty
pred, uncertainty = knn_learner.predict_with_uncertainty(state, time_delta=10.0)
print(f"CORE_TEMP: {pred['CORE_TEMP']:.1f} ± {uncertainty:.3f} (std, GP posterior)")
# uncertainty ≈ 0: confident (query near known data)
# uncertainty ≈ 1: out of distribution
knn_learner.save_model('reactor_knn.pkl')
The trained models can be integrated into the NuconSimulator to provide accurate dynamics based on real game data.
Full Training Loop
The recommended end-to-end workflow for training an RL operator is an iterative cycle of real-game data collection, model fitting, and simulated training. The real game is slow and cannot be parallelised, so the bulk of RL training happens in the simulator. The game is used only as an oracle for data and evaluation.
┌─────────────────────────────────────────────────────────────┐
│ 1. Human dataset collection │
│ Play the game: start up the reactor, operate it across │
│ a range of states. NuCon records state transitions. │
└───────────────────────┬─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ 2. Initial model fitting │
│ Fit NN or kNN dynamics model to the collected dataset. │
│ kNN is instant; NN needs gradient steps but generalises │
│ better with more data. │
└───────────────────────┬─────────────────────────────────────┘
│
┌─────────▼──────────┐
│ 3. Train RL │◄───────────────────────┐
│ in simulator │ │
│ (fast, many │ │
│ trajectories) │ │
└─────────┬──────────┘ │
│ │
▼ │
┌─────────────────────┐ │
│ 4. Eval in game │ │
│ + collect new data │ │
│ (merge & prune │ │
│ dataset) │ │
└─────────┬───────────┘ │
│ │
▼ │
┌─────────────────────┐ model improved? │
│ 5. Refit model ├──────── yes ──────────┘
│ on expanded data │
└─────────────────────┘
Step 1 — Human dataset collection: Run scripts/collect_dataset.py during your play session (see Scripts). Cover a wide range of states: startup from cold, ramping power, individual rod bank adjustments. Diversity in the dataset directly determines simulator accuracy. See Model Learning for collection details.
Step 2 — Initial model fitting: Fit a kNN-GP model (instant) or NN (better extrapolation with larger datasets) using fit_knn() or train_model(). Prune near-duplicate samples with drop_redundant() before fitting. See Model Learning.
Step 3 — Train RL in simulator: Load the fitted model into NuconSimulator, then train a NuconGoalEnv policy with SAC + HER. The simulator runs far faster than the real game, allowing many trajectories in reasonable time. Pass Parameterized_Objectives['uncertainty_penalty'] and Parameterized_Terminators['uncertainty_abort'] as additional objectives/terminators to discourage the policy from wandering into regions the model hasn't seen; SIM_UNCERTAINTY is automatically injected into the obs dict when a simulator is active. See NuconGoalEnv + HER Usage and scripts/train_sac.py for a complete example.
Step 4 — Eval in game + collect new data: Run the trained policy against the real game. This validates simulator accuracy and simultaneously collects new data from states the policy visits, which may be regions the original dataset missed. Run a second NuconModelLearner in a background thread to collect concurrently.
Step 5 — Refit model on expanded data: Merge new data into the original dataset with merge_datasets(), prune with drop_redundant(), and refit. Then return to Step 3 with the improved model. Each iteration the simulator gets more accurate and the policy improves.
Stop when the policy performs well in the real game and kNN-GP uncertainty stays low throughout an episode, indicating the policy stays within the known data distribution.
Scripts
Ready-to-run scripts in the scripts/ directory covering the most common workflows.
scripts/collect_dataset.py — collect a dynamics dataset while playing the game:
python scripts/collect_dataset.py --steps 1000 --delta 10 --out reactor_dataset.pkl
# Ctrl-C to stop early; data is saved on exit
# Merge a previous session: --merge previous.pkl
scripts/train_sac.py — train a SAC + HER goal-conditioned policy on the kNN-GP simulator:
python scripts/train_sac.py
# Expects /tmp/reactor_knn.pkl and /tmp/nucon_dataset.pkl
# Saves trained policy to /tmp/sac_nucon_knn.zip
This script is the most elaborate end-to-end example: it loads a pre-fitted kNN-GP model, seeds episode resets from dataset states, uses delta actions and an uncertainty penalty, and configures SAC + HER for fast sim training.
Testing
NuCon includes a test suite to verify its functionality and compatibility with the Nucleares game.
Running Tests
To run the tests:
- Ensure the Nucleares game is running and accessible at http://localhost:8785/ (or update the URL in the test setup).
- Install pytest:
pip install pytest(orpip install -e .[dev]) - Run the tests:
pytest test/test_core.py pytest test/test_sim.py
Test Coverage
The tests verify:
- Parameter types match their definitions in NuCon
- Writable parameters can be written to
- Non-writable parameters cannot be written to, even when force-writing
- Enum parameters and their custom truthy values behave correctly
- Simulator functionality and consistency
To use you'll need to install pillow. You can do so via
pip install -e '.[drake]'
Usage:
from nucon.drake import create_drake_meme
items = [
(False, "Play Nucleares manually"),
(True, "Automate it with a script"),
(False, "But the web interface is tedious to use"),
(True, "Write an elegant libary to interface with the game and then use that to write the script"),
(False, "But I would still need to write the control policy by hand"),
(True, "Let's extend the libary such that it trains a policy via Reinforcement Learning"),
(False, "But RL takes a huge number of training samples"),
(True, "Extend the libary to also include an efficient simulator"),
(False, "But I don't know what the actual internal dynamics are"),
(True, "Extend the libary once more to also include a neural network dynamics model"),
(True, "And I'm gonna put a drake meme on the README"),
(False, "Online meme generators only support a single yes/no pair"),
(True, "Let's also add a drake meme generator to the libary"),
]
meme = create_drake_meme(items)
meme.save("README_meme.jpg")
Disclaimer
NuCon is an unofficial tool and is not affiliated with or endorsed by the creators of Nucleares.
Citing
What? Why would you wanna cite it? What are you even doing?
@misc{nucon,
title = {NuCon},
author = {Dominik Roth},
abstract = {NuCon is a Python library to interface with and control Nucleares, a nuclear reactor simulation game. Includes gymnasium bindings for Reinforcement Learning.},
url = {https://git.dominik-roth.eu/dodox/NuCon},
year = {2024},
}
