NuCon

Python library to interface with and control Nucleares, a nuclear reactor simulation game. Includes gymnasium bindings for Reinforcement Learning and Model Learning.

Go to file

Dominik Roth 55d6e8708e fix: kNN zero-variance dims get inf std; hot-start SAC from saved model - nucon/model.py: constant input dimensions (zero variance in training data) now get std=inf so they contribute 0 to normalised kNN distance instead of causing catastrophic OOD from tiny float epsilon - scripts/train_sac.py: add --load, --steps, --out CLI args; --load hot-starts actor/critic weights from a previous run (learning_starts=0) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>		2026-03-13 12:44:26 +01:00
nucon	fix: kNN zero-variance dims get inf std; hot-start SAC from saved model	2026-03-13 12:44:26 +01:00
scripts	fix: kNN zero-variance dims get inf std; hot-start SAC from saved model	2026-03-13 12:44:26 +01:00
test	Full parameter coverage compatible with game V2.2.25.213	2026-03-12 16:27:57 +01:00
.gitignore	Initial commit	2024-10-02 16:25:45 +02:00
logo.svg	chore: replace logo with minimal SVG reactor cross-section	2026-03-12 20:46:33 +01:00
pyproject.toml	Dev and simulator fixes	2026-03-12 16:27:48 +01:00
README_meme.jpg	Fix typo in meme	2024-10-03 23:28:29 +02:00
README.md	drop note	2026-03-12 21:04:56 +01:00

README.md

NuCon

NuCon (Nucleares Controller) is a Python library designed to interface with and control parameters in Nucleares, a nuclear reactor simulation game. It provides a robust, type-safe foundation for reading and writing game parameters, allowing users to easily create their own automations and control systems.

NuCon further provides a reinforcement learning environment for training control policies and a simulator based on model learning.

Features

Enum-based parameter system for type safety and code clarity
Support for various parameter types including floats, integers, booleans, strings, and custom enums
Read and write capabilities for game parameters
Reinforcement learning environment for training control policies
Built-in simulator for rapid prototyping and testing
Model learning for dynamics prediction

Installation

To install NuCon, clone this repository and install via pip:

git clone https://git.dominik-roth.eu/dodox/NuCon
cd NuCon
pip install -e .

Usage

Here's a basic example of how to use NuCon:

from nucon import Nucon

nucon = Nucon()
# or nucon = Nucon(host='localhost', port=8786)

# Enable dummy mode for testing (optional)
nucon.set_dummy_mode(True)

# Read a parameter
core_temp = nucon.CORE_TEMP.value
print(f"Core Temperature: {core_temp}")
# >> Core Temperature: 500.0

# Read a parameter with an enum type
pump_status = nucon.COOLANT_CORE_CIRCULATION_PUMP_0_STATUS.value
print(f"Pump 0 Status: {pump_status}")
if nucon.COOLANT_CORE_CIRCULATION_PUMP_0_STATUS.value:
    print('Pump 0 is active.')
# >> Pump 0 Status: PumpStatus.INACTIVE

# Write to a parameter (has no effect in dummy mode)
nucon.RODS_POS_ORDERED.value = 50
print(f"Rods Position Ordered: {nucon.RODS_POS_ORDERED.value}")
# >> Rods Position Ordered: 50.0

# The repr of an attribute contains all contained info
nucon.CORE_TEMP
# >> NuconParameter(id='CORE_TEMP', value=500.0, param_type=float, is_writable=False)

API Reference

The nucon instance contains all available parameters.

Parameter properties:

nucon.<PARAMETER>.value: Get or set the current value of the parameter. Assigning a new value will write it to the game.
nucon.<PARAMETER>.param_type: Get the type of the parameter
nucon.<PARAMETER>.is_writable: Check if the parameter is writable
nucon.<PARAMETER>.is_readable: False for write-only parameters (e.g. VALVE_OPEN, CORE_SCRAM_BUTTON). Reading raises AttributeError.
nucon.<PARAMETER>.is_cheat: True for game-event triggers (all FUN_*). Writing raises ValueError unless cheat_mode=True.
nucon.<PARAMETER>.enum_type: Get the enum type of the parameter if it's an enum, otherwise None
nucon.<PARAMETER>.unit: Unit string if defined (e.g. '°C', 'bar', '%')

Parameter methods:

nucon.<PARAMETER>.read(): Get the current value of the parameter (alias for value)
nucon.<PARAMETER>.write(new_value, force=False): Write a new value to the parameter. force will try to write even if the parameter is known as non-writable or out of known allowed range.

Class methods:

nucon.get(parameter): Get the value of a specific parameter. Also accepts string parameter names.
nucon.set(parameter, value, force=False): Set the value of a specific parameter. Also accepts string parameter names. force bypasses writable/range/cheat checks.
nucon.get_all_readable(): Get a dict of all readable parameters.
nucon.get_all_writable(): Get a dict of all writable parameters (includes write-only params).
nucon.get_all(): Get all readable parameter values as a dictionary.
nucon.get_all_iter(): Get all readable parameter values as a generator.
nucon.get_multiple(params): Get values for multiple specified parameters.
nucon.get_multiple_iter(params): Get values for multiple specified parameters as a generator.
nucon.get_game_variable_names(): Query the game for all exposed variable names (GET and POST), excluding special endpoints.
nucon.set_dummy_mode(dummy_mode): In dummy mode, returns sensible values without connecting to the game and silently ignores writes.
nucon.set_cheat_mode(cheat_mode): Enable writing to cheat parameters (FUN_* event triggers). Default False.

Valve API (motorized actuators: OPEN/CLOSE powers the motor, OFF holds current position):

nucon.get_valve(name): Get state dict for a single valve (Value, IsOpened, IsClosed, Stuck, …).
nucon.get_valves(): Get state dict for all 53 valves.
nucon.open_valve(name) / nucon.open_valves(names): Power actuator toward open.
nucon.close_valve(name) / nucon.close_valves(names): Power actuator toward closed.
nucon.off_valve(name) / nucon.off_valves(names): Cut actuator power, hold current position (normal resting state).

Custom Enum Types:

PumpStatus: Enum for pump status (INACTIVE, ACTIVE_NO_SPEED_REACHED*, ACTIVE_SPEED_REACHED*, REQUIRES_MAINTENANCE, NOT_INSTALLED, INSUFFICIENT_ENERGY)
PumpDryStatus: Enum for pump dry status (ACTIVE_WITHOUT_FLUID*, INACTIVE_OR_ACTIVE_WITH_FLUID)
PumpOverloadStatus: Enum for pump overload status (ACTIVE_AND_OVERLOAD*, INACTIVE_OR_ACTIVE_NO_OVERLOAD)
BreakerStatus: Enum for breaker status (OPEN*, CLOSED)

*: Truthy value (will be treated as true in e.g. if statements).

So if you're not in the mood to play the game manually, this API can be used to easily create your own automations and control systems. Maybe a little PID controller for the rods? Or, if you wanna go crazy, why not try some

Reinforcement Learning

NuCon includes a Reinforcement Learning (RL) environment based on the OpenAI Gym interface. This allows you to train control policies for the Nucleares game instead of writing them yourself. Requires additional dependencies.

Additional Dependencies

To use you'll need to install gymnasium and numpy. You can do so via

pip install -e '.[rl]'

Environments

Two environment classes are provided in nucon/rl.py:

NuconEnv: classic fixed-objective environment. You define one or more objectives at construction time (e.g. maximise power output, keep temperature in range). The agent always trains toward the same goal.

Observation space: all readable numeric parameters (~290 dims).
Action space: all readable-back writable parameters (~30 dims): 9 individual rod bank positions, 3 MSCVs, 3 turbine bypass valves, 6 coolant pump speeds, condenser pump, freight/vent switches, resistor banks, and more.
Objectives: predefined strings ('max_power', 'episode_time') or arbitrary callables (obs) -> float. Multiple objectives are weighted-summed.

NuconGoalEnv: goal-conditioned environment. The desired goal (e.g. target generator output) is sampled at the start of each episode and provided as part of the observation. A single policy learns to reach any goal in the specified range, making it far more useful than a fixed-objective agent. Designed for training with Hindsight Experience Replay (HER), which makes sparse-reward goal-conditioned training tractable.

Observation space: Dict with keys observation (non-goal params), achieved_goal (current goal param values, normalised to [0,1]), desired_goal (target, normalised to [0,1]).
Goals are sampled uniformly from the specified goal_range each episode.
Reward defaults to negative L2 distance in normalised goal space (dense). Pass tolerance for a sparse {0, -1} reward; this works particularly well with HER.

NuconEnv Usage

from nucon.rl import NuconEnv, Parameterized_Objectives

env = NuconEnv(objectives=['max_power'], seconds_per_step=5)
# env2 = gym.make('Nucon-max_power-v0')
# env3 = NuconEnv(objectives=[Parameterized_Objectives['target_temperature'](goal_temp=350)], objective_weights=[1.0], seconds_per_step=5)

obs, info = env.reset()
for _ in range(1000):
    action = env.action_space.sample()  # Your agent here (instead of random)
    obs, reward, terminated, truncated, info = env.step(action)

    if terminated or truncated:
        obs, info = env.reset()
env.close()

Objectives takes either strings of the name of predefined objectives, or lambda functions which take an observation and return a scalar reward. Final rewards are (weighted) summed across all objectives. info['objectives'] contains all objectives and their values.

You can e.g. train a PPO agent using the sb3 implementation:

from nucon.rl import NuconEnv
from stable_baselines3 import PPO

env = NuconEnv(objectives=['max_power'], seconds_per_step=5)

model = PPO(
    "MlpPolicy",
    env,
    verbose=1,
    learning_rate=3e-4,
    n_steps=2048,
    batch_size=64,
    n_epochs=10,
    gamma=0.99,
    gae_lambda=0.95,
    clip_range=0.2,
    ent_coef=0.01,
)
model.learn(total_timesteps=100_000)

obs, info = env.reset()
for _ in range(1000):
    action, _states = model.predict(obs, deterministic=True)
    obs, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        obs, info = env.reset()
env.close()

NuconGoalEnv + HER Usage

HER works by relabelling past trajectories with the goal that was actually achieved, turning every episode into useful training signal even when the agent never reaches the intended target. This makes it much more sample-efficient than standard RL for goal-reaching tasks. This matters a lot given how slow the real game is.

from nucon.rl import NuconGoalEnv, Parameterized_Objectives, Parameterized_Terminators
from stable_baselines3 import SAC
from stable_baselines3.her.her_replay_buffer import HerReplayBuffer


env = NuconGoalEnv(
    goal_params=['GENERATOR_0_KW', 'GENERATOR_1_KW', 'GENERATOR_2_KW'],
    goal_range={
        'GENERATOR_0_KW': (0.0, 1200.0),
        'GENERATOR_1_KW': (0.0, 1200.0),
        'GENERATOR_2_KW': (0.0, 1200.0),
    },
    tolerance=0.05,        # sparse: within 5% of range counts as success (recommended with HER)
    seconds_per_step=5,
    simulator=simulator,   # use a pre-trained simulator for fast pre-training
    # Keep policy within the simulator's known data distribution.
    # SIM_UNCERTAINTY (kNN-GP posterior std) is injected into obs when a simulator is active.
    # Tune start/scale/threshold to taste.
    additional_objectives=[Parameterized_Objectives['uncertainty_penalty'](start=0.3, scale=1.0)],
    terminators=[Parameterized_Terminators['uncertainty_abort'](threshold=0.7)],
)
# Or use a preset: env = gym.make('Nucon-goal_power-v0', simulator=simulator)

model = SAC(
    'MultiInputPolicy',
    env,
    replay_buffer_class=HerReplayBuffer,
    replay_buffer_kwargs={'n_sampled_goal': 4, 'goal_selection_strategy': 'future'},
    verbose=1,
    learning_rate=1e-3,
    batch_size=256,
    tau=0.005,
    gamma=0.98,
    train_freq=1,
    gradient_steps=1,
)
model.learn(total_timesteps=500_000)

At inference time, inject any target by constructing the observation manually:

import numpy as np
obs, _ = env.reset()
# Override the desired goal (values are normalised to [0,1] within goal_range)
obs['desired_goal'] = np.array([0.8, 0.8, 0.8], dtype=np.float32)  # ~960 kW per generator
action, _ = model.predict(obs, deterministic=True)

Predefined goal environments:

Nucon-goal_power-v0: target total generator output (3 × 0–1200 kW)
Nucon-goal_temp-v0: target core temperature (280–380 °C)

RL algorithms require a huge number of training steps, and Nucleares is slow and cannot be trivially parallelised. That's why NuCon provides a built-in simulator.

Simulator

NuCon provides a built-in simulator to address the challenge of slow training times in the actual Nucleares game. This simulator allows for rapid prototyping and testing of control policies without the need for the full game environment. Key features include:

Mimics the behavior of the Nucleares game API
Configurable initial states and operating modes
Faster than real-time simulation
Supports parallel execution for increased training throughput

Additional Dependencies

To use you'll need to install torch and flask. You can do so via

pip install -e '.[sim]'

Usage

To use the NuCon simulator:

from nucon import Nucon
from nucon.sim import NuconSimulator, OperatingState

# Create a simulator instance
simulator = NuconSimulator()

# Load a dynamics model (explained later)
simulator.load_model('path/to/model.pth')

# Set initial state (optional)
simulator.set_state(OperatingState.NOMINAL)

# The web server starts automatically in __init__; access via nucon using the simulator's port
nucon = Nucon(port=simulator.port)

# Or use the simulator with NuconEnv
from nucon.rl import NuconEnv
env = NuconEnv(simulator=simulator) # When given a similator, instead of waiting on the game, we will tell the simulator to skip forward after each step

# Train your RL agent using the simulator
# ...

The simulator needs an accurate dynamics model of the game. NuCon provides tools to learn one from real gameplay data.

Model Learning

To address the challenge of unknown game dynamics, NuCon provides tools for collecting data, creating datasets, and training models to learn the reactor dynamics. Key features include:

Data Collection: Gathers state transitions from human play or automated agents. time_delta is specified in game-time seconds; wall-clock sleep is automatically adjusted for GAME_SIM_SPEED so collected deltas are uniform regardless of simulation speed.
Automatic param filtering: Junk params (GAME_VERSION, TIME, ALARMS_ACTIVE, …) and params from uninstalled subsystems (returns None) are automatically excluded from model inputs/outputs.
Two model backends: Neural network (NN) or a local Gaussian Process approximated via k-Nearest Neighbours (kNN-GP).
Uncertainty estimation: The kNN-GP backend returns a GP posterior standard deviation alongside each prediction; 0 means the query lies on known data, ~1 means it is out of distribution.
Dataset management: Tools for saving, loading, merging, and pruning datasets.

Additional Dependencies

pip install -e '.[model]'

Model selection

kNN-GP (the ReactorKNNModel backend) is a local Gaussian Process: it finds the k nearest neighbours in the training set, fits an RBF kernel on them, and returns a prediction plus a GP posterior std as uncertainty. It works well from a few hundred samples and requires no training. NN needs input normalisation and several thousand samples to generalise; use it once you have a large dataset. For initial experiments, start with kNN-GP (k=10).

Usage

from nucon.model import NuconModelLearner

# --- Data collection ---
learner = NuconModelLearner(
    time_delta=10.0,             # 10 game-seconds per step (wall sleep auto-scales with sim speed)
    include_valve_states=False,  # set True to include all 53 valve positions as model inputs
)
learner.collect_data(num_steps=1000)
learner.save_dataset('reactor_dataset.pkl')

# Merge datasets collected across multiple sessions
learner.merge_datasets('other_session.pkl')

# --- Neural network backend ---
nn_learner = NuconModelLearner(dataset_path='reactor_dataset.pkl')
nn_learner.train_model(batch_size=32, num_epochs=50)  # creates NN model on first call
# Drop samples the NN already predicts well (keep hard cases for further training)
nn_learner.drop_well_fitted(error_threshold=1.0)
nn_learner.save_model('reactor_nn.pth')

# --- kNN-GP backend ---
knn_learner = NuconModelLearner(dataset_path='reactor_dataset.pkl')
# Drop near-duplicate samples before fitting (keeps diverse coverage).
# A sample is dropped only if BOTH its input state AND output transition
# are within the given distances of an already-kept sample.
knn_learner.drop_redundant(min_state_distance=0.1, min_output_distance=0.05)
knn_learner.fit_knn(k=10)  # creates kNN-GP model on first call

# Point prediction
state = knn_learner._get_state()
pred = knn_learner.model.forward(state, time_delta=10.0)

# Prediction with uncertainty
pred, uncertainty = knn_learner.predict_with_uncertainty(state, time_delta=10.0)
print(f"CORE_TEMP: {pred['CORE_TEMP']:.1f} ± {uncertainty:.3f} (std, GP posterior)")
# uncertainty ≈ 0: confident (query near known data)
# uncertainty ≈ 1: out of distribution

knn_learner.save_model('reactor_knn.pkl')

The trained models can be integrated into the NuconSimulator to provide accurate dynamics based on real game data.

Full Training Loop

The recommended end-to-end workflow for training an RL operator is an iterative cycle of real-game data collection, model fitting, and simulated training. The real game is slow and cannot be parallelised, so the bulk of RL training happens in the simulator. The game is used only as an oracle for data and evaluation.

┌─────────────────────────────────────────────────────────────┐
│  1. Human dataset collection                                │
│     Play the game: start up the reactor, operate it across  │
│     a range of states. NuCon records state transitions.     │
└───────────────────────┬─────────────────────────────────────┘
                        │
                        ▼
┌─────────────────────────────────────────────────────────────┐
│  2. Initial model fitting                                   │
│     Fit NN or kNN dynamics model to the collected dataset.  │
│     kNN is instant; NN needs gradient steps but generalises │
│     better with more data.                                  │
└───────────────────────┬─────────────────────────────────────┘
                        │
              ┌─────────▼──────────┐
              │   3. Train RL      │◄───────────────────────┐
              │   in simulator     │                        │
              │   (fast, many      │                        │
              │   trajectories)    │                        │
              └─────────┬──────────┘                        │
                        │                                   │
                        ▼                                   │
              ┌─────────────────────┐                       │
              │  4. Eval in game    │                       │
              │  + collect new data │                       │
              │  (merge & prune     │                       │
              │   dataset)          │                       │
              └─────────┬───────────┘                       │
                        │                                   │
                        ▼                                   │
              ┌─────────────────────┐      model improved?  │
              │  5. Refit model     ├──────── yes ──────────┘
              │  on expanded data   │
              └─────────────────────┘

Step 1 — Human dataset collection: Run scripts/collect_dataset.py during your play session (see Scripts). Cover a wide range of states: startup from cold, ramping power, individual rod bank adjustments. Diversity in the dataset directly determines simulator accuracy. See Model Learning for collection details.

Step 2 — Initial model fitting: Fit a kNN-GP model (instant) or NN (better extrapolation with larger datasets) using fit_knn() or train_model(). Prune near-duplicate samples with drop_redundant() before fitting. See Model Learning.

Step 3 — Train RL in simulator: Load the fitted model into NuconSimulator, then train a NuconGoalEnv policy with SAC + HER. The simulator runs far faster than the real game, allowing many trajectories in reasonable time. Pass Parameterized_Objectives['uncertainty_penalty'] and Parameterized_Terminators['uncertainty_abort'] as additional objectives/terminators to discourage the policy from wandering into regions the model hasn't seen; SIM_UNCERTAINTY is automatically injected into the obs dict when a simulator is active. See NuconGoalEnv + HER Usage and scripts/train_sac.py for a complete example.

Step 4 — Eval in game + collect new data: Run the trained policy against the real game. This validates simulator accuracy and simultaneously collects new data from states the policy visits, which may be regions the original dataset missed. Run a second NuconModelLearner in a background thread to collect concurrently.

Step 5 — Refit model on expanded data: Merge new data into the original dataset with merge_datasets(), prune with drop_redundant(), and refit. Then return to Step 3 with the improved model. Each iteration the simulator gets more accurate and the policy improves.

Stop when the policy performs well in the real game and kNN-GP uncertainty stays low throughout an episode, indicating the policy stays within the known data distribution.

Scripts

Ready-to-run scripts in the scripts/ directory covering the most common workflows.

scripts/collect_dataset.py — collect a dynamics dataset while playing the game:

python scripts/collect_dataset.py --steps 1000 --delta 10 --out reactor_dataset.pkl
# Ctrl-C to stop early; data is saved on exit
# Merge a previous session: --merge previous.pkl

scripts/train_sac.py — train a SAC + HER goal-conditioned policy on the kNN-GP simulator:

python scripts/train_sac.py
# Expects /tmp/reactor_knn.pkl and /tmp/nucon_dataset.pkl
# Saves trained policy to /tmp/sac_nucon_knn.zip

This script is the most elaborate end-to-end example: it loads a pre-fitted kNN-GP model, seeds episode resets from dataset states, uses delta actions and an uncertainty penalty, and configures SAC + HER for fast sim training.

Testing

NuCon includes a test suite to verify its functionality and compatibility with the Nucleares game.

Running Tests

To run the tests:

Ensure the Nucleares game is running and accessible at http://localhost:8785/ (or update the URL in the test setup).
Install pytest: pip install pytest (or pip install -e .[dev])

Run the tests:

pytest test/test_core.py
pytest test/test_sim.py

Test Coverage

The tests verify:

Parameter types match their definitions in NuCon
Writable parameters can be written to
Non-writable parameters cannot be written to, even when force-writing
Enum parameters and their custom truthy values behave correctly
Simulator functionality and consistency

To use you'll need to install pillow. You can do so via

pip install -e '.[drake]'

Usage:

from nucon.drake import create_drake_meme

items = [
    (False, "Play Nucleares manually"),
    (True, "Automate it with a script"),
    (False, "But the web interface is tedious to use"),
    (True, "Write an elegant libary to interface with the game and then use that to write the script"),
    (False, "But I would still need to write the control policy by hand"),
    (True, "Let's extend the libary such that it trains a policy via Reinforcement Learning"),
    (False, "But RL takes a huge number of training samples"),
    (True, "Extend the libary to also include an efficient simulator"),
    (False, "But I don't know what the actual internal dynamics are"),
    (True, "Extend the libary once more to also include a neural network dynamics model"),
    (True, "And I'm gonna put a drake meme on the README"),
    (False, "Online meme generators only support a single yes/no pair"),
    (True, "Let's also add a drake meme generator to the libary"),
    ]

meme = create_drake_meme(items)
meme.save("README_meme.jpg")

Disclaimer

NuCon is an unofficial tool and is not affiliated with or endorsed by the creators of Nucleares.

Citing

What? Why would you wanna cite it? What are you even doing?

@misc{nucon,
  title = {NuCon},
  author = {Dominik Roth},
  abstract = {NuCon is a Python library to interface with and control Nucleares, a nuclear reactor simulation game. Includes gymnasium bindings for Reinforcement Learning.},
  url = {https://git.dominik-roth.eu/dodox/NuCon},
  year = {2024},
}

README.md Unescape Escape

NuCon

Features

Installation

Usage

API Reference

Reinforcement Learning

Additional Dependencies

Environments

NuconEnv Usage

NuconGoalEnv + HER Usage

Simulator

Additional Dependencies

Usage

Model Learning

Additional Dependencies

Model selection

Usage

Full Training Loop

Scripts

Testing

Running Tests

Test Coverage

Usage:

Disclaimer

Citing

README.md