remove UncertaintyPenalty/Abort aliases; use Parameterized_Objectives/Terminators dicts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 18:58:22 +01:00 · 2026-03-12 18:58:22 +01:00 · 845ca708a7
commit 845ca708a7
parent 2c1bbc1a31
2 changed files with 3 additions and 6 deletions
--- a/README.md
+++ b/README.md
@ -196,7 +196,7 @@ env.close()
 HER works by relabelling past trajectories with the goal that was *actually achieved*, turning every episode into useful training signal even when the agent never reaches the intended target. This makes it much more sample-efficient than standard RL for goal-reaching tasks. This matters a lot given how slow the real game is.
 ```python
-from nucon.rl import NuconGoalEnv, UncertaintyPenalty, UncertaintyAbort
+from nucon.rl import NuconGoalEnv, Parameterized_Objectives, Parameterized_Terminators
 from stable_baselines3 import SAC
 from stable_baselines3.common.buffers import HerReplayBuffer
@ -214,8 +214,8 @@ env = NuconGoalEnv(
    # Keep policy within the simulator's known data distribution.
    # SIM_UNCERTAINTY (kNN-GP posterior std) is injected into obs when a simulator is active.
    # Tune start/scale/threshold to taste.
-    additional_objectives=[UncertaintyPenalty(start=0.3, scale=1.0)],  # L2 penalty above soft threshold
+    additional_objectives=[Parameterized_Objectives['uncertainty_penalty'](start=0.3, scale=1.0)],
-    terminators=[UncertaintyAbort(threshold=0.7)],            # abort episode at hard threshold
+    terminators=[Parameterized_Terminators['uncertainty_abort'](threshold=0.7)],
 )
 # Or use a preset: env = gym.make('Nucon-goal_power-v0', simulator=simulator)
--- a/nucon/rl.py
+++ b/nucon/rl.py
@ -43,9 +43,6 @@ Parameterized_Terminators = {
    "uncertainty_abort": _uncertainty_abort,  # (threshold,) -> (obs) -> float
 }
 # Convenience aliases
 UncertaintyPenalty = _uncertainty_penalty
 UncertaintyAbort   = _uncertainty_abort
 # ---------------------------------------------------------------------------