diff --git a/docs/build/doctrees/environment.pickle b/docs/build/doctrees/environment.pickle
index ae5c3a7..204d567 100644
Binary files a/docs/build/doctrees/environment.pickle and b/docs/build/doctrees/environment.pickle differ
diff --git a/docs/build/doctrees/envs/fancy/mujoco.doctree b/docs/build/doctrees/envs/fancy/mujoco.doctree
index 12330b9..90ac0e6 100644
Binary files a/docs/build/doctrees/envs/fancy/mujoco.doctree and b/docs/build/doctrees/envs/fancy/mujoco.doctree differ
diff --git a/docs/build/html/_sources/envs/fancy/mujoco.md.txt b/docs/build/html/_sources/envs/fancy/mujoco.md.txt
index 89faeee..6401cdc 100644
--- a/docs/build/html/_sources/envs/fancy/mujoco.md.txt
+++ b/docs/build/html/_sources/envs/fancy/mujoco.md.txt
@@ -18,6 +18,12 @@ A composite reward function serves as the performance metric for the RL system.
Variations of this environment are available, differing in reward structures and the optionality of randomizing the box's initial position. These variations are purposefully designed to challenge RL algorithms, enhancing their generalization and adaptation capabilities. Temporally sparse environments only provide a reward at the last timestep. Spatially sparse environments only provide a reward, if the goal is almost reached, the box is close enought to the goal and somewhat correctly aligned.
+These environments all provide smoothness metrics as part of the return infos:
+
+- mean_squared_jerk: Averages the square of jerk (rate of acceleration change) across the motion. Lower values indicate smoother movement.
+- maximum_jerk: Identifies the highest jerk value encountered.
+- dimensionless_jerk: Normalizes the summed squared jerk over the motion's duration and peak velocity, offering a scale-independent metric of smoothness
+
| Name | Description | Horizon | Action Dimension | Observation Dimension |
| ------------------------------------------ | -------------------------------------------------------------------- | ------- | ---------------- | --------------------- |
| `fancy/BoxPushingDense-v0` | Custom Box-pushing task with dense rewards | 100 | 3 | 13 |
@@ -49,6 +55,9 @@ Variations of the table tennis environment are available to cater to different r
| `fancy/TableTennisWind-v0` | Table Tennis task with wind effects, based on a custom environment for table tennis | 350 | 7 | 19 |
| `fancy/TableTennisGoalSwitching-v0` | Table Tennis task with goal switching, based on a custom environment for table tennis | 350 | 7 | 19 |
| `fancy/TableTennisWindReplan-v0` | Table Tennis task with wind effects and replanning, based on a custom environment for table tennis | 350 | 7 | 19 |
+| `fancy/TableTennisRndRobot-v0` | Table Tennis task with random initial robot joint positions \* | 350 | 7 | 19 |
+
+\* Random initialization of robot joint position and speed can be enabled by providing `random_pos_scale` / `random_vel_scale` to make. `TableTennisRndRobot` is equivalent to `TableTennis4D` except, that `random_pos_scale` is set to 0.1 instead of 0 per default.
---
@@ -89,8 +98,9 @@ A successful throw in this task is determined by the ball landing in the cup at
| `fancy/Reacher5dSparse-v0` | Sparse Reacher task with 5 links, based on Gymnasium's `gym.envs.mujoco.ReacherEnv` | 200 | 5 | 20 |
| `fancy/Reacher7d-v0` | Reacher task with 7 links, based on Gymnasium's `gym.envs.mujoco.ReacherEnv` | 200 | 7 | 22 |
| `fancy/Reacher7dSparse-v0` | Sparse Reacher task with 7 links, based on Gymnasium's `gym.envs.mujoco.ReacherEnv` | 200 | 7 | 22 |
-| `fancy/HopperJumpSparse-v0` | Hopper Jump task with sparse rewards, based on Gymnasium's `gym.envs.mujoco.Hopper` | 250 | 3 | 15 / 16\* |
| `fancy/HopperJump-v0` | Hopper Jump task with continuous rewards, based on Gymnasium's `gym.envs.mujoco.Hopper` | 250 | 3 | 15 / 16\* |
+| `fancy/HopperJumpMarkov-v0` | `fancy/HopperJump-v0`, but with an alternative reward that is markovian. | 250 | 3 | 15 / 16\* |
+| `fancy/HopperJumpSparse-v0` | Hopper Jump task with sparse rewards, based on Gymnasium's `gym.envs.mujoco.Hopper` | 250 | 3 | 15 / 16\* |
| `fancy/AntJump-v0` | Ant Jump task, based on Gymnasium's `gym.envs.mujoco.Ant` | 200 | 8 | 119 |
| `fancy/HalfCheetahJump-v0` | HalfCheetah Jump task, based on Gymnasium's `gym.envs.mujoco.HalfCheetah` | 100 | 6 | 112 |
| `fancy/HopperJumpOnBox-v0` | Hopper Jump on Box task, based on Gymnasium's `gym.envs.mujoco.Hopper` | 250 | 4 | 16 / 100\* |
diff --git a/docs/build/html/envs/fancy/mujoco.html b/docs/build/html/envs/fancy/mujoco.html
index 11d2273..f147a6f 100644
--- a/docs/build/html/envs/fancy/mujoco.html
+++ b/docs/build/html/envs/fancy/mujoco.html
@@ -135,6 +135,12 @@
The observation space includes the sine and cosine values of the robotic joint angles, their velocities, and quaternion orientations for the end-effector and the box. The action space describes the applied torques for each joint.
A composite reward function serves as the performance metric for the RL system. It accounts for the distance to the goal, the box’s orientation, maintaining a rod within the box, achieving the rod’s desired orientation, and includes penalties for joint position and velocity limit violations, as well as an action cost for energy expenditure.
Variations of this environment are available, differing in reward structures and the optionality of randomizing the box’s initial position. These variations are purposefully designed to challenge RL algorithms, enhancing their generalization and adaptation capabilities. Temporally sparse environments only provide a reward at the last timestep. Spatially sparse environments only provide a reward, if the goal is almost reached, the box is close enought to the goal and somewhat correctly aligned.
+These environments all provide smoothness metrics as part of the return infos:
+
+mean_squared_jerk: Averages the square of jerk (rate of acceleration change) across the motion. Lower values indicate smoother movement.
+maximum_jerk: Identifies the highest jerk value encountered.
+dimensionless_jerk: Normalizes the summed squared jerk over the motion’s duration and peak velocity, offering a scale-independent metric of smoothness
+
Name |
@@ -228,8 +234,15 @@
7 |
19 |
+fancy/TableTennisRndRobot-v0
|
+Table Tennis task with random initial robot joint positions * |
+350 |
+7 |
+19 |
+
+* Random initialization of robot joint position and speed can be enabled by providing random_pos_scale
/ random_vel_scale
to make. TableTennisRndRobot
is equivalent to TableTennis4D
except, that random_pos_scale
is set to 0.1 instead of 0 per default.
@@ -335,49 +348,55 @@
7 |
22 |
+fancy/HopperJump-v0
|
+Hopper Jump task with continuous rewards, based on Gymnasium’s gym.envs.mujoco.Hopper |
+250 |
+3 |
+15 / 16* |
+
+fancy/HopperJumpMarkov-v0
|
+fancy/HopperJump-v0 , but with an alternative reward that is markovian.
|
+250 |
+3 |
+15 / 16* |
+
fancy/HopperJumpSparse-v0
|
Hopper Jump task with sparse rewards, based on Gymnasium’s gym.envs.mujoco.Hopper |
250 |
3 |
15 / 16* |
-fancy/HopperJump-v0
|
-Hopper Jump task with continuous rewards, based on Gymnasium’s gym.envs.mujoco.Hopper |
-250 |
-3 |
-15 / 16* |
-
-fancy/AntJump-v0
|
+
fancy/AntJump-v0
|
Ant Jump task, based on Gymnasium’s gym.envs.mujoco.Ant |
200 |
8 |
119 |
-fancy/HalfCheetahJump-v0
|
+
fancy/HalfCheetahJump-v0
|
HalfCheetah Jump task, based on Gymnasium’s gym.envs.mujoco.HalfCheetah |
100 |
6 |
112 |
-fancy/HopperJumpOnBox-v0
|
+
fancy/HopperJumpOnBox-v0
|
Hopper Jump on Box task, based on Gymnasium’s gym.envs.mujoco.Hopper |
250 |
4 |
16 / 100* |
-fancy/HopperThrow-v0
|
+
fancy/HopperThrow-v0
|
Hopper Throw task, based on Gymnasium’s gym.envs.mujoco.Hopper |
250 |
3 |
18 / 100* |
-fancy/HopperThrowInBasket-v0
|
+
fancy/HopperThrowInBasket-v0
|
Hopper Throw in Basket task, based on Gymnasium’s gym.envs.mujoco.Hopper |
250 |
3 |
18 / 100* |
-fancy/Walker2DJump-v0
|
+
fancy/Walker2DJump-v0
|
Walker 2D Jump task, based on Gymnasium’s gym.envs.mujoco.Walker2d |
300 |
6 |
diff --git a/docs/build/html/searchindex.js b/docs/build/html/searchindex.js
index 209c151..3c9754c 100644
--- a/docs/build/html/searchindex.js
+++ b/docs/build/html/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"docnames": ["api", "envs/dmc", "envs/fancy/airhockey", "envs/fancy/classic_control", "envs/fancy/index", "envs/fancy/mujoco", "envs/meta", "envs/open_ai", "examples/dmc", "examples/general", "examples/metaworld", "examples/movement_primitives", "examples/mp_params_tuning", "examples/open_ai", "examples/pd_control_gain_tuning", "examples/replanning_envs", "generated/fancy_gym.envs", "generated/fancy_gym.register", "generated/fancy_gym.upgrade", "guide/basic_usage", "guide/episodic_rl", "guide/installation", "guide/upgrading_envs", "index"], "filenames": ["api.rst", "envs/dmc.md", "envs/fancy/airhockey.rst", "envs/fancy/classic_control.md", "envs/fancy/index.rst", "envs/fancy/mujoco.md", "envs/meta.md", "envs/open_ai.md", "examples/dmc.rst", "examples/general.rst", "examples/metaworld.rst", "examples/movement_primitives.rst", "examples/mp_params_tuning.rst", "examples/open_ai.rst", "examples/pd_control_gain_tuning.rst", "examples/replanning_envs.rst", "generated/fancy_gym.envs.rst", "generated/fancy_gym.register.rst", "generated/fancy_gym.upgrade.rst", "guide/basic_usage.rst", "guide/episodic_rl.rst", "guide/installation.rst", "guide/upgrading_envs.rst", "index.rst"], "titles": ["API", "DeepMind Control (DMC)", "AirHockey", "Classic Control", "Fancy", "Mujoco", "Metaworld", "Gymnasium", "DeepMind Control Examples", "General Usage Examples", "Metaworld Examples", "Movement Primitives Examples", "MP Params Tuning Example", "OpenAI Envs Examples", "PD Control Gain Tuning Example", "Replanning Example", "fancy_gym.envs", "fancy_gym.register", "fancy_gym.upgrade", "Basic Usage", "What is Episodic RL?", "Installation", "Creating new MP Environments", "Fancy Gym"], "terms": {"These": [1, 2, 3, 5, 7, 20], "ar": [1, 2, 3, 4, 5, 7, 8, 10, 11, 14, 17, 19, 20, 21, 22], "wrapper": [1, 8, 10, 11, 15, 17, 18, 22], "select": [1, 7, 22], "order": 1, "us": [1, 2, 5, 6, 9, 11, 15, 17, 18, 19, 20, 21, 22, 23], "our": [1, 8, 9, 10, 11, 20, 23], "motion": [1, 5, 20], "primit": [1, 8, 10, 13, 17, 18, 20, 22, 23], "gym": [1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 19, 22], "interfac": [1, 6, 11, 22, 23], "them": [1, 5, 6, 7, 8, 10, 11, 19, 23], "when": [1, 5, 8, 9, 10, 17, 22], "instal": [1, 10, 23], "fancy_gym": [1, 6, 8, 9, 10, 11, 12, 13, 14, 15, 19, 21, 22, 23], "option": [1, 5, 17, 18, 19, 21], "extra": 1, "e": [1, 8, 10, 11, 21, 22], "g": [1, 8, 10, 11, 22], "pip": [1, 21, 23], "all": [1, 5, 6, 9, 10, 19, 21, 23], "regular": [1, 19, 23], "task": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 19, 22], "avaibl": [1, 6, 21], "via": [1, 3, 6, 19, 21, 22, 23], "shimmi": 1, "name": [1, 3, 5, 6, 7, 8, 10, 19], "descript": [1, 3, 5, 6, 7, 19], "action": [1, 3, 5, 6, 7, 8, 9, 10, 11, 14, 15, 19, 20, 22, 23], "dim": 1, "observ": [1, 2, 3, 5, 6, 8, 9, 10, 11, 19, 20, 22, 23], "dm_control": [1, 8, 19], "acrobot": 1, "swingup": 1, "v0": [1, 2, 3, 4, 5, 7, 8, 9, 11, 12, 14, 15, 17, 18, 19, 22, 23], "underactu": 1, "doubl": 1, "pendulum": [1, 9], "torqu": [1, 5, 20], "appli": [1, 5], "second": 1, "joint": [1, 5, 22], "swing": 1, "up": [1, 4, 6, 21], "balanc": 1, "1": [1, 5, 7, 8, 9, 10, 11, 13, 14, 15, 19, 22, 23], "6": [1, 5, 6], "swingup_spars": 1, "similar": 1, "spars": [1, 5], "reward": [1, 3, 5, 8, 9, 10, 11, 13, 15, 19, 22, 23], "achiev": [1, 5, 8, 10], "ball_in_cup": [1, 8, 19], "catch": [1, 8, 19], "planar": 1, "ball": [1, 5], "cup": [1, 5], "where": [1, 2, 3, 6], "receptacl": 1, "must": [1, 6], "2": [1, 3, 5, 7, 8, 9, 10, 11, 13, 22], "8": [1, 5, 15], "cartpol": 1, "cart": 1, "pole": 1, "goal": [1, 3, 5, 10], "i": [1, 2, 5, 6, 8, 9, 10, 11, 13, 15, 17, 18, 19, 22, 23], "an": [1, 5, 6, 7, 8, 10, 17, 18, 19, 20, 22, 23], "unactu": 1, "move": 1, "start": [1, 22], "upright": 1, "5": [1, 3, 5, 8, 10, 11, 14, 15, 19, 22], "balance_spars": 1, "downward": 1, "requir": [1, 2, 3, 5, 6, 8, 10, 11, 19, 20, 22], "two_pol": 1, "extens": 1, "domain": 1, "two": [1, 5], "serial": 1, "connect": 1, "increas": [1, 9], "challeng": [1, 2, 5, 23], "three_pol": 1, "three": [1, 2], "further": [1, 19, 20], "11": [1, 8], "cheetah": 1, "run": [1, 8, 9, 10, 11, 13, 15], "biped": 1, "robot": [1, 2, 5, 6, 20, 23], "The": [1, 2, 3, 5, 6, 8, 10, 11, 17, 18, 19, 20, 22, 23], "proport": 1, "forward": 1, "veloc": [1, 5, 11, 14, 15, 19, 20, 22], "maximum": [1, 15], "speed": 1, "17": 1, "dog": 1, "stand": 1, "focus": [1, 2], "postur": 1, "38": 1, "223": 1, "walk": 1, "coordin": [1, 5], "movement": [1, 8, 10, 13, 17, 18, 20, 22, 23], "trot": 1, "perform": [1, 2, 5], "gait": 1, "combin": 1, "stabil": 1, "fetch": 1, "plai": [1, 5, 6], "involv": [1, 2, 6], "locomot": 1, "object": [1, 5, 6, 20], "interact": [1, 19], "232": 1, "finger": 1, "spin": 1, "rotat": 1, "bodi": 1, "hing": 1, "9": [1, 3], "turn_easi": 1, "align": [1, 5, 20], "tip": 1, "free": [1, 19, 22], "target": [1, 14], "easier": 1, "version": [1, 7, 8, 10, 13, 17, 18, 19, 21, 22], "larger": 1, "12": 1, "turn_hard": 1, "smaller": 1, "difficulti": [1, 23], "fish": [1, 8], "right": [1, 20], "itself": [1, 3], "fluid": 1, "21": [1, 5], "swim": [1, 8], "incorpor": 1, "dynam": [1, 2, 20, 23], "24": 1, "hopper": [1, 5], "One": 1, "leg": 1, "minim": 1, "torso": 1, "height": 1, "4": [1, 5, 6, 7, 9, 11, 15, 22], "15": [1, 5, 14], "hop": 1, "humanoid": 1, "simplifi": 1, "maintain": [1, 5, 19, 23], "67": 1, "specifi": [1, 5, 8, 10, 18], "aim": [1, 2], "high": [1, 3, 14], "horizont": 1, "run_pure_st": 1, "focu": [1, 3], "pure": 1, "state": [1, 15, 19], "55": 1, "humanoid_cmu": 1, "advanc": [1, 5, 6], "cmu": 1, "model": [1, 2], "56": 1, "137": 1, "lqr": 1, "lqr_2_1": 1, "linear": [1, 8, 10, 11, 22], "quadrat": 1, "regul": 1, "mass": 1, "actuat": [1, 2], "posit": [1, 5, 14, 19, 20, 22], "optim": [1, 20], "lqr_6_2": 1, "more": [1, 9, 13, 19, 20, 22, 23], "complex": [1, 2, 3, 5], "manipul": [1, 5, 6, 8, 9], "bring_bal": 1, "bring": 1, "locat": [1, 5], "initi": [1, 5], "variat": [1, 4], "44": 1, "bring_peg": 1, "peg": [1, 6], "insert_bal": 1, "insert": [1, 6], "basket": [1, 5], "insert_peg": 1, "slot": 1, "classic": [1, 4, 20, 23], "invert": 1, "limit": [1, 2, 5], "multipl": [1, 5, 8, 10, 11, 13, 18, 19, 22], "3": [1, 2, 5, 22], "point_mass": 1, "easi": [1, 22, 23], "point": [1, 3, 17, 22], "correspond": 1, "global": 1, "x": [1, 5], "y": [1, 5], "ax": [1, 5, 14], "hard": 1, "random": [1, 5], "gain": [1, 23], "per": [1, 5], "episod": [1, 5, 8, 9, 10, 11, 14, 19, 23], "memoryless": 1, "agent": [1, 2, 3], "quadrup": 1, "four": 1, "78": 1, "escap": 1, "environment": 1, "101": 1, "90": 1, "reacher": [1, 5, 7, 11, 13, 19], "link": [1, 3, 5], "sphere": 1, "stacker": 1, "stack_2": 1, "stack": [1, 9], "box": [1, 4, 6, 11, 20, 23], "correct": [1, 14], "placement": 1, "gripper": 1, "49": 1, "stack_4": 1, "63": 1, "swimmer": 1, "swimmer6": 1, "six": 1, "nose": 1, "insid": 1, "25": [1, 3, 5, 15], "swimmer15": 1, "fifteen": 1, "extend": 1, "14": 1, "61": 1, "walker": [1, 5], "trajectori": [1, 3, 7, 8, 10, 11, 13, 14, 19, 20, 22, 23], "horizon": [1, 3, 5, 6, 7], "dimens": [1, 3, 5, 6, 7, 22], "context": [1, 3, 5, 6, 11, 19, 20, 22], "dm_control_prodmp": 1, "A": [1, 3, 5, 6, 7, 22], "promp": [1, 7, 8, 10, 11, 13, 17, 18, 19, 20, 22, 23], "wrap": [1, 7], "1000": [1, 8, 9, 10, 11, 19, 23], "10": [1, 8, 9, 10, 11, 13, 23], "dm_control_dmp": [1, 19], "dmp": [1, 3, 6, 8, 9, 10, 11, 17, 18, 19, 20, 22, 23], "fanci": [2, 3, 5, 9, 11, 15, 19], "provid": [2, 3, 5, 7, 8, 10, 11, 17, 18, 19, 21], "access": [2, 19, 22, 23], "rang": [2, 5, 8, 9, 10, 11, 13, 15, 19, 22, 23], "environ": [2, 4, 8, 9, 10, 11, 13, 14, 15, 17, 18, 20, 21], "air": 2, "hockei": 2, "close": [2, 5, 6, 8, 10, 11, 15], "gap": 2, "between": [2, 5, 14, 19], "simul": [2, 3, 6], "learn": [2, 3, 5, 6, 11, 19, 20, 23], "real": [2, 14], "world": [2, 10], "applic": 2, "variou": [2, 5, 23], "aspect": 2, "oper": [2, 20], "deal": 2, "disturb": 2, "nois": 2, "safeti": 2, "avail": [2, 5, 19, 22], "through": [2, 11], "allow": [2, 3, 8, 10, 11, 17, 18, 19, 22], "develop": 2, "capabl": [2, 5], "differ": [2, 5, 8, 14, 18, 20], "level": [2, 19], "includ": [2, 5, 9, 17, 18, 23], "hit": [2, 5], "defend": 2, "both": [2, 22, 23], "degre": [2, 5, 23], "freedom": [2, 5], "dof": [2, 5], "seven": [2, 5], "7": [2, 5], "configur": [2, 5, 17, 18, 22], "base": [2, 4, 8, 9, 10, 11, 13, 15, 17, 18, 20, 22, 23], "kuka": 2, "iiwa14": 2, "which": [2, 3, 5, 8, 10, 11, 13, 17], "repres": [2, 20, 22], "higher": [2, 23], "control": [2, 4, 19, 20, 22, 23], "akin": 2, "set": [2, 8, 9, 10, 17, 19, 20, 23], "particip": 2, "strategi": 2, "enabl": [2, 11, 19], "react": 2, "adapt": [2, 4, 5], "within": [2, 5], "final": [2, 5], "phase": 2, "tournament": 2, "test": [2, 19, 21], "comprehens": [2, 5, 23], "game": [2, 5, 6], "scenario": 2, "top": [2, 5, 6], "team": 2, "actual": 2, "system": [2, 5], "For": [2, 5, 8, 10, 13, 22], "detail": [2, 19, 22], "inform": [2, 5, 13, 14, 19], "rule": 2, "stage": 2, "submiss": [2, 23], "pleas": [2, 14, 18, 22], "visit": 2, "offici": 2, "websit": 2, "follow": [2, 8, 10, 11, 22], "7dof": 2, "3dof": 2, "airhockit2023": 2, "foundat": [3, 5, 21, 23], "platform": 3, "explor": [3, 23], "experi": 3, "rl": [3, 5, 23], "algorithm": [3, 5], "design": [3, 4, 5, 6, 20], "simpl": 3, "research": [3, 5, 23], "practition": 3, "fundament": 3, "principl": 3, "without": [3, 19, 22], "dimension": [3, 22], "physic": 3, "simplereach": 3, "reach": [3, 5, 6, 19], "ani": [3, 9, 17, 18, 19], "until": 3, "150": [3, 6], "time": [3, 5, 8, 10, 11, 19, 23], "thi": [3, 5, 6, 8, 9, 10, 11, 14, 19, 20, 22, 23], "space": [3, 5, 11, 20, 22], "precis": [3, 5], "toward": 3, "end": [3, 5], "200": [3, 5, 9], "longsimplereach": 3, "18": [3, 5], "viapointreach": 3, "leverag": [3, 9], "support": [3, 6, 10, 19, 20, 22, 23], "self": [3, 22], "collis": 3, "detect": 3, "onli": [3, 5, 8, 10, 11, 17, 19, 21, 22], "100": [3, 5, 7, 15], "199": 3, "viapoint": 3, "respect": 3, "holereach": [3, 9, 11], "effector": [3, 5], "need": [3, 5, 8, 10, 18, 22], "narrow": 3, "hole": [3, 6], "colld": 3, "wall": [3, 6], "fancy_dmp": [3, 5, 11], "holereacherfixedgo": 3, "fix": [3, 5], "attractor": 3, "30": 3, "add": [4, 8, 10, 19, 22], "coupl": 4, "new": [4, 11, 18, 19, 20, 23], "some": [4, 11, 14, 19], "exist": [4, 6, 8, 10, 11, 17, 18, 19, 22], "while": [4, 5, 15, 19, 20], "other": [4, 8, 10, 19, 22, 23], "were": 4, "build": [4, 22], "u": 4, "from": [4, 5, 6, 8, 9, 10, 14, 19, 20, 22, 23], "ground": 4, "push": [4, 6, 23], "boxpushingdens": [4, 5, 15, 23], "mujoco": [4, 9, 11, 15, 21, 23], "step": [4, 8, 9, 10, 11, 13, 14, 15, 17, 18, 20, 22, 23], "tabl": [4, 23], "tenni": [4, 23], "beer": 4, "pong": 4, "mp": [4, 8, 10, 11, 14, 17, 18, 19, 20, 23], "airhockei": [4, 23], "present": [5, 20, 23], "reinforc": [5, 6, 23], "util": 5, "versatil": 5, "franka": 5, "emika": 5, "panda": [5, 23], "arm": [5, 6], "boast": 5, "orient": 5, "defin": [5, 11, 18, 22], "its": 5, "constrain": 5, "certain": 5, "along": 5, "encompass": 5, "full": [5, 8, 10, 11, 13, 19, 22, 23], "360": 5, "z": 5, "axi": [5, 14], "": [5, 20, 23], "mission": 5, "accuraci": 5, "centimet": 5, "0": [5, 8, 9, 10, 11, 13, 14, 15, 19, 22], "radian": 5, "sine": 5, "cosin": 5, "valu": [5, 9, 14, 19], "angl": 5, "quaternion": 5, "describ": 5, "each": [5, 19], "composit": 5, "function": [5, 9, 11], "serv": 5, "metric": 5, "It": [5, 8, 10, 11, 22], "account": 5, "distanc": 5, "rod": 5, "desir": [5, 15], "penalti": 5, "violat": 5, "well": [5, 19, 22], "cost": 5, "energi": 5, "expenditur": 5, "structur": [5, 6, 8, 10, 11], "purposefulli": 5, "enhanc": [5, 20], "gener": [5, 11, 15, 19, 20, 22, 23], "tempor": 5, "last": [5, 11], "timestep": 5, "spatial": 5, "almost": 5, "enought": 5, "somewhat": 5, "correctli": 5, "custom": [5, 8, 9, 10, 11, 15, 18, 19, 22, 23], "dens": 5, "13": 5, "boxpushingtemporalspars": [5, 11], "boxpushingtemporalspatialspars": 5, "offer": [5, 23], "equip": [5, 6], "respond": 5, "incom": 5, "return": [5, 8, 9, 10, 11, 12, 13, 19, 22], "accur": 5, "oppon": 5, "side": [5, 6], "meter": 5, "65": 5, "compris": [5, 6], "decis": 5, "consid": 5, "successfulli": 5, "complet": [5, 20], "land": 5, "also": [5, 6, 8, 9, 10, 11, 17, 18, 19, 21], "tight": 5, "margin": 5, "20": [5, 11], "reflect": 5, "condit": [5, 15], "whether": [5, 17, 22, 23], "wa": 5, "proxim": 5, "cater": 5, "addit": [5, 17, 18, 19], "overcom": 5, "tabletennis2d": 5, "2d": 5, "350": 5, "19": 5, "tabletennis2dreplan": 5, "replan": [5, 11, 19, 23], "tabletennis4d": [5, 11, 12], "4d": 5, "22": 5, "tabletennis4dreplan": [5, 11], "tabletenniswind": 5, "wind": 5, "effect": [5, 22], "tabletennisgoalswitch": 5, "switch": 5, "tabletenniswindreplan": [5, 11], "upon": [5, 23], "throw": 5, "place": [5, 6], "larg": 5, "establish": 5, "42": [5, 18], "05": [5, 14], "angular": 5, "rel": [5, 22], "bottom": 5, "current": [5, 6, 8, 10, 19, 20, 22], "method": [5, 8, 10, 11, 20, 23], "paramet": [5, 8, 10, 11, 18, 22, 23], "expand": 5, "weight": 5, "basi": [5, 11, 20], "durat": 5, "releas": 5, "implement": [5, 11, 19, 22], "form": 5, "squar": 5, "sum": [5, 11], "across": 5, "penal": 5, "excess": 5, "forc": 5, "encourag": [5, 23], "effici": [5, 6], "t": [5, 11, 14, 15], "befor": 5, "non": [5, 18], "markovian": 5, "compon": [5, 6], "assess": 5, "chosen": [5, 20], "ensur": 5, "fall": 5, "reason": 5, "overal": 5, "specif": [5, 13, 20], "success": 5, "determin": [5, 22], "conclus": 5, "showcas": 5, "abil": 5, "predict": [5, 20], "execut": [5, 11, 19, 20, 23], "popular": 5, "parti": [5, 21], "beerpong": 5, "300": 5, "29": 5, "beerpongstepbas": 5, "beerpongfixedreleas": 5, "modifi": 5, "gymnasium": [5, 8, 9, 10, 11, 12, 13, 14, 15, 17, 18, 19, 22, 23], "v2": [5, 6, 7, 9, 10, 13, 19], "reacherspars": 5, "same": [5, 8, 10, 11, 17, 18, 19, 22], "longreach": 5, "27": 5, "longreacherspars": 5, "reacher5d": [5, 9, 11, 14, 19], "env": [5, 6, 8, 9, 10, 11, 14, 15, 17, 18, 19, 22, 23], "reacherenv": 5, "reacher5dspars": 5, "reacher7d": 5, "reacher7dspars": 5, "hopperjumpspars": 5, "jump": 5, "250": [5, 8], "16": [5, 9], "hopperjump": 5, "continu": 5, "antjump": 5, "ant": 5, "119": 5, "halfcheetahjump": 5, "halfcheetah": [5, 9], "112": 5, "hopperjumponbox": 5, "hopperthrow": 5, "hopperthrowinbasket": 5, "walker2djump": 5, "walker2d": 5, "depend": [5, 20, 21], "most": 5, "variant": [5, 6, 19, 23], "refer": [5, 6, 7], "fancy_promp": [5, 11, 12, 14, 19, 23], "fancy_prodmp": [5, 11, 12, 15], "dial": 6, "turn": [6, 19], "open": [6, 19, 22], "sourc": [6, 17, 18], "benchmark": [6, 23], "meta": [6, 10], "multi": 6, "50": [6, 7], "divers": 6, "featur": 6, "univers": 6, "tabletop": 6, "sawyer": 6, "varieti": [6, 11], "everydai": 6, "share": 6, "pivot": 6, "reus": 6, "acquir": 6, "relat": 6, "make": [6, 8, 9, 10, 11, 12, 13, 14, 15, 19, 22, 23], "ml1": [6, 19], "standard": [6, 8, 10, 23], "assembli": 6, "assembl": 6, "39": 6, "basketbal": 6, "bin": 6, "pick": [6, 18], "button": [6, 10], "press": [6, 10], "topdown": 6, "down": 6, "perspect": 6, "coffe": 6, "machin": 6, "pull": 6, "lever": 6, "disassembl": 6, "door": 6, "lock": 6, "unlock": 6, "hand": [6, 22], "drawer": 6, "faucet": 6, "hammer": 6, "handl": [6, 14], "out": [6, 23], "back": [6, 11], "backward": 6, "plate": 6, "slide": 6, "unplug": 6, "soccer": 6, "stick": 6, "against": 6, "shelf": 6, "sweep": 6, "contain": 6, "window": 6, "metaworld_promp": [6, 10], "metaworld_prodmp": [6, 19], "now": [6, 11], "lunar": 7, "lander": 7, "lunarland": 7, "we": [7, 8, 10, 11, 18, 19, 20, 21, 22, 23], "farama": [7, 21], "previous": 7, "openai": [7, 9, 19, 23], "doc": 7, "overview": 7, "counterpart": 7, "gym_promp": [7, 13, 19], "continuousmountaincar": 7, "fetchslidedens": 7, "v1": [7, 9, 10], "fetchreachdens": 7, "import": [8, 9, 10, 11, 12, 13, 14, 15, 19, 22, 23], "def": [8, 9, 10, 11, 12, 13, 15, 22], "example_dmc": 8, "env_id": [8, 9, 10, 11, 13, 14], "seed": [8, 9, 10, 11, 13, 14, 15, 19], "iter": [8, 9, 10, 11, 15], "render": [8, 9, 10, 11, 13, 14, 15, 19, 23], "true": [8, 9, 10, 11, 12, 13, 14, 15, 17, 19], "dmc": [8, 9, 21, 23], "ha": [8, 10, 21, 22], "domain_nam": [8, 9], "task_nam": [8, 9, 10], "environment_nam": [8, 9], "arg": [8, 9, 10, 11, 13, 17, 18], "either": [8, 9, 14], "determinist": [8, 9, 10, 11], "behaviour": [8, 9, 10, 11], "number": [8, 9, 10, 11, 13, 15, 19, 22], "rollout": [8, 9, 10, 11], "render_mod": [8, 9, 10, 11, 13, 15, 23], "human": [8, 9, 10, 11, 13, 15, 19, 23], "els": [8, 9, 10, 11, 13, 15], "none": [8, 9, 10, 11, 13, 15, 17, 18, 19], "ob": [8, 9, 10, 11, 13, 15], "reset": [8, 9, 10, 11, 13, 14, 15, 19, 22, 23], "print": [8, 9, 10, 11, 13, 17, 19, 22], "shape": [8, 9, 10, 14, 22], "observation_spac": [8, 9, 10, 22], "action_spac": [8, 9, 10, 11, 13, 14, 15, 19, 22, 23], "ac": [8, 10, 11, 13, 15, 22], "sampl": [8, 9, 10, 11, 13, 14, 15, 19, 22, 23], "termin": [8, 9, 10, 11, 13, 15, 19, 22, 23], "truncat": [8, 9, 10, 11, 13, 15, 19, 22, 23], "info": [8, 9, 10, 11, 13, 15, 19, 22, 23], "del": [8, 10, 15], "example_custom_dmc_and_mp": 8, "alreadi": [8, 10, 11, 13, 17, 18, 19, 22], "regist": [8, 10, 11, 13, 15, 18, 22, 23], "henc": [8, 10, 11, 19], "adjust": [8, 10, 11], "hyperparamet": [8, 10, 11], "yet": [8, 10, 11, 21, 22], "recommend": [8, 10, 11, 22, 23], "abov": [8, 9, 10, 11, 19], "you": [8, 10, 11, 17, 18, 19, 21, 22, 23], "just": [8, 10, 11, 19], "interest": [8, 10, 11], "chain": [8, 10], "those": [8, 10, 11, 21], "appreci": [8, 10, 11, 23], "pr": [8, 10, 11, 22, 23], "especi": [8, 10, 11], "repo": [8, 10, 11], "http": [8, 10, 11, 21, 23], "github": [8, 10, 11, 21, 23], "com": [8, 10, 11, 21, 23], "alrhub": [8, 10, 11, 21, 23], "accord": [8, 10], "base_env_id": [8, 10, 11, 15], "replac": [8, 10], "your": [8, 10, 14, 22, 23], "inherit": [8, 10], "rawinterfacewrapp": [8, 10, 17, 18, 22], "can": [8, 10, 11, 15, 17, 18, 19, 21, 22, 23], "case": [8, 10, 19, 22], "thei": [8, 10, 11, 20, 21], "suit": [8, 20, 23], "mpwrapper": [8, 10, 11, 15], "trajectory_generator_kwarg": [8, 10, 11, 15], "trajectory_generator_typ": [8, 10, 11, 15], "phase_generator_kwarg": [8, 10, 11, 15, 22], "phase_generator_typ": [8, 10, 11, 15, 22], "controller_kwarg": [8, 10, 11, 14, 15, 22], "controller_typ": [8, 10, 11, 15], "motor": 8, "p_gain": [8, 14, 22], "d_gain": [8, 14, 22], "basis_generator_kwarg": [8, 10, 11, 15, 22], "basis_generator_typ": [8, 10, 11, 15], "zero_rbf": [8, 10, 11], "num_basi": [8, 10, 11, 15, 22], "num_basis_zero_start": [8, 10, 11, 22], "exp": [8, 10, 11, 15], "alpha_phas": [8, 10, 11], "rbf": [8, 10, 11], "base_env": [8, 10, 15], "make_bb": [8, 10, 15], "black_box_kwarg": [8, 10, 15], "traj_gen_kwarg": [8, 10, 15], "phase_kwarg": [8, 10, 15], "basis_kwarg": [8, 10, 15], "call": [8, 10, 11, 19], "onc": [8, 10, 11, 19, 20], "begin": [8, 10, 11, 19], "everi": [8, 10, 11, 19, 20], "consecut": [8, 10, 11], "mode": [8, 10, 11, 14, 19], "possibl": [8, 10, 11], "chang": [8, 10, 11, 19, 22], "nth": [8, 10], "should": [8, 10, 18, 22], "displai": [8, 10], "main": [8, 9, 10, 11, 13, 15], "fals": [8, 9, 10, 11, 15, 17], "disclaim": 8, "vision": 8, "integr": [8, 22, 23], "yield": 8, "error": 8, "reach_site_featur": 8, "hybrid": [8, 10, 19], "framework": [8, 9, 10, 20, 22, 23], "dm_control_promp": 8, "becaus": 8, "longer": [8, 19], "combo": 8, "__name__": [8, 9, 10, 11, 12, 13, 15], "__main__": [8, 9, 10, 11, 12, 13, 15], "collect": [9, 14, 19, 23], "defaultdict": 9, "numpi": [9, 14, 22], "np": [9, 14, 22], "example_gener": 9, "make_env": 9, "id": [9, 15, 17, 18, 19, 22], "example_async": 9, "n_cpu": 9, "int": [9, 22], "533d": 9, "n_sampl": 9, "800": 9, "vector": 9, "multiprocess": 9, "faster": 9, "Be": 9, "awar": 9, "reduc": 9, "total": [9, 19], "length": [9, 19], "individu": [9, 20], "cpu": 9, "core": 9, "parallel": 9, "tupl": [9, 22], "done": 9, "type": [9, 17, 18, 19, 22], "ndarrai": [9, 22], "asyncvectorenv": 9, "make_rank": 9, "OR": 9, "plot": [9, 12, 14], "zero": [9, 14], "buffer": 9, "list": [9, 17, 18, 19], "would": 9, "than": 9, "request": 9, "num_env": 9, "repeat": 9, "ceil": 9, "append": 9, "f": [9, 14], "do": [9, 22], "threshold": 9, "map": 9, "lambda": [9, 15], "v": 9, "basic": [9, 23], "example_meta": 10, "alwai": [10, 19], "found": [10, 19, 20, 23], "here": [10, 11, 19, 20, 22, 23], "arxiv": 10, "org": 10, "pdf": 10, "1910": 10, "10897": 10, "io": 10, "todo": [10, 14], "work": [10, 14, 19], "due": 10, "issu": [10, 19], "code": 10, "example_custom_meta_and_mp": 10, "goal_object_change_mp_wrapp": 10, "might": [10, 14], "necessari": [10, 19, 22], "opengl": 10, "export": 10, "ld_preload": 10, "usr": 10, "lib": 10, "x86_64": 10, "linux": 10, "gnu": 10, "libglew": 10, "so": [10, 22], "500": [10, 11], "example_mp": [11, 13], "env_nam": [11, 13, 15], "black": [11, 20, 23], "equival": 11, "have": [11, 20, 21, 22], "creat": [11, 17, 19, 23], "take": 11, "care": 11, "extern": 11, "raw": [11, 17, 18], "parametr": [11, 20], "give": 11, "sub": [11, 19], "equal": 11, "default": [11, 17, 18, 19, 22], "over": 11, "wise": [11, 19], "aggreg": 11, "example_custom_mp": 11, "argument": [11, 17, 19], "mp_config_overrid": [11, 14, 17, 18], "wai": [11, 14, 19], "mani": 11, "class": [11, 17, 18, 22], "custom_mpwrapp": 11, "mp_config": [11, 22], "weights_scal": [11, 15], "example_fully_custom_mp": 11, "custom_env_id": 11, "custom_env_id_dmp": 11, "custom_env_id_promp": 11, "upgrad": [11, 17, 22, 23], "mp_wrapper": [11, 15, 17, 18, 22], "add_mp_typ": [11, 17, 18], "base_id": [11, 18], "try": [11, 19, 23], "don": 11, "correlcti": 11, "except": [11, 19], "pass": [11, 17], "example_fully_custom_mp_altern": 11, "instead": [11, 17, 18, 20, 22], "mp_arg": 11, "dure": 11, "registr": [11, 18], "prodmp": [11, 15, 17, 18, 19, 20, 22, 23], "boxpushingdensereplan": [11, 15], "alter": 11, "obs1": 11, "compare_bases_shap": 12, "env1_id": 12, "env2_id": 12, "env1": 12, "traj_gen": [12, 13], "show_scaled_basi": 12, "env2": 12, "stuff": 13, "look": [13, 19, 22], "boolean": [13, 22], "ordereddict": 14, "matplotlib": 14, "pyplot": 14, "plt": 14, "howev": [14, 19, 22], "verifi": 14, "extract": 14, "below": 14, "w": 14, "po": [14, 15], "vel": [14, 15], "get_trajectori": 14, "base_shap": 14, "actual_po": 14, "len": 14, "actual_vel": 14, "act": 14, "ion": 14, "fig": 14, "figur": 14, "add_subplot": 14, "img": 14, "imshow": 14, "rgb_arrai": 14, "show": [14, 19], "des_po": 14, "des_vel": 14, "enumer": 14, "zip": 14, "tracking_control": 14, "get_act": 14, "current_po": [14, 22], "current_vel": [14, 22], "clip": 14, "low": 14, "set_data": 14, "canva": 14, "draw": 14, "flush_ev": 14, "figsiz": 14, "subplot": 14, "131": 14, "titl": [14, 23], "p1": 14, "c": 14, "c0": 14, "label": 14, "p2": 14, "c1": 14, "xlabel": 14, "gca": 14, "get_legend_handles_label": 14, "by_label": 14, "legend": 14, "kei": [14, 19], "132": 14, "133": 14, "std": 14, "example_run_replanning_env": 15, "break": 15, "example_custom_replanning_env": 15, "box_push": 15, "max_planning_tim": 15, "plan": 15, "replanning_schedul": 15, "trigger": 15, "condition_on_desir": 15, "boundari": [15, 23], "next": 15, "str": [17, 18], "entry_point": [17, 22], "union": [17, 22], "callabl": 17, "black_box": [17, 18], "raw_interface_wrapp": [17, 18], "registri": [17, 18], "defaultmpwrapp": [17, 18], "register_step_bas": 17, "bool": [17, 22], "dict": [17, 18], "kwarg": 17, "If": [17, 19, 21, 22, 23], "want": [17, 21, 23], "uniqu": [17, 18, 20], "identifi": [17, 18], "entri": 17, "srtep": 17, "dictionari": [17, 18, 19], "overrid": [17, 18], "keyword": 17, "constructor": 17, "note": [17, 18], "otherwis": [17, 18], "given": [17, 19, 22], "string": 17, "notat": 17, "warn": 17, "messag": 17, "suggest": 17, "exampl": [17, 18, 19, 22], "To": [17, 18, 19, 23], "myenv": [17, 18], "myenvclass": 17, "my_modul": 17, "expect": 18, "known_mp": 18, "Will": [18, 23], "match": [18, 22], "wish": 18, "one": [18, 22, 23], "alongsid": 18, "custommpwrapp": 18, "param": [18, 23], "prepar": 19, "ad": 19, "namespac": 19, "legaci": [19, 21], "rais": [19, 22], "metaworld": [19, 20, 21, 23], "n": 19, "cumul": 19, "part": [19, 22], "mainli": 19, "meant": 19, "debug": 19, "log": 19, "train": 19, "step_act": 19, "output": 19, "step_observ": 19, "intermedi": 19, "step_reward": 19, "trajectory_length": 19, "underli": 19, "origin": 19, "In": [19, 22], "miss": 19, "fill": 19, "_": 19, "keep": 19, "mind": 19, "process": 19, "split": 19, "lean": 19, "still": [19, 22], "beta": 19, "feel": [19, 22], "problem": 19, "occur": 19, "directli": [19, 22], "gym_": 19, "again": 19, "conveni": 19, "variabl": 19, "store": 19, "all_movement_primitive_environ": 19, "all_fancy_movement_primitive_environ": 19, "all_gym_movement_primitive_environ": 19, "deepmind": [19, 23], "all_dmc_movement_primitive_environ": 19, "all_metaworld_movement_primitive_environ": 19, "movement_primitive_environments_for_n": 19, "my_custom_namespac": 19, "tradit": 20, "concept": 20, "stochast": 20, "search": 20, "commonli": 20, "produc": 20, "like": [20, 21], "probabilist": [20, 23], "convert": 20, "track": 20, "pd": [20, 23], "tailor": 20, "addition": 20, "special": 20, "overarch": 20, "remain": 20, "polici": 20, "craft": 20, "accommod": 20, "contextu": [20, 22], "At": 20, "onset": 20, "subset": 20, "demand": 20, "virtual": 21, "venv": 21, "3rd": 21, "altern": [21, 23], "poetri": 21, "conda": 21, "few": 21, "choos": 21, "box2d": 21, "jax": 21, "automat": 21, "date": 21, "sinc": 21, "git": 21, "c822f28f582ba1ad49eb5dcf61016566f28003ba": 21, "egg": 21, "clone": 21, "repositori": 21, "go": 21, "folder": 21, "cd": 21, "manual": 21, "guid": 22, "explain": 22, "how": 22, "abc": 22, "abstractmethod": 22, "properti": 22, "context_mask": 22, "mask": 22, "filter": 22, "unwant": 22, "unnecessari": 22, "after": 22, "first": 22, "receiv": 22, "arrai": 22, "indic": 22, "ones": 22, "dtype": 22, "float": 22, "exclus": 22, "regardless": 22, "indirectli": 22, "notimplementederror": 22, "overitten": 22, "attribut": 22, "document": 22, "mp_pytorch": 22, "userguid": 22, "anoth": 22, "merg": 22, "num_basis_zero_go": 22, "rough": 22, "outlin": 22, "shown": 22, "simpli": 22, "cool_new_env": 22, "my_custom_mpwrapp": 22, "my_custom_env": 22, "custom_prodmp": 22, "built": 23, "fork": 23, "renown": 23, "librari": 23, "sever": 23, "etc": 23, "With": 23, "straightforward": 23, "transform": 23, "compat": 23, "contribut": 23, "own": 23, "re": 23, "inspir": 23, "assist": 23, "highli": 23, "randomli": 23, "sleep": 23, "metadata": 23, "render_fp": 23, "about": 23, "pypi": 23, "master": 23, "what": 23, "usag": 23, "tune": 23, "public": 23, "softwar": 23, "author": 23, "otto": 23, "fabian": 23, "celik": 23, "onur": 23, "roth": 23, "dominik": 23, "zhou": 23, "hongyi": 23, "abstract": 23, "unifi": 23, "approach": 23, "url": 23, "organ": 23, "autonom": 23, "lab": 23, "alr": 23, "kit": 23}, "objects": {"fancy_gym": [[16, 0, 0, "-", "envs"], [17, 1, 1, "", "register"], [18, 1, 1, "", "upgrade"]]}, "objtypes": {"0": "py:module", "1": "py:function"}, "objnames": {"0": ["py", "module", "Python module"], "1": ["py", "function", "Python function"]}, "titleterms": {"api": [0, 23], "deepmind": [1, 8], "control": [1, 3, 8, 14], "dmc": 1, "step": [1, 3, 5, 6, 7, 19], "base": [1, 3, 5, 6, 7, 19], "environ": [1, 3, 5, 6, 7, 19, 22, 23], "mp": [1, 3, 5, 6, 7, 12, 22], "airhockei": 2, "classic": 3, "fanci": [4, 23], "mujoco": 5, "box": [5, 19], "push": 5, "tabl": 5, "tenni": 5, "beer": 5, "pong": 5, "variat": 5, "exist": 5, "metaworld": [6, 10], "gymnasium": 7, "exampl": [8, 9, 10, 11, 12, 13, 14, 15, 23], "gener": 9, "usag": [9, 19], "movement": 11, "primit": 11, "param": 12, "tune": [12, 14], "openai": 13, "env": [13, 16], "pd": 14, "gain": 14, "replan": 15, "fancy_gym": [16, 17, 18], "regist": 17, "upgrad": 18, "basic": 19, "black": 19, "what": 20, "i": 20, "episod": 20, "rl": 20, "instal": 21, "from": 21, "pypi": 21, "recommend": 21, "master": 21, "creat": 22, "new": 22, "gym": 23, "kei": 23, "featur": 23, "quickstart": 23, "guid": 23, "user": 23, "cite": 23, "project": 23, "icon": 23, "attribut": 23}, "envversion": {"sphinx.domains.c": 2, "sphinx.domains.changeset": 1, "sphinx.domains.citation": 1, "sphinx.domains.cpp": 8, "sphinx.domains.index": 1, "sphinx.domains.javascript": 2, "sphinx.domains.math": 2, "sphinx.domains.python": 3, "sphinx.domains.rst": 2, "sphinx.domains.std": 2, "sphinx.ext.viewcode": 1, "sphinx": 57}, "alltitles": {"API": [[0, "api"], [23, null]], "DeepMind Control (DMC)": [[1, "deepmind-control-dmc"]], "Step-Based Environments": [[1, "step-based-environments"], [3, "step-based-environments"], [5, "step-based-environments"], [6, "step-based-environments"], [7, "step-based-environments"], [19, "step-based-environments"]], "MP Environments": [[1, "mp-environments"], [3, "mp-environments"], [5, "mp-environments"], [6, "mp-environments"], [7, "mp-environments"]], "AirHockey": [[2, "airhockey"]], "Classic Control": [[3, "classic-control"]], "Fancy": [[4, "fancy"]], "Mujoco": [[5, "mujoco"]], "Box Pushing": [[5, "box-pushing"]], "Table Tennis": [[5, "table-tennis"]], "Beer Pong": [[5, "beer-pong"]], "Variations of existing environments": [[5, "variations-of-existing-environments"]], "Metaworld": [[6, "metaworld"]], "Gymnasium": [[7, "gymnasium"]], "DeepMind Control Examples": [[8, "deepmind-control-examples"]], "General Usage Examples": [[9, "general-usage-examples"]], "Metaworld Examples": [[10, "metaworld-examples"]], "Movement Primitives Examples": [[11, "movement-primitives-examples"]], "MP Params Tuning Example": [[12, "mp-params-tuning-example"]], "OpenAI Envs Examples": [[13, "openai-envs-examples"]], "PD Control Gain Tuning Example": [[14, "pd-control-gain-tuning-example"]], "Replanning Example": [[15, "replanning-example"]], "fancy_gym.envs": [[16, "module-fancy_gym.envs"]], "fancy_gym.register": [[17, "fancy-gym-register"]], "fancy_gym.upgrade": [[18, "fancy-gym-upgrade"]], "Basic Usage": [[19, "basic-usage"]], "Black-Box Environments": [[19, "black-box-environments"]], "What is Episodic RL?": [[20, "what-is-episodic-rl"]], "Installation": [[21, "installation"]], "Installation from PyPI (recommended)": [[21, "installation-from-pypi-recommended"]], "Installation from master": [[21, "installation-from-master"]], "Creating new MP Environments": [[22, "creating-new-mp-environments"]], "Fancy Gym": [[23, "fancy-gym"]], "Key Features": [[23, "key-features"]], "Quickstart Guide": [[23, "quickstart-guide"]], "User Guide": [[23, null]], "Environments": [[23, null]], "Examples": [[23, null]], "Citing the Project": [[23, "citing-the-project"]], "Icon Attribution": [[23, "icon-attribution"]]}, "indexentries": {"fancy_gym.envs": [[16, "module-fancy_gym.envs"]], "module": [[16, "module-fancy_gym.envs"]], "register() (in module fancy_gym)": [[17, "fancy_gym.register"]], "upgrade() (in module fancy_gym)": [[18, "fancy_gym.upgrade"]]}})
\ No newline at end of file
+Search.setIndex({"docnames": ["api", "envs/dmc", "envs/fancy/airhockey", "envs/fancy/classic_control", "envs/fancy/index", "envs/fancy/mujoco", "envs/meta", "envs/open_ai", "examples/dmc", "examples/general", "examples/metaworld", "examples/movement_primitives", "examples/mp_params_tuning", "examples/open_ai", "examples/pd_control_gain_tuning", "examples/replanning_envs", "generated/fancy_gym.envs", "generated/fancy_gym.register", "generated/fancy_gym.upgrade", "guide/basic_usage", "guide/episodic_rl", "guide/installation", "guide/upgrading_envs", "index"], "filenames": ["api.rst", "envs/dmc.md", "envs/fancy/airhockey.rst", "envs/fancy/classic_control.md", "envs/fancy/index.rst", "envs/fancy/mujoco.md", "envs/meta.md", "envs/open_ai.md", "examples/dmc.rst", "examples/general.rst", "examples/metaworld.rst", "examples/movement_primitives.rst", "examples/mp_params_tuning.rst", "examples/open_ai.rst", "examples/pd_control_gain_tuning.rst", "examples/replanning_envs.rst", "generated/fancy_gym.envs.rst", "generated/fancy_gym.register.rst", "generated/fancy_gym.upgrade.rst", "guide/basic_usage.rst", "guide/episodic_rl.rst", "guide/installation.rst", "guide/upgrading_envs.rst", "index.rst"], "titles": ["API", "DeepMind Control (DMC)", "AirHockey", "Classic Control", "Fancy", "Mujoco", "Metaworld", "Gymnasium", "DeepMind Control Examples", "General Usage Examples", "Metaworld Examples", "Movement Primitives Examples", "MP Params Tuning Example", "OpenAI Envs Examples", "PD Control Gain Tuning Example", "Replanning Example", "fancy_gym.envs", "fancy_gym.register", "fancy_gym.upgrade", "Basic Usage", "What is Episodic RL?", "Installation", "Creating new MP Environments", "Fancy Gym"], "terms": {"These": [1, 2, 3, 5, 7, 20], "ar": [1, 2, 3, 4, 5, 7, 8, 10, 11, 14, 17, 19, 20, 21, 22], "wrapper": [1, 8, 10, 11, 15, 17, 18, 22], "select": [1, 7, 22], "order": 1, "us": [1, 2, 5, 6, 9, 11, 15, 17, 18, 19, 20, 21, 22, 23], "our": [1, 8, 9, 10, 11, 20, 23], "motion": [1, 5, 20], "primit": [1, 8, 10, 13, 17, 18, 20, 22, 23], "gym": [1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 19, 22], "interfac": [1, 6, 11, 22, 23], "them": [1, 5, 6, 7, 8, 10, 11, 19, 23], "when": [1, 5, 8, 9, 10, 17, 22], "instal": [1, 10, 23], "fancy_gym": [1, 6, 8, 9, 10, 11, 12, 13, 14, 15, 19, 21, 22, 23], "option": [1, 5, 17, 18, 19, 21], "extra": 1, "e": [1, 8, 10, 11, 21, 22], "g": [1, 8, 10, 11, 22], "pip": [1, 21, 23], "all": [1, 5, 6, 9, 10, 19, 21, 23], "regular": [1, 19, 23], "task": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 19, 22], "avaibl": [1, 6, 21], "via": [1, 3, 6, 19, 21, 22, 23], "shimmi": 1, "name": [1, 3, 5, 6, 7, 8, 10, 19], "descript": [1, 3, 5, 6, 7, 19], "action": [1, 3, 5, 6, 7, 8, 9, 10, 11, 14, 15, 19, 20, 22, 23], "dim": 1, "observ": [1, 2, 3, 5, 6, 8, 9, 10, 11, 19, 20, 22, 23], "dm_control": [1, 8, 19], "acrobot": 1, "swingup": 1, "v0": [1, 2, 3, 4, 5, 7, 8, 9, 11, 12, 14, 15, 17, 18, 19, 22, 23], "underactu": 1, "doubl": 1, "pendulum": [1, 9], "torqu": [1, 5, 20], "appli": [1, 5], "second": 1, "joint": [1, 5, 22], "swing": 1, "up": [1, 4, 6, 21], "balanc": 1, "1": [1, 5, 7, 8, 9, 10, 11, 13, 14, 15, 19, 22, 23], "6": [1, 5, 6], "swingup_spars": 1, "similar": 1, "spars": [1, 5], "reward": [1, 3, 5, 8, 9, 10, 11, 13, 15, 19, 22, 23], "achiev": [1, 5, 8, 10], "ball_in_cup": [1, 8, 19], "catch": [1, 8, 19], "planar": 1, "ball": [1, 5], "cup": [1, 5], "where": [1, 2, 3, 6], "receptacl": 1, "must": [1, 6], "2": [1, 3, 5, 7, 8, 9, 10, 11, 13, 22], "8": [1, 5, 15], "cartpol": 1, "cart": 1, "pole": 1, "goal": [1, 3, 5, 10], "i": [1, 2, 5, 6, 8, 9, 10, 11, 13, 15, 17, 18, 19, 22, 23], "an": [1, 5, 6, 7, 8, 10, 17, 18, 19, 20, 22, 23], "unactu": 1, "move": 1, "start": [1, 22], "upright": 1, "5": [1, 3, 5, 8, 10, 11, 14, 15, 19, 22], "balance_spars": 1, "downward": 1, "requir": [1, 2, 3, 5, 6, 8, 10, 11, 19, 20, 22], "two_pol": 1, "extens": 1, "domain": 1, "two": [1, 5], "serial": 1, "connect": 1, "increas": [1, 9], "challeng": [1, 2, 5, 23], "three_pol": 1, "three": [1, 2], "further": [1, 19, 20], "11": [1, 8], "cheetah": 1, "run": [1, 8, 9, 10, 11, 13, 15], "biped": 1, "robot": [1, 2, 5, 6, 20, 23], "The": [1, 2, 3, 5, 6, 8, 10, 11, 17, 18, 19, 20, 22, 23], "proport": 1, "forward": 1, "veloc": [1, 5, 11, 14, 15, 19, 20, 22], "maximum": [1, 15], "speed": [1, 5], "17": 1, "dog": 1, "stand": 1, "focus": [1, 2], "postur": 1, "38": 1, "223": 1, "walk": 1, "coordin": [1, 5], "movement": [1, 5, 8, 10, 13, 17, 18, 20, 22, 23], "trot": 1, "perform": [1, 2, 5], "gait": 1, "combin": 1, "stabil": 1, "fetch": 1, "plai": [1, 5, 6], "involv": [1, 2, 6], "locomot": 1, "object": [1, 5, 6, 20], "interact": [1, 19], "232": 1, "finger": 1, "spin": 1, "rotat": 1, "bodi": 1, "hing": 1, "9": [1, 3], "turn_easi": 1, "align": [1, 5, 20], "tip": 1, "free": [1, 19, 22], "target": [1, 14], "easier": 1, "version": [1, 7, 8, 10, 13, 17, 18, 19, 21, 22], "larger": 1, "12": 1, "turn_hard": 1, "smaller": 1, "difficulti": [1, 23], "fish": [1, 8], "right": [1, 20], "itself": [1, 3], "fluid": 1, "21": [1, 5], "swim": [1, 8], "incorpor": 1, "dynam": [1, 2, 20, 23], "24": 1, "hopper": [1, 5], "One": 1, "leg": 1, "minim": 1, "torso": 1, "height": 1, "4": [1, 5, 6, 7, 9, 11, 15, 22], "15": [1, 5, 14], "hop": 1, "humanoid": 1, "simplifi": 1, "maintain": [1, 5, 19, 23], "67": 1, "specifi": [1, 5, 8, 10, 18], "aim": [1, 2], "high": [1, 3, 14], "horizont": 1, "run_pure_st": 1, "focu": [1, 3], "pure": 1, "state": [1, 15, 19], "55": 1, "humanoid_cmu": 1, "advanc": [1, 5, 6], "cmu": 1, "model": [1, 2], "56": 1, "137": 1, "lqr": 1, "lqr_2_1": 1, "linear": [1, 8, 10, 11, 22], "quadrat": 1, "regul": 1, "mass": 1, "actuat": [1, 2], "posit": [1, 5, 14, 19, 20, 22], "optim": [1, 20], "lqr_6_2": 1, "more": [1, 9, 13, 19, 20, 22, 23], "complex": [1, 2, 3, 5], "manipul": [1, 5, 6, 8, 9], "bring_bal": 1, "bring": 1, "locat": [1, 5], "initi": [1, 5], "variat": [1, 4], "44": 1, "bring_peg": 1, "peg": [1, 6], "insert_bal": 1, "insert": [1, 6], "basket": [1, 5], "insert_peg": 1, "slot": 1, "classic": [1, 4, 20, 23], "invert": 1, "limit": [1, 2, 5], "multipl": [1, 5, 8, 10, 11, 13, 18, 19, 22], "3": [1, 2, 5, 22], "point_mass": 1, "easi": [1, 22, 23], "point": [1, 3, 17, 22], "correspond": 1, "global": 1, "x": [1, 5], "y": [1, 5], "ax": [1, 5, 14], "hard": 1, "random": [1, 5], "gain": [1, 23], "per": [1, 5], "episod": [1, 5, 8, 9, 10, 11, 14, 19, 23], "memoryless": 1, "agent": [1, 2, 3], "quadrup": 1, "four": 1, "78": 1, "escap": 1, "environment": 1, "101": 1, "90": 1, "reacher": [1, 5, 7, 11, 13, 19], "link": [1, 3, 5], "sphere": 1, "stacker": 1, "stack_2": 1, "stack": [1, 9], "box": [1, 4, 6, 11, 20, 23], "correct": [1, 14], "placement": 1, "gripper": 1, "49": 1, "stack_4": 1, "63": 1, "swimmer": 1, "swimmer6": 1, "six": 1, "nose": 1, "insid": 1, "25": [1, 3, 5, 15], "swimmer15": 1, "fifteen": 1, "extend": 1, "14": 1, "61": 1, "walker": [1, 5], "trajectori": [1, 3, 7, 8, 10, 11, 13, 14, 19, 20, 22, 23], "horizon": [1, 3, 5, 6, 7], "dimens": [1, 3, 5, 6, 7, 22], "context": [1, 3, 5, 6, 11, 19, 20, 22], "dm_control_prodmp": 1, "A": [1, 3, 5, 6, 7, 22], "promp": [1, 7, 8, 10, 11, 13, 17, 18, 19, 20, 22, 23], "wrap": [1, 7], "1000": [1, 8, 9, 10, 11, 19, 23], "10": [1, 8, 9, 10, 11, 13, 23], "dm_control_dmp": [1, 19], "dmp": [1, 3, 6, 8, 9, 10, 11, 17, 18, 19, 20, 22, 23], "fanci": [2, 3, 5, 9, 11, 15, 19], "provid": [2, 3, 5, 7, 8, 10, 11, 17, 18, 19, 21], "access": [2, 19, 22, 23], "rang": [2, 5, 8, 9, 10, 11, 13, 15, 19, 22, 23], "environ": [2, 4, 8, 9, 10, 11, 13, 14, 15, 17, 18, 20, 21], "air": 2, "hockei": 2, "close": [2, 5, 6, 8, 10, 11, 15], "gap": 2, "between": [2, 5, 14, 19], "simul": [2, 3, 6], "learn": [2, 3, 5, 6, 11, 19, 20, 23], "real": [2, 14], "world": [2, 10], "applic": 2, "variou": [2, 5, 23], "aspect": 2, "oper": [2, 20], "deal": 2, "disturb": 2, "nois": 2, "safeti": 2, "avail": [2, 5, 19, 22], "through": [2, 11], "allow": [2, 3, 8, 10, 11, 17, 18, 19, 22], "develop": 2, "capabl": [2, 5], "differ": [2, 5, 8, 14, 18, 20], "level": [2, 19], "includ": [2, 5, 9, 17, 18, 23], "hit": [2, 5], "defend": 2, "both": [2, 22, 23], "degre": [2, 5, 23], "freedom": [2, 5], "dof": [2, 5], "seven": [2, 5], "7": [2, 5], "configur": [2, 5, 17, 18, 22], "base": [2, 4, 8, 9, 10, 11, 13, 15, 17, 18, 20, 22, 23], "kuka": 2, "iiwa14": 2, "which": [2, 3, 5, 8, 10, 11, 13, 17], "repres": [2, 20, 22], "higher": [2, 23], "control": [2, 4, 19, 20, 22, 23], "akin": 2, "set": [2, 5, 8, 9, 10, 17, 19, 20, 23], "particip": 2, "strategi": 2, "enabl": [2, 5, 11, 19], "react": 2, "adapt": [2, 4, 5], "within": [2, 5], "final": [2, 5], "phase": 2, "tournament": 2, "test": [2, 19, 21], "comprehens": [2, 5, 23], "game": [2, 5, 6], "scenario": 2, "top": [2, 5, 6], "team": 2, "actual": 2, "system": [2, 5], "For": [2, 5, 8, 10, 13, 22], "detail": [2, 19, 22], "inform": [2, 5, 13, 14, 19], "rule": 2, "stage": 2, "submiss": [2, 23], "pleas": [2, 14, 18, 22], "visit": 2, "offici": 2, "websit": 2, "follow": [2, 8, 10, 11, 22], "7dof": 2, "3dof": 2, "airhockit2023": 2, "foundat": [3, 5, 21, 23], "platform": 3, "explor": [3, 23], "experi": 3, "rl": [3, 5, 23], "algorithm": [3, 5], "design": [3, 4, 5, 6, 20], "simpl": 3, "research": [3, 5, 23], "practition": 3, "fundament": 3, "principl": 3, "without": [3, 19, 22], "dimension": [3, 22], "physic": 3, "simplereach": 3, "reach": [3, 5, 6, 19], "ani": [3, 9, 17, 18, 19], "until": 3, "150": [3, 6], "time": [3, 5, 8, 10, 11, 19, 23], "thi": [3, 5, 6, 8, 9, 10, 11, 14, 19, 20, 22, 23], "space": [3, 5, 11, 20, 22], "precis": [3, 5], "toward": 3, "end": [3, 5], "200": [3, 5, 9], "longsimplereach": 3, "18": [3, 5], "viapointreach": 3, "leverag": [3, 9], "support": [3, 6, 10, 19, 20, 22, 23], "self": [3, 22], "collis": 3, "detect": 3, "onli": [3, 5, 8, 10, 11, 17, 19, 21, 22], "100": [3, 5, 7, 15], "199": 3, "viapoint": 3, "respect": 3, "holereach": [3, 9, 11], "effector": [3, 5], "need": [3, 5, 8, 10, 18, 22], "narrow": 3, "hole": [3, 6], "colld": 3, "wall": [3, 6], "fancy_dmp": [3, 5, 11], "holereacherfixedgo": 3, "fix": [3, 5], "attractor": 3, "30": 3, "add": [4, 8, 10, 19, 22], "coupl": 4, "new": [4, 11, 18, 19, 20, 23], "some": [4, 11, 14, 19], "exist": [4, 6, 8, 10, 11, 17, 18, 19, 22], "while": [4, 5, 15, 19, 20], "other": [4, 8, 10, 19, 22, 23], "were": 4, "build": [4, 22], "u": 4, "from": [4, 5, 6, 8, 9, 10, 14, 19, 20, 22, 23], "ground": 4, "push": [4, 6, 23], "boxpushingdens": [4, 5, 15, 23], "mujoco": [4, 9, 11, 15, 21, 23], "step": [4, 8, 9, 10, 11, 13, 14, 15, 17, 18, 20, 22, 23], "tabl": [4, 23], "tenni": [4, 23], "beer": 4, "pong": 4, "mp": [4, 8, 10, 11, 14, 17, 18, 19, 20, 23], "airhockei": [4, 23], "present": [5, 20, 23], "reinforc": [5, 6, 23], "util": 5, "versatil": 5, "franka": 5, "emika": 5, "panda": [5, 23], "arm": [5, 6], "boast": 5, "orient": 5, "defin": [5, 11, 18, 22], "its": 5, "constrain": 5, "certain": 5, "along": 5, "encompass": 5, "full": [5, 8, 10, 11, 13, 19, 22, 23], "360": 5, "z": 5, "axi": [5, 14], "": [5, 20, 23], "mission": 5, "accuraci": 5, "centimet": 5, "0": [5, 8, 9, 10, 11, 13, 14, 15, 19, 22], "radian": 5, "sine": 5, "cosin": 5, "valu": [5, 9, 14, 19], "angl": 5, "quaternion": 5, "describ": 5, "each": [5, 19], "composit": 5, "function": [5, 9, 11], "serv": 5, "metric": 5, "It": [5, 8, 10, 11, 22], "account": 5, "distanc": 5, "rod": 5, "desir": [5, 15], "penalti": 5, "violat": 5, "well": [5, 19, 22], "cost": 5, "energi": 5, "expenditur": 5, "structur": [5, 6, 8, 10, 11], "purposefulli": 5, "enhanc": [5, 20], "gener": [5, 11, 15, 19, 20, 22, 23], "tempor": 5, "last": [5, 11], "timestep": 5, "spatial": 5, "almost": 5, "enought": 5, "somewhat": 5, "correctli": 5, "smooth": 5, "part": [5, 19, 22], "return": [5, 8, 9, 10, 11, 12, 13, 19, 22], "info": [5, 8, 9, 10, 11, 13, 15, 19, 22, 23], "mean_squared_jerk": 5, "averag": 5, "squar": 5, "jerk": 5, "rate": 5, "acceler": 5, "chang": [5, 8, 10, 11, 19, 22], "across": 5, "lower": 5, "indic": [5, 22], "smoother": 5, "maximum_jerk": 5, "identifi": [5, 17, 18], "highest": 5, "encount": 5, "dimensionless_jerk": 5, "normal": 5, "sum": [5, 11], "over": [5, 11], "durat": 5, "peak": 5, "offer": [5, 23], "scale": 5, "independ": 5, "custom": [5, 8, 9, 10, 11, 15, 18, 19, 22, 23], "dens": 5, "13": 5, "boxpushingtemporalspars": [5, 11], "boxpushingtemporalspatialspars": 5, "equip": [5, 6], "respond": 5, "incom": 5, "accur": 5, "oppon": 5, "side": [5, 6], "meter": 5, "65": 5, "compris": [5, 6], "decis": 5, "consid": 5, "successfulli": 5, "complet": [5, 20], "land": 5, "also": [5, 6, 8, 9, 10, 11, 17, 18, 19, 21], "tight": 5, "margin": 5, "20": [5, 11], "reflect": 5, "condit": [5, 15], "whether": [5, 17, 22, 23], "wa": 5, "proxim": 5, "cater": 5, "addit": [5, 17, 18, 19], "overcom": 5, "tabletennis2d": 5, "2d": 5, "350": 5, "19": 5, "tabletennis2dreplan": 5, "replan": [5, 11, 19, 23], "tabletennis4d": [5, 11, 12], "4d": 5, "22": 5, "tabletennis4dreplan": [5, 11], "tabletenniswind": 5, "wind": 5, "effect": [5, 22], "tabletennisgoalswitch": 5, "switch": 5, "tabletenniswindreplan": [5, 11], "tabletennisrndrobot": 5, "can": [5, 8, 10, 11, 15, 17, 18, 19, 21, 22, 23], "random_pos_scal": 5, "random_vel_scal": 5, "make": [5, 6, 8, 9, 10, 11, 12, 13, 14, 15, 19, 22, 23], "equival": [5, 11], "except": [5, 11, 19], "instead": [5, 11, 17, 18, 20, 22], "default": [5, 11, 17, 18, 19, 22], "upon": [5, 23], "throw": 5, "place": [5, 6], "larg": 5, "establish": 5, "42": [5, 18], "05": [5, 14], "angular": 5, "rel": [5, 22], "bottom": 5, "current": [5, 6, 8, 10, 19, 20, 22], "method": [5, 8, 10, 11, 20, 23], "paramet": [5, 8, 10, 11, 18, 22, 23], "expand": 5, "weight": 5, "basi": [5, 11, 20], "releas": 5, "implement": [5, 11, 19, 22], "form": 5, "penal": 5, "excess": 5, "forc": 5, "encourag": [5, 23], "effici": [5, 6], "t": [5, 11, 14, 15], "befor": 5, "non": [5, 18], "markovian": 5, "compon": [5, 6], "assess": 5, "chosen": [5, 20], "ensur": 5, "fall": 5, "reason": 5, "overal": 5, "specif": [5, 13, 20], "success": 5, "determin": [5, 22], "conclus": 5, "showcas": 5, "abil": 5, "predict": [5, 20], "execut": [5, 11, 19, 20, 23], "popular": 5, "parti": [5, 21], "beerpong": 5, "300": 5, "29": 5, "beerpongstepbas": 5, "beerpongfixedreleas": 5, "modifi": 5, "gymnasium": [5, 8, 9, 10, 11, 12, 13, 14, 15, 17, 18, 19, 22, 23], "v2": [5, 6, 7, 9, 10, 13, 19], "reacherspars": 5, "same": [5, 8, 10, 11, 17, 18, 19, 22], "longreach": 5, "27": 5, "longreacherspars": 5, "reacher5d": [5, 9, 11, 14, 19], "env": [5, 6, 8, 9, 10, 11, 14, 15, 17, 18, 19, 22, 23], "reacherenv": 5, "reacher5dspars": 5, "reacher7d": 5, "reacher7dspars": 5, "hopperjump": 5, "jump": 5, "continu": 5, "250": [5, 8], "16": [5, 9], "hopperjumpmarkov": 5, "altern": [5, 21, 23], "hopperjumpspars": 5, "antjump": 5, "ant": 5, "119": 5, "halfcheetahjump": 5, "halfcheetah": [5, 9], "112": 5, "hopperjumponbox": 5, "hopperthrow": 5, "hopperthrowinbasket": 5, "walker2djump": 5, "walker2d": 5, "depend": [5, 20, 21], "most": 5, "variant": [5, 6, 19, 23], "refer": [5, 6, 7], "fancy_promp": [5, 11, 12, 14, 19, 23], "fancy_prodmp": [5, 11, 12, 15], "dial": 6, "turn": [6, 19], "open": [6, 19, 22], "sourc": [6, 17, 18], "benchmark": [6, 23], "meta": [6, 10], "multi": 6, "50": [6, 7], "divers": 6, "featur": 6, "univers": 6, "tabletop": 6, "sawyer": 6, "varieti": [6, 11], "everydai": 6, "share": 6, "pivot": 6, "reus": 6, "acquir": 6, "relat": 6, "ml1": [6, 19], "standard": [6, 8, 10, 23], "assembli": 6, "assembl": 6, "39": 6, "basketbal": 6, "bin": 6, "pick": [6, 18], "button": [6, 10], "press": [6, 10], "topdown": 6, "down": 6, "perspect": 6, "coffe": 6, "machin": 6, "pull": 6, "lever": 6, "disassembl": 6, "door": 6, "lock": 6, "unlock": 6, "hand": [6, 22], "drawer": 6, "faucet": 6, "hammer": 6, "handl": [6, 14], "out": [6, 23], "back": [6, 11], "backward": 6, "plate": 6, "slide": 6, "unplug": 6, "soccer": 6, "stick": 6, "against": 6, "shelf": 6, "sweep": 6, "contain": 6, "window": 6, "metaworld_promp": [6, 10], "metaworld_prodmp": [6, 19], "now": [6, 11], "lunar": 7, "lander": 7, "lunarland": 7, "we": [7, 8, 10, 11, 18, 19, 20, 21, 22, 23], "farama": [7, 21], "previous": 7, "openai": [7, 9, 19, 23], "doc": 7, "overview": 7, "counterpart": 7, "gym_promp": [7, 13, 19], "continuousmountaincar": 7, "fetchslidedens": 7, "v1": [7, 9, 10], "fetchreachdens": 7, "import": [8, 9, 10, 11, 12, 13, 14, 15, 19, 22, 23], "def": [8, 9, 10, 11, 12, 13, 15, 22], "example_dmc": 8, "env_id": [8, 9, 10, 11, 13, 14], "seed": [8, 9, 10, 11, 13, 14, 15, 19], "iter": [8, 9, 10, 11, 15], "render": [8, 9, 10, 11, 13, 14, 15, 19, 23], "true": [8, 9, 10, 11, 12, 13, 14, 15, 17, 19], "dmc": [8, 9, 21, 23], "ha": [8, 10, 21, 22], "domain_nam": [8, 9], "task_nam": [8, 9, 10], "environment_nam": [8, 9], "arg": [8, 9, 10, 11, 13, 17, 18], "either": [8, 9, 14], "determinist": [8, 9, 10, 11], "behaviour": [8, 9, 10, 11], "number": [8, 9, 10, 11, 13, 15, 19, 22], "rollout": [8, 9, 10, 11], "render_mod": [8, 9, 10, 11, 13, 15, 23], "human": [8, 9, 10, 11, 13, 15, 19, 23], "els": [8, 9, 10, 11, 13, 15], "none": [8, 9, 10, 11, 13, 15, 17, 18, 19], "ob": [8, 9, 10, 11, 13, 15], "reset": [8, 9, 10, 11, 13, 14, 15, 19, 22, 23], "print": [8, 9, 10, 11, 13, 17, 19, 22], "shape": [8, 9, 10, 14, 22], "observation_spac": [8, 9, 10, 22], "action_spac": [8, 9, 10, 11, 13, 14, 15, 19, 22, 23], "ac": [8, 10, 11, 13, 15, 22], "sampl": [8, 9, 10, 11, 13, 14, 15, 19, 22, 23], "termin": [8, 9, 10, 11, 13, 15, 19, 22, 23], "truncat": [8, 9, 10, 11, 13, 15, 19, 22, 23], "del": [8, 10, 15], "example_custom_dmc_and_mp": 8, "alreadi": [8, 10, 11, 13, 17, 18, 19, 22], "regist": [8, 10, 11, 13, 15, 18, 22, 23], "henc": [8, 10, 11, 19], "adjust": [8, 10, 11], "hyperparamet": [8, 10, 11], "yet": [8, 10, 11, 21, 22], "recommend": [8, 10, 11, 22, 23], "abov": [8, 9, 10, 11, 19], "you": [8, 10, 11, 17, 18, 19, 21, 22, 23], "just": [8, 10, 11, 19], "interest": [8, 10, 11], "chain": [8, 10], "those": [8, 10, 11, 21], "appreci": [8, 10, 11, 23], "pr": [8, 10, 11, 22, 23], "especi": [8, 10, 11], "repo": [8, 10, 11], "http": [8, 10, 11, 21, 23], "github": [8, 10, 11, 21, 23], "com": [8, 10, 11, 21, 23], "alrhub": [8, 10, 11, 21, 23], "accord": [8, 10], "base_env_id": [8, 10, 11, 15], "replac": [8, 10], "your": [8, 10, 14, 22, 23], "inherit": [8, 10], "rawinterfacewrapp": [8, 10, 17, 18, 22], "case": [8, 10, 19, 22], "thei": [8, 10, 11, 20, 21], "suit": [8, 20, 23], "mpwrapper": [8, 10, 11, 15], "trajectory_generator_kwarg": [8, 10, 11, 15], "trajectory_generator_typ": [8, 10, 11, 15], "phase_generator_kwarg": [8, 10, 11, 15, 22], "phase_generator_typ": [8, 10, 11, 15, 22], "controller_kwarg": [8, 10, 11, 14, 15, 22], "controller_typ": [8, 10, 11, 15], "motor": 8, "p_gain": [8, 14, 22], "d_gain": [8, 14, 22], "basis_generator_kwarg": [8, 10, 11, 15, 22], "basis_generator_typ": [8, 10, 11, 15], "zero_rbf": [8, 10, 11], "num_basi": [8, 10, 11, 15, 22], "num_basis_zero_start": [8, 10, 11, 22], "exp": [8, 10, 11, 15], "alpha_phas": [8, 10, 11], "rbf": [8, 10, 11], "base_env": [8, 10, 15], "make_bb": [8, 10, 15], "black_box_kwarg": [8, 10, 15], "traj_gen_kwarg": [8, 10, 15], "phase_kwarg": [8, 10, 15], "basis_kwarg": [8, 10, 15], "call": [8, 10, 11, 19], "onc": [8, 10, 11, 19, 20], "begin": [8, 10, 11, 19], "everi": [8, 10, 11, 19, 20], "consecut": [8, 10, 11], "mode": [8, 10, 11, 14, 19], "possibl": [8, 10, 11], "nth": [8, 10], "should": [8, 10, 18, 22], "displai": [8, 10], "main": [8, 9, 10, 11, 13, 15], "fals": [8, 9, 10, 11, 15, 17], "disclaim": 8, "vision": 8, "integr": [8, 22, 23], "yield": 8, "error": 8, "reach_site_featur": 8, "hybrid": [8, 10, 19], "framework": [8, 9, 10, 20, 22, 23], "dm_control_promp": 8, "becaus": 8, "longer": [8, 19], "combo": 8, "__name__": [8, 9, 10, 11, 12, 13, 15], "__main__": [8, 9, 10, 11, 12, 13, 15], "collect": [9, 14, 19, 23], "defaultdict": 9, "numpi": [9, 14, 22], "np": [9, 14, 22], "example_gener": 9, "make_env": 9, "id": [9, 15, 17, 18, 19, 22], "example_async": 9, "n_cpu": 9, "int": [9, 22], "533d": 9, "n_sampl": 9, "800": 9, "vector": 9, "multiprocess": 9, "faster": 9, "Be": 9, "awar": 9, "reduc": 9, "total": [9, 19], "length": [9, 19], "individu": [9, 20], "cpu": 9, "core": 9, "parallel": 9, "tupl": [9, 22], "done": 9, "type": [9, 17, 18, 19, 22], "ndarrai": [9, 22], "asyncvectorenv": 9, "make_rank": 9, "OR": 9, "plot": [9, 12, 14], "zero": [9, 14], "buffer": 9, "list": [9, 17, 18, 19], "would": 9, "than": 9, "request": 9, "num_env": 9, "repeat": 9, "ceil": 9, "append": 9, "f": [9, 14], "do": [9, 22], "threshold": 9, "map": 9, "lambda": [9, 15], "v": 9, "basic": [9, 23], "example_meta": 10, "alwai": [10, 19], "found": [10, 19, 20, 23], "here": [10, 11, 19, 20, 22, 23], "arxiv": 10, "org": 10, "pdf": 10, "1910": 10, "10897": 10, "io": 10, "todo": [10, 14], "work": [10, 14, 19], "due": 10, "issu": [10, 19], "code": 10, "example_custom_meta_and_mp": 10, "goal_object_change_mp_wrapp": 10, "might": [10, 14], "necessari": [10, 19, 22], "opengl": 10, "export": 10, "ld_preload": 10, "usr": 10, "lib": 10, "x86_64": 10, "linux": 10, "gnu": 10, "libglew": 10, "so": [10, 22], "500": [10, 11], "example_mp": [11, 13], "env_nam": [11, 13, 15], "black": [11, 20, 23], "have": [11, 20, 21, 22], "creat": [11, 17, 19, 23], "take": 11, "care": 11, "extern": 11, "raw": [11, 17, 18], "parametr": [11, 20], "give": 11, "sub": [11, 19], "equal": 11, "wise": [11, 19], "aggreg": 11, "example_custom_mp": 11, "argument": [11, 17, 19], "mp_config_overrid": [11, 14, 17, 18], "wai": [11, 14, 19], "mani": 11, "class": [11, 17, 18, 22], "custom_mpwrapp": 11, "mp_config": [11, 22], "weights_scal": [11, 15], "example_fully_custom_mp": 11, "custom_env_id": 11, "custom_env_id_dmp": 11, "custom_env_id_promp": 11, "upgrad": [11, 17, 22, 23], "mp_wrapper": [11, 15, 17, 18, 22], "add_mp_typ": [11, 17, 18], "base_id": [11, 18], "try": [11, 19, 23], "don": 11, "correlcti": 11, "pass": [11, 17], "example_fully_custom_mp_altern": 11, "mp_arg": 11, "dure": 11, "registr": [11, 18], "prodmp": [11, 15, 17, 18, 19, 20, 22, 23], "boxpushingdensereplan": [11, 15], "alter": 11, "obs1": 11, "compare_bases_shap": 12, "env1_id": 12, "env2_id": 12, "env1": 12, "traj_gen": [12, 13], "show_scaled_basi": 12, "env2": 12, "stuff": 13, "look": [13, 19, 22], "boolean": [13, 22], "ordereddict": 14, "matplotlib": 14, "pyplot": 14, "plt": 14, "howev": [14, 19, 22], "verifi": 14, "extract": 14, "below": 14, "w": 14, "po": [14, 15], "vel": [14, 15], "get_trajectori": 14, "base_shap": 14, "actual_po": 14, "len": 14, "actual_vel": 14, "act": 14, "ion": 14, "fig": 14, "figur": 14, "add_subplot": 14, "img": 14, "imshow": 14, "rgb_arrai": 14, "show": [14, 19], "des_po": 14, "des_vel": 14, "enumer": 14, "zip": 14, "tracking_control": 14, "get_act": 14, "current_po": [14, 22], "current_vel": [14, 22], "clip": 14, "low": 14, "set_data": 14, "canva": 14, "draw": 14, "flush_ev": 14, "figsiz": 14, "subplot": 14, "131": 14, "titl": [14, 23], "p1": 14, "c": 14, "c0": 14, "label": 14, "p2": 14, "c1": 14, "xlabel": 14, "gca": 14, "get_legend_handles_label": 14, "by_label": 14, "legend": 14, "kei": [14, 19], "132": 14, "133": 14, "std": 14, "example_run_replanning_env": 15, "break": 15, "example_custom_replanning_env": 15, "box_push": 15, "max_planning_tim": 15, "plan": 15, "replanning_schedul": 15, "trigger": 15, "condition_on_desir": 15, "boundari": [15, 23], "next": 15, "str": [17, 18], "entry_point": [17, 22], "union": [17, 22], "callabl": 17, "black_box": [17, 18], "raw_interface_wrapp": [17, 18], "registri": [17, 18], "defaultmpwrapp": [17, 18], "register_step_bas": 17, "bool": [17, 22], "dict": [17, 18], "kwarg": 17, "If": [17, 19, 21, 22, 23], "want": [17, 21, 23], "uniqu": [17, 18, 20], "entri": 17, "srtep": 17, "dictionari": [17, 18, 19], "overrid": [17, 18], "keyword": 17, "constructor": 17, "note": [17, 18], "otherwis": [17, 18], "given": [17, 19, 22], "string": 17, "notat": 17, "warn": 17, "messag": 17, "suggest": 17, "exampl": [17, 18, 19, 22], "To": [17, 18, 19, 23], "myenv": [17, 18], "myenvclass": 17, "my_modul": 17, "expect": 18, "known_mp": 18, "Will": [18, 23], "match": [18, 22], "wish": 18, "one": [18, 22, 23], "alongsid": 18, "custommpwrapp": 18, "param": [18, 23], "prepar": 19, "ad": 19, "namespac": 19, "legaci": [19, 21], "rais": [19, 22], "metaworld": [19, 20, 21, 23], "n": 19, "cumul": 19, "mainli": 19, "meant": 19, "debug": 19, "log": 19, "train": 19, "step_act": 19, "output": 19, "step_observ": 19, "intermedi": 19, "step_reward": 19, "trajectory_length": 19, "underli": 19, "origin": 19, "In": [19, 22], "miss": 19, "fill": 19, "_": 19, "keep": 19, "mind": 19, "process": 19, "split": 19, "lean": 19, "still": [19, 22], "beta": 19, "feel": [19, 22], "problem": 19, "occur": 19, "directli": [19, 22], "gym_": 19, "again": 19, "conveni": 19, "variabl": 19, "store": 19, "all_movement_primitive_environ": 19, "all_fancy_movement_primitive_environ": 19, "all_gym_movement_primitive_environ": 19, "deepmind": [19, 23], "all_dmc_movement_primitive_environ": 19, "all_metaworld_movement_primitive_environ": 19, "movement_primitive_environments_for_n": 19, "my_custom_namespac": 19, "tradit": 20, "concept": 20, "stochast": 20, "search": 20, "commonli": 20, "produc": 20, "like": [20, 21], "probabilist": [20, 23], "convert": 20, "track": 20, "pd": [20, 23], "tailor": 20, "addition": 20, "special": 20, "overarch": 20, "remain": 20, "polici": 20, "craft": 20, "accommod": 20, "contextu": [20, 22], "At": 20, "onset": 20, "subset": 20, "demand": 20, "virtual": 21, "venv": 21, "3rd": 21, "poetri": 21, "conda": 21, "few": 21, "choos": 21, "box2d": 21, "jax": 21, "automat": 21, "date": 21, "sinc": 21, "git": 21, "c822f28f582ba1ad49eb5dcf61016566f28003ba": 21, "egg": 21, "clone": 21, "repositori": 21, "go": 21, "folder": 21, "cd": 21, "manual": 21, "guid": 22, "explain": 22, "how": 22, "abc": 22, "abstractmethod": 22, "properti": 22, "context_mask": 22, "mask": 22, "filter": 22, "unwant": 22, "unnecessari": 22, "after": 22, "first": 22, "receiv": 22, "arrai": 22, "ones": 22, "dtype": 22, "float": 22, "exclus": 22, "regardless": 22, "indirectli": 22, "notimplementederror": 22, "overitten": 22, "attribut": 22, "document": 22, "mp_pytorch": 22, "userguid": 22, "anoth": 22, "merg": 22, "num_basis_zero_go": 22, "rough": 22, "outlin": 22, "shown": 22, "simpli": 22, "cool_new_env": 22, "my_custom_mpwrapp": 22, "my_custom_env": 22, "custom_prodmp": 22, "built": 23, "fork": 23, "renown": 23, "librari": 23, "sever": 23, "etc": 23, "With": 23, "straightforward": 23, "transform": 23, "compat": 23, "contribut": 23, "own": 23, "re": 23, "inspir": 23, "assist": 23, "highli": 23, "randomli": 23, "sleep": 23, "metadata": 23, "render_fp": 23, "about": 23, "pypi": 23, "master": 23, "what": 23, "usag": 23, "tune": 23, "public": 23, "softwar": 23, "author": 23, "otto": 23, "fabian": 23, "celik": 23, "onur": 23, "roth": 23, "dominik": 23, "zhou": 23, "hongyi": 23, "abstract": 23, "unifi": 23, "approach": 23, "url": 23, "organ": 23, "autonom": 23, "lab": 23, "alr": 23, "kit": 23}, "objects": {"fancy_gym": [[16, 0, 0, "-", "envs"], [17, 1, 1, "", "register"], [18, 1, 1, "", "upgrade"]]}, "objtypes": {"0": "py:module", "1": "py:function"}, "objnames": {"0": ["py", "module", "Python module"], "1": ["py", "function", "Python function"]}, "titleterms": {"api": [0, 23], "deepmind": [1, 8], "control": [1, 3, 8, 14], "dmc": 1, "step": [1, 3, 5, 6, 7, 19], "base": [1, 3, 5, 6, 7, 19], "environ": [1, 3, 5, 6, 7, 19, 22, 23], "mp": [1, 3, 5, 6, 7, 12, 22], "airhockei": 2, "classic": 3, "fanci": [4, 23], "mujoco": 5, "box": [5, 19], "push": 5, "tabl": 5, "tenni": 5, "beer": 5, "pong": 5, "variat": 5, "exist": 5, "metaworld": [6, 10], "gymnasium": 7, "exampl": [8, 9, 10, 11, 12, 13, 14, 15, 23], "gener": 9, "usag": [9, 19], "movement": 11, "primit": 11, "param": 12, "tune": [12, 14], "openai": 13, "env": [13, 16], "pd": 14, "gain": 14, "replan": 15, "fancy_gym": [16, 17, 18], "regist": 17, "upgrad": 18, "basic": 19, "black": 19, "what": 20, "i": 20, "episod": 20, "rl": 20, "instal": 21, "from": 21, "pypi": 21, "recommend": 21, "master": 21, "creat": 22, "new": 22, "gym": 23, "kei": 23, "featur": 23, "quickstart": 23, "guid": 23, "user": 23, "cite": 23, "project": 23, "icon": 23, "attribut": 23}, "envversion": {"sphinx.domains.c": 2, "sphinx.domains.changeset": 1, "sphinx.domains.citation": 1, "sphinx.domains.cpp": 8, "sphinx.domains.index": 1, "sphinx.domains.javascript": 2, "sphinx.domains.math": 2, "sphinx.domains.python": 3, "sphinx.domains.rst": 2, "sphinx.domains.std": 2, "sphinx.ext.viewcode": 1, "sphinx": 57}, "alltitles": {"API": [[0, "api"], [23, null]], "DeepMind Control (DMC)": [[1, "deepmind-control-dmc"]], "Step-Based Environments": [[1, "step-based-environments"], [3, "step-based-environments"], [5, "step-based-environments"], [6, "step-based-environments"], [7, "step-based-environments"], [19, "step-based-environments"]], "MP Environments": [[1, "mp-environments"], [3, "mp-environments"], [5, "mp-environments"], [6, "mp-environments"], [7, "mp-environments"]], "AirHockey": [[2, "airhockey"]], "Classic Control": [[3, "classic-control"]], "Fancy": [[4, "fancy"]], "Mujoco": [[5, "mujoco"]], "Box Pushing": [[5, "box-pushing"]], "Table Tennis": [[5, "table-tennis"]], "Beer Pong": [[5, "beer-pong"]], "Variations of existing environments": [[5, "variations-of-existing-environments"]], "Metaworld": [[6, "metaworld"]], "Gymnasium": [[7, "gymnasium"]], "DeepMind Control Examples": [[8, "deepmind-control-examples"]], "General Usage Examples": [[9, "general-usage-examples"]], "Metaworld Examples": [[10, "metaworld-examples"]], "Movement Primitives Examples": [[11, "movement-primitives-examples"]], "MP Params Tuning Example": [[12, "mp-params-tuning-example"]], "OpenAI Envs Examples": [[13, "openai-envs-examples"]], "PD Control Gain Tuning Example": [[14, "pd-control-gain-tuning-example"]], "Replanning Example": [[15, "replanning-example"]], "fancy_gym.envs": [[16, "module-fancy_gym.envs"]], "fancy_gym.register": [[17, "fancy-gym-register"]], "fancy_gym.upgrade": [[18, "fancy-gym-upgrade"]], "Basic Usage": [[19, "basic-usage"]], "Black-Box Environments": [[19, "black-box-environments"]], "What is Episodic RL?": [[20, "what-is-episodic-rl"]], "Installation": [[21, "installation"]], "Installation from PyPI (recommended)": [[21, "installation-from-pypi-recommended"]], "Installation from master": [[21, "installation-from-master"]], "Creating new MP Environments": [[22, "creating-new-mp-environments"]], "Fancy Gym": [[23, "fancy-gym"]], "Key Features": [[23, "key-features"]], "Quickstart Guide": [[23, "quickstart-guide"]], "User Guide": [[23, null]], "Environments": [[23, null]], "Examples": [[23, null]], "Citing the Project": [[23, "citing-the-project"]], "Icon Attribution": [[23, "icon-attribution"]]}, "indexentries": {"fancy_gym.envs": [[16, "module-fancy_gym.envs"]], "module": [[16, "module-fancy_gym.envs"]], "register() (in module fancy_gym)": [[17, "fancy_gym.register"]], "upgrade() (in module fancy_gym)": [[18, "fancy_gym.upgrade"]]}})
\ No newline at end of file