Merge pull request #42 from ALRhub/clean_api

Fancy_gym is born
2022-09-23 09:01:47 +02:00 · 2022-09-23 09:01:47 +02:00 · a4f696dee4
commit a4f696dee4
parent d313795cec eaedd58e73
274 changed files with 5947 additions and 6763 deletions
--- a/README.md
+++ b/README.md
@ -1,48 +1,40 @@
-## ALR Robotics Control Environments
+# Fancy Gym

-This project offers a large variety of reinforcement learning environments under the unifying interface of [OpenAI gym](https://gym.openai.com/).
-Besides, we also provide support (under the OpenAI interface) for the benchmark suites
+`fancy_gym` offers a large variety of reinforcement learning environments under the unifying interface
+of [OpenAI gym](https://gym.openai.com/). We provide support (under the OpenAI gym interface) for the benchmark suites
 [DeepMind Control](https://deepmind.com/research/publications/2020/dm-control-Software-and-Tasks-for-Continuous-Control)
-(DMC) and [Metaworld](https://meta-world.github.io/). Custom (Mujoco) gym environments can be created according
-to [this guide](https://github.com/openai/gym/blob/master/docs/creating-environments.md). Unlike existing libraries, we
-additionally support to control agents with Dynamic Movement Primitives (DMPs) and Probabilistic Movement Primitives (ProMP,
-we only consider the mean usually).
+(DMC) and [Metaworld](https://meta-world.github.io/). If those are not sufficient and you want to create your own custom
+gym environments, use [this guide](https://www.gymlibrary.ml/content/environment_creation/). We highly appreciate it, if
+you would then submit a PR for this environment to become part of `fancy_gym`.  
+In comparison to existing libraries, we additionally support to control agents with movement primitives, such as Dynamic
+Movement Primitives (DMPs) and Probabilistic Movement Primitives (ProMP).

-## Motion Primitive Environments (Episodic environments)
+## Movement Primitive Environments (Episode-Based/Black-Box Environments)

-Unlike step-based environments, motion primitive (MP) environments are closer related to stochastic search, black box
-optimization, and methods that are often used in robotics. MP environments are trajectory-based and always execute a full
-trajectory, which is generated by a Dynamic Motion Primitive (DMP) or a Probabilistic Motion Primitive (ProMP). The
-generated trajectory is translated into individual step-wise actions by a controller. The exact choice of controller is,
-however, dependent on the type of environment. We currently support position, velocity, and PD-Controllers for position,
-velocity, and torque control, respectively. The goal of all MP environments is still to learn a policy. Yet, an action
-represents the parametrization of the motion primitives to generate a suitable trajectory. Additionally, in this
-framework we support all of this also for the contextual setting, for which we expose all changing substates of the
-task as a single observation in the beginning. This requires to predict a new action/MP parametrization for each
-trajectory. All environments provide next to the cumulative episode reward all collected information from each
-step as part of the info dictionary. This information should, however, mainly be used for debugging and logging.
-
-|Key| Description|
-|---|---|
-`trajectory`| Generated trajectory from MP
-`step_actions`| Step-wise executed action based on controller output
-`step_observations`| Step-wise intermediate observations
-`step_rewards`| Step-wise rewards
-`trajectory_length`| Total number of environment interactions
-`other`| All other information from the underlying environment are returned as a list with length `trajectory_length` maintaining the original key. In case some information are not provided every time step, the missing values are filled with `None`.
+Unlike step-based environments, movement primitive (MP) environments are closer related to stochastic search, black-box
+optimization, and methods that are often used in traditional robotics and control. MP environments are typically
+episode-based and execute a full trajectory, which is generated by a trajectory generator, such as a Dynamic Movement
+Primitive (DMP) or a Probabilistic Movement Primitive (ProMP). The generated trajectory is translated into individual
+step-wise actions by a trajectory tracking controller. The exact choice of controller is, however, dependent on the type
+of environment. We currently support position, velocity, and PD-Controllers for position, velocity, and torque control,
+respectively as well as a special controller for the MetaWorld control suite.  
+The goal of all MP environments is still to learn an optimal policy. Yet, an action represents the parametrization of
+the motion primitives to generate a suitable trajectory. Additionally, in this framework we support all of this also for
+the contextual setting, i.e. we expose the context space - a subset of the observation space - in the beginning of the
+episode. This requires to predict a new action/MP parametrization for each context.

 ## Installation

 1. Clone the repository

 ```bash 
-git clone git@github.com:ALRhub/alr_envs.git
+git clone git@github.com:ALRhub/fancy_gym.git
 ```

 2. Go to the folder

 ```bash 
-cd alr_envs
+cd fancy_gym
 ```

 3. Install with
@ -51,104 +43,137 @@ cd alr_envs
 pip install -e .
 ```

-## Using the framework
+In case you want to use dm_control oder metaworld, you can install them by specifying extras

-We prepared [multiple examples](alr_envs/examples/), please have a look there for more specific examples.
+```bash 
+pip install -e .[dmc, metaworld]
+```

-### Step-wise environments
+> **Note:**   
+> While our library already fully supports the new mujoco bindings, metaworld still relies on
+> [mujoco_py](https://github.com/openai/mujoco-py), hence make sure to have mujoco 2.1 installed beforehand.
+
+## How to use Fancy Gym
+
+We will only show the basics here and prepared [multiple examples](fancy_gym/examples/) for a more detailed look.
+
+### Step-wise Environments

 ```python
-import alr_envs
+import fancy_gym

-env = alr_envs.make('HoleReacher-v0', seed=1)
-state = env.reset()
+env = fancy_gym.make('Reacher5d-v0', seed=1)
+obs = env.reset()

 for i in range(1000):
-    state, reward, done, info = env.step(env.action_space.sample())
+    action = env.action_space.sample()
+    obs, reward, done, info = env.step(action)
    if i % 5 == 0:
        env.render()

    if done:
-        state = env.reset()
+        obs = env.reset()
 ``` 

-For Deepmind control tasks we expect the `env_id` to be specified as `domain_name-task_name` or for manipulation tasks
-as `manipulation-environment_name`. All other environments can be created based on their original name.
+When using `dm_control` tasks we expect the `env_id` to be specified as `dmc:domain_name-task_name` or for manipulation
+tasks as `dmc:manipulation-environment_name`. For `metaworld` tasks, we require the structure `metaworld:env_id-v2`, our
+custom tasks and standard gym environments can be created without prefixes.

-Existing MP tasks can be created the same way as above. Just keep in mind, calling `step()` always executs a full
-trajectory.
+### Black-box Environments
+
+All environments provide by default the cumulative episode reward, this can however be changed if necessary. Optionally,
+each environment returns all collected information from each step as part of the infos. This information is, however,
+mainly meant for debugging as well as logging and not for training.
+
+|Key| Description|Type
+|---|---|---|
+`positions`| Generated trajectory from MP | Optional
+`velocities`| Generated trajectory from MP | Optional
+`step_actions`| Step-wise executed action based on controller output | Optional
+`step_observations`| Step-wise intermediate observations | Optional
+`step_rewards`| Step-wise rewards | Optional
+`trajectory_length`| Total number of environment interactions | Always
+`other`| All other information from the underlying environment are returned as a list with length `trajectory_length` maintaining the original key. In case some information are not provided every time step, the missing values are filled with `None`. | Always
+
+Existing MP tasks can be created the same way as above. Just keep in mind, calling `step()` executes a full trajectory.
+
+> **Note:**   
+> Currently, we are also in the process of enabling replanning as well as learning of sub-trajectories.
+> This allows to split the episode into multiple trajectories and is a hybrid setting between step-based and
+> black-box leaning.
+> While this is already implemented, it is still in beta and requires further testing.
+> Feel free to try it and open an issue with any problems that occur.

 ```python
-import alr_envs
+import fancy_gym

-env = alr_envs.make('HoleReacherProMP-v0', seed=1)
-# render() can be called once in the beginning with all necessary arguments. To turn it of again just call render(None). 
-env.render()
+env = fancy_gym.make('Reacher5dProMP-v0', seed=1)
+# render() can be called once in the beginning with all necessary arguments.
+# To turn it of again just call render() without any arguments. 
+env.render(mode='human')

-state = env.reset()
+# This returns the context information, not the full state observation
+obs = env.reset()

 for i in range(5):
-    state, reward, done, info = env.step(env.action_space.sample())
+    action = env.action_space.sample()
+    obs, reward, done, info = env.step(action)

-    # Not really necessary as the environments resets itself after each trajectory anyway.
-    state = env.reset()
+    # Done is always True as we are working on the episode level, hence we always reset()
+    obs = env.reset()
 ```

-To show all available environments, we provide some additional convenience. Each value will return a dictionary with two
-keys `DMP` and `ProMP` that store a list of available environment names.
+To show all available environments, we provide some additional convenience variables. All of them return a dictionary
+with two keys `DMP` and `ProMP` that store a list of available environment ids.

 ```python
-import alr_envs
+import fancy_gym

-print("Custom MP tasks:")
-print(alr_envs.ALL_ALR_MOTION_PRIMITIVE_ENVIRONMENTS)
+print("Fancy Black-box tasks:")
+print(fancy_gym.ALL_FANCY_MOVEMENT_PRIMITIVE_ENVIRONMENTS)

-print("OpenAI Gym MP tasks:")
-print(alr_envs.ALL_GYM_MOTION_PRIMITIVE_ENVIRONMENTS)
+print("OpenAI Gym Black-box tasks:")
+print(fancy_gym.ALL_GYM_MOVEMENT_PRIMITIVE_ENVIRONMENTS)

-print("Deepmind Control MP tasks:")
-print(alr_envs.ALL_DEEPMIND_MOTION_PRIMITIVE_ENVIRONMENTS)
+print("Deepmind Control Black-box tasks:")
+print(fancy_gym.ALL_DMC_MOVEMENT_PRIMITIVE_ENVIRONMENTS)

-print("MetaWorld MP tasks:")
-print(alr_envs.ALL_METAWORLD_MOTION_PRIMITIVE_ENVIRONMENTS)
+print("MetaWorld Black-box tasks:")
+print(fancy_gym.ALL_METAWORLD_MOVEMENT_PRIMITIVE_ENVIRONMENTS)
 ```

 ### How to create a new MP task

 In case a required task is not supported yet in the MP framework, it can be created relatively easy. For the task at
-hand, the following interface needs to be implemented.
+hand, the following [interface](fancy_gym/black_box/raw_interface_wrapper.py) needs to be implemented.

 ```python
+from abc import abstractmethod
+from typing import Union, Tuple
+
+import gym
 import numpy as np
-from mp_env_api import MPEnvWrapper


-class MPWrapper(MPEnvWrapper):
+class RawInterfaceWrapper(gym.Wrapper):

    @property
-    def active_obs(self):
+    def context_mask(self) -> np.ndarray:
        """
-            Returns boolean mask for each substate in the full observation.
-            It determines whether the observation is returned for the contextual case or not.
-            This effectively allows to filter unwanted or unnecessary observations from the full step-based case.
-            E.g. Velocities starting at 0 are only changing after the first action. Given we only receive the first  
-            observation, the velocities are not necessary in the observation for the MP task.
+        Returns boolean mask of the same shape as the observation space.
+        It determines whether the observation is returned for the contextual case or not.
+        This effectively allows to filter unwanted or unnecessary observations from the full step-based case.
+        E.g. Velocities starting at 0 are only changing after the first action. Given we only receive the 
+        context/part of the first observation, the velocities are not necessary in the observation for the task.
+        Returns:
+            bool array representing the indices of the observations
+
        """
-        return np.ones(self.observation_space.shape, dtype=bool)
+        return np.ones(self.env.observation_space.shape[0], dtype=bool)

    @property
-    def current_vel(self):
-        """
-            Returns the current velocity of the action/control dimension. 
-            The dimensionality has to match the action/control dimension.
-            This is not required when exclusively using position control, 
-            it should, however, be implemented regardless.
-            E.g. The joint velocities that are directly or indirectly controlled by the action.
-        """
-        raise NotImplementedError()
-
-    @property
-    def current_pos(self):
+    @abstractmethod
+    def current_pos(self) -> Union[float, int, np.ndarray, Tuple]:
        """
            Returns the current position of the action/control dimension.
            The dimensionality has to match the action/control dimension.
@ -159,41 +184,35 @@ class MPWrapper(MPEnvWrapper):
        raise NotImplementedError()

    @property
-    def goal_pos(self):
+    @abstractmethod
+    def current_vel(self) -> Union[float, int, np.ndarray, Tuple]:
        """
-            Returns a predefined final position of the action/control dimension.
-            This is only required for the DMP and is most of the time learned instead.
-        """
-        raise NotImplementedError()
-
-    @property
-    def dt(self):
-        """
-            Returns the time between two simulated steps of the environment
+            Returns the current velocity of the action/control dimension.
+            The dimensionality has to match the action/control dimension.
+            This is not required when exclusively using position control,
+            it should, however, be implemented regardless.
+            E.g. The joint velocities that are directly or indirectly controlled by the action.
        """
        raise NotImplementedError()

 ```

-If you created a new task wrapper, feel free to open a PR, so we can integrate it for others to use as well. 
-Without the integration the task can still be used. A rough outline can be shown here, for more details we recommend 
-having a look at the [examples](alr_envs/examples/).
+If you created a new task wrapper, feel free to open a PR, so we can integrate it for others to use as well. Without the
+integration the task can still be used. A rough outline can be shown here, for more details we recommend having a look
+at the [examples](fancy_gym/examples/).

 ```python
-import alr_envs
+import fancy_gym

 # Base environment name, according to structure of above example
 base_env_id = "ball_in_cup-catch"

-# Replace this wrapper with the custom wrapper for your environment by inheriting from the MPEnvWrapper.
+# Replace this wrapper with the custom wrapper for your environment by inheriting from the RawInferfaceWrapper.
 # You can also add other gym.Wrappers in case they are needed, 
 # e.g. gym.wrappers.FlattenObservation for dict observations
-wrappers = [alr_envs.dmc.suite.ball_in_cup.MPWrapper]
-mp_kwargs = {...}
+wrappers = [fancy_gym.dmc.suite.ball_in_cup.MPWrapper]
 kwargs = {...}
-env = alr_envs.make_dmp_env(base_env_id, wrappers=wrappers, seed=1, mp_kwargs=mp_kwargs, **kwargs)
-# OR for a deterministic ProMP (other mp_kwargs are required):
-# env = alr_envs.make_promp_env(base_env, wrappers=wrappers, seed=seed, mp_kwargs=mp_args)
+env = fancy_gym.make_bb(base_env_id, wrappers=wrappers, seed=0, **kwargs)

 rewards = 0
 obs = env.reset()
--- a/alr_envs/init.py
+++ b/alr_envs/init.py
@ -1,15 +0,0 @@
-from alr_envs import dmc, meta, open_ai
-from alr_envs.utils.make_env_helpers import make, make_dmp_env, make_promp_env, make_rank
-from alr_envs.utils import make_dmc
-
-# Convenience function for all MP environments
-from .alr import ALL_ALR_MOTION_PRIMITIVE_ENVIRONMENTS
-from .dmc import ALL_DEEPMIND_MOTION_PRIMITIVE_ENVIRONMENTS
-from .meta import ALL_METAWORLD_MOTION_PRIMITIVE_ENVIRONMENTS
-from .open_ai import ALL_GYM_MOTION_PRIMITIVE_ENVIRONMENTS
-
-ALL_MOTION_PRIMITIVE_ENVIRONMENTS = {
-    key: value + ALL_DEEPMIND_MOTION_PRIMITIVE_ENVIRONMENTS[key] +
-         ALL_GYM_MOTION_PRIMITIVE_ENVIRONMENTS[key] +
-         ALL_METAWORLD_MOTION_PRIMITIVE_ENVIRONMENTS[key]
-    for key, value in ALL_ALR_MOTION_PRIMITIVE_ENVIRONMENTS.items()}
--- a/alr_envs/alr/init.py
+++ b/alr_envs/alr/init.py
@ -1,499 +0,0 @@
-import numpy as np
-from gym import register
-
-from . import classic_control, mujoco
-from .classic_control.hole_reacher.hole_reacher import HoleReacherEnv
-from .classic_control.simple_reacher.simple_reacher import SimpleReacherEnv
-from .classic_control.viapoint_reacher.viapoint_reacher import ViaPointReacherEnv
-from .mujoco.ball_in_a_cup.ball_in_a_cup import ALRBallInACupEnv
-from .mujoco.ball_in_a_cup.biac_pd import ALRBallInACupPDEnv
-from .mujoco.reacher.alr_reacher import ALRReacherEnv
-from .mujoco.reacher.balancing import BalancingEnv
-
-from .mujoco.table_tennis.tt_gym import MAX_EPISODE_STEPS
-
-ALL_ALR_MOTION_PRIMITIVE_ENVIRONMENTS = {"DMP": [], "ProMP": []}
-
-# Classic Control
-## Simple Reacher
-register(
-    id='SimpleReacher-v0',
-    entry_point='alr_envs.alr.classic_control:SimpleReacherEnv',
-    max_episode_steps=200,
-    kwargs={
-        "n_links": 2,
-    }
-)
-
-register(
-    id='SimpleReacher-v1',
-    entry_point='alr_envs.alr.classic_control:SimpleReacherEnv',
-    max_episode_steps=200,
-    kwargs={
-        "n_links": 2,
-        "random_start": False
-    }
-)
-
-register(
-    id='LongSimpleReacher-v0',
-    entry_point='alr_envs.alr.classic_control:SimpleReacherEnv',
-    max_episode_steps=200,
-    kwargs={
-        "n_links": 5,
-    }
-)
-
-register(
-    id='LongSimpleReacher-v1',
-    entry_point='alr_envs.alr.classic_control:SimpleReacherEnv',
-    max_episode_steps=200,
-    kwargs={
-        "n_links": 5,
-        "random_start": False
-    }
-)
-
-## Viapoint Reacher
-
-register(
-    id='ViaPointReacher-v0',
-    entry_point='alr_envs.alr.classic_control:ViaPointReacherEnv',
-    max_episode_steps=200,
-    kwargs={
-        "n_links": 5,
-        "allow_self_collision": False,
-        "collision_penalty": 1000
-    }
-)
-
-## Hole Reacher
-register(
-    id='HoleReacher-v0',
-    entry_point='alr_envs.alr.classic_control:HoleReacherEnv',
-    max_episode_steps=200,
-    kwargs={
-        "n_links": 5,
-        "random_start": True,
-        "allow_self_collision": False,
-        "allow_wall_collision": False,
-        "hole_width": None,
-        "hole_depth": 1,
-        "hole_x": None,
-        "collision_penalty": 100,
-    }
-)
-
-register(
-    id='HoleReacher-v1',
-    entry_point='alr_envs.alr.classic_control:HoleReacherEnv',
-    max_episode_steps=200,
-    kwargs={
-        "n_links": 5,
-        "random_start": False,
-        "allow_self_collision": False,
-        "allow_wall_collision": False,
-        "hole_width": 0.25,
-        "hole_depth": 1,
-        "hole_x": None,
-        "collision_penalty": 100,
-    }
-)
-
-register(
-    id='HoleReacher-v2',
-    entry_point='alr_envs.alr.classic_control:HoleReacherEnv',
-    max_episode_steps=200,
-    kwargs={
-        "n_links": 5,
-        "random_start": False,
-        "allow_self_collision": False,
-        "allow_wall_collision": False,
-        "hole_width": 0.25,
-        "hole_depth": 1,
-        "hole_x": 2,
-        "collision_penalty": 1,
-    }
-)
-
-# Mujoco
-
-## Reacher
-register(
-    id='ALRReacher-v0',
-    entry_point='alr_envs.alr.mujoco:ALRReacherEnv',
-    max_episode_steps=200,
-    kwargs={
-        "steps_before_reward": 0,
-        "n_links": 5,
-        "balance": False,
-    }
-)
-
-register(
-    id='ALRReacherSparse-v0',
-    entry_point='alr_envs.alr.mujoco:ALRReacherEnv',
-    max_episode_steps=200,
-    kwargs={
-        "steps_before_reward": 200,
-        "n_links": 5,
-        "balance": False,
-    }
-)
-
-register(
-    id='ALRReacherSparseBalanced-v0',
-    entry_point='alr_envs.alr.mujoco:ALRReacherEnv',
-    max_episode_steps=200,
-    kwargs={
-        "steps_before_reward": 200,
-        "n_links": 5,
-        "balance": True,
-    }
-)
-
-register(
-    id='ALRLongReacher-v0',
-    entry_point='alr_envs.alr.mujoco:ALRReacherEnv',
-    max_episode_steps=200,
-    kwargs={
-        "steps_before_reward": 0,
-        "n_links": 7,
-        "balance": False,
-    }
-)
-
-register(
-    id='ALRLongReacherSparse-v0',
-    entry_point='alr_envs.alr.mujoco:ALRReacherEnv',
-    max_episode_steps=200,
-    kwargs={
-        "steps_before_reward": 200,
-        "n_links": 7,
-        "balance": False,
-    }
-)
-
-register(
-    id='ALRLongReacherSparseBalanced-v0',
-    entry_point='alr_envs.alr.mujoco:ALRReacherEnv',
-    max_episode_steps=200,
-    kwargs={
-        "steps_before_reward": 200,
-        "n_links": 7,
-        "balance": True,
-    }
-)
-
-## Balancing Reacher
-
-register(
-    id='Balancing-v0',
-    entry_point='alr_envs.alr.mujoco:BalancingEnv',
-    max_episode_steps=200,
-    kwargs={
-        "n_links": 5,
-    }
-)
-
-## Table Tennis
-register(id='TableTennis2DCtxt-v0',
-         entry_point='alr_envs.alr.mujoco:TTEnvGym',
-         max_episode_steps=MAX_EPISODE_STEPS,
-         kwargs={'ctxt_dim': 2})
-
-register(id='TableTennis2DCtxt-v1',
-         entry_point='alr_envs.alr.mujoco:TTEnvGym',
-         max_episode_steps=MAX_EPISODE_STEPS,
-         kwargs={'ctxt_dim': 2, 'fixed_goal': True})
-
-register(id='TableTennis4DCtxt-v0',
-         entry_point='alr_envs.alr.mujoco:TTEnvGym',
-         max_episode_steps=MAX_EPISODE_STEPS,
-         kwargs={'ctxt_dim': 4})
-
-## BeerPong
-difficulties = ["simple", "intermediate", "hard", "hardest"]
-
-for v, difficulty in enumerate(difficulties):
-    register(
-        id='ALRBeerPong-v{}'.format(v),
-        entry_point='alr_envs.alr.mujoco:ALRBeerBongEnv',
-        max_episode_steps=600,
-        kwargs={
-            "difficulty": difficulty,
-            "reward_type": "staged",
-        }
-    )
-
-# Motion Primitive Environments
-
-## Simple Reacher
-_versions = ["SimpleReacher-v0", "SimpleReacher-v1", "LongSimpleReacher-v0", "LongSimpleReacher-v1"]
-for _v in _versions:
-    _name = _v.split("-")
-    _env_id = f'{_name[0]}DMP-{_name[1]}'
-    register(
-        id=_env_id,
-        entry_point='alr_envs.utils.make_env_helpers:make_dmp_env_helper',
-        # max_episode_steps=1,
-        kwargs={
-            "name": f"alr_envs:{_v}",
-            "wrappers": [classic_control.simple_reacher.MPWrapper],
-            "mp_kwargs": {
-                "num_dof": 2 if "long" not in _v.lower() else 5,
-                "num_basis": 5,
-                "duration": 2,
-                "alpha_phase": 2,
-                "learn_goal": True,
-                "policy_type": "motor",
-                "weights_scale": 50,
-                "policy_kwargs": {
-                    "p_gains": .6,
-                    "d_gains": .075
-                }
-            }
-        }
-    )
-    ALL_ALR_MOTION_PRIMITIVE_ENVIRONMENTS["DMP"].append(_env_id)
-
-    _env_id = f'{_name[0]}ProMP-{_name[1]}'
-    register(
-        id=_env_id,
-        entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-        kwargs={
-            "name": f"alr_envs:{_v}",
-            "wrappers": [classic_control.simple_reacher.MPWrapper],
-            "mp_kwargs": {
-                "num_dof": 2 if "long" not in _v.lower() else 5,
-                "num_basis": 5,
-                "duration": 2,
-                "policy_type": "motor",
-                "weights_scale": 1,
-                "zero_start": True,
-                "policy_kwargs": {
-                    "p_gains": .6,
-                    "d_gains": .075
-                }
-            }
-        }
-    )
-    ALL_ALR_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append(_env_id)
-
-# Viapoint reacher
-register(
-    id='ViaPointReacherDMP-v0',
-    entry_point='alr_envs.utils.make_env_helpers:make_dmp_env_helper',
-    # max_episode_steps=1,
-    kwargs={
-        "name": "alr_envs:ViaPointReacher-v0",
-        "wrappers": [classic_control.viapoint_reacher.MPWrapper],
-        "mp_kwargs": {
-            "num_dof": 5,
-            "num_basis": 5,
-            "duration": 2,
-            "learn_goal": True,
-            "alpha_phase": 2,
-            "policy_type": "velocity",
-            "weights_scale": 50,
-        }
-    }
-)
-ALL_ALR_MOTION_PRIMITIVE_ENVIRONMENTS["DMP"].append("ViaPointReacherDMP-v0")
-
-register(
-    id="ViaPointReacherProMP-v0",
-    entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-    kwargs={
-        "name": f"alr_envs:ViaPointReacher-v0",
-        "wrappers": [classic_control.viapoint_reacher.MPWrapper],
-        "mp_kwargs": {
-            "num_dof": 5,
-            "num_basis": 5,
-            "duration": 2,
-            "policy_type": "velocity",
-            "weights_scale": 1,
-            "zero_start": True
-        }
-    }
-)
-ALL_ALR_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append("ViaPointReacherProMP-v0")
-
-## Hole Reacher
-_versions = ["v0", "v1", "v2"]
-for _v in _versions:
-    _env_id = f'HoleReacherDMP-{_v}'
-    register(
-        id=_env_id,
-        entry_point='alr_envs.utils.make_env_helpers:make_dmp_env_helper',
-        # max_episode_steps=1,
-        kwargs={
-            "name": f"alr_envs:HoleReacher-{_v}",
-            "wrappers": [classic_control.hole_reacher.MPWrapper],
-            "mp_kwargs": {
-                "num_dof": 5,
-                "num_basis": 5,
-                "duration": 2,
-                "learn_goal": True,
-                "alpha_phase": 2.5,
-                "bandwidth_factor": 2,
-                "policy_type": "velocity",
-                "weights_scale": 50,
-                "goal_scale": 0.1
-            }
-        }
-    )
-    ALL_ALR_MOTION_PRIMITIVE_ENVIRONMENTS["DMP"].append(_env_id)
-
-    _env_id = f'HoleReacherProMP-{_v}'
-    register(
-        id=_env_id,
-        entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-        kwargs={
-            "name": f"alr_envs:HoleReacher-{_v}",
-            "wrappers": [classic_control.hole_reacher.MPWrapper],
-            "mp_kwargs": {
-                "num_dof": 5,
-                "num_basis": 5,
-                "duration": 2,
-                "policy_type": "velocity",
-                "weights_scale": 5,
-                "zero_start": True
-            }
-        }
-    )
-    ALL_ALR_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append(_env_id)
-
-## ALRReacher
-_versions = ["ALRReacher-v0", "ALRLongReacher-v0", "ALRReacherSparse-v0", "ALRLongReacherSparse-v0"]
-for _v in _versions:
-    _name = _v.split("-")
-    _env_id = f'{_name[0]}DMP-{_name[1]}'
-    register(
-        id=_env_id,
-        entry_point='alr_envs.utils.make_env_helpers:make_dmp_env_helper',
-        # max_episode_steps=1,
-        kwargs={
-            "name": f"alr_envs:{_v}",
-            "wrappers": [mujoco.reacher.MPWrapper],
-            "mp_kwargs": {
-                "num_dof": 5 if "long" not in _v.lower() else 7,
-                "num_basis": 2,
-                "duration": 4,
-                "alpha_phase": 2,
-                "learn_goal": True,
-                "policy_type": "motor",
-                "weights_scale": 5,
-                "policy_kwargs": {
-                    "p_gains": 1,
-                    "d_gains": 0.1
-                }
-            }
-        }
-    )
-    ALL_ALR_MOTION_PRIMITIVE_ENVIRONMENTS["DMP"].append(_env_id)
-
-    _env_id = f'{_name[0]}ProMP-{_name[1]}'
-    register(
-        id=_env_id,
-        entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-        kwargs={
-            "name": f"alr_envs:{_v}",
-            "wrappers": [mujoco.reacher.MPWrapper],
-            "mp_kwargs": {
-                "num_dof": 5 if "long" not in _v.lower() else 7,
-                "num_basis": 1,
-                "duration": 4,
-                "policy_type": "motor",
-                "weights_scale": 5,
-                "zero_start": True,
-                "policy_kwargs": {
-                    "p_gains": 1,
-                    "d_gains": 0.1
-                }
-            }
-        }
-    )
-    ALL_ALR_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append(_env_id)
-
-## Beerpong
-_versions = ["v0", "v1", "v2", "v3"]
-for _v in _versions:
-    _env_id = f'BeerpongProMP-{_v}'
-    register(
-        id=_env_id,
-        entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-        kwargs={
-            "name": f"alr_envs:ALRBeerPong-{_v}",
-            "wrappers": [mujoco.beerpong.MPWrapper],
-            "mp_kwargs": {
-                "num_dof": 7,
-                "num_basis": 2,
-                "duration": 1,
-                "post_traj_time": 2,
-                "policy_type": "motor",
-                "weights_scale": 1,
-                "zero_start": True,
-                "zero_goal": False,
-                "policy_kwargs": {
-                    "p_gains": np.array([       1.5,   5,   2.55,    3,   2.,    2,   1.25]),
-                    "d_gains": np.array([0.02333333, 0.1, 0.0625, 0.08, 0.03, 0.03, 0.0125])
-                }
-            }
-        }
-    )
-    ALL_ALR_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append(_env_id)
-
-## Table Tennis
-ctxt_dim = [2, 4]
-for _v, cd in enumerate(ctxt_dim):
-    _env_id = f'TableTennisProMP-v{_v}'
-    register(
-        id=_env_id,
-        entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-        kwargs={
-            "name": "alr_envs:TableTennis{}DCtxt-v0".format(cd),
-            "wrappers": [mujoco.table_tennis.MPWrapper],
-            "mp_kwargs": {
-                "num_dof": 7,
-                "num_basis": 2,
-                "duration": 1.25,
-                "post_traj_time": 4.5,
-                "policy_type": "motor",
-                "weights_scale": 1.0,
-                "zero_start": True,
-                "zero_goal": False,
-                "policy_kwargs": {
-                    "p_gains": 0.5*np.array([1.0, 4.0, 2.0, 4.0, 1.0, 4.0, 1.0]),
-                    "d_gains": 0.5*np.array([0.1, 0.4, 0.2, 0.4, 0.1, 0.4, 0.1])
-                }
-            }
-        }
-    )
-    ALL_ALR_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append(_env_id)
-
-register(
-    id='TableTennisProMP-v2',
-    entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-    kwargs={
-        "name": "alr_envs:TableTennis2DCtxt-v1",
-        "wrappers": [mujoco.table_tennis.MPWrapper],
-        "mp_kwargs": {
-            "num_dof": 7,
-            "num_basis": 2,
-            "duration": 1.,
-            "post_traj_time": 2.5,
-            "policy_type": "motor",
-            "weights_scale": 1,
-            "off": -0.05,
-            "bandwidth_factor": 3.5,
-            "zero_start": True,
-            "zero_goal": False,
-            "policy_kwargs": {
-                "p_gains": 0.5*np.array([1.0, 4.0, 2.0, 4.0, 1.0, 4.0, 1.0]),
-                "d_gains": 0.5*np.array([0.1, 0.4, 0.2, 0.4, 0.1, 0.4, 0.1])
-            }
-        }
-    }
-)
-ALL_ALR_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append("TableTennisProMP-v2")
--- a/alr_envs/alr/classic_control/hole_reacher/mp_wrapper.py
+++ b/alr_envs/alr/classic_control/hole_reacher/mp_wrapper.py
@ -1,43 +0,0 @@
-from typing import Tuple, Union
-
-import numpy as np
-
-from mp_env_api import MPEnvWrapper
-
-
-class MPWrapper(MPEnvWrapper):
-    @property
-    def active_obs(self):
-        return np.hstack([
-            [self.env.random_start] * self.env.n_links,  # cos
-            [self.env.random_start] * self.env.n_links,  # sin
-            [self.env.random_start] * self.env.n_links,  # velocity
-            [self.env.initial_width is None],  # hole width
-            # [self.env.hole_depth is None],  # hole depth
-            [True] * 2,  # x-y coordinates of target distance
-            [False]  # env steps
-        ])
-
-    # @property
-    # def current_pos(self) -> Union[float, int, np.ndarray, Tuple]:
-    #     return self._joint_angles.copy()
-    #
-    # @property
-    # def current_vel(self) -> Union[float, int, np.ndarray, Tuple]:
-    #     return self._angle_velocity.copy()
-
-    @property
-    def current_pos(self) -> Union[float, int, np.ndarray, Tuple]:
-        return self.env.current_pos
-
-    @property
-    def current_vel(self) -> Union[float, int, np.ndarray, Tuple]:
-        return self.env.current_vel
-
-    @property
-    def goal_pos(self) -> Union[float, int, np.ndarray, Tuple]:
-        raise ValueError("Goal position is not available and has to be learnt based on the environment.")
-
-    @property
-    def dt(self) -> Union[float, int]:
-        return self.env.dt
--- a/alr_envs/alr/mujoco/init.py
+++ b/alr_envs/alr/mujoco/init.py
@ -1,6 +0,0 @@
-from .reacher.alr_reacher import ALRReacherEnv
-from .reacher.balancing import BalancingEnv
-from .ball_in_a_cup.ball_in_a_cup import ALRBallInACupEnv
-from .ball_in_a_cup.biac_pd import ALRBallInACupPDEnv
-from .table_tennis.tt_gym import TTEnvGym
-from .beerpong.beerpong import ALRBeerBongEnv
--- a/alr_envs/alr/mujoco/alr_reward_fct.py
+++ b/alr_envs/alr/mujoco/alr_reward_fct.py
@ -1,21 +0,0 @@
-class AlrReward:
-    """
-    A base class for non-Markovian reward functions which may need trajectory information to calculate an episodic
-    reward. Call the methods in reset() and step() of the environment.
-    """
-
-    # methods to override:
-    # ----------------------------
-    def reset(self, *args, **kwargs):
-        """
-        Reset the reward function, empty state buffers before an episode, set contexts that influence reward, etc.
-        """
-        raise NotImplementedError
-
-    def compute_reward(self, *args, **kwargs):
-        """
-
-        Returns: Useful things to return are reward values, success flags or crash flags
-
-        """
-        raise NotImplementedError
--- a/alr_envs/alr/mujoco/ball_in_a_cup/assets/biac_base.xml
+++ b/alr_envs/alr/mujoco/ball_in_a_cup/assets/biac_base.xml
@ -1,361 +0,0 @@
-<mujoco model="wam(v1.31)">
-    <compiler angle="radian" meshdir="../../meshes/wam/" />
-    <option timestep="0.0005" integrator="Euler" />
-    <size njmax="500" nconmax="100" />
-    <default class="main">
-        <joint limited="true" frictionloss="0.001" />
-        <default class="viz">
-            <geom type="mesh" contype="0" conaffinity="0" group="1" rgba="0.7 0.7 0.7 1" />
-        </default>
-        <default class="col">
-            <geom type="mesh" contype="0" rgba="0.5 0.6 0.7 1" />
-        </default>
-    </default>
-    <asset>
-        <texture type="2d" name="groundplane" builtin="checker" mark="edge" rgb1="0.25 0.26 0.25" rgb2="0.22 0.22 0.22" markrgb="0.3 0.3 0.3" width="100" height="100" />
-        <material name="MatGnd" texture="groundplane" texrepeat="5 5" specular="1" shininess="0.3" reflectance="1e-05" />
-        <mesh name="base_link_fine" file="base_link_fine.stl" />
-        <mesh name="base_link_convex" file="base_link_convex.stl" />
-        <mesh name="shoulder_link_fine" file="shoulder_link_fine.stl" />
-        <mesh name="shoulder_link_convex_decomposition_p1" file="shoulder_link_convex_decomposition_p1.stl" />
-        <mesh name="shoulder_link_convex_decomposition_p2" file="shoulder_link_convex_decomposition_p2.stl" />
-        <mesh name="shoulder_link_convex_decomposition_p3" file="shoulder_link_convex_decomposition_p3.stl" />
-        <mesh name="shoulder_pitch_link_fine" file="shoulder_pitch_link_fine.stl" />
-        <mesh name="shoulder_pitch_link_convex" file="shoulder_pitch_link_convex.stl" />
-        <mesh name="upper_arm_link_fine" file="upper_arm_link_fine.stl" />
-        <mesh name="upper_arm_link_convex_decomposition_p1" file="upper_arm_link_convex_decomposition_p1.stl" />
-        <mesh name="upper_arm_link_convex_decomposition_p2" file="upper_arm_link_convex_decomposition_p2.stl" />
-        <mesh name="elbow_link_fine" file="elbow_link_fine.stl" />
-        <mesh name="elbow_link_convex" file="elbow_link_convex.stl" />
-        <mesh name="forearm_link_fine" file="forearm_link_fine.stl" />
-        <mesh name="forearm_link_convex_decomposition_p1" file="forearm_link_convex_decomposition_p1.stl" />
-        <mesh name="forearm_link_convex_decomposition_p2" file="forearm_link_convex_decomposition_p2.stl" />
-        <mesh name="wrist_yaw_link_fine" file="wrist_yaw_link_fine.stl" />
-        <mesh name="wrist_yaw_link_convex_decomposition_p1" file="wrist_yaw_link_convex_decomposition_p1.stl" />
-        <mesh name="wrist_yaw_link_convex_decomposition_p2" file="wrist_yaw_link_convex_decomposition_p2.stl" />
-        <mesh name="wrist_pitch_link_fine" file="wrist_pitch_link_fine.stl" />
-        <mesh name="wrist_pitch_link_convex_decomposition_p1" file="wrist_pitch_link_convex_decomposition_p1.stl" />
-        <mesh name="wrist_pitch_link_convex_decomposition_p2" file="wrist_pitch_link_convex_decomposition_p2.stl" />
-        <mesh name="wrist_pitch_link_convex_decomposition_p3" file="wrist_pitch_link_convex_decomposition_p3.stl" />
-        <mesh name="wrist_palm_link_fine" file="wrist_palm_link_fine.stl" />
-        <mesh name="wrist_palm_link_convex" file="wrist_palm_link_convex.stl" />
-        <mesh name="cup1" file="cup_split1.stl" scale="0.001 0.001 0.001" />
-        <mesh name="cup2" file="cup_split2.stl" scale="0.001 0.001 0.001" />
-        <mesh name="cup3" file="cup_split3.stl" scale="0.001 0.001 0.001" />
-        <mesh name="cup4" file="cup_split4.stl" scale="0.001 0.001 0.001" />
-        <mesh name="cup5" file="cup_split5.stl" scale="0.001 0.001 0.001" />
-        <mesh name="cup6" file="cup_split6.stl" scale="0.001 0.001 0.001" />
-        <mesh name="cup7" file="cup_split7.stl" scale="0.001 0.001 0.001" />
-        <mesh name="cup8" file="cup_split8.stl" scale="0.001 0.001 0.001" />
-        <mesh name="cup9" file="cup_split9.stl" scale="0.001 0.001 0.001" />
-        <mesh name="cup10" file="cup_split10.stl" scale="0.001 0.001 0.001" />
-        <mesh name="cup11" file="cup_split11.stl" scale="0.001 0.001 0.001" />
-        <mesh name="cup12" file="cup_split12.stl" scale="0.001 0.001 0.001" />
-        <mesh name="cup13" file="cup_split13.stl" scale="0.001 0.001 0.001" />
-        <mesh name="cup14" file="cup_split14.stl" scale="0.001 0.001 0.001" />
-        <mesh name="cup15" file="cup_split15.stl" scale="0.001 0.001 0.001" />
-        <mesh name="cup16" file="cup_split16.stl" scale="0.001 0.001 0.001" />
-        <mesh name="cup17" file="cup_split17.stl" scale="0.001 0.001 0.001" />
-        <mesh name="cup18" file="cup_split18.stl" scale="0.001 0.001 0.001" />
-    </asset>
-
-    <worldbody>
-        <geom name="ground" size="1.5 2 1" type="plane" material="MatGnd" />
-        <light pos="0.1 0.2 1.3" dir="-0.0758098 -0.32162 -0.985527" directional="true" cutoff="60" exponent="1" diffuse="1 1 1" specular="0.1 0.1 0.1" />
-
-        <body name="wam/base_link" pos="0 0 0.6">
-            <inertial pos="6.93764e-06 0.0542887 0.076438" quat="0.496481 0.503509 -0.503703 0.496255" mass="27.5544" diaginertia="0.432537 0.318732 0.219528" />
-            <geom class="viz" quat="0.707107 0 0 -0.707107" mesh="base_link_fine" />
-            <geom class="col" quat="0.707107 0 0 -0.707107" mesh="base_link_convex" />
-            <body name="wam/shoulder_yaw_link" pos="0 0 0.16" quat="0.707107 0 0 -0.707107">
-                <inertial pos="-0.00443422 -0.00066489 -0.12189" quat="0.999995 0.000984795 0.00270132 0.00136071" mass="10.7677" diaginertia="0.507411 0.462983 0.113271" />
-                <joint name="wam/base_yaw_joint" pos="0 0 0" axis="0 0 1" range="-2.6 2.6" />
-                <geom class="viz" pos="0 0 0.186" mesh="shoulder_link_fine" />
-                <geom class="col" pos="0 0 0.186" mesh="shoulder_link_convex_decomposition_p1" />
-                <geom class="col" pos="0 0 0.186" mesh="shoulder_link_convex_decomposition_p2" />
-                <geom class="col" pos="0 0 0.186" mesh="shoulder_link_convex_decomposition_p3" />
-                <body name="wam/shoulder_pitch_link" pos="0 0 0.184" quat="0.707107 -0.707107 0 0">
-                    <inertial pos="-0.00236983 -0.0154211 0.0310561" quat="0.961781 -0.272983 0.0167269 0.0133385" mass="3.87494" diaginertia="0.0214207 0.0167101 0.0126465" />
-                    <joint name="wam/shoulder_pitch_joint" pos="0 0 0" axis="0 0 1" range="-1.985 1.985" />
-                    <geom class="viz" mesh="shoulder_pitch_link_fine" />
-                    <geom class="col" mesh="shoulder_pitch_link_convex" />
-                    <body name="wam/upper_arm_link" pos="0 -0.505 0" quat="0.707107 0.707107 0 0">
-                        <inertial pos="-0.0382586 3.309e-05 -0.207508" quat="0.705455 0.0381914 0.0383402 0.706686" mass="1.80228" diaginertia="0.0665697 0.0634285 0.00622701" />
-                        <joint name="wam/shoulder_yaw_joint" pos="0 0 0" axis="0 0 1" range="-2.8 2.8" />
-                        <geom class="viz" pos="0 0 -0.505" mesh="upper_arm_link_fine" />
-                        <geom class="col" pos="0 0 -0.505" mesh="upper_arm_link_convex_decomposition_p1" />
-                        <geom class="col" pos="0 0 -0.505" mesh="upper_arm_link_convex_decomposition_p2" />
-                        <body name="wam/forearm_link" pos="0.045 0 0.045" quat="0.707107 -0.707107 0 0">
-                            <inertial pos="0.00498512 -0.132717 -0.00022942" quat="0.546303 0.447151 -0.548676 0.447842" mass="2.40017" diaginertia="0.0196896 0.0152225 0.00749914" />
-                            <joint name="wam/elbow_pitch_joint" pos="0 0 0" axis="0 0 1" range="-0.9 3.14159" />
-                            <geom class="viz" mesh="elbow_link_fine" />
-                            <geom class="col" mesh="elbow_link_convex" />
-                            <geom class="viz" pos="-0.045 -0.073 0" quat="0.707388 0.706825 0 0" mesh="forearm_link_fine" />
-                            <geom class="col" pos="-0.045 -0.073 0" quat="0.707388 0.706825 0 0" mesh="forearm_link_convex_decomposition_p1" name="forearm_link_convex_decomposition_p1_geom" />
-                            <geom class="col" pos="-0.045 -0.073 0" quat="0.707388 0.706825 0 0" mesh="forearm_link_convex_decomposition_p2" name="forearm_link_convex_decomposition_p2_geom" />
-                            <body name="wam/wrist_yaw_link" pos="-0.045 0 0" quat="0.707107 0.707107 0 0">
-                                <inertial pos="8.921e-05 0.00435824 -0.00511217" quat="0.708528 -0.000120667 0.000107481 0.705683" mass="0.12376" diaginertia="0.0112011 0.0111887 7.58188e-05" />
-                                <joint name="wam/wrist_yaw_joint" pos="0 0 0" axis="0 0 1" range="-4.55 1.25" />
-                                <geom class="viz" pos="0 0 0.3" mesh="wrist_yaw_link_fine" />
-                                <geom class="col" pos="0 0 0.3" mesh="wrist_yaw_link_convex_decomposition_p1" name="wrist_yaw_link_convex_decomposition_p1_geom" />
-                                <geom class="col" pos="0 0 0.3" mesh="wrist_yaw_link_convex_decomposition_p2" name="wrist_yaw_link_convex_decomposition_p2_geom" />
-                                <body name="wam/wrist_pitch_link" pos="0 0 0.3" quat="0.707107 -0.707107 0 0">
-                                    <inertial pos="-0.00012262 -0.0246834 -0.0170319" quat="0.994687 -0.102891 0.000824211 -0.00336105" mass="0.417974" diaginertia="0.000555166 0.000463174 0.00023407" />
-                                    <joint name="wam/wrist_pitch_joint" pos="0 0 0" axis="0 0 1" range="-1.5707 1.5707" />
-                                    <geom class="viz" mesh="wrist_pitch_link_fine" />
-                                    <geom class="col" mesh="wrist_pitch_link_convex_decomposition_p1" name="wrist_pitch_link_convex_decomposition_p1_geom" />
-                                    <geom class="col" mesh="wrist_pitch_link_convex_decomposition_p2" name="wrist_pitch_link_convex_decomposition_p2_geom" />
-                                    <geom class="col" mesh="wrist_pitch_link_convex_decomposition_p3" name="wrist_pitch_link_convex_decomposition_p3_geom" />
-                                    <body name="wam/wrist_palm_link" pos="0 -0.06 0" quat="0.707107 0.707107 0 0">
-                                        <inertial pos="-7.974e-05 -0.00323552 -0.00016313" quat="0.594752 0.382453 0.382453 0.594752" mass="0.0686475" diaginertia="7.408e-05 3.81466e-05 3.76434e-05" />
-                                        <joint name="wam/palm_yaw_joint" pos="0 0 0" axis="0 0 1" range="-2.7 2.7" />
-                                        <geom class="viz" pos="0 0 -0.06" mesh="wrist_palm_link_fine" />
-                                        <geom class="col" pos="0 0 -0.06" mesh="wrist_palm_link_convex" name="wrist_palm_link_convex_geom" />
-                                        <body name="cup" pos="0 0 0" quat="-0.000203673 0 0 1">
-                                            <inertial pos="-3.75236e-10 8.27811e-05 0.0947015" quat="0.999945 -0.0104888 0 0" mass="0.132" diaginertia="0.000285643 0.000270485 9.65696e-05" />
-                                            <geom name="cup_geom1" pos="0 0.05 0.055" euler="-1.57 0 0" solref="-10000 -100" type="mesh" mesh="cup1" />
-                                            <geom name="cup_geom2" pos="0 0.05 0.055" euler="-1.57 0 0" solref="-10000 -100" type="mesh" mesh="cup2" />
-                                            <geom name="cup_geom3" pos="0 0.05 0.055" euler="-1.57 0 0" solref="-10000 -100" type="mesh" mesh="cup3" />
-                                            <geom name="cup_geom4" pos="0 0.05 0.055" euler="-1.57 0 0" solref="-10000 -100" type="mesh" mesh="cup4" />
-                                            <geom name="cup_geom5" pos="0 0.05 0.055" euler="-1.57 0 0" solref="-10000 -100" type="mesh" mesh="cup5" />
-                                            <geom name="cup_geom6" pos="0 0.05 0.055" euler="-1.57 0 0" solref="-10000 -100" type="mesh" mesh="cup6" />
-                                            <geom name="cup_geom7" pos="0 0.05 0.055" euler="-1.57 0 0" solref="-10000 -100" type="mesh" mesh="cup7" />
-                                            <geom name="cup_geom8" pos="0 0.05 0.055" euler="-1.57 0 0" solref="-10000 -100" type="mesh" mesh="cup8" />
-                                            <geom name="cup_geom9" pos="0 0.05 0.055" euler="-1.57 0 0" solref="-10000 -100" type="mesh" mesh="cup9" />
-                                            <geom name="cup_geom10" pos="0 0.05 0.055" euler="-1.57 0 0" solref="-10000 -100" type="mesh" mesh="cup10" />
-                                            <geom name="cup_base" pos="0 -0.035 0.1165" euler="-1.57 0 0" type="cylinder" size="0.038 0.0045" solref="-10000 -100"/>
-<!--                                            <geom name="cup_base_contact" pos="0 -0.025 0.1165" euler="-1.57 0 0" type="cylinder" size="0.03 0.0005" solref="-10000 -100" rgba="0 0 255 1"/>-->
-                                            <geom name="cup_base_contact" pos="0 -0.005 0.1165" euler="-1.57 0 0" type="cylinder" size="0.02 0.0005" solref="-10000 -100" rgba="0 0 255 1"/>
-                                            <geom name="cup_base_contact_below" pos="0 -0.04 0.1165" euler="-1.57 0 0" type="cylinder" size="0.035 0.001" solref="-10000 -100" rgba="255 0 255 1"/>
-<!--                                            <geom name="cup_geom11" pos="0 0.05 0.055" euler="-1.57 0 0" solref="-10000 -100" type="mesh" mesh="cup11" />-->
-<!--                                            <geom name="cup_geom12" pos="0 0.05 0.055" euler="-1.57 0 0" solref="-10000 -100" type="mesh" mesh="cup12" />-->
-<!--                                            <geom name="cup_geom13" pos="0 0.05 0.055" euler="-1.57 0 0" solref="-10000 -100" type="mesh" mesh="cup13" />-->
-<!--                                            <geom name="cup_geom14" pos="0 0.05 0.055" euler="-1.57 0 0" solref="-10000 -100" type="mesh" mesh="cup14" />-->
-
-                                            <geom name="cup_geom15" pos="0 0.05 0.055" euler="-1.57 0 0" solref="-10000 -100" type="mesh" mesh="cup15" />
-                                            <geom name="cup_geom16" pos="0 0.05 0.055" euler="-1.57 0 0" solref="-10000 -100" type="mesh" mesh="cup16" />
-                                            <geom name="cup_geom17" pos="0 0.05 0.055" euler="-1.57 0 0" solref="-10000 -100" type="mesh" mesh="cup17" />
-                                            <geom name="cup_geom18" pos="0 0.05 0.055" euler="-1.57 0 0" solref="-10000 -100" type="mesh" mesh="cup18" />
-                                            <site name="cup_goal" pos="0 0.05 0.1165" rgba="255 0 0 1"/>
-                                            <site name="cup_goal_final" pos="0 -0.025 0.1165" rgba="0 255 0 1"/>
-                                            <body name="B0" pos="0 -0.045 0.1165" quat="0.707388 0 0 -0.706825">
-                                                <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                <geom name="G0" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                <body name="B1" pos="0.0107 0 0">
-                                                    <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                    <joint name="J0_1" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                    <joint name="J1_1" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                    <geom name="G1" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                    <body name="B2" pos="0.0107 0 0">
-                                                        <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                        <joint name="J0_2" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                        <joint name="J1_2" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                        <geom name="G2" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                        <body name="B3" pos="0.0107 0 0">
-                                                            <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                            <joint name="J0_3" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                            <joint name="J1_3" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                            <geom name="G3" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                            <body name="B4" pos="0.0107 0 0">
-                                                                <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                <joint name="J0_4" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                <joint name="J1_4" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                <geom name="G4" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                <body name="B5" pos="0.0107 0 0">
-                                                                    <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                    <joint name="J0_5" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                    <joint name="J1_5" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                    <geom name="G5" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                    <body name="B6" pos="0.0107 0 0">
-                                                                        <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                        <joint name="J0_6" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                        <joint name="J1_6" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                        <geom name="G6" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                        <body name="B7" pos="0.0107 0 0">
-                                                                            <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                            <joint name="J0_7" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                            <joint name="J1_7" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                            <geom name="G7" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                            <body name="B8" pos="0.0107 0 0">
-                                                                                <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                <joint name="J0_8" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                <joint name="J1_8" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                <geom name="G8" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                <body name="B9" pos="0.0107 0 0">
-                                                                                    <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                    <joint name="J0_9" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                    <joint name="J1_9" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                    <geom name="G9" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                    <body name="B10" pos="0.0107 0 0">
-                                                                                        <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                        <joint name="J0_10" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                        <joint name="J1_10" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                        <geom name="G10" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                        <body name="B11" pos="0.0107 0 0">
-                                                                                            <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                            <joint name="J0_11" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                            <joint name="J1_11" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                            <geom name="G11" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                            <body name="B12" pos="0.0107 0 0">
-                                                                                                <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                                <joint name="J0_12" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                <joint name="J1_12" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                <geom name="G12" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                                <body name="B13" pos="0.0107 0 0">
-                                                                                                    <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                                    <joint name="J0_13" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                    <joint name="J1_13" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                    <geom name="G13" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                                    <body name="B14" pos="0.0107 0 0">
-                                                                                                        <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                                        <joint name="J0_14" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                        <joint name="J1_14" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                        <geom name="G14" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                                        <body name="B15" pos="0.0107 0 0">
-                                                                                                            <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                                            <joint name="J0_15" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                            <joint name="J1_15" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                            <geom name="G15" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                                            <body name="B16" pos="0.0107 0 0">
-                                                                                                                <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                                                <joint name="J0_16" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                <joint name="J1_16" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                <geom name="G16" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                                                <body name="B17" pos="0.0107 0 0">
-                                                                                                                    <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                                                    <joint name="J0_17" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                    <joint name="J1_17" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                    <geom name="G17" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                                                    <body name="B18" pos="0.0107 0 0">
-                                                                                                                        <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                                                        <joint name="J0_18" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                        <joint name="J1_18" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                        <geom name="G18" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                                                        <body name="B19" pos="0.0107 0 0">
-                                                                                                                            <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                                                            <joint name="J0_19" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                            <joint name="J1_19" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                            <geom name="G19" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                                                            <body name="B20" pos="0.0107 0 0">
-                                                                                                                                <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                                                                <joint name="J0_20" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                                <joint name="J1_20" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                                <geom name="G20" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                                                                <body name="B21" pos="0.0107 0 0">
-                                                                                                                                    <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                                                                    <joint name="J0_21" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                                    <joint name="J1_21" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                                    <geom name="G21" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                                                                    <body name="B22" pos="0.0107 0 0">
-                                                                                                                                        <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                                                                        <joint name="J0_22" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                                        <joint name="J1_22" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                                        <geom name="G22" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                                                                        <body name="B23" pos="0.0107 0 0">
-                                                                                                                                            <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                                                                            <joint name="J0_23" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                                            <joint name="J1_23" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                                            <geom name="G23" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                                                                            <body name="B24" pos="0.0107 0 0">
-                                                                                                                                                <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                                                                                <joint name="J0_24" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                                                <joint name="J1_24" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                                                <geom name="G24" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                                                                                <body name="B25" pos="0.0107 0 0">
-                                                                                                                                                    <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                                                                                    <joint name="J0_25" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                                                    <joint name="J1_25" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                                                    <geom name="G25" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                                                                                    <body name="B26" pos="0.0107 0 0">
-                                                                                                                                                        <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                                                                                        <joint name="J0_26" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                                                        <joint name="J1_26" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                                                        <geom name="G26" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                                                                                        <body name="B27" pos="0.0107 0 0">
-                                                                                                                                                            <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                                                                                            <joint name="J0_27" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                                                            <joint name="J1_27" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                                                            <geom name="G27" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                                                                                            <body name="B28" pos="0.0107 0 0">
-                                                                                                                                                                <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                                                                                                <joint name="J0_28" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                                                                <joint name="J1_28" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                                                                <geom name="G28" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                                                                                                <body name="B29" pos="0.0107 0 0">
-                                                                                                                                                                    <inertial pos="0 0 0" quat="0.707107 0 0.707107 0" mass="7.4927e-05" diaginertia="5.87e-10 5.87e-10 3.74635e-11" />
-                                                                                                                                                                    <joint name="J0_29" pos="-0.00535 0 0" axis="0 1 0" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                                                                    <joint name="J1_29" pos="-0.00535 0 0" axis="0 0 1" group="3" limited="false" damping="0.0001" frictionloss="0" />
-                                                                                                                                                                    <geom name="G29" size="0.001 0.00427" quat="0.707107 0 0.707107 0" type="capsule" rgba="0.8 0.2 0.1 1" />
-                                                                                                                                                                    <body name="ball">
-                                                                                                                                                                        <geom name="ball_geom" type="sphere" size="0.02" mass="0.015" rgba="0.8 0.2 0.1 1"/>
-                                                                                                                                                                    </body>
-                                                                                                                                                                </body>
-                                                                                                                                                            </body>
-                                                                                                                                                        </body>
-                                                                                                                                                    </body>
-                                                                                                                                                </body>
-                                                                                                                                            </body>
-                                                                                                                                        </body>
-                                                                                                                                    </body>
-                                                                                                                                </body>
-                                                                                                                            </body>
-                                                                                                                        </body>
-                                                                                                                    </body>
-                                                                                                                </body>
-                                                                                                            </body>
-                                                                                                        </body>
-                                                                                                    </body>
-                                                                                                </body>
-                                                                                            </body>
-                                                                                        </body>
-                                                                                    </body>
-                                                                                </body>
-                                                                            </body>
-                                                                        </body>
-                                                                    </body>
-                                                                </body>
-                                                            </body>
-                                                        </body>
-                                                    </body>
-                                                </body>
-                                            </body>
-                                        </body>
-                                    </body>
-                                </body>
-                            </body>
-                        </body>
-                    </body>
-                </body>
-            </body>
-        </body>
-<!--        <site name="context_point" pos="-0.20869846 -0.66376693 1.18088501" euler="-1.57 0 0" size="0.015" rgba="1 0 0 0.6" type="sphere"/>-->
-<!--        <site name="context_point1" pos="-0.5 -0.85 0.8165" euler="-1.57 0 0" size="0.015" rgba="0 1 0 0.3" type="sphere"/>-->
-<!--        <site name="context_point2" pos="-0.5 -0.85 1.4165" euler="-1.57 0 0" size="0.015" rgba="0 1 0 0.3" type="sphere"/>-->
-<!--        <site name="context_point3" pos="-0.5 -0.35 0.8165" euler="-1.57 0 0" size="0.015" rgba="0 1 0 0.3" type="sphere"/>-->
-<!--        <site name="context_point4" pos="-0.5 -0.35 1.4165" euler="-1.57 0 0" size="0.015" rgba="0 1 0 0.3" type="sphere"/>-->
-<!--        <site name="context_point5" pos="0.5 -0.85 0.8165" euler="-1.57 0 0" size="0.015" rgba="0 1 0 0.3" type="sphere"/>-->
-<!--        <site name="context_point6" pos="0.5 -0.85 1.4165" euler="-1.57 0 0" size="0.015" rgba="0 1 0 0.3" type="sphere"/>-->
-<!--        <site name="context_point7" pos="0.5 -0.35 0.8165" euler="-1.57 0 0" size="0.015" rgba="0 1 0 0.3" type="sphere"/>-->
-<!--        <site name="context_point8" pos="0.5 -0.35 1.4165" euler="-1.57 0 0" size="0.015" rgba="0 1 0 0.3" type="sphere"/>-->
-<!--        <site name="context_space" pos="0 -0.6 1.1165" euler="0 0 0" size="0.5 0.25 0.3" rgba="0 0 1 0.15" type="box"/>-->
-        <camera name="visualization" mode="targetbody" target="wam/wrist_yaw_link" pos="1.5 -0.4 2.2"/>
-        <camera name="experiment" mode="fixed" quat="0.44418059 0.41778323 0.54301123 0.57732103" pos="1.5 -0.3 1.33" />
-    </worldbody>
-
-    <actuator>
-<!--        <motor ctrllimited="true" ctrlrange="-150 150" joint="wam/base_yaw_joint"/>-->
-<!--        <motor ctrllimited="true" ctrlrange="-125 125" joint="wam/shoulder_pitch_joint"/>-->
-<!--        <motor ctrllimited="true" ctrlrange="-40 40" joint="wam/shoulder_yaw_joint"/>-->
-<!--        <motor ctrllimited="true" ctrlrange="-60 60" joint="wam/elbow_pitch_joint"/>-->
-<!--        <motor ctrllimited="true" ctrlrange="-5 5" joint="wam/wrist_yaw_joint"/>-->
-<!--        <motor ctrllimited="true" ctrlrange="-5 5" joint="wam/wrist_pitch_joint"/>-->
-<!--        <motor ctrllimited="true" ctrlrange="-2 2" joint="wam/palm_yaw_joint"/>-->
-        <motor ctrllimited="true" ctrlrange="-1.0 1.0" gear="150.0" joint="wam/base_yaw_joint"/>
-        <motor ctrllimited="true" ctrlrange="-1.0 1.0" gear="125.0" joint="wam/shoulder_pitch_joint"/>
-        <motor ctrllimited="true" ctrlrange="-1.0 1.0" gear="40.0" joint="wam/shoulder_yaw_joint"/>
-        <motor ctrllimited="true" ctrlrange="-1.0 1.0" gear="60.0" joint="wam/elbow_pitch_joint"/>
-        <motor ctrllimited="true" ctrlrange="-1.0 1.0" gear="5.0" joint="wam/wrist_yaw_joint"/>
-        <motor ctrllimited="true" ctrlrange="-1.0 1.0" gear="5.0" joint="wam/wrist_pitch_joint"/>
-        <motor ctrllimited="true" ctrlrange="-1.0 1.0" gear="2.0" joint="wam/palm_yaw_joint"/>
-    </actuator>
-</mujoco>
--- a/alr_envs/alr/mujoco/ball_in_a_cup/ball_in_a_cup.py
+++ b/alr_envs/alr/mujoco/ball_in_a_cup/ball_in_a_cup.py
@ -1,196 +0,0 @@
-from gym import utils
-import os
-import numpy as np
-from gym.envs.mujoco import MujocoEnv
-
-
-
-class ALRBallInACupEnv(MujocoEnv, utils.EzPickle):
-    def __init__(self, n_substeps=4, apply_gravity_comp=True, simplified: bool = False,
-                 reward_type: str = None, context: np.ndarray = None):
-        utils.EzPickle.__init__(**locals())
-        self._steps = 0
-
-        self.xml_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), "assets", "biac_base.xml")
-
-        self._q_pos = []
-        self._q_vel = []
-        # self.weight_matrix_scale = 50
-        self.max_ctrl = np.array([150., 125., 40., 60., 5., 5., 2.])
-
-        self.j_min = np.array([-2.6, -1.985, -2.8, -0.9, -4.55, -1.5707, -2.7])
-        self.j_max = np.array([2.6, 1.985, 2.8, 3.14159, 1.25, 1.5707, 2.7])
-
-        self.context = context
-
-        alr_mujoco_env.AlrMujocoEnv.__init__(self,
-                                             self.xml_path,
-                                             apply_gravity_comp=apply_gravity_comp,
-                                             n_substeps=n_substeps)
-        self._start_pos = np.array([0.0, 0.58760536, 0.0, 1.36004913, 0.0, -0.32072943, -1.57])
-        self._start_vel = np.zeros(7)
-
-        self.simplified = simplified
-
-        self.sim_time = 8  # seconds
-        self.sim_steps = int(self.sim_time / self.dt)
-        if reward_type == "no_context":
-            from alr_envs.alr.mujoco.ball_in_a_cup.ball_in_a_cup_reward_simple import BallInACupReward
-            reward_function = BallInACupReward
-        elif reward_type == "contextual_goal":
-            from alr_envs.alr.mujoco.ball_in_a_cup.ball_in_a_cup_reward import BallInACupReward
-            reward_function = BallInACupReward
-        else:
-            raise ValueError("Unknown reward type: {}".format(reward_type))
-        self.reward_function = reward_function(self.sim_steps)
-
-    @property
-    def start_pos(self):
-        if self.simplified:
-            return self._start_pos[1::2]
-        else:
-            return self._start_pos
-
-    @property
-    def start_vel(self):
-        if self.simplified:
-            return self._start_vel[1::2]
-        else:
-            return self._start_vel
-
-    @property
-    def current_pos(self):
-        return self.sim.data.qpos[0:7].copy()
-
-    @property
-    def current_vel(self):
-        return self.sim.data.qvel[0:7].copy()
-
-    def reset(self):
-        self.reward_function.reset(None)
-        return super().reset()
-
-    def reset_model(self):
-        init_pos_all = self.init_qpos.copy()
-        init_pos_robot = self._start_pos
-        init_vel = np.zeros_like(init_pos_all)
-
-        self._steps = 0
-        self._q_pos = []
-        self._q_vel = []
-
-        start_pos = init_pos_all
-        start_pos[0:7] = init_pos_robot
-
-        self.set_state(start_pos, init_vel)
-
-        return self._get_obs()
-
-    def step(self, a):
-        reward_dist = 0.0
-        angular_vel = 0.0
-        reward_ctrl = - np.square(a).sum()
-
-        crash = self.do_simulation(a)
-        # joint_cons_viol = self.check_traj_in_joint_limits()
-
-        self._q_pos.append(self.sim.data.qpos[0:7].ravel().copy())
-        self._q_vel.append(self.sim.data.qvel[0:7].ravel().copy())
-
-        ob = self._get_obs()
-
-        if not crash:
-            reward, success, is_collided = self.reward_function.compute_reward(a, self)
-            done = success or self._steps == self.sim_steps - 1 or is_collided
-            self._steps += 1
-        else:
-            reward = -2000
-            success = False
-            is_collided = False
-            done = True
-        return ob, reward, done, dict(reward_dist=reward_dist,
-                                      reward_ctrl=reward_ctrl,
-                                      velocity=angular_vel,
-                                      # traj=self._q_pos,
-                                      action=a,
-                                      q_pos=self.sim.data.qpos[0:7].ravel().copy(),
-                                      q_vel=self.sim.data.qvel[0:7].ravel().copy(),
-                                      is_success=success,
-                                      is_collided=is_collided, sim_crash=crash)
-
-    def check_traj_in_joint_limits(self):
-        return any(self.current_pos > self.j_max) or any(self.current_pos < self.j_min)
-
-    # TODO: extend observation space
-    def _get_obs(self):
-        theta = self.sim.data.qpos.flat[:7]
-        return np.concatenate([
-            np.cos(theta),
-            np.sin(theta),
-            # self.get_body_com("target"),  # only return target to make problem harder
-            [self._steps],
-        ])
-
-    # TODO
-    @property
-    def active_obs(self):
-        return np.hstack([
-            [False] * 7,  # cos
-            [False] * 7,  # sin
-            # [True] * 2,  # x-y coordinates of target distance
-            [False]  # env steps
-        ])
-
-    # These functions are for the task with 3 joint actuations
-    def extend_des_pos(self, des_pos):
-        des_pos_full = self._start_pos.copy()
-        des_pos_full[1] = des_pos[0]
-        des_pos_full[3] = des_pos[1]
-        des_pos_full[5] = des_pos[2]
-        return des_pos_full
-
-    def extend_des_vel(self, des_vel):
-        des_vel_full = self._start_vel.copy()
-        des_vel_full[1] = des_vel[0]
-        des_vel_full[3] = des_vel[1]
-        des_vel_full[5] = des_vel[2]
-        return des_vel_full
-
-    def render(self, render_mode, **render_kwargs):
-        if render_mode == "plot_trajectory":
-            if self._steps == 1:
-                import matplotlib.pyplot as plt
-                # plt.ion()
-                self.fig, self.axs = plt.subplots(3, 1)
-
-            if self._steps <= 1750:
-                for ax, cp in zip(self.axs, self.current_pos[1::2]):
-                    ax.scatter(self._steps, cp, s=2, marker=".")
-
-            # self.fig.show()
-
-        else:
-            super().render(render_mode, **render_kwargs)
-
-
-if __name__ == "__main__":
-    env = ALRBallInACupEnv()
-    ctxt = np.array([-0.20869846, -0.66376693, 1.18088501])
-
-    env.configure(ctxt)
-    env.reset()
-    # env.render()
-    for i in range(16000):
-        # test with random actions
-        ac = 0.001 * env.action_space.sample()[0:7]
-        # ac = env.start_pos
-        # ac[0] += np.pi/2
-        obs, rew, d, info = env.step(ac)
-        # env.render()
-
-        print(rew)
-
-        if d:
-            break
-
-    env.close()
--- a/alr_envs/alr/mujoco/ball_in_a_cup/ball_in_a_cup_mp_wrapper.py
+++ b/alr_envs/alr/mujoco/ball_in_a_cup/ball_in_a_cup_mp_wrapper.py
@ -1,42 +0,0 @@
-from typing import Tuple, Union
-
-import numpy as np
-
-from mp_env_api import MPEnvWrapper
-
-
-class BallInACupMPWrapper(MPEnvWrapper):
-
-    @property
-    def active_obs(self):
-        # TODO: @Max Filter observations correctly
-        return np.hstack([
-            [False] * 7,  # cos
-            [False] * 7,  # sin
-            # [True] * 2,  # x-y coordinates of target distance
-            [False]  # env steps
-        ])
-
-    @property
-    def start_pos(self):
-        if self.simplified:
-            return self._start_pos[1::2]
-        else:
-            return self._start_pos
-
-    @property
-    def current_pos(self) -> Union[float, int, np.ndarray, Tuple]:
-        return self.sim.data.qpos[0:7].copy()
-
-    @property
-    def current_vel(self) -> Union[float, int, np.ndarray, Tuple]:
-        return self.sim.data.qvel[0:7].copy()
-
-    @property
-    def goal_pos(self):
-        # TODO: @Max I think the default value of returning to the start is reasonable here
-        raise ValueError("Goal position is not available and has to be learnt based on the environment.")
-
-    @property
-    def dt(self) -> Union[float, int]:
-        return self.env.dt
--- a/alr_envs/alr/mujoco/ball_in_a_cup/ball_in_a_cup_reward.py
+++ b/alr_envs/alr/mujoco/ball_in_a_cup/ball_in_a_cup_reward.py
@ -1,142 +0,0 @@
-import numpy as np
-from alr_envs.alr.mujoco import alr_reward_fct
-
-
-class BallInACupReward(alr_reward_fct.AlrReward):
-    def __init__(self, sim_time):
-        self.sim_time = sim_time
-
-        self.collision_objects = ["cup_geom1", "cup_geom2", "wrist_palm_link_convex_geom",
-                                  "wrist_pitch_link_convex_decomposition_p1_geom",
-                                  "wrist_pitch_link_convex_decomposition_p2_geom",
-                                  "wrist_pitch_link_convex_decomposition_p3_geom",
-                                  "wrist_yaw_link_convex_decomposition_p1_geom",
-                                  "wrist_yaw_link_convex_decomposition_p2_geom",
-                                  "forearm_link_convex_decomposition_p1_geom",
-                                  "forearm_link_convex_decomposition_p2_geom"]
-
-        self.ball_id = None
-        self.ball_collision_id = None
-        self.goal_id = None
-        self.goal_final_id = None
-        self.collision_ids = None
-
-        self.ball_traj = None
-        self.dists = None
-        self.dists_ctxt = None
-        self.dists_final = None
-        self.costs = None
-
-        self.reset(None)
-
-    def reset(self, context):
-        self.ball_traj = np.zeros(shape=(self.sim_time, 3))
-        self.cup_traj = np.zeros(shape=(self.sim_time, 3))
-        self.dists = []
-        self.dists_ctxt = []
-        self.dists_final = []
-        self.costs = []
-        self.context = context
-        self.ball_in_cup = False
-        self.ball_above_threshold = False
-        self.dist_ctxt = 3
-        self.action_costs = []
-        self.cup_angles = []
-
-    def compute_reward(self, action, sim, step):
-        action_cost = np.sum(np.square(action))
-        self.action_costs.append(action_cost)
-
-        stop_sim = False
-        success = False
-
-        self.ball_id = sim.model._body_name2id["ball"]
-        self.ball_collision_id = sim.model._geom_name2id["ball_geom"]
-        self.goal_id = sim.model._site_name2id["cup_goal"]
-        self.goal_final_id = sim.model._site_name2id["cup_goal_final"]
-        self.collision_ids = [sim.model._geom_name2id[name] for name in self.collision_objects]
-
-        if self.check_collision(sim):
-            reward = - 1e-3 * action_cost - 1000
-            stop_sim = True
-            return reward, success, stop_sim
-
-        # Compute the current distance from the ball to the inner part of the cup
-        goal_pos = sim.data.site_xpos[self.goal_id]
-        ball_pos = sim.data.body_xpos[self.ball_id]
-        goal_final_pos = sim.data.site_xpos[self.goal_final_id]
-        self.dists.append(np.linalg.norm(goal_pos - ball_pos))
-        self.dists_final.append(np.linalg.norm(goal_final_pos - ball_pos))
-        self.dists_ctxt.append(np.linalg.norm(ball_pos - self.context))
-        self.ball_traj[step, :] = np.copy(ball_pos)
-        self.cup_traj[step, :] = np.copy(goal_pos)  # ?
-        cup_quat = np.copy(sim.data.body_xquat[sim.model._body_name2id["cup"]])
-        self.cup_angles.append(np.arctan2(2 * (cup_quat[0] * cup_quat[1] + cup_quat[2] * cup_quat[3]),
-                                          1 - 2 * (cup_quat[1] ** 2 + cup_quat[2] ** 2)))
-
-        # Determine the first time when ball is in cup
-        if not self.ball_in_cup:
-            ball_in_cup = self.check_ball_in_cup(sim, self.ball_collision_id)
-            self.ball_in_cup = ball_in_cup
-            if ball_in_cup:
-                dist_to_ctxt = np.linalg.norm(ball_pos - self.context)
-                self.dist_ctxt = dist_to_ctxt
-
-        if step == self.sim_time - 1:
-            t_min_dist = np.argmin(self.dists)
-            angle_min_dist = self.cup_angles[t_min_dist]
-            cost_angle = (angle_min_dist - np.pi / 2) ** 2
-
-            min_dist = np.min(self.dists)
-            dist_final = self.dists_final[-1]
-            # dist_ctxt = self.dists_ctxt[-1]
-
-            #  # max distance between ball and cup and cup height at that time
-            # ball_to_cup_diff = self.ball_traj[:, 2] - self.cup_traj[:, 2]
-            # t_max_diff = np.argmax(ball_to_cup_diff)
-            # t_max_ball_height = np.argmax(self.ball_traj[:, 2])
-            # max_ball_height = np.max(self.ball_traj[:, 2])
-
-            # cost = self._get_stage_wise_cost(ball_in_cup, min_dist, dist_final, dist_ctxt)
-            cost = 0.5 * min_dist + 0.5 * dist_final + 0.3 * np.minimum(self.dist_ctxt, 3) + 0.01 * cost_angle
-            reward = np.exp(-2 * cost) - 1e-3 * action_cost
-            # if max_ball_height < self.context[2] or ball_to_cup_diff[t_max_ball_height] < 0:
-            #     reward -= 1
-
-            success = dist_final < 0.05 and self.dist_ctxt < 0.05
-        else:
-            reward = - 1e-3 * action_cost
-            success = False
-
-        return reward, success, stop_sim
-
-    def _get_stage_wise_cost(self, ball_in_cup, min_dist, dist_final, dist_to_ctxt):
-        if not ball_in_cup:
-            cost = 3 + 2*(0.5 * min_dist**2 + 0.5 * dist_final**2)
-        else:
-            cost = 2 * dist_to_ctxt ** 2
-            print('Context Distance:', dist_to_ctxt)
-        return cost
-
-    def check_ball_in_cup(self, sim, ball_collision_id):
-        cup_base_collision_id = sim.model._geom_name2id["cup_base_contact"]
-        for coni in range(0, sim.data.ncon):
-            con = sim.data.contact[coni]
-
-            collision = con.geom1 == cup_base_collision_id and con.geom2 == ball_collision_id
-            collision_trans = con.geom1 == ball_collision_id and con.geom2 == cup_base_collision_id
-
-            if collision or collision_trans:
-                return True
-        return False
-
-    def check_collision(self, sim):
-        for coni in range(0, sim.data.ncon):
-            con = sim.data.contact[coni]
-
-            collision = con.geom1 in self.collision_ids and con.geom2 == self.ball_collision_id
-            collision_trans = con.geom1 == self.ball_collision_id and con.geom2 in self.collision_ids
-
-            if collision or collision_trans:
-                return True
-        return False
--- a/alr_envs/alr/mujoco/ball_in_a_cup/ball_in_a_cup_reward_simple.py
+++ b/alr_envs/alr/mujoco/ball_in_a_cup/ball_in_a_cup_reward_simple.py
@ -1,116 +0,0 @@
-import numpy as np
-from alr_envs.alr.mujoco import alr_reward_fct
-
-
-class BallInACupReward(alr_reward_fct.AlrReward):
-    def __init__(self, env):
-        self.env = env
-        self.collision_objects = ["cup_geom1", "cup_geom2", "cup_base_contact_below",
-                                  "wrist_palm_link_convex_geom",
-                                  "wrist_pitch_link_convex_decomposition_p1_geom",
-                                  "wrist_pitch_link_convex_decomposition_p2_geom",
-                                  "wrist_pitch_link_convex_decomposition_p3_geom",
-                                  "wrist_yaw_link_convex_decomposition_p1_geom",
-                                  "wrist_yaw_link_convex_decomposition_p2_geom",
-                                  "forearm_link_convex_decomposition_p1_geom",
-                                  "forearm_link_convex_decomposition_p2_geom"]
-
-        self.ball_id = None
-        self.ball_collision_id = None
-        self.goal_id = None
-        self.goal_final_id = None
-        self.collision_ids = None
-        self._is_collided = False
-        self.collision_penalty = 1000
-
-        self.ball_traj = None
-        self.dists = None
-        self.dists_final = None
-        self.costs = None
-
-        self.reset(None)
-
-    def reset(self, context):
-        # self.sim_time = self.env.sim.dtsim_time
-        self.ball_traj = []  # np.zeros(shape=(self.sim_time, 3))
-        self.dists = []
-        self.dists_final = []
-        self.costs = []
-        self.action_costs = []
-        self.angle_costs = []
-        self.cup_angles = []
-
-    def compute_reward(self, action):
-        self.ball_id = self.env.sim.model._body_name2id["ball"]
-        self.ball_collision_id = self.env.sim.model._geom_name2id["ball_geom"]
-        self.goal_id = self.env.sim.model._site_name2id["cup_goal"]
-        self.goal_final_id = self.env.sim.model._site_name2id["cup_goal_final"]
-        self.collision_ids = [self.env.sim.model._geom_name2id[name] for name in self.collision_objects]
-
-        ball_in_cup = self.check_ball_in_cup(self.env.sim, self.ball_collision_id)
-
-        # Compute the current distance from the ball to the inner part of the cup
-        goal_pos = self.env.sim.data.site_xpos[self.goal_id]
-        ball_pos = self.env.sim.data.body_xpos[self.ball_id]
-        goal_final_pos = self.env.sim.data.site_xpos[self.goal_final_id]
-        self.dists.append(np.linalg.norm(goal_pos - ball_pos))
-        self.dists_final.append(np.linalg.norm(goal_final_pos - ball_pos))
-        # self.ball_traj[self.env._steps, :] = ball_pos
-        self.ball_traj.append(ball_pos)
-        cup_quat = np.copy(self.env.sim.data.body_xquat[self.env.sim.model._body_name2id["cup"]])
-        cup_angle = np.arctan2(2 * (cup_quat[0] * cup_quat[1] + cup_quat[2] * cup_quat[3]),
-                                          1 - 2 * (cup_quat[1]**2 + cup_quat[2]**2))
-        cost_angle = (cup_angle - np.pi / 2) ** 2
-        self.angle_costs.append(cost_angle)
-        self.cup_angles.append(cup_angle)
-
-        action_cost = np.sum(np.square(action))
-        self.action_costs.append(action_cost)
-
-        self._is_collided = self.check_collision(self.env.sim) or self.env.check_traj_in_joint_limits()
-
-        if self.env._steps == self.env.ep_length - 1 or self._is_collided:
-            t_min_dist = np.argmin(self.dists)
-            angle_min_dist = self.cup_angles[t_min_dist]
-            # cost_angle = (angle_min_dist - np.pi / 2)**2
-
-
-            # min_dist = self.dists[t_min_dist]
-            dist_final = self.dists_final[-1]
-            min_dist_final = np.min(self.dists_final)
-
-            # cost = 0.5 * dist_final + 0.05 * cost_angle  # TODO: Increase cost_angle weight  # 0.5 * min_dist +
-            # reward = np.exp(-2 * cost) - 1e-2 * action_cost - self.collision_penalty * int(self._is_collided)
-            # reward = - dist_final**2 - 1e-4 * cost_angle - 1e-5 * action_cost - self.collision_penalty * int(self._is_collided)
-            reward = - dist_final**2 - min_dist_final**2 - 1e-4 * cost_angle - 1e-3 * action_cost - self.collision_penalty * int(self._is_collided)
-            success = dist_final < 0.05 and ball_in_cup and not self._is_collided
-            crash = self._is_collided
-        else:
-            reward = - 1e-3 * action_cost - 1e-4 * cost_angle  # TODO: increase action_cost weight
-            success = False
-            crash = False
-
-        return reward, success, crash
-
-    def check_ball_in_cup(self, sim, ball_collision_id):
-        cup_base_collision_id = sim.model._geom_name2id["cup_base_contact"]
-        for coni in range(0, sim.data.ncon):
-            con = sim.data.contact[coni]
-
-            collision = con.geom1 == cup_base_collision_id and con.geom2 == ball_collision_id
-            collision_trans = con.geom1 == ball_collision_id and con.geom2 == cup_base_collision_id
-
-            if collision or collision_trans:
-                return True
-        return False
-
-    def check_collision(self, sim):
-        for coni in range(0, sim.data.ncon):
-            con = sim.data.contact[coni]
-
-            collision = con.geom1 in self.collision_ids and con.geom2 == self.ball_collision_id
-            collision_trans = con.geom1 == self.ball_collision_id and con.geom2 in self.collision_ids
-
-            if collision or collision_trans:
-                return True
-        return False
--- a/alr_envs/alr/mujoco/ball_in_a_cup/biac_pd.py
+++ b/alr_envs/alr/mujoco/ball_in_a_cup/biac_pd.py
@ -1,205 +0,0 @@
-import os
-
-import gym.envs.mujoco
-import gym.envs.mujoco as mujoco_env
-import mujoco_py.builder
-import numpy as np
-from gym import utils
-
-from mp_env_api.mp_wrappers.detpmp_wrapper import DetPMPWrapper
-from mp_env_api.utils.policies import PDControllerExtend
-
-
-def make_detpmp_env(**kwargs):
-    name = kwargs.pop("name")
-    _env = gym.make(name)
-    policy = PDControllerExtend(_env, p_gains=kwargs.pop('p_gains'), d_gains=kwargs.pop('d_gains'))
-    kwargs['policy_type'] = policy
-    return DetPMPWrapper(_env, **kwargs)
-
-
-class ALRBallInACupPDEnv(mujoco_env.MujocoEnv, utils.EzPickle):
-    def __init__(self, frame_skip=4, apply_gravity_comp=True, simplified: bool = False,
-                 reward_type: str = None, context: np.ndarray = None):
-        utils.EzPickle.__init__(**locals())
-        self._steps = 0
-
-        self.xml_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), "assets", "biac_base.xml")
-
-        self.max_ctrl = np.array([150., 125., 40., 60., 5., 5., 2.])
-
-        self.j_min = np.array([-2.6, -1.985, -2.8, -0.9, -4.55, -1.5707, -2.7])
-        self.j_max = np.array([2.6, 1.985, 2.8, 3.14159, 1.25, 1.5707, 2.7])
-
-        self.context = context
-        self.apply_gravity_comp = apply_gravity_comp
-        self.simplified = simplified
-
-        self._start_pos = np.array([0.0, 0.58760536, 0.0, 1.36004913, 0.0, -0.32072943, -1.57])
-        self._start_vel = np.zeros(7)
-
-        self.sim_time = 8  # seconds
-        self._dt = 0.02
-        self.ep_length = 4000  # based on 8 seconds with dt = 0.02 int(self.sim_time / self.dt)
-        if reward_type == "no_context":
-            from alr_envs.alr.mujoco.ball_in_a_cup.ball_in_a_cup_reward_simple import BallInACupReward
-            reward_function = BallInACupReward
-        elif reward_type == "contextual_goal":
-            from alr_envs.alr.mujoco.ball_in_a_cup.ball_in_a_cup_reward import BallInACupReward
-            reward_function = BallInACupReward
-        else:
-            raise ValueError("Unknown reward type: {}".format(reward_type))
-        self.reward_function = reward_function(self)
-
-        mujoco_env.MujocoEnv.__init__(self, self.xml_path, frame_skip)
-
-    @property
-    def dt(self):
-        return self._dt
-
-    # TODO: @Max is this even needed?
-    @property
-    def start_vel(self):
-        if self.simplified:
-            return self._start_vel[1::2]
-        else:
-            return self._start_vel
-
-    # def _set_action_space(self):
-    #     if self.simplified:
-    #         bounds = self.model.actuator_ctrlrange.copy().astype(np.float32)[1::2]
-    #     else:
-    #         bounds = self.model.actuator_ctrlrange.copy().astype(np.float32)
-    #     low, high = bounds.T
-    #     self.action_space = spaces.Box(low=low, high=high, dtype=np.float32)
-    #     return self.action_space
-
-    def reset(self):
-        self.reward_function.reset(None)
-        return super().reset()
-
-    def reset_model(self):
-        init_pos_all = self.init_qpos.copy()
-        init_pos_robot = self._start_pos
-        init_vel = np.zeros_like(init_pos_all)
-
-        self._steps = 0
-        self._q_pos = []
-        self._q_vel = []
-
-        start_pos = init_pos_all
-        start_pos[0:7] = init_pos_robot
-
-        self.set_state(start_pos, init_vel)
-
-        return self._get_obs()
-
-    def step(self, a):
-        reward_dist = 0.0
-        angular_vel = 0.0
-        reward_ctrl = - np.square(a).sum()
-
-        # if self.simplified:
-        #     tmp = np.zeros(7)
-        #     tmp[1::2] = a
-        #     a = tmp
-
-        if self.apply_gravity_comp:
-            a += self.sim.data.qfrc_bias[:len(a)].copy() / self.model.actuator_gear[:, 0]
-
-        crash = False
-        try:
-            self.do_simulation(a, self.frame_skip)
-        except mujoco_py.builder.MujocoException:
-            crash = True
-        # joint_cons_viol = self.check_traj_in_joint_limits()
-
-        ob = self._get_obs()
-
-        if not crash:
-            reward, success, is_collided = self.reward_function.compute_reward(a)
-            done = success or is_collided  # self._steps == self.sim_steps - 1
-            self._steps += 1
-        else:
-            reward = -2000
-            success = False
-            is_collided = False
-            done = True
-
-        return ob, reward, done, dict(reward_dist=reward_dist,
-                                      reward_ctrl=reward_ctrl,
-                                      velocity=angular_vel,
-                                      # traj=self._q_pos,
-                                      action=a,
-                                      q_pos=self.sim.data.qpos[0:7].ravel().copy(),
-                                      q_vel=self.sim.data.qvel[0:7].ravel().copy(),
-                                      is_success=success,
-                                      is_collided=is_collided, sim_crash=crash)
-
-    def check_traj_in_joint_limits(self):
-        return any(self.current_pos > self.j_max) or any(self.current_pos < self.j_min)
-
-    # TODO: extend observation space
-    def _get_obs(self):
-        theta = self.sim.data.qpos.flat[:7]
-        return np.concatenate([
-            np.cos(theta),
-            np.sin(theta),
-            # self.get_body_com("target"),  # only return target to make problem harder
-            [self._steps],
-        ])
-
-    # These functions are for the task with 3 joint actuations
-    def extend_des_pos(self, des_pos):
-        des_pos_full = self._start_pos.copy()
-        des_pos_full[1] = des_pos[0]
-        des_pos_full[3] = des_pos[1]
-        des_pos_full[5] = des_pos[2]
-        return des_pos_full
-
-    def extend_des_vel(self, des_vel):
-        des_vel_full = self._start_vel.copy()
-        des_vel_full[1] = des_vel[0]
-        des_vel_full[3] = des_vel[1]
-        des_vel_full[5] = des_vel[2]
-        return des_vel_full
-
-    def render(self, render_mode, **render_kwargs):
-        if render_mode == "plot_trajectory":
-            if self._steps == 1:
-                import matplotlib.pyplot as plt
-                # plt.ion()
-                self.fig, self.axs = plt.subplots(3, 1)
-
-            if self._steps <= 1750:
-                for ax, cp in zip(self.axs, self.current_pos[1::2]):
-                    ax.scatter(self._steps, cp, s=2, marker=".")
-
-            # self.fig.show()
-
-        else:
-            super().render(render_mode, **render_kwargs)
-
-
-if __name__ == "__main__":
-    env = ALRBallInACupPDEnv(reward_type="no_context", simplified=True)
-    # env = gym.make("alr_envs:ALRBallInACupPDSimpleDetPMP-v0")
-    # ctxt = np.array([-0.20869846, -0.66376693, 1.18088501])
-
-    # env.configure(ctxt)
-    env.reset()
-    env.render("human")
-    for i in range(16000):
-        # test with random actions
-        ac = 0.02 * env.action_space.sample()[0:7]
-        # ac = env.start_pos
-        # ac[0] += np.pi/2
-        obs, rew, d, info = env.step(ac)
-        env.render("human")
-
-        print(rew)
-
-        if d:
-            break
-
-    env.close()
--- a/alr_envs/alr/mujoco/beerpong/beerpong.py
+++ b/alr_envs/alr/mujoco/beerpong/beerpong.py
@ -1,193 +0,0 @@
-import mujoco_py.builder
-import os
-
-import numpy as np
-from gym import utils
-from gym.envs.mujoco import MujocoEnv
-
-
-class ALRBeerBongEnv(MujocoEnv, utils.EzPickle):
-    def __init__(self, frame_skip=1, apply_gravity_comp=True, reward_type: str = "staged", noisy=False,
-                 context: np.ndarray = None, difficulty='simple'):
-        self._steps = 0
-
-        self.xml_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), "assets",
-                                     "beerpong_wo_cup" + ".xml")
-
-        self.j_min = np.array([-2.6, -1.985, -2.8, -0.9, -4.55, -1.5707, -2.7])
-        self.j_max = np.array([2.6, 1.985, 2.8, 3.14159, 1.25, 1.5707, 2.7])
-
-        self.context = context
-        self.apply_gravity_comp = apply_gravity_comp
-        self.add_noise = noisy
-
-        self._start_pos = np.array([0.0, 1.35, 0.0, 1.18, 0.0, -0.786, -1.59])
-        self._start_vel = np.zeros(7)
-
-        self.ball_site_id = 0
-        self.ball_id = 11
-
-        self._release_step = 175  # time step of ball release
-
-        self.sim_time = 3  # seconds
-        self.ep_length = 600  # based on 3 seconds with dt = 0.005 int(self.sim_time / self.dt)
-        self.cup_table_id = 10
-
-        if noisy:
-            self.noise_std = 0.01
-        else:
-            self.noise_std = 0
-
-        if difficulty == 'simple':
-            self.cup_goal_pos = np.array([0, -1.7, 0.840])
-        elif difficulty == 'intermediate':
-            self.cup_goal_pos = np.array([0.3, -1.5, 0.840])
-        elif difficulty == 'hard':
-            self.cup_goal_pos = np.array([-0.3, -2.2, 0.840])
-        elif difficulty == 'hardest':
-            self.cup_goal_pos = np.array([-0.3, -1.2, 0.840])
-
-        if reward_type == "no_context":
-            from alr_envs.alr.mujoco.beerpong.beerpong_reward import BeerPongReward
-            reward_function = BeerPongReward
-        elif reward_type == "staged":
-            from alr_envs.alr.mujoco.beerpong.beerpong_reward_staged import BeerPongReward
-            reward_function = BeerPongReward
-        else:
-            raise ValueError("Unknown reward type: {}".format(reward_type))
-        self.reward_function = reward_function()
-
-        MujocoEnv.__init__(self, self.xml_path, frame_skip)
-        utils.EzPickle.__init__(self)
-
-    @property
-    def start_pos(self):
-        return self._start_pos
-
-    @property
-    def start_vel(self):
-        return self._start_vel
-
-    @property
-    def current_pos(self):
-        return self.sim.data.qpos[0:7].copy()
-
-    @property
-    def current_vel(self):
-        return self.sim.data.qvel[0:7].copy()
-
-    def reset(self):
-        self.reward_function.reset(self.add_noise)
-        return super().reset()
-
-    def reset_model(self):
-        init_pos_all = self.init_qpos.copy()
-        init_pos_robot = self.start_pos
-        init_vel = np.zeros_like(init_pos_all)
-
-        self._steps = 0
-
-        start_pos = init_pos_all
-        start_pos[0:7] = init_pos_robot
-
-        self.set_state(start_pos, init_vel)
-        self.sim.model.body_pos[self.cup_table_id] = self.cup_goal_pos
-        start_pos[7::] = self.sim.data.site_xpos[self.ball_site_id, :].copy()
-        self.set_state(start_pos, init_vel)
-        return self._get_obs()
-
-    def step(self, a):
-        reward_dist = 0.0
-        angular_vel = 0.0
-        reward_ctrl = - np.square(a).sum()
-
-        if self.apply_gravity_comp:
-            a = a + self.sim.data.qfrc_bias[:len(a)].copy() / self.model.actuator_gear[:, 0]
-        try:
-            self.do_simulation(a, self.frame_skip)
-            if self._steps < self._release_step:
-                self.sim.data.qpos[7::] = self.sim.data.site_xpos[self.ball_site_id, :].copy()
-                self.sim.data.qvel[7::] = self.sim.data.site_xvelp[self.ball_site_id, :].copy()
-            elif self._steps == self._release_step and self.add_noise:
-                 self.sim.data.qvel[7::] += self.noise_std * np.random.randn(3)
-            crash = False
-        except mujoco_py.builder.MujocoException:
-            crash = True
-        # joint_cons_viol = self.check_traj_in_joint_limits()
-
-        ob = self._get_obs()
-
-        if not crash:
-            reward, reward_infos = self.reward_function.compute_reward(self, a)
-            success = reward_infos['success']
-            is_collided = reward_infos['is_collided']
-            ball_pos = reward_infos['ball_pos']
-            ball_vel = reward_infos['ball_vel']
-            done = is_collided or self._steps == self.ep_length - 1
-            self._steps += 1
-        else:
-            reward = -30
-            reward_infos = dict()
-            success = False
-            is_collided = False
-            done = True
-            ball_pos = np.zeros(3)
-            ball_vel = np.zeros(3)
-
-        infos = dict(reward_dist=reward_dist,
-                     reward_ctrl=reward_ctrl,
-                     reward=reward,
-                     velocity=angular_vel,
-                     # traj=self._q_pos,
-                     action=a,
-                     q_pos=self.sim.data.qpos[0:7].ravel().copy(),
-                     q_vel=self.sim.data.qvel[0:7].ravel().copy(),
-                     ball_pos=ball_pos,
-                     ball_vel=ball_vel,
-                     success=success,
-                     is_collided=is_collided, sim_crash=crash)
-        infos.update(reward_infos)
-
-        return ob, reward, done, infos
-
-    def check_traj_in_joint_limits(self):
-        return any(self.current_pos > self.j_max) or any(self.current_pos < self.j_min)
-
-    # TODO: extend observation space
-    def _get_obs(self):
-        theta = self.sim.data.qpos.flat[:7]
-        return np.concatenate([
-            np.cos(theta),
-            np.sin(theta),
-            # self.get_body_com("target"),  # only return target to make problem harder
-            [self._steps],
-        ])
-
-    # TODO
-    @property
-    def active_obs(self):
-        return np.hstack([
-            [False] * 7,  # cos
-            [False] * 7,  # sin
-            # [True] * 2,  # x-y coordinates of target distance
-            [False]  # env steps
-        ])
-
-
-if __name__ == "__main__":
-    env = ALRBeerBongEnv(reward_type="staged", difficulty='hardest')
-
-    # env.configure(ctxt)
-    env.reset()
-    env.render("human")
-    for i in range(800):
-        ac = 10 * env.action_space.sample()[0:7]
-        obs, rew, d, info = env.step(ac)
-        env.render("human")
-
-        print(rew)
-
-        if d:
-            break
-
-    env.close()
--- a/alr_envs/alr/mujoco/beerpong/beerpong_reward.py
+++ b/alr_envs/alr/mujoco/beerpong/beerpong_reward.py
@ -1,171 +0,0 @@
-import numpy as np
-
-
-class BeerPongReward:
-    def __init__(self):
-
-        self.robot_collision_objects = ["wrist_palm_link_convex_geom",
-                                        "wrist_pitch_link_convex_decomposition_p1_geom",
-                                        "wrist_pitch_link_convex_decomposition_p2_geom",
-                                        "wrist_pitch_link_convex_decomposition_p3_geom",
-                                        "wrist_yaw_link_convex_decomposition_p1_geom",
-                                        "wrist_yaw_link_convex_decomposition_p2_geom",
-                                        "forearm_link_convex_decomposition_p1_geom",
-                                        "forearm_link_convex_decomposition_p2_geom",
-                                        "upper_arm_link_convex_decomposition_p1_geom",
-                                        "upper_arm_link_convex_decomposition_p2_geom",
-                                        "shoulder_link_convex_decomposition_p1_geom",
-                                        "shoulder_link_convex_decomposition_p2_geom",
-                                        "shoulder_link_convex_decomposition_p3_geom",
-                                        "base_link_convex_geom", "table_contact_geom"]
-
-        self.cup_collision_objects = ["cup_geom_table3", "cup_geom_table4", "cup_geom_table5", "cup_geom_table6",
-                                      "cup_geom_table7", "cup_geom_table8", "cup_geom_table9", "cup_geom_table10",
-                                      # "cup_base_table", "cup_base_table_contact",
-                                      "cup_geom_table15",
-                                      "cup_geom_table16",
-                                      "cup_geom_table17", "cup_geom1_table8",
-                                      # "cup_base_table_contact",
-                                      # "cup_base_table"
-                                      ]
-
-
-        self.ball_traj = None
-        self.dists = None
-        self.dists_final = None
-        self.costs = None
-        self.action_costs = None
-        self.angle_rewards = None
-        self.cup_angles = None
-        self.cup_z_axes = None
-        self.collision_penalty = 500
-        self.reset(None)
-
-    def reset(self, context):
-        self.ball_traj = []
-        self.dists = []
-        self.dists_final = []
-        self.costs = []
-        self.action_costs = []
-        self.angle_rewards = []
-        self.cup_angles = []
-        self.cup_z_axes = []
-        self.ball_ground_contact = False
-        self.ball_table_contact = False
-        self.ball_wall_contact = False
-        self.ball_cup_contact = False
-
-    def compute_reward(self, env, action):
-        self.ball_id = env.sim.model._body_name2id["ball"]
-        self.ball_collision_id = env.sim.model._geom_name2id["ball_geom"]
-        self.goal_id = env.sim.model._site_name2id["cup_goal_table"]
-        self.goal_final_id = env.sim.model._site_name2id["cup_goal_final_table"]
-        self.cup_collision_ids = [env.sim.model._geom_name2id[name] for name in self.cup_collision_objects]
-        self.cup_table_id = env.sim.model._body_name2id["cup_table"]
-        self.table_collision_id = env.sim.model._geom_name2id["table_contact_geom"]
-        self.wall_collision_id = env.sim.model._geom_name2id["wall"]
-        self.cup_table_collision_id = env.sim.model._geom_name2id["cup_base_table_contact"]
-        self.init_ball_pos_site_id = env.sim.model._site_name2id["init_ball_pos_site"]
-        self.ground_collision_id = env.sim.model._geom_name2id["ground"]
-        self.robot_collision_ids = [env.sim.model._geom_name2id[name] for name in self.robot_collision_objects]
-
-        goal_pos = env.sim.data.site_xpos[self.goal_id]
-        ball_pos = env.sim.data.body_xpos[self.ball_id]
-        ball_vel = env.sim.data.body_xvelp[self.ball_id]
-        goal_final_pos = env.sim.data.site_xpos[self.goal_final_id]
-        self.dists.append(np.linalg.norm(goal_pos - ball_pos))
-        self.dists_final.append(np.linalg.norm(goal_final_pos - ball_pos))
-
-        action_cost = np.sum(np.square(action))
-        self.action_costs.append(action_cost)
-
-        ball_table_bounce = self._check_collision_single_objects(env.sim, self.ball_collision_id,
-                                                                 self.table_collision_id)
-
-        if ball_table_bounce:  # or ball_cup_table_cont or ball_wall_con
-            self.ball_table_contact = True
-
-        ball_cup_cont = self._check_collision_with_set_of_objects(env.sim, self.ball_collision_id,
-                                                                        self.cup_collision_ids)
-        if ball_cup_cont:
-            self.ball_cup_contact = True
-
-        ball_wall_cont = self._check_collision_single_objects(env.sim, self.ball_collision_id, self.wall_collision_id)
-        if ball_wall_cont and not self.ball_table_contact:
-            self.ball_wall_contact = True
-
-        ball_ground_contact = self._check_collision_single_objects(env.sim, self.ball_collision_id,
-                                                                   self.ground_collision_id)
-        if ball_ground_contact and not self.ball_table_contact:
-            self.ball_ground_contact = True
-
-        self._is_collided = self._check_collision_with_itself(env.sim, self.robot_collision_ids)
-        if env._steps == env.ep_length - 1 or self._is_collided:
-
-            min_dist = np.min(self.dists)
-
-            ball_in_cup = self._check_collision_single_objects(env.sim, self.ball_collision_id, self.cup_table_collision_id)
-
-            cost_offset = 0
-
-            if self.ball_ground_contact:  #  or self.ball_wall_contact:
-                cost_offset += 2
-
-            if not self.ball_table_contact:
-                cost_offset += 2
-            
-            if not ball_in_cup:
-                cost_offset += 2
-                cost = cost_offset + min_dist ** 2 + 0.5 * self.dists_final[-1] ** 2 + 1e-4 * action_cost  # + min_dist ** 2
-            else:
-                if self.ball_cup_contact:
-                    cost_offset += 1
-                cost = cost_offset + self.dists_final[-1] ** 2 + 1e-4 * action_cost
-
-            reward = - 1*cost - self.collision_penalty * int(self._is_collided)
-            success = ball_in_cup and not self.ball_ground_contact and not self.ball_wall_contact and not self.ball_cup_contact
-        else:
-            reward = - 1e-4 * action_cost
-            success = False
-
-        infos = {}
-        infos["success"] = success
-        infos["is_collided"] = self._is_collided
-        infos["ball_pos"] = ball_pos.copy()
-        infos["ball_vel"] = ball_vel.copy()
-        infos["action_cost"] = 5e-4 * action_cost
-
-        return reward, infos
-
-    def _check_collision_single_objects(self, sim, id_1, id_2):
-        for coni in range(0, sim.data.ncon):
-            con = sim.data.contact[coni]
-
-            collision = con.geom1 == id_1 and con.geom2 == id_2
-            collision_trans = con.geom1 == id_2 and con.geom2 == id_1
-
-            if collision or collision_trans:
-                return True
-        return False
-
-    def _check_collision_with_itself(self, sim, collision_ids):
-        col_1, col_2 = False, False
-        for j, id in enumerate(collision_ids):
-            col_1 = self._check_collision_with_set_of_objects(sim, id, collision_ids[:j])
-            if j != len(collision_ids) - 1:
-                col_2 = self._check_collision_with_set_of_objects(sim, id, collision_ids[j + 1:])
-            else:
-                col_2 = False
-        collision = True if col_1 or col_2 else False
-        return collision
-
-    def _check_collision_with_set_of_objects(self, sim, id_1, id_list):
-        for coni in range(0, sim.data.ncon):
-            con = sim.data.contact[coni]
-
-            collision = con.geom1 in id_list and con.geom2 == id_1
-            collision_trans = con.geom1 == id_1 and con.geom2 in id_list
-
-            if collision or collision_trans:
-                return True
-        return False
--- a/alr_envs/alr/mujoco/beerpong/beerpong_reward_simple.py
+++ b/alr_envs/alr/mujoco/beerpong/beerpong_reward_simple.py
@ -1,141 +0,0 @@
-import numpy as np
-from alr_envs.alr.mujoco import alr_reward_fct
-
-
-class BeerpongReward(alr_reward_fct.AlrReward):
-    def __init__(self, sim, sim_time):
-
-        self.sim = sim
-        self.sim_time = sim_time
-
-        self.collision_objects = ["cup_geom1", "cup_geom2", "wrist_palm_link_convex_geom",
-                                  "wrist_pitch_link_convex_decomposition_p1_geom",
-                                  "wrist_pitch_link_convex_decomposition_p2_geom",
-                                  "wrist_pitch_link_convex_decomposition_p3_geom",
-                                  "wrist_yaw_link_convex_decomposition_p1_geom",
-                                  "wrist_yaw_link_convex_decomposition_p2_geom",
-                                  "forearm_link_convex_decomposition_p1_geom",
-                                  "forearm_link_convex_decomposition_p2_geom"]
-
-        self.ball_id = None
-        self.ball_collision_id = None
-        self.goal_id = None
-        self.goal_final_id = None
-        self.collision_ids = None
-
-        self.ball_traj = None
-        self.dists = None
-        self.dists_ctxt = None
-        self.dists_final = None
-        self.costs = None
-
-        self.reset(None)
-
-    def reset(self, context):
-        self.ball_traj = np.zeros(shape=(self.sim_time, 3))
-        self.dists = []
-        self.dists_ctxt = []
-        self.dists_final = []
-        self.costs = []
-        self.action_costs = []
-        self.context = context
-        self.ball_in_cup = False
-        self.dist_ctxt = 5
-        self.bounce_dist = 2
-        self.min_dist = 2
-        self.dist_final = 2
-        self.table_contact = False
-
-        self.ball_id = self.sim.model._body_name2id["ball"]
-        self.ball_collision_id = self.sim.model._geom_name2id["ball_geom"]
-        self.cup_robot_id = self.sim.model._site_name2id["cup_robot_final"]
-        self.goal_id = self.sim.model._site_name2id["cup_goal_table"]
-        self.goal_final_id = self.sim.model._site_name2id["cup_goal_final_table"]
-        self.collision_ids = [self.sim.model._geom_name2id[name] for name in self.collision_objects]
-        self.cup_table_id = self.sim.model._body_name2id["cup_table"]
-        self.bounce_table_id = self.sim.model._site_name2id["bounce_table"]
-
-    def compute_reward(self, action, sim, step):
-        action_cost = np.sum(np.square(action))
-        self.action_costs.append(action_cost)
-
-        stop_sim = False
-        success = False
-
-        if self.check_collision(sim):
-            reward = - 1e-2 * action_cost - 10
-            stop_sim = True
-            return reward, success, stop_sim
-
-        # Compute the current distance from the ball to the inner part of the cup
-        goal_pos = sim.data.site_xpos[self.goal_id]
-        ball_pos = sim.data.body_xpos[self.ball_id]
-        bounce_pos = sim.data.site_xpos[self.bounce_table_id]
-        goal_final_pos = sim.data.site_xpos[self.goal_final_id]
-        self.dists.append(np.linalg.norm(goal_pos - ball_pos))
-        self.dists_final.append(np.linalg.norm(goal_final_pos - ball_pos))
-        self.ball_traj[step, :] = ball_pos
-
-        ball_in_cup = self.check_ball_in_cup(sim, self.ball_collision_id)
-        table_contact = self.check_ball_table_contact(sim, self.ball_collision_id)
-
-        if table_contact and not self.table_contact:
-            self.bounce_dist = np.minimum((np.linalg.norm(bounce_pos - ball_pos)), 2)
-            self.table_contact = True
-
-        if step == self.sim_time - 1:
-            min_dist = np.min(self.dists)
-            self.min_dist = min_dist
-            dist_final = self.dists_final[-1]
-            self.dist_final = dist_final
-
-            cost = 0.33 * min_dist + 0.33 * dist_final + 0.33 * self.bounce_dist
-            reward = np.exp(-2 * cost) - 1e-2 * action_cost
-            success = self.bounce_dist < 0.05 and dist_final < 0.05 and ball_in_cup
-        else:
-            reward = - 1e-2 * action_cost
-            success = False
-
-        return reward, success, stop_sim
-
-    def _get_stage_wise_cost(self, ball_in_cup, min_dist, dist_final, dist_to_ctxt):
-        if not ball_in_cup:
-            cost = 3 + 2*(0.5 * min_dist**2 + 0.5 * dist_final**2)
-        else:
-            cost = 2 * dist_to_ctxt ** 2
-            print('Context Distance:', dist_to_ctxt)
-        return cost
-
-    def check_ball_table_contact(self, sim, ball_collision_id):
-        table_collision_id = sim.model._geom_name2id["table_contact_geom"]
-        for coni in range(0, sim.data.ncon):
-            con = sim.data.contact[coni]
-            collision = con.geom1 == table_collision_id and con.geom2 == ball_collision_id
-            collision_trans = con.geom1 == ball_collision_id and con.geom2 == table_collision_id
-
-            if collision or collision_trans:
-                return True
-        return False
-
-    def check_ball_in_cup(self, sim, ball_collision_id):
-        cup_base_collision_id = sim.model._geom_name2id["cup_base_table_contact"]
-        for coni in range(0, sim.data.ncon):
-            con = sim.data.contact[coni]
-
-            collision = con.geom1 == cup_base_collision_id and con.geom2 == ball_collision_id
-            collision_trans = con.geom1 == ball_collision_id and con.geom2 == cup_base_collision_id
-
-            if collision or collision_trans:
-                return True
-        return False
-
-    def check_collision(self, sim):
-        for coni in range(0, sim.data.ncon):
-            con = sim.data.contact[coni]
-
-            collision = con.geom1 in self.collision_ids and con.geom2 == self.ball_collision_id
-            collision_trans = con.geom1 == self.ball_collision_id and con.geom2 in self.collision_ids
-
-            if collision or collision_trans:
-                return True
-        return False
--- a/alr_envs/alr/mujoco/beerpong/beerpong_reward_staged.py
+++ b/alr_envs/alr/mujoco/beerpong/beerpong_reward_staged.py
@ -1,158 +0,0 @@
-import numpy as np
-
-
-class BeerPongReward:
-    def __init__(self):
-
-        self.robot_collision_objects = ["wrist_palm_link_convex_geom",
-                                        "wrist_pitch_link_convex_decomposition_p1_geom",
-                                        "wrist_pitch_link_convex_decomposition_p2_geom",
-                                        "wrist_pitch_link_convex_decomposition_p3_geom",
-                                        "wrist_yaw_link_convex_decomposition_p1_geom",
-                                        "wrist_yaw_link_convex_decomposition_p2_geom",
-                                        "forearm_link_convex_decomposition_p1_geom",
-                                        "forearm_link_convex_decomposition_p2_geom",
-                                        "upper_arm_link_convex_decomposition_p1_geom",
-                                        "upper_arm_link_convex_decomposition_p2_geom",
-                                        "shoulder_link_convex_decomposition_p1_geom",
-                                        "shoulder_link_convex_decomposition_p2_geom",
-                                        "shoulder_link_convex_decomposition_p3_geom",
-                                        "base_link_convex_geom", "table_contact_geom"]
-
-        self.cup_collision_objects = ["cup_geom_table3", "cup_geom_table4", "cup_geom_table5", "cup_geom_table6",
-                                      "cup_geom_table7", "cup_geom_table8", "cup_geom_table9", "cup_geom_table10",
-                                      # "cup_base_table", "cup_base_table_contact",
-                                      "cup_geom_table15",
-                                      "cup_geom_table16",
-                                      "cup_geom_table17", "cup_geom1_table8",
-                                      # "cup_base_table_contact",
-                                      # "cup_base_table"
-                                      ]
-
-
-        self.ball_traj = None
-        self.dists = None
-        self.dists_final = None
-        self.costs = None
-        self.action_costs = None
-        self.angle_rewards = None
-        self.cup_angles = None
-        self.cup_z_axes = None
-        self.collision_penalty = 500
-        self.reset(None)
-
-    def reset(self, noisy):
-        self.ball_traj = []
-        self.dists = []
-        self.dists_final = []
-        self.costs = []
-        self.action_costs = []
-        self.angle_rewards = []
-        self.cup_angles = []
-        self.cup_z_axes = []
-        self.ball_ground_contact = False
-        self.ball_table_contact = False
-        self.ball_wall_contact = False
-        self.ball_cup_contact = False
-        self.noisy_bp = noisy
-        self._t_min_final_dist = -1
-
-    def compute_reward(self, env, action):
-        self.ball_id = env.sim.model._body_name2id["ball"]
-        self.ball_collision_id = env.sim.model._geom_name2id["ball_geom"]
-        self.goal_id = env.sim.model._site_name2id["cup_goal_table"]
-        self.goal_final_id = env.sim.model._site_name2id["cup_goal_final_table"]
-        self.cup_collision_ids = [env.sim.model._geom_name2id[name] for name in self.cup_collision_objects]
-        self.cup_table_id = env.sim.model._body_name2id["cup_table"]
-        self.table_collision_id = env.sim.model._geom_name2id["table_contact_geom"]
-        self.wall_collision_id = env.sim.model._geom_name2id["wall"]
-        self.cup_table_collision_id = env.sim.model._geom_name2id["cup_base_table_contact"]
-        self.init_ball_pos_site_id = env.sim.model._site_name2id["init_ball_pos_site"]
-        self.ground_collision_id = env.sim.model._geom_name2id["ground"]
-        self.robot_collision_ids = [env.sim.model._geom_name2id[name] for name in self.robot_collision_objects]
-
-        goal_pos = env.sim.data.site_xpos[self.goal_id]
-        ball_pos = env.sim.data.body_xpos[self.ball_id]
-        ball_vel = env.sim.data.body_xvelp[self.ball_id]
-        goal_final_pos = env.sim.data.site_xpos[self.goal_final_id]
-        self.dists.append(np.linalg.norm(goal_pos - ball_pos))
-        self.dists_final.append(np.linalg.norm(goal_final_pos - ball_pos))
-
-        action_cost = np.sum(np.square(action))
-        self.action_costs.append(action_cost)
-
-        if not self.ball_table_contact:
-            self.ball_table_contact = self._check_collision_single_objects(env.sim, self.ball_collision_id,
-                                                                           self.table_collision_id)
-
-        self._is_collided = self._check_collision_with_itself(env.sim, self.robot_collision_ids)
-        if env._steps == env.ep_length - 1 or self._is_collided:
-
-            min_dist = np.min(self.dists)
-            final_dist = self.dists_final[-1]
-
-            ball_in_cup = self._check_collision_single_objects(env.sim, self.ball_collision_id,
-                                                               self.cup_table_collision_id)
-
-            # encourage bounce before falling into cup
-            if not ball_in_cup:
-                if not self.ball_table_contact:
-                    reward = 0.2 * (1 - np.tanh(min_dist ** 2)) + 0.1 * (1 - np.tanh(final_dist ** 2))
-                else:
-                    reward = (1 - np.tanh(min_dist ** 2)) + 0.5 * (1 - np.tanh(final_dist ** 2))
-            else:
-                if not self.ball_table_contact:
-                    reward = 2 * (1 - np.tanh(final_dist ** 2)) + 1 * (1 - np.tanh(min_dist ** 2)) + 1
-                else:
-                    reward = 2 * (1 - np.tanh(final_dist ** 2)) + 1 * (1 - np.tanh(min_dist ** 2)) + 3
-
-            # reward = - 1 * cost - self.collision_penalty * int(self._is_collided)
-            success = ball_in_cup
-            crash = self._is_collided
-        else:
-            reward = - 1e-2 * action_cost
-            success = False
-            crash = False
-
-        infos = {}
-        infos["success"] = success
-        infos["is_collided"] = self._is_collided
-        infos["ball_pos"] = ball_pos.copy()
-        infos["ball_vel"] = ball_vel.copy()
-        infos["action_cost"] = action_cost
-        infos["task_reward"] = reward
-
-        return reward, infos
-
-    def _check_collision_single_objects(self, sim, id_1, id_2):
-        for coni in range(0, sim.data.ncon):
-            con = sim.data.contact[coni]
-
-            collision = con.geom1 == id_1 and con.geom2 == id_2
-            collision_trans = con.geom1 == id_2 and con.geom2 == id_1
-
-            if collision or collision_trans:
-                return True
-        return False
-
-    def _check_collision_with_itself(self, sim, collision_ids):
-        col_1, col_2 = False, False
-        for j, id in enumerate(collision_ids):
-            col_1 = self._check_collision_with_set_of_objects(sim, id, collision_ids[:j])
-            if j != len(collision_ids) - 1:
-                col_2 = self._check_collision_with_set_of_objects(sim, id, collision_ids[j + 1:])
-            else:
-                col_2 = False
-        collision = True if col_1 or col_2 else False
-        return collision
-
-    def _check_collision_with_set_of_objects(self, sim, id_1, id_list):
-        for coni in range(0, sim.data.ncon):
-            con = sim.data.contact[coni]
-
-            collision = con.geom1 in id_list and con.geom2 == id_1
-            collision_trans = con.geom1 == id_1 and con.geom2 in id_list
-
-            if collision or collision_trans:
-                return True
-        return False
--- a/alr_envs/alr/mujoco/beerpong/beerpong_simple.py
+++ b/alr_envs/alr/mujoco/beerpong/beerpong_simple.py
@ -1,166 +0,0 @@
-from gym import utils
-import os
-import numpy as np
-from gym.envs.mujoco import MujocoEnv
-
-
-class ALRBeerpongEnv(MujocoEnv, utils.EzPickle):
-    def __init__(self, n_substeps=4, apply_gravity_comp=True, reward_function=None):
-        self._steps = 0
-
-        self.xml_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), "assets",
-                                     "beerpong" + ".xml")
-
-        self.start_pos = np.array([0.0, 1.35, 0.0, 1.18, 0.0, -0.786, -1.59])
-        self.start_vel = np.zeros(7)
-
-        self._q_pos = []
-        self._q_vel = []
-        # self.weight_matrix_scale = 50
-        self.max_ctrl = np.array([150., 125., 40., 60., 5., 5., 2.])
-        self.p_gains = 1 / self.max_ctrl * np.array([200, 300, 100, 100, 10, 10, 2.5])
-        self.d_gains = 1 / self.max_ctrl * np.array([7, 15, 5, 2.5, 0.3, 0.3, 0.05])
-
-        self.j_min = np.array([-2.6, -1.985, -2.8, -0.9, -4.55, -1.5707, -2.7])
-        self.j_max = np.array([2.6, 1.985, 2.8, 3.14159, 1.25, 1.5707, 2.7])
-
-        self.context = None
-
-        # alr_mujoco_env.AlrMujocoEnv.__init__(self,
-        #                                      self.xml_path,
-        #                                      apply_gravity_comp=apply_gravity_comp,
-        #                                      n_substeps=n_substeps)
-
-        self.sim_time = 8  # seconds
-        # self.sim_steps = int(self.sim_time / self.dt)
-        if reward_function is None:
-            from alr_envs.alr.mujoco.beerpong.beerpong_reward_simple import BeerpongReward
-            reward_function = BeerpongReward
-        self.reward_function = reward_function(self.sim, self.sim_steps)
-        self.cup_robot_id = self.sim.model._site_name2id["cup_robot_final"]
-        self.ball_id = self.sim.model._body_name2id["ball"]
-        self.cup_table_id = self.sim.model._body_name2id["cup_table"]
-        # self.bounce_table_id = self.sim.model._body_name2id["bounce_table"]
-
-        MujocoEnv.__init__(self, model_path=self.xml_path, frame_skip=n_substeps)
-        utils.EzPickle.__init__(self)
-
-    @property
-    def current_pos(self):
-        return self.sim.data.qpos[0:7].copy()
-
-    @property
-    def current_vel(self):
-        return self.sim.data.qvel[0:7].copy()
-
-    def configure(self, context):
-        if context is None:
-            context = np.array([0, -2, 0.840])
-        self.context = context
-        self.reward_function.reset(context)
-
-    def reset_model(self):
-        init_pos_all = self.init_qpos.copy()
-        init_pos_robot = self.start_pos
-        init_vel = np.zeros_like(init_pos_all)
-
-        self._steps = 0
-        self._q_pos = []
-        self._q_vel = []
-
-        start_pos = init_pos_all
-        start_pos[0:7] = init_pos_robot
-        # start_pos[7:] = np.copy(self.sim.data.site_xpos[self.cup_robot_id, :]) + np.array([0., 0.0, 0.05])
-
-        self.set_state(start_pos, init_vel)
-
-        ball_pos = np.copy(self.sim.data.site_xpos[self.cup_robot_id, :]) + np.array([0., 0.0, 0.05])
-        self.sim.model.body_pos[self.ball_id] = ball_pos.copy()
-        self.sim.model.body_pos[self.cup_table_id] = self.context.copy()
-        # self.sim.model.body_pos[self.bounce_table_id] = self.context.copy()
-
-        self.sim.forward()
-
-        return self._get_obs()
-
-    def step(self, a):
-        reward_dist = 0.0
-        angular_vel = 0.0
-        reward_ctrl = - np.square(a).sum()
-        action_cost = np.sum(np.square(a))
-
-        crash = self.do_simulation(a, self.frame_skip)
-        joint_cons_viol = self.check_traj_in_joint_limits()
-
-        self._q_pos.append(self.sim.data.qpos[0:7].ravel().copy())
-        self._q_vel.append(self.sim.data.qvel[0:7].ravel().copy())
-
-        ob = self._get_obs()
-
-        if not crash and not joint_cons_viol:
-            reward, success, stop_sim = self.reward_function.compute_reward(a, self.sim, self._steps)
-            done = success or self._steps == self.sim_steps - 1 or stop_sim
-            self._steps += 1
-        else:
-            reward = -10 - 1e-2 * action_cost
-            success = False
-            done = True
-        return ob, reward, done, dict(reward_dist=reward_dist,
-                                      reward_ctrl=reward_ctrl,
-                                      velocity=angular_vel,
-                                      traj=self._q_pos, is_success=success,
-                                      is_collided=crash or joint_cons_viol)
-
-    def check_traj_in_joint_limits(self):
-        return any(self.current_pos > self.j_max) or any(self.current_pos < self.j_min)
-
-    def extend_des_pos(self, des_pos):
-        des_pos_full = self.start_pos.copy()
-        des_pos_full[1] = des_pos[0]
-        des_pos_full[3] = des_pos[1]
-        des_pos_full[5] = des_pos[2]
-        return des_pos_full
-
-    def extend_des_vel(self, des_vel):
-        des_vel_full = self.start_vel.copy()
-        des_vel_full[1] = des_vel[0]
-        des_vel_full[3] = des_vel[1]
-        des_vel_full[5] = des_vel[2]
-        return des_vel_full
-
-    def _get_obs(self):
-        theta = self.sim.data.qpos.flat[:7]
-        return np.concatenate([
-            np.cos(theta),
-            np.sin(theta),
-            # self.get_body_com("target"),  # only return target to make problem harder
-            [self._steps],
-        ])
-
-
-
-if __name__ == "__main__":
-    env = ALRBeerpongEnv()
-    ctxt = np.array([0, -2, 0.840])    # initial
-
-    env.configure(ctxt)
-    env.reset()
-    env.render()
-    for i in range(16000):
-        # test with random actions
-        ac = 0.0 * env.action_space.sample()[0:7]
-        ac[1] = -0.01
-        ac[3] = - 0.01
-        ac[5] = -0.01
-        # ac = env.start_pos
-        # ac[0] += np.pi/2
-        obs, rew, d, info = env.step(ac)
-        env.render()
-
-        print(rew)
-
-        if d:
-            break
-
-    env.close()
-
--- a/alr_envs/alr/mujoco/beerpong/mp_wrapper.py
+++ b/alr_envs/alr/mujoco/beerpong/mp_wrapper.py
@ -1,39 +0,0 @@
-from typing import Tuple, Union
-
-import numpy as np
-
-from mp_env_api.interface_wrappers.mp_env_wrapper import MPEnvWrapper
-
-
-class MPWrapper(MPEnvWrapper):
-
-    @property
-    def active_obs(self):
-        # TODO: @Max Filter observations correctly
-        return np.hstack([
-            [False] * 7,  # cos
-            [False] * 7,  # sin
-            # [True] * 2,  # x-y coordinates of target distance
-            [False]  # env steps
-        ])
-
-    @property
-    def start_pos(self):
-        return self._start_pos
-
-    @property
-    def current_pos(self) -> Union[float, int, np.ndarray, Tuple]:
-        return self.sim.data.qpos[0:7].copy()
-
-    @property
-    def current_vel(self) -> Union[float, int, np.ndarray, Tuple]:
-        return self.sim.data.qvel[0:7].copy()
-
-    @property
-    def goal_pos(self):
-        # TODO: @Max I think the default value of returning to the start is reasonable here
-        raise ValueError("Goal position is not available and has to be learnt based on the environment.")
-
-    @property
-    def dt(self) -> Union[float, int]:
-        return self.env.dt
--- a/alr_envs/alr/mujoco/gym_table_tennis/init.py
+++ b/alr_envs/alr/mujoco/gym_table_tennis/init.py
@ -1 +0,0 @@
-
--- a/alr_envs/alr/mujoco/gym_table_tennis/envs/init.py
+++ b/alr_envs/alr/mujoco/gym_table_tennis/envs/init.py
@ -1 +0,0 @@
-
--- a/alr_envs/alr/mujoco/gym_table_tennis/envs/assets/include_7_motor_actuator.xml
+++ b/alr_envs/alr/mujoco/gym_table_tennis/envs/assets/include_7_motor_actuator.xml
@ -1,12 +0,0 @@
-<mujocoinclude>
-    <actuator boastype="none">
-        <motor name="wam/shoulder_yaw_link_right_motor" joint="wam/base_yaw_joint_right"/>
-        <motor name="wam/shoulder_pitch_joint_right_motor" joint='wam/shoulder_pitch_joint_right'/>
-        <motor name="wam/shoulder_yaw_joint_right_motor" joint='wam/shoulder_yaw_joint_right'/>
-        <motor name="wam/elbow_pitch_joint_right_motor" joint='wam/elbow_pitch_joint_right'/>
-        <motor name="wam/wrist_yaw_joint_right_motor" joint='wam/wrist_yaw_joint_right'/>
-        <motor name="wam/wrist_pitch_joint_right_motor" joint='wam/wrist_pitch_joint_right'/>
-        <motor name="wam/palm_yaw_joint_right_motor" joint='wam/palm_yaw_joint_right'/>
-    </actuator>
-</mujocoinclude>
-
--- a/alr_envs/alr/mujoco/gym_table_tennis/envs/assets/include_barrett_wam_7dof_left.xml
+++ b/alr_envs/alr/mujoco/gym_table_tennis/envs/assets/include_barrett_wam_7dof_left.xml
@ -1,76 +0,0 @@
-<mujocoinclude>
-    <body name="wam/base_link_left" pos="-2.5 0 2" quat="0 1 0 0" childclass="wam">
-            <inertial pos="0 0 0" mass="1" diaginertia="0.1 0.1 0.1"/>
-            <geom class="viz" mesh="base_link_fine" rgba="0.5 0.5 0.5 0"/>
-            <geom class="col" mesh="base_link_convex" rgba="0.5 0.5 0.5 1"/>
-            <body name="wam/shoulder_yaw_link" pos="0 0 0.346">
-                <inertial pos="-0.00443422 -0.00066489 -0.128904" quat="0.69566 0.716713 -0.0354863 0.0334839" mass="5"
-                          diaginertia="0.135089 0.113095 0.0904426"/>
-                <joint name="wam/base_yaw_joint" range="-2.6 2.6" damping="1.98"/>
-                <geom class="viz" mesh="shoulder_link_fine" rgba="1 1 1 0"/>
-                <geom class="col" mesh="shoulder_link_convex_decomposition_p1"/>
-                <geom class="col" mesh="shoulder_link_convex_decomposition_p2"/>
-                <geom class="col" mesh="shoulder_link_convex_decomposition_p3"/>
-                <body name="wam/shoulder_pitch_link" pos="0 0 0" quat="0.707107 -0.707107 0 0">
-                    <inertial pos="-0.00236981 -0.0154211 0.0310561" quat="0.961794 0.273112 -0.0169316 0.00866592"
-                              mass="3.87494" diaginertia="0.0214195 0.0167127 0.0126452"/> <!--seems off-->
-                    <joint name="wam/shoulder_pitch_joint" range="-1.985 1.985" damping="0.55"/>
-                    <geom class="viz" mesh="shoulder_pitch_link_fine" rgba="1 1 1 0"/>
-                    <geom class="col" mesh="shoulder_pitch_link_convex"/>
-                    <body name="wam/upper_arm_link" pos="0 0 0" quat="0.707107 0.707107 0 0">
-                        <inertial pos="0.00683259 3.309e-005 0.392492" quat="0.647136 0.0170822 0.0143038 0.762049"
-                                  mass="2.20228" diaginertia="0.0592718 0.0592207 0.00313419"/>
-                        <joint name="wam/shoulder_yaw_joint" range="-2.8 2.8" damping="1.65"/>
-                        <geom class="viz" mesh="upper_arm_link_fine" rgba="1 1 1 0"/>
-                        <geom class="col" mesh="upper_arm_link_convex_decomposition_p1" rgba="0.094 0.48 0.804 1"/>
-                        <geom class="col" mesh="upper_arm_link_convex_decomposition_p2" rgba="0.094 0.48 0.804 1"/>
-                        <body name="wam/forearm_link" pos="0.045 0 0.55" quat="0.707107 -0.707107 0 0">
-                            <inertial pos="-0.0400149 -0.142717 -0.00022942"
-                                      quat="0.704281 0.706326 0.0180333 0.0690353" mass="0.500168"
-                                      diaginertia="0.0151047 0.0148285 0.00275805"/>
-                            <joint name="wam/elbow_pitch_joint" range="-0.9 3.14159" damping="0.88"/>
-                            <geom class="viz" mesh="elbow_link_fine" rgba="1 1 1 0"/>
-                            <geom class="col" mesh="elbow_link_convex"/>
-                            <geom class="viz" mesh="forearm_link_fine" pos="-.045 -0.0730 0" euler="1.57 0 0" rgba="1 1 1 0"/>
-                            <geom class="col" mesh="forearm_link_convex_decomposition_p1" pos="-0.045 -0.0730 0"
-                                  euler="1.57 0 0" rgba="0.094 0.48 0.804 1"/>
-                            <geom class="col" mesh="forearm_link_convex_decomposition_p2" pos="-.045 -0.0730 0"
-                                  euler="1.57 0 0" rgba="0.094 0.48 0.804 1"/>
-                            <body name="wam/wrist_yaw_link" pos="-0.045 -0.3 0" quat="0.707107 0.707107 0 0">
-                                <inertial pos="8.921e-005 0.00435824 -0.00511217"
-                                          quat="0.630602 0.776093 0.00401969 -0.002372" mass="1.05376"
-                                          diaginertia="0.000555168 0.00046317 0.000234072"/> <!--this is an approximation-->
-                                <joint name="wam/wrist_yaw_joint" range="-4.55 1.25" damping="0.55"/>
-                                <geom class="viz" mesh="wrist_yaw_link_fine" rgba="1 1 1 0"/>
-                                <geom class="col" mesh="wrist_yaw_link_convex_decomposition_p1"/>
-                                <geom class="col" mesh="wrist_yaw_link_convex_decomposition_p2"/>
-                                <body name="wam/wrist_pitch_link" pos="0 0 0" quat="0.707107 -0.707107 0 0">
-                                    <inertial pos="-0.00012262 -0.0246834 -0.0170319"
-                                              quat="0.630602 0.776093 0.00401969 -0.002372" mass="0.517974"
-                                              diaginertia="0.000555168 0.00046317 0.000234072"/>
-                                    <joint name="wam/wrist_pitch_joint" range="-1.5707 1.5707" damping="0.11"/>
-                                    <geom class="viz" mesh="wrist_pitch_link_fine" rgba="1 1 1 0"/>
-                                    <geom class="col" mesh="wrist_pitch_link_convex_decomposition_p1" rgba="1 0.5 0.313 1"/>
-                                    <geom class="col" mesh="wrist_pitch_link_convex_decomposition_p2" rgba="1 0.5 0.313 1"/>
-                                    <geom class="col" mesh="wrist_pitch_link_convex_decomposition_p3" rgba="1 0.5 0.313 1"/>
-                                    <body name="wam/wrist_palm_link" pos="0 0 0" quat="0.707107 0.707107 0 0">
-                                        <inertial pos="0 0 0.055" quat="0.707107 0 0 0.707107" mass="0.0828613"
-                                                  diaginertia="0.00020683 0.00010859 0.00010851"/>
-                                        <joint name="wam/palm_yaw_joint" range="-3 3" damping="0.11"/>
-                                        <geom class="viz" mesh="wrist_palm_link_fine" rgba="1 1 1 0"/>
-                                        <geom class="col" mesh="wrist_palm_link_convex"/>
-                                        <body name="paddle_left" pos="0 0 0.26" childclass="contact_geom">
-                                            <geom name="bat_left" type="cylinder" size="0.075 0.0015" rgba="1 0 0 1"
-                                                  quat="0.71 0 0.71 0"/>
-                                            <geom name="handle_left" type="box" size="0.005 0.01 0.05" pos="0 0 -0.08"
-                                                  rgba="1 1 1 1"/>
-                                        </body>
-                                    </body>
-                                </body>
-                            </body>
-                        </body>
-                    </body>
-                </body>
-            </body>
-        </body>
-</mujocoinclude>
--- a/alr_envs/alr/mujoco/gym_table_tennis/envs/assets/include_barrett_wam_7dof_right.xml
+++ b/alr_envs/alr/mujoco/gym_table_tennis/envs/assets/include_barrett_wam_7dof_right.xml
@ -1,95 +0,0 @@
-<mujocoinclue>
-    <body name="wam/base_link_right" pos="2.5 0 2" quat="0 0 1 0" childclass="wam" >
-            <inertial pos="0 0 0" mass="1" diaginertia="0.1 0.1 0.1"/>
-            <geom name="base_link_fine" class="viz" mesh="base_link_fine" rgba="0.5 0.5 0.5 0"/>
-            <geom name="base_link_convex" class="col" mesh="base_link_convex" rgba="0.5 0.5 0.5 1"/>
-            <body name="wam/shoulder_yaw_link_right" pos="0 0 0.346">
-                <inertial pos="-0.00443422 -0.00066489 -0.128904" quat="0.69566 0.716713 -0.0354863 0.0334839" mass="5"
-                          diaginertia="0.135089 0.113095 0.0904426"/>
-                <joint name="wam/base_yaw_joint_right" range="-2.6 2.6" damping="1.98"/>
-                <geom name="shoulder_link_fine" class="viz" mesh="shoulder_link_fine" rgba="1 1 1 0"/>
-                <geom name="shoulder_link_convex_decomposition_p1" class="col"
-                      mesh="shoulder_link_convex_decomposition_p1"/>
-                <geom name="shoulder_link_convex_decomposition_p2" class="col"
-                      mesh="shoulder_link_convex_decomposition_p2"/>
-                <geom name="shoulder_link_convex_decomposition_p3" class="col"
-                      mesh="shoulder_link_convex_decomposition_p3"/>
-                <body name="wam/shoulder_pitch_link_right" pos="0 0 0" quat="0.707107 -0.707107 0 0">
-                    <inertial pos="-0.00236981 -0.0154211 0.0310561" quat="0.961794 0.273112 -0.0169316 0.00866592"
-                              mass="3.87494" diaginertia="0.0214195 0.0167127 0.0126452"/> <!--seems off-->
-                    <joint name="wam/shoulder_pitch_joint_right" range="-2 2" damping="0.55"/>
-                    <geom name="shoulder_pitch_link_fine" class="viz" mesh="shoulder_pitch_link_fine" rgba="1 1 1 0"/>
-                    <geom name="shoulder_pitch_link_convex" class="col" mesh="shoulder_pitch_link_convex"/>
-                    <body name="wam/upper_arm_link_right" pos="0 0 0" quat="0.707107 0.707107 0 0">
-                        <inertial pos="0.00683259 3.309e-005 0.392492" quat="0.647136 0.0170822 0.0143038 0.762049"
-                                  mass="2.20228" diaginertia="0.0592718 0.0592207 0.00313419"/>
-                        <joint name="wam/shoulder_yaw_joint_right" range="-2.8 2.8" damping="1.65"/>
-                        <geom name="upper_arm_link_fine" class="viz" mesh="upper_arm_link_fine" rgba="1 1 1 0"/>
-                        <geom name="upper_arm_link_convex_decomposition_p1" class="col"
-                              mesh="upper_arm_link_convex_decomposition_p1" rgba="0.094 0.48 0.804 1"/>
-                        <geom name="upper_arm_link_convex_decomposition_p2" class="col"
-                              mesh="upper_arm_link_convex_decomposition_p2" rgba="0.094 0.48 0.804 1"/>
-                        <body name="wam/forearm_link_right" pos="0.045 0 0.55" quat="0.707107 -0.707107 0 0">
-                            <inertial pos="-0.0400149 -0.142717 -0.00022942"
-                                      quat="0.704281 0.706326 0.0180333 0.0690353" mass="0.500168"
-                                      diaginertia="0.0151047 0.0148285 0.00275805"/>
-                            <joint name="wam/elbow_pitch_joint_right" range="-0.9 3.1" damping="0.88"/>
-                            <geom name="elbow_link_fine" class="viz" mesh="elbow_link_fine" rgba="1 1 1 0"/>
-                            <geom name="elbow_link_convex" class="col" mesh="elbow_link_convex"/>
-                            <geom name="forearm_link_fine" class="viz" mesh="forearm_link_fine" pos="-.045 -0.0730 0"
-                                  euler="1.57 0 0" rgba="1 1 1 0"/>
-                            <geom name="forearm_link_convex_decomposition_p1" class="col"
-                                  mesh="forearm_link_convex_decomposition_p1" pos="-0.045 -0.0730 0"
-                                  euler="1.57 0 0" rgba="0.094 0.48 0.804 1"/>
-                            <geom name="forearm_link_convex_decomposition_p2" class="col"
-                                  mesh="forearm_link_convex_decomposition_p2" pos="-.045 -0.0730 0"
-                                  euler="1.57 0 0" rgba="0.094 0.48 0.804 1"/>
-                            <body name="wam/wrist_yaw_link_right" pos="-0.045 -0.3 0" quat="0.707107 0.707107 0 0">
-                                <inertial pos="8.921e-005 0.00435824 -0.00511217"
-                                          quat="0.630602 0.776093 0.00401969 -0.002372" mass="1.05376"
-                                          diaginertia="0.000555168 0.00046317 0.000234072"/> <!--this is an approximation-->
-                                <joint name="wam/wrist_yaw_joint_right" range="-4.8 1.3" damping="0.55"/>
-                                <geom name="wrist_yaw_link_fine" class="viz" mesh="wrist_yaw_link_fine" rgba="1 1 1 0"/>
-                                <geom name="wrist_yaw_link_convex_decomposition_p1" class="col"
-                                      mesh="wrist_yaw_link_convex_decomposition_p1"/>
-                                <geom name="wrist_yaw_link_convex_decomposition_p2" class="col"
-                                      mesh="wrist_yaw_link_convex_decomposition_p2"/>
-                                <body name="wam/wrist_pitch_link_right" pos="0 0 0" quat="0.707107 -0.707107 0 0">
-                                    <inertial pos="-0.00012262 -0.0246834 -0.0170319"
-                                              quat="0.630602 0.776093 0.00401969 -0.002372" mass="0.517974"
-                                              diaginertia="0.000555168 0.00046317 0.000234072"/>
-                                    <joint name="wam/wrist_pitch_joint_right" range="-1.6 1.6" damping="0.11"/>
-                                    <geom name="wrist_pitch_link_fine" class="viz" mesh="wrist_pitch_link_fine"
-                                          rgba="1 1 1 0"/>
-                                    <geom name="wrist_pitch_link_convex_decomposition_p1" rgba="1 0.5 0.313 1"
-                                          class="col" mesh="wrist_pitch_link_convex_decomposition_p1"/>
-                                    <geom name="wrist_pitch_link_convex_decomposition_p2" rgba="1 0.5 0.313 1"
-                                          class="col" mesh="wrist_pitch_link_convex_decomposition_p2"/>
-                                    <geom name="wrist_pitch_link_convex_decomposition_p3" rgba="1 0.5 0.313 1"
-                                          class="col" mesh="wrist_pitch_link_convex_decomposition_p3"/>
-                                    <body name="wam/wrist_palm_link_right" pos="0 0 0" quat="0.707107 0.707107 0 0">
-                                        <inertial pos="0 0 0.055" quat="0.707107 0 0 0.707107" mass="0.0828613"
-                                                  diaginertia="0.00020683 0.00010859 0.00010851"/>
-                                        <joint name="wam/palm_yaw_joint_right" range="-2.2 2.2" damping="0.11"/>
-                                        <geom name="wrist_palm_link_fine" class="viz" mesh="wrist_palm_link_fine"
-                                              rgba="1 1 1 0"/>
-                                        <geom name="wrist_palm_link_convex" class="col" mesh="wrist_palm_link_convex"/>
-<!--                                        EE=wam/paddle, configure name to the end effector name-->
-                                        <body name="EE" pos="0 0 0.26" childclass="contact_geom">
-                                            <geom name="bat" type="cylinder" size="0.075 0.005" rgba="1 0 0 1"
-                                                  quat="0.71 0 0.71 0"/>
-                                            <geom name="wam/paddle_handle" type="box" size="0.005 0.01 0.05" pos="0 0 -0.08"
-                                                  rgba="1 1 1 1"/>
-                                            <!--                                            Extract information for sampling goals.-->
-                                            <site name="wam/paddle_center" pos="0 0 0" rgba="1 1 1 1" size="0.00001"/>
-
-                                        </body>
-                                    </body>
-                                </body>
-                            </body>
-                        </body>
-                    </body>
-                </body>
-            </body>
-        </body>
-</mujocoinclue>
--- a/alr_envs/alr/mujoco/gym_table_tennis/envs/assets/include_table.xml
+++ b/alr_envs/alr/mujoco/gym_table_tennis/envs/assets/include_table.xml
@ -1,38 +0,0 @@
-<mujocoinclude>
-    <body name="table_tennis_table" pos="0 0 0">
-        <geom class="contact_geom" name="table_base_1" type="box" size="0.05 0.05 .375" rgba="1 1 1 1"
-              pos="1 0.7 0.375"/>
-        <geom class="contact_geom" name="table_base_2" type="box" size="0.05 0.05 .375" rgba="1 1 1 1"
-              pos="1 -0.7 0.375"/>
-        <geom class="contact_geom" name="table_base_3" type="box" size="0.05 0.05 .375" rgba="1 1 1 1"
-              pos="-1 -0.7 0.375"/>
-        <geom class="contact_geom" name="table_base_4" type="box" size="0.05 0.05 .375" rgba="1 1 1 1"
-              pos="-1 0.7 0.375"/>
-        <body name="table_top" pos="0 0 0.76">
-            <geom class="contact_geom" name="table_tennis_table" type="box" size="1.37 .7625 .01" rgba="0 0 0.5 1"
-                  pos="0 0 0"/>
-<!--            <geom class="contact_geom" name="table_tennis_table_right_side" type="box" size="0.685 .7625 .01"-->
-<!--                  rgba="0.5 0 0 1" pos="0.685 0 0"/>-->
-<!--            <geom class="contact_geom" name="table_tennis_table_left_side" type="box" size="0.685 .7625 .01"-->
-<!--                  rgba="0 0.5 0 1" pos="-0.685 0 0"/>-->
-            <site name="left_up_corner" pos="-1.37 .7625 0.01" rgba="1 1 1 1" size="0.00001"/>
-            <site name="middle_up_corner" pos="0 .7625 0.01" rgba="1 1 1 1" size="0.00001"/>
-            <site name="left_down_corner" pos="-1.37 -0.7625 0.01" rgba="1 1 1 1" size="0.00001"/>
-            <site name="middle_down_corner" pos="0 -.7625 0.01" rgba="1 1 1 1" size="0.00001"/>
-            <geom class="contact_geom" name="table_tennis_net" type="box" size="0.01 0.915 0.07625"
-                  material="floor_plane"
-                  rgba="0 0 1 0.5"
-                  pos="0 0 0.08625"/>
-            <geom class="contact_geom" name="left_while_line" type="box" size="1.37 .02 .001" rgba="1 1 1 1"
-                  pos="0 -0.7425 0.01"/>
-            <geom class="contact_geom" name="center_while_line" type="box" size="1.37 .01 .001" rgba="1 1 1 1"
-                  pos="0 0 0.01"/>
-            <geom class="contact_geom" name="right_while_line" type="box" size="1.37 .02 .001" rgba="1 1 1 1"
-                  pos="0 0.7425 0.01"/>
-            <geom class="contact_geom" name="right_side_line" type="box" size="0.02 .7625 .001" rgba="1 1 1 1"
-                  pos="1.35 0 0.01"/>
-            <geom class="contact_geom" name="left_side_line" type="box" size="0.02 .7625 .001" rgba="1 1 1 1"
-                  pos="-1.35 0 0.01"/>
-        </body>
-    </body>
-</mujocoinclude>
--- a/alr_envs/alr/mujoco/gym_table_tennis/envs/assets/include_target_ball.xml
+++ b/alr_envs/alr/mujoco/gym_table_tennis/envs/assets/include_target_ball.xml
@ -1,10 +0,0 @@
-<mujocoinclude>
-    <body name="target_ball" pos="-1.2 -0.6 1.5">
-        <joint axis="1 0 0" damping="0.0" name="tar:x" pos="0 0 0" stiffness="0" type="slide" frictionloss="0"/>
-        <joint axis="0 1 0" damping="0.0" name="tar:y" pos="0 0 0" stiffness="0" type="slide" frictionloss="0"/>
-        <joint axis="0 0 1" damping="0.0" name="tar:z" pos="0 0 0" stiffness="0" type="slide" frictionloss="0"/>
-        <geom size="0.025 0.025 0.025" type="sphere" condim="4" name="target_ball" rgba="1 1 0 1" mass="0.1"
-              friction="0.1 0.1 0.1" solimp="1 1 0" solref="0.1 0.03"/>
-        <site name="target_ball" pos="0 0 0" size="0.02 0.02 0.02" rgba="1 0 0 1" type="sphere"/>
-    </body>
-</mujocoinclude>
--- a/alr_envs/alr/mujoco/gym_table_tennis/envs/assets/include_test_balls.xml
+++ b/alr_envs/alr/mujoco/gym_table_tennis/envs/assets/include_test_balls.xml
@ -1,80 +0,0 @@
-<mujocoinclude>
-    <body name="test_ball_table" pos="1 0 4">
-        <joint axis="1 0 0" damping="0.0" name="tar:x_test_ball_table" pos="0 0 0" stiffness="0" type="slide"
-               frictionloss="0"/>
-        <joint axis="0 1 0" damping="0.0" name="tar:y_test_ball_table" pos="0 0 0" stiffness="0" type="slide"
-               frictionloss="0"/>
-        <joint axis="0 0 1" damping="0.0" name="tar:z_test_ball_table" pos="0 0 0" stiffness="0" type="slide"
-               frictionloss="0"/>
-        <geom size="0.025 0.025 0.025" type="sphere" condim="4" name="test_ball_table" rgba="0 1 0 1" mass="0.1"
-              friction="0.1 0.1 0.1" solimp="1 1 0" solref="0.1 0.03"/>
-        <site name="test_ball_table" pos="0 0 0" size="0.02 0.02 0.02" rgba="0 1 0 1" type="sphere"/>
-    </body>
-    <body name="test_ball_net" pos="0 0 4">
-        <joint axis="1 0 0" damping="0.0" name="tar:x_test_ball_net" pos="0 0 0" stiffness="0" type="slide"
-               frictionloss="0"/>
-        <joint axis="0 1 0" damping="0.0" name="tar:y_test_ball_net" pos="0 0 0" stiffness="0" type="slide"
-               frictionloss="0"/>
-        <joint axis="0 0 1" damping="0.0" name="tar:z_test_ball_net" pos="0 0 0" stiffness="0" type="slide"
-               frictionloss="0"/>
-        <geom size="0.025 0.025 0.025" type="sphere" condim="4" name="test_ball_net" rgba="1 1 0 1" mass="0.1"
-              friction="0.1 0.1 0.1" solimp="1 1 0" solref="0.1 0.03"/>
-        <site name="test_ball_net" pos="0 0 0" size="0.02 0.02 0.02" rgba="0 1 0 1" type="sphere"/>
-    </body>
-    <body name="test_ball_racket_0" pos="2.54919187 0.81642672 4">
-        <joint axis="1 0 0" damping="0.0" name="tar:x_test_ball_racket_0" pos="0 0 0" stiffness="0" type="slide"
-               frictionloss="0"/>
-        <joint axis="0 1 0" damping="0.0" name="tar:y_test_ball_racket_0" pos="0 0 0" stiffness="0" type="slide"
-               frictionloss="0"/>
-        <joint axis="0 0 1" damping="0.0" name="tar:z_test_ball_racket_0" pos="0 0 0" stiffness="0" type="slide"
-               frictionloss="0"/>
-        <geom size="0.025 0.025 0.025" type="sphere" condim="4" name="test_ball_racket_0" rgba="1 0 1 1" mass="0.1"
-              friction="0.1 0.1 0.1" solimp="1 1 0" solref="0.1 0.03"/>
-        <site name="test_ball_racket_0" pos="0 0 0" size="0.02 0.02 0.02" rgba="0 1 0 1" type="sphere"/>
-    </body>
-        <body name="test_ball_racket_1" pos="2.54919187 0.81642672 4.5">
-        <joint axis="1 0 0" damping="0.0" name="tar:x_test_ball_racket_1" pos="0 0 0" stiffness="0" type="slide"
-               frictionloss="0"/>
-        <joint axis="0 1 0" damping="0.0" name="tar:y_test_ball_racket_1" pos="0 0 0" stiffness="0" type="slide"
-               frictionloss="0"/>
-        <joint axis="0 0 1" damping="0.0" name="tar:z_test_ball_racket_1" pos="0 0 0" stiffness="0" type="slide"
-               frictionloss="0"/>
-        <geom size="0.025 0.025 0.025" type="sphere" condim="4" name="test_ball_racket_1" rgba="1 0 1 1" mass="0.1"
-              friction="0.1 0.1 0.1" solimp="1 1 0" solref="0.1 0.03"/>
-        <site name="test_ball_racket_1" pos="0 0 0" size="0.02 0.02 0.02" rgba="0 1 0 1" type="sphere"/>
-    </body>
-        <body name="test_ball_racket_2" pos="2.54919187 0.81642672 3">
-        <joint axis="1 0 0" damping="0.0" name="tar:x_test_ball_racket_2" pos="0 0 0" stiffness="0" type="slide"
-               frictionloss="0"/>
-        <joint axis="0 1 0" damping="0.0" name="tar:y_test_ball_racket_2" pos="0 0 0" stiffness="0" type="slide"
-               frictionloss="0"/>
-        <joint axis="0 0 1" damping="0.0" name="tar:z_test_ball_racket_2" pos="0 0 0" stiffness="0" type="slide"
-               frictionloss="0"/>
-        <geom size="0.025 0.025 0.025" type="sphere" condim="4" name="test_ball_racket_2" rgba="1 0 1 1" mass="0.1"
-              friction="0.1 0.1 0.1" solimp="1 1 0" solref="0.1 0.03"/>
-        <site name="test_ball_racket" pos="0 0 0" size="0.02 0.02 0.02" rgba="0 1 0 1" type="sphere"/>
-    </body>
-        <body name="test_ball_racket_3" pos="2.54919187 0.81642672 10">
-        <joint axis="1 0 0" damping="0.0" name="tar:x_test_ball_racket_3" pos="0 0 0" stiffness="0" type="slide"
-               frictionloss="0"/>
-        <joint axis="0 1 0" damping="0.0" name="tar:y_test_ball_racket_3" pos="0 0 0" stiffness="0" type="slide"
-               frictionloss="0"/>
-        <joint axis="0 0 1" damping="0.0" name="tar:z_test_ball_racket_3" pos="0 0 0" stiffness="0" type="slide"
-               frictionloss="0"/>
-        <geom size="0.025 0.025 0.025" type="sphere" condim="4" name="test_ball_racket_3" rgba="1 0 1 1" mass="0.1"
-              friction="0.1 0.1 0.1" solimp="1 1 0" solref="0.1 0.03"/>
-        <site name="test_ball_racket_3" pos="0 0 0" size="0.02 0.02 0.02" rgba="0 1 0 1" type="sphere"/>
-    </body>
-<!--            <body name="test_ball_racket_4" pos="2.54919187 0.81642672 4">-->
-<!--        <joint axis="1 0 0" damping="0.0" name="tar:x_test_ball_racket_4" pos="0 0 0" stiffness="0" type="slide"-->
-<!--               frictionloss="0"/>-->
-<!--        <joint axis="0 1 0" damping="0.0" name="tar:y_test_ball_racket_4" pos="0 0 0" stiffness="0" type="slide"-->
-<!--               frictionloss="0"/>-->
-<!--        <joint axis="0 0 1" damping="0.0" name="tar:z_test_ball_racket_4" pos="0 0 0" stiffness="0" type="slide"-->
-<!--               frictionloss="0"/>-->
-<!--        <geom size="0.025 0.025 0.025" type="sphere" condim="4" name="test_ball_racket_4" rgba="1 0 0 1" mass="0.1"-->
-<!--              friction="0.1 0.1 0.1" solimp="1 1 0" solref="0.1 0.03"/>-->
-<!--        <site name="test_ball_racket_4" pos="0 0 0" size="0.02 0.02 0.02" rgba="0 1 0 1" type="sphere"/>-->
-<!--    </body>-->
-
-</mujocoinclude>
--- a/alr_envs/alr/mujoco/gym_table_tennis/envs/assets/right_arm_actuator.xml
+++ b/alr_envs/alr/mujoco/gym_table_tennis/envs/assets/right_arm_actuator.xml
@ -1,19 +0,0 @@
-<mujocoinclude>
-    <actuator>
-
-        <!--        <position ctrlrange="-2.6 2.6" joint="wam/base_yaw_joint_right" kp="100.0" />-->
-        <!--        <position ctrlrange="-1.985 1.985" joint="wam/shoulder_pitch_joint_right" kp="162.0" />-->
-        <!--        <position ctrlrange="-2.8 2.8" joint="wam/shoulder_yaw_joint_right" kp="100.0" />-->
-        <!--        <position ctrlrange="-0.9 3.14159" joint="wam/elbow_pitch_joint_right" kp="122.0" />-->
-        <!--        <position ctrlrange="-4.55 1.25" joint="wam/wrist_yaw_joint_right" kp="100.0" />-->
-        <!--        <position ctrlrange="-1.5707 1.5707" joint="wam/wrist_pitch_joint_right" kp="102.0" />-->
-        <!--        <position ctrlrange="-3 3" joint="wam/palm_yaw_joint_right" kp="100.0" />-->
-        <position ctrlrange="-2.6 2.6" joint="wam/base_yaw_joint_right" kp="151.0"/>
-        <position ctrlrange="-1.985 1.985" joint="wam/shoulder_pitch_joint_right" kp="125.0"/>
-        <position ctrlrange="-2.8 2.8" joint="wam/shoulder_yaw_joint_right" kp="122.0"/>
-        <position ctrlrange="-0.9 3.14159" joint="wam/elbow_pitch_joint_right" kp="121.0"/>
-        <position ctrlrange="-4.55 1.25" joint="wam/wrist_yaw_joint_right" kp="99.0"/>
-        <position ctrlrange="-1.5707 1.5707" joint="wam/wrist_pitch_joint_right" kp="103.0"/>
-        <position ctrlrange="-3 3" joint="wam/palm_yaw_joint_right" kp="99.0"/>
-    </actuator>
-</mujocoinclude>
--- a/alr_envs/alr/mujoco/gym_table_tennis/envs/assets/shared.xml
+++ b/alr_envs/alr/mujoco/gym_table_tennis/envs/assets/shared.xml
@ -1,49 +0,0 @@
-<mujocoinclude>
-    <default>
-        <default class="wam">
-            <joint type="hinge" limited="true" pos="0 0 0" axis="0 0 1"/>
-        </default>
-        <default class="viz">
-            <geom type="mesh" contype="0" conaffinity="0" group="1" rgba="1 1 1 1"/>
-        </default>
-        <default class="col">
-            <geom type="mesh" contype="0" conaffinity="1" group="0" rgba="1 1 1 1"/>
-        </default>
-        <default class="contact_geom">
-            <geom condim="4" friction="0.1 0.1 0.1" margin="0" solimp="1 1 0" solref="0.1 0.03"/>
-            <!--            <geom condim="4" friction="0 0 0" margin="0" solimp="1 1 0" solref="0.01 1.1"/>-->
-
-        </default>
-    </default>
-    <asset>
-        <mesh file="base_link_fine.stl"/>
-        <mesh file="base_link_convex.stl"/>
-        <mesh file="shoulder_link_fine.stl"/>
-        <mesh file="shoulder_link_convex_decomposition_p1.stl"/>
-        <mesh file="shoulder_link_convex_decomposition_p2.stl"/>
-        <mesh file="shoulder_link_convex_decomposition_p3.stl"/>
-        <mesh file="shoulder_pitch_link_fine.stl"/>
-        <mesh file="shoulder_pitch_link_convex.stl"/>
-        <mesh file="upper_arm_link_fine.stl"/>
-        <mesh file="upper_arm_link_convex_decomposition_p1.stl"/>
-        <mesh file="upper_arm_link_convex_decomposition_p2.stl"/>
-        <mesh file="elbow_link_fine.stl"/>
-        <mesh file="elbow_link_convex.stl"/>
-        <mesh file="forearm_link_fine.stl"/>
-        <mesh file="forearm_link_convex_decomposition_p1.stl"/>
-        <mesh file="forearm_link_convex_decomposition_p2.stl"/>
-        <mesh file="wrist_yaw_link_fine.stl"/>
-        <mesh file="wrist_yaw_link_convex_decomposition_p1.stl"/>
-        <mesh file="wrist_yaw_link_convex_decomposition_p2.stl"/>
-        <mesh file="wrist_pitch_link_fine.stl"/>
-        <mesh file="wrist_pitch_link_convex_decomposition_p1.stl"/>
-        <mesh file="wrist_pitch_link_convex_decomposition_p2.stl"/>
-        <mesh file="wrist_pitch_link_convex_decomposition_p3.stl"/>
-        <mesh file="wrist_palm_link_fine.stl"/>
-        <mesh file="wrist_palm_link_convex.stl"/>
-
-        <texture builtin="checker" height="512" name="texplane" rgb1=".2 .3 .4" rgb2=".1 0.15 0.2" type="2d"
-                 width="512"/>
-        <material name="floor_plane" reflectance="0.5" texrepeat="1 1" texture="texplane" texuniform="true"/>
-    </asset>
-</mujocoinclude>
--- a/alr_envs/alr/mujoco/gym_table_tennis/envs/assets/table_tennis_env.xml
+++ b/alr_envs/alr/mujoco/gym_table_tennis/envs/assets/table_tennis_env.xml
@ -1,41 +0,0 @@
-<mujoco model="table_tennis(v0.1)">
-    <compiler angle="radian" coordinate="local" meshdir="meshes/" />
-
-    <option gravity="0 0 -9.81" timestep="0.002">
-        <flag warmstart="enable" />
-    </option>
-
-
-    <custom>
-        <numeric data="0 0 0 0 0 0 0" name="START_ANGLES" />
-    </custom>
-
-
-    <include file="shared.xml" />
-
-    <worldbody>
-        <light cutoff="60" diffuse="1 1 1" dir="-.1 -.2 -1.3" directional="true" exponent="1" pos=".1 .2 1.3" specular=".1 .1 .1" />
-        <geom conaffinity="1" contype="1" material="floor_plane" name="floor" pos="0 0 0" size="10 5 1" type="plane" />
-
-
-        <include file="include_table.xml" />
-
-
-
-        <include file="include_barrett_wam_7dof_right.xml" />
-
-        <include file="include_target_ball.xml" />
-
-
-
-
-
-    </worldbody>
-
-
-
-
-    <include file="right_arm_actuator.xml" />
-
-
-</mujoco>
--- a/alr_envs/alr/mujoco/gym_table_tennis/envs/table_tennis_env.py
+++ b/alr_envs/alr/mujoco/gym_table_tennis/envs/table_tennis_env.py
@ -1,244 +0,0 @@
-import numpy as np
-from gym import spaces
-from gym.envs.robotics import robot_env, utils
-# import xml.etree.ElementTree as ET
-from alr_envs.alr.mujoco.gym_table_tennis.utils.rewards.hierarchical_reward import HierarchicalRewardTableTennis
-import glfw
-from alr_envs.alr.mujoco.gym_table_tennis.utils.experiment import ball_initialize
-from pathlib import Path
-import os
-
-
-class TableTennisEnv(robot_env.RobotEnv):
-    """Class for Table Tennis environment.
-    """
-    def __init__(self, n_substeps=1,
-                 model_path=None,
-                 initial_qpos=None,
-                 initial_ball_state=None,
-                 config=None,
-                 reward_obj=None
-                 ):
-        """Initializes a new mujoco based Table Tennis environment.
-
-        Args:
-            model_path (string): path to the environments XML file
-            initial_qpos (dict): a dictionary of joint names and values that define the initial
-            n_actions: Number of joints
-            n_substeps (int): number of substeps the simulation runs on every call to step
-            scale (double): limit maximum change in position
-            initial_ball_state: to reset the ball state
-        """
-        # self.config = config.config
-        if model_path is None:
-            path_cws = Path.cwd()
-            print(path_cws)
-            current_dir = Path(os.path.split(os.path.realpath(__file__))[0])
-            table_tennis_env_xml_path = current_dir / "assets"/"table_tennis_env.xml"
-            model_path = str(table_tennis_env_xml_path)
-        self.config = config
-        action_space = True  # self.config['trajectory']['args']['action_space']
-        time_step = 0.002  # self.config['mujoco_sim_env']['args']["time_step"]
-        if initial_qpos is None:
-            initial_qpos = {"wam/base_yaw_joint_right": 1.5,
-                            "wam/shoulder_pitch_joint_right": 1,
-                            "wam/shoulder_yaw_joint_right": 0,
-                            "wam/elbow_pitch_joint_right": 1,
-                            "wam/wrist_yaw_joint_right": 1,
-                            "wam/wrist_pitch_joint_right": 0,
-                            "wam/palm_yaw_joint_right": 0}
-            # initial_qpos = [1.5, 1, 0, 1, 1, 0, 0]  # self.config['robot_config']['args']['initial_qpos']
-
-        # TODO should read all configuration in config
-        assert initial_qpos is not None, "Must initialize the initial q position of robot arm"
-        n_actions = 7
-        self.initial_qpos_value = np.array(list(initial_qpos.values())).copy()
-        # self.initial_qpos_value = np.array(initial_qpos)
-        # # change time step in .xml file
-        # tree = ET.parse(model_path)
-        # root = tree.getroot()
-        # for option in root.findall('option'):
-        #     option.set("timestep", str(time_step))
-        #
-        # tree.write(model_path)
-
-        super(TableTennisEnv, self).__init__(
-            model_path=model_path, n_substeps=n_substeps, n_actions=n_actions,
-            initial_qpos=initial_qpos)
-
-        if action_space:
-            self.action_space = spaces.Box(low=np.array([-2.6, -2.0, -2.8, -0.9, -4.8, -1.6, -2.2]),
-                                           high=np.array([2.6, 2.0, 2.8, 3.1, 1.3, 1.6, 2.2]),
-                                           dtype='float64')
-        else:
-            self.action_space = spaces.Box(low=np.array([-np.inf] * 7),
-                                           high=np.array([-np.inf] * 7),
-                                           dtype='float64')
-        self.scale = None
-        self.desired_pos = None
-        self.n_actions = n_actions
-        self.action = None
-        self.time_step = time_step
-        self._dt = time_step
-        self.paddle_center_pos = self.sim.data.get_site_xpos('wam/paddle_center')
-        if reward_obj is None:
-            self.reward_obj = HierarchicalRewardTableTennis()
-        else:
-            self.reward_obj = reward_obj
-
-        if initial_ball_state is not None:
-            self.initial_ball_state = initial_ball_state
-        else:
-            self.initial_ball_state = ball_initialize(random=False)
-        self.target_ball_pos = self.sim.data.get_site_xpos("target_ball")
-        self.racket_center_pos = self.sim.data.get_site_xpos("wam/paddle_center")
-
-    def close(self):
-        if self.viewer is not None:
-            glfw.destroy_window(self.viewer.window)
-            # self.viewer.window.close()
-            self.viewer = None
-            self._viewers = {}
-
-    # GoalEnv methods
-    # ----------------------------
-    def compute_reward(self, achieved_goal, goal, info):
-        # reset the reward, if action valid
-        # right_court_contact_obj = ["target_ball", "table_tennis_table_right_side"]
-        # right_court_contact_detector = self.reward_obj.contact_detection(self, right_court_contact_obj)
-        # if right_court_contact_detector:
-        #     print("can detect the table ball contact")
-        self.reward_obj.total_reward = 0
-        # Stage 1 Hitting
-        self.reward_obj.hitting(self)
-        # if not hitted, return the highest reward
-        if not self.reward_obj.goal_achievement:
-            # return self.reward_obj.highest_reward
-            return self.reward_obj.total_reward
-        # # Stage 2 Right Table Contact
-        # self.reward_obj.right_table_contact(self)
-        # if not self.reward_obj.goal_achievement:
-        #     return self.reward_obj.highest_reward
-        # # Stage 2 Net Contact
-        # self.reward_obj.net_contact(self)
-        # if not self.reward_obj.goal_achievement:
-        #     return self.reward_obj.highest_reward
-        # Stage 3 Opponent court Contact
-        # self.reward_obj.landing_on_opponent_court(self)
-        # if not self.reward_obj.goal_achievement:
-        # print("self.reward_obj.highest_reward: ", self.reward_obj.highest_reward)
-        # TODO
-        self.reward_obj.target_achievement(self)
-        # return self.reward_obj.highest_reward
-        return self.reward_obj.total_reward
-
-    def _reset_sim(self):
-        self.sim.set_state(self.initial_state)
-        [initial_x, initial_y, initial_z, v_x, v_y, v_z] = self.initial_ball_state
-        self.sim.data.set_joint_qpos('tar:x', initial_x)
-        self.sim.data.set_joint_qpos('tar:y', initial_y)
-        self.sim.data.set_joint_qpos('tar:z', initial_z)
-        self.energy_corrected = True
-        self.give_reflection_reward = False
-
-        # velocity is positive direction
-        self.sim.data.set_joint_qvel('tar:x', v_x)
-        self.sim.data.set_joint_qvel('tar:y', v_y)
-        self.sim.data.set_joint_qvel('tar:z', v_z)
-
-        # Apply gravity compensation
-        if self.sim.data.qfrc_applied[:7] is not self.sim.data.qfrc_bias[:7]:
-            self.sim.data.qfrc_applied[:7] = self.sim.data.qfrc_bias[:7]
-        self.sim.forward()
-        return True
-
-    def _env_setup(self, initial_qpos):
-        for name, value in initial_qpos.items():
-            self.sim.data.set_joint_qpos(name, value)
-
-        # Apply gravity compensation
-        if self.sim.data.qfrc_applied[:7] is not self.sim.data.qfrc_bias[:7]:
-            self.sim.data.qfrc_applied[:7] = self.sim.data.qfrc_bias[:7]
-        self.sim.forward()
-
-        # Get the target position
-        self.initial_paddle_center_xpos = self.sim.data.get_site_xpos('wam/paddle_center').copy()
-        self.initial_paddle_center_vel = None  # self.sim.get_site_
-
-    def _sample_goal(self):
-        goal = self.initial_paddle_center_xpos[:3] + self.np_random.uniform(-0.2, 0.2, size=3)
-        return goal.copy()
-
-    def _get_obs(self):
-
-        # positions of racket center
-        paddle_center_pos = self.sim.data.get_site_xpos('wam/paddle_center')
-        ball_pos = self.sim.data.get_site_xpos("target_ball")
-
-        dt = self.sim.nsubsteps * self.sim.model.opt.timestep
-        paddle_center_velp = self.sim.data.get_site_xvelp('wam/paddle_center') * dt
-        robot_qpos, robot_qvel = utils.robot_get_obs(self.sim)
-
-        wrist_state = robot_qpos[-3:]
-        wrist_vel = robot_qvel[-3:] * dt  # change to a scalar if the gripper is made symmetric
-
-        # achieved_goal = paddle_body_EE_pos
-        obs = np.concatenate([
-            paddle_center_pos, paddle_center_velp, wrist_state, wrist_vel
-        ])
-
-        out_dict = {
-            'observation': obs.copy(),
-            'achieved_goal': paddle_center_pos.copy(),
-            'desired_goal': self.goal.copy(),
-            'q_pos': self.sim.data.qpos[:].copy(),
-            "ball_pos": ball_pos.copy(),
-            # "hitting_flag": self.reward_obj.hitting_flag
-        }
-
-        return out_dict
-
-    def _step_callback(self):
-        pass
-
-    def _set_action(self, action):
-        # Apply gravity compensation
-        if self.sim.data.qfrc_applied[:7] is not self.sim.data.qfrc_bias[:7]:
-            self.sim.data.qfrc_applied[:7] = self.sim.data.qfrc_bias[:7]
-        # print("set action process running")
-        assert action.shape == (self.n_actions,)
-        self.action = action.copy()  # ensure that we don't change the action outside of this scope
-        pos_ctrl = self.action[:]  # limit maximum change in position
-        pos_ctrl = np.clip(pos_ctrl, self.action_space.low, self.action_space.high)
-
-        # get desired trajectory
-        self.sim.data.qpos[:7] = pos_ctrl
-        self.sim.forward()
-        self.desired_pos = self.sim.data.get_site_xpos('wam/paddle_center').copy()
-
-        self.sim.data.ctrl[:] = pos_ctrl
-
-    def _is_success(self, achieved_goal, desired_goal):
-        pass
-
-
-if __name__ == '__main__':
-    render_mode = "human"  # "human" or "partial" or "final"
-    env = TableTennisEnv()
-    env.reset()
-    # env.render(mode=render_mode)
-
-    for i in range(500):
-        # objective.load_result("/tmp/cma")
-        # test with random actions
-        ac = env.action_space.sample()
-        # ac[0] += np.pi/2
-        obs, rew, d, info = env.step(ac)
-        env.render(mode=render_mode)
-
-        print(rew)
-
-        if d:
-            break
-
-    env.close()
--- a/alr_envs/alr/mujoco/gym_table_tennis/utils/experiment.py
+++ b/alr_envs/alr/mujoco/gym_table_tennis/utils/experiment.py
@ -1,83 +0,0 @@
-import numpy as np
-from gym.utils import seeding
-from alr_envs.alr.mujoco.gym_table_tennis.utils.util import read_yaml, read_json
-from pathlib import Path
-
-
-def ball_initialize(random=False, scale=False, context_range=None, scale_value=None):
-    if random:
-        if scale:
-            # if scale_value is None:
-            scale_value = context_scale_initialize(context_range)
-            v_x, v_y, v_z = [2.5, 2, 0.5] * scale_value
-            dx = 1
-            dy = 0
-            dz = 0.05
-        else:
-            seed = None
-            np_random, seed = seeding.np_random(seed)
-            dx = np_random.uniform(-0.1, 0.1)
-            dy = np_random.uniform(-0.1, 0.1)
-            dz = np_random.uniform(-0.1, 0.1)
-
-            v_x = np_random.uniform(1.7, 1.8)
-            v_y = np_random.uniform(0.7, 0.8)
-            v_z = np_random.uniform(0.1, 0.2)
-        # print(dx, dy, dz, v_x, v_y, v_z)
-    # else:
-    #     dx = -0.1
-    #     dy = 0.05
-    #     dz = 0.05
-    #     v_x = 1.5
-    #     v_y = 0.7
-    #     v_z = 0.06
-    # initial_x = -0.6 + dx
-    # initial_y = -0.3 + dy
-    # initial_z = 0.8 + dz
-    else:
-        if scale:
-            v_x, v_y, v_z = [2.5, 2, 0.5] * scale_value
-        else:
-            v_x = 2.5
-            v_y = 2
-            v_z = 0.5
-        dx = 1
-        dy = 0
-        dz = 0.05
-
-    initial_x = 0 + dx
-    initial_y = -0.2 + dy
-    initial_z = 0.3 + dz
-    # print("initial ball state: ", initial_x, initial_y, initial_z, v_x, v_y, v_z)
-    initial_ball_state = np.array([initial_x, initial_y, initial_z, v_x, v_y, v_z])
-    return initial_ball_state
-
-
-def context_scale_initialize(range):
-    """
-
-    Returns:
-
-    """
-    low, high = range
-    scale = np.random.uniform(low, high, 1)
-    return scale
-
-
-def config_handle_generation(config_file_path):
-    """Generate config handle for multiprocessing
-
-    Args:
-        config_file_path:
-
-    Returns:
-
-    """
-    cfg_fname = Path(config_file_path)
-    # .json and .yml file
-    if cfg_fname.suffix == ".json":
-        config = read_json(cfg_fname)
-    elif cfg_fname.suffix == ".yml":
-        config = read_yaml(cfg_fname)
-
-    return config
--- a/alr_envs/alr/mujoco/gym_table_tennis/utils/rewards/hierarchical_reward.py
+++ b/alr_envs/alr/mujoco/gym_table_tennis/utils/rewards/hierarchical_reward.py
@ -1,402 +0,0 @@
-import numpy as np
-import logging
-
-
-class HierarchicalRewardTableTennis(object):
-    """Class for hierarchical reward function for table tennis experiment.
-
-    Return Highest Reward.
-    Reward = 0
-    Step 1: Action Valid. Upper Bound 0
-                [-∞, 0]
-                Reward += -1 * |hit_duration - hit_duration_threshold| * |hit_duration < hit_duration_threshold| * 10
-    Step 2: Hitting. Upper Bound 2
-                if hitting:
-                    [0, 2]
-                    Reward = 2 * (1 - tanh(|shortest_hitting_dist|))
-                if not hitting:
-                    [0, 0.2]
-                    Reward = 2 * (1 - tanh(|shortest_hitting_dist|))
-    Step 3: Target Point Achievement. Upper Bound 6
-                [0, 4]
-                if table_contact_detector:
-                    Reward += 1
-                    Reward += (1 - tanh(|shortest_hitting_dist|)) * 2
-                    if contact_coordinate[0] < 0:
-                        Reward += 1
-                    else:
-                        Reward += 0
-                elif:
-                    Reward += (1 - tanh(|shortest_hitting_dist|))
-    """
-
-    def __init__(self):
-        self.reward = None
-        self.goal_achievement = False
-        self.total_reward = 0
-        self.shortest_hitting_dist = 1000
-        self.highest_reward = -1000
-        self.lowest_corner_dist = 100
-        self.right_court_contact_detector = False
-        self.table_contact_detector = False
-        self.floor_contact_detector = False
-        self.radius = 0.025
-        self.min_ball_x_pos = 100
-        self.hit_contact_detector = False
-        self.net_contact_detector = False
-        self.ratio = 1
-        self.lowest_z = 100
-        self.target_flag = False
-        self.dist_target_virtual = 100
-        self.ball_z_pos_lowest = 100
-        self.hitting_flag = False
-        self.hitting_time_point = None
-        self.ctxt_dim = None
-        self.context_range_bounds = None
-        # self.ctxt_out_of_range_punishment = None
-        # self.ctxt_in_side_of_range_punishment = None
-    #
-    # def check_where_invalid(self, ctxt, context_range_bounds, set_to_valid_region=False):
-    #     idx_max = []
-    #     idx_min = []
-    #     for dim in range(self.ctxt_dim):
-    #         min_dim = context_range_bounds[0][dim]
-    #         max_dim = context_range_bounds[1][dim]
-    #         idx_max_c = np.where(ctxt[:, dim] > max_dim)[0]
-    #         idx_min_c = np.where(ctxt[:, dim] < min_dim)[0]
-    #         if set_to_valid_region:
-    #             if idx_max_c.shape[0] != 0:
-    #                 ctxt[idx_max_c, dim] = max_dim
-    #             if idx_min_c.shape[0] != 0:
-    #                 ctxt[idx_min_c, dim] = min_dim
-    #         idx_max.append(idx_max_c)
-    #         idx_min.append(idx_min_c)
-    #     return idx_max, idx_min, ctxt
-
-    def check_valid(self, scale, context_range_bounds):
-
-        min_dim = context_range_bounds[0][0]
-        max_dim = context_range_bounds[1][0]
-        valid = (scale < max_dim) and (scale > min_dim)
-        return valid
-
-    @classmethod
-    def goal_distance(cls, goal_a, goal_b):
-        assert goal_a.shape == goal_b.shape
-        return np.linalg.norm(goal_a - goal_b, axis=-1)
-
-    def refresh_highest_reward(self):
-        if self.total_reward >= self.highest_reward:
-            self.highest_reward = self.total_reward
-
-    def duration_valid(self):
-        pass
-
-    def huge_value_unstable(self):
-        self.total_reward += -10
-        self.highest_reward = -1
-
-    def context_valid(self, context):
-        valid = self.check_valid(context.copy(), context_range_bounds=self.context_range_bounds)
-        # when using dirac punishments
-        if valid:
-            self.total_reward += 1 # If Action Valid and Context Valid, total_reward = 0
-        else:
-            self.total_reward += 0
-        self.refresh_highest_reward()
-
-
-
-        # If in the ctxt, add 1, otherwise, 0
-
-    def action_valid(self, durations=None):
-        """Ensure the execution of the robot movement with parameters which are in a valid domain.
-
-        Time should always be positive,
-        the joint position of the robot should be a subset of [−π, π].
-        if all parameters are valid, the robot gets a zero score,
-        otherwise it gets a negative score proportional to how much it is beyond the valid parameter domain.
-
-        Returns:
-            rewards: if valid, reward is equal to 0.
-            if not valid, reward is negative and proportional to the distance beyond the valid parameter domain
-        """
-        assert durations.shape[0] == 2, "durations type should be np.array and the shape should be 2"
-        # pre_duration = durations[0]
-        hit_duration = durations[1]
-        # pre_duration_thres = 0.01
-        hit_duration_thres = 1
-        # self.goal_achievement = np.all(
-        #     [(pre_duration > pre_duration_thres), (hit_duration > hit_duration_thres), (0.3 < pre_duration < 0.6)])
-        self.goal_achievement = (hit_duration > hit_duration_thres)
-        if self.goal_achievement:
-            self.total_reward = -1
-            self.goal_achievement = True
-        else:
-            # self.total_reward += -1 * ((np.abs(pre_duration - pre_duration_thres) * int(
-            #     pre_duration < pre_duration_thres) + np.abs(hit_duration - hit_duration_thres) * int(
-            #     hit_duration < hit_duration_thres)) * 10)
-            self.total_reward = -1 * ((np.abs(hit_duration - hit_duration_thres) * int(
-                hit_duration < hit_duration_thres)) * 10)
-            self.total_reward += -1
-            self.goal_achievement = False
-        self.refresh_highest_reward()
-
-    def motion_penalty(self, action, high_motion_penalty):
-        """Protects the robot from high acceleration and dangerous movement.
-        """
-        if not high_motion_penalty:
-            reward_ctrl = - 0.05 * np.square(action).sum()
-        else:
-            reward_ctrl = - 0.075 * np.square(action).sum()
-        self.total_reward += reward_ctrl
-        self.refresh_highest_reward()
-        self.goal_achievement = True
-
-    def hitting(self, env):  # , target_ball_pos, racket_center_pos, hit_contact_detector=False
-        """Hitting reward calculation
-
-        If racket successfully hit the ball, the reward +1
-        Otherwise calculate the distance between the center of racket and the center of ball,
-        reward = tanh(r/dist) if dist<1 reward almost 2 , if dist >= 1 reward is between [0, 0.2]
-
-
-        Args:
-            env:
-
-        Returns:
-
-        """
-
-        hit_contact_obj = ["target_ball", "bat"]
-        target_ball_pos = env.target_ball_pos
-        racket_center_pos = env.racket_center_pos
-        # hit contact detection
-        # Record the hitting history
-        self.hitting_flag = False
-        if not self.hit_contact_detector:
-            self.hit_contact_detector = self.contact_detection(env, hit_contact_obj)
-            if self.hit_contact_detector:
-                print("First time detect hitting")
-                self.hitting_flag = True
-        if self.hit_contact_detector:
-
-            # TODO
-            dist = self.goal_distance(target_ball_pos, racket_center_pos)
-            if dist < 0:
-                dist = 0
-            # print("goal distance is:", dist)
-            if dist <= self.shortest_hitting_dist:
-                self.shortest_hitting_dist = dist
-            # print("shortest_hitting_dist is:", self.shortest_hitting_dist)
-            # Keep the shortest hitting distance.
-            dist_reward = 2 * (1 - np.tanh(np.abs(self.shortest_hitting_dist)))
-
-            # TODO sparse
-            # dist_reward = 2
-
-            self.total_reward += dist_reward
-            self.goal_achievement = True
-
-            # if self.hitting_time_point is not None and self.hitting_time_point > 600:
-            #     self.total_reward += 1
-
-        else:
-            dist = self.goal_distance(target_ball_pos, racket_center_pos)
-            if dist <= self.shortest_hitting_dist:
-                self.shortest_hitting_dist = dist
-            dist_reward = 1 - np.tanh(self.shortest_hitting_dist)
-            reward = 0.2 * dist_reward  # because it does not hit the ball, so multiply 0.2
-            self.total_reward += reward
-            self.goal_achievement = False
-
-        self.refresh_highest_reward()
-
-    @classmethod
-    def relu(cls, x):
-        return np.maximum(0, x)
-
-    # def right_table_contact(self, env):
-    #     right_court_contact_obj = ["target_ball", "table_tennis_table_right_side"]
-    #     if env.target_ball_pos[0] >= 0 and env.target_ball_pos[2] >= 0.7:
-    #         # update right court contact detection
-    #         if not self.right_court_contact_detector:
-    #             self.right_court_contact_detector = self.contact_detection(env, right_court_contact_obj)
-    #             if self.right_court_contact_detector:
-    #                 self.contact_x_pos = env.target_ball_pos[0]
-    #         if self.right_court_contact_detector:
-    #             self.total_reward += 1 - norm(0.685, 1).pdf(self.contact_x_pos)  # x axis middle of right table
-    #             self.goal_achievement = False
-    #         else:
-    #             self.total_reward += 1
-    #             self.goal_achievement = True
-    #     # else:
-    #     #     self.total_reward += 0
-    #     #     self.goal_achievement = False
-    #     self.refresh_highest_reward()
-
-    # def net_contact(self, env):
-    #     net_contact_obj = ["target_ball", "table_tennis_net"]
-    #     # net_contact_detector = self.contact_detection(env, net_contact_obj)
-    #     # ball_x_pos = env.target_ball_pos[0]
-    #     # if self.min_ball_x_pos >= ball_x_pos:
-    #     #     self.min_ball_x_pos = ball_x_pos
-    #     # table_left_edge_x_pos = -1.37
-    #     # if np.abs(ball_x_pos) <= 0.01:  # x threshold of net
-    #     #     if self.lowest_z >= env.target_ball_pos[2]:
-    #     #         self.lowest_z = env.target_ball_pos[2]
-    #     #     # construct a gaussian distribution of z
-    #     #     z_reward = 4 - norm(0, 0.1).pdf(self.lowest_z - 0.07625)  # maximum 4
-    #     #     self.total_reward += z_reward
-    #     # self.total_reward += 2 - np.minimum(1, self.relu(np.abs(self.min_ball_x_pos)))
-    #     if not self.net_contact_detector:
-    #         self.net_contact_detector = self.contact_detection(env, net_contact_obj)
-    #     if self.net_contact_detector:
-    #         self.total_reward += 0  # very high cost
-    #         self.goal_achievement = False
-    #     else:
-    #         self.total_reward += 1
-    #         self.goal_achievement = True
-    #     self.refresh_highest_reward()
-
-    # def landing_on_opponent_court(self, env):
-    #     # Very sparse reward
-    #     # don't contact the right side court
-    #     # right_court_contact_obj = ["target_ball", "table_tennis_table_right_side"]
-    #     # right_court_contact_detector = self.contact_detection(env, right_court_contact_obj)
-    #     left_court_contact_obj = ["target_ball", "table_tennis_table_left_side"]
-    #     # left_court_contact_detector = self.contact_detection(env, left_court_contact_obj)
-    #     # record the contact history
-    #     # if not self.right_court_contact_detector:
-    #     #     self.right_court_contact_detector = self.contact_detection(env, right_court_contact_obj)
-    #     if not self.table_contact_detector:
-    #         self.table_contact_detector = self.contact_detection(env, left_court_contact_obj)
-    #
-    #     dist_left_up_corner = self.goal_distance(env.target_ball_pos, env.sim.data.get_site_xpos("left_up_corner"))
-    #     dist_middle_up_corner = self.goal_distance(env.target_ball_pos, env.sim.data.get_site_xpos("middle_up_corner"))
-    #     dist_left_down_corner = self.goal_distance(env.target_ball_pos, env.sim.data.get_site_xpos("left_down_corner"))
-    #     dist_middle_down_corner = self.goal_distance(env.target_ball_pos,
-    #                                                  env.sim.data.get_site_xpos("middle_down_corner"))
-    #     dist_array = np.array(
-    #         [dist_left_up_corner, dist_middle_up_corner, dist_left_down_corner, dist_middle_down_corner])
-    #     dist_corner = np.amin(dist_array)
-    #     if self.lowest_corner_dist >= dist_corner:
-    #         self.lowest_corner_dist = dist_corner
-    #
-    #     right_contact_cost = 1
-    #     left_contact_reward = 2
-    #     dist_left_table_reward = (2 - np.tanh(self.lowest_corner_dist))
-    #     # TODO Try multi dimensional gaussian distribution
-    #     # contact only the left side court
-    #     if self.right_court_contact_detector:
-    #         self.total_reward += 0
-    #         self.goal_achievement = False
-    #         if self.table_contact_detector:
-    #             self.total_reward += left_contact_reward
-    #             self.goal_achievement = False
-    #         else:
-    #             self.total_reward += dist_left_table_reward
-    #             self.goal_achievement = False
-    #     else:
-    #         self.total_reward += right_contact_cost
-    #         if self.table_contact_detector:
-    #             self.total_reward += left_contact_reward
-    #             self.goal_achievement = True
-    #         else:
-    #             self.total_reward += dist_left_table_reward
-    #             self.goal_achievement = False
-    #     self.refresh_highest_reward()
-    #     # if self.left_court_contact_detector and not self.right_court_contact_detector:
-    #     #     self.total_reward += self.ratio * left_contact_reward
-    #     #     print("only left court reward return!!!!!!!!!")
-    #     #     print("contact only left court!!!!!!")
-    #     #     self.goal_achievement = True
-    #     # # no contact with table
-    #     # elif not self.right_court_contact_detector and not self.left_court_contact_detector:
-    #     #     self.total_reward += 0 + self.ratio * dist_left_table_reward
-    #     #     self.goal_achievement = False
-    #     # # contact both side
-    #     # elif self.right_court_contact_detector and self.left_court_contact_detector:
-    #     #     self.total_reward += self.ratio * (left_contact_reward - right_contact_cost)  # cost of contact of right court
-    #     #     self.goal_achievement = False
-    #     # # contact only the right side court
-    #     # elif self.right_court_contact_detector and not self.left_court_contact_detector:
-    #     #     self.total_reward += 0 + self.ratio * (
-    #     #                 dist_left_table_reward - right_contact_cost)  # cost of contact of right court
-    #     #     self.goal_achievement = False
-
-    def target_achievement(self, env):
-        target_coordinate = np.array([-0.5, -0.5])
-        # net_contact_obj = ["target_ball", "table_tennis_net"]
-        table_contact_obj = ["target_ball", "table_tennis_table"]
-        floor_contact_obj = ["target_ball", "floor"]
-
-        if 0.78 < env.target_ball_pos[2] < 0.8:
-            dist_target_virtual = np.linalg.norm(env.target_ball_pos[:2] - target_coordinate)
-            if self.dist_target_virtual > dist_target_virtual:
-                self.dist_target_virtual = dist_target_virtual
-        if -0.07 < env.target_ball_pos[0] < 0.07 and env.sim.data.get_joint_qvel('tar:x') < 0:
-            if self.ball_z_pos_lowest > env.target_ball_pos[2]:
-                self.ball_z_pos_lowest = env.target_ball_pos[2].copy()
-        # if not self.net_contact_detector:
-        #     self.net_contact_detector = self.contact_detection(env, net_contact_obj)
-        if not self.table_contact_detector:
-            self.table_contact_detector = self.contact_detection(env, table_contact_obj)
-        if not self.floor_contact_detector:
-            self.floor_contact_detector = self.contact_detection(env, floor_contact_obj)
-        if not self.target_flag:
-            # Table Contact Reward.
-            if self.table_contact_detector:
-                self.total_reward += 1
-                # only update when the first contact because of the flag
-                contact_coordinate = env.target_ball_pos[:2].copy()
-                print("contact table ball coordinate: ", env.target_ball_pos)
-                logging.info("contact table ball coordinate: {}".format(env.target_ball_pos))
-                dist_target = np.linalg.norm(contact_coordinate - target_coordinate)
-                self.total_reward += (1 - np.tanh(dist_target)) * 2
-                self.target_flag = True
-                # Net Contact Reward. Precondition: Table Contact exits.
-                if contact_coordinate[0] < 0:
-                    print("left table contact")
-                    logging.info("~~~~~~~~~~~~~~~left table contact~~~~~~~~~~~~~~~")
-                    self.total_reward += 1
-                    # TODO Z coordinate reward
-                    # self.total_reward += np.maximum(np.tanh(self.ball_z_pos_lowest), 0)
-                    self.goal_achievement = True
-                else:
-                    print("right table contact")
-                    logging.info("~~~~~~~~~~~~~~~right table contact~~~~~~~~~~~~~~~")
-                    self.total_reward += 0
-                    self.goal_achievement = False
-                # if self.net_contact_detector:
-                #     self.total_reward += 0
-                #     self.goal_achievement = False
-                # else:
-                #     self.total_reward += 1
-                #     self.goal_achievement = False
-            # Floor Contact Reward. Precondition: Table Contact exits.
-            elif self.floor_contact_detector:
-                self.total_reward += (1 - np.tanh(self.dist_target_virtual))
-                self.target_flag = True
-                self.goal_achievement = False
-            # No Contact of Floor or Table, flying
-            else:
-                pass
-        # else:
-        # print("Flag is True already")
-        self.refresh_highest_reward()
-
-    def distance_to_target(self):
-        pass
-
-    @classmethod
-    def contact_detection(cls, env, goal_contact):
-        for i in range(env.sim.data.ncon):
-            contact = env.sim.data.contact[i]
-            achieved_geom1_name = env.sim.model.geom_id2name(contact.geom1)
-            achieved_geom2_name = env.sim.model.geom_id2name(contact.geom2)
-            if np.all([(achieved_geom1_name in goal_contact), (achieved_geom2_name in goal_contact)]):
-                print("contact of " + achieved_geom1_name + " " + achieved_geom2_name)
-                return True
-            else:
-                return False
--- a/alr_envs/alr/mujoco/gym_table_tennis/utils/rewards/rewards.py
+++ b/alr_envs/alr/mujoco/gym_table_tennis/utils/rewards/rewards.py
@ -1,136 +0,0 @@
-# Copyright 2017 The dm_control Authors.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#    http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or  implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-# ============================================================================
-
-# """Soft indicator function evaluating whether a number is within bounds."""
-#
-# from __future__ import absolute_import
-# from __future__ import division
-# from __future__ import print_function
-
-# Internal dependencies.
-import numpy as np
-
-# The value returned by tolerance() at `margin` distance from `bounds` interval.
-_DEFAULT_VALUE_AT_MARGIN = 0.1
-
-
-def _sigmoids(x, value_at_1, sigmoid):
-  """Returns 1 when `x` == 0, between 0 and 1 otherwise.
-
-  Args:
-    x: A scalar or numpy array.
-    value_at_1: A float between 0 and 1 specifying the output when `x` == 1.
-    sigmoid: String, choice of sigmoid type.
-
-  Returns:
-    A numpy array with values between 0.0 and 1.0.
-
-  Raises:
-    ValueError: If not 0 < `value_at_1` < 1, except for `linear`, `cosine` and
-      `quadratic` sigmoids which allow `value_at_1` == 0.
-    ValueError: If `sigmoid` is of an unknown type.
-  """
-  if sigmoid in ('cosine', 'linear', 'quadratic'):
-    if not 0 <= value_at_1 < 1:
-      raise ValueError('`value_at_1` must be nonnegative and smaller than 1, '
-                       'got {}.'.format(value_at_1))
-  else:
-    if not 0 < value_at_1 < 1:
-      raise ValueError('`value_at_1` must be strictly between 0 and 1, '
-                       'got {}.'.format(value_at_1))
-
-  if sigmoid == 'gaussian':
-    scale = np.sqrt(-2 * np.log(value_at_1))
-    return np.exp(-0.5 * (x*scale)**2)
-
-  elif sigmoid == 'hyperbolic':
-    scale = np.arccosh(1/value_at_1)
-    return 1 / np.cosh(x*scale)
-
-  elif sigmoid == 'long_tail':
-    scale = np.sqrt(1/value_at_1 - 1)
-    return 1 / ((x*scale)**2 + 1)
-
-  elif sigmoid == 'cosine':
-    scale = np.arccos(2*value_at_1 - 1) / np.pi
-    scaled_x = x*scale
-    return np.where(abs(scaled_x) < 1, (1 + np.cos(np.pi*scaled_x))/2, 0.0)
-
-  elif sigmoid == 'linear':
-    scale = 1-value_at_1
-    scaled_x = x*scale
-    return np.where(abs(scaled_x) < 1, 1 - scaled_x, 0.0)
-
-  elif sigmoid == 'quadratic':
-    scale = np.sqrt(1-value_at_1)
-    scaled_x = x*scale
-    return np.where(abs(scaled_x) < 1, 1 - scaled_x**2, 0.0)
-
-  elif sigmoid == 'tanh_squared':
-    scale = np.arctanh(np.sqrt(1-value_at_1))
-    return 1 - np.tanh(x*scale)**2
-
-  else:
-    raise ValueError('Unknown sigmoid type {!r}.'.format(sigmoid))
-
-
-def tolerance(x, bounds=(0.0, 0.0), margin=0.0, sigmoid='gaussian',
-              value_at_margin=_DEFAULT_VALUE_AT_MARGIN):
-  """Returns 1 when `x` falls inside the bounds, between 0 and 1 otherwise.
-
-  Args:
-    x: A scalar or numpy array.
-    bounds: A tuple of floats specifying inclusive `(lower, upper)` bounds for
-      the target interval. These can be infinite if the interval is unbounded
-      at one or both ends, or they can be equal to one another if the target
-      value is exact.
-    margin: Float. Parameter that controls how steeply the output decreases as
-      `x` moves out-of-bounds.
-      * If `margin == 0` then the output will be 0 for all values of `x`
-        outside of `bounds`.
-      * If `margin > 0` then the output will decrease sigmoidally with
-        increasing distance from the nearest bound.
-    sigmoid: String, choice of sigmoid type. Valid values are: 'gaussian',
-       'linear', 'hyperbolic', 'long_tail', 'cosine', 'tanh_squared'.
-    value_at_margin: A float between 0 and 1 specifying the output value when
-      the distance from `x` to the nearest bound is equal to `margin`. Ignored
-      if `margin == 0`.
-
-  Returns:
-    A float or numpy array with values between 0.0 and 1.0.
-
-  Raises:
-    ValueError: If `bounds[0] > bounds[1]`.
-    ValueError: If `margin` is negative.
-  """
-  lower, upper = bounds
-  if lower > upper:
-    raise ValueError('Lower bound must be <= upper bound.')
-  if margin < 0:
-    raise ValueError('`margin` must be non-negative.')
-
-  in_bounds = np.logical_and(lower <= x, x <= upper)
-  if margin == 0:
-    value = np.where(in_bounds, 1.0, 0.0)
-  else:
-    d = np.where(x < lower, lower - x, x - upper) / margin
-    value = np.where(in_bounds, 1.0, _sigmoids(d, value_at_margin, sigmoid))
-
-  return float(value) if np.isscalar(x) else value
-
-
-
-
-
--- a/alr_envs/alr/mujoco/gym_table_tennis/utils/util.py
+++ b/alr_envs/alr/mujoco/gym_table_tennis/utils/util.py
@ -1,49 +0,0 @@
-import json
-import yaml
-import xml.etree.ElementTree as ET
-from collections import OrderedDict
-from pathlib import Path
-
-
-def read_json(fname):
-    fname = Path(fname)
-    with fname.open('rt') as handle:
-        return json.load(handle, object_hook=OrderedDict)
-
-
-def write_json(content, fname):
-    fname = Path(fname)
-    with fname.open('wt') as handle:
-        json.dump(content, handle, indent=4, sort_keys=False)
-
-
-def read_yaml(fname):
-    fname = Path(fname)
-    with fname.open('rt') as handle:
-        return yaml.load(handle, Loader=yaml.FullLoader)
-
-
-def write_yaml(content, fname):
-    fname = Path(fname)
-    with fname.open('wt') as handle:
-        yaml.dump(content, handle)
-
-
-def config_save(dir_path, config):
-    dir_path = Path(dir_path)
-    config_path_json = dir_path / "config.json"
-    config_path_yaml = dir_path / "config.yml"
-    # .json and .yml file,save 2 version of configuration.
-    write_json(config, config_path_json)
-    write_yaml(config, config_path_yaml)
-
-
-def change_kp_in_xml(kp_list,
-                     model_path="/home/zhou/slow/table_tennis_rl/simulation/gymTableTennis/gym_table_tennis/simple_reacher/robotics/assets/table_tennis/right_arm_actuator.xml"):
-    tree = ET.parse(model_path)
-    root = tree.getroot()
-    # for actuator in root.find("actuator"):
-    for position, kp in zip(root.iter('position'), kp_list):
-        position.set("kp", str(kp))
-    tree.write(model_path)
-
--- a/alr_envs/alr/mujoco/meshes/wam/base_link_convex.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/base_link_convex.stl
--- a/alr_envs/alr/mujoco/meshes/wam/base_link_fine.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/base_link_fine.stl
--- a/alr_envs/alr/mujoco/meshes/wam/bhand_finger_dist_link_fine.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/bhand_finger_dist_link_fine.stl
--- a/alr_envs/alr/mujoco/meshes/wam/bhand_finger_med_link_convex.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/bhand_finger_med_link_convex.stl
--- a/alr_envs/alr/mujoco/meshes/wam/bhand_finger_med_link_fine.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/bhand_finger_med_link_fine.stl
--- a/alr_envs/alr/mujoco/meshes/wam/bhand_finger_prox_link_convex_decomposition_p1.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/bhand_finger_prox_link_convex_decomposition_p1.stl
--- a/alr_envs/alr/mujoco/meshes/wam/bhand_finger_prox_link_convex_decomposition_p3.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/bhand_finger_prox_link_convex_decomposition_p3.stl
--- a/alr_envs/alr/mujoco/meshes/wam/bhand_finger_prox_link_fine.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/bhand_finger_prox_link_fine.stl
--- a/alr_envs/alr/mujoco/meshes/wam/bhand_palm_link_convex_decomposition_p1.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/bhand_palm_link_convex_decomposition_p1.stl
--- a/alr_envs/alr/mujoco/meshes/wam/bhand_palm_link_convex_decomposition_p2.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/bhand_palm_link_convex_decomposition_p2.stl
--- a/alr_envs/alr/mujoco/meshes/wam/bhand_palm_link_convex_decomposition_p3.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/bhand_palm_link_convex_decomposition_p3.stl
--- a/alr_envs/alr/mujoco/meshes/wam/bhand_palm_link_convex_decomposition_p4.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/bhand_palm_link_convex_decomposition_p4.stl
--- a/alr_envs/alr/mujoco/meshes/wam/elbow_link_convex.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/elbow_link_convex.stl
--- a/alr_envs/alr/mujoco/meshes/wam/forearm_link_convex_decomposition_p1.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/forearm_link_convex_decomposition_p1.stl
--- a/alr_envs/alr/mujoco/meshes/wam/forearm_link_convex_decomposition_p2.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/forearm_link_convex_decomposition_p2.stl
--- a/alr_envs/alr/mujoco/meshes/wam/forearm_link_fine.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/forearm_link_fine.stl
--- a/alr_envs/alr/mujoco/meshes/wam/shoulder_link_convex_decomposition_p1.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/shoulder_link_convex_decomposition_p1.stl
--- a/alr_envs/alr/mujoco/meshes/wam/shoulder_link_convex_decomposition_p2.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/shoulder_link_convex_decomposition_p2.stl
--- a/alr_envs/alr/mujoco/meshes/wam/shoulder_link_convex_decomposition_p3.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/shoulder_link_convex_decomposition_p3.stl
--- a/alr_envs/alr/mujoco/meshes/wam/shoulder_link_fine.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/shoulder_link_fine.stl
--- a/alr_envs/alr/mujoco/meshes/wam/shoulder_pitch_link_convex.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/shoulder_pitch_link_convex.stl
--- a/alr_envs/alr/mujoco/meshes/wam/shoulder_pitch_link_fine.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/shoulder_pitch_link_fine.stl
--- a/alr_envs/alr/mujoco/meshes/wam/upper_arm_link_convex_decomposition_p1.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/upper_arm_link_convex_decomposition_p1.stl
--- a/alr_envs/alr/mujoco/meshes/wam/upper_arm_link_convex_decomposition_p2.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/upper_arm_link_convex_decomposition_p2.stl
--- a/alr_envs/alr/mujoco/meshes/wam/upper_arm_link_fine.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/upper_arm_link_fine.stl
--- a/alr_envs/alr/mujoco/meshes/wam/wrist_palm_link_convex.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/wrist_palm_link_convex.stl
--- a/alr_envs/alr/mujoco/meshes/wam/wrist_pitch_link_convex_decomposition_p1.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/wrist_pitch_link_convex_decomposition_p1.stl
--- a/alr_envs/alr/mujoco/meshes/wam/wrist_pitch_link_convex_decomposition_p2.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/wrist_pitch_link_convex_decomposition_p2.stl
--- a/alr_envs/alr/mujoco/meshes/wam/wrist_pitch_link_convex_decomposition_p3.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/wrist_pitch_link_convex_decomposition_p3.stl
--- a/alr_envs/alr/mujoco/meshes/wam/wrist_pitch_link_fine.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/wrist_pitch_link_fine.stl
--- a/alr_envs/alr/mujoco/meshes/wam/wrist_yaw_link_convex_decomposition_p1.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/wrist_yaw_link_convex_decomposition_p1.stl
--- a/alr_envs/alr/mujoco/meshes/wam/wrist_yaw_link_convex_decomposition_p2.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/wrist_yaw_link_convex_decomposition_p2.stl
--- a/alr_envs/alr/mujoco/meshes/wam/wrist_yaw_link_fine.stl
+++ b/alr_envs/alr/mujoco/meshes/wam/wrist_yaw_link_fine.stl
--- a/alr_envs/alr/mujoco/reacher/alr_reacher.py
+++ b/alr_envs/alr/mujoco/reacher/alr_reacher.py
@ -1,139 +0,0 @@
-import os
-
-import numpy as np
-from gym import utils
-from gym.envs.mujoco import MujocoEnv
-
-import alr_envs.utils.utils as alr_utils
-
-
-class ALRReacherEnv(MujocoEnv, utils.EzPickle):
-    def __init__(self, steps_before_reward=200, n_links=5, balance=False):
-        utils.EzPickle.__init__(**locals())
-
-        self._steps = 0
-        self.steps_before_reward = steps_before_reward
-        self.n_links = n_links
-
-        self.balance = balance
-        self.balance_weight = 1.0
-
-        self.reward_weight = 1
-        if steps_before_reward == 200:
-            self.reward_weight = 200
-        elif steps_before_reward == 50:
-            self.reward_weight = 50
-
-        if n_links == 5:
-            file_name = 'reacher_5links.xml'
-        elif n_links == 7:
-            file_name = 'reacher_7links.xml'
-        else:
-            raise ValueError(f"Invalid number of links {n_links}, only 5 or 7 allowed.")
-
-        MujocoEnv.__init__(self, os.path.join(os.path.dirname(__file__), "assets", file_name), 2)
-
-    def step(self, a):
-        self._steps += 1
-
-        reward_dist = 0.0
-        angular_vel = 0.0
-        reward_balance = 0.0
-        if self._steps >= self.steps_before_reward:
-            vec = self.get_body_com("fingertip") - self.get_body_com("target")
-            reward_dist -= self.reward_weight * np.linalg.norm(vec)
-            if self.steps_before_reward > 0:
-                # avoid giving this penalty for normal step based case
-                # angular_vel -= 10 * np.linalg.norm(self.sim.data.qvel.flat[:self.n_links])
-                angular_vel -= 10 * np.square(self.sim.data.qvel.flat[:self.n_links]).sum()
-        reward_ctrl = - 10 * np.square(a).sum()
-
-        if self.balance:
-            reward_balance -= self.balance_weight * np.abs(
-                alr_utils.angle_normalize(np.sum(self.sim.data.qpos.flat[:self.n_links]), type="rad"))
-
-        reward = reward_dist + reward_ctrl + angular_vel + reward_balance
-        self.do_simulation(a, self.frame_skip)
-        ob = self._get_obs()
-        done = False
-        return ob, reward, done, dict(reward_dist=reward_dist, reward_ctrl=reward_ctrl,
-                                      velocity=angular_vel, reward_balance=reward_balance,
-                                      end_effector=self.get_body_com("fingertip").copy(),
-                                      goal=self.goal if hasattr(self, "goal") else None)
-
-    def viewer_setup(self):
-        self.viewer.cam.trackbodyid = 0
-
-    # def reset_model(self):
-    #     qpos = self.init_qpos
-    #     if not hasattr(self, "goal"):
-    #         self.goal = np.array([-0.25, 0.25])
-    #         # self.goal = self.init_qpos.copy()[:2] + 0.05
-    #     qpos[-2:] = self.goal
-    #     qvel = self.init_qvel
-    #     qvel[-2:] = 0
-    #     self.set_state(qpos, qvel)
-    #     self._steps = 0
-    #
-    #     return self._get_obs()
-
-    def reset_model(self):
-        qpos = self.init_qpos.copy()
-        while True:
-            self.goal = self.np_random.uniform(low=-self.n_links / 10, high=self.n_links / 10, size=2)
-            # self.goal = self.np_random.uniform(low=0, high=self.n_links / 10, size=2)
-            # self.goal = np.random.uniform(low=[-self.n_links / 10, 0], high=[0, self.n_links / 10], size=2)
-            if np.linalg.norm(self.goal) < self.n_links / 10:
-                break
-        qpos[-2:] = self.goal
-        qvel = self.init_qvel.copy()
-        qvel[-2:] = 0
-        self.set_state(qpos, qvel)
-        self._steps = 0
-
-        return self._get_obs()
-
-    # def reset_model(self):
-    #     qpos = self.np_random.uniform(low=-0.1, high=0.1, size=self.model.nq) + self.init_qpos
-    #     while True:
-    #         self.goal = self.np_random.uniform(low=-self.n_links / 10, high=self.n_links / 10, size=2)
-    #         if np.linalg.norm(self.goal) < self.n_links / 10:
-    #             break
-    #     qpos[-2:] = self.goal
-    #     qvel = self.init_qvel + self.np_random.uniform(low=-.005, high=.005, size=self.model.nv)
-    #     qvel[-2:] = 0
-    #     self.set_state(qpos, qvel)
-    #     self._steps = 0
-    #
-    #     return self._get_obs()
-
-    def _get_obs(self):
-        theta = self.sim.data.qpos.flat[:self.n_links]
-        target = self.get_body_com("target")
-        return np.concatenate([
-            np.cos(theta),
-            np.sin(theta),
-            target[:2],  # x-y of goal position
-            self.sim.data.qvel.flat[:self.n_links],  # angular velocity
-            self.get_body_com("fingertip") - target,  # goal distance
-            [self._steps],
-        ])
-
-
-if __name__ == '__main__':
-    nl = 5
-    render_mode = "human"  # "human" or "partial" or "final"
-    env = ALRReacherEnv(n_links=nl)
-    obs = env.reset()
-
-    for i in range(2000):
-        # objective.load_result("/tmp/cma")
-        # test with random actions
-        ac = env.action_space.sample()
-        obs, rew, d, info = env.step(ac)
-        if i % 10 == 0:
-            env.render(mode=render_mode)
-        if d:
-            env.reset()
-
-    env.close()
--- a/alr_envs/alr/mujoco/reacher/balancing.py
+++ b/alr_envs/alr/mujoco/reacher/balancing.py
@ -1,53 +0,0 @@
-import os
-
-import numpy as np
-from gym import utils
-from gym.envs.mujoco import mujoco_env
-
-import alr_envs.utils.utils as alr_utils
-
-
-class BalancingEnv(mujoco_env.MujocoEnv, utils.EzPickle):
-    def __init__(self, n_links=5):
-        utils.EzPickle.__init__(**locals())
-
-        self.n_links = n_links
-
-        if n_links == 5:
-            file_name = 'reacher_5links.xml'
-        elif n_links == 7:
-            file_name = 'reacher_7links.xml'
-        else:
-            raise ValueError(f"Invalid number of links {n_links}, only 5 or 7 allowed.")
-
-        mujoco_env.MujocoEnv.__init__(self, os.path.join(os.path.dirname(__file__), "assets", file_name), 2)
-
-    def step(self, a):
-        angle = alr_utils.angle_normalize(np.sum(self.sim.data.qpos.flat[:self.n_links]), type="rad")
-        reward = - np.abs(angle)
-
-        self.do_simulation(a, self.frame_skip)
-        ob = self._get_obs()
-        done = False
-        return ob, reward, done, dict(angle=angle, end_effector=self.get_body_com("fingertip").copy())
-
-    def viewer_setup(self):
-        self.viewer.cam.trackbodyid = 1
-
-    def reset_model(self):
-        # This also generates a goal, we however do not need/use it
-        qpos = self.np_random.uniform(low=-0.1, high=0.1, size=self.model.nq) + self.init_qpos
-        qpos[-2:] = 0
-        qvel = self.init_qvel + self.np_random.uniform(low=-.005, high=.005, size=self.model.nv)
-        qvel[-2:] = 0
-        self.set_state(qpos, qvel)
-
-        return self._get_obs()
-
-    def _get_obs(self):
-        theta = self.sim.data.qpos.flat[:self.n_links]
-        return np.concatenate([
-            np.cos(theta),
-            np.sin(theta),
-            self.sim.data.qvel.flat[:self.n_links],  # this is angular velocity
-        ])
--- a/alr_envs/alr/mujoco/reacher/mp_wrapper.py
+++ b/alr_envs/alr/mujoco/reacher/mp_wrapper.py
@ -1,43 +0,0 @@
-from typing import Union
-
-import numpy as np
-from mp_env_api import MPEnvWrapper
-
-
-class MPWrapper(MPEnvWrapper):
-
-    @property
-    def active_obs(self):
-        return np.concatenate([
-            [False] * self.n_links,  # cos
-            [False] * self.n_links,  # sin
-            [True] * 2,  # goal position
-            [False] * self.n_links,  # angular velocity
-            [False] * 3,  # goal distance
-            # self.get_body_com("target"),  # only return target to make problem harder
-            [False],  # step
-        ])
-
-    # @property
-    # def active_obs(self):
-    #     return np.concatenate([
-    #         [True] * self.n_links,  # cos, True
-    #         [True] * self.n_links,  # sin, True
-    #         [True] * 2,  # goal position
-    #         [True] * self.n_links,  # angular velocity, True
-    #         [True] * 3,  # goal distance
-    #         # self.get_body_com("target"),  # only return target to make problem harder
-    #         [False],  # step
-    #     ])
-
-    @property
-    def current_vel(self) -> Union[float, int, np.ndarray]:
-        return self.sim.data.qvel.flat[:self.n_links]
-
-    @property
-    def current_pos(self) -> Union[float, int, np.ndarray]:
-        return self.sim.data.qpos.flat[:self.n_links]
-
-    @property
-    def dt(self) -> Union[float, int]:
-        return self.env.dt
--- a/alr_envs/alr/mujoco/table_tennis/mp_wrapper.py
+++ b/alr_envs/alr/mujoco/table_tennis/mp_wrapper.py
@ -1,38 +0,0 @@
-from typing import Tuple, Union
-
-import numpy as np
-
-from mp_env_api.interface_wrappers.mp_env_wrapper import MPEnvWrapper
-
-
-class MPWrapper(MPEnvWrapper):
-
-    @property
-    def active_obs(self):
-        # TODO: @Max Filter observations correctly
-        return np.hstack([
-            [True] * 7,  # Joint Pos
-            [True] * 3, # Ball pos
-            [True] * 3  # goal pos
-        ])
-
-    @property
-    def start_pos(self):
-        return self.self.init_qpos_tt
-
-    @property
-    def current_pos(self) -> Union[float, int, np.ndarray, Tuple]:
-        return self.sim.data.qpos[:7].copy()
-
-    @property
-    def current_vel(self) -> Union[float, int, np.ndarray, Tuple]:
-        return self.sim.data.qvel[:7].copy()
-
-    @property
-    def goal_pos(self):
-        # TODO: @Max I think the default value of returning to the start is reasonable here
-        raise ValueError("Goal position is not available and has to be learnt based on the environment.")
-
-    @property
-    def dt(self) -> Union[float, int]:
-        return self.env.dt
--- a/alr_envs/alr/mujoco/table_tennis/tt_gym.py
+++ b/alr_envs/alr/mujoco/table_tennis/tt_gym.py
@ -1,180 +0,0 @@
-import os
-
-import numpy as np
-import mujoco_py
-from gym import utils, spaces
-from gym.envs.mujoco import MujocoEnv
-
-from alr_envs.alr.mujoco.table_tennis.tt_utils import ball_init
-from alr_envs.alr.mujoco.table_tennis.tt_reward import TT_Reward
-
-#TODO: Check for simulation stability. Make sure the code runs even for sim crash
-
-MAX_EPISODE_STEPS = 1750
-BALL_NAME_CONTACT = "target_ball_contact"
-BALL_NAME = "target_ball"
-TABLE_NAME = 'table_tennis_table'
-FLOOR_NAME = 'floor'
-PADDLE_CONTACT_1_NAME = 'bat'
-PADDLE_CONTACT_2_NAME = 'bat_back'
-RACKET_NAME = 'bat'
-# CONTEXT_RANGE_BOUNDS_2DIM = np.array([[-1.2, -0.6], [-0.2, 0.6]])
-CONTEXT_RANGE_BOUNDS_2DIM = np.array([[-1.2, -0.6], [-0.2, 0.0]])
-CONTEXT_RANGE_BOUNDS_4DIM = np.array([[-1.35, -0.75, -1.25, -0.75], [-0.1, 0.75, -0.1, 0.75]])
-
-
-class TTEnvGym(MujocoEnv, utils.EzPickle):
-
-    def __init__(self, ctxt_dim=2, fixed_goal=False):
-        model_path = os.path.join(os.path.dirname(__file__), "xml", 'table_tennis_env.xml')
-
-        self.ctxt_dim = ctxt_dim
-        self.fixed_goal = fixed_goal
-        if ctxt_dim == 2:
-            self.context_range_bounds = CONTEXT_RANGE_BOUNDS_2DIM
-            if self.fixed_goal:
-                self.goal = np.array([-1, -0.1, 0])
-            else:
-                self.goal = np.zeros(3)  # 2 x,y + 1z
-        elif ctxt_dim == 4:
-            self.context_range_bounds = CONTEXT_RANGE_BOUNDS_4DIM
-            self.goal = np.zeros(3)
-        else:
-            raise ValueError("either 2 or 4 dimensional Contexts available")
-
-        # has no effect as it is overwritten in init of super
-        # action_space_low = np.array([-2.6, -2.0, -2.8, -0.9, -4.8, -1.6, -2.2])
-        # action_space_high = np.array([2.6, 2.0, 2.8, 3.1, 1.3, 1.6, 2.2])
-        # self.action_space = spaces.Box(low=action_space_low, high=action_space_high, dtype='float64')
-
-        self.time_steps = 0
-        self.init_qpos_tt = np.array([0, 0, 0, 1.5, 0, 0, 1.5, 0, 0, 0])
-        self.init_qvel_tt = np.zeros(10)
-
-        self.reward_func = TT_Reward(self.ctxt_dim)
-        self.ball_landing_pos = None
-        self.hit_ball = False
-        self.ball_contact_after_hit = False
-        self._ids_set = False
-        super(TTEnvGym, self).__init__(model_path=model_path, frame_skip=1)
-        self.ball_id = self.sim.model._body_name2id[BALL_NAME]  # find the proper -> not protected func.
-        self.ball_contact_id = self.sim.model._geom_name2id[BALL_NAME_CONTACT]
-        self.table_contact_id = self.sim.model._geom_name2id[TABLE_NAME]
-        self.floor_contact_id = self.sim.model._geom_name2id[FLOOR_NAME]
-        self.paddle_contact_id_1 = self.sim.model._geom_name2id[PADDLE_CONTACT_1_NAME]  # check if we need both or only this
-        self.paddle_contact_id_2 = self.sim.model._geom_name2id[PADDLE_CONTACT_2_NAME]  # check if we need both or only this
-        self.racket_id = self.sim.model._geom_name2id[RACKET_NAME]
-
-    def _set_ids(self):
-        self.ball_id = self.sim.model._body_name2id[BALL_NAME]  # find the proper -> not protected func.
-        self.table_contact_id = self.sim.model._geom_name2id[TABLE_NAME]
-        self.floor_contact_id = self.sim.model._geom_name2id[FLOOR_NAME]
-        self.paddle_contact_id_1 = self.sim.model._geom_name2id[PADDLE_CONTACT_1_NAME]  # check if we need both or only this
-        self.paddle_contact_id_2 = self.sim.model._geom_name2id[PADDLE_CONTACT_2_NAME]  # check if we need both or only this
-        self.racket_id = self.sim.model._geom_name2id[RACKET_NAME]
-        self.ball_contact_id = self.sim.model._geom_name2id[BALL_NAME_CONTACT]
-        self._ids_set = True
-
-    def _get_obs(self):
-        ball_pos = self.sim.data.body_xpos[self.ball_id]
-        obs = np.concatenate([self.sim.data.qpos[:7].copy(),  # 7 joint positions
-                              ball_pos,
-                              self.goal.copy()])
-        return obs
-
-    def sample_context(self):
-        return self.np_random.uniform(self.context_range_bounds[0], self.context_range_bounds[1], size=self.ctxt_dim)
-
-    def reset_model(self):
-        self.set_state(self.init_qpos_tt, self.init_qvel_tt)    # reset to initial sim state
-        self.time_steps = 0
-        self.ball_landing_pos = None
-        self.hit_ball = False
-        self.ball_contact_after_hit = False
-        if self.fixed_goal:
-            self.goal = self.goal[:2]
-        else:
-            self.goal = self.sample_context()[:2]
-        if self.ctxt_dim == 2:
-            initial_ball_state = ball_init(random=False)  # fixed velocity, fixed position
-        elif self.ctxt_dim == 4:
-            initial_ball_state = ball_init(random=False)#raise NotImplementedError
-
-        self.sim.data.set_joint_qpos('tar:x', initial_ball_state[0])
-        self.sim.data.set_joint_qpos('tar:y', initial_ball_state[1])
-        self.sim.data.set_joint_qpos('tar:z', initial_ball_state[2])
-
-        self.sim.data.set_joint_qvel('tar:x', initial_ball_state[3])
-        self.sim.data.set_joint_qvel('tar:y', initial_ball_state[4])
-        self.sim.data.set_joint_qvel('tar:z', initial_ball_state[5])
-
-        z_extended_goal_pos = np.concatenate((self.goal[:2], [0.77]))
-        self.goal = z_extended_goal_pos
-        self.sim.model.body_pos[5] = self.goal[:3]          # Desired Landing Position, Yellow
-        self.sim.model.body_pos[3] = np.array([0, 0, 0.5])  # Outgoing Ball Landing Position, Green
-        self.sim.model.body_pos[4] = np.array([0, 0, 0.5])  # Incoming Ball Landing Position, Red
-        self.sim.forward()
-
-        self.reward_func.reset(self.goal)                   # reset the reward function
-        return self._get_obs()
-
-    def _contact_checker(self, id_1, id_2):
-        for coni in range(0, self.sim.data.ncon):
-            con = self.sim.data.contact[coni]
-            collision = con.geom1 == id_1 and con.geom2 == id_2
-            collision_trans = con.geom1 == id_2 and con.geom2 == id_1
-            if collision or collision_trans:
-                return True
-        return False
-
-    def step(self, action):
-        if not self._ids_set:
-            self._set_ids()
-        done = False
-        episode_end = False if self.time_steps + 1 < MAX_EPISODE_STEPS else True
-        if not self.hit_ball:
-            self.hit_ball = self._contact_checker(self.ball_contact_id, self.paddle_contact_id_1) # check for one side
-            if not self.hit_ball:
-                self.hit_ball = self._contact_checker(self.ball_contact_id, self.paddle_contact_id_2) # check for other side
-        if self.hit_ball:
-            if not self.ball_contact_after_hit:
-                if self._contact_checker(self.ball_contact_id, self.floor_contact_id):  # first check contact with floor
-                    self.ball_contact_after_hit = True
-                    self.ball_landing_pos = self.sim.data.body_xpos[self.ball_id]
-                elif self._contact_checker(self.ball_contact_id, self.table_contact_id): # second check contact with table
-                    self.ball_contact_after_hit = True
-                    self.ball_landing_pos = self.sim.data.body_xpos[self.ball_id]
-        c_ball_pos = self.sim.data.body_xpos[self.ball_id]
-        racket_pos = self.sim.data.geom_xpos[self.racket_id]        # TODO: use this to reach out the position of the paddle?
-        if self.ball_landing_pos is not None:
-            done = True
-            episode_end =True
-        reward = self.reward_func.get_reward(episode_end, c_ball_pos, racket_pos, self.hit_ball, self.ball_landing_pos)
-        self.time_steps += 1
-        # gravity compensation on joints:
-        #action += self.sim.data.qfrc_bias[:7].copy()
-        try:
-            self.do_simulation(action, self.frame_skip)
-        except mujoco_py.MujocoException as e:
-            print('Simulation got unstable returning')
-            done = True
-            reward = -25
-        ob = self._get_obs()
-        info = {"hit_ball": self.hit_ball,
-                "q_pos": np.copy(self.sim.data.qpos[:7]),
-                "ball_pos": np.copy(self.sim.data.qpos[7:])}
-        return ob, reward, done, info # might add some information here ....
-
-    def set_context(self, context):
-        old_state = self.sim.get_state()
-        qpos = old_state.qpos.copy()
-        qvel = old_state.qvel.copy()
-        self.set_state(qpos, qvel)
-        self.goal = context
-        z_extended_goal_pos = np.concatenate((self.goal[:self.ctxt_dim], [0.77]))
-        if self.ctxt_dim == 4:
-            z_extended_goal_pos = np.concatenate((z_extended_goal_pos, [0.77]))
-        self.goal = z_extended_goal_pos
-        self.sim.model.body_pos[5] = self.goal[:3]      # TODO: Missing: Setting the desired incomoing landing position
-        self.sim.forward()
-        return self._get_obs()
--- a/alr_envs/alr/mujoco/table_tennis/tt_reward.py
+++ b/alr_envs/alr/mujoco/table_tennis/tt_reward.py
@ -1,48 +0,0 @@
-import numpy as np
-
-
-class TT_Reward:
-
-    def __init__(self, ctxt_dim):
-        self.ctxt_dim = ctxt_dim
-        self.c_goal = None          # current desired landing point
-        self.c_ball_traj = []
-        self.c_racket_traj = []
-        self.constant = 8
-
-    def get_reward(self, episode_end, ball_position, racket_pos, hited_ball, ball_landing_pos):
-        self.c_ball_traj.append(ball_position.copy())
-        self.c_racket_traj.append(racket_pos.copy())
-        if not episode_end:
-            return 0
-        else:
-            # # seems to work for episodic case
-            min_r_b_dist = np.min(np.linalg.norm(np.array(self.c_ball_traj) - np.array(self.c_racket_traj), axis=1))
-            if not hited_ball:
-                return 0.2 * (1 - np.tanh(min_r_b_dist**2))
-            else:
-                if ball_landing_pos is None:
-                    min_b_des_b_dist = np.min(np.linalg.norm(np.array(self.c_ball_traj)[:,:2] - self.c_goal[:2], axis=1))
-                    return 2 * (1 - np.tanh(min_r_b_dist ** 2)) + (1 - np.tanh(min_b_des_b_dist**2))
-                else:
-                    min_b_des_b_land_dist = np.linalg.norm(self.c_goal[:2] - ball_landing_pos[:2])
-                    over_net_bonus = int(ball_landing_pos[0] < 0)
-                    return 2 * (1 - np.tanh(min_r_b_dist ** 2)) + 4 * (1 - np.tanh(min_b_des_b_land_dist ** 2)) + over_net_bonus
-
-
-            # if not hited_ball:
-            #     min_r_b_dist = 1 + np.min(np.linalg.norm(np.array(self.c_ball_traj) - np.array(self.c_racket_traj), axis=1))
-            #     return -min_r_b_dist
-            # else:
-            #     if ball_landing_pos is None:
-            #         dist_to_des_pos = 1-np.power(np.linalg.norm(self.c_goal - ball_position), 0.75)/self.constant
-            #     else:
-            #         dist_to_des_pos = 1-np.power(np.linalg.norm(self.c_goal - ball_landing_pos), 0.75)/self.constant
-            #     if dist_to_des_pos < -0.2:
-            #         dist_to_des_pos = -0.2
-            #     return -dist_to_des_pos
-
-    def reset(self, goal):
-        self.c_goal = goal.copy()
-        self.c_ball_traj = []
-        self.c_racket_traj = []
--- a/alr_envs/alr/mujoco/table_tennis/tt_utils.py
+++ b/alr_envs/alr/mujoco/table_tennis/tt_utils.py
@ -1,26 +0,0 @@
-import numpy as np
-
-
-def ball_init(random=False, context_range=None):
-    if random:
-        dx = np.random.uniform(-0.1, 0.1)           # TODO: clarify these numbers?
-        dy = np.random.uniform(-0.1, 0.1)           # TODO: clarify these numbers?
-        dz = np.random.uniform(-0.1, 0.1)           # TODO: clarify these numbers?
-
-        v_x = np.random.uniform(1.7, 1.8)
-        v_y = np.random.uniform(0.7, 0.8)
-        v_z = np.random.uniform(0.1, 0.2)
-    else:
-        dx = 1
-        dy = 0
-        dz = 0.05
-
-        v_x = 2.5
-        v_y = 2
-        v_z = 0.5
-
-    initial_x = 0 + dx - 1.2
-    initial_y = -0.2 + dy - 0.6
-    initial_z = 0.3 + dz + 1.5
-    initial_ball_state = np.array([initial_x, initial_y, initial_z, v_x, v_y, v_z])
-    return initial_ball_state
--- a/alr_envs/alr/mujoco/table_tennis/xml/include_7_motor_actuator.xml
+++ b/alr_envs/alr/mujoco/table_tennis/xml/include_7_motor_actuator.xml
@ -1,12 +0,0 @@
-<mujocoinclude>
-    <actuator>
-        <motor name="wam/base_motor" joint="wam/base_yaw_joint_right" ctrllimited="true" ctrlrange="-1.0 1.0" gear="150.0"/>
-        <motor name="wam/shoulder_pitch_motor" joint='wam/shoulder_pitch_joint_right' ctrllimited="true" ctrlrange="-1.0 1.0" gear="125.0"/>
-        <motor name="wam/shoulder_yaw_motor" joint='wam/shoulder_yaw_joint_right' ctrllimited="true" ctrlrange="-1.0 1.0" gear="40.0"/>
-        <motor name="wam/elbow_pitch_motor" joint='wam/elbow_pitch_joint_right' ctrllimited="true" ctrlrange="-1.0 1.0" gear="60.0"/>
-        <motor name="wam/wrist_yaw_motor" joint='wam/wrist_yaw_joint_right' ctrllimited="true" ctrlrange="-1.0 1.0" gear="5.0"/>
-        <motor name="wam/wrist_pitch_motor" joint='wam/wrist_pitch_joint_right' ctrllimited="true" ctrlrange="-1.0 1.0" gear="5.0"/>
-        <motor name="wam/palm_yaw_motor" joint='wam/palm_yaw_joint_right' ctrllimited="true" ctrlrange="-1.0 1.0" gear="2.0"/>
-    </actuator>
-</mujocoinclude>
-
--- a/alr_envs/alr/mujoco/table_tennis/xml/include_barrett_wam_7dof_right.xml
+++ b/alr_envs/alr/mujoco/table_tennis/xml/include_barrett_wam_7dof_right.xml
@ -1,103 +0,0 @@
-<mujocoinclue>
-    <body name="wam/base_link_right" pos="2.1 0 2.0" quat="0 0 1 0" childclass="wam" >
-            <inertial pos="0 0 0" mass="1" diaginertia="0.1 0.1 0.1"/>
-            <geom name="base_link_fine" class="viz" mesh="base_link_fine" rgba="0.5 0.5 0.5 0"/>
-            <geom name="base_link_convex" class="col" mesh="base_link_convex" rgba="0.5 0.5 0.5 1"/>
-            <body name="wam/shoulder_yaw_link_right" pos="0 0 0.346">
-                <inertial pos="-0.00443422 -0.00066489 -0.128904" quat="0.69566 0.716713 -0.0354863 0.0334839" mass="5"
-                          diaginertia="0.135089 0.113095 0.0904426"/>
-<!--                control 0: 1.6-->
-                <joint name="wam/base_yaw_joint_right" range="-2.6 2.6" damping="1.98"/>
-                <geom name="shoulder_link_fine" class="viz" mesh="shoulder_link_fine" rgba="1 1 1 0"/>
-                <geom name="shoulder_link_convex_decomposition_p1" class="col"
-                      mesh="shoulder_link_convex_decomposition_p1"/>
-                <geom name="shoulder_link_convex_decomposition_p2" class="col"
-                      mesh="shoulder_link_convex_decomposition_p2"/>
-                <geom name="shoulder_link_convex_decomposition_p3" class="col"
-                      mesh="shoulder_link_convex_decomposition_p3"/>
-                <body name="wam/shoulder_pitch_link_right" pos="0 0 0" quat="0.707107 -0.707107 0 0">
-                    <inertial pos="-0.00236981 -0.0154211 0.0310561" quat="0.961794 0.273112 -0.0169316 0.00866592"
-                              mass="3.87494" diaginertia="0.0214195 0.0167127 0.0126452"/> <!--seems off-->
-<!--                    control 1: 0-->
-                    <joint name="wam/shoulder_pitch_joint_right" range="-2 2" damping="0.55"/>
-                    <geom name="shoulder_pitch_link_fine" class="viz" mesh="shoulder_pitch_link_fine" rgba="1 1 1 0"/>
-                    <geom name="shoulder_pitch_link_convex" class="col" mesh="shoulder_pitch_link_convex"/>
-                    <body name="wam/upper_arm_link_right" pos="0 0 0" quat="0.707107 0.707107 0 0">
-                        <inertial pos="0.00683259 3.309e-005 0.392492" quat="0.647136 0.0170822 0.0143038 0.762049"
-                                  mass="2.20228" diaginertia="0.0592718 0.0592207 0.00313419"/>
-<!--                        control 2: 0-->
-                        <joint name="wam/shoulder_yaw_joint_right" range="-2.8 2.8" damping="1.65"/>
-                        <geom name="upper_arm_link_fine" class="viz" mesh="upper_arm_link_fine" rgba="1 1 1 0"/>
-                        <geom name="upper_arm_link_convex_decomposition_p1" class="col"
-                              mesh="upper_arm_link_convex_decomposition_p1" rgba="0.094 0.48 0.804 1"/>
-                        <geom name="upper_arm_link_convex_decomposition_p2" class="col"
-                              mesh="upper_arm_link_convex_decomposition_p2" rgba="0.094 0.48 0.804 1"/>
-                        <body name="wam/forearm_link_right" pos="0.045 0 0.55" quat="0.707107 -0.707107 0 0">
-                            <inertial pos="-0.0400149 -0.142717 -0.00022942"
-                                      quat="0.704281 0.706326 0.0180333 0.0690353" mass="0.500168"
-                                      diaginertia="0.0151047 0.0148285 0.00275805"/>
-<!--                            control 3: 2.4-->
-                            <joint name="wam/elbow_pitch_joint_right" range="-0.9 3.1" damping="0.88"/>
-                            <geom name="elbow_link_fine" class="viz" mesh="elbow_link_fine" rgba="1 1 1 0"/>
-                            <geom name="elbow_link_convex" class="col" mesh="elbow_link_convex"/>
-                            <geom name="forearm_link_fine" class="viz" mesh="forearm_link_fine" pos="-.045 -0.0730 0"
-                                  euler="1.57 0 0" rgba="1 1 1 0"/>
-                            <geom name="forearm_link_convex_decomposition_p1" class="col"
-                                  mesh="forearm_link_convex_decomposition_p1" pos="-0.045 -0.0730 0"
-                                  euler="1.57 0 0" rgba="0.094 0.48 0.804 1"/>
-                            <geom name="forearm_link_convex_decomposition_p2" class="col"
-                                  mesh="forearm_link_convex_decomposition_p2" pos="-.045 -0.0730 0"
-                                  euler="1.57 0 0" rgba="0.094 0.48 0.804 1"/>
-                            <body name="wam/wrist_yaw_link_right" pos="-0.045 -0.3 0" quat="0.707107 0.707107 0 0">
-                                <inertial pos="8.921e-005 0.00435824 -0.00511217"
-                                          quat="0.630602 0.776093 0.00401969 -0.002372" mass="1.05376"
-                                          diaginertia="0.000555168 0.00046317 0.000234072"/> <!--this is an approximation-->
-<!--                                control 4: 0-->
-                                <joint name="wam/wrist_yaw_joint_right" range="-4.8 1.3" damping="0.55"/>
-                                <geom name="wrist_yaw_link_fine" class="viz" mesh="wrist_yaw_link_fine" rgba="1 1 1 0"/>
-                                <geom name="wrist_yaw_link_convex_decomposition_p1" class="col"
-                                      mesh="wrist_yaw_link_convex_decomposition_p1"/>
-                                <geom name="wrist_yaw_link_convex_decomposition_p2" class="col"
-                                      mesh="wrist_yaw_link_convex_decomposition_p2"/>
-                                <body name="wam/wrist_pitch_link_right" pos="0 0 0" quat="0.707107 -0.707107 0 0">
-                                    <inertial pos="-0.00012262 -0.0246834 -0.0170319"
-                                              quat="0.630602 0.776093 0.00401969 -0.002372" mass="0.517974"
-                                              diaginertia="0.000555168 0.00046317 0.000234072"/>
-<!--                                    control 5: 0-->
-                                    <joint name="wam/wrist_pitch_joint_right" range="-1.6 1.6" damping="0.11"/>
-                                    <geom name="wrist_pitch_link_fine" class="viz" mesh="wrist_pitch_link_fine"
-                                          rgba="1 1 1 0"/>
-                                    <geom name="wrist_pitch_link_convex_decomposition_p1" rgba="1 0.5 0.313 1"
-                                          class="col" mesh="wrist_pitch_link_convex_decomposition_p1"/>
-                                    <geom name="wrist_pitch_link_convex_decomposition_p2" rgba="1 0.5 0.313 1"
-                                          class="col" mesh="wrist_pitch_link_convex_decomposition_p2"/>
-                                    <geom name="wrist_pitch_link_convex_decomposition_p3" rgba="1 0.5 0.313 1"
-                                          class="col" mesh="wrist_pitch_link_convex_decomposition_p3"/>
-                                    <body name="wam/wrist_palm_link_right" pos="0 0 0" quat="0.707107 0.707107 0 0">
-                                        <inertial pos="0 0 0.055" quat="0.707107 0 0 0.707107" mass="0.0828613"
-                                                  diaginertia="0.00020683 0.00010859 0.00010851"/>
-<!--                                        control 6: 1.8-->
-                                        <joint name="wam/palm_yaw_joint_right" range="-2.2 2.2" damping="0.11"/>
-                                        <geom name="wrist_palm_link_fine" class="viz" mesh="wrist_palm_link_fine"
-                                              rgba="1 1 1 0"/>
-                                        <geom name="wrist_palm_link_convex" class="col" mesh="wrist_palm_link_convex"/>
-<!--                                        EE=wam/paddle, configure name to the end effector name-->
-                                        <body name="EE" pos="0 0 0.26" childclass="contact_geom">
-                                            <geom name="bat" type="cylinder" size="0.075 0.005" rgba="1 0 0 1"
-                                                  quat="0.71 0 0.71 0"/>
-                                            <geom name="bat_back" type="cylinder" size="0.0749 0.0025" rgba="0 1 0 1"
-                                                  quat="0.71 0 0.71 0" pos="-0.0026 0 0"/>
-                                            <geom name="wam/paddle_handle" type="box" size="0.005 0.01 0.05" pos="0 0 -0.08"
-                                                  rgba="1 1 1 1"/>
-                                            <!--                                            Extract information for sampling goals.-->
-                                            <site name="wam/paddle_center" pos="0 0 0" rgba="1 1 1 1" size="0.00001"/>
-                                        </body>
-                                    </body>
-                                </body>
-                            </body>
-                        </body>
-                    </body>
-                </body>
-            </body>
-        </body>
-</mujocoinclue>
--- a/alr_envs/alr/mujoco/table_tennis/xml/include_table.xml
+++ b/alr_envs/alr/mujoco/table_tennis/xml/include_table.xml
@ -1,30 +0,0 @@
-<mujocoinclude>
-    <body name="table_tennis_table" pos="0 0 0">
-        <geom class="viz" name="table_base_1" pos="1 0.7 0.375" rgba="1 1 1 1" size="0.05 0.05 .375" type="box" />
-        <geom class="viz" name="table_base_2" pos="1 -0.7 0.375" rgba="1 1 1 1" size="0.05 0.05 .375" type="box" />
-        <geom class="viz" name="table_base_3" pos="-1 -0.7 0.375" rgba="1 1 1 1" size="0.05 0.05 .375" type="box" />
-        <geom class="viz" name="table_base_4" pos="-1 0.7 0.375" rgba="1 1 1 1" size="0.05 0.05 .375" type="box" />
-        <body name="table_top" pos="0 0 0.76">
-            <geom class="contact_geom" name="table_tennis_table" pos="0 0 0" rgba="0 0 0.5 1" size="1.37 .7625 .01" type="box" />
-            <site name="left_up_corner" pos="-1.37 .7625 0.01" rgba="1 1 1 1" size="0.00001" />
-            <site name="middle_up_corner" pos="0 .7625 0.01" rgba="1 1 1 1" size="0.00001" />
-            <site name="left_down_corner" pos="-1.37 -0.7625 0.01" rgba="1 1 1 1" size="0.00001" />
-            <site name="middle_down_corner" pos="0 -.7625 0.01" rgba="1 1 1 1" size="0.00001" />
-            <geom class="contact_geom" material="floor_plane" name="table_te_context_spacennis_net" pos="0 0 0.08625" rgba="0 0 1 0.5" size="0.01 0.915 0.07625" type="box" />
-            <geom class="viz" name="left_while_line" pos="0 -0.7425 0.01" rgba="1 1 1 1" size="1.37 .02 .001" type="box" />
-            <geom class="viz" name="center_while_line" pos="0 0 0.01" rgba="1 1 1 1" size="1.37 .01 .001" type="box" />
-            <geom class="viz" name="right_while_line" pos="0 0.7425 0.01" rgba="1 1 1 1" size="1.37 .02 .001" type="box" />
-            <geom class="viz" name="right_side_line" pos="1.35 0 0.01" rgba="1 1 1 1" size="0.02 .7625 .001" type="box" />
-            <geom class="viz" name="left_side_line" pos="-1.35 0 0.01" rgba="1 1 1 1" size="0.02 .7625 .001" type="box" />
-        </body>
-        <body name="achieved_pos" pos="0 0 0.5">
-            <geom class="viz" name="achieved_point_geom" pos="0 0 0" rgba="0 1 0 1" size="0.02 0.001" type="cylinder" />
-        </body>
-        <body name="right_achieved_pos" pos="0 0 0.5">
-            <geom class="viz" name="hitting_achieved_point_geom" pos="0 0 0" rgba="1 0 0 1" size="0.02 0.001" type="cylinder" />
-        </body>
-        <body name="target_point" pos="0 0 0.5">
-            <geom class="viz" name="target_point_geom" pos="0 0 0" rgba="1 1 0 1" size="0.02 0.001" type="cylinder" />
-        </body>
-    </body>
-</mujocoinclude>
--- a/alr_envs/alr/mujoco/table_tennis/xml/include_target_ball.xml
+++ b/alr_envs/alr/mujoco/table_tennis/xml/include_target_ball.xml
@ -1,10 +0,0 @@
-<mujocoinclude>
-    <body name="target_ball" pos="0 0 0">
-        <joint axis="1 0 0" damping="0.0" name="tar:x" pos="0 0 0" stiffness="0" type="slide" frictionloss="0"/>
-        <joint axis="0 1 0" damping="0.0" name="tar:y" pos="0 0 0" stiffness="0" type="slide" frictionloss="0"/>
-        <joint axis="0 0 1" damping="0.0" name="tar:z" pos="0 0 0" stiffness="0" type="slide" frictionloss="0"/>
-        <geom size="0.025 0.025 0.025" type="sphere" condim="4" name="target_ball_contact" rgba="1 1 0 1" mass="0.1"
-              friction="0.1 0.1 0.1" solimp="0.9 0.95 0.001 0.5 2" solref="0.1 0.03" priority="1"/>
-        <site name="target_ball" pos="0 0 0" size="0.02 0.02 0.02" rgba="1 0 0 1" type="sphere"/>
-    </body>
-</mujocoinclude>
--- a/alr_envs/alr/mujoco/table_tennis/xml/right_arm_actuator.xml
+++ b/alr_envs/alr/mujoco/table_tennis/xml/right_arm_actuator.xml
@ -1,47 +0,0 @@
-<mujocoinclude>
-    <actuator>
-
-<!--                        <position ctrlrange="-2.6 2.6" joint="wam/base_yaw_joint_right" kp="100.0" />-->
-<!--                        <position ctrlrange="-1.985 1.985" joint="wam/shoulder_pitch_joint_right" kp="162.0" />-->
-<!--                        <position ctrlrange="-2.8 2.8" joint="wam/shoulder_yaw_joint_right" kp="100.0" />-->
-<!--                        <position ctrlrange="-0.9 3.14159" joint="wam/elbow_pitch_joint_right" kp="122.0" />-->
-<!--                        <position ctrlrange="-4.55 1.25" joint="wam/wrist_yaw_joint_right" kp="100.0" />-->
-<!--                        <position ctrlrange="-1.5707 1.5707" joint="wam/wrist_pitch_joint_right" kp="102.0" />-->
-<!--                        <position ctrlrange="-3 3" joint="wam/palm_yaw_joint_right" kp="100.0" />-->
-
-<!--                <position ctrlrange="-2.6 2.6" joint="wam/base_yaw_joint_right" kp="151.0" ctrllimited="true"/>-->
-<!--                <position ctrlrange="-1.985 1.985" joint="wam/shoulder_pitch_joint_right" kp="125.0" ctrllimited="true"/>-->
-<!--                <position ctrlrange="-2.8 2.8" joint="wam/shoulder_yaw_joint_right" kp="122.0" ctrllimited="true"/>-->
-<!--                <position ctrlrange="-0.9 3.14159" joint="wam/elbow_pitch_joint_right" kp="121.0" ctrllimited="true"/>-->
-<!--                <position ctrlrange="-4.55 1.25" joint="wam/wrist_yaw_joint_right" kp="99.0" ctrllimited="true"/>-->
-<!--                <position ctrlrange="-1.5707 1.5707" joint="wam/wrist_pitch_joint_right" kp="103.0" ctrllimited="true"/>-->
-<!--                <position ctrlrange="-3 3" joint="wam/palm_yaw_joint_right" kp="99.0" ctrllimited="true"/>-->
-
-<!--                        <position ctrlrange="-2.6 2.6" joint="wam/base_yaw_joint_right" kp="100.0" ctrllimited="true"/>-->
-<!--                <position ctrlrange="-1.985 1.985" joint="wam/shoulder_pitch_joint_right" kp="600.0" ctrllimited="true"/>-->
-<!--                <position ctrlrange="-2.8 2.8" joint="wam/shoulder_yaw_joint_right" kp="122.0" ctrllimited="true"/>-->
-<!--                <position ctrlrange="-0.9 3.14159" joint="wam/elbow_pitch_joint_right" kp="500.0" ctrllimited="true"/>-->
-<!--                <position ctrlrange="-4.55 1.25" joint="wam/wrist_yaw_joint_right" kp="99.0" ctrllimited="true"/>-->
-<!--                <position ctrlrange="-1.5707 1.5707" joint="wam/wrist_pitch_joint_right" kp="103.0" ctrllimited="true"/>-->
-<!--                <position ctrlrange="-3 3" joint="wam/palm_yaw_joint_right" kp="99.0" ctrllimited="true"/>-->
-
-                <position ctrlrange="-2.6 2.6" joint="wam/base_yaw_joint_right" kp="800.0" ctrllimited="true"/>
-                <position ctrlrange="-1.985 1.985" joint="wam/shoulder_pitch_joint_right" kp="800.0" ctrllimited="true"/>
-                <position ctrlrange="-2.8 2.8" joint="wam/shoulder_yaw_joint_right" kp="800.0" ctrllimited="true"/>
-                <position ctrlrange="-0.9 3.14159" joint="wam/elbow_pitch_joint_right" kp="800.0" ctrllimited="true"/>
-                <position ctrlrange="-4.55 1.25" joint="wam/wrist_yaw_joint_right" kp="100.0" ctrllimited="true"/>
-                <position ctrlrange="-1.5707 1.5707" joint="wam/wrist_pitch_joint_right" kp="1000.0" ctrllimited="true"/>
-                <position ctrlrange="-3 3" joint="wam/palm_yaw_joint_right" kp="100.0" ctrllimited="true"/>
-
-
-<!--        <position ctrlrange="-2.6 2.6" joint="wam/base_yaw_joint_right" kp="1600.0" ctrllimited="true"/>-->
-<!--&lt;!&ndash;        <velocity ctrlrange="-50 50" joint="wam/base_yaw_joint_right" kv="100" ctrllimited="true"/>&ndash;&gt;-->
-
-<!--        <position ctrlrange="-1.985 1.985" joint="wam/shoulder_pitch_joint_right" kp="2000.0" ctrllimited="true"/>-->
-<!--        <position ctrlrange="-2.8 2.8" joint="wam/shoulder_yaw_joint_right" kp="800.0" ctrllimited="true"/>-->
-<!--        <position ctrlrange="-0.9 3.14159" joint="wam/elbow_pitch_joint_right" kp="1200.0" ctrllimited="true"/>-->
-<!--        <position ctrlrange="-4.55 1.25" joint="wam/wrist_yaw_joint_right" kp="100.0" ctrllimited="true"/>-->
-<!--        <position ctrlrange="-1.5707 1.5707" joint="wam/wrist_pitch_joint_right" kp="2000.0" ctrllimited="true"/>-->
-<!--        <position ctrlrange="-3 3" joint="wam/palm_yaw_joint_right" kp="100.0" ctrllimited="true"/>-->
-    </actuator>
-</mujocoinclude>
--- a/alr_envs/alr/mujoco/table_tennis/xml/shared.xml
+++ b/alr_envs/alr/mujoco/table_tennis/xml/shared.xml
@ -1,46 +0,0 @@
-<mujocoinclude>
-    <default>
-        <default class="wam">
-            <joint type="hinge" limited="true" pos="0 0 0" axis="0 0 1"/>
-        </default>
-        <default class="viz">
-            <geom type="mesh" contype="0" conaffinity="0" group="1" rgba="1 1 1 1"/>
-        </default>
-        <default class="col">
-            <geom type="mesh" contype="0" conaffinity="1" group="0" rgba="1 1 1 1"/>
-        </default>
-        <default class="contact_geom">
-            <geom condim="4" friction="0.1 0.1 0.1" margin="0" solimp="1 1 0" solref="0.1 0.03"/>
-        </default>
-    </default>
-    <asset>
-        <mesh file="base_link_fine.stl"/>
-        <mesh file="base_link_convex.stl"/>
-        <mesh file="shoulder_link_fine.stl"/>
-        <mesh file="shoulder_link_convex_decomposition_p1.stl"/>
-        <mesh file="shoulder_link_convex_decomposition_p2.stl"/>
-        <mesh file="shoulder_link_convex_decomposition_p3.stl"/>
-        <mesh file="shoulder_pitch_link_fine.stl"/>
-        <mesh file="shoulder_pitch_link_convex.stl"/>
-        <mesh file="upper_arm_link_fine.stl"/>
-        <mesh file="upper_arm_link_convex_decomposition_p1.stl"/>
-        <mesh file="upper_arm_link_convex_decomposition_p2.stl"/>
-        <mesh file="elbow_link_fine.stl"/>
-        <mesh file="elbow_link_convex.stl"/>
-        <mesh file="forearm_link_fine.stl"/>
-        <mesh file="forearm_link_convex_decomposition_p1.stl"/>
-        <mesh file="forearm_link_convex_decomposition_p2.stl"/>
-        <mesh file="wrist_yaw_link_fine.stl"/>
-        <mesh file="wrist_yaw_link_convex_decomposition_p1.stl"/>
-        <mesh file="wrist_yaw_link_convex_decomposition_p2.stl"/>
-        <mesh file="wrist_pitch_link_fine.stl"/>
-        <mesh file="wrist_pitch_link_convex_decomposition_p1.stl"/>
-        <mesh file="wrist_pitch_link_convex_decomposition_p2.stl"/>
-        <mesh file="wrist_pitch_link_convex_decomposition_p3.stl"/>
-        <mesh file="wrist_palm_link_fine.stl"/>
-        <mesh file="wrist_palm_link_convex.stl"/>
-        <texture builtin="checker" height="512" name="texplane" rgb1=".2 .3 .4" rgb2=".1 0.15 0.2" type="2d"
-                 width="512"/>
-        <material name="floor_plane" reflectance="0.5" texrepeat="1 1" texture="texplane" texuniform="true"/>
-    </asset>
-</mujocoinclude>
--- a/alr_envs/alr/mujoco/table_tennis/xml/table_tennis_env.xml
+++ b/alr_envs/alr/mujoco/table_tennis/xml/table_tennis_env.xml
@ -1,18 +0,0 @@
-<mujoco model="table_tennis(v0.1)">
-    <compiler angle="radian" coordinate="local" meshdir="../../meshes/wam" />
-    <option gravity="0 0 -9.81" timestep="0.002">
-        <flag warmstart="enable" />
-    </option>
-    <custom>
-        <numeric data="0 0 0 0 0 0 0" name="START_ANGLES" />
-    </custom>
-    <include file="shared.xml" />
-    <worldbody>
-        <light cutoff="60" diffuse="1 1 1" dir="-.1 -.2 -1.3" directional="true" exponent="1" pos=".1 .2 1.3" specular=".1 .1 .1" />
-        <geom conaffinity="1" contype="1" material="floor_plane" name="floor" pos="0 0 0" size="10 5 1" type="plane" />
-        <include file="include_table.xml" />
-        <include file="include_barrett_wam_7dof_right.xml" />
-        <include file="include_target_ball.xml" />
-    </worldbody>
-    <include file="include_7_motor_actuator.xml" />
-</mujoco>
--- a/alr_envs/dmc/init.py
+++ b/alr_envs/dmc/init.py
@ -1,378 +0,0 @@
-from . import manipulation, suite
-
-ALL_DEEPMIND_MOTION_PRIMITIVE_ENVIRONMENTS = {"DMP": [], "ProMP": []}
-
-from gym.envs.registration import register
-
-# DeepMind Control Suite (DMC)
-
-register(
-    id=f'dmc_ball_in_cup-catch_dmp-v0',
-    entry_point='alr_envs.utils.make_env_helpers:make_dmp_env_helper',
-    # max_episode_steps=1,
-    kwargs={
-        "name": f"ball_in_cup-catch",
-        "time_limit": 20,
-        "episode_length": 1000,
-        "wrappers": [suite.ball_in_cup.MPWrapper],
-        "mp_kwargs": {
-            "num_dof": 2,
-            "num_basis": 5,
-            "duration": 20,
-            "learn_goal": True,
-            "alpha_phase": 2,
-            "bandwidth_factor": 2,
-            "policy_type": "motor",
-            "goal_scale": 0.1,
-            "policy_kwargs": {
-                "p_gains": 50,
-                "d_gains": 1
-            }
-        }
-    }
-)
-ALL_DEEPMIND_MOTION_PRIMITIVE_ENVIRONMENTS["DMP"].append("dmc_ball_in_cup-catch_dmp-v0")
-
-register(
-    id=f'dmc_ball_in_cup-catch_promp-v0',
-    entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-    kwargs={
-        "name": f"ball_in_cup-catch",
-        "time_limit": 20,
-        "episode_length": 1000,
-        "wrappers": [suite.ball_in_cup.MPWrapper],
-        "mp_kwargs": {
-            "num_dof": 2,
-            "num_basis": 5,
-            "duration": 20,
-            "policy_type": "motor",
-            "zero_start": True,
-            "policy_kwargs": {
-                "p_gains": 50,
-                "d_gains": 1
-            }
-        }
-    }
-)
-ALL_DEEPMIND_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append("dmc_ball_in_cup-catch_promp-v0")
-
-register(
-    id=f'dmc_reacher-easy_dmp-v0',
-    entry_point='alr_envs.utils.make_env_helpers:make_dmp_env_helper',
-    # max_episode_steps=1,
-    kwargs={
-        "name": f"reacher-easy",
-        "time_limit": 20,
-        "episode_length": 1000,
-        "wrappers": [suite.reacher.MPWrapper],
-        "mp_kwargs": {
-            "num_dof": 2,
-            "num_basis": 5,
-            "duration": 20,
-            "learn_goal": True,
-            "alpha_phase": 2,
-            "bandwidth_factor": 2,
-            "policy_type": "motor",
-            "weights_scale": 50,
-            "goal_scale": 0.1,
-            "policy_kwargs": {
-                "p_gains": 50,
-                "d_gains": 1
-            }
-        }
-    }
-)
-ALL_DEEPMIND_MOTION_PRIMITIVE_ENVIRONMENTS["DMP"].append("dmc_reacher-easy_dmp-v0")
-
-register(
-    id=f'dmc_reacher-easy_promp-v0',
-    entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-    kwargs={
-        "name": f"reacher-easy",
-        "time_limit": 20,
-        "episode_length": 1000,
-        "wrappers": [suite.reacher.MPWrapper],
-        "mp_kwargs": {
-            "num_dof": 2,
-            "num_basis": 5,
-            "duration": 20,
-            "policy_type": "motor",
-            "weights_scale": 0.2,
-            "zero_start": True,
-            "policy_kwargs": {
-                "p_gains": 50,
-                "d_gains": 1
-            }
-        }
-    }
-)
-ALL_DEEPMIND_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append("dmc_reacher-easy_promp-v0")
-
-register(
-    id=f'dmc_reacher-hard_dmp-v0',
-    entry_point='alr_envs.utils.make_env_helpers:make_dmp_env_helper',
-    # max_episode_steps=1,
-    kwargs={
-        "name": f"reacher-hard",
-        "time_limit": 20,
-        "episode_length": 1000,
-        "wrappers": [suite.reacher.MPWrapper],
-        "mp_kwargs": {
-            "num_dof": 2,
-            "num_basis": 5,
-            "duration": 20,
-            "learn_goal": True,
-            "alpha_phase": 2,
-            "bandwidth_factor": 2,
-            "policy_type": "motor",
-            "weights_scale": 50,
-            "goal_scale": 0.1,
-            "policy_kwargs": {
-                "p_gains": 50,
-                "d_gains": 1
-            }
-        }
-    }
-)
-ALL_DEEPMIND_MOTION_PRIMITIVE_ENVIRONMENTS["DMP"].append("dmc_reacher-hard_dmp-v0")
-
-register(
-    id=f'dmc_reacher-hard_promp-v0',
-    entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-    kwargs={
-        "name": f"reacher-hard",
-        "time_limit": 20,
-        "episode_length": 1000,
-        "wrappers": [suite.reacher.MPWrapper],
-        "mp_kwargs": {
-            "num_dof": 2,
-            "num_basis": 5,
-            "duration": 20,
-            "policy_type": "motor",
-            "weights_scale": 0.2,
-            "zero_start": True,
-            "policy_kwargs": {
-                "p_gains": 50,
-                "d_gains": 1
-            }
-        }
-    }
-)
-ALL_DEEPMIND_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append("dmc_reacher-hard_promp-v0")
-
-_dmc_cartpole_tasks = ["balance", "balance_sparse", "swingup", "swingup_sparse"]
-
-for _task in _dmc_cartpole_tasks:
-    _env_id = f'dmc_cartpole-{_task}_dmp-v0'
-    register(
-        id=_env_id,
-        entry_point='alr_envs.utils.make_env_helpers:make_dmp_env_helper',
-        # max_episode_steps=1,
-        kwargs={
-            "name": f"cartpole-{_task}",
-            # "time_limit": 1,
-            "camera_id": 0,
-            "episode_length": 1000,
-            "wrappers": [suite.cartpole.MPWrapper],
-            "mp_kwargs": {
-                "num_dof": 1,
-                "num_basis": 5,
-                "duration": 10,
-                "learn_goal": True,
-                "alpha_phase": 2,
-                "bandwidth_factor": 2,
-                "policy_type": "motor",
-                "weights_scale": 50,
-                "goal_scale": 0.1,
-                "policy_kwargs": {
-                    "p_gains": 10,
-                    "d_gains": 10
-                }
-            }
-        }
-    )
-    ALL_DEEPMIND_MOTION_PRIMITIVE_ENVIRONMENTS["DMP"].append(_env_id)
-
-    _env_id = f'dmc_cartpole-{_task}_promp-v0'
-    register(
-        id=_env_id,
-        entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-        kwargs={
-            "name": f"cartpole-{_task}",
-            # "time_limit": 1,
-            "camera_id": 0,
-            "episode_length": 1000,
-            "wrappers": [suite.cartpole.MPWrapper],
-            "mp_kwargs": {
-                "num_dof": 1,
-                "num_basis": 5,
-                "duration": 10,
-                "policy_type": "motor",
-                "weights_scale": 0.2,
-                "zero_start": True,
-                "policy_kwargs": {
-                    "p_gains": 10,
-                    "d_gains": 10
-                }
-            }
-        }
-    )
-    ALL_DEEPMIND_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append(_env_id)
-
-_env_id = f'dmc_cartpole-two_poles_dmp-v0'
-register(
-    id=_env_id,
-    entry_point='alr_envs.utils.make_env_helpers:make_dmp_env_helper',
-    # max_episode_steps=1,
-    kwargs={
-        "name": f"cartpole-two_poles",
-        # "time_limit": 1,
-        "camera_id": 0,
-        "episode_length": 1000,
-        "wrappers": [suite.cartpole.TwoPolesMPWrapper],
-        "mp_kwargs": {
-            "num_dof": 1,
-            "num_basis": 5,
-            "duration": 10,
-            "learn_goal": True,
-            "alpha_phase": 2,
-            "bandwidth_factor": 2,
-            "policy_type": "motor",
-            "weights_scale": 50,
-            "goal_scale": 0.1,
-            "policy_kwargs": {
-                "p_gains": 10,
-                "d_gains": 10
-            }
-        }
-    }
-)
-ALL_DEEPMIND_MOTION_PRIMITIVE_ENVIRONMENTS["DMP"].append(_env_id)
-
-_env_id = f'dmc_cartpole-two_poles_promp-v0'
-register(
-    id=_env_id,
-    entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-    kwargs={
-        "name": f"cartpole-two_poles",
-        # "time_limit": 1,
-        "camera_id": 0,
-        "episode_length": 1000,
-        "wrappers": [suite.cartpole.TwoPolesMPWrapper],
-        "mp_kwargs": {
-            "num_dof": 1,
-            "num_basis": 5,
-            "duration": 10,
-            "policy_type": "motor",
-            "weights_scale": 0.2,
-            "zero_start": True,
-            "policy_kwargs": {
-                "p_gains": 10,
-                "d_gains": 10
-            }
-        }
-    }
-)
-ALL_DEEPMIND_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append(_env_id)
-
-_env_id = f'dmc_cartpole-three_poles_dmp-v0'
-register(
-    id=_env_id,
-    entry_point='alr_envs.utils.make_env_helpers:make_dmp_env_helper',
-    # max_episode_steps=1,
-    kwargs={
-        "name": f"cartpole-three_poles",
-        # "time_limit": 1,
-        "camera_id": 0,
-        "episode_length": 1000,
-        "wrappers": [suite.cartpole.ThreePolesMPWrapper],
-        "mp_kwargs": {
-            "num_dof": 1,
-            "num_basis": 5,
-            "duration": 10,
-            "learn_goal": True,
-            "alpha_phase": 2,
-            "bandwidth_factor": 2,
-            "policy_type": "motor",
-            "weights_scale": 50,
-            "goal_scale": 0.1,
-            "policy_kwargs": {
-                "p_gains": 10,
-                "d_gains": 10
-            }
-        }
-    }
-)
-ALL_DEEPMIND_MOTION_PRIMITIVE_ENVIRONMENTS["DMP"].append(_env_id)
-
-_env_id = f'dmc_cartpole-three_poles_promp-v0'
-register(
-    id=_env_id,
-    entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-    kwargs={
-        "name": f"cartpole-three_poles",
-        # "time_limit": 1,
-        "camera_id": 0,
-        "episode_length": 1000,
-        "wrappers": [suite.cartpole.ThreePolesMPWrapper],
-        "mp_kwargs": {
-            "num_dof": 1,
-            "num_basis": 5,
-            "duration": 10,
-            "policy_type": "motor",
-            "weights_scale": 0.2,
-            "zero_start": True,
-            "policy_kwargs": {
-                "p_gains": 10,
-                "d_gains": 10
-            }
-        }
-    }
-)
-ALL_DEEPMIND_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append(_env_id)
-
-# DeepMind Manipulation
-
-register(
-    id=f'dmc_manipulation-reach_site_dmp-v0',
-    entry_point='alr_envs.utils.make_env_helpers:make_dmp_env_helper',
-    # max_episode_steps=1,
-    kwargs={
-        "name": f"manipulation-reach_site_features",
-        # "time_limit": 1,
-        "episode_length": 250,
-        "wrappers": [manipulation.reach_site.MPWrapper],
-        "mp_kwargs": {
-            "num_dof": 9,
-            "num_basis": 5,
-            "duration": 10,
-            "learn_goal": True,
-            "alpha_phase": 2,
-            "bandwidth_factor": 2,
-            "policy_type": "velocity",
-            "weights_scale": 50,
-            "goal_scale": 0.1,
-        }
-    }
-)
-ALL_DEEPMIND_MOTION_PRIMITIVE_ENVIRONMENTS["DMP"].append("dmc_manipulation-reach_site_dmp-v0")
-
-register(
-    id=f'dmc_manipulation-reach_site_promp-v0',
-    entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-    kwargs={
-        "name": f"manipulation-reach_site_features",
-        # "time_limit": 1,
-        "episode_length": 250,
-        "wrappers": [manipulation.reach_site.MPWrapper],
-        "mp_kwargs": {
-            "num_dof": 9,
-            "num_basis": 5,
-            "duration": 10,
-            "policy_type": "velocity",
-            "weights_scale": 0.2,
-            "zero_start": True,
-        }
-    }
-)
-ALL_DEEPMIND_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append("dmc_manipulation-reach_site_promp-v0")
--- a/alr_envs/dmc/dmc_wrapper.py
+++ b/alr_envs/dmc/dmc_wrapper.py
@ -1,206 +0,0 @@
-# Adopted from: https://github.com/denisyarats/dmc2gym/blob/master/dmc2gym/wrappers.py
-# License: MIT
-# Copyright (c) 2020 Denis Yarats
-import collections
-from typing import Any, Dict, Tuple
-
-import numpy as np
-from dm_control import manipulation, suite
-from dm_env import specs
-from gym import core, spaces
-
-
-def _spec_to_box(spec):
-    def extract_min_max(s):
-        assert s.dtype == np.float64 or s.dtype == np.float32, f"Only float64 and float32 types are allowed, instead {s.dtype} was found"
-        dim = int(np.prod(s.shape))
-        if type(s) == specs.Array:
-            bound = np.inf * np.ones(dim, dtype=s.dtype)
-            return -bound, bound
-        elif type(s) == specs.BoundedArray:
-            zeros = np.zeros(dim, dtype=s.dtype)
-            return s.minimum + zeros, s.maximum + zeros
-
-    mins, maxs = [], []
-    for s in spec:
-        mn, mx = extract_min_max(s)
-        mins.append(mn)
-        maxs.append(mx)
-    low = np.concatenate(mins, axis=0)
-    high = np.concatenate(maxs, axis=0)
-    assert low.shape == high.shape
-    return spaces.Box(low, high, dtype=s.dtype)
-
-
-def _flatten_obs(obs: collections.MutableMapping):
-    """
-    Flattens an observation of type MutableMapping, e.g. a dict to a 1D array.
-    Args:
-        obs: observation to flatten
-
-    Returns: 1D array of observation
-
-    """
-
-    if not isinstance(obs, collections.MutableMapping):
-        raise ValueError(f'Requires dict-like observations structure. {type(obs)} found.')
-
-    # Keep key order consistent for non OrderedDicts
-    keys = obs.keys() if isinstance(obs, collections.OrderedDict) else sorted(obs.keys())
-
-    obs_vals = [np.array([obs[key]]) if np.isscalar(obs[key]) else obs[key].ravel() for key in keys]
-    return np.concatenate(obs_vals)
-
-
-class DMCWrapper(core.Env):
-    def __init__(
-            self,
-            domain_name: str,
-            task_name: str,
-            task_kwargs: dict = {},
-            visualize_reward: bool = True,
-            from_pixels: bool = False,
-            height: int = 84,
-            width: int = 84,
-            camera_id: int = 0,
-            frame_skip: int = 1,
-            environment_kwargs: dict = None,
-            channels_first: bool = True
-    ):
-        assert 'random' in task_kwargs, 'Please specify a seed for deterministic behavior.'
-        self._from_pixels = from_pixels
-        self._height = height
-        self._width = width
-        self._camera_id = camera_id
-        self._frame_skip = frame_skip
-        self._channels_first = channels_first
-
-        # create task
-        if domain_name == "manipulation":
-            assert not from_pixels and not task_name.endswith("_vision"), \
-                "TODO: Vision interface for manipulation is different to suite and needs to be implemented"
-            self._env = manipulation.load(environment_name=task_name, seed=task_kwargs['random'])
-        else:
-            self._env = suite.load(domain_name=domain_name, task_name=task_name, task_kwargs=task_kwargs,
-                                   visualize_reward=visualize_reward, environment_kwargs=environment_kwargs)
-
-        # action and observation space
-        self._action_space = _spec_to_box([self._env.action_spec()])
-        self._observation_space = _spec_to_box(self._env.observation_spec().values())
-
-        self._last_state = None
-        self.viewer = None
-
-        # set seed
-        self.seed(seed=task_kwargs.get('random', 1))
-
-    def __getattr__(self, item):
-        """Propagate only non-existent properties to wrapped env."""
-        if item.startswith('_'):
-            raise AttributeError("attempted to get missing private attribute '{}'".format(item))
-        if item in self.__dict__:
-            return getattr(self, item)
-        return getattr(self._env, item)
-
-    def _get_obs(self, time_step):
-        if self._from_pixels:
-            obs = self.render(
-                mode="rgb_array",
-                height=self._height,
-                width=self._width,
-                camera_id=self._camera_id
-            )
-            if self._channels_first:
-                obs = obs.transpose(2, 0, 1).copy()
-        else:
-            obs = _flatten_obs(time_step.observation).astype(self.observation_space.dtype)
-        return obs
-
-    @property
-    def observation_space(self):
-        return self._observation_space
-
-    @property
-    def action_space(self):
-        return self._action_space
-
-    @property
-    def dt(self):
-        return self._env.control_timestep() * self._frame_skip
-
-    @property
-    def base_step_limit(self):
-        """
-        Returns: max_episode_steps of the underlying DMC env
-
-        """
-        # Accessing private attribute because DMC does not expose time_limit or step_limit.
-        # Only the current time_step/time as well as the control_timestep can be accessed.
-        try:
-            return (self._env._step_limit + self._frame_skip - 1) // self._frame_skip
-        except AttributeError as e:
-            return self._env._time_limit / self.dt
-
-    def seed(self, seed=None):
-        self._action_space.seed(seed)
-        self._observation_space.seed(seed)
-
-    def step(self, action) -> Tuple[np.ndarray, float, bool, Dict[str, Any]]:
-        assert self._action_space.contains(action)
-        reward = 0
-        extra = {'internal_state': self._env.physics.get_state().copy()}
-
-        for _ in range(self._frame_skip):
-            time_step = self._env.step(action)
-            reward += time_step.reward or 0.
-            done = time_step.last()
-            if done:
-                break
-
-        self._last_state = _flatten_obs(time_step.observation)
-        obs = self._get_obs(time_step)
-        extra['discount'] = time_step.discount
-        return obs, reward, done, extra
-
-    def reset(self) -> np.ndarray:
-        time_step = self._env.reset()
-        self._last_state = _flatten_obs(time_step.observation)
-        obs = self._get_obs(time_step)
-        return obs
-
-    def render(self, mode='rgb_array', height=None, width=None, camera_id=0):
-        if self._last_state is None:
-            raise ValueError('Environment not ready to render. Call reset() first.')
-
-        camera_id = camera_id or self._camera_id
-
-        # assert mode == 'rgb_array', 'only support rgb_array mode, given %s' % mode
-        if mode == "rgb_array":
-            height = height or self._height
-            width = width or self._width
-            return self._env.physics.render(height=height, width=width, camera_id=camera_id)
-
-        elif mode == 'human':
-            if self.viewer is None:
-                # pylint: disable=import-outside-toplevel
-                # pylint: disable=g-import-not-at-top
-                from gym.envs.classic_control import rendering
-                self.viewer = rendering.SimpleImageViewer()
-            # Render max available buffer size. Larger is only possible by altering the XML.
-            img = self._env.physics.render(height=self._env.physics.model.vis.global_.offheight,
-                                           width=self._env.physics.model.vis.global_.offwidth,
-                                           camera_id=camera_id)
-            self.viewer.imshow(img)
-            return self.viewer.isopen
-
-    def close(self):
-        super().close()
-        if self.viewer is not None and self.viewer.isopen:
-            self.viewer.close()
-
-    @property
-    def reward_range(self) -> Tuple[float, float]:
-        reward_spec = self._env.reward_spec()
-        if isinstance(reward_spec, specs.BoundedArray):
-            return reward_spec.minimum, reward_spec.maximum
-        return -float('inf'), float('inf')
--- a/alr_envs/examples/examples_motion_primitives.py
+++ b/alr_envs/examples/examples_motion_primitives.py
@ -1,164 +0,0 @@
-import alr_envs
-
-
-def example_mp(env_name="alr_envs:HoleReacherDMP-v1", seed=1, iterations=1, render=True):
-    """
-    Example for running a motion primitive based environment, which is already registered
-    Args:
-        env_name: DMP env_id
-        seed: seed for deterministic behaviour
-        iterations: Number of rollout steps to run
-        render: Render the episode
-
-    Returns:
-
-    """
-    # While in this case gym.make() is possible to use as well, we recommend our custom make env function.
-    # First, it already takes care of seeding and second enables the use of DMC tasks within the gym interface.
-    env = alr_envs.make(env_name, seed)
-
-    rewards = 0
-    # env.render(mode=None)
-    obs = env.reset()
-
-    # number of samples/full trajectories (multiple environment steps)
-    for i in range(iterations):
-
-        if render and i % 2 == 0:
-            # This renders the full MP trajectory
-            # It is only required to call render() once in the beginning, which renders every consecutive trajectory.
-            # Resetting to no rendering, can be achieved by render(mode=None).
-            # It is also possible to change the mode multiple times when
-            # e.g. only every second trajectory should be displayed, such as here
-            # Just make sure the correct mode is set before executing the step.
-            env.render(mode="human")
-        else:
-            env.render(mode=None)
-
-        ac = env.action_space.sample()
-        obs, reward, done, info = env.step(ac)
-        rewards += reward
-
-        if done:
-            print(rewards)
-            rewards = 0
-            obs = env.reset()
-
-
-def example_custom_mp(env_name="alr_envs:HoleReacherDMP-v1", seed=1, iterations=1, render=True):
-    """
-    Example for running a motion primitive based environment, which is already registered
-    Args:
-        env_name: DMP env_id
-        seed: seed for deterministic behaviour
-        iterations: Number of rollout steps to run
-        render: Render the episode
-
-    Returns:
-
-    """
-    # Changing the mp_kwargs is possible by providing them to gym.
-    # E.g. here by providing way to many basis functions
-    mp_kwargs = {
-        "num_dof": 5,
-        "num_basis": 1000,
-        "duration": 2,
-        "learn_goal": True,
-        "alpha_phase": 2,
-        "bandwidth_factor": 2,
-        "policy_type": "velocity",
-        "weights_scale": 50,
-        "goal_scale": 0.1
-    }
-    env = alr_envs.make(env_name, seed, mp_kwargs=mp_kwargs)
-
-    # This time rendering every trajectory
-    if render:
-        env.render(mode="human")
-
-    rewards = 0
-    obs = env.reset()
-
-    # number of samples/full trajectories (multiple environment steps)
-    for i in range(iterations):
-        ac = env.action_space.sample()
-        obs, reward, done, info = env.step(ac)
-        rewards += reward
-
-        if done:
-            print(rewards)
-            rewards = 0
-            obs = env.reset()
-
-
-def example_fully_custom_mp(seed=1, iterations=1, render=True):
-    """
-    Example for running a custom motion primitive based environments.
-    Our already registered environments follow the same structure.
-    Hence, this also allows to adjust hyperparameters of the motion primitives.
-    Yet, we recommend the method above if you are just interested in chaining those parameters for existing tasks.
-    We appreciate PRs for custom environments (especially MP wrappers of existing tasks) 
-    for our repo: https://github.com/ALRhub/alr_envs/
-    Args:
-        seed: seed
-        iterations: Number of rollout steps to run
-        render: Render the episode
-
-    Returns:
-
-    """
-
-    base_env = "alr_envs:HoleReacher-v1"
-
-    # Replace this wrapper with the custom wrapper for your environment by inheriting from the MPEnvWrapper.
-    # You can also add other gym.Wrappers in case they are needed.
-    wrappers = [alr_envs.alr.classic_control.hole_reacher.MPWrapper]
-    mp_kwargs = {
-        "num_dof": 5,
-        "num_basis": 5,
-        "duration": 2,
-        "learn_goal": True,
-        "alpha_phase": 2,
-        "bandwidth_factor": 2,
-        "policy_type": "velocity",
-        "weights_scale": 50,
-        "goal_scale": 0.1
-    }
-    env = alr_envs.make_dmp_env(base_env, wrappers=wrappers, seed=seed, mp_kwargs=mp_kwargs)
-    # OR for a deterministic ProMP:
-    # env = make_promp_env(base_env, wrappers=wrappers, seed=seed, mp_kwargs=mp_kwargs)
-
-    if render:
-        env.render(mode="human")
-
-    rewards = 0
-    obs = env.reset()
-
-    # number of samples/full trajectories (multiple environment steps)
-    for i in range(iterations):
-        ac = env.action_space.sample()
-        obs, reward, done, info = env.step(ac)
-        rewards += reward
-
-        if done:
-            print(rewards)
-            rewards = 0
-            obs = env.reset()
-
-
-if __name__ == '__main__':
-    render = False
-    # DMP
-    example_mp("alr_envs:HoleReacherDMP-v1", seed=10, iterations=1, render=render)
-
-    # ProMP
-    example_mp("alr_envs:HoleReacherProMP-v1", seed=10, iterations=1, render=render)
-
-    # DetProMP
-    example_mp("alr_envs:HoleReacherDetPMP-v1", seed=10, iterations=1, render=render)
-
-    # Altered basis functions
-    example_custom_mp("alr_envs:HoleReacherDMP-v1", seed=10, iterations=1, render=render)
-
-    # Custom MP
-    example_fully_custom_mp(seed=10, iterations=1, render=render)
--- a/alr_envs/examples/examples_open_ai.py
+++ b/alr_envs/examples/examples_open_ai.py
@ -1,41 +0,0 @@
-import alr_envs
-
-
-def example_mp(env_name, seed=1):
-    """
-    Example for running a motion primitive based version of a OpenAI-gym environment, which is already registered.
-    For more information on motion primitive specific stuff, look at the mp examples.
-    Args:
-        env_name: ProMP env_id
-        seed: seed
-
-    Returns:
-
-    """
-    # While in this case gym.make() is possible to use as well, we recommend our custom make env function.
-    env = alr_envs.make(env_name, seed)
-
-    rewards = 0
-    obs = env.reset()
-
-    # number of samples/full trajectories (multiple environment steps)
-    for i in range(10):
-        ac = env.action_space.sample()
-        obs, reward, done, info = env.step(ac)
-        rewards += reward
-
-        if done:
-            print(rewards)
-            rewards = 0
-            obs = env.reset()
-
-
-if __name__ == '__main__':
-    # DMP - not supported yet
-    # example_mp("ReacherDMP-v2")
-
-    # DetProMP
-    example_mp("ContinuousMountainCarProMP-v0")
-    example_mp("ReacherProMP-v2")
-    example_mp("FetchReachDenseProMP-v1")
-    example_mp("FetchSlideDenseProMP-v1")
--- a/alr_envs/examples/pd_control_gain_tuning.py
+++ b/alr_envs/examples/pd_control_gain_tuning.py
@ -1,100 +0,0 @@
-import numpy as np
-from matplotlib import pyplot as plt
-
-from alr_envs import dmc, meta
-from alr_envs.alr import mujoco
-from alr_envs.utils.make_env_helpers import make_promp_env
-
-
-def visualize(env):
-    t = env.t
-    pos_features = env.mp.basis_generator.basis(t)
-    plt.plot(t, pos_features)
-    plt.show()
-
-
-# This might work for some environments, however, please verify either way the correct trajectory information
-# for your environment are extracted below
-SEED = 1
-# env_id = "ball_in_cup-catch"
-env_id = "ALRReacherSparse-v0"
-env_id = "button-press-v2"
-wrappers = [mujoco.reacher.MPWrapper]
-wrappers = [meta.goal_object_change_mp_wrapper.MPWrapper]
-
-mp_kwargs = {
-    "num_dof": 4,
-    "num_basis": 5,
-    "duration": 6.25,
-    "policy_type": "metaworld",
-    "weights_scale": 10,
-    "zero_start": True,
-    # "policy_kwargs": {
-    #     "p_gains": 1,
-    #     "d_gains": 0.1
-    # }
-}
-
-# kwargs = dict(time_limit=4, episode_length=200)
-kwargs = {}
-
-env = make_promp_env(env_id, wrappers, seed=SEED, mp_kwargs=mp_kwargs, **kwargs)
-env.action_space.seed(SEED)
-
-# Plot difference between real trajectory and target MP trajectory
-env.reset()
-w = env.action_space.sample()  # N(0,1)
-visualize(env)
-pos, vel = env.mp_rollout(w)
-
-base_shape = env.full_action_space.shape
-actual_pos = np.zeros((len(pos), *base_shape))
-actual_vel = np.zeros((len(pos), *base_shape))
-act = np.zeros((len(pos), *base_shape))
-
-plt.ion()
-fig = plt.figure()
-ax = fig.add_subplot(1, 1, 1)
-img = ax.imshow(env.env.render("rgb_array"))
-fig.show()
-
-for t, pos_vel in enumerate(zip(pos, vel)):
-    actions = env.policy.get_action(pos_vel[0], pos_vel[1])
-    actions = np.clip(actions, env.full_action_space.low, env.full_action_space.high)
-    _, _, _, _ = env.env.step(actions)
-    if t % 15 == 0:
-        img.set_data(env.env.render("rgb_array"))
-        fig.canvas.draw()
-        fig.canvas.flush_events()
-    act[t, :] = actions
-    # TODO verify for your environment
-    actual_pos[t, :] = env.current_pos
-    actual_vel[t, :] = 0  # env.current_vel
-
-plt.figure(figsize=(15, 5))
-
-plt.subplot(131)
-plt.title("Position")
-p1 = plt.plot(actual_pos, c='C0', label="true")
-# plt.plot(actual_pos_ball, label="true pos ball")
-p2 = plt.plot(pos, c='C1', label="MP")  # , label=["MP" if i == 0 else None for i in range(np.prod(base_shape))])
-plt.xlabel("Episode steps")
-# plt.legend()
-handles, labels = plt.gca().get_legend_handles_labels()
-from collections import OrderedDict
-
-by_label = OrderedDict(zip(labels, handles))
-plt.legend(by_label.values(), by_label.keys())
-
-plt.subplot(132)
-plt.title("Velocity")
-plt.plot(actual_vel, c='C0', label="true")
-plt.plot(vel, c='C1', label="MP")
-plt.xlabel("Episode steps")
-
-plt.subplot(133)
-plt.title(f"Actions {np.std(act, axis=0)}")
-plt.plot(act, c="C0"),  # label=[f"actions" if i == 0 else "" for i in range(np.prod(base_action_shape))])
-plt.xlabel("Episode steps")
-# plt.legend()
-plt.show()
--- a/alr_envs/open_ai/init.py
+++ b/alr_envs/open_ai/init.py
@ -1,154 +0,0 @@
-from gym import register
-from gym.wrappers import FlattenObservation
-
-from . import classic_control, mujoco, robotics
-
-ALL_GYM_MOTION_PRIMITIVE_ENVIRONMENTS = {"DMP": [], "ProMP": []}
-
-# Short Continuous Mountain Car
-register(
-    id="MountainCarContinuous-v1",
-    entry_point="gym.envs.classic_control:Continuous_MountainCarEnv",
-    max_episode_steps=100,
-    reward_threshold=90.0,
-)
-
-# Open AI
-# Classic Control
-register(
-    id='ContinuousMountainCarProMP-v1',
-    entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-    kwargs={
-        "name": "alr_envs:MountainCarContinuous-v1",
-        "wrappers": [classic_control.continuous_mountain_car.MPWrapper],
-        "mp_kwargs": {
-            "num_dof": 1,
-            "num_basis": 4,
-            "duration": 2,
-            "post_traj_time": 0,
-            "zero_start": True,
-            "policy_type": "motor",
-            "policy_kwargs": {
-                "p_gains": 1.,
-                "d_gains": 1.
-            }
-        }
-    }
-)
-ALL_GYM_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append("ContinuousMountainCarProMP-v1")
-
-register(
-    id='ContinuousMountainCarProMP-v0',
-    entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-    kwargs={
-        "name": "gym.envs.classic_control:MountainCarContinuous-v0",
-        "wrappers": [classic_control.continuous_mountain_car.MPWrapper],
-        "mp_kwargs": {
-            "num_dof": 1,
-            "num_basis": 4,
-            "duration": 19.98,
-            "post_traj_time": 0,
-            "zero_start": True,
-            "policy_type": "motor",
-            "policy_kwargs": {
-                "p_gains": 1.,
-                "d_gains": 1.
-            }
-        }
-    }
-)
-ALL_GYM_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append("ContinuousMountainCarProMP-v0")
-
-register(
-    id='ReacherProMP-v2',
-    entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-    kwargs={
-        "name": "gym.envs.mujoco:Reacher-v2",
-        "wrappers": [mujoco.reacher_v2.MPWrapper],
-        "mp_kwargs": {
-            "num_dof": 2,
-            "num_basis": 6,
-            "duration": 1,
-            "post_traj_time": 0,
-            "zero_start": True,
-            "policy_type": "motor",
-            "policy_kwargs": {
-                "p_gains": .6,
-                "d_gains": .075
-            }
-        }
-    }
-)
-ALL_GYM_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append("ReacherProMP-v2")
-
-register(
-    id='FetchSlideDenseProMP-v1',
-    entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-    kwargs={
-        "name": "gym.envs.robotics:FetchSlideDense-v1",
-        "wrappers": [FlattenObservation, robotics.fetch.MPWrapper],
-        "mp_kwargs": {
-            "num_dof": 4,
-            "num_basis": 5,
-            "duration": 2,
-            "post_traj_time": 0,
-            "zero_start": True,
-            "policy_type": "position"
-        }
-    }
-)
-ALL_GYM_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append("FetchSlideDenseProMP-v1")
-
-register(
-    id='FetchSlideProMP-v1',
-    entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-    kwargs={
-        "name": "gym.envs.robotics:FetchSlide-v1",
-        "wrappers": [FlattenObservation, robotics.fetch.MPWrapper],
-        "mp_kwargs": {
-            "num_dof": 4,
-            "num_basis": 5,
-            "duration": 2,
-            "post_traj_time": 0,
-            "zero_start": True,
-            "policy_type": "position"
-        }
-    }
-)
-ALL_GYM_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append("FetchSlideProMP-v1")
-
-register(
-    id='FetchReachDenseProMP-v1',
-    entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-    kwargs={
-        "name": "gym.envs.robotics:FetchReachDense-v1",
-        "wrappers": [FlattenObservation, robotics.fetch.MPWrapper],
-        "mp_kwargs": {
-            "num_dof": 4,
-            "num_basis": 5,
-            "duration": 2,
-            "post_traj_time": 0,
-            "zero_start": True,
-            "policy_type": "position"
-        }
-    }
-)
-ALL_GYM_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append("FetchReachDenseProMP-v1")
-
-register(
-    id='FetchReachProMP-v1',
-    entry_point='alr_envs.utils.make_env_helpers:make_promp_env_helper',
-    kwargs={
-        "name": "gym.envs.robotics:FetchReach-v1",
-        "wrappers": [FlattenObservation, robotics.fetch.MPWrapper],
-        "mp_kwargs": {
-            "num_dof": 4,
-            "num_basis": 5,
-            "duration": 2,
-            "post_traj_time": 0,
-            "zero_start": True,
-            "policy_type": "position"
-        }
-    }
-)
-ALL_GYM_MOTION_PRIMITIVE_ENVIRONMENTS["ProMP"].append("FetchReachProMP-v1")
--- a/alr_envs/open_ai/classic_control/init.py
+++ b/alr_envs/open_ai/classic_control/init.py
@ -1 +0,0 @@
-from . import continuous_mountain_car
--- a/alr_envs/open_ai/classic_control/continuous_mountain_car/init.py
+++ b/alr_envs/open_ai/classic_control/continuous_mountain_car/init.py
@ -1 +0,0 @@
-from .mp_wrapper import MPWrapper
--- a/alr_envs/open_ai/classic_control/continuous_mountain_car/mp_wrapper.py
+++ b/alr_envs/open_ai/classic_control/continuous_mountain_car/mp_wrapper.py
@ -1,22 +0,0 @@
-from typing import Union
-
-import numpy as np
-from mp_env_api import MPEnvWrapper
-
-
-class MPWrapper(MPEnvWrapper):
-    @property
-    def current_vel(self) -> Union[float, int, np.ndarray]:
-        return np.array([self.state[1]])
-
-    @property
-    def current_pos(self) -> Union[float, int, np.ndarray]:
-        return np.array([self.state[0]])
-
-    @property
-    def goal_pos(self):
-        raise ValueError("Goal position is not available and has to be learnt based on the environment.")
-
-    @property
-    def dt(self) -> Union[float, int]:
-        return 0.02
--- a/alr_envs/open_ai/mujoco/reacher_v2/init.py
+++ b/alr_envs/open_ai/mujoco/reacher_v2/init.py
@ -1 +0,0 @@
-from .mp_wrapper import MPWrapper
--- a/alr_envs/open_ai/mujoco/reacher_v2/mp_wrapper.py
+++ b/alr_envs/open_ai/mujoco/reacher_v2/mp_wrapper.py
@ -1,19 +0,0 @@
-from typing import Union
-
-import numpy as np
-from mp_env_api import MPEnvWrapper
-
-
-class MPWrapper(MPEnvWrapper):
-
-    @property
-    def current_vel(self) -> Union[float, int, np.ndarray]:
-        return self.sim.data.qvel[:2]
-
-    @property
-    def current_pos(self) -> Union[float, int, np.ndarray]:
-        return self.sim.data.qpos[:2]
-
-    @property
-    def dt(self) -> Union[float, int]:
-        return self.env.dt
--- a/alr_envs/open_ai/robotics/fetch/init.py
+++ b/alr_envs/open_ai/robotics/fetch/init.py
@ -1 +0,0 @@
-from .mp_wrapper import MPWrapper
--- a/alr_envs/utils/init.py
+++ b/alr_envs/utils/init.py
@ -1,66 +0,0 @@
-import re
-from typing import Union
-
-import gym
-from gym.envs.registration import register
-
-from alr_envs.utils.make_env_helpers import make
-
-
-def make_dmc(
-        id: str,
-        seed: int = 1,
-        visualize_reward: bool = True,
-        from_pixels: bool = False,
-        height: int = 84,
-        width: int = 84,
-        camera_id: int = 0,
-        frame_skip: int = 1,
-        episode_length: Union[None, int] = None,
-        environment_kwargs: dict = {},
-        time_limit: Union[None, float] = None,
-        channels_first: bool = True
-):
-    # Adopted from: https://github.com/denisyarats/dmc2gym/blob/master/dmc2gym/__init__.py
-    # License: MIT
-    # Copyright (c) 2020 Denis Yarats
-
-    assert re.match(r"\w+-\w+", id), "env_id does not have the following structure: 'domain_name-task_name'"
-    domain_name, task_name = id.split("-")
-
-    env_id = f'dmc_{domain_name}_{task_name}_{seed}-v1'
-
-    if from_pixels:
-        assert not visualize_reward, 'cannot use visualize reward when learning from pixels'
-
-    # shorten episode length
-    if episode_length is None:
-        # Default lengths for benchmarking suite is 1000 and for manipulation tasks 250
-        episode_length = 250 if domain_name == "manipulation" else 1000
-
-    max_episode_steps = (episode_length + frame_skip - 1) // frame_skip
-    if env_id not in gym.envs.registry.env_specs:
-        task_kwargs = {'random': seed}
-        # if seed is not None:
-        #     task_kwargs['random'] = seed
-        if time_limit is not None:
-            task_kwargs['time_limit'] = time_limit
-        register(
-            id=env_id,
-            entry_point='alr_envs.dmc.dmc_wrapper:DMCWrapper',
-            kwargs=dict(
-                domain_name=domain_name,
-                task_name=task_name,
-                task_kwargs=task_kwargs,
-                environment_kwargs=environment_kwargs,
-                visualize_reward=visualize_reward,
-                from_pixels=from_pixels,
-                height=height,
-                width=width,
-                camera_id=camera_id,
-                frame_skip=frame_skip,
-                channels_first=channels_first,
-            ),
-            max_episode_steps=max_episode_steps,
-        )
-    return gym.make(env_id)
--- a/alr_envs/utils/make_env_helpers.py
+++ b/alr_envs/utils/make_env_helpers.py
@ -1,224 +0,0 @@
-import warnings
-from typing import Iterable, Type, Union
-
-import gym
-import numpy as np
-from gym.envs.registration import EnvSpec
-
-from mp_env_api import MPEnvWrapper
-from mp_env_api.mp_wrappers.dmp_wrapper import DmpWrapper
-from mp_env_api.mp_wrappers.promp_wrapper import ProMPWrapper
-
-
-def make_rank(env_id: str, seed: int, rank: int = 0, return_callable=True, **kwargs):
-    """
-    TODO: Do we need this?
-    Generate a callable to create a new gym environment with a given seed.
-    The rank is added to the seed and can be used for example when using vector environments.
-    E.g. [make_rank("my_env_name-v0", 123, i) for i in range(8)] creates a list of 8 environments
-    with seeds 123 through 130.
-    Hence, testing environments should be seeded with a value which is offset by the number of training environments.
-    Here e.g. [make_rank("my_env_name-v0", 123 + 8, i) for i in range(5)] for 5 testing environmetns
-
-    Args:
-        env_id: name of the environment
-        seed: seed for deterministic behaviour
-        rank: environment rank for deterministic over multiple seeds behaviour
-        return_callable: If True returns a callable to create the environment instead of the environment itself.
-
-    Returns:
-
-    """
-
-    def f():
-        return make(env_id, seed + rank, **kwargs)
-
-    return f if return_callable else f()
-
-
-def make(env_id: str, seed, **kwargs):
-    """
-    Converts an env_id to an environment with the gym API.
-    This also works for DeepMind Control Suite interface_wrappers
-    for which domain name and task name are expected to be separated by "-".
-    Args:
-        env_id: gym name or env_id of the form "domain_name-task_name" for DMC tasks
-        **kwargs: Additional kwargs for the constructor such as pixel observations, etc.
-
-    Returns: Gym environment
-
-    """
-    if any([det_pmp in env_id for det_pmp in ["DetPMP", "detpmp"]]):
-        warnings.warn("DetPMP is deprecated and converted to ProMP")
-        env_id = env_id.replace("DetPMP", "ProMP")
-        env_id = env_id.replace("detpmp", "promp")
-
-    try:
-        # Add seed to kwargs in case it is a predefined gym+dmc hybrid environment.
-        if env_id.startswith("dmc"):
-            kwargs.update({"seed": seed})
-
-        # Gym
-        env = gym.make(env_id, **kwargs)
-        env.seed(seed)
-        env.action_space.seed(seed)
-        env.observation_space.seed(seed)
-    except gym.error.Error:
-
-        # MetaWorld env
-        import metaworld
-        if env_id in metaworld.ML1.ENV_NAMES:
-            env = metaworld.envs.ALL_V2_ENVIRONMENTS_GOAL_OBSERVABLE[env_id + "-goal-observable"](seed=seed, **kwargs)
-            # setting this avoids generating the same initialization after each reset
-            env._freeze_rand_vec = False
-            # Manually set spec, as metaworld environments are not registered via gym
-            env.unwrapped.spec = EnvSpec(env_id)
-            # Set Timelimit based on the maximum allowed path length of the environment
-            env = gym.wrappers.TimeLimit(env, max_episode_steps=env.max_path_length)
-            env.seed(seed)
-            env.action_space.seed(seed)
-            env.observation_space.seed(seed)
-            env.goal_space.seed(seed)
-
-        else:
-            # DMC
-            from alr_envs import make_dmc
-            env = make_dmc(env_id, seed=seed, **kwargs)
-
-            assert env.base_step_limit == env.spec.max_episode_steps, \
-                f"The specified 'episode_length' of {env.spec.max_episode_steps} steps for gym is different from " \
-                f"the DMC environment specification of {env.base_step_limit} steps."
-
-    return env
-
-
-def _make_wrapped_env(env_id: str, wrappers: Iterable[Type[gym.Wrapper]], seed=1, **kwargs):
-    """
-    Helper function for creating a wrapped gym environment using MPs.
-    It adds all provided wrappers to the specified environment and verifies at least one MPEnvWrapper is
-    provided to expose the interface for MPs.
-
-    Args:
-        env_id: name of the environment
-        wrappers: list of wrappers (at least an MPEnvWrapper),
-        seed: seed of environment
-
-    Returns: gym environment with all specified wrappers applied
-
-    """
-    # _env = gym.make(env_id)
-    _env = make(env_id, seed, **kwargs)
-
-    assert any(issubclass(w, MPEnvWrapper) for w in wrappers), \
-        "At least one MPEnvWrapper is required in order to leverage motion primitive environments."
-    for w in wrappers:
-        _env = w(_env)
-
-    return _env
-
-
-def make_dmp_env(env_id: str, wrappers: Iterable, seed=1, mp_kwargs={}, **kwargs):
-    """
-    This can also be used standalone for manually building a custom DMP environment.
-    Args:
-        env_id: base_env_name,
-        wrappers: list of wrappers (at least an MPEnvWrapper),
-        seed: seed of environment
-        mp_kwargs: dict of at least {num_dof: int, num_basis: int} for DMP
-
-    Returns: DMP wrapped gym env
-
-    """
-    _verify_time_limit(mp_kwargs.get("duration", None), kwargs.get("time_limit", None))
-
-    _env = _make_wrapped_env(env_id=env_id, wrappers=wrappers, seed=seed, **kwargs)
-
-    _verify_dof(_env, mp_kwargs.get("num_dof"))
-
-    return DmpWrapper(_env, **mp_kwargs)
-
-
-def make_promp_env(env_id: str, wrappers: Iterable, seed=1, mp_kwargs={}, **kwargs):
-    """
-    This can also be used standalone for manually building a custom ProMP environment.
-    Args:
-        env_id: base_env_name,
-        wrappers: list of wrappers (at least an MPEnvWrapper),
-        mp_kwargs: dict of at least {num_dof: int, num_basis: int, width: int}
-
-    Returns: ProMP wrapped gym env
-
-    """
-    _verify_time_limit(mp_kwargs.get("duration", None), kwargs.get("time_limit", None))
-
-    _env = _make_wrapped_env(env_id=env_id, wrappers=wrappers, seed=seed, **kwargs)
-
-    _verify_dof(_env, mp_kwargs.get("num_dof"))
-
-    return ProMPWrapper(_env, **mp_kwargs)
-
-
-def make_dmp_env_helper(**kwargs):
-    """
-    Helper function for registering a DMP gym environments.
-    Args:
-        **kwargs: expects at least the following:
-        {
-        "name": base_env_name,
-        "wrappers": list of wrappers (at least an MPEnvWrapper),
-        "mp_kwargs": dict of at least {num_dof: int, num_basis: int} for DMP
-        }
-
-    Returns: DMP wrapped gym env
-
-    """
-    seed = kwargs.pop("seed", None)
-    return make_dmp_env(env_id=kwargs.pop("name"), wrappers=kwargs.pop("wrappers"), seed=seed,
-                        mp_kwargs=kwargs.pop("mp_kwargs"), **kwargs)
-
-
-def make_promp_env_helper(**kwargs):
-    """
-    Helper function for registering ProMP gym environments.
-    This can also be used standalone for manually building a custom ProMP environment.
-    Args:
-        **kwargs: expects at least the following:
-        {
-        "name": base_env_name,
-        "wrappers": list of wrappers (at least an MPEnvWrapper),
-        "mp_kwargs": dict of at least {num_dof: int, num_basis: int, width: int}
-        }
-
-    Returns: ProMP wrapped gym env
-
-    """
-    seed = kwargs.pop("seed", None)
-    return make_promp_env(env_id=kwargs.pop("name"), wrappers=kwargs.pop("wrappers"), seed=seed,
-                          mp_kwargs=kwargs.pop("mp_kwargs"), **kwargs)
-
-
-def _verify_time_limit(mp_time_limit: Union[None, float], env_time_limit: Union[None, float]):
-    """
-    When using DMC check if a manually specified time limit matches the trajectory duration the MP receives.
-    Mostly, the time_limit for DMC is not specified and the default values from DMC are taken.
-    This check, however, can only been done after instantiating the environment.
-    It can be found in the BaseMP class.
-
-    Args:
-        mp_time_limit: max trajectory length of mp in seconds
-        env_time_limit: max trajectory length of DMC environment in seconds
-
-    Returns:
-
-    """
-    if mp_time_limit is not None and env_time_limit is not None:
-        assert mp_time_limit == env_time_limit, \
-            f"The specified 'time_limit' of {env_time_limit}s does not match " \
-            f"the duration of {mp_time_limit}s for the MP."
-
-
-def _verify_dof(base_env: gym.Env, dof: int):
-    action_shape = np.prod(base_env.action_space.shape)
-    assert dof == action_shape, \
-        f"The specified degrees of freedom ('num_dof') {dof} do not match " \
-        f"the action space of {action_shape} the base environments"
--- a/alr_envs/utils/utils.py
+++ b/alr_envs/utils/utils.py
@ -1,21 +0,0 @@
-import numpy as np
-
-
-def angle_normalize(x, type="deg"):
-    """
-    normalize angle x to [-pi,pi].
-    Args:
-        x: Angle in either degrees or radians
-        type: one of "deg" or "rad" for x being in degrees or radians
-
-    Returns:
-
-    """
-
-    if type not in ["deg", "rad"]: raise ValueError(f"Invalid type {type}. Choose one of 'deg' or 'rad'.")
-
-    if type == "deg":
-        x = np.deg2rad(x)  # x * pi / 180
-
-    two_pi = 2 * np.pi
-    return x - two_pi * np.floor((x + np.pi) / two_pi)
--- a/fancy_gym/init.py
+++ b/fancy_gym/init.py
@ -0,0 +1,13 @@
+from fancy_gym import dmc, meta, open_ai
+from fancy_gym.utils.make_env_helpers import make, make_bb, make_rank
+from .dmc import ALL_DMC_MOVEMENT_PRIMITIVE_ENVIRONMENTS
+# Convenience function for all MP environments
+from .envs import ALL_FANCY_MOVEMENT_PRIMITIVE_ENVIRONMENTS
+from .meta import ALL_METAWORLD_MOVEMENT_PRIMITIVE_ENVIRONMENTS
+from .open_ai import ALL_GYM_MOVEMENT_PRIMITIVE_ENVIRONMENTS
+
+ALL_MOVEMENT_PRIMITIVE_ENVIRONMENTS = {
+    key: value + ALL_DMC_MOVEMENT_PRIMITIVE_ENVIRONMENTS[key] +
+         ALL_GYM_MOVEMENT_PRIMITIVE_ENVIRONMENTS[key] +
+         ALL_METAWORLD_MOVEMENT_PRIMITIVE_ENVIRONMENTS[key]
+    for key, value in ALL_FANCY_MOVEMENT_PRIMITIVE_ENVIRONMENTS.items()}
--- a/fancy_gym/black_box/init.py
+++ b/fancy_gym/black_box/init.py
--- a/fancy_gym/black_box/black_box_wrapper.py
+++ b/fancy_gym/black_box/black_box_wrapper.py
@ -0,0 +1,182 @@
+from typing import Tuple, Optional
+
+import gym
+import numpy as np
+from gym import spaces
+from mp_pytorch.mp.mp_interfaces import MPInterface
+
+from fancy_gym.black_box.controller.base_controller import BaseController
+from fancy_gym.black_box.raw_interface_wrapper import RawInterfaceWrapper
+from fancy_gym.utils.utils import get_numpy
+
+
+class BlackBoxWrapper(gym.ObservationWrapper):
+
+    def __init__(self,
+                 env: RawInterfaceWrapper,
+                 trajectory_generator: MPInterface,
+                 tracking_controller: BaseController,
+                 duration: float,
+                 verbose: int = 1,
+                 learn_sub_trajectories: bool = False,
+                 replanning_schedule: Optional[callable] = None,
+                 reward_aggregation: callable = np.sum
+                 ):
+        """
+        gym.Wrapper for leveraging a black box approach with a trajectory generator.
+
+        Args:
+            env: The (wrapped) environment this wrapper is applied on
+            trajectory_generator: Generates the full or partial trajectory
+            tracking_controller: Translates the desired trajectory to raw action sequences
+            duration: Length of the trajectory of the movement primitive in seconds
+            verbose: level of detail for returned values in info dict.
+            learn_sub_trajectories: Transforms full episode learning into learning sub-trajectories, similar to
+                step-based learning
+            replanning_schedule: callable that receives
+            reward_aggregation: function that takes the np.ndarray of step rewards as input and returns the trajectory
+                reward, default summation over all values.
+        """
+        super().__init__(env)
+
+        self.duration = duration
+        self.learn_sub_trajectories = learn_sub_trajectories
+        self.do_replanning = replanning_schedule is not None
+        self.replanning_schedule = replanning_schedule or (lambda *x: False)
+        self.current_traj_steps = 0
+
+        # trajectory generation
+        self.traj_gen = trajectory_generator
+        self.tracking_controller = tracking_controller
+        # self.time_steps = np.linspace(0, self.duration, self.traj_steps)
+        # self.traj_gen.set_mp_times(self.time_steps)
+        self.traj_gen.set_duration(self.duration - self.dt, self.dt)
+
+        # reward computation
+        self.reward_aggregation = reward_aggregation
+
+        # spaces
+        self.return_context_observation = not (learn_sub_trajectories or self.do_replanning)
+        self.traj_gen_action_space = self._get_traj_gen_action_space()
+        self.action_space = self._get_action_space()
+        self.observation_space = self._get_observation_space()
+
+        # rendering
+        self.render_kwargs = {}
+        self.verbose = verbose
+
+    def observation(self, observation):
+        # return context space if we are
+        if self.return_context_observation:
+            observation = observation[self.env.context_mask]
+        # cast dtype because metaworld returns incorrect that throws gym error
+        return observation.astype(self.observation_space.dtype)
+
+    def get_trajectory(self, action: np.ndarray) -> Tuple:
+        clipped_params = np.clip(action, self.traj_gen_action_space.low, self.traj_gen_action_space.high)
+        self.traj_gen.set_params(clipped_params)
+        bc_time = np.array(0 if not self.do_replanning else self.current_traj_steps * self.dt)
+        # TODO we could think about initializing with the previous desired value in order to have a smooth transition
+        #  at least from the planning point of view.
+        self.traj_gen.set_boundary_conditions(bc_time, self.current_pos, self.current_vel)
+        duration = None if self.learn_sub_trajectories else self.duration
+        self.traj_gen.set_duration(duration, self.dt)
+        # traj_dict = self.traj_gen.get_trajs(get_pos=True, get_vel=True)
+        trajectory = get_numpy(self.traj_gen.get_traj_pos())
+        velocity = get_numpy(self.traj_gen.get_traj_vel())
+
+        # Remove first element of trajectory as this is the current position and velocity
+        # trajectory = trajectory[1:]
+        # velocity = velocity[1:]
+
+        return trajectory, velocity
+
+    def _get_traj_gen_action_space(self):
+        """This function can be used to set up an individual space for the parameters of the traj_gen."""
+        min_action_bounds, max_action_bounds = self.traj_gen.get_params_bounds()
+        action_space = gym.spaces.Box(low=min_action_bounds.numpy(), high=max_action_bounds.numpy(),
+                                      dtype=self.env.action_space.dtype)
+        return action_space
+
+    def _get_action_space(self):
+        """
+        This function can be used to modify the action space for considering actions which are not learned via movement
+        primitives. E.g. ball releasing time for the beer pong task. By default, it is the parameter space of the
+        movement primitive.
+        Only needs to be overwritten if the action space needs to be modified.
+        """
+        try:
+            return self.traj_gen_action_space
+        except AttributeError:
+            return self._get_traj_gen_action_space()
+
+    def _get_observation_space(self):
+        if self.return_context_observation:
+            mask = self.env.context_mask
+            # return full observation
+            min_obs_bound = self.env.observation_space.low[mask]
+            max_obs_bound = self.env.observation_space.high[mask]
+            return spaces.Box(low=min_obs_bound, high=max_obs_bound, dtype=self.env.observation_space.dtype)
+        return self.env.observation_space
+
+    def step(self, action: np.ndarray):
+        """ This function generates a trajectory based on a MP and then does the usual loop over reset and step"""
+
+        # TODO remove this part, right now only needed for beer pong
+        mp_params, env_spec_params = self.env.episode_callback(action, self.traj_gen)
+        trajectory, velocity = self.get_trajectory(mp_params)
+
+        trajectory_length = len(trajectory)
+        rewards = np.zeros(shape=(trajectory_length,))
+        if self.verbose >= 2:
+            actions = np.zeros(shape=(trajectory_length,) + self.env.action_space.shape)
+            observations = np.zeros(shape=(trajectory_length,) + self.env.observation_space.shape,
+                                    dtype=self.env.observation_space.dtype)
+
+        infos = dict()
+        done = False
+
+        for t, (pos, vel) in enumerate(zip(trajectory, velocity)):
+            step_action = self.tracking_controller.get_action(pos, vel, self.current_pos, self.current_vel)
+            c_action = np.clip(step_action, self.env.action_space.low, self.env.action_space.high)
+            obs, c_reward, done, info = self.env.step(c_action)
+            rewards[t] = c_reward
+
+            if self.verbose >= 2:
+                actions[t, :] = c_action
+                observations[t, :] = obs
+
+            for k, v in info.items():
+                elems = infos.get(k, [None] * trajectory_length)
+                elems[t] = v
+                infos[k] = elems
+
+            if self.render_kwargs:
+                self.env.render(**self.render_kwargs)
+
+            if done or self.replanning_schedule(self.current_pos, self.current_vel, obs, c_action,
+                                                t + 1 + self.current_traj_steps):
+                break
+
+        infos.update({k: v[:t] for k, v in infos.items()})
+        self.current_traj_steps += t + 1
+
+        if self.verbose >= 2:
+            infos['positions'] = trajectory
+            infos['velocities'] = velocity
+            infos['step_actions'] = actions[:t + 1]
+            infos['step_observations'] = observations[:t + 1]
+            infos['step_rewards'] = rewards[:t + 1]
+
+        infos['trajectory_length'] = t + 1
+        trajectory_return = self.reward_aggregation(rewards[:t + 1])
+        return self.observation(obs), trajectory_return, done, infos
+
+    def render(self, **kwargs):
+        """Only set render options here, such that they can be used during the rollout.
+        This only needs to be called once"""
+        self.render_kwargs = kwargs
+
+    def reset(self, *, seed: Optional[int] = None, return_info: bool = False, options: Optional[dict] = None):
+        self.current_traj_steps = 0
+        return super(BlackBoxWrapper, self).reset()
--- a/alr_envs/alr/classic_control/base_reacher/init.py
+++ b/alr_envs/alr/classic_control/base_reacher/init.py
--- a/Show More
+++ b/Show More