fancy_gym/docs/source/guide/upgrading_envs.rst

Creating new MP Environments
----------------------------

In case a required task is not supported yet in the MP framework, it can
be created relatively easy. For the task at hand, the following
`interface <https://github.com/ALRhub/fancy_gym/tree/master/fancy_gym/black_box/raw_interface_wrapper.py>`__
needs to be implemented.

.. code:: python

   from abc import abstractmethod
   from typing import Union, Tuple

   import gymnasium as gym
   import numpy as np


   class RawInterfaceWrapper(gym.Wrapper):
       mp_config = {
           'ProMP': {},
           'DMP': {},
           'ProDMP': {},
       }

       @property
       def context_mask(self) -> np.ndarray:
           """
               Returns boolean mask of the same shape as the observation space.
               It determines whether the observation is returned for the contextual case or not.
               This effectively allows to filter unwanted or unnecessary observations from the full step-based case.
               E.g. Velocities starting at 0 are only changing after the first action. Given we only receive the
               context/part of the first observation, the velocities are not necessary in the observation for the task.
               Returns:
                   bool array representing the indices of the observations
           """
           return np.ones(self.env.observation_space.shape[0], dtype=bool)

       @property
       @abstractmethod
       def current_pos(self) -> Union[float, int, np.ndarray, Tuple]:
           """
               Returns the current position of the action/control dimension.
               The dimensionality has to match the action/control dimension.
               This is not required when exclusively using velocity control,
               it should, however, be implemented regardless.
               E.g. The joint positions that are directly or indirectly controlled by the action.
           """
           raise NotImplementedError()

       @property
       @abstractmethod
       def current_vel(self) -> Union[float, int, np.ndarray, Tuple]:
           """
               Returns the current velocity of the action/control dimension.
               The dimensionality has to match the action/control dimension.
               This is not required when exclusively using position control,
               it should, however, be implemented regardless.
               E.g. The joint velocities that are directly or indirectly controlled by the action.
           """
           raise NotImplementedError()

Default configurations for MPs can be overitten by defining attributes
in mp_config. Available parameters are documented in the `MP_PyTorch
Userguide <https://github.com/ALRhub/MP_PyTorch/blob/main/doc/README.md>`__.

.. code:: python

   class RawInterfaceWrapper(gym.Wrapper):
       mp_config = {
           'ProMP': {
               'phase_generator_kwargs': {
                   'phase_generator_type': 'linear'
                   # When selecting another generator type, the default configuration will not be merged for the attribute.
               },
               'controller_kwargs': {
                   'p_gains': 0.5 * np.array([1.0, 4.0, 2.0, 4.0, 1.0, 4.0, 1.0]),
                   'd_gains': 0.5 * np.array([0.1, 0.4, 0.2, 0.4, 0.1, 0.4, 0.1]),
               },
               'basis_generator_kwargs': {
                   'num_basis': 3,
                   'num_basis_zero_start': 1,
                   'num_basis_zero_goal': 1,
               },
           },
           'DMP': {},
           'ProDMP': {}.
       }

       [...]

If you created a new task wrapper, feel free to open a PR, so we can
integrate it for others to use as well. Without the integration the task
can still be used. A rough outline can be shown here, for more details
we recommend having a look at the
`examples <https://github.com/ALRhub/fancy_gym/tree/master/fancy_gym/examples/>`__.

If the step-based is already registered with gym, you can simply do the
following:

.. code:: python

   fancy_gym.upgrade(
       id='custom/cool_new_env-v0',
       mp_wrapper=my_custom_MPWrapper
   )

If the step-based is not yet registered with gym we can add both the
step-based and MP-versions via

.. code:: python

   fancy_gym.register(
       id='custom/cool_new_env-v0',
       entry_point=my_custom_env,
       mp_wrapper=my_custom_MPWrapper
   )

From this point on, you can access MP-version of your environments via

.. code:: python

   env = gym.make('custom_ProDMP/cool_new_env-v0')

   rewards = 0
   observation, info = env.reset()

   # number of samples/full trajectories (multiple environment steps)
   for i in range(5):
       ac = env.action_space.sample()
       observation, reward, terminated, truncated, info = env.step(ac)
       rewards += reward

       if terminated or truncated:
           print(rewards)
           rewards = 0
           observation, info = env.reset()