'])
\ No newline at end of file
diff --git a/docs/source/guide/episodic_rl.rst b/docs/source/guide/episodic_rl.rst
new file mode 100644
index 0000000..4c0b7bc
--- /dev/null
+++ b/docs/source/guide/episodic_rl.rst
@@ -0,0 +1,50 @@
+What is Episodic RL?
+--------------------
+
+.. raw:: html
+
+
+
+Movement primitive (MP) environments differ from traditional step-based
+environments. They align more with concepts from stochastic search,
+black-box optimization, and methods commonly found in classical robotics
+and control. Instead of individual steps, MP environments operate on an
+episode basis, executing complete trajectories. These trajectories are
+produced by trajectory generators like Dynamic Movement Primitives
+(DMP), Probabilistic Movement Primitives (ProMP) or Probabilistic
+Dynamic Movement Primitives (ProDMP).
+
+.. raw:: html
+
+
+
+.. raw:: html
+
+
+
+Once generated, these trajectories are converted into step-by-step
+actions using a trajectory tracking controller. The specific controller
+chosen depends on the environment’s requirements. Currently, we support
+position, velocity, and PD-Controllers tailored for position, velocity,
+and torque control. Additionally, we have a specialized controller
+designed for the MetaWorld control suite.
+
+.. raw:: html
+
+
+
+.. raw:: html
+
+
+
+While the overarching objective of MP environments remains the learning
+of an optimal policy, the actions here represent the parametrization of
+motion primitives to craft the right trajectory. Our framework further
+enhances this by accommodating a contextual setting. At the episode’s
+onset, we present the context space—a subset of the observation space.
+This demands the prediction of a new action or MP parametrization for
+every unique context.
+
+.. raw:: html
+
+
\ No newline at end of file
diff --git a/docs/source/guide/installation.rst b/docs/source/guide/installation.rst
new file mode 100644
index 0000000..5885d43
--- /dev/null
+++ b/docs/source/guide/installation.rst
@@ -0,0 +1,72 @@
+Installation
+------------
+
+We recommend installing ``fancy_gym`` into a virtual environment as
+provided by `venv `__. 3rd
+party alternatives to venv like `Poetry `__
+or `Conda `__ can also be used.
+
+Installation from PyPI (recommended)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Install ``fancy_gym`` via
+
+.. code:: bash
+
+ pip install fancy_gym
+
+We have a few optional dependencies. If you also want to install those
+use
+
+.. code:: bash
+
+ # to install all optional dependencies
+ pip install 'fancy_gym[all]'
+
+ # or choose only those you want
+ pip install 'fancy_gym[dmc,box2d,mujoco-legacy,jax,testing]'
+
+Pip can not automatically install up-to-date versions of metaworld,
+since they are not avaible on PyPI yet. Install metaworld via
+
+.. code:: bash
+
+ pip install metaworld@git+https://github.com/Farama-Foundation/Metaworld.git@d155d0051630bb365ea6a824e02c66c068947439#egg=metaworld
+
+Installation from master
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+1. Clone the repository
+
+.. code:: bash
+
+ git clone git@github.com:ALRhub/fancy_gym.git
+
+2. Go to the folder
+
+.. code:: bash
+
+ cd fancy_gym
+
+3. Install with
+
+.. code:: bash
+
+ pip install -e .
+
+We have a few optional dependencies. If you also want to install those
+use
+
+.. code:: bash
+
+ # to install all optional dependencies
+ pip install -e '.[all]'
+
+ # or choose only those you want
+ pip install -e '.[dmc,box2d,mujoco-legacy,jax,testing]'
+
+Metaworld has to be installed manually with
+
+.. code:: bash
+
+ pip install metaworld@git+https://github.com/Farama-Foundation/Metaworld.git@d155d0051630bb365ea6a824e02c66c068947439#egg=metaworld
\ No newline at end of file
diff --git a/docs/source/guide/upgrading_envs.rst b/docs/source/guide/upgrading_envs.rst
new file mode 100644
index 0000000..f04e8a0
--- /dev/null
+++ b/docs/source/guide/upgrading_envs.rst
@@ -0,0 +1,136 @@
+Creating new MP Environments
+----------------------------
+
+In case a required task is not supported yet in the MP framework, it can
+be created relatively easy. For the task at hand, the following
+`interface `__
+needs to be implemented.
+
+.. code:: python
+
+ from abc import abstractmethod
+ from typing import Union, Tuple
+
+ import gymnasium as gym
+ import numpy as np
+
+
+ class RawInterfaceWrapper(gym.Wrapper):
+ mp_config = {
+ 'ProMP': {},
+ 'DMP': {},
+ 'ProDMP': {},
+ }
+
+ @property
+ def context_mask(self) -> np.ndarray:
+ """
+ Returns boolean mask of the same shape as the observation space.
+ It determines whether the observation is returned for the contextual case or not.
+ This effectively allows to filter unwanted or unnecessary observations from the full step-based case.
+ E.g. Velocities starting at 0 are only changing after the first action. Given we only receive the
+ context/part of the first observation, the velocities are not necessary in the observation for the task.
+ Returns:
+ bool array representing the indices of the observations
+ """
+ return np.ones(self.env.observation_space.shape[0], dtype=bool)
+
+ @property
+ @abstractmethod
+ def current_pos(self) -> Union[float, int, np.ndarray, Tuple]:
+ """
+ Returns the current position of the action/control dimension.
+ The dimensionality has to match the action/control dimension.
+ This is not required when exclusively using velocity control,
+ it should, however, be implemented regardless.
+ E.g. The joint positions that are directly or indirectly controlled by the action.
+ """
+ raise NotImplementedError()
+
+ @property
+ @abstractmethod
+ def current_vel(self) -> Union[float, int, np.ndarray, Tuple]:
+ """
+ Returns the current velocity of the action/control dimension.
+ The dimensionality has to match the action/control dimension.
+ This is not required when exclusively using position control,
+ it should, however, be implemented regardless.
+ E.g. The joint velocities that are directly or indirectly controlled by the action.
+ """
+ raise NotImplementedError()
+
+Default configurations for MPs can be overitten by defining attributes
+in mp_config. Available parameters are documented in the `MP_PyTorch
+Userguide `__.
+
+.. code:: python
+
+ class RawInterfaceWrapper(gym.Wrapper):
+ mp_config = {
+ 'ProMP': {
+ 'phase_generator_kwargs': {
+ 'phase_generator_type': 'linear'
+ # When selecting another generator type, the default configuration will not be merged for the attribute.
+ },
+ 'controller_kwargs': {
+ 'p_gains': 0.5 * np.array([1.0, 4.0, 2.0, 4.0, 1.0, 4.0, 1.0]),
+ 'd_gains': 0.5 * np.array([0.1, 0.4, 0.2, 0.4, 0.1, 0.4, 0.1]),
+ },
+ 'basis_generator_kwargs': {
+ 'num_basis': 3,
+ 'num_basis_zero_start': 1,
+ 'num_basis_zero_goal': 1,
+ },
+ },
+ 'DMP': {},
+ 'ProDMP': {}.
+ }
+
+ [...]
+
+If you created a new task wrapper, feel free to open a PR, so we can
+integrate it for others to use as well. Without the integration the task
+can still be used. A rough outline can be shown here, for more details
+we recommend having a look at the
+`examples `__.
+
+If the step-based is already registered with gym, you can simply do the
+following:
+
+.. code:: python
+
+ fancy_gym.upgrade(
+ id='custom/cool_new_env-v0',
+ mp_wrapper=my_custom_MPWrapper
+ )
+
+If the step-based is not yet registered with gym we can add both the
+step-based and MP-versions via
+
+.. code:: python
+
+ fancy_gym.register(
+ id='custom/cool_new_env-v0',
+ entry_point=my_custom_env,
+ mp_wrapper=my_custom_MPWrapper
+ )
+
+From this point on, you can access MP-version of your environments via
+
+.. code:: python
+
+ env = gym.make('custom_ProDMP/cool_new_env-v0')
+
+ rewards = 0
+ observation, info = env.reset()
+
+ # number of samples/full trajectories (multiple environment steps)
+ for i in range(5):
+ ac = env.action_space.sample()
+ observation, reward, terminated, truncated, info = env.step(ac)
+ rewards += reward
+
+ if terminated or truncated:
+ print(rewards)
+ rewards = 0
+ observation, info = env.reset()
\ No newline at end of file