fancy_gym/README.md

217 lines
8.3 KiB
Markdown
Raw Normal View History

<h1 align="center">
2023-06-27 21:39:03 +02:00
<br>
<img src='https://raw.githubusercontent.com/ALRhub/fancy_gym/master/icon.svg' width="250px">
2023-06-27 21:39:03 +02:00
<br><br>
<b>Fancy Gym</b>
2023-10-11 13:28:03 +02:00
<br><br>
</h1>
2023-10-11 13:28:03 +02:00
Built upon the foundation of [Gymnasium](https://gymnasium.farama.org/) (a maintained fork of OpenAIs renowned Gym library) `fancy_gym` offers a comprehensive collection of reinforcement learning environments.
2023-09-17 18:37:40 +02:00
**Key Features**:
2023-09-20 12:49:31 +02:00
- **New Challenging Environments**: `fancy_gym` includes several new environments (Panda Box Pushing, Table Tennis, etc.) that present a higher degree of difficulty, pushing the boundaries of reinforcement learning research.
2023-09-18 17:41:10 +02:00
- **Support for Movement Primitives**: `fancy_gym` supports a range of movement primitives (MPs), including Dynamic Movement Primitives (DMPs), Probabilistic Movement Primitives (ProMP), and Probabilistic Dynamic Movement Primitives (ProDMP).
2023-09-17 18:37:40 +02:00
- **Upgrade to Movement Primitives**: With our framework, it's straightforward to transform standard Gymnasium environments into environments that support movement primitives.
2023-09-19 16:43:07 +02:00
- **Benchmark Suite Compatibility**: `fancy_gym` makes it easy to access renowned benchmark suites such as [DeepMind Control](https://deepmind.com/research/publications/2020/dm-control-Software-and-Tasks-for-Continuous-Control) and [Metaworld](https://meta-world.github.io/), whether you want to use them in the regular step-based setting or using MPs.
- **Contribute Your Own Environments**: If you're inspired to create custom gym environments, both step-based and with movement primitives, this [guide](https://gymnasium.farama.org/tutorials/gymnasium_basics/environment_creation/) will assist you. We encourage and highly appreciate submissions via PRs to integrate these environments into `fancy_gym`.
## Movement Primitive Environments (Episode-Based/Black-Box Environments)
2021-08-23 17:24:55 +02:00
<p align="justify">
2024-01-18 10:20:19 +01:00
In step-based environments, actions are determined step by step, with each individual observation directly mapped to a corresponding action. Contrary to this, in episodic MP-based (Movement Primitive-based) environments, the process is different. Here, rather than responding to individual observations, a broader context is considered at the start of each episode. This context is used to define parameters for Movement Primitives (MPs), which then describe a complete trajectory. The trajectory is executed over the entire episode using a tracking controller, allowing for the enactment of complex, continuous sequences of actions. This approach contrasts with the discrete, moment-to-moment decision-making of step-based environments and integrates concepts from stochastic search and black-box optimization, commonly found in classical robotics and control.
</p>
2021-08-23 17:24:55 +02:00
2024-01-18 10:20:19 +01:00
For a more extensive explaination, please have a look at our Documentation-TODO:Link.
2021-08-23 17:24:55 +02:00
## Installation
2023-10-29 17:38:38 +01:00
We recommend installing `fancy_gym` into a virtual environment as provided by [venv](https://docs.python.org/3/library/venv.html). 3rd party alternatives to venv like [Poetry](https://python-poetry.org/) or [Conda](https://docs.conda.io/en/latest/) can also be used.
2023-10-23 14:28:26 +02:00
### Installation from PyPI (recommended)
Install `fancy_gym` via
2024-01-18 10:20:19 +01:00
```bash
pip install fancy_gym
```
We have a few optional dependencies. If you also want to install those use
```bash
# to install all optional dependencies
pip install 'fancy_gym[all]'
# or choose only those you want
pip install 'fancy_gym[dmc,box2d,mujoco-legacy,jax,testing]'
```
Pip can not automatically install up-to-date versions of metaworld, since they are not avaible on PyPI yet.
Install metaworld via
```bash
pip install metaworld@git+https://github.com/Farama-Foundation/Metaworld.git@d155d0051630bb365ea6a824e02c66c068947439#egg=metaworld
```
### Installation from master
2021-08-23 17:24:55 +02:00
1. Clone the repository
2024-01-18 10:20:19 +01:00
2023-09-20 12:49:31 +02:00
```bash
2022-07-14 09:39:54 +02:00
git clone git@github.com:ALRhub/fancy_gym.git
2020-08-28 18:46:19 +02:00
```
2021-08-23 17:24:55 +02:00
2. Go to the folder
2024-01-18 10:20:19 +01:00
2023-09-20 12:49:31 +02:00
```bash
2022-07-14 09:39:54 +02:00
cd fancy_gym
2020-08-28 18:46:19 +02:00
```
2021-08-23 17:24:55 +02:00
3. Install with
2024-01-18 10:20:19 +01:00
2023-09-20 12:49:31 +02:00
```bash
2022-07-13 16:52:24 +02:00
pip install -e .
2020-08-28 18:46:19 +02:00
```
2021-08-23 17:24:55 +02:00
2023-09-19 16:43:07 +02:00
We have a few optional dependencies. If you also want to install those use
2024-01-18 10:20:19 +01:00
2023-09-20 12:49:31 +02:00
```bash
# to install all optional dependencies
pip install -e '.[all]'
# or choose only those you want
pip install -e '.[dmc,box2d,mujoco-legacy,jax,testing]'
```
Metaworld has to be installed manually with
2024-01-18 10:20:19 +01:00
```bash
pip install metaworld@git+https://github.com/Farama-Foundation/Metaworld.git@d155d0051630bb365ea6a824e02c66c068947439#egg=metaworld
2022-07-13 16:52:24 +02:00
```
## How to use Fancy Gym
2021-08-23 17:24:55 +02:00
2024-01-18 10:20:19 +01:00
Documentation for `fancy_gym` is avaible at TODO:Link. Usage examples can be found here-TODO:Link.
2022-07-13 16:52:24 +02:00
2023-09-17 18:37:40 +02:00
### Step-Based Environments
2023-09-20 12:49:31 +02:00
2023-09-19 16:43:07 +02:00
Regular step based environments added by Fancy Gym are added into the `fancy/` namespace.
2023-09-17 18:37:40 +02:00
2024-01-18 10:20:19 +01:00
| &#x2757; Legacy versions of Fancy Gym used `fancy_gym.make(...)`. This is no longer supported and will raise an Exception on new versions. |
| ------------------------------------------------------------------------------------------------------------------------------------------ |
2021-08-23 17:24:55 +02:00
2020-08-28 18:46:19 +02:00
```python
import gymnasium as gym
2022-07-13 15:10:43 +02:00
import fancy_gym
2020-08-28 18:50:37 +02:00
2023-09-17 18:37:40 +02:00
env = gym.make('fancy/Reacher5d-v0')
2023-09-19 16:43:07 +02:00
# or env = gym.make('metaworld/reach-v2') # fancy_gym allows access to all metaworld ML1 tasks via the metaworld/ NS
# or env = gym.make('dm_control/ball_in_cup-catch-v0')
2023-09-19 17:01:00 +02:00
# or env = gym.make('Reacher-v2')
2023-09-17 18:37:40 +02:00
observation = env.reset(seed=1)
2020-08-28 18:50:37 +02:00
2021-08-23 17:24:55 +02:00
for i in range(1000):
2022-07-13 16:52:24 +02:00
action = env.action_space.sample()
2023-09-17 18:37:40 +02:00
observation, reward, terminated, truncated, info = env.step(action)
2020-08-28 18:50:37 +02:00
if i % 5 == 0:
env.render()
2023-09-17 18:37:40 +02:00
if terminated or truncated:
2023-09-19 17:07:12 +02:00
observation, info = env.reset()
2023-06-27 21:39:03 +02:00
```
2021-04-23 12:16:19 +02:00
2024-01-18 10:20:19 +01:00
A list of all included environments is avaible here-TODO:Link.
2022-07-13 16:52:24 +02:00
2024-01-18 10:20:19 +01:00
### Black-box Environments
2022-07-13 16:52:24 +02:00
2023-09-19 16:43:07 +02:00
Existing MP tasks can be created the same way as above. The namespace of a MP-variant of an environment is given by `<original namespace>_<MP name>/`.
2023-09-17 18:37:40 +02:00
Just keep in mind, calling `step()` executes a full trajectory.
2022-07-13 16:52:24 +02:00
2021-08-23 17:24:55 +02:00
```python
import gymnasium as gym
2022-07-13 15:10:43 +02:00
import fancy_gym
2021-08-23 17:24:55 +02:00
env = gym.make('fancy_ProMP/Reacher5d-v0')
2023-09-19 16:43:07 +02:00
# or env = gym.make('metaworld_ProDMP/reach-v2')
# or env = gym.make('dm_control_DMP/ball_in_cup-catch-v0')
2023-09-19 17:01:00 +02:00
# or env = gym.make('gym_ProMP/Reacher-v2') # mp versions of envs added directly by gymnasium are in the gym_<MP-type> NS
2023-09-17 18:37:40 +02:00
2022-07-13 16:52:24 +02:00
# render() can be called once in the beginning with all necessary arguments.
2023-09-20 12:49:31 +02:00
# To turn it of again just call render() without any arguments.
2022-07-13 16:52:24 +02:00
env.render(mode='human')
2021-08-23 17:24:55 +02:00
2022-07-13 16:52:24 +02:00
# This returns the context information, not the full state observation
2023-09-18 19:52:08 +02:00
observation, info = env.reset(seed=1)
2021-08-23 17:24:55 +02:00
for i in range(5):
2022-07-13 16:52:24 +02:00
action = env.action_space.sample()
2023-09-17 18:37:40 +02:00
observation, reward, terminated, truncated, info = env.step(action)
2021-08-23 17:24:55 +02:00
2023-09-19 17:01:00 +02:00
# terminated or truncated is always True as we are working on the episode level, hence we always reset()
2023-09-19 17:07:12 +02:00
observation, info = env.reset()
2021-08-23 17:24:55 +02:00
```
2024-01-18 10:20:19 +01:00
A list of all included MP environments is avaible here-TODO:Link.
2021-08-23 17:24:55 +02:00
### How to create a new MP task
2024-01-18 10:20:19 +01:00
We refer to our Documentation for a complete description-TODO:Link.
2021-08-23 17:24:55 +02:00
2023-09-17 18:37:40 +02:00
If the step-based is already registered with gym, you can simply do the following:
2021-08-23 17:24:55 +02:00
```python
2023-09-17 18:37:40 +02:00
fancy_gym.upgrade(
id='custom/cool_new_env-v0',
mp_wrapper=my_custom_MPWrapper
)
```
2021-08-23 17:24:55 +02:00
2023-09-17 18:37:40 +02:00
If the step-based is not yet registered with gym we can add both the step-based and MP-versions via
2021-08-23 17:24:55 +02:00
2023-09-17 18:37:40 +02:00
```python
fancy_gym.register(
id='custom/cool_new_env-v0',
entry_point=my_custom_env,
mp_wrapper=my_custom_MPWrapper
)
```
2024-01-18 10:20:19 +01:00
As for how to write custom MP-Wrappers, please have a look at our Documentation-TODO:Link.
2023-09-17 18:37:40 +02:00
From this point on, you can access MP-version of your environments via
```python
env = gym.make('custom_ProDMP/cool_new_env-v0')
2021-08-23 17:24:55 +02:00
rewards = 0
2023-09-18 19:52:08 +02:00
observation, info = env.reset()
2021-08-23 17:24:55 +02:00
# number of samples/full trajectories (multiple environment steps)
for i in range(5):
ac = env.action_space.sample()
2023-09-17 18:37:40 +02:00
observation, reward, terminated, truncated, info = env.step(ac)
2021-08-23 17:24:55 +02:00
rewards += reward
2023-09-17 18:37:40 +02:00
if terminated or truncated:
2023-09-19 17:07:12 +02:00
print(rewards)
2021-08-23 17:24:55 +02:00
rewards = 0
2023-09-19 17:32:29 +02:00
observation, info = env.reset()
2021-08-23 17:24:55 +02:00
```
2023-06-27 21:43:22 +02:00
2023-10-11 11:56:51 +02:00
## Citing the Project
To cite this repository in publications:
```bibtex
@software{fancy_gym,
title = {Fancy Gym},
2023-10-11 11:56:51 +02:00
author = {Otto, Fabian and Celik, Onur and Roth, Dominik and Zhou, Hongyi},
abstract = {Fancy Gym: Unifying interface for various RL benchmarks with support for Black Box approaches.},
url = {https://github.com/ALRhub/fancy_gym},
organization = {Autonomous Learning Robots Lab (ALR) at KIT},
2023-10-11 11:56:51 +02:00
}
```
2023-06-28 22:20:49 +02:00
## Icon Attribution
2023-09-20 12:49:31 +02:00
2023-06-28 22:20:49 +02:00
The icon is based on the [Gymnasium](https://github.com/Farama-Foundation/Gymnasium) icon as can be found [here](https://gymnasium.farama.org/_static/img/gymnasium_black.svg).