More README

This commit is contained in:
kngwyu 2020-12-11 21:45:57 +09:00
parent f52213b16d
commit 0f09ba6547

View File

@ -6,7 +6,24 @@
Some maze environments for reinforcement learning(RL) using [mujoco-py] and Some maze environments for reinforcement learning(RL) using [mujoco-py] and
[openai gym][gym]. [openai gym][gym].
Thankfully, this project is based on the code from [rllab] and [tensorflow/models][models]. Thankfully, this project is based on the code from [rllab] and
[tensorflow/models][models].
Note that [d4rl][d4rl] and [dm_control][dm_control] have similar maze
environment, and you can also check them.
But if you want more customizable or minimal one, I recommend this.
## Usage
Importing `mujoco_maze` registers environments and environments listed
below are available via `gym.make`.
E.g.,
```python
import gym
import mujoco_maze # noqa
env = gym.make("Ant4Rooms-v0")
```
## Environments ## Environments
@ -42,6 +59,56 @@ Thankfully, this project is based on the code from [rllab] and [tensorflow/mode
- PointBilliard-v1 (Goal-based Reward) - PointBilliard-v1 (Goal-based Reward)
- PointBilliard-v2 (Multiple Goals (0.5 pt or 1.0 pt)) - PointBilliard-v2 (Multiple Goals (0.5 pt or 1.0 pt))
## Customize Environments
You can define your own task by using components in `maze_task.py`,
like:
```
import gym
import numpy as np
from mujoco_maze.maze_env_utils import MazeCell
from mujoco_maze.maze_task import MazeGoal, MazeTask
from mujoco_maze.point import PointEnv
class GoalRewardEMaze(MazeTask):
REWARD_THRESHOLD: float = 0.9
PENALTY: float = -0.0001
def __init__(self, scale):
super().__init__(scale)
self.goals = [MazeGoal(np.array([0.0, 4.0]) * scale)]
def reward(self, obs):
return 1.0 if self.termination(obs) else self.PENALTY
@staticmethod
def create_maze():
E, B, R = MazeCell.EMPTY, MazeCell.BLOCK, MazeCell.ROBOT
return [
[B, B, B, B, B],
[B, R, E, E, B],
[B, B, B, E, B],
[B, E, E, E, B],
[B, B, B, E, B],
[B, E, E, E, B],
[B, B, B, B, B],
]
gym.envs.register(
id="PointEMaze-v0",
entry_point="mujoco_maze.maze_env:MazeEnv",
kwargs=dict(
model_cls=PointEnv,
maze_task=GoalRewardEMaze,
maze_size_scaling=GoalRewardEMaze.MAZE_SIZE_SCALING.point,
inner_reward_scaling=GoalRewardEMaze.INNER_REWARD_SCALING,
)
)
```
You can also customize models. See `point.py` or so.
## Warning ## Warning
This project has some other environments (e.g., reacher and swimmer) This project has some other environments (e.g., reacher and swimmer)
but if they are not on README, they are work in progress and but if they are not on README, they are work in progress and
@ -51,6 +118,8 @@ not tested well.
This project is licensed under Apache License, Version 2.0 This project is licensed under Apache License, Version 2.0
([LICENSE-APACHE](LICENSE) or http://www.apache.org/licenses/LICENSE-2.0). ([LICENSE-APACHE](LICENSE) or http://www.apache.org/licenses/LICENSE-2.0).
[d4rl]: https://github.com/rail-berkeley/d4rl
[dm_control]: https://github.com/deepmind/dm_control
[gym]: https://github.com/openai/gym [gym]: https://github.com/openai/gym
[models]: https://github.com/tensorflow/models/tree/master/research/efficient-hrl [models]: https://github.com/tensorflow/models/tree/master/research/efficient-hrl
[mujoco-py]: https://github.com/openai/mujoco-py [mujoco-py]: https://github.com/openai/mujoco-py