1.8 KiB
1.8 KiB
Metastable Baselines 2
An extension to Stable Baselines 3. Based on Metastable Baselines 1.
This repo provides:
- An implementation of "Differentiable Trust Region Layers for Deep Reinforcement Learning" by Fabian Otto et al. (TRPL)
- Support for Prior Conditioned Annealing (WIP)
- Support for Contextual Covariances (Planned)
- Support for Full Covariances (Planned)
The resulting algorithms can than be tested for their ability of exploration in the enviroments provided by Fancy Gym or Project Columbus
Installation
Install dependency: Metastable Projections
Follow instructions for the Metastable Projections (GitHub Mirror). KL Projections require ALR's ITPAL as an additional dependecy.
Install as a package
Then install this repo as a package:
pip install -e .
Usage
TRPL can be used just like SB3's PPO:
import gymnasium as gym
from metastable_baselines2 import TRPL
projection = 'Wasserstein' # or Frobenius or KL
model = TRPL("MlpPolicy", env_id, n_steps=128, seed=0, policy_kwargs=dict(net_arch=[16]), projection_class=projection, projection_kwargs={'mean_bound': mean_bound, 'cov_bound': cov_bound}, verbose=1)
model.learn(total_timesteps=100)
For avaible projection_kwargs have a look at Metastable Projections.
License
Since this Repo is an extension to Stable Baselines 3 by DLR-RM, it contains some of it's code. SB3 is licensed under the MIT-License.