dodox/metastable-baselines2

Dominik Roth 85e9e1033d Extended README

2024-03-14 17:35:07 +01:00

1.8 KiB

Raw Blame History

Metastable Baselines 2

An extension to Stable Baselines 3. Based on Metastable Baselines 1.

This repo provides:

An implementation of "Differentiable Trust Region Layers for Deep Reinforcement Learning" by Fabian Otto et al. (TRPL)
Support for Prior Conditioned Annealing (WIP)
Support for Contextual Covariances (Planned)
Support for Full Covariances (Planned)

The resulting algorithms can than be tested for their ability of exploration in the enviroments provided by Fancy Gym or Project Columbus

Installation

Install dependency: Metastable Projections

Follow instructions for the Metastable Projections (GitHub Mirror). KL Projections require ALR's ITPAL as an additional dependecy.

Install as a package

Then install this repo as a package:

pip install -e .

Usage

TRPL can be used just like SB3's PPO:

import gymnasium as gym
from metastable_baselines2 import TRPL

projection = 'Wasserstein' # or Frobenius or KL

model = TRPL("MlpPolicy", env_id, n_steps=128, seed=0, policy_kwargs=dict(net_arch=[16]), projection_class=projection, projection_kwargs={'mean_bound': mean_bound, 'cov_bound': cov_bound}, verbose=1)

model.learn(total_timesteps=100)

For avaible projection_kwargs have a look at Metastable Projections.

License

Since this Repo is an extension to Stable Baselines 3 by DLR-RM, it contains some of it's code. SB3 is licensed under the MIT-License.