History

Dominik Roth e3f4c511bf Better default HPs for TRPL		2024-01-23 09:20:34 +01:00
..
common	Minor bug fixes	2024-01-22 19:58:08 +01:00
ppo	Implement Importance Sampling for PCA	2024-01-16 15:13:06 +01:00
sac	Updating to new sb3 version	2023-11-19 18:34:15 +01:00
trpl	Better default HPs for TRPL	2024-01-23 09:20:34 +01:00
__init__.py	Expose TRPL class	2024-01-16 15:34:12 +01:00
README.md	Implement Importance Sampling for PCA	2024-01-16 15:13:06 +01:00

README.md

Metastable Baselines 2

An extension to Stable Baselines 3. Based on Metastable Baselines 1.

During training of a RL-Agent we follow the gradient of the loss, which leads us to a minimum. In cases where the found minimum is merely a local minimum, this can be seen as a false vacuum in our loss space. Exploration mechanisms try to let our training procedure escape these stable states: Making them metastable.

In order to archive this, this Repo contains some extensions for Stable Baselines 3 by DLR-RM
These extensions include:

An implementation of "Differentiable Trust Region Layers for Deep Reinforcement Learning" by Fabian Otto et al. (TRPL)
Support for Prior Conditioned Annealing
Support for Contextual Covariances (Planned)
Support for Full Covariances (Planned)

The resulting algorithms can than be tested for their ability of exploration in the enviroments provided by Fancy Gym or Project Columbus

Installation

Install dependency: Metastable Projections

Follow instructions for the Metastable Projections (GitHub Mirror). KL Projections require ALR's ITPAL as an additional dependecy.

Install as a package

Then install this repo as a package:

pip install -e .

License

Since this Repo is an extension to Stable Baselines 3 by DLR-RM, it contains some of it's code. SB3 is licensed under the MIT-License.