Fancy RL is a minimalistic and efficient implementation of Proximal Policy Optimization (PPO) and Trust Region Policy Layers (TRPL) using primitives from [torchrl](https://pypi.org/project/torchrl/). Future plans include implementing Soft Actor-Critic (SAC). This library focuses on providing clean and understandable code while leveraging the powerful functionalities of torchrl.
We provide optional integration with wandb.
## Installation
Fancy RL requires Python 3.7-3.11. (TorchRL currently does not support Python 3.12)
```bash
pip install -e .
```
## Usage
Here's a basic example of how to train a PPO agent with Fancy RL: