From df1ba6fe53c1fd3c2b490a28df0e586a49a48df8 Mon Sep 17 00:00:00 2001 From: Dominik Roth Date: Mon, 21 Oct 2024 15:25:01 +0200 Subject: [PATCH] Updated README --- README.md | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 3559bfc..4b259d8 100644 --- a/README.md +++ b/README.md @@ -52,17 +52,34 @@ To run the test suite: pytest test/test_ppo.py ``` -## TODO +## Status +### Implemented Features +- Proximal Policy Optimization (PPO) algorithm +- Trust Region Policy Layers (TRPL) algorithm (WIP) +- Support for continuous and discrete action spaces +- Multiple projection methods (Rewritten for MIT License Compatability): + - KL Divergence projection + - Frobenius norm projection + - Wasserstein distance projection + - Identity projection (Eq to PPO) +- Configurable neural network architectures for actor and critic +- Logging support (Terminal and WandB, extendable) + +### TODO +- [ ] All PPO Tests green - [ ] Better / more logging - [ ] Test / Benchmark PPO - [ ] Refactor Modules for TRPL - [ ] Get TRPL working -- [ ] Test / Benchmark TRPL +- [ ] All TRPL Tests green - [ ] Make contextual covariance optional - [ ] Allow full-cov via chol +- [ ] Test / Benchmark TRPL - [ ] Write docs / extend README -- [ ] (Implement SAC?) +- [ ] Test func of non-gym envs +- [ ] Implement SAC +- [ ] Implement VLEARN ## Contributing