Commit Graph

2 Commits

Author SHA1 Message Date
Younggyo Seo
cef44108d8
Support MTBench (#15)
This PR incorporates MTBench into the current codebase, as a good demonstration that shows how to use FastTD3 for multi-task setup.

- Add support for MTBench along with its wrapper
- Add support for per-task reward normalizer useful for multi-task RL, motivated by BRC paper (https://arxiv.org/abs/2505.23150v1)
2025-06-20 21:52:43 -07:00
Younggyo Seo
6e890eebd2
Support FastTD3 + SimbaV2 (#13)
- Support hyperspherical normalization
- Support loading FastTD3 + SimbaV2 for both training and inference
- Support (experimental) reward normalization that uses SimbaV2's formulation -- not working that well though
- Updated README for FastTD3 + SimbaV2
2025-06-15 12:49:59 -07:00