This PR incorporates MTBench into the current codebase, as a good demonstration that shows how to use FastTD3 for multi-task setup. - Add support for MTBench along with its wrapper - Add support for per-task reward normalizer useful for multi-task RL, motivated by BRC paper (https://arxiv.org/abs/2505.23150v1)
15 lines
212 B
Plaintext
15 lines
212 B
Plaintext
gymnasium<1.0.0
|
|
jax-jumpy==1.0.0 ; python_version >= "3.8" and python_version < "3.11"
|
|
matplotlib
|
|
moviepy
|
|
numpy<2.0
|
|
pandas
|
|
protobuf
|
|
pygame
|
|
stable-baselines3
|
|
tqdm
|
|
wandb
|
|
torchrl==0.5.0
|
|
tensordict==0.5.0
|
|
tyro
|
|
loguru |