FastTD3/experiment_plan.md
ys1087@partner.kit.edu e7e3ae48f1 Add FastTD3 HoReKa experiment management system
- Fixed JAX/PyTorch dtype mismatch for successful training
- Added experiment plan with paper-accurate hyperparameters
- Created batch submission and monitoring scripts
- Cleaned up log files and updated gitignore
- Ready for systematic paper replication
2025-07-22 17:08:03 +02:00

1.8 KiB

FastTD3 HoReKa Experiment Plan

Added by Dominik - Paper Replication Study

Proof of Concept Results

Initial Success: HoReKa Dev Run

  • Task: T1JoystickFlatTerrain
  • Duration: 7 minutes (5000 timesteps)
  • Performance: Successfully training at ~29 it/s
  • Key Achievement: Fixed JAX/PyTorch dtype mismatch issue (removed JAX_ENABLE_X64)
  • Status: Environment working, ready for full-scale experiments

Experiments to Replicate

Phase 1: MuJoCo Playground (Figure 11 from paper)

  • T1JoystickFlatTerrain (3600s)
  • T1JoystickRoughTerrain (3600s)
  • G1JoystickFlatTerrain (3600s)
  • G1JoystickRoughTerrain (3600s)

Hyperparameters (from paper):

  • total_timesteps: 500000
  • num_envs: 2048
  • batch_size: 32768
  • buffer_size: 102400 (50K per env)
  • eval_interval: 25000

Phase 2: IsaacLab (Figure 10 from paper)

  • Isaac-Velocity-Flat-G1-v0 (3600s)
  • Isaac-Velocity-Rough-G1-v0 (3600s)
  • Isaac-Repose-Cube-Allegro-Direct-v0 (3600s)
  • Isaac-Repose-Cube-Shadow-Direct-v0 (3600s)
  • Isaac-Velocity-Flat-H1-v0 (3600s)
  • Isaac-Velocity-Rough-H1-v0 (3600s)

Hyperparameters:

  • total_timesteps: 1000000
  • num_envs: 1024
  • batch_size: 32768
  • buffer_size: 51200
  • eval_interval: 50000

Phase 3: HumanoidBench (Figure 9 from paper - subset)

  • h1hand-walk (10800s)
  • h1hand-run (10800s)
  • h1hand-hurdle (10800s)
  • h1hand-stair (10800s)
  • h1hand-slide (10800s)

Hyperparameters:

  • total_timesteps: 2000000
  • num_envs: 256
  • batch_size: 16384
  • buffer_size: 12800
  • eval_interval: 100000

Usage

Submit Phase 1:

python submit_experiment_batch.py --phase 1 --seeds 3

Monitor progress:

python monitor_experiments.py --watch