- Fixed JAX/PyTorch dtype mismatch for successful training - Added experiment plan with paper-accurate hyperparameters - Created batch submission and monitoring scripts - Cleaned up log files and updated gitignore - Ready for systematic paper replication
1.8 KiB
1.8 KiB
FastTD3 HoReKa Experiment Plan
Added by Dominik - Paper Replication Study
✅ Proof of Concept Results
Initial Success: HoReKa Dev Run
- Task: T1JoystickFlatTerrain
- Duration: 7 minutes (5000 timesteps)
- Performance: Successfully training at ~29 it/s
- Key Achievement: Fixed JAX/PyTorch dtype mismatch issue (removed JAX_ENABLE_X64)
- Status: ✅ Environment working, ready for full-scale experiments
Experiments to Replicate
Phase 1: MuJoCo Playground (Figure 11 from paper)
T1JoystickFlatTerrain
(3600s)T1JoystickRoughTerrain
(3600s)G1JoystickFlatTerrain
(3600s)G1JoystickRoughTerrain
(3600s)
Hyperparameters (from paper):
- total_timesteps: 500000
- num_envs: 2048
- batch_size: 32768
- buffer_size: 102400 (50K per env)
- eval_interval: 25000
Phase 2: IsaacLab (Figure 10 from paper)
Isaac-Velocity-Flat-G1-v0
(3600s)Isaac-Velocity-Rough-G1-v0
(3600s)Isaac-Repose-Cube-Allegro-Direct-v0
(3600s)Isaac-Repose-Cube-Shadow-Direct-v0
(3600s)Isaac-Velocity-Flat-H1-v0
(3600s)Isaac-Velocity-Rough-H1-v0
(3600s)
Hyperparameters:
- total_timesteps: 1000000
- num_envs: 1024
- batch_size: 32768
- buffer_size: 51200
- eval_interval: 50000
Phase 3: HumanoidBench (Figure 9 from paper - subset)
h1hand-walk
(10800s)h1hand-run
(10800s)h1hand-hurdle
(10800s)h1hand-stair
(10800s)h1hand-slide
(10800s)
Hyperparameters:
- total_timesteps: 2000000
- num_envs: 256
- batch_size: 16384
- buffer_size: 12800
- eval_interval: 100000
Usage
Submit Phase 1:
python submit_experiment_batch.py --phase 1 --seeds 3
Monitor progress:
python monitor_experiments.py --watch