- Fixed JAX/PyTorch dtype mismatch for successful training - Added experiment plan with paper-accurate hyperparameters - Created batch submission and monitoring scripts - Cleaned up log files and updated gitignore - Ready for systematic paper replication
67 lines
1.8 KiB
Markdown
67 lines
1.8 KiB
Markdown
# FastTD3 HoReKa Experiment Plan
|
|
*Added by Dominik - Paper Replication Study*
|
|
|
|
## ✅ Proof of Concept Results
|
|
**Initial Success**: [HoReKa Dev Run](https://wandb.ai/rl-network-scaling/FastTD3_HoReKa_Dev?nw=nwuserdominik_roth)
|
|
|
|
- **Task**: T1JoystickFlatTerrain
|
|
- **Duration**: 7 minutes (5000 timesteps)
|
|
- **Performance**: Successfully training at ~29 it/s
|
|
- **Key Achievement**: Fixed JAX/PyTorch dtype mismatch issue (removed JAX_ENABLE_X64)
|
|
- **Status**: ✅ Environment working, ready for full-scale experiments
|
|
|
|
## Experiments to Replicate
|
|
|
|
### Phase 1: MuJoCo Playground (Figure 11 from paper)
|
|
- `T1JoystickFlatTerrain` (3600s)
|
|
- `T1JoystickRoughTerrain` (3600s)
|
|
- `G1JoystickFlatTerrain` (3600s)
|
|
- `G1JoystickRoughTerrain` (3600s)
|
|
|
|
**Hyperparameters (from paper):**
|
|
- total_timesteps: 500000
|
|
- num_envs: 2048
|
|
- batch_size: 32768
|
|
- buffer_size: 102400 (50K per env)
|
|
- eval_interval: 25000
|
|
|
|
### Phase 2: IsaacLab (Figure 10 from paper)
|
|
- `Isaac-Velocity-Flat-G1-v0` (3600s)
|
|
- `Isaac-Velocity-Rough-G1-v0` (3600s)
|
|
- `Isaac-Repose-Cube-Allegro-Direct-v0` (3600s)
|
|
- `Isaac-Repose-Cube-Shadow-Direct-v0` (3600s)
|
|
- `Isaac-Velocity-Flat-H1-v0` (3600s)
|
|
- `Isaac-Velocity-Rough-H1-v0` (3600s)
|
|
|
|
**Hyperparameters:**
|
|
- total_timesteps: 1000000
|
|
- num_envs: 1024
|
|
- batch_size: 32768
|
|
- buffer_size: 51200
|
|
- eval_interval: 50000
|
|
|
|
### Phase 3: HumanoidBench (Figure 9 from paper - subset)
|
|
- `h1hand-walk` (10800s)
|
|
- `h1hand-run` (10800s)
|
|
- `h1hand-hurdle` (10800s)
|
|
- `h1hand-stair` (10800s)
|
|
- `h1hand-slide` (10800s)
|
|
|
|
**Hyperparameters:**
|
|
- total_timesteps: 2000000
|
|
- num_envs: 256
|
|
- batch_size: 16384
|
|
- buffer_size: 12800
|
|
- eval_interval: 100000
|
|
|
|
## Usage
|
|
|
|
Submit Phase 1:
|
|
```bash
|
|
python submit_experiment_batch.py --phase 1 --seeds 3
|
|
```
|
|
|
|
Monitor progress:
|
|
```bash
|
|
python monitor_experiments.py --watch
|
|
``` |