84 lines
3.5 KiB
Markdown
84 lines
3.5 KiB
Markdown
# FastTD3 HoReKa Experiment Plan
|
||
*Added by Dominik - Paper Replication Study*
|
||
|
||
## ✅ Proof of Concept Results
|
||
**Initial Success**: [HoReKa Dev Run](https://wandb.ai/rl-network-scaling/FastTD3_HoReKa_Dev?nw=nwuserdominik_roth)
|
||
|
||
- **Task**: T1JoystickFlatTerrain
|
||
- **Duration**: 7 minutes (5000 timesteps)
|
||
- **Performance**: Successfully training at ~29 it/s
|
||
- **Key Achievement**: Fixed JAX/PyTorch dtype mismatch issue (removed JAX_ENABLE_X64)
|
||
- **Status**: ✅ Environment working, ready for full-scale experiments
|
||
|
||
## 🚧 Currently Running Jobs
|
||
|
||
### Phase 1: MuJoCo Playground - RESUBMITTED TO H100 ✅
|
||
**NEW SLURM Job IDs**: 3371681-3371692 (12 jobs total) - Using accelerated-h100 partition (94GB GPU RAM)
|
||
- ⏳ T1JoystickFlatTerrain (seeds 1,2,3) - Jobs: 3371681, 3371682, 3371683
|
||
- ⏳ T1JoystickRoughTerrain (seeds 1,2,3) - Jobs: 3371684, 3371685, 3371686
|
||
- ⏳ G1JoystickFlatTerrain (seeds 1,2,3) - Jobs: 3371687, 3371688, 3371689
|
||
- ⏳ G1JoystickRoughTerrain (seeds 1,2,3) - Jobs: 3371690, 3371691, 3371692
|
||
- **Status**: All jobs pending in accelerated-h100 queue
|
||
- **Monitor**: `python monitor_experiments.py experiment_tracking_1753312228.yaml --watch`
|
||
- **Note**: Previous jobs (3367710-3367723) crashed due to insufficient GPU RAM on standard partition
|
||
|
||
## 📋 TODO List
|
||
|
||
### Phase 1: MuJoCo Playground
|
||
- [x] Set up MuJoCo Playground environment
|
||
- [x] Test 5000-step run successfully
|
||
- [x] Submit full batch (4 tasks × 3 seeds)
|
||
- [ ] Wait for jobs to complete (~1 hour each)
|
||
- [ ] Verify results match paper Figure 11
|
||
- [ ] Download wandb data for analysis
|
||
|
||
### Phase 2: IsaacLab
|
||
- [ ] **INSTALL ISAACLAB ENVIRONMENT FIRST**
|
||
- [ ] Test single IsaacLab task
|
||
- [ ] Submit batch: `python submit_experiment_batch.py --phase 2 --seeds 3`
|
||
- [ ] Monitor 6 tasks × 3 seeds (18 jobs total)
|
||
- [ ] Verify results match paper Figure 10
|
||
|
||
### Phase 3: HumanoidBench
|
||
- [ ] **INSTALL HUMANOIDBENCH ENVIRONMENT FIRST**
|
||
- [ ] Test single HumanoidBench task
|
||
- [ ] Submit batch: `python submit_experiment_batch.py --phase 3 --seeds 3`
|
||
- [ ] Monitor 5 tasks × 3 seeds (15 jobs total)
|
||
- [ ] Verify results match paper Figure 9
|
||
|
||
### Analysis & Completion
|
||
- [ ] Collect all results from wandb
|
||
- [ ] Generate comparison plots vs paper
|
||
- [ ] Document findings and performance
|
||
- [ ] Create final report
|
||
|
||
## 📊 Experiment Details
|
||
|
||
### Phase 1: MuJoCo Playground (Figure 11 from paper)
|
||
- `T1JoystickFlatTerrain`, `T1JoystickRoughTerrain`, `G1JoystickFlatTerrain`, `G1JoystickRoughTerrain`
|
||
- **Duration**: 3600s each
|
||
- **Hyperparameters**: total_timesteps=500000, num_envs=2048, batch_size=32768, buffer_size=102400, eval_interval=25000
|
||
|
||
### Phase 2: IsaacLab (Figure 10 from paper)
|
||
- `Isaac-Velocity-Flat-G1-v0`, `Isaac-Velocity-Rough-G1-v0`, `Isaac-Repose-Cube-Allegro-Direct-v0`, `Isaac-Repose-Cube-Shadow-Direct-v0`, `Isaac-Velocity-Flat-H1-v0`, `Isaac-Velocity-Rough-H1-v0`
|
||
- **Duration**: 3600s each
|
||
- **Hyperparameters**: total_timesteps=1000000, num_envs=1024, batch_size=32768, buffer_size=51200, eval_interval=50000
|
||
|
||
### Phase 3: HumanoidBench (Figure 9 from paper - subset)
|
||
- `h1hand-walk`, `h1hand-run`, `h1hand-hurdle`, `h1hand-stair`, `h1hand-slide`
|
||
- **Duration**: 10800s each
|
||
- **Hyperparameters**: total_timesteps=2000000, num_envs=256, batch_size=16384, buffer_size=12800, eval_interval=100000
|
||
|
||
## 🔧 Commands
|
||
|
||
Monitor jobs:
|
||
```bash
|
||
squeue -u $USER
|
||
python monitor_experiments.py logs/experiment_tracking_1753196960.yaml --watch
|
||
```
|
||
|
||
Submit next phases:
|
||
```bash
|
||
python submit_experiment_batch.py --phase 2 --seeds 3
|
||
python submit_experiment_batch.py --phase 3 --seeds 3
|
||
``` |