FastTD3/experiment_plan.md
ys1087@partner.kit.edu e95c2c4e11 Update experiment plan to TODO format with running jobs
- Added currently running SLURM job IDs (3367710-3367723)
- Converted to TODO list format with checkboxes
- Added reminders to install IsaacLab and HumanoidBench before Phase 2/3
- Phase 1 (MuJoCo) batch submitted and pending in queue
2025-07-22 17:12:05 +02:00

83 lines
3.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# FastTD3 HoReKa Experiment Plan
*Added by Dominik - Paper Replication Study*
## ✅ Proof of Concept Results
**Initial Success**: [HoReKa Dev Run](https://wandb.ai/rl-network-scaling/FastTD3_HoReKa_Dev?nw=nwuserdominik_roth)
- **Task**: T1JoystickFlatTerrain
- **Duration**: 7 minutes (5000 timesteps)
- **Performance**: Successfully training at ~29 it/s
- **Key Achievement**: Fixed JAX/PyTorch dtype mismatch issue (removed JAX_ENABLE_X64)
- **Status**: ✅ Environment working, ready for full-scale experiments
## 🚧 Currently Running Jobs
### Phase 1: MuJoCo Playground - SUBMITTED ✅
**SLURM Job IDs**: 3367710-3367723 (12 jobs total)
- ⏳ T1JoystickFlatTerrain (seeds 1,2,3) - Jobs: 3367710, 3367711, 3367712
- ⏳ T1JoystickRoughTerrain (seeds 1,2,3) - Jobs: 3367713, 3367716, 3367717
- ⏳ G1JoystickFlatTerrain (seeds 1,2,3) - Jobs: 3367718, 3367719, 3367720
- ⏳ G1JoystickRoughTerrain (seeds 1,2,3) - Jobs: 3367721, 3367722, 3367723
- **Status**: All jobs pending in queue
- **Monitor**: `python monitor_experiments.py logs/experiment_tracking_1753196960.yaml --watch`
## 📋 TODO List
### Phase 1: MuJoCo Playground
- [x] Set up MuJoCo Playground environment
- [x] Test 5000-step run successfully
- [x] Submit full batch (4 tasks × 3 seeds)
- [ ] Wait for jobs to complete (~1 hour each)
- [ ] Verify results match paper Figure 11
- [ ] Download wandb data for analysis
### Phase 2: IsaacLab
- [ ] **INSTALL ISAACLAB ENVIRONMENT FIRST**
- [ ] Test single IsaacLab task
- [ ] Submit batch: `python submit_experiment_batch.py --phase 2 --seeds 3`
- [ ] Monitor 6 tasks × 3 seeds (18 jobs total)
- [ ] Verify results match paper Figure 10
### Phase 3: HumanoidBench
- [ ] **INSTALL HUMANOIDBENCH ENVIRONMENT FIRST**
- [ ] Test single HumanoidBench task
- [ ] Submit batch: `python submit_experiment_batch.py --phase 3 --seeds 3`
- [ ] Monitor 5 tasks × 3 seeds (15 jobs total)
- [ ] Verify results match paper Figure 9
### Analysis & Completion
- [ ] Collect all results from wandb
- [ ] Generate comparison plots vs paper
- [ ] Document findings and performance
- [ ] Create final report
## 📊 Experiment Details
### Phase 1: MuJoCo Playground (Figure 11 from paper)
- `T1JoystickFlatTerrain`, `T1JoystickRoughTerrain`, `G1JoystickFlatTerrain`, `G1JoystickRoughTerrain`
- **Duration**: 3600s each
- **Hyperparameters**: total_timesteps=500000, num_envs=2048, batch_size=32768, buffer_size=102400, eval_interval=25000
### Phase 2: IsaacLab (Figure 10 from paper)
- `Isaac-Velocity-Flat-G1-v0`, `Isaac-Velocity-Rough-G1-v0`, `Isaac-Repose-Cube-Allegro-Direct-v0`, `Isaac-Repose-Cube-Shadow-Direct-v0`, `Isaac-Velocity-Flat-H1-v0`, `Isaac-Velocity-Rough-H1-v0`
- **Duration**: 3600s each
- **Hyperparameters**: total_timesteps=1000000, num_envs=1024, batch_size=32768, buffer_size=51200, eval_interval=50000
### Phase 3: HumanoidBench (Figure 9 from paper - subset)
- `h1hand-walk`, `h1hand-run`, `h1hand-hurdle`, `h1hand-stair`, `h1hand-slide`
- **Duration**: 10800s each
- **Hyperparameters**: total_timesteps=2000000, num_envs=256, batch_size=16384, buffer_size=12800, eval_interval=100000
## 🔧 Commands
Monitor jobs:
```bash
squeue -u $USER
python monitor_experiments.py logs/experiment_tracking_1753196960.yaml --watch
```
Submit next phases:
```bash
python submit_experiment_batch.py --phase 2 --seeds 3
python submit_experiment_batch.py --phase 3 --seeds 3
```