# FastTD3 HoReKa Experiment Plan *Added by Dominik - Paper Replication Study* ## ✅ Proof of Concept Results **Initial Success**: [HoReKa Dev Run](https://wandb.ai/rl-network-scaling/FastTD3_HoReKa_Dev?nw=nwuserdominik_roth) - **Task**: T1JoystickFlatTerrain - **Duration**: 7 minutes (5000 timesteps) - **Performance**: Successfully training at ~29 it/s - **Key Achievement**: Fixed JAX/PyTorch dtype mismatch issue (removed JAX_ENABLE_X64) - **Status**: ✅ Environment working, ready for full-scale experiments ## 🚧 Currently Running Jobs ### Phase 1: MuJoCo Playground - RESUBMITTED TO H100 ✅ **NEW SLURM Job IDs**: 3371681-3371692 (12 jobs total) - Using accelerated-h100 partition (94GB GPU RAM) - ⏳ T1JoystickFlatTerrain (seeds 1,2,3) - Jobs: 3371681, 3371682, 3371683 - ⏳ T1JoystickRoughTerrain (seeds 1,2,3) - Jobs: 3371684, 3371685, 3371686 - ⏳ G1JoystickFlatTerrain (seeds 1,2,3) - Jobs: 3371687, 3371688, 3371689 - ⏳ G1JoystickRoughTerrain (seeds 1,2,3) - Jobs: 3371690, 3371691, 3371692 - **Status**: All jobs pending in accelerated-h100 queue - **Monitor**: `python monitor_experiments.py experiment_tracking_1753312228.yaml --watch` - **Note**: Previous jobs (3367710-3367723) crashed due to insufficient GPU RAM on standard partition ## 📋 TODO List ### Phase 1: MuJoCo Playground - [x] Set up MuJoCo Playground environment - [x] Test 5000-step run successfully - [x] Submit full batch (4 tasks × 3 seeds) - [ ] Wait for jobs to complete (~1 hour each) - [ ] Verify results match paper Figure 11 - [ ] Download wandb data for analysis ### Phase 2: IsaacLab - [ ] **INSTALL ISAACLAB ENVIRONMENT FIRST** - [ ] Test single IsaacLab task - [ ] Submit batch: `python submit_experiment_batch.py --phase 2 --seeds 3` - [ ] Monitor 6 tasks × 3 seeds (18 jobs total) - [ ] Verify results match paper Figure 10 ### Phase 3: HumanoidBench - [ ] **INSTALL HUMANOIDBENCH ENVIRONMENT FIRST** - [ ] Test single HumanoidBench task - [ ] Submit batch: `python submit_experiment_batch.py --phase 3 --seeds 3` - [ ] Monitor 5 tasks × 3 seeds (15 jobs total) - [ ] Verify results match paper Figure 9 ### Analysis & Completion - [ ] Collect all results from wandb - [ ] Generate comparison plots vs paper - [ ] Document findings and performance - [ ] Create final report ## 📊 Experiment Details ### Phase 1: MuJoCo Playground (Figure 11 from paper) - `T1JoystickFlatTerrain`, `T1JoystickRoughTerrain`, `G1JoystickFlatTerrain`, `G1JoystickRoughTerrain` - **Duration**: 3600s each - **Hyperparameters**: total_timesteps=500000, num_envs=2048, batch_size=32768, buffer_size=102400, eval_interval=25000 ### Phase 2: IsaacLab (Figure 10 from paper) - `Isaac-Velocity-Flat-G1-v0`, `Isaac-Velocity-Rough-G1-v0`, `Isaac-Repose-Cube-Allegro-Direct-v0`, `Isaac-Repose-Cube-Shadow-Direct-v0`, `Isaac-Velocity-Flat-H1-v0`, `Isaac-Velocity-Rough-H1-v0` - **Duration**: 3600s each - **Hyperparameters**: total_timesteps=1000000, num_envs=1024, batch_size=32768, buffer_size=51200, eval_interval=50000 ### Phase 3: HumanoidBench (Figure 9 from paper - subset) - `h1hand-walk`, `h1hand-run`, `h1hand-hurdle`, `h1hand-stair`, `h1hand-slide` - **Duration**: 10800s each - **Hyperparameters**: total_timesteps=2000000, num_envs=256, batch_size=16384, buffer_size=12800, eval_interval=100000 ## 🔧 Commands Monitor jobs: ```bash squeue -u $USER python monitor_experiments.py logs/experiment_tracking_1753196960.yaml --watch ``` Submit next phases: ```bash python submit_experiment_batch.py --phase 2 --seeds 3 python submit_experiment_batch.py --phase 3 --seeds 3 ```