FastTD3/experiment_plan.md
ys1087@partner.kit.edu b7b5a59803 Upd experiment_plan.md
2025-07-24 01:11:30 +02:00

3.5 KiB
Raw Permalink Blame History

FastTD3 HoReKa Experiment Plan

Added by Dominik - Paper Replication Study

Proof of Concept Results

Initial Success: HoReKa Dev Run

  • Task: T1JoystickFlatTerrain
  • Duration: 7 minutes (5000 timesteps)
  • Performance: Successfully training at ~29 it/s
  • Key Achievement: Fixed JAX/PyTorch dtype mismatch issue (removed JAX_ENABLE_X64)
  • Status: Environment working, ready for full-scale experiments

🚧 Currently Running Jobs

Phase 1: MuJoCo Playground - RESUBMITTED TO H100

NEW SLURM Job IDs: 3371681-3371692 (12 jobs total) - Using accelerated-h100 partition (94GB GPU RAM)

  • T1JoystickFlatTerrain (seeds 1,2,3) - Jobs: 3371681, 3371682, 3371683
  • T1JoystickRoughTerrain (seeds 1,2,3) - Jobs: 3371684, 3371685, 3371686
  • G1JoystickFlatTerrain (seeds 1,2,3) - Jobs: 3371687, 3371688, 3371689
  • G1JoystickRoughTerrain (seeds 1,2,3) - Jobs: 3371690, 3371691, 3371692
  • Status: All jobs pending in accelerated-h100 queue
  • Monitor: python monitor_experiments.py experiment_tracking_1753312228.yaml --watch
  • Note: Previous jobs (3367710-3367723) crashed due to insufficient GPU RAM on standard partition

📋 TODO List

Phase 1: MuJoCo Playground

  • Set up MuJoCo Playground environment
  • Test 5000-step run successfully
  • Submit full batch (4 tasks × 3 seeds)
  • Wait for jobs to complete (~1 hour each)
  • Verify results match paper Figure 11
  • Download wandb data for analysis

Phase 2: IsaacLab

  • INSTALL ISAACLAB ENVIRONMENT FIRST
  • Test single IsaacLab task
  • Submit batch: python submit_experiment_batch.py --phase 2 --seeds 3
  • Monitor 6 tasks × 3 seeds (18 jobs total)
  • Verify results match paper Figure 10

Phase 3: HumanoidBench

  • INSTALL HUMANOIDBENCH ENVIRONMENT FIRST
  • Test single HumanoidBench task
  • Submit batch: python submit_experiment_batch.py --phase 3 --seeds 3
  • Monitor 5 tasks × 3 seeds (15 jobs total)
  • Verify results match paper Figure 9

Analysis & Completion

  • Collect all results from wandb
  • Generate comparison plots vs paper
  • Document findings and performance
  • Create final report

📊 Experiment Details

Phase 1: MuJoCo Playground (Figure 11 from paper)

  • T1JoystickFlatTerrain, T1JoystickRoughTerrain, G1JoystickFlatTerrain, G1JoystickRoughTerrain
  • Duration: 3600s each
  • Hyperparameters: total_timesteps=500000, num_envs=2048, batch_size=32768, buffer_size=102400, eval_interval=25000

Phase 2: IsaacLab (Figure 10 from paper)

  • Isaac-Velocity-Flat-G1-v0, Isaac-Velocity-Rough-G1-v0, Isaac-Repose-Cube-Allegro-Direct-v0, Isaac-Repose-Cube-Shadow-Direct-v0, Isaac-Velocity-Flat-H1-v0, Isaac-Velocity-Rough-H1-v0
  • Duration: 3600s each
  • Hyperparameters: total_timesteps=1000000, num_envs=1024, batch_size=32768, buffer_size=51200, eval_interval=50000

Phase 3: HumanoidBench (Figure 9 from paper - subset)

  • h1hand-walk, h1hand-run, h1hand-hurdle, h1hand-stair, h1hand-slide
  • Duration: 10800s each
  • Hyperparameters: total_timesteps=2000000, num_envs=256, batch_size=16384, buffer_size=12800, eval_interval=100000

🔧 Commands

Monitor jobs:

squeue -u $USER
python monitor_experiments.py logs/experiment_tracking_1753196960.yaml --watch

Submit next phases:

python submit_experiment_batch.py --phase 2 --seeds 3
python submit_experiment_batch.py --phase 3 --seeds 3