diff --git a/experiment_plan.md b/experiment_plan.md index 4c09d00..b346de4 100644 --- a/experiment_plan.md +++ b/experiment_plan.md @@ -12,14 +12,15 @@ ## 🚧 Currently Running Jobs -### Phase 1: MuJoCo Playground - SUBMITTED ✅ -**SLURM Job IDs**: 3367710-3367723 (12 jobs total) -- ⏳ T1JoystickFlatTerrain (seeds 1,2,3) - Jobs: 3367710, 3367711, 3367712 -- ⏳ T1JoystickRoughTerrain (seeds 1,2,3) - Jobs: 3367713, 3367716, 3367717 -- ⏳ G1JoystickFlatTerrain (seeds 1,2,3) - Jobs: 3367718, 3367719, 3367720 -- ⏳ G1JoystickRoughTerrain (seeds 1,2,3) - Jobs: 3367721, 3367722, 3367723 -- **Status**: All jobs pending in queue -- **Monitor**: `python monitor_experiments.py logs/experiment_tracking_1753196960.yaml --watch` +### Phase 1: MuJoCo Playground - RESUBMITTED TO H100 ✅ +**NEW SLURM Job IDs**: 3371681-3371692 (12 jobs total) - Using accelerated-h100 partition (94GB GPU RAM) +- ⏳ T1JoystickFlatTerrain (seeds 1,2,3) - Jobs: 3371681, 3371682, 3371683 +- ⏳ T1JoystickRoughTerrain (seeds 1,2,3) - Jobs: 3371684, 3371685, 3371686 +- ⏳ G1JoystickFlatTerrain (seeds 1,2,3) - Jobs: 3371687, 3371688, 3371689 +- ⏳ G1JoystickRoughTerrain (seeds 1,2,3) - Jobs: 3371690, 3371691, 3371692 +- **Status**: All jobs pending in accelerated-h100 queue +- **Monitor**: `python monitor_experiments.py experiment_tracking_1753312228.yaml --watch` +- **Note**: Previous jobs (3367710-3367723) crashed due to insufficient GPU RAM on standard partition ## 📋 TODO List