- Change train.n_iters to train.n_epochs (correct DPPO parameter) - Update experiment tracking with failed job details - Ready for corrected dev test
2.4 KiB
2.4 KiB
DPPO Experiment Plan
Current Status
Setup Complete ✅
- Installation successful on HoReKa with Python 3.10 venv
- SLURM scripts created for automated job submission
- All dependencies installed including PyTorch, d4rl, dm-control
Initial Testing
🔄 Job ID 3445081: Development test (30min) - PENDING
- Command:
./submit_job.sh dev
- Status: Waiting for resources on dev_accelerated partition
- Purpose: Verify DPPO can run on HoReKa with basic pre-training
Experiments To Run
1. Reproduce Paper Results - Gym Tasks
Pre-training Phase:
- hopper-medium-v2
- walker2d-medium-v2
- halfcheetah-medium-v2
Fine-tuning Phase:
- hopper-v2
- walker2d-v2
- halfcheetah-v2
Settings: Paper hyperparameters, 3 seeds each
2. Additional Environments (Future)
Robomimic Suite:
- lift, can, square, transport
D3IL Suite:
- avoid_m1, avoid_m2, avoid_m3
Furniture-Bench Suite:
- one_leg, lamp, round_table (low/med difficulty)
Running Experiments
Quick Development Test
./submit_job.sh dev
Gym Pre-training
./submit_job.sh gym hopper pretrain
./submit_job.sh gym walker2d pretrain
./submit_job.sh gym halfcheetah pretrain
Gym Fine-tuning (after pre-training completes)
./submit_job.sh gym hopper finetune
./submit_job.sh gym walker2d finetune
./submit_job.sh gym halfcheetah finetune
Manual SLURM Submission
# With environment variables
TASK=hopper MODE=pretrain sbatch slurm/run_dppo_gym.sh
Job Tracking
Job ID | Type | Task | Mode | Status | Duration | Results |
---|---|---|---|---|---|---|
3445081 | dev test | hopper | pretrain | ❌ FAILED | 33sec | Hydra config error |
Configuration Notes
WandB Setup Required
export WANDB_API_KEY=<your_api_key>
export WANDB_ENTITY=<your_username>
Resource Requirements
- Dev jobs: 30min, 24GB RAM, 8 CPUs, dev_accelerated
- Production: 8h, 32GB RAM, 40 CPUs, accelerated
Issues Encountered
Fixed Issues
- Hydra Configuration Error (Job 3445081)
- Issue: Wrong parameter names in dev script (
train.n_iters
instead oftrain.n_epochs
) - Fix: Updated to use correct DPPO config parameters
- Status: Fixed in commit
- Issue: Wrong parameter names in dev script (
Next Steps
- Wait for dev test to complete
- Analyze dev test results
- Begin systematic pre-training experiments
- Document any issues or required fixes