# DPPO Experiment Plan ## Current Status ### Setup Complete ✅ - Installation successful on HoReKa with Python 3.10 venv - SLURM scripts created for automated job submission - All dependencies installed including PyTorch, d4rl, dm-control ### Initial Testing 🔄 **Job ID 3445081**: Development test (30min) - PENDING - Command: `./submit_job.sh dev` - Status: Waiting for resources on dev_accelerated partition - Purpose: Verify DPPO can run on HoReKa with basic pre-training ## Experiments To Run ### 1. Reproduce Paper Results - Gym Tasks **Pre-training Phase**: - hopper-medium-v2 - walker2d-medium-v2 - halfcheetah-medium-v2 **Fine-tuning Phase**: - hopper-v2 - walker2d-v2 - halfcheetah-v2 **Settings**: Paper hyperparameters, 3 seeds each ### 2. Additional Environments (Future) **Robomimic Suite**: - lift, can, square, transport **D3IL Suite**: - avoid_m1, avoid_m2, avoid_m3 **Furniture-Bench Suite**: - one_leg, lamp, round_table (low/med difficulty) ## Running Experiments ### Quick Development Test ```bash ./submit_job.sh dev ``` ### Gym Pre-training ```bash ./submit_job.sh gym hopper pretrain ./submit_job.sh gym walker2d pretrain ./submit_job.sh gym halfcheetah pretrain ``` ### Gym Fine-tuning (after pre-training completes) ```bash ./submit_job.sh gym hopper finetune ./submit_job.sh gym walker2d finetune ./submit_job.sh gym halfcheetah finetune ``` ### Manual SLURM Submission ```bash # With environment variables TASK=hopper MODE=pretrain sbatch slurm/run_dppo_gym.sh ``` ## Job Tracking | Job ID | Type | Task | Mode | Status | Duration | Results | |--------|------|------|------|---------|----------|---------| | 3445081 | dev test | hopper | pretrain | PENDING | 30min | - | ## Configuration Notes ### WandB Setup Required ```bash export WANDB_API_KEY= export WANDB_ENTITY= ``` ### Resource Requirements - **Dev jobs**: 30min, 24GB RAM, 8 CPUs, dev_accelerated - **Production**: 8h, 32GB RAM, 40 CPUs, accelerated ## Issues Encountered None so far - installation completed without code modifications. ## Next Steps 1. Wait for dev test to complete 2. Analyze dev test results 3. Begin systematic pre-training experiments 4. Document any issues or required fixes