From f88a5be4fe0c217a03de65378c503e921b3935a1 Mon Sep 17 00:00:00 2001 From: "ys1087@partner.kit.edu" Date: Wed, 27 Aug 2025 12:06:34 +0200 Subject: [PATCH] Add experiment tracking plan Document current status, planned experiments, and job tracking Following REPPO experiment documentation pattern --- EXPERIMENT_PLAN.md | 97 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 97 insertions(+) create mode 100644 EXPERIMENT_PLAN.md diff --git a/EXPERIMENT_PLAN.md b/EXPERIMENT_PLAN.md new file mode 100644 index 0000000..f0b6f10 --- /dev/null +++ b/EXPERIMENT_PLAN.md @@ -0,0 +1,97 @@ +# DPPO Experiment Plan + +## Current Status + +### Setup Complete ✅ +- Installation successful on HoReKa with Python 3.10 venv +- SLURM scripts created for automated job submission +- All dependencies installed including PyTorch, d4rl, dm-control + +### Initial Testing +🔄 **Job ID 3445081**: Development test (30min) - PENDING +- Command: `./submit_job.sh dev` +- Status: Waiting for resources on dev_accelerated partition +- Purpose: Verify DPPO can run on HoReKa with basic pre-training + +## Experiments To Run + +### 1. Reproduce Paper Results - Gym Tasks + +**Pre-training Phase**: +- hopper-medium-v2 +- walker2d-medium-v2 +- halfcheetah-medium-v2 + +**Fine-tuning Phase**: +- hopper-v2 +- walker2d-v2 +- halfcheetah-v2 + +**Settings**: Paper hyperparameters, 3 seeds each + +### 2. Additional Environments (Future) + +**Robomimic Suite**: +- lift, can, square, transport + +**D3IL Suite**: +- avoid_m1, avoid_m2, avoid_m3 + +**Furniture-Bench Suite**: +- one_leg, lamp, round_table (low/med difficulty) + +## Running Experiments + +### Quick Development Test +```bash +./submit_job.sh dev +``` + +### Gym Pre-training +```bash +./submit_job.sh gym hopper pretrain +./submit_job.sh gym walker2d pretrain +./submit_job.sh gym halfcheetah pretrain +``` + +### Gym Fine-tuning (after pre-training completes) +```bash +./submit_job.sh gym hopper finetune +./submit_job.sh gym walker2d finetune +./submit_job.sh gym halfcheetah finetune +``` + +### Manual SLURM Submission +```bash +# With environment variables +TASK=hopper MODE=pretrain sbatch slurm/run_dppo_gym.sh +``` + +## Job Tracking + +| Job ID | Type | Task | Mode | Status | Duration | Results | +|--------|------|------|------|---------|----------|---------| +| 3445081 | dev test | hopper | pretrain | PENDING | 30min | - | + +## Configuration Notes + +### WandB Setup Required +```bash +export WANDB_API_KEY= +export WANDB_ENTITY= +``` + +### Resource Requirements +- **Dev jobs**: 30min, 24GB RAM, 8 CPUs, dev_accelerated +- **Production**: 8h, 32GB RAM, 40 CPUs, accelerated + +## Issues Encountered + +None so far - installation completed without code modifications. + +## Next Steps + +1. Wait for dev test to complete +2. Analyze dev test results +3. Begin systematic pre-training experiments +4. Document any issues or required fixes \ No newline at end of file