dppo/EXPERIMENT_PLAN.md
ys1087@partner.kit.edu d739fa5e5e Add robomimic transport test and update experiment plan
- Create robomimic transport pre-training test script
- Update EXPERIMENT_PLAN.md with square success
- Add WandB URLs for completed robomimic tests
- Track progress on remaining validation tests
2025-08-27 16:21:06 +02:00

2.6 KiB

DPPO Experiment Plan

What's Done

Installation & Setup:

  • Python 3.10 venv working on HoReKa
  • All dependencies installed (gym, robomimic, d3il)
  • WandB logging configured with "dppo-" project prefix
  • HoReKa Intel compiler fix for mujoco-py integrated into install script
  • Cython version pinned to 0.29.37 for mujoco-py compatibility

Validated Pre-training:

What We're Doing Right Now 🔄

Current Jobs:

  • 🔄 Job 3445594: Running updated installer with integrated MuJoCo fix
  • 🔄 Job 3445604: Testing robomimic square (new job)
  • 🔄 Job 3445606: Testing robomimic transport

Latest Success:

  • Job 3445550: Robomimic square pre-training SUCCESS with WandB logging!

Progress on MuJoCo Fix:

  • Identified root cause: Intel compiler flags incompatible with GCC for mujoco-py
  • Developed sysconfig patch to override Intel flags
  • Integrated fix into install script and README
  • 🔄 Waiting for installer completion to test fix validation

What Needs to Be Done 📋

Phase 1: Complete Installation Validation

Goal: Confirm every environment works in both pre-train and fine-tune modes

Remaining Pre-training Tests:

  • Robomimic: transport (in progress)
  • D3IL: avoid_m2, avoid_m3 (waiting for full installer)

Fine-tuning Tests (after MuJoCo validation):

  • Gym: hopper, walker2d, halfcheetah
  • Robomimic: lift, can, square, transport
  • D3IL: avoid_m1, avoid_m2, avoid_m3

Phase 2: Paper Results Generation

Goal: Run full experiments to replicate paper results

Gym Tasks (Core Paper Results):

  • hopper-medium-v2 → hopper-v2: Pre-train (200 epochs) + Fine-tune
  • walker2d-medium-v2 → walker2d-v2: Pre-train (200 epochs) + Fine-tune
  • halfcheetah-medium-v2 → halfcheetah-v2: Pre-train (200 epochs) + Fine-tune

Extended Results:

  • All Robomimic tasks: full pre-train + fine-tune
  • All D3IL tasks: full pre-train + fine-tune

Current Status

Blockers: None - all technical issues resolved Waiting on: Cluster resources to run validation jobs Next Step: Complete Phase 1 validation, then move to Phase 2 production runs

Success Criteria

  • All environments work in dev tests (Phase 1)
  • All paper results replicated and in WandB (Phase 2)
  • Complete documentation for future users