- Complete validation status table with results for all environments - Add WandB tracking URLs for completed fine-tuning runs - Document technical fixes and current job queue status - Add test scripts for remaining D3IL avoid_m3 and robomimic transport validation
30 lines
1.7 KiB
Markdown
30 lines
1.7 KiB
Markdown
# DPPO Validation Status
|
|
|
|
## Environment Testing Progress
|
|
|
|
| Environment | Pre-train | Fine-tune | Result | WandB URL |
|
|
|-------------|-----------|-----------|--------|-----------|
|
|
| **Gym (MuJoCo)** |
|
|
| hopper-medium-v2 | Complete | Complete | 1415.85 | [Run](https://wandb.ai/dominik_roth/dppo-gym-hopper-medium-v2-finetune/runs/hpvpzp50) |
|
|
| walker2d-medium-v2 | Complete | Complete | 2977.97 | [Run](https://wandb.ai/dominik_roth/dppo-gym-walker2d-medium-v2-finetune/runs/70b8ioli) |
|
|
| halfcheetah-medium-v2 | Complete | Complete | 4058.34 | [Run](https://wandb.ai/dominik_roth/dppo-gym-halfcheetah-medium-v2-finetune/runs/ya612mef) |
|
|
| **Robomimic** |
|
|
| lift | Complete | Complete | 69% success | [Run](https://wandb.ai/dominik_roth/robomimic-lift-finetune/runs/aih90dlk) |
|
|
| can | Complete | Complete | 85.89% success | [Run](https://wandb.ai/dominik_roth/robomimic-can-finetune/runs/f9nl5u17) |
|
|
| square | Complete | Running (job 3446120) | Pending | - |
|
|
| transport | Complete | Running (job 3446147) | Pending | - |
|
|
| **D3IL** |
|
|
| avoid_m1 | Complete | Complete | 87.7 reward | [Run](https://wandb.ai/dominik_roth/d3il-avoiding-m5-m1-finetune/runs/ugkrcngm) |
|
|
| avoid_m2 | Complete | Complete | 82.46 reward | [Run](https://wandb.ai/dominik_roth/d3il-avoiding-m5-m2-finetune/runs/farekalr) |
|
|
| avoid_m3 | Ready | Queued (job 3446146) | Pending | - |
|
|
|
|
## Technical Issues Resolved
|
|
- MuJoCo compilation with Intel compiler (GCC wrapper solution)
|
|
- SLURM job scheduling and resource allocation
|
|
- WandB logging configuration
|
|
- Configuration parameter corrections for D3IL
|
|
|
|
## Next Steps
|
|
1. Complete remaining validation runs
|
|
2. Begin full paper replication experiments
|
|
3. Generate performance comparison tables |