Commit Graph

58 Commits

Author SHA1 Message Date
ys1087@partner.kit.edu
b259157a31 Launch Phase 2: Complete DPPO paper replication
- Submit all 10 full replication runs on accelerated partition
- Update experiment plan with complete validation results and full run status
- Add comprehensive full run scripts for robomimic and D3IL environments
- All validated environments now running full paper-quality experiments
- Total queue: 3 Gym + 4 Robomimic + 3 D3IL fine-tuning runs
2025-08-27 22:52:19 +02:00
ys1087@partner.kit.edu
cb9846484f Update experiment plan with validation results and WandB URLs
- Complete validation status table with results for all environments
- Add WandB tracking URLs for completed fine-tuning runs
- Document technical fixes and current job queue status
- Add test scripts for remaining D3IL avoid_m3 and robomimic transport validation
2025-08-27 22:14:10 +02:00
ys1087@partner.kit.edu
bda37869e1 Add remaining validation test scripts and D3IL installer
- Additional robomimic fine-tuning tests: can, square
- D3IL avoid_m2 and avoid_m3 validation scripts
- D3IL installation script for SLURM
- Add d3il_repo/ to gitignore
- Comprehensive test coverage for all environment types
2025-08-27 21:06:44 +02:00
ys1087@partner.kit.edu
314a3f3c06 Add comprehensive dev test scripts and update experiment plan
- Complete SLURM test scripts for all environment types
- Gym fine-tuning: walker2d, halfcheetah validation tests
- Robomimic fine-tuning: lift validation test with scheduler fix
- D3IL validation: avoid_m1 pre-training and fine-tuning tests
- Updated experiment plan with current validation status
- All major environments now have automated testing pipeline
2025-08-27 21:02:55 +02:00
ys1087@partner.kit.edu
7e800c9a33 Complete MuJoCo fix and validate hopper fine-tuning
- Add GCC wrapper script to filter Intel compiler flags
- Download missing mujoco-py generated files automatically
- Update installer with comprehensive MuJoCo fixes
- Document complete solution in README and EXPERIMENT_PLAN
- Hopper fine-tuning validated with reward 1415.8471
- All pre-training environments working
- DPPO is now production-ready on HoReKa
2025-08-27 18:27:02 +02:00
ys1087@partner.kit.edu
d739fa5e5e Add robomimic transport test and update experiment plan
- Create robomimic transport pre-training test script
- Update EXPERIMENT_PLAN.md with square success
- Add WandB URLs for completed robomimic tests
- Track progress on remaining validation tests
2025-08-27 16:21:06 +02:00
ys1087@partner.kit.edu
826b55a2d2 Integrate HoReKa Intel compiler fix for mujoco-py
- Add HoReKa-specific MuJoCo compilation fix to install script
- Pin compatible Cython version (0.29.37)
- Create fix_mujoco_compilation.py helper script
- Document Intel compiler override in README
- Update test script to use integrated fix
- Addresses Intel OneAPI compiler flag incompatibility with GCC
2025-08-27 16:09:13 +02:00
ys1087@partner.kit.edu
2404a34c36 Add MuJoCo compilation debugging and continue validation tests
- Add robomimic square test (continuing pre-training validation)
- Create MuJoCo environment fix scripts for debugging compilation
- Update experiment plan with latest test results
- Robomimic can pre-training validated successfully
2025-08-27 15:32:29 +02:00
ys1087@partner.kit.edu
3cf999c32e Update documentation and simplify experiment tracking
- Simplify experiment plan with clear phases and current status
- Add complete MuJoCo setup instructions for fine-tuning
- Update install script to include all dependencies
- Document current validation progress and next steps
2025-08-27 15:25:43 +02:00
ys1087@partner.kit.edu
0424a080c1 feat: HoReKa cluster adaptation and validation
- Updated all WandB project names to use dppo- prefix for organization
- Added flexible dev testing script for all environments
- Created organized dev_tests directory for test scripts
- Fixed MuJoCo compilation issues (added GCC compiler flags)
- Documented Python 3.10 compatibility and Furniture-Bench limitation
- Validated pre-training for Gym, Robomimic, D3IL environments
- Updated experiment tracking with validation results
- Enhanced README with troubleshooting and setup instructions
2025-08-27 14:01:51 +02:00
ys1087@partner.kit.edu
93ac652def Start full hopper pre-training production run
Job 3445123: 200 epochs, 8h allocated, queued on accelerated partition
2025-08-27 12:31:42 +02:00
ys1087@partner.kit.edu
a67f474fc0 Clarify pre-training vs fine-tuning phases and dev test purpose
- Pre-training: diffusion model on offline D4RL data (200 epochs)
- Fine-tuning: PPO fine-tune with online environment interaction
- Dev test: 2 epochs only for quick verification, not full training
2025-08-27 12:29:31 +02:00
ys1087@partner.kit.edu
80339cad52 Update experiment plan with successful WandB run
Job 3445117 completed with proper WandB logging
Added WandB URL to tracking table
2025-08-27 12:28:16 +02:00
ys1087@partner.kit.edu
5a458aac67 Configure personal WandB entity and clean up docs
Set DPPO_WANDB_ENTITY to dominik_roth for personal logging
Remove irrelevant implementation details from experiment plan
2025-08-27 12:24:39 +02:00
ys1087@partner.kit.edu
d43a9e2b3c Fix WandB configuration for proper logging
- Configure DPPO_WANDB_ENTITY environment variable in dev script
- Update README with clear WandB setup instructions
- Remove wandb=null to enable logging when credentials are set
2025-08-27 12:23:43 +02:00
ys1087@partner.kit.edu
e8e7233d98 Fix WandB config issue and achieve working DPPO setup
- Disable WandB in dev script to avoid config object vs string error
- Successfully completed development test (Job 3445106)
- Confirmed: pre-training works, loss reduces, checkpoints save
- Update experiment tracking with successful results
2025-08-27 12:19:38 +02:00
ys1087@partner.kit.edu
7fc9b17871 Clean up experiment tracking
Remove failed job tracking, only track successful/running experiments
Note: Previous failure was setup error, not DPPO repository issue
2025-08-27 12:08:31 +02:00
ys1087@partner.kit.edu
4adf67694a Fix Hydra config error in dev script
- Change train.n_iters to train.n_epochs (correct DPPO parameter)
- Update experiment tracking with failed job details
- Ready for corrected dev test
2025-08-27 12:07:38 +02:00
ys1087@partner.kit.edu
f88a5be4fe Add experiment tracking plan
Document current status, planned experiments, and job tracking
Following REPPO experiment documentation pattern
2025-08-27 12:06:34 +02:00
ys1087@partner.kit.edu
add21c7019 Clarify that installation must run on GPU node
Remove manual installation instructions as PyTorch CUDA dependencies require GPU node
2025-08-27 12:03:41 +02:00
ys1087@partner.kit.edu
2be39c4f2e Fix README: remove incorrect cluster policy reference 2025-08-27 12:01:31 +02:00
ys1087@partner.kit.edu
835441af45 Fix broken image URL: use raw GitHub URL for cross-origin compatibility 2025-08-27 12:00:59 +02:00
ys1087@partner.kit.edu
2bb63d0ed1 Fix README: remove local git configuration details 2025-08-27 12:00:08 +02:00
ys1087@partner.kit.edu
30f59aaa9b Add HoReKa cluster documentation to README
- Document installation process using Python 3.10 venv
- Add usage examples for SLURM job submission
- Document available environments and resource allocations
- Add WandB configuration instructions
- List all repository changes made for HoReKa compatibility
2025-08-27 11:59:32 +02:00
ys1087@partner.kit.edu
05dddfa10c Add HoReKa cluster setup and SLURM scripts
- Add installation script for HoReKa with Python 3.10 venv
- Add SLURM job submission scripts for dev and production runs
- Add convenient submit_job.sh wrapper for easy job submission
- Update .gitignore to allow shell scripts (removed *.sh exclusion)
- Configure git remotes: upstream (original) and origin (fork)
2025-08-27 11:57:32 +02:00
allenzren
cc7234ad7f add note about ft_denoising_steps in eval in README 2025-02-04 11:51:47 -05:00
allenzren
b8086ed12e update version 2025-02-04 11:51:47 -05:00
allenzren
9032d02eae change default ft_denoising_steps in eval configs to 0 (assume evaluating pre-trained models) 2025-02-04 11:51:47 -05:00
allenzren
fc42865c77 rename DiffusionEvalFT to DiffusionEval 2025-02-04 11:51:47 -05:00
allenzren
a746220905 allow loading pre-trained weights (not fine-tuned) in DiffusionEvalFT 2025-02-04 11:51:47 -05:00
allenzren
169a16dda7 update eval configs 2025-02-04 11:51:47 -05:00
allenzren
ace2bbdab9 add separate eval model class that also initializes the pre-trained policy for early denoising steps 2025-02-04 11:51:47 -05:00
allenzren
e7f73dffc1 update batch size in D3IL so it works with the new form of gradient update 2024-12-24 02:06:17 -05:00
Allen Z. Ren
1d04211666 v0.7 (#26)
* update from scratch configs

* update gym pretraining configs - use fewer epochs

* update robomimic pretraining configs - use fewer epochs

* allow trajectory plotting in eval agent

* add simple vit unet

* update avoid pretraining configs - use fewer epochs

* update furniture pretraining configs - use same amount of epochs as before

* add robomimic diffusion unet pretraining configs

* update robomimic finetuning configs - higher lr

* add vit unet checkpoint urls

* update pretraining and finetuning instructions as configs are updated
2024-11-20 15:56:23 -05:00
Allen Z. Ren
d2929f65e1
update isaacgym download path 2024-11-12 18:05:23 -05:00
allenzren
7d1b3a236f update D3IL pre-processing, fix normalization bug in robomimic pre-processing 2024-11-08 18:40:42 -05:00
allenzren
0bdae945e9 use default epoch_start_ema=20 and update_ema_freq=10 2024-11-07 10:55:16 -05:00
allenzren
c0921a1fb5 remove update_ema_freq 2024-11-06 21:01:15 -05:00
Allen Z. Ren
e1ef4ca1cf
More frequent EMA update (#20)
* move ema update within pretraining epoch

* update pretraining ema configs

* add lift and can epoch 8000 checkpoint url

* add note about EMA issue in pretraining instruction
2024-11-06 20:42:31 -05:00
Allen Z. Ren
dc8e0c9edc
v0.6 (#18)
* Sampling over both env and denoising steps in DPPO updates (#13)

* sample one from each chain

* full random sampling

* Add Proficient Human (PH) Configs and Pipeline (#16)

* fix missing cfg

* add ph config

* fix how terminated flags are added to buffer in ibrl

* add ph config

* offline calql for 1M gradient updates

* bug fix: number of calql online gradient steps is the number of new transitions collected

* add sample config for DPPO with ta=1

* Sampling over both env and denoising steps in DPPO updates (#13)

* sample one from each chain

* full random sampling

* fix diffusion loss when predicting initial noise

* fix dppo inds

* fix typo

* remove print statement

---------

Co-authored-by: Justin M. Lidard <jlidard@neuronic.cs.princeton.edu>
Co-authored-by: allenzren <allen.ren@princeton.edu>

* update robomimic configs

* better calql formulation

* optimize calql and ibrl training

* optimize data transfer in ppo agents

* add kitchen configs

* re-organize config folders, rerun calql and rlpd

* add scratch gym locomotion configs

* add kitchen installation dependencies

* use truncated for termination in furniture env

* update furniture and gym configs

* update README and dependencies with kitchen

* add url for new data and checkpoints

* update demo RL configs

* update batch sizes for furniture unet configs

* raise error about dropout in residual mlp

* fix observation bug in bc loss

---------

Co-authored-by: Justin Lidard <60638575+jlidard@users.noreply.github.com>
Co-authored-by: Justin M. Lidard <jlidard@neuronic.cs.princeton.edu>
2024-10-30 19:58:06 -04:00
allenzren
7b10df690d fix diffusion loss when predicting initial noise 2024-10-13 11:19:10 -04:00
allenzren
4e14b8086d fix itr initialization in eval agents 2024-10-10 14:10:16 -04:00
Allen Z. Ren
e0842e71dc
v0.5 to main (#10)
* v0.5 (#9)

* update idql configs

* update awr configs

* update dipo configs

* update qsm configs

* update dqm configs

* update project version to 0.5.0
2024-10-07 16:35:13 -04:00
allenzren
dd14c5887c set deterministic=True when sampling in diffusion evaluation 2024-09-26 01:15:10 -04:00
allenzren
4962bbce38 rename train.py to run.py 2024-09-17 16:33:53 -04:00
allenzren
c9f24ba0c3 add evaluation agents and some example configs 2024-09-17 16:32:45 -04:00
allenzren
bc52beca1e add minor docs to diffusion classes and clean up some args 2024-09-17 16:26:25 -04:00
allenzren
ef5b14f820 fix observation history indexing in the dataset 2024-09-17 12:43:15 -04:00
allenzren
1aaa6c2302 support varying img size 2024-09-16 17:55:31 -04:00
allenzren
64595baca9 more intuitive handling of done in GAE 2024-09-13 16:29:56 -04:00