Commit Graph

23 Commits

Author SHA1 Message Date
ys1087@partner.kit.edu
7e800c9a33 Complete MuJoCo fix and validate hopper fine-tuning
- Add GCC wrapper script to filter Intel compiler flags
- Download missing mujoco-py generated files automatically
- Update installer with comprehensive MuJoCo fixes
- Document complete solution in README and EXPERIMENT_PLAN
- Hopper fine-tuning validated with reward 1415.8471
- All pre-training environments working
- DPPO is now production-ready on HoReKa
2025-08-27 18:27:02 +02:00
ys1087@partner.kit.edu
826b55a2d2 Integrate HoReKa Intel compiler fix for mujoco-py
- Add HoReKa-specific MuJoCo compilation fix to install script
- Pin compatible Cython version (0.29.37)
- Create fix_mujoco_compilation.py helper script
- Document Intel compiler override in README
- Update test script to use integrated fix
- Addresses Intel OneAPI compiler flag incompatibility with GCC
2025-08-27 16:09:13 +02:00
ys1087@partner.kit.edu
3cf999c32e Update documentation and simplify experiment tracking
- Simplify experiment plan with clear phases and current status
- Add complete MuJoCo setup instructions for fine-tuning
- Update install script to include all dependencies
- Document current validation progress and next steps
2025-08-27 15:25:43 +02:00
ys1087@partner.kit.edu
0424a080c1 feat: HoReKa cluster adaptation and validation
- Updated all WandB project names to use dppo- prefix for organization
- Added flexible dev testing script for all environments
- Created organized dev_tests directory for test scripts
- Fixed MuJoCo compilation issues (added GCC compiler flags)
- Documented Python 3.10 compatibility and Furniture-Bench limitation
- Validated pre-training for Gym, Robomimic, D3IL environments
- Updated experiment tracking with validation results
- Enhanced README with troubleshooting and setup instructions
2025-08-27 14:01:51 +02:00
ys1087@partner.kit.edu
d43a9e2b3c Fix WandB configuration for proper logging
- Configure DPPO_WANDB_ENTITY environment variable in dev script
- Update README with clear WandB setup instructions
- Remove wandb=null to enable logging when credentials are set
2025-08-27 12:23:43 +02:00
ys1087@partner.kit.edu
add21c7019 Clarify that installation must run on GPU node
Remove manual installation instructions as PyTorch CUDA dependencies require GPU node
2025-08-27 12:03:41 +02:00
ys1087@partner.kit.edu
2be39c4f2e Fix README: remove incorrect cluster policy reference 2025-08-27 12:01:31 +02:00
ys1087@partner.kit.edu
835441af45 Fix broken image URL: use raw GitHub URL for cross-origin compatibility 2025-08-27 12:00:59 +02:00
ys1087@partner.kit.edu
2bb63d0ed1 Fix README: remove local git configuration details 2025-08-27 12:00:08 +02:00
ys1087@partner.kit.edu
30f59aaa9b Add HoReKa cluster documentation to README
- Document installation process using Python 3.10 venv
- Add usage examples for SLURM job submission
- Document available environments and resource allocations
- Add WandB configuration instructions
- List all repository changes made for HoReKa compatibility
2025-08-27 11:59:32 +02:00
allenzren
cc7234ad7f add note about ft_denoising_steps in eval in README 2025-02-04 11:51:47 -05:00
allenzren
ace2bbdab9 add separate eval model class that also initializes the pre-trained policy for early denoising steps 2025-02-04 11:51:47 -05:00
Allen Z. Ren
dc8e0c9edc
v0.6 (#18)
* Sampling over both env and denoising steps in DPPO updates (#13)

* sample one from each chain

* full random sampling

* Add Proficient Human (PH) Configs and Pipeline (#16)

* fix missing cfg

* add ph config

* fix how terminated flags are added to buffer in ibrl

* add ph config

* offline calql for 1M gradient updates

* bug fix: number of calql online gradient steps is the number of new transitions collected

* add sample config for DPPO with ta=1

* Sampling over both env and denoising steps in DPPO updates (#13)

* sample one from each chain

* full random sampling

* fix diffusion loss when predicting initial noise

* fix dppo inds

* fix typo

* remove print statement

---------

Co-authored-by: Justin M. Lidard <jlidard@neuronic.cs.princeton.edu>
Co-authored-by: allenzren <allen.ren@princeton.edu>

* update robomimic configs

* better calql formulation

* optimize calql and ibrl training

* optimize data transfer in ppo agents

* add kitchen configs

* re-organize config folders, rerun calql and rlpd

* add scratch gym locomotion configs

* add kitchen installation dependencies

* use truncated for termination in furniture env

* update furniture and gym configs

* update README and dependencies with kitchen

* add url for new data and checkpoints

* update demo RL configs

* update batch sizes for furniture unet configs

* raise error about dropout in residual mlp

* fix observation bug in bc loss

---------

Co-authored-by: Justin Lidard <60638575+jlidard@users.noreply.github.com>
Co-authored-by: Justin M. Lidard <jlidard@neuronic.cs.princeton.edu>
2024-10-30 19:58:06 -04:00
allenzren
4962bbce38 rename train.py to run.py 2024-09-17 16:33:53 -04:00
allenzren
c9f24ba0c3 add evaluation agents and some example configs 2024-09-17 16:32:45 -04:00
allenzren
1aaa6c2302 support varying img size 2024-09-16 17:55:31 -04:00
allenzren
f5a8da5719 typo 2024-09-11 21:50:02 -04:00
allenzren
f13eb203e1 allow history observation 2024-09-11 21:44:47 -04:00
allenzren
8ce0aa1485 simplify pre-training dataset, use npz 2024-09-08 17:52:16 -04:00
allenzren
447c8dfd02 update instruction on reducing cpu threads 2024-09-08 13:43:52 -04:00
allenzren
a658353eb7 update compute instruction 2024-09-04 13:19:39 -04:00
allenzren
771240b7a6 fix repo link 2024-09-03 21:05:26 -04:00
allenzren
8293b0936b release 2024-09-03 21:03:27 -04:00