Commit Graph

16 Commits

Author SHA1 Message Date
ys1087@partner.kit.edu
0424a080c1 feat: HoReKa cluster adaptation and validation
- Updated all WandB project names to use dppo- prefix for organization
- Added flexible dev testing script for all environments
- Created organized dev_tests directory for test scripts
- Fixed MuJoCo compilation issues (added GCC compiler flags)
- Documented Python 3.10 compatibility and Furniture-Bench limitation
- Validated pre-training for Gym, Robomimic, D3IL environments
- Updated experiment tracking with validation results
- Enhanced README with troubleshooting and setup instructions
2025-08-27 14:01:51 +02:00
allenzren
9032d02eae change default ft_denoising_steps in eval configs to 0 (assume evaluating pre-trained models) 2025-02-04 11:51:47 -05:00
allenzren
169a16dda7 update eval configs 2025-02-04 11:51:47 -05:00
allenzren
e7f73dffc1 update batch size in D3IL so it works with the new form of gradient update 2024-12-24 02:06:17 -05:00
Allen Z. Ren
1d04211666 v0.7 (#26)
* update from scratch configs

* update gym pretraining configs - use fewer epochs

* update robomimic pretraining configs - use fewer epochs

* allow trajectory plotting in eval agent

* add simple vit unet

* update avoid pretraining configs - use fewer epochs

* update furniture pretraining configs - use same amount of epochs as before

* add robomimic diffusion unet pretraining configs

* update robomimic finetuning configs - higher lr

* add vit unet checkpoint urls

* update pretraining and finetuning instructions as configs are updated
2024-11-20 15:56:23 -05:00
allenzren
0bdae945e9 use default epoch_start_ema=20 and update_ema_freq=10 2024-11-07 10:55:16 -05:00
allenzren
c0921a1fb5 remove update_ema_freq 2024-11-06 21:01:15 -05:00
Allen Z. Ren
e1ef4ca1cf
More frequent EMA update (#20)
* move ema update within pretraining epoch

* update pretraining ema configs

* add lift and can epoch 8000 checkpoint url

* add note about EMA issue in pretraining instruction
2024-11-06 20:42:31 -05:00
Allen Z. Ren
dc8e0c9edc
v0.6 (#18)
* Sampling over both env and denoising steps in DPPO updates (#13)

* sample one from each chain

* full random sampling

* Add Proficient Human (PH) Configs and Pipeline (#16)

* fix missing cfg

* add ph config

* fix how terminated flags are added to buffer in ibrl

* add ph config

* offline calql for 1M gradient updates

* bug fix: number of calql online gradient steps is the number of new transitions collected

* add sample config for DPPO with ta=1

* Sampling over both env and denoising steps in DPPO updates (#13)

* sample one from each chain

* full random sampling

* fix diffusion loss when predicting initial noise

* fix dppo inds

* fix typo

* remove print statement

---------

Co-authored-by: Justin M. Lidard <jlidard@neuronic.cs.princeton.edu>
Co-authored-by: allenzren <allen.ren@princeton.edu>

* update robomimic configs

* better calql formulation

* optimize calql and ibrl training

* optimize data transfer in ppo agents

* add kitchen configs

* re-organize config folders, rerun calql and rlpd

* add scratch gym locomotion configs

* add kitchen installation dependencies

* use truncated for termination in furniture env

* update furniture and gym configs

* update README and dependencies with kitchen

* add url for new data and checkpoints

* update demo RL configs

* update batch sizes for furniture unet configs

* raise error about dropout in residual mlp

* fix observation bug in bc loss

---------

Co-authored-by: Justin Lidard <60638575+jlidard@users.noreply.github.com>
Co-authored-by: Justin M. Lidard <jlidard@neuronic.cs.princeton.edu>
2024-10-30 19:58:06 -04:00
Allen Z. Ren
e0842e71dc
v0.5 to main (#10)
* v0.5 (#9)

* update idql configs

* update awr configs

* update dipo configs

* update qsm configs

* update dqm configs

* update project version to 0.5.0
2024-10-07 16:35:13 -04:00
allenzren
c9f24ba0c3 add evaluation agents and some example configs 2024-09-17 16:32:45 -04:00
allenzren
1aaa6c2302 support varying img size 2024-09-16 17:55:31 -04:00
allenzren
f13eb203e1 allow history observation 2024-09-11 21:44:47 -04:00
allenzren
2ddf63b8f5 squash commits 2024-09-11 21:09:17 -04:00
allenzren
8ce0aa1485 simplify pre-training dataset, use npz 2024-09-08 17:52:16 -04:00
allenzren
8293b0936b release 2024-09-03 21:03:27 -04:00