dodox/dppo - dppo - Gitea: Git with a cup of tea

dodox/dppo

Author	SHA1	Message	Date
ys1087@partner.kit.edu	0424a080c1	feat: HoReKa cluster adaptation and validation - Updated all WandB project names to use dppo- prefix for organization - Added flexible dev testing script for all environments - Created organized dev_tests directory for test scripts - Fixed MuJoCo compilation issues (added GCC compiler flags) - Documented Python 3.10 compatibility and Furniture-Bench limitation - Validated pre-training for Gym, Robomimic, D3IL environments - Updated experiment tracking with validation results - Enhanced README with troubleshooting and setup instructions	2025-08-27 14:01:51 +02:00
allenzren	9032d02eae	change default `ft_denoising_steps` in eval configs to 0 (assume evaluating pre-trained models)	2025-02-04 11:51:47 -05:00
allenzren	169a16dda7	update eval configs	2025-02-04 11:51:47 -05:00
allenzren	e7f73dffc1	update batch size in D3IL so it works with the new form of gradient update	2024-12-24 02:06:17 -05:00
Allen Z. Ren	1d04211666	v0.7 (#26 ) * update from scratch configs * update gym pretraining configs - use fewer epochs * update robomimic pretraining configs - use fewer epochs * allow trajectory plotting in eval agent * add simple vit unet * update avoid pretraining configs - use fewer epochs * update furniture pretraining configs - use same amount of epochs as before * add robomimic diffusion unet pretraining configs * update robomimic finetuning configs - higher lr * add vit unet checkpoint urls * update pretraining and finetuning instructions as configs are updated	2024-11-20 15:56:23 -05:00
allenzren	0bdae945e9	use default `epoch_start_ema=20` and `update_ema_freq=10`	2024-11-07 10:55:16 -05:00
allenzren	c0921a1fb5	remove `update_ema_freq`	2024-11-06 21:01:15 -05:00
Allen Z. Ren	e1ef4ca1cf	More frequent EMA update (#20 ) * move ema update within pretraining epoch * update pretraining ema configs * add lift and can epoch 8000 checkpoint url * add note about EMA issue in pretraining instruction	2024-11-06 20:42:31 -05:00
Allen Z. Ren	dc8e0c9edc	v0.6 (#18 ) * Sampling over both env and denoising steps in DPPO updates (#13) * sample one from each chain * full random sampling * Add Proficient Human (PH) Configs and Pipeline (#16) * fix missing cfg * add ph config * fix how terminated flags are added to buffer in ibrl * add ph config * offline calql for 1M gradient updates * bug fix: number of calql online gradient steps is the number of new transitions collected * add sample config for DPPO with ta=1 * Sampling over both env and denoising steps in DPPO updates (#13) * sample one from each chain * full random sampling * fix diffusion loss when predicting initial noise * fix dppo inds * fix typo * remove print statement --------- Co-authored-by: Justin M. Lidard <jlidard@neuronic.cs.princeton.edu> Co-authored-by: allenzren <allen.ren@princeton.edu> * update robomimic configs * better calql formulation * optimize calql and ibrl training * optimize data transfer in ppo agents * add kitchen configs * re-organize config folders, rerun calql and rlpd * add scratch gym locomotion configs * add kitchen installation dependencies * use truncated for termination in furniture env * update furniture and gym configs * update README and dependencies with kitchen * add url for new data and checkpoints * update demo RL configs * update batch sizes for furniture unet configs * raise error about dropout in residual mlp * fix observation bug in bc loss --------- Co-authored-by: Justin Lidard <60638575+jlidard@users.noreply.github.com> Co-authored-by: Justin M. Lidard <jlidard@neuronic.cs.princeton.edu>	2024-10-30 19:58:06 -04:00
Allen Z. Ren	e0842e71dc	v0.5 to main (#10 ) * v0.5 (#9) * update idql configs * update awr configs * update dipo configs * update qsm configs * update dqm configs * update project version to 0.5.0	2024-10-07 16:35:13 -04:00
allenzren	c9f24ba0c3	add evaluation agents and some example configs	2024-09-17 16:32:45 -04:00
allenzren	1aaa6c2302	support varying img size	2024-09-16 17:55:31 -04:00
allenzren	f13eb203e1	allow history observation	2024-09-11 21:44:47 -04:00
allenzren	2ddf63b8f5	squash commits	2024-09-11 21:09:17 -04:00
allenzren	8ce0aa1485	simplify pre-training dataset, use npz	2024-09-08 17:52:16 -04:00
allenzren	8293b0936b	release	2024-09-03 21:03:27 -04:00

16 Commits