dppo/square at dc8e0c9edce7ac2b2ff112abe460e1c21b0b3bdc - dppo

dodox/dppo

History

Allen Z. Ren dc8e0c9edc v0.6 (#18 ) * Sampling over both env and denoising steps in DPPO updates (#13) * sample one from each chain * full random sampling * Add Proficient Human (PH) Configs and Pipeline (#16) * fix missing cfg * add ph config * fix how terminated flags are added to buffer in ibrl * add ph config * offline calql for 1M gradient updates * bug fix: number of calql online gradient steps is the number of new transitions collected * add sample config for DPPO with ta=1 * Sampling over both env and denoising steps in DPPO updates (#13) * sample one from each chain * full random sampling * fix diffusion loss when predicting initial noise * fix dppo inds * fix typo * remove print statement --------- Co-authored-by: Justin M. Lidard <jlidard@neuronic.cs.princeton.edu> Co-authored-by: allenzren <allen.ren@princeton.edu> * update robomimic configs * better calql formulation * optimize calql and ibrl training * optimize data transfer in ppo agents * add kitchen configs * re-organize config folders, rerun calql and rlpd * add scratch gym locomotion configs * add kitchen installation dependencies * use truncated for termination in furniture env * update furniture and gym configs * update README and dependencies with kitchen * add url for new data and checkpoints * update demo RL configs * update batch sizes for furniture unet configs * raise error about dropout in residual mlp * fix observation bug in bc loss --------- Co-authored-by: Justin Lidard <60638575+jlidard@users.noreply.github.com> Co-authored-by: Justin M. Lidard <jlidard@neuronic.cs.princeton.edu>		2024-10-30 19:58:06 -04:00
..
calql_mlp_online_ph.yaml	v0.6 (#18 )	2024-10-30 19:58:06 -04:00
calql_mlp_online.yaml	v0.6 (#18 )	2024-10-30 19:58:06 -04:00
ft_awr_diffusion_mlp.yaml	v0.6 (#18 )	2024-10-30 19:58:06 -04:00
ft_dipo_diffusion_mlp.yaml	v0.5 to main (#10 )	2024-10-07 16:35:13 -04:00
ft_dql_diffusion_mlp.yaml	v0.6 (#18 )	2024-10-30 19:58:06 -04:00
ft_idql_diffusion_mlp.yaml	v0.6 (#18 )	2024-10-30 19:58:06 -04:00
ft_ppo_diffusion_mlp_img.yaml	v0.6 (#18 )	2024-10-30 19:58:06 -04:00
ft_ppo_diffusion_mlp_ta1_ph.yaml	v0.6 (#18 )	2024-10-30 19:58:06 -04:00
ft_ppo_diffusion_mlp_ta1.yaml	v0.6 (#18 )	2024-10-30 19:58:06 -04:00
ft_ppo_diffusion_mlp.yaml	v0.6 (#18 )	2024-10-30 19:58:06 -04:00
ft_ppo_diffusion_unet.yaml	v0.5 to main (#10 )	2024-10-07 16:35:13 -04:00
ft_ppo_gaussian_mlp_img.yaml	v0.5 to main (#10 )	2024-10-07 16:35:13 -04:00
ft_ppo_gaussian_mlp.yaml	v0.5 to main (#10 )	2024-10-07 16:35:13 -04:00
ft_ppo_gaussian_transformer.yaml	v0.5 to main (#10 )	2024-10-07 16:35:13 -04:00
ft_ppo_gmm_mlp.yaml	v0.5 to main (#10 )	2024-10-07 16:35:13 -04:00
ft_ppo_gmm_transformer.yaml	v0.5 to main (#10 )	2024-10-07 16:35:13 -04:00
ft_qsm_diffusion_mlp.yaml	v0.6 (#18 )	2024-10-30 19:58:06 -04:00
ft_rwr_diffusion_mlp.yaml	v0.6 (#18 )	2024-10-30 19:58:06 -04:00
ibrl_mlp_ph.yaml	v0.6 (#18 )	2024-10-30 19:58:06 -04:00
ibrl_mlp.yaml	v0.6 (#18 )	2024-10-30 19:58:06 -04:00