dppo/cfg/robomimic/finetune/square
Allen Z. Ren dc8e0c9edc
v0.6 (#18)
* Sampling over both env and denoising steps in DPPO updates (#13)

* sample one from each chain

* full random sampling

* Add Proficient Human (PH) Configs and Pipeline (#16)

* fix missing cfg

* add ph config

* fix how terminated flags are added to buffer in ibrl

* add ph config

* offline calql for 1M gradient updates

* bug fix: number of calql online gradient steps is the number of new transitions collected

* add sample config for DPPO with ta=1

* Sampling over both env and denoising steps in DPPO updates (#13)

* sample one from each chain

* full random sampling

* fix diffusion loss when predicting initial noise

* fix dppo inds

* fix typo

* remove print statement

---------

Co-authored-by: Justin M. Lidard <jlidard@neuronic.cs.princeton.edu>
Co-authored-by: allenzren <allen.ren@princeton.edu>

* update robomimic configs

* better calql formulation

* optimize calql and ibrl training

* optimize data transfer in ppo agents

* add kitchen configs

* re-organize config folders, rerun calql and rlpd

* add scratch gym locomotion configs

* add kitchen installation dependencies

* use truncated for termination in furniture env

* update furniture and gym configs

* update README and dependencies with kitchen

* add url for new data and checkpoints

* update demo RL configs

* update batch sizes for furniture unet configs

* raise error about dropout in residual mlp

* fix observation bug in bc loss

---------

Co-authored-by: Justin Lidard <60638575+jlidard@users.noreply.github.com>
Co-authored-by: Justin M. Lidard <jlidard@neuronic.cs.princeton.edu>
2024-10-30 19:58:06 -04:00
..
calql_mlp_online_ph.yaml v0.6 (#18) 2024-10-30 19:58:06 -04:00
calql_mlp_online.yaml v0.6 (#18) 2024-10-30 19:58:06 -04:00
ft_awr_diffusion_mlp.yaml v0.6 (#18) 2024-10-30 19:58:06 -04:00
ft_dipo_diffusion_mlp.yaml v0.5 to main (#10) 2024-10-07 16:35:13 -04:00
ft_dql_diffusion_mlp.yaml v0.6 (#18) 2024-10-30 19:58:06 -04:00
ft_idql_diffusion_mlp.yaml v0.6 (#18) 2024-10-30 19:58:06 -04:00
ft_ppo_diffusion_mlp_img.yaml v0.6 (#18) 2024-10-30 19:58:06 -04:00
ft_ppo_diffusion_mlp_ta1_ph.yaml v0.6 (#18) 2024-10-30 19:58:06 -04:00
ft_ppo_diffusion_mlp_ta1.yaml v0.6 (#18) 2024-10-30 19:58:06 -04:00
ft_ppo_diffusion_mlp.yaml v0.6 (#18) 2024-10-30 19:58:06 -04:00
ft_ppo_diffusion_unet.yaml v0.5 to main (#10) 2024-10-07 16:35:13 -04:00
ft_ppo_gaussian_mlp_img.yaml v0.5 to main (#10) 2024-10-07 16:35:13 -04:00
ft_ppo_gaussian_mlp.yaml v0.5 to main (#10) 2024-10-07 16:35:13 -04:00
ft_ppo_gaussian_transformer.yaml v0.5 to main (#10) 2024-10-07 16:35:13 -04:00
ft_ppo_gmm_mlp.yaml v0.5 to main (#10) 2024-10-07 16:35:13 -04:00
ft_ppo_gmm_transformer.yaml v0.5 to main (#10) 2024-10-07 16:35:13 -04:00
ft_qsm_diffusion_mlp.yaml v0.6 (#18) 2024-10-30 19:58:06 -04:00
ft_rwr_diffusion_mlp.yaml v0.6 (#18) 2024-10-30 19:58:06 -04:00
ibrl_mlp_ph.yaml v0.6 (#18) 2024-10-30 19:58:06 -04:00
ibrl_mlp.yaml v0.6 (#18) 2024-10-30 19:58:06 -04:00