dodox/dppo - dppo - Gitea: Git with a cup of tea

dodox/dppo

Author	SHA1	Message	Date
ys1087@partner.kit.edu	7e800c9a33	Complete MuJoCo fix and validate hopper fine-tuning - Add GCC wrapper script to filter Intel compiler flags - Download missing mujoco-py generated files automatically - Update installer with comprehensive MuJoCo fixes - Document complete solution in README and EXPERIMENT_PLAN - Hopper fine-tuning validated with reward 1415.8471 - All pre-training environments working - DPPO is now production-ready on HoReKa	2025-08-27 18:27:02 +02:00
ys1087@partner.kit.edu	826b55a2d2	Integrate HoReKa Intel compiler fix for mujoco-py - Add HoReKa-specific MuJoCo compilation fix to install script - Pin compatible Cython version (0.29.37) - Create fix_mujoco_compilation.py helper script - Document Intel compiler override in README - Update test script to use integrated fix - Addresses Intel OneAPI compiler flag incompatibility with GCC	2025-08-27 16:09:13 +02:00
ys1087@partner.kit.edu	3cf999c32e	Update documentation and simplify experiment tracking - Simplify experiment plan with clear phases and current status - Add complete MuJoCo setup instructions for fine-tuning - Update install script to include all dependencies - Document current validation progress and next steps	2025-08-27 15:25:43 +02:00
ys1087@partner.kit.edu	0424a080c1	feat: HoReKa cluster adaptation and validation - Updated all WandB project names to use dppo- prefix for organization - Added flexible dev testing script for all environments - Created organized dev_tests directory for test scripts - Fixed MuJoCo compilation issues (added GCC compiler flags) - Documented Python 3.10 compatibility and Furniture-Bench limitation - Validated pre-training for Gym, Robomimic, D3IL environments - Updated experiment tracking with validation results - Enhanced README with troubleshooting and setup instructions	2025-08-27 14:01:51 +02:00
ys1087@partner.kit.edu	d43a9e2b3c	Fix WandB configuration for proper logging - Configure DPPO_WANDB_ENTITY environment variable in dev script - Update README with clear WandB setup instructions - Remove wandb=null to enable logging when credentials are set	2025-08-27 12:23:43 +02:00
ys1087@partner.kit.edu	add21c7019	Clarify that installation must run on GPU node Remove manual installation instructions as PyTorch CUDA dependencies require GPU node	2025-08-27 12:03:41 +02:00
ys1087@partner.kit.edu	2be39c4f2e	Fix README: remove incorrect cluster policy reference	2025-08-27 12:01:31 +02:00
ys1087@partner.kit.edu	835441af45	Fix broken image URL: use raw GitHub URL for cross-origin compatibility	2025-08-27 12:00:59 +02:00
ys1087@partner.kit.edu	2bb63d0ed1	Fix README: remove local git configuration details	2025-08-27 12:00:08 +02:00
ys1087@partner.kit.edu	30f59aaa9b	Add HoReKa cluster documentation to README - Document installation process using Python 3.10 venv - Add usage examples for SLURM job submission - Document available environments and resource allocations - Add WandB configuration instructions - List all repository changes made for HoReKa compatibility	2025-08-27 11:59:32 +02:00
allenzren	cc7234ad7f	add note about `ft_denoising_steps` in eval in README	2025-02-04 11:51:47 -05:00
allenzren	ace2bbdab9	add separate eval model class that also initializes the pre-trained policy for early denoising steps	2025-02-04 11:51:47 -05:00
Allen Z. Ren	dc8e0c9edc	v0.6 (#18 ) * Sampling over both env and denoising steps in DPPO updates (#13) * sample one from each chain * full random sampling * Add Proficient Human (PH) Configs and Pipeline (#16) * fix missing cfg * add ph config * fix how terminated flags are added to buffer in ibrl * add ph config * offline calql for 1M gradient updates * bug fix: number of calql online gradient steps is the number of new transitions collected * add sample config for DPPO with ta=1 * Sampling over both env and denoising steps in DPPO updates (#13) * sample one from each chain * full random sampling * fix diffusion loss when predicting initial noise * fix dppo inds * fix typo * remove print statement --------- Co-authored-by: Justin M. Lidard <jlidard@neuronic.cs.princeton.edu> Co-authored-by: allenzren <allen.ren@princeton.edu> * update robomimic configs * better calql formulation * optimize calql and ibrl training * optimize data transfer in ppo agents * add kitchen configs * re-organize config folders, rerun calql and rlpd * add scratch gym locomotion configs * add kitchen installation dependencies * use truncated for termination in furniture env * update furniture and gym configs * update README and dependencies with kitchen * add url for new data and checkpoints * update demo RL configs * update batch sizes for furniture unet configs * raise error about dropout in residual mlp * fix observation bug in bc loss --------- Co-authored-by: Justin Lidard <60638575+jlidard@users.noreply.github.com> Co-authored-by: Justin M. Lidard <jlidard@neuronic.cs.princeton.edu>	2024-10-30 19:58:06 -04:00
allenzren	4962bbce38	rename `train.py` to `run.py`	2024-09-17 16:33:53 -04:00
allenzren	c9f24ba0c3	add evaluation agents and some example configs	2024-09-17 16:32:45 -04:00
allenzren	1aaa6c2302	support varying img size	2024-09-16 17:55:31 -04:00
allenzren	f5a8da5719	typo	2024-09-11 21:50:02 -04:00
allenzren	f13eb203e1	allow history observation	2024-09-11 21:44:47 -04:00
allenzren	8ce0aa1485	simplify pre-training dataset, use npz	2024-09-08 17:52:16 -04:00
allenzren	447c8dfd02	update instruction on reducing cpu threads	2024-09-08 13:43:52 -04:00
allenzren	a658353eb7	update compute instruction	2024-09-04 13:19:39 -04:00
allenzren	771240b7a6	fix repo link	2024-09-03 21:05:26 -04:00
allenzren	8293b0936b	release	2024-09-03 21:03:27 -04:00

23 Commits