dodox/dppo - dppo - Gitea: Git with a cup of tea

dodox/dppo

Author	SHA1	Message	Date
Dominik Moritz Roth	a6a805d5de	Update EXPERIMENT_PLAN.md	2025-10-08 21:08:59 +02:00
ys1087@partner.kit.edu	b259157a31	Launch Phase 2: Complete DPPO paper replication - Submit all 10 full replication runs on accelerated partition - Update experiment plan with complete validation results and full run status - Add comprehensive full run scripts for robomimic and D3IL environments - All validated environments now running full paper-quality experiments - Total queue: 3 Gym + 4 Robomimic + 3 D3IL fine-tuning runs	2025-08-27 22:52:19 +02:00
ys1087@partner.kit.edu	cb9846484f	Update experiment plan with validation results and WandB URLs - Complete validation status table with results for all environments - Add WandB tracking URLs for completed fine-tuning runs - Document technical fixes and current job queue status - Add test scripts for remaining D3IL avoid_m3 and robomimic transport validation	2025-08-27 22:14:10 +02:00
ys1087@partner.kit.edu	bda37869e1	Add remaining validation test scripts and D3IL installer - Additional robomimic fine-tuning tests: can, square - D3IL avoid_m2 and avoid_m3 validation scripts - D3IL installation script for SLURM - Add d3il_repo/ to gitignore - Comprehensive test coverage for all environment types	2025-08-27 21:06:44 +02:00
ys1087@partner.kit.edu	314a3f3c06	Add comprehensive dev test scripts and update experiment plan - Complete SLURM test scripts for all environment types - Gym fine-tuning: walker2d, halfcheetah validation tests - Robomimic fine-tuning: lift validation test with scheduler fix - D3IL validation: avoid_m1 pre-training and fine-tuning tests - Updated experiment plan with current validation status - All major environments now have automated testing pipeline	2025-08-27 21:02:55 +02:00
ys1087@partner.kit.edu	7e800c9a33	Complete MuJoCo fix and validate hopper fine-tuning - Add GCC wrapper script to filter Intel compiler flags - Download missing mujoco-py generated files automatically - Update installer with comprehensive MuJoCo fixes - Document complete solution in README and EXPERIMENT_PLAN - Hopper fine-tuning validated with reward 1415.8471 - All pre-training environments working - DPPO is now production-ready on HoReKa	2025-08-27 18:27:02 +02:00
ys1087@partner.kit.edu	d739fa5e5e	Add robomimic transport test and update experiment plan - Create robomimic transport pre-training test script - Update EXPERIMENT_PLAN.md with square success - Add WandB URLs for completed robomimic tests - Track progress on remaining validation tests	2025-08-27 16:21:06 +02:00
ys1087@partner.kit.edu	826b55a2d2	Integrate HoReKa Intel compiler fix for mujoco-py - Add HoReKa-specific MuJoCo compilation fix to install script - Pin compatible Cython version (0.29.37) - Create fix_mujoco_compilation.py helper script - Document Intel compiler override in README - Update test script to use integrated fix - Addresses Intel OneAPI compiler flag incompatibility with GCC	2025-08-27 16:09:13 +02:00
ys1087@partner.kit.edu	2404a34c36	Add MuJoCo compilation debugging and continue validation tests - Add robomimic square test (continuing pre-training validation) - Create MuJoCo environment fix scripts for debugging compilation - Update experiment plan with latest test results - Robomimic can pre-training validated successfully	2025-08-27 15:32:29 +02:00
ys1087@partner.kit.edu	3cf999c32e	Update documentation and simplify experiment tracking - Simplify experiment plan with clear phases and current status - Add complete MuJoCo setup instructions for fine-tuning - Update install script to include all dependencies - Document current validation progress and next steps	2025-08-27 15:25:43 +02:00
ys1087@partner.kit.edu	0424a080c1	feat: HoReKa cluster adaptation and validation - Updated all WandB project names to use dppo- prefix for organization - Added flexible dev testing script for all environments - Created organized dev_tests directory for test scripts - Fixed MuJoCo compilation issues (added GCC compiler flags) - Documented Python 3.10 compatibility and Furniture-Bench limitation - Validated pre-training for Gym, Robomimic, D3IL environments - Updated experiment tracking with validation results - Enhanced README with troubleshooting and setup instructions	2025-08-27 14:01:51 +02:00
ys1087@partner.kit.edu	93ac652def	Start full hopper pre-training production run Job 3445123: 200 epochs, 8h allocated, queued on accelerated partition	2025-08-27 12:31:42 +02:00
ys1087@partner.kit.edu	a67f474fc0	Clarify pre-training vs fine-tuning phases and dev test purpose - Pre-training: diffusion model on offline D4RL data (200 epochs) - Fine-tuning: PPO fine-tune with online environment interaction - Dev test: 2 epochs only for quick verification, not full training	2025-08-27 12:29:31 +02:00
ys1087@partner.kit.edu	80339cad52	Update experiment plan with successful WandB run Job 3445117 completed with proper WandB logging Added WandB URL to tracking table	2025-08-27 12:28:16 +02:00
ys1087@partner.kit.edu	5a458aac67	Configure personal WandB entity and clean up docs Set DPPO_WANDB_ENTITY to dominik_roth for personal logging Remove irrelevant implementation details from experiment plan	2025-08-27 12:24:39 +02:00
ys1087@partner.kit.edu	d43a9e2b3c	Fix WandB configuration for proper logging - Configure DPPO_WANDB_ENTITY environment variable in dev script - Update README with clear WandB setup instructions - Remove wandb=null to enable logging when credentials are set	2025-08-27 12:23:43 +02:00
ys1087@partner.kit.edu	e8e7233d98	Fix WandB config issue and achieve working DPPO setup - Disable WandB in dev script to avoid config object vs string error - Successfully completed development test (Job 3445106) - Confirmed: pre-training works, loss reduces, checkpoints save - Update experiment tracking with successful results	2025-08-27 12:19:38 +02:00
ys1087@partner.kit.edu	7fc9b17871	Clean up experiment tracking Remove failed job tracking, only track successful/running experiments Note: Previous failure was setup error, not DPPO repository issue	2025-08-27 12:08:31 +02:00
ys1087@partner.kit.edu	4adf67694a	Fix Hydra config error in dev script - Change train.n_iters to train.n_epochs (correct DPPO parameter) - Update experiment tracking with failed job details - Ready for corrected dev test	2025-08-27 12:07:38 +02:00
ys1087@partner.kit.edu	f88a5be4fe	Add experiment tracking plan Document current status, planned experiments, and job tracking Following REPPO experiment documentation pattern	2025-08-27 12:06:34 +02:00
ys1087@partner.kit.edu	add21c7019	Clarify that installation must run on GPU node Remove manual installation instructions as PyTorch CUDA dependencies require GPU node	2025-08-27 12:03:41 +02:00
ys1087@partner.kit.edu	2be39c4f2e	Fix README: remove incorrect cluster policy reference	2025-08-27 12:01:31 +02:00
ys1087@partner.kit.edu	835441af45	Fix broken image URL: use raw GitHub URL for cross-origin compatibility	2025-08-27 12:00:59 +02:00
ys1087@partner.kit.edu	2bb63d0ed1	Fix README: remove local git configuration details	2025-08-27 12:00:08 +02:00
ys1087@partner.kit.edu	30f59aaa9b	Add HoReKa cluster documentation to README - Document installation process using Python 3.10 venv - Add usage examples for SLURM job submission - Document available environments and resource allocations - Add WandB configuration instructions - List all repository changes made for HoReKa compatibility	2025-08-27 11:59:32 +02:00
ys1087@partner.kit.edu	05dddfa10c	Add HoReKa cluster setup and SLURM scripts - Add installation script for HoReKa with Python 3.10 venv - Add SLURM job submission scripts for dev and production runs - Add convenient submit_job.sh wrapper for easy job submission - Update .gitignore to allow shell scripts (removed *.sh exclusion) - Configure git remotes: upstream (original) and origin (fork)	2025-08-27 11:57:32 +02:00
allenzren	cc7234ad7f	add note about `ft_denoising_steps` in eval in README	2025-02-04 11:51:47 -05:00
allenzren	b8086ed12e	update version	2025-02-04 11:51:47 -05:00
allenzren	9032d02eae	change default `ft_denoising_steps` in eval configs to 0 (assume evaluating pre-trained models)	2025-02-04 11:51:47 -05:00
allenzren	fc42865c77	rename `DiffusionEvalFT` to `DiffusionEval`	2025-02-04 11:51:47 -05:00
allenzren	a746220905	allow loading pre-trained weights (not fine-tuned) in `DiffusionEvalFT`	2025-02-04 11:51:47 -05:00
allenzren	169a16dda7	update eval configs	2025-02-04 11:51:47 -05:00
allenzren	ace2bbdab9	add separate eval model class that also initializes the pre-trained policy for early denoising steps	2025-02-04 11:51:47 -05:00
allenzren	e7f73dffc1	update batch size in D3IL so it works with the new form of gradient update	2024-12-24 02:06:17 -05:00
Allen Z. Ren	1d04211666	v0.7 (#26 ) * update from scratch configs * update gym pretraining configs - use fewer epochs * update robomimic pretraining configs - use fewer epochs * allow trajectory plotting in eval agent * add simple vit unet * update avoid pretraining configs - use fewer epochs * update furniture pretraining configs - use same amount of epochs as before * add robomimic diffusion unet pretraining configs * update robomimic finetuning configs - higher lr * add vit unet checkpoint urls * update pretraining and finetuning instructions as configs are updated	2024-11-20 15:56:23 -05:00
Allen Z. Ren	d2929f65e1	update isaacgym download path	2024-11-12 18:05:23 -05:00
allenzren	7d1b3a236f	update D3IL pre-processing, fix normalization bug in robomimic pre-processing	2024-11-08 18:40:42 -05:00
allenzren	0bdae945e9	use default `epoch_start_ema=20` and `update_ema_freq=10`	2024-11-07 10:55:16 -05:00
allenzren	c0921a1fb5	remove `update_ema_freq`	2024-11-06 21:01:15 -05:00
Allen Z. Ren	e1ef4ca1cf	More frequent EMA update (#20 ) * move ema update within pretraining epoch * update pretraining ema configs * add lift and can epoch 8000 checkpoint url * add note about EMA issue in pretraining instruction	2024-11-06 20:42:31 -05:00
Allen Z. Ren	dc8e0c9edc	v0.6 (#18 ) * Sampling over both env and denoising steps in DPPO updates (#13) * sample one from each chain * full random sampling * Add Proficient Human (PH) Configs and Pipeline (#16) * fix missing cfg * add ph config * fix how terminated flags are added to buffer in ibrl * add ph config * offline calql for 1M gradient updates * bug fix: number of calql online gradient steps is the number of new transitions collected * add sample config for DPPO with ta=1 * Sampling over both env and denoising steps in DPPO updates (#13) * sample one from each chain * full random sampling * fix diffusion loss when predicting initial noise * fix dppo inds * fix typo * remove print statement --------- Co-authored-by: Justin M. Lidard <jlidard@neuronic.cs.princeton.edu> Co-authored-by: allenzren <allen.ren@princeton.edu> * update robomimic configs * better calql formulation * optimize calql and ibrl training * optimize data transfer in ppo agents * add kitchen configs * re-organize config folders, rerun calql and rlpd * add scratch gym locomotion configs * add kitchen installation dependencies * use truncated for termination in furniture env * update furniture and gym configs * update README and dependencies with kitchen * add url for new data and checkpoints * update demo RL configs * update batch sizes for furniture unet configs * raise error about dropout in residual mlp * fix observation bug in bc loss --------- Co-authored-by: Justin Lidard <60638575+jlidard@users.noreply.github.com> Co-authored-by: Justin M. Lidard <jlidard@neuronic.cs.princeton.edu>	2024-10-30 19:58:06 -04:00
allenzren	7b10df690d	fix diffusion loss when predicting initial noise	2024-10-13 11:19:10 -04:00
allenzren	4e14b8086d	fix itr initialization in eval agents	2024-10-10 14:10:16 -04:00
Allen Z. Ren	e0842e71dc	v0.5 to main (#10 ) * v0.5 (#9) * update idql configs * update awr configs * update dipo configs * update qsm configs * update dqm configs * update project version to 0.5.0	2024-10-07 16:35:13 -04:00
allenzren	dd14c5887c	set `deterministic=True` when sampling in diffusion evaluation	2024-09-26 01:15:10 -04:00
allenzren	4962bbce38	rename `train.py` to `run.py`	2024-09-17 16:33:53 -04:00
allenzren	c9f24ba0c3	add evaluation agents and some example configs	2024-09-17 16:32:45 -04:00
allenzren	bc52beca1e	add minor docs to diffusion classes and clean up some args	2024-09-17 16:26:25 -04:00
allenzren	ef5b14f820	fix observation history indexing in the dataset	2024-09-17 12:43:15 -04:00
allenzren	1aaa6c2302	support varying img size	2024-09-16 17:55:31 -04:00

1 2

59 Commits