FastTD3

Author	SHA1	Message	Date
Dominik Moritz Roth	eff135a860	Update experiment_plan.md	2025-10-08 21:07:06 +02:00
ys1087@partner.kit.edu	4eaec644ec	fix: forgot to set path to include python3.10	2025-07-29 19:43:31 +02:00
ys1087@partner.kit.edu	13cd2e5b60	Upd install intr to supprot epyc nodes like HoReKa Teal	2025-07-29 19:19:27 +02:00
ys1087@partner.kit.edu	22dfaa82dd	ensure epyc node (like teal) compat	2025-07-29 19:19:10 +02:00
ys1087@partner.kit.edu	466bd2867f	ignore alt venvs	2025-07-29 19:19:01 +02:00
ys1087@partner.kit.edu	b7b5a59803	Upd experiment_plan.md	2025-07-24 01:11:30 +02:00
ys1087@partner.kit.edu	e3c5a229c3	require h100 due to gpu ram usage	2025-07-24 01:10:48 +02:00
ys1087@partner.kit.edu	69502c8911	Complete HoReKa README with experiment management tools - Added reference to experiment_plan.md for current progress - Updated running instructions with batch experiment commands - Added monitoring tools and paper replication workflow - Listed all available scripts and their purposes - Complete install and run instructions for HoReKa users	2025-07-22 17:18:27 +02:00
ys1087@partner.kit.edu	e95c2c4e11	Update experiment plan to TODO format with running jobs - Added currently running SLURM job IDs (3367710-3367723) - Converted to TODO list format with checkboxes - Added reminders to install IsaacLab and HumanoidBench before Phase 2/3 - Phase 1 (MuJoCo) batch submitted and pending in queue	2025-07-22 17:12:05 +02:00
ys1087@partner.kit.edu	e7e3ae48f1	Add FastTD3 HoReKa experiment management system - Fixed JAX/PyTorch dtype mismatch for successful training - Added experiment plan with paper-accurate hyperparameters - Created batch submission and monitoring scripts - Cleaned up log files and updated gitignore - Ready for systematic paper replication	2025-07-22 17:08:03 +02:00
ys1087@partner.kit.edu	15750f56b2	Fix JAX compatibility and CUDA module issues for HoReKa - Update SLURM scripts to use correct CUDA modules (devel/cuda/12.4, intel compiler) - Add JAX downgrade to 0.4.35 for CuDNN 9.5.1 compatibility - Fix JAX_PLATFORMS environment variable (cuda vs gpu,cpu) - Update README with cluster-specific JAX installation steps - Tested successfully: Both PyTorch and JAX working on GPU with full training	2025-07-22 16:36:06 +02:00
ys1087@partner.kit.edu	336c96bb7b	Add HoReKa cluster support with SLURM and wandb integration - Add complete HoReKa installation guide without conda dependency - Include SLURM job script with GPU configuration and account setup - Add helper scripts for job submission and environment testing - Integrate wandb logging with both online and offline modes - Support MuJoCo Playground environments for humanoid control - Update README with clear separation of added vs original content	2025-07-22 16:15:30 +02:00
Younggyo Seo	51c55d4a8a	Support Multi-GPU Training (#22 ) - Change in isaaclab_env wrapper to explicitly state GPU for each simulation - Removing jax cache to support multi-gpu environment launch in MuJoCo Playground - Removing .train() and .eval() in evaluation and rendering to avoid deadlock in multi-gpu training - Supporting synchronous normalization for multi-gpu training	2025-07-07 10:24:42 -07:00
Younggyo Seo	83907422a3	Improved AMP/torch.compile compatibility of SimbaV2 (#21 )	2025-07-07 10:04:46 -07:00
Younggyo Seo	c354ead107	Optimized codebase to speed up training (#20 ) - Modified codes to be compatible with torch.compile - Modified empirical normalizer to use in-place operator to avoid costly __setattr__ - Parallel soft Q-update - As a default option, we disabled gradient norm clipping as it is quite expensive	2025-07-02 19:39:02 -07:00
Younggyo Seo	799624b202	Bug fix -- MTBench evaluation and missing code (#18 ) This PR includes these changes: - Fixing a bug in MTBench evaluation - Add a missing `critic_cls` in `train.py` (resolving an issue https://github.com/younggyoseo/FastTD3/issues/17) - Updating hyperparameters for MTBench	2025-06-25 09:21:04 -07:00
Younggyo Seo	cef44108d8	Support MTBench (#15 ) This PR incorporates MTBench into the current codebase, as a good demonstration that shows how to use FastTD3 for multi-task setup. - Add support for MTBench along with its wrapper - Add support for per-task reward normalizer useful for multi-task RL, motivated by BRC paper (https://arxiv.org/abs/2505.23150v1)	2025-06-20 21:52:43 -07:00
Younggyo Seo	3facede77d	Update README	2025-06-15 19:56:23 +00:00
Younggyo Seo	6e890eebd2	Support FastTD3 + SimbaV2 (#13 ) - Support hyperspherical normalization - Support loading FastTD3 + SimbaV2 for both training and inference - Support (experimental) reward normalization that uses SimbaV2's formulation -- not working that well though - Updated README for FastTD3 + SimbaV2	2025-06-15 12:49:59 -07:00
Younggyo Seo	1014bf7e82	[hotfix] fix issue when using n-step==1	2025-06-10 08:26:27 +00:00
Younggyo Seo	85cb1c65c7	Fix replay buffer issues when n_steps > 1 (#7 ) - Fix an issue where the n-step reward is not properly computed for end-of-episode transitions when using n_step > 1. - Fix an issue where the observation and next_observations are sampled across different episodes when using n_step > 1 and the buffer is full - Fix an issue where the discount is not properly computed when n_step > 1	2025-06-07 01:20:48 -04:00
Younggyo Seo	fe028b578f	update README and gitignore	2025-06-01 22:50:02 +00:00
Younggyo Seo	544adac2b4	fix typo	2025-05-29 17:53:35 +00:00
Younggyo Seo	3f22046fa8	Merge pull request #3 from younggyoseo/minor_updates_dev1 Update tuned_reward for T1	2025-05-29 01:30:33 -07:00
Younggyo Seo	c156ba93fb	black formatting and update tuned_reward for T1	2025-05-29 08:29:44 +00:00
Younggyo Seo	65a55433fc	Merge pull request #2 from younggyoseo/memory_optimization_for_playground memory optimization for playground	2025-05-29 00:00:57 -07:00
Younggyo Seo	5725eba3b8	memory optimization for playground	2025-05-29 06:58:28 +00:00
Younggyo Seo	5db18c2de2	update citations to include blog posts	2025-05-29 02:40:43 +00:00
Younggyo Seo	258bfe67dd	Initial Public Release	2025-05-29 01:49:23 +00:00

29 Commits