FastTD3

11 Commits 1 Branch 0 Tags 211 KiB

Author	SHA1	Message	Date
Younggyo Seo	6e890eebd2	Support FastTD3 + SimbaV2 (#13 ) - Support hyperspherical normalization - Support loading FastTD3 + SimbaV2 for both training and inference - Support (experimental) reward normalization that uses SimbaV2's formulation -- not working that well though - Updated README for FastTD3 + SimbaV2	2025-06-15 12:49:59 -07:00
Younggyo Seo	1014bf7e82	[hotfix] fix issue when using n-step==1	2025-06-10 08:26:27 +00:00
Younggyo Seo	85cb1c65c7	Fix replay buffer issues when n_steps > 1 (#7 ) - Fix an issue where the n-step reward is not properly computed for end-of-episode transitions when using n_step > 1. - Fix an issue where the observation and next_observations are sampled across different episodes when using n_step > 1 and the buffer is full - Fix an issue where the discount is not properly computed when n_step > 1	2025-06-07 01:20:48 -04:00
Younggyo Seo	fe028b578f	update README and gitignore	2025-06-01 22:50:02 +00:00
Younggyo Seo	544adac2b4	fix typo	2025-05-29 17:53:35 +00:00
Younggyo Seo	3f22046fa8	Merge pull request #3 from younggyoseo/minor_updates_dev1 Update tuned_reward for T1	2025-05-29 01:30:33 -07:00
Younggyo Seo	c156ba93fb	black formatting and update tuned_reward for T1	2025-05-29 08:29:44 +00:00
Younggyo Seo	65a55433fc	Merge pull request #2 from younggyoseo/memory_optimization_for_playground memory optimization for playground	2025-05-29 00:00:57 -07:00
Younggyo Seo	5725eba3b8	memory optimization for playground	2025-05-29 06:58:28 +00:00
Younggyo Seo	5db18c2de2	update citations to include blog posts	2025-05-29 02:40:43 +00:00
Younggyo Seo	258bfe67dd	Initial Public Release	2025-05-29 01:49:23 +00:00