Commit Graph

28 Commits

Author SHA1 Message Date
ys1087@partner.kit.edu
4eaec644ec fix: forgot to set path to include python3.10 2025-07-29 19:43:31 +02:00
ys1087@partner.kit.edu
13cd2e5b60 Upd install intr to supprot epyc nodes like HoReKa Teal 2025-07-29 19:19:27 +02:00
ys1087@partner.kit.edu
22dfaa82dd ensure epyc node (like teal) compat 2025-07-29 19:19:10 +02:00
ys1087@partner.kit.edu
466bd2867f ignore alt venvs 2025-07-29 19:19:01 +02:00
ys1087@partner.kit.edu
b7b5a59803 Upd experiment_plan.md 2025-07-24 01:11:30 +02:00
ys1087@partner.kit.edu
e3c5a229c3 require h100 due to gpu ram usage 2025-07-24 01:10:48 +02:00
ys1087@partner.kit.edu
69502c8911 Complete HoReKa README with experiment management tools
- Added reference to experiment_plan.md for current progress
- Updated running instructions with batch experiment commands
- Added monitoring tools and paper replication workflow
- Listed all available scripts and their purposes
- Complete install and run instructions for HoReKa users
2025-07-22 17:18:27 +02:00
ys1087@partner.kit.edu
e95c2c4e11 Update experiment plan to TODO format with running jobs
- Added currently running SLURM job IDs (3367710-3367723)
- Converted to TODO list format with checkboxes
- Added reminders to install IsaacLab and HumanoidBench before Phase 2/3
- Phase 1 (MuJoCo) batch submitted and pending in queue
2025-07-22 17:12:05 +02:00
ys1087@partner.kit.edu
e7e3ae48f1 Add FastTD3 HoReKa experiment management system
- Fixed JAX/PyTorch dtype mismatch for successful training
- Added experiment plan with paper-accurate hyperparameters
- Created batch submission and monitoring scripts
- Cleaned up log files and updated gitignore
- Ready for systematic paper replication
2025-07-22 17:08:03 +02:00
ys1087@partner.kit.edu
15750f56b2 Fix JAX compatibility and CUDA module issues for HoReKa
- Update SLURM scripts to use correct CUDA modules (devel/cuda/12.4, intel compiler)
- Add JAX downgrade to 0.4.35 for CuDNN 9.5.1 compatibility
- Fix JAX_PLATFORMS environment variable (cuda vs gpu,cpu)
- Update README with cluster-specific JAX installation steps
- Tested successfully: Both PyTorch and JAX working on GPU with full training
2025-07-22 16:36:06 +02:00
ys1087@partner.kit.edu
336c96bb7b Add HoReKa cluster support with SLURM and wandb integration
- Add complete HoReKa installation guide without conda dependency
- Include SLURM job script with GPU configuration and account setup
- Add helper scripts for job submission and environment testing
- Integrate wandb logging with both online and offline modes
- Support MuJoCo Playground environments for humanoid control
- Update README with clear separation of added vs original content
2025-07-22 16:15:30 +02:00
Younggyo Seo
51c55d4a8a
Support Multi-GPU Training (#22)
- Change in isaaclab_env wrapper to explicitly state GPU for each simulation
- Removing jax cache to support multi-gpu environment launch in MuJoCo Playground
- Removing .train() and .eval() in evaluation and rendering to avoid deadlock in multi-gpu training
- Supporting synchronous normalization for multi-gpu training
2025-07-07 10:24:42 -07:00
Younggyo Seo
83907422a3
Improved AMP/torch.compile compatibility of SimbaV2 (#21) 2025-07-07 10:04:46 -07:00
Younggyo Seo
c354ead107
Optimized codebase to speed up training (#20)
- Modified codes to be compatible with torch.compile
- Modified empirical normalizer to use in-place operator to avoid costly __setattr__
- Parallel soft Q-update
- As a default option, we disabled gradient norm clipping as it is quite expensive
2025-07-02 19:39:02 -07:00
Younggyo Seo
799624b202
Bug fix -- MTBench evaluation and missing code (#18)
This PR includes these changes:
- Fixing a bug in MTBench evaluation
- Add a missing `critic_cls` in `train.py` (resolving an issue https://github.com/younggyoseo/FastTD3/issues/17)
- Updating hyperparameters for MTBench
2025-06-25 09:21:04 -07:00
Younggyo Seo
cef44108d8
Support MTBench (#15)
This PR incorporates MTBench into the current codebase, as a good demonstration that shows how to use FastTD3 for multi-task setup.

- Add support for MTBench along with its wrapper
- Add support for per-task reward normalizer useful for multi-task RL, motivated by BRC paper (https://arxiv.org/abs/2505.23150v1)
2025-06-20 21:52:43 -07:00
Younggyo Seo
3facede77d Update README 2025-06-15 19:56:23 +00:00
Younggyo Seo
6e890eebd2
Support FastTD3 + SimbaV2 (#13)
- Support hyperspherical normalization
- Support loading FastTD3 + SimbaV2 for both training and inference
- Support (experimental) reward normalization that uses SimbaV2's formulation -- not working that well though
- Updated README for FastTD3 + SimbaV2
2025-06-15 12:49:59 -07:00
Younggyo Seo
1014bf7e82 [hotfix] fix issue when using n-step==1 2025-06-10 08:26:27 +00:00
Younggyo Seo
85cb1c65c7
Fix replay buffer issues when n_steps > 1 (#7)
- Fix an issue where the n-step reward is not properly computed for end-of-episode transitions when using n_step > 1.
- Fix an issue where the observation and next_observations are sampled across different episodes when using n_step > 1 and the buffer is full
- Fix an issue where the discount is not properly computed when n_step > 1
2025-06-07 01:20:48 -04:00
Younggyo Seo
fe028b578f update README and gitignore 2025-06-01 22:50:02 +00:00
Younggyo Seo
544adac2b4 fix typo 2025-05-29 17:53:35 +00:00
Younggyo Seo
3f22046fa8
Merge pull request #3 from younggyoseo/minor_updates_dev1
Update tuned_reward for T1
2025-05-29 01:30:33 -07:00
Younggyo Seo
c156ba93fb black formatting and update tuned_reward for T1 2025-05-29 08:29:44 +00:00
Younggyo Seo
65a55433fc
Merge pull request #2 from younggyoseo/memory_optimization_for_playground
memory optimization for playground
2025-05-29 00:00:57 -07:00
Younggyo Seo
5725eba3b8 memory optimization for playground 2025-05-29 06:58:28 +00:00
Younggyo Seo
5db18c2de2 update citations to include blog posts 2025-05-29 02:40:43 +00:00
Younggyo Seo
258bfe67dd Initial Public Release 2025-05-29 01:49:23 +00:00