Upd install intr to supprot epyc nodes like HoReKa Teal
This commit is contained in:
parent
22dfaa82dd
commit
13cd2e5b60
23
README.md
23
README.md
@ -24,25 +24,38 @@ This repository includes optimized scripts for running FastTD3 on the HoReKa sup
|
|||||||
git clone https://github.com/younggyoseo/FastTD3.git
|
git clone https://github.com/younggyoseo/FastTD3.git
|
||||||
cd FastTD3
|
cd FastTD3
|
||||||
|
|
||||||
# Install Python 3.10 locally (HoReKa doesn't provide conda)
|
# Install Python 3.10 locally with cross-CPU compatibility
|
||||||
|
# IMPORTANT: Use generic x86-64 architecture for compatibility with both Intel and AMD nodes
|
||||||
mkdir -p $HOME/.local/python-3.10
|
mkdir -p $HOME/.local/python-3.10
|
||||||
cd /tmp
|
cd /tmp
|
||||||
curl -O https://www.python.org/ftp/python/3.10.14/Python-3.10.14.tgz
|
curl -O https://www.python.org/ftp/python/3.10.14/Python-3.10.14.tgz
|
||||||
tar -xzf Python-3.10.14.tgz
|
tar -xzf Python-3.10.14.tgz
|
||||||
cd Python-3.10.14
|
cd Python-3.10.14
|
||||||
./configure --prefix=$HOME/.local/python-3.10 --enable-optimizations --with-ensurepip=install
|
|
||||||
|
# Configure without Intel-specific optimizations for AMD EPYC compatibility
|
||||||
|
export EXTRA_CFLAGS="-march=x86-64 -mtune=generic"
|
||||||
|
./configure --prefix=$HOME/.local/python-3.10 \
|
||||||
|
--with-ensurepip=install \
|
||||||
|
--enable-shared \
|
||||||
|
CFLAGS="$EXTRA_CFLAGS" \
|
||||||
|
CPPFLAGS="$EXTRA_CFLAGS"
|
||||||
make -j$(nproc)
|
make -j$(nproc)
|
||||||
make install
|
make install
|
||||||
|
|
||||||
# Add to PATH
|
# Add to PATH and set library path
|
||||||
echo 'export PATH="$HOME/.local/python-3.10/bin:$PATH"' >> ~/.bashrc
|
echo 'export PATH="$HOME/.local/python-3.10/bin:$PATH"' >> ~/.bashrc
|
||||||
|
echo 'export LD_LIBRARY_PATH="$HOME/.local/python-3.10/lib:$LD_LIBRARY_PATH"' >> ~/.bashrc
|
||||||
echo 'export PATH="$HOME/.local/python-3.10/bin:$PATH"' >> ~/.zshrc
|
echo 'export PATH="$HOME/.local/python-3.10/bin:$PATH"' >> ~/.zshrc
|
||||||
|
echo 'export LD_LIBRARY_PATH="$HOME/.local/python-3.10/lib:$LD_LIBRARY_PATH"' >> ~/.zshrc
|
||||||
export PATH="$HOME/.local/python-3.10/bin:$PATH"
|
export PATH="$HOME/.local/python-3.10/bin:$PATH"
|
||||||
|
export LD_LIBRARY_PATH="$HOME/.local/python-3.10/lib:$LD_LIBRARY_PATH"
|
||||||
|
|
||||||
# Go back to FastTD3 directory
|
# Go back to FastTD3 directory
|
||||||
cd $HOME/path/to/FastTD3
|
cd $HOME/path/to/FastTD3
|
||||||
|
|
||||||
# Create virtual environment and install dependencies
|
# Create virtual environment and install dependencies
|
||||||
|
# NOTE: If you encounter library errors, ensure LD_LIBRARY_PATH is set correctly
|
||||||
|
source ~/.bashrc # Load PATH and LD_LIBRARY_PATH
|
||||||
$HOME/.local/python-3.10/bin/python3.10 -m venv .venv
|
$HOME/.local/python-3.10/bin/python3.10 -m venv .venv
|
||||||
source .venv/bin/activate
|
source .venv/bin/activate
|
||||||
pip install --upgrade pip
|
pip install --upgrade pip
|
||||||
@ -100,15 +113,17 @@ sbatch run_fasttd3.slurm
|
|||||||
### Configuration
|
### Configuration
|
||||||
|
|
||||||
The setup includes:
|
The setup includes:
|
||||||
|
- **Cross-CPU compatible Python 3.10** with generic x86-64 architecture (works on both Intel Xeon and AMD EPYC nodes)
|
||||||
- **SLURM scripts** (`run_fasttd3.slurm`, `run_fasttd3_full.slurm`) configured for accelerated partition with GPU
|
- **SLURM scripts** (`run_fasttd3.slurm`, `run_fasttd3_full.slurm`) configured for accelerated partition with GPU
|
||||||
- **Job helpers** (`submit_job.py`, `submit_experiment_batch.py`) for single/batch job submission
|
- **Job helpers** (`submit_job.py`, `submit_experiment_batch.py`) for single/batch job submission
|
||||||
- **Monitoring tool** (`monitor_experiments.py`) for real-time experiment tracking
|
- **Monitoring tool** (`monitor_experiments.py`) for real-time experiment tracking
|
||||||
- **Test script** (`test_setup.py`) for environment verification
|
- **Test script** (`test_setup.py`) for environment verification
|
||||||
- **Experiment plan** (`experiment_plan.md`) with current progress and TODO tracking
|
- **Experiment plan** (`experiment_plan.md`) with current progress and TODO tracking
|
||||||
- **MuJoCo Playground environment** (`T1JoystickFlatTerrain`) working and tested
|
- **MuJoCo Playground environment** (`T1JoystickFlatTerrain`) working and tested on all node types
|
||||||
- **Automatic GPU detection** and CUDA 12.4 compatibility
|
- **Automatic GPU detection** and CUDA 12.4 compatibility
|
||||||
- **Wandb logging** with online mode by default
|
- **Wandb logging** with online mode by default
|
||||||
- **Paper-accurate hyperparameters** for systematic replication
|
- **Paper-accurate hyperparameters** for systematic replication
|
||||||
|
- **LD_LIBRARY_PATH configuration** for shared Python libraries
|
||||||
|
|
||||||
### Wandb Integration
|
### Wandb Integration
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user