- Disable WandB in dev script to avoid config object vs string error - Successfully completed development test (Job 3445106) - Confirmed: pre-training works, loss reduces, checkpoints save - Update experiment tracking with successful results |
||
|---|---|---|
| .. | ||
| run_dppo_dev.sh | ||
| run_dppo_gym.sh | ||