Fix Hydra config error in dev script
- Change train.n_iters to train.n_epochs (correct DPPO parameter) - Update experiment tracking with failed job details - Ready for corrected dev test
This commit is contained in:
parent
f88a5be4fe
commit
4adf67694a
@ -71,7 +71,7 @@ TASK=hopper MODE=pretrain sbatch slurm/run_dppo_gym.sh
|
||||
|
||||
| Job ID | Type | Task | Mode | Status | Duration | Results |
|
||||
|--------|------|------|------|---------|----------|---------|
|
||||
| 3445081 | dev test | hopper | pretrain | PENDING | 30min | - |
|
||||
| 3445081 | dev test | hopper | pretrain | ❌ FAILED | 33sec | Hydra config error |
|
||||
|
||||
## Configuration Notes
|
||||
|
||||
@ -87,7 +87,11 @@ export WANDB_ENTITY=<your_username>
|
||||
|
||||
## Issues Encountered
|
||||
|
||||
None so far - installation completed without code modifications.
|
||||
### Fixed Issues
|
||||
1. **Hydra Configuration Error** (Job 3445081)
|
||||
- Issue: Wrong parameter names in dev script (`train.n_iters` instead of `train.n_epochs`)
|
||||
- Fix: Updated to use correct DPPO config parameters
|
||||
- Status: Fixed in commit
|
||||
|
||||
## Next Steps
|
||||
|
||||
|
@ -41,12 +41,11 @@ echo "PyTorch version: $(python -c 'import torch; print(torch.__version__)')"
|
||||
echo "CUDA available: $(python -c 'import torch; print(torch.cuda.is_available())')"
|
||||
echo ""
|
||||
|
||||
# Run a quick pre-training test with reduced steps
|
||||
# Run a quick pre-training test with reduced epochs
|
||||
python script/run.py --config-name=pre_diffusion_mlp \
|
||||
--config-dir=cfg/gym/pretrain/hopper-medium-v2 \
|
||||
train.n_iters=10 \
|
||||
train.log_interval=5 \
|
||||
train.checkpoint_interval=10 \
|
||||
train.n_epochs=2 \
|
||||
train.save_model_freq=1 \
|
||||
wandb=${WANDB_MODE:-null}
|
||||
|
||||
echo "Dev test completed!"
|
Loading…
Reference in New Issue
Block a user