|
479d73ac4b
|
Hotfix for exploding gradients
|
2022-11-03 20:13:36 +01:00 |
|
|
00dbc9bdd8
|
Error when calculating action_loss
|
2022-09-12 22:28:57 +02:00 |
|
|
4532135812
|
Finalized factoring out projections
|
2022-09-03 11:59:16 +02:00 |
|
|
0aeea4e2e5
|
Fixed Bug: Wrong dimensions for action_loss
|
2022-09-03 11:44:01 +02:00 |
|
|
4bb772a251
|
Factor Projections out into metastable-projections
|
2022-09-03 11:37:41 +02:00 |
|
|
e4a8cfc349
|
Implemented action_loss
|
2022-09-03 11:16:29 +02:00 |
|
|
4080ad8135
|
Removed old TODOs
|
2022-08-28 12:07:19 +02:00 |
|
|
eb881559d6
|
Support clip_range None
|
2022-08-28 02:07:18 +02:00 |
|
|
1d3c2fe005
|
Allow completely disabling some PPO features (for TRPL)
|
2022-08-28 00:26:44 +02:00 |
|
|
02e4ed1510
|
Added support for parallel envs
|
2022-08-27 15:19:00 +02:00 |
|
|
d35c3d8520
|
Fixed all the bugs in TRPL
|
2022-08-15 16:55:17 +02:00 |
|
|
28d0c609bc
|
Fixed SDE: sampling had dimension mismatches
|
2022-08-14 20:09:10 +02:00 |
|
|
0ee65e789b
|
Fixing sde's bugs
|
2022-08-14 16:10:22 +02:00 |
|
|
520dc98eb5
|
Implemented SDE
|
2022-08-10 11:54:52 +02:00 |
|
|
12e422aec7
|
Why does KL double free?
|
2022-08-07 18:04:40 +02:00 |
|
|
75d73049b4
|
Fixing bugs with w2 and sqrt_induced_gaussian
|
2022-08-06 21:25:49 +02:00 |
|
|
802094a50f
|
Enabled w2 (can now get sqrt from dist)
|
2022-08-06 14:54:59 +02:00 |
|
|
508ebf51f0
|
Implemented sqrt-induced-gaussian for W2-Projection
|
2022-08-06 14:46:42 +02:00 |
|
|
3fa6de7e66
|
Broader sampling of stds for logging with batched full covs
|
2022-07-16 15:28:16 +02:00 |
|
|
d2d84d3287
|
Fixed bug for logging std-estimates when using batched data
|
2022-07-16 15:18:24 +02:00 |
|
|
4854346f2d
|
Fixed bug with logging of std for full-cov
|
2022-07-16 14:58:00 +02:00 |
|
|
f184b88f19
|
Allow std logging for full and diagonal cov policies
|
2022-07-15 18:46:42 +02:00 |
|
|
a86d19053d
|
Smashing bugs (dimension mismatch between Normal and
Independent/MultivariateNormal)
|
2022-07-15 15:46:31 +02:00 |
|
|
ab557a8856
|
Making MultivariateNormal Policies work (and porting Normal to
Independent)
|
2022-07-15 15:03:51 +02:00 |
|
|
b1ed9fc2b8
|
Renamed TRL_PG to PPO
|
2022-07-13 19:51:33 +02:00 |
|