Commit Graph

24 Commits

Author SHA1 Message Date
00dbc9bdd8 Error when calculating action_loss 2022-09-12 22:28:57 +02:00
4532135812 Finalized factoring out projections 2022-09-03 11:59:16 +02:00
0aeea4e2e5 Fixed Bug: Wrong dimensions for action_loss 2022-09-03 11:44:01 +02:00
4bb772a251 Factor Projections out into metastable-projections 2022-09-03 11:37:41 +02:00
e4a8cfc349 Implemented action_loss 2022-09-03 11:16:29 +02:00
4080ad8135 Removed old TODOs 2022-08-28 12:07:19 +02:00
eb881559d6 Support clip_range None 2022-08-28 02:07:18 +02:00
1d3c2fe005 Allow completely disabling some PPO features (for TRPL) 2022-08-28 00:26:44 +02:00
02e4ed1510 Added support for parallel envs 2022-08-27 15:19:00 +02:00
d35c3d8520 Fixed all the bugs in TRPL 2022-08-15 16:55:17 +02:00
28d0c609bc Fixed SDE: sampling had dimension mismatches 2022-08-14 20:09:10 +02:00
0ee65e789b Fixing sde's bugs 2022-08-14 16:10:22 +02:00
520dc98eb5 Implemented SDE 2022-08-10 11:54:52 +02:00
12e422aec7 Why does KL double free? 2022-08-07 18:04:40 +02:00
75d73049b4 Fixing bugs with w2 and sqrt_induced_gaussian 2022-08-06 21:25:49 +02:00
802094a50f Enabled w2 (can now get sqrt from dist) 2022-08-06 14:54:59 +02:00
508ebf51f0 Implemented sqrt-induced-gaussian for W2-Projection 2022-08-06 14:46:42 +02:00
3fa6de7e66 Broader sampling of stds for logging with batched full covs 2022-07-16 15:28:16 +02:00
d2d84d3287 Fixed bug for logging std-estimates when using batched data 2022-07-16 15:18:24 +02:00
4854346f2d Fixed bug with logging of std for full-cov 2022-07-16 14:58:00 +02:00
f184b88f19 Allow std logging for full and diagonal cov policies 2022-07-15 18:46:42 +02:00
a86d19053d Smashing bugs (dimension mismatch between Normal and
Independent/MultivariateNormal)
2022-07-15 15:46:31 +02:00
ab557a8856 Making MultivariateNormal Policies work (and porting Normal to
Independent)
2022-07-15 15:03:51 +02:00
b1ed9fc2b8 Renamed TRL_PG to PPO 2022-07-13 19:51:33 +02:00