Added Plasability Study
This commit is contained in:
parent
c0913ba965
commit
77aa402dbc
Binary file not shown.
After Width: | Height: | Size: 99 KiB |
Binary file not shown.
After Width: | Height: | Size: 99 KiB |
Binary file not shown.
After Width: | Height: | Size: 111 KiB |
Binary file not shown.
After Width: | Height: | Size: 103 KiB |
Binary file not shown.
After Width: | Height: | Size: 77 KiB |
Binary file not shown.
After Width: | Height: | Size: 137 KiB |
64
documents/PCA_Plausability.md
Normal file
64
documents/PCA_Plausability.md
Normal file
@ -0,0 +1,64 @@
|
||||
# PCA Plausibility Study
|
||||
|
||||
## Results of the Preliminary Tests on Prior Conditioned Annealing (PCA)
|
||||
|
||||
We are testing and comparing the behavior of REX (White Noise), PCA and Pink Noise when run in an Columbus Environment.
|
||||
No AI is used; we assume a Gaussian Policy of $\mathcal N (0,1)$ at all times and only try to investigate the exploratory behavior induced by the exploration mechanisms.
|
||||
|
||||
### Behavior under Velocity-Controll
|
||||
|
||||
The actions describe the velocity vector, at which we travel.
|
||||
|
||||
![Screenshot from 2023-05-03 18-00-58](PCA_Plausability.assets/Screenshot from 2023-05-03 18-00-58.png)
|
||||
|
||||
<video src="../../Videos/Screencasts/Versus.webm"></video>
|
||||
|
||||
REX (left) generates Brownian Motion (as expected), which results in the worst exploration performance relative to the trajectories entropy possible. Pink Noise (right) travels a lot further in the state-space. PCA (center) also reaches a wide area of the state-space. We observe a lot jerkier motions. This could be problematic, since it would induce jerky motions into the policy net, when trained. It could also be a benefit, since the trajectories have much higher entropy.
|
||||
|
||||
The resulting trajectories of PCA look somewhat similar to those of noise with a spectral power density with beta=0.5:
|
||||
|
||||
<video src="../../Videos/Screencasts/BETA.5.webm"></video>
|
||||
|
||||
### Behavior under Acceleration-Controll
|
||||
|
||||
The actions describe the acceleration vector, with which we change our speed.
|
||||
|
||||
![Screenshot from 2023-05-03 18-09-15](PCA_Plausability.assets/Screenshot from 2023-05-03 18-09-15.png)
|
||||
|
||||
<video src="../../Videos/Screencasts/Versus_AccControll.webm"></video>
|
||||
|
||||
Pink Noise does not seem a good fit for this environment; the samples are in no way representative for the null policy used.
|
||||
|
||||
We will restrict our further investigations to only Velocity-Controll-Environment, since Acceleration-Controll seems like a hack to get smoother motions anyways.
|
||||
|
||||
### Pink PCA
|
||||
|
||||
![Screenshot from 2023-05-04 14-05-41](PCA_Plausability.assets/Screenshot from 2023-05-04 14-05-41.png)
|
||||
|
||||
We are not restricted to using white noise as the underlying distribution to be conditioned by PCA. PCA based on Pink Noise does not seem to behave significantly different to regular Pink Noise. A more pronounced difference should become apparent, when integrating the exploration mechanisms into a RL setup, since PCA will react to the policys behavior, while regular Pink Noise will essentially just be superimposed (apart from optimizing the distributions sigma)
|
||||
|
||||
### Why has no one ever tried Perlin Noise?
|
||||
|
||||
In my B.Sc. Thesis, I showed a significant benefit of SDE to lie in its ability to serve as a 'teacher of smooth motions'. Any change to the policy will be made based on samples from our exploration mechanism. An exploration mechanism, that generates jerky motions will therefore also teach a NN to generate jerky motions.
|
||||
|
||||
We already throw the i.i.d. contraint out of the window, when using SDE. Pink Noise is also a statefull stochastic process. This begs the questions how weird and not-real-Gaussian-like we can make our distribution, before things break.
|
||||
|
||||
We can normalize Perlin Noise to approximately follow a Standard Gaussian. We can then use the samples from this Normalized Perlin Noise as the underlying $\varepsilon$-Distribution used in the re-parameterization. It can also be combined with PCA.
|
||||
|
||||
![Screenshot from 2023-05-04 14-06-11](PCA_Plausability.assets/Screenshot from 2023-05-04 14-06-11.png)
|
||||
|
||||
<video src="../../Videos/Screencasts/Screencast from 2023-05-04 14-05-50.webm"></video>
|
||||
|
||||
We observe very nice smooth motions. Pink Noise sometimes has the tendency to 'run off', since it can generate samples, that are offset from the mean for a significant amount of time. Perlin follows the zero-mean behavior as seen in White Noise 'more strictly' / within shorter time horizons. We therefore expect it to maybe better suitable then Pink Noise. The observed difference between Perlin and PCA with Perlin is small. PCA with Perlin seems to 'run' off even less. Differences in behavior should again become more apparent as soon as we add an actual RL part into the setup.
|
||||
|
||||
Similar to the ssf-parameter found in SDE, our implementation of exploration absed on Perlin exposes a speed parameter. Compositions of Perlin Noise with higher octaves could be used to induce higher entropic behavior (not yet tested).
|
||||
|
||||
---
|
||||
|
||||
### Whats next
|
||||
|
||||
Next up are test of PCA on different Columbus Environments. We want to compare White Noise / Pink Noise / Perlin / SDE with PCA based on all these four underlying noises. We will prioritize, which ones to test on based on the initial results.
|
||||
|
||||
### Backup Plan
|
||||
|
||||
In the case, where PCA fails to provide good results, we could still test Perlin in comparison to Pink Noise (which seems to be the current SOTA), SDE and Episodic approaches.
|
Loading…
Reference in New Issue
Block a user