Making the repo somewhat understandable to other readers...

This commit is contained in:
Dominik Moritz Roth 2022-09-16 11:38:21 +02:00
parent 5cedffa473
commit d2c2343d08
4 changed files with 85 additions and 4 deletions

View File

@ -0,0 +1,45 @@
env_args:
observable:
- type: State
coordsAgent: True
speedAgent: True
coordsRelativeToAgent: False
coordsRewards: True
coordsEnemys: False
enemysNoBarriers: True
rewardsTimeouts: False
include_rand: True
- type: State
coordsAgent: False
speedAgent: False
coordsRelativeToAgent: True
coordsRewards: True
coordsEnemys: False
enemysNoBarriers: True
rewardsTimeouts: False
include_rand: True
- type: Compass
- type: RayCast
num_rays: 8
chans: [Enemy]
entities:
- type: CircleBarrier
num: 8
num_rand: 6
damage: 20
radius: 25
radius_rand: 75
- type: TeleportingReward
num: 1
reward: 100
radius: 20
default_collision_elasticity: 0.8
start_score: 50
speed_fac: 0.01
acc_fac: 0.1
die_on_zero: True
agent_drag: 0.07
controll_type: ACC
aux_reward_max: 1
aux_penalty_max: 0.1
void_damage: 5

View File

@ -4,23 +4,48 @@
<img src='./icon.svg'> <img src='./icon.svg'>
</p> </p>
Project Columbus is a framework for trivial 2D OpenAI Gym environments that are supposed to test a agents ability to solve tasks that require different forms of exploration effectively and efficiently. Project Columbus is a framework for trivial 2D OpenAI Gym environments that are supposed to test a agents ability to solve tasks that require different forms of exploration effectively and efficiently.
## Installation ## Installation
(If you want to install Columbus as a dependency for metastable-baselines, activate (source) the venv from metastable-baselines before running this command.) (If you want to install Columbus as a dependency for metastable-baselines, activate (source) the venv from metastable-baselines before running this command.)
``` ```
pip install -e . pip install -e .
``` ```
### env.py ### env.py
![Screenshot](./img_README.png) ![Screenshot](./img_README.png)
Contains the ColumbusEnv. New envs are implemented by subclassing ColumbusEnv and expanding _init_ and overriding _setup_. Contains the ColumbusEnv.
There exist two ways to implement new envs:
- Subclassing ColumbusEnv and expanding _init_ and overriding _setup_.
- Using the ColumbusConfigDefined with a desired configuration. This makes configuring ColumbusEnvs via ClusterWorks2-configs possible. (See ColumbusConfigDefinedExample.md for an example of how the parameters are supposed to look like (uses yaml format), I don't have to to write a better documentation right now...)
##### Some caveats / infos
- If you want to render to a window (pygame-gui) call render with mode='human'
- If you want visualize the covariance you have supply the cholesky-decomp of the cov-matrix to render
- If you want to render into a mp4, you have to call render with a mode!='human' and assemble/encode the returned frames yourself into a mp4/webm/...
- Even while the agent plays, some keyboard-inputs are possible (to test the agents reaction to situations he would never enter by itself. Look at \_handle_user_input in env.py for avaible keys)
### entities.py ### entities.py
Contains all implemented entities (e.g. the Agent, Rewards and Enemies) Contains all implemented entities (e.g. the Agent, Rewards and Enemies)
##### Some caveats
- Support for non spherical entities (rectangles) is very new. There might be bugs that I have not yet found
### observables.py ### observables.py
Contains all 'oberservables'. These are attached to envs to define what kind of output is given to the agent. This way environments can be designed independently from the observation machanism that is used by the agent to play it. Contains all 'oberservables'. These are attached to envs to define what kind of output is given to the agent. This way environments can be designed independently from the observation machanism that is used by the agent to play it.
### humanPlayer.py ### humanPlayer.py
Allows environments to be played by a human using mouse input. Allows environments to be played by a human using mouse input.
##### Some caveats
- CNNObservable seems to be broken currently. (Fixing it is also no priority for me)

View File

@ -309,9 +309,13 @@ class ColumbusEnv(gym.Env):
L, V = th.linalg.eig(cov) L, V = th.linalg.eig(cov)
L, V = L.real, V.real L, V = L.real, V.real
w, h = int(abs(L[0].item()*f))+1, int(abs(L[1].item()*f))+1 w, h = int(abs(L[0].item()*f))+1, int(abs(L[1].item()*f))+1
# TODO: Is this correct? We try to solve for teh angle from this: # In theory we would ahve to solve:
# R = [[cos, -sin],[sin, cos]] # R = [[cos, -sin],[sin, cos]]
# Via only the -sin term. # But we only use the -sin term.
# Because of this our calculated angle might be wrong
# by periods of 180°
# But since an ellipsoid does not change under such an 'error',
# we don't care
# ang1 = int(math.acos(V[0, 0])/math.pi*360) # ang1 = int(math.acos(V[0, 0])/math.pi*360)
ang2 = int(math.asin(-V[0, 1])/math.pi*360) ang2 = int(math.asin(-V[0, 1])/math.pi*360)
# ang3 = int(math.asin(V[1, 0])/math.pi*360) # ang3 = int(math.asin(V[1, 0])/math.pi*360)

View File

@ -29,6 +29,7 @@ class Observable():
class CnnObservable(Observable): class CnnObservable(Observable):
# Currently broken...
def __init__(self, in_width=256, in_height=256, out_width=32, out_height=32, draw_width=128, draw_height=128, smooth_scaling=True): def __init__(self, in_width=256, in_height=256, out_width=32, out_height=32, draw_width=128, draw_height=128, smooth_scaling=True):
super(CnnObservable, self).__init__() super(CnnObservable, self).__init__()
self.in_width = in_width self.in_width = in_width
@ -195,6 +196,7 @@ class RayObservable(Observable):
class StateObservable(Observable): class StateObservable(Observable):
# Whitelists probably don't work...
def __init__(self, coordsAgent=False, speedAgent=False, coordsRelativeToAgent=True, coordsRewards=True, rewardsWhitelist=None, coordsEnemys=True, enemysWhitelist=None, enemysNoBarriers=True, rewardsTimeouts=True, include_rand=True): def __init__(self, coordsAgent=False, speedAgent=False, coordsRelativeToAgent=True, coordsRewards=True, rewardsWhitelist=None, coordsEnemys=True, enemysWhitelist=None, enemysNoBarriers=True, rewardsTimeouts=True, include_rand=True):
super(StateObservable, self).__init__() super(StateObservable, self).__init__()
self._entities = None self._entities = None
@ -287,6 +289,9 @@ class StateObservable(Observable):
class CompassObservable(Observable): class CompassObservable(Observable):
# Usefull for navigation close to an reward.
# Works like the StateObservable, but we assign a bigger range of possible input values to those, that are close to zero.
# I found that Agents without such an Observable often moved close to a reward and then just jiggled arround, adding a CompassObservable fixes this
def __init__(self, coordsRewards=True, rewardsWhitelist=None, coordsEnemys=False, enemysWhitelist=None, enemysNoBarriers=True): def __init__(self, coordsRewards=True, rewardsWhitelist=None, coordsEnemys=False, enemysWhitelist=None, enemysNoBarriers=True):
super().__init__() super().__init__()
self._entities = None self._entities = None
@ -355,6 +360,8 @@ class CompassObservable(Observable):
class CompositionalObservable(Observable): class CompositionalObservable(Observable):
# Used whenever you want to attach multiple Observables to an Env.
# We currently flatten the outputs of all attached Observables, so using a CNN though an CompositionalObservable would lead to problems.
def __init__(self, observables): def __init__(self, observables):
super().__init__() super().__init__()
self.observables = observables self.observables = observables