Go to file

Dominik Roth 9c6077e213 Loss seems to low?		2022-05-18 19:44:40 +02:00
brains	Newly trained net (uttt)	2022-05-18 19:04:07 +02:00
.gitignore	Removed brains from .gitignore	2022-04-15 14:15:40 +02:00
dikehiker.py	Initial commit	2022-03-21 14:27:16 +01:00
encBreaker.py	Initial commit	2022-03-21 14:27:16 +01:00
mycelia.py	Initial commit	2022-03-21 14:27:16 +01:00
README.md	Added update about current state to README	2022-05-13 17:06:35 +02:00
tictactoe.py	Blub	2022-04-13 22:49:38 +02:00
ultimatetictactoe.py	Changed layout of NN; more options for playing against the ai	2022-05-13 17:07:29 +02:00
vacuumDecay.py	Loss seems to low?	2022-05-18 19:44:40 +02:00

README.md

Project vacuumDecay

Project vacuumDecay is a framework for building AIs for games.
Avaible architectures are

those used in Deep Blue (mini-max / expecti-max)
advanced expecti-max exploration based on utility heuristics
those used in AlphaGo Zero (knowledge distilation using neural-networks)

A new AI is created by subclassing the State-class and defining the following functionality (mycelia.py provies a template):

initialization (generating the gameboard or similar)
getting avaible actions for the current situation (returns an Action-object, which can be subclassed to add additional functionality)
applying an action (the state itself should be immutable, a new state should be returned)
checking for a winning-condition (should return None if game has not yet ended)
(optional) a getter for a string-representation of the current state
(optional) a heuristic for the winning-condition (greatly improves capability)
(optional) a getter for a tensor that describes the current game state (required for knowledge distilation)
(optional) interface to allow a human to select an action

Current state of the project

The only thing that currently works is the AI for Ultimate TicTacToe.
It uses a trained neural heuristic (neuristic)
You can train it or play against it (will also train it) using 'python ultimatetictactoe.py'

The performance of the trained neuristic is pretty bad. I have some ideas on what could be the problems but no time to implement fixes.
(Focus on the ending of games at the beginning of training; more consistent exploration-depth for expanding while training; ...)