Thursday, 8 June 2017

Fractal VS Pack-Man

Last week my friend Guillem adapted the fractal AI for the OpenAI Atari games (OpenAI is a "gym" for AIs), in particular he focused on "Ms Pack Man", an environment labeled as "unsolved" as I write this.

Yesterday the work was almost done and the first videos came out of the pipeline and, to be honest, the results have stonished me, it worked out far beyond my always-optimistic high spectations.

So here is the video that made me so happy yesterday:




Just to put it in context: Your algorithm is only capable of "seeing" the game screen, and has the ability to push (or not) any of the four buttons available. Nothing more. Out from this mere information, build an algorithm that presses the buttons "intelligently" so the game's score gets as high as you can, as soon as you can. Pretty hard.

After adapting our algorithm to the atari APIs, we set a very modest fractal of only 105 walkers and allowed it to only "see" 2 seconds in the future. Not much to be fair.

The algorithm does not understand what the purpose of the game is, how many players are on the screen or anything about the game itself, nothing, just the images of the game as a raw integer array, and the corresponding score (smartly extracted for the screen shots).

The algorithm is able to score as much as 20.000 points and solve four screens without previous training. Surely, with more walkers and more seconds it could beat by far this scoring, but this is not where the power of the idea lies.

If you use those "pretty good games" as examples to train a standard neuronal network, you could make it learn much faster than todays methods based on random games.

Inversely, if this trained neuronal network could be used by the fractal AI to be smarter by "learning" form the past decisions, its "efficiency" could drastically jump spme prders of magnitude (the example above is not using any neuronal network).

And we have built a nice circular procces: better games examples means a better neuronal network, that in turn means a better fractal AI, that will generate even better games, that will make the NN smarter, and so on.

The tests on this strange mix are already being ran, and my expectation sub-routines are disabled until tomorrow.

Update (27/06/2017): It is going to take more than a day, so in the meanwhile, here you have some other atari games played from OpenAI. All of them were played from the ram dump and, of course, without any trainig. For comparation purpouses, I am including the average score from the Fractal AI games vs the second "best-so-far" algorithm on OpenAI:

 
Qbert-ram-v0 (184k vs 4k)


Tennis-ram-v0 (8 vs 0.01)


VideoPinball-ram-v0 (500k vs 20k)

5 comments:

  1. If this isn't some weird joke - please share, what is fractal AI?

    ReplyDelete
    Replies
    1. It is NOT a joke. Fractal AI is a new algorithm based on entropy that can solve all the atari games, using either ram or images, using the same code, with no training. This far we have tried with 4 or 5 games, so may be some of them could not be solvable, who knows!

      We are actually on the quest to actually solve them all, but there are 118 of them, and you need 100 games to enter, and yesterday it took about 2 hours to play a agame... almost 3 years of CPU time, we need to cut those times first.

      Today we have cut it to about 1/10 and we are working at about real time playing, so we will post more examples in some days.

      After offically solving most of them, we will publish the python code on github.

      Delete