Thursday, 27 February 2014

Video 1 - The basics.

Before this blog I used to publish videos on a Youtube playlist about how the AI was better on this or that, it gave me a way to get comments on the subject.

Now that the blog is up and working, I would like to start by reviewing all those "old videos" and give an explanation on how the algorithm was working by this time, how good it was and witch things needed to be changed.

So today we will comment on the first video, it is the most important one to understand in order to get the whole idea of the algorithm, so read carefully and comment on any aspect you feel is not cleared in this post, I will try to help my best.

So start by watching the video:

As you can see, we have a simple case: a track with a kart on it. The kart is also quite simple, it is always accelerating (it has no brakes) and you just have a joystick for left/right turning (it is said that this system only has one degree of freedom). The AI must decide where will it "push" the joystick on every frame, that's our goal.

1) Options = Possible decisions

First thing you need in order to make the AI, is to have a list of witch "option" or "decisions" you are going to ponder. In this simple case, we will consider only two possible options/decisions: Push the driving wheel left by adding +5º, or push it right by substracting -5º. In the video, each one is represented by red or blue lines.

2) Create random futures for each decision

Second thing to do is to "imagine" a bunch of, let say, 100 futures for each possible initial decision.

For instance, you consider the decision "pushing +5º ", then you simulate a frame (in my case, a frame is 0.1 second time) with this "push" working, so simulation code will tell you where the kart will be after 0.1s of pushing driving wheel +5º. It will be a little ahead and left form actual position, and surely a little rotated counter clock wise.

From this "lefty" possition, you continue simulating until you get to time+10s (those 10 second is a parameter, how far in the future you want to simulate), but on all subsecuent frames, instead of "pushing +5º", you will chose the "push" randomly in the range +5 to -5. This makes the kart to drive randomly, and it correspond to the blue lines on the video.

You repeat it 100 times, so you end up with 100 blue lines, representing 100 possible futures that start by turning +5º left the kart. Those 100 blue lines form a "blue flame" in front of the kart as you can see in the video.

Now you repeat with the second initial decision, pushing -5º to the right. In the same way, we get 100 more futures, all them painted on red on the video.

3) Counting different futures

With all those futures found for one of the two possible decisions, now we need to discard the similar ones in order to get the list of the "different" futures that taking this decision could bring to us.

For doing this, you need to know witch of the kart's parameters are going to be considered "important" for comparing ending position. I decided to only use the position of the kart (PosX, PosY), and rejected to use the angle. For considering "similar" two futures, their positions, rounded to a given precision, must be equal.

So, if future #1 ended up at position (234.4, 187.0) -here I use just pixel coordinates- and I am using a precision of 5, it means this future "roughly" ends at (235, 190), and this rounded position is the one to compare with other futures in order to get a list of different ones.

So, after discarded duplicated futures, may be you end up having 35 different futures for decision 1 (turning left 5º) and 15 for the other decision (turning right -5º).

4) Deciding

Well, everything is ready, the AI only needs to decide by averaging the 2 possible decisions (+5 and -5) weightened with the number of different futures each decision had, compared with all the different futures found so weigths sum 1.

Decision = +5 * (35/(35+15)) -5 * (15/(35+15)) = 5*35/50 - 5*15/50 = 5*0.7-5*0.3 = 5*0.4 = 2

So, the "intelligent" decision, in this case, is turn +2º to the left!

5) Loop on it

Well, the work is done, now you just simulate the kart after applying the decision and the kart moves on screen. You are again ready to go to 2) and start over again from this new starting position.

Thats all the algorithm is doing: counting red and blue different dots and heading left if there are more blue dots than red ones.


-A future that crash with the fences was supposed to be a "bad" future, and its final point is not even draw, only non-crashing futures are considered as valid ones. It was a really bad decision from me, we will come back to this in future posts.

-Each different future found count as 1. A longer future (one that takes the kart far away from its intial position) count as much as another future where the kart crash after running break and only race for 1 meter. It is simple and work, but it is far from optimum. Scoring the futures will be one of the biggest improvements in next versions.

Wednesday, 26 February 2014

Wellcome to the Entropic AI blog

In this my first post, I would like to introduce you to the history behind this kart simulation.

Back in april 2013, I read an article at (big thanxs to José Elias for the greatest blog) about a new approach at artificial intelligence based solely on entropy concepts, driven just by thermodinamical laws, that was surprisingly good in making phisyc systems, of any kind, to "behave intelligently": Causal Entropic Forces by Alexander D. Wissner-Gross.

It chatched my attention so I jumped from link to link to link in search of some more insight on it. Reading those links made me understand the idea even before taking a look at the original paper, and when I read the world "Montecarlo", the algorithm popped up in my mind.

So I sat and code a quick and dirty approach to the algoritm in a couple of days, resulting in the first version of the kart simulator. It was very very simple, but the AI managed to drive the kart on track quite impresivley... and I didn't code anythink like "run" or "drive inside the track".

After this first success, I started to really investigate the paper, trying to get down to the formulaes to know more about the correct way to do it, but I was not really prepared to understand it all, really, so I focused on the kart example, trying to make it better by my own.

Eventually, the AI becomes as good at driving this kart as anyone can be, seriously, and small "imperfections" on how the kart is driven use to be my fault, as the simulation code for the kart physics is just a bad "proof of concept" and is not really that realistic, so the AI does it right for the pseudo-realistic simulation I gave to it.

Actually, V0.7 of the algortihm mimic almost 100% of the paper formulaes properly -as far as I can tell- with some simplifications and also with some additions of my own.

So how good this AI can be? Well, basically you can make the AI manage any kind of "thing" you can simulate as long as you let it decide on one or more "free parameters" or degrees of freedom.

It means that, if you have a drone, a lunar rober, a probe to a distant sun, a industrial proccess you want to optimize, a simulation of an ameba... just anything, as long as you let the AI push the "joysticks" and provide a way to simulate how will the system evolve in a small time, the AI will move the joysticks in such a way the drone will fly in a "intelligent way", whatever this means. You don't need to give it a goal, nor instructions on how to drive the drone, just let it play for you.

I think this is such an important step in AI that, in a future, it may become as important as todays neuronal networks are, and who knows, if both approaches are mixed together the right way, we could get magic from them.

So I open up all my findings, thoughts, videos, code and exes so anyone can play with this, invent new ways to apply it, games, industrial optimizations... but sharing findings here so everyone can benefit from it.

Please feel free to download and play with the files, comment on it, ask about anything, share your wildest thoughts, modify or torture the code, convert to other languajes, etc., I will apreciate!