Tuesday, 22 April 2014

Layers and layers of intelligence.

Those days I have been busy ioroning out the ideas about how different "levels" of entropic intelligence could be layered, one over the other, to make up our complex and sophisticated mind.

I have come across with a very simple -once you get the idea- way to arrange it like a bunch of layers of "common sense" placed one on top of the previous one.

There are at least two ways of explaining it: algortihmic (for programmers) or entropy laws (for physicists) so I will focus first in the algortihmic aspect so you can make your own "multilayered common sense intelligence machine" if you are in the need (I already am on the work, but it is still far from done).

So lets go for it the easy way using the old and good "kart simulation" example.

To be clear about the problem we are facing I will just go to the point: usign 100 crazy blind monkeys to randomly drive the kart in the 100 futures I have to imagine, and then take a "common sense" decision based on the things those crazy monkeys did, may be, only may be, was not such a clever idea after all.

Using "crazy blind monkeys" (or totaly random decision, if you prefer not to joke about serious things) is the limiting factor in the simulations: you can not simulate more than 6 or 10 seconds using them, no monkey will survive more than 10 seconds, so it is limiting us to only "10 seconds ahead" strategies.

Watching the first videos I created, I always wondered why couldn't I just pop up the length of the imagined futures up to 10 minutes (instead of 6 seconds) and get an intelligence that start the race and make any movement thinking only on how to win it... instead on how to take the next turn without breaking its neck.

Those crazy monkeys needed a replacement, but what can replace crazy monkeys generating common sense and do it even better? No kind of heuristic could do it, not inside my computer!

Then I realized I needed to use a somehow lower level of this AI to get an aproximate path the monkeys would follow: give the monkeys a little of this itelligence medicine and watch them evolve (yes, like in the planet of the apes!). But was it possible to build a lower version of the "common sense" algortihm and give it to the monkeys to drink? Nopes.

So what I needed to give them was the full intelligence somehow, but this hides a big problem: if you want the kart to be driven with common sense when you are imagining a future, as you need to count with the other players around you to avoid crashing with them, you would need to know witch will be the intelligence drive they will do in your imagined future to avoid them... but they need to know your intelligence drive before they decide... you have a dead loop: you need to know all to be able to know a part of it.

But I had the solution in front of my eyes all the time: I have a machine than can take intelligent decision with ANY system you can simulate, and with the crazy monkeys I can safely simulate a couple of minutes of racing without crashing, so I can recursively use the same algortihm again and again.

Imagine you pack all the simulation we were using, with the karts, the futures, the 100 crazy monkeys, the resulting AI helping the karts to survive the drive... all this is now "your system", and this new system is far more stable than the previous one  (the kart with crazy monkeys that only lasted for a few seconds).

We have almost all we need, but there something missing: the goals, the metric in the new phase state space.

In the last post I showed you goals combos that made up quite nice intelligences with some params you could play with: how much do you love to race fast was the main positive goal, then you have how strong is your tendency to save energy, and finally how strong was your tendency to keep your health high.

Those are 3 free params I played with manually, trying to get the perfect combination, but what if I ask a second layer of intelligence to take this "macro simulation", simulate it in steps of 1 second to construct futures that last 1 or 2 minutes, and let the intelligence to manage this new "joysticks" that are tied to the "love racing" or "hate being out of energy" stenght params. and let it adjust them real time?

That is the raw idea, and here is the result (not as a video, I will need some more code time fro this): you hired a track engineer to assist the driver fom the wall.

This engineer will simulate the race on his laptop, not at the pilot level of decinding at a millisecond time scale, not, it will simulate the long term evolution of the race, and will send a message to the driver like: "we need to reduce fuel consumption, adjust the keep energy goal from actual 0.5 strength to 0.7 as, in the long term, it is better than your actual settings".

So this second layer works exactly as the first one, but its "system" is not the kart and the track, it is the first layer as a whole. The params this layer will decide on will be of a higer level, as the engineer will score things like "overtake this other kart", the engineer goal could be like Score=1/race_position, so if he get the kart jump from position 3 to position 2 in the 2 minute time horizont he uses, it will score 1/2, while other future in witch the kart couln't overtake, the score is 1/3.

The efect is that the 3 params I used to manually set before a race are now intelligently adjusted every second using 100 simulations of a minute or two of the race as it would be with only the layer 1.

This can be repeated many times:  a "strategy engineer" could be on top of the "track engineer" telling him "don't try to overtake Hamilton in the next minutes, in five laps will be pit-stop time, and it is far more convenient to try to overtake him there".

And then you could add a "team manager" that could give the "strategy engineer" higher level orders, like "don't try to win the race if you are going to waste 2 engines, we need to keep them for next races"... and so on.

Finally, Eclestone (F1 owner) could send a message to all of them like "We need to lower the team budget limits to allow new teams to come, as we need them for the bussiness health, the higest goal for me, as if we don't generate revenues, we will have to close the F1".

As a bonus, all those layers "protect" your kart from being too fear of crashing to the point it freeze it, as the second layer engineer will notice and adjust it to a lower strength... in the long term, suceisve layers make negative goals less and less dangerous, even innecesary... well, this is my actual bet.

So I need to make my actual algortihm recursive and then just add more layers and watch the results... thats all!

Ops! I forgot! What about the "entropic" side of all this?

Having a new layer makes you predict the futrure in a longer term, and so you take care of producing entropy in a more eficient way, what in turn makes you much "smarter".

If you want a real world case of this "adding a new layer proccess" you have one:

Humanity was never able to predict what will happend to it in a 200 year horizont, but now it is starting to do it and a new concept has emerged: we must act with "sustainability" or we wont survive more than 100 or 200 years from now.

By adding the goal of "being sustainable" we have a kart (or a human kind) able to produce entropy at a nice pace for the next 200 years or more. It is far more optimal that using all the energy available on the next 10 years and then cease to exists.

Entropy is now being created at the best way considered the next 200 years instead of 10.

Negative goals

We have seen how "common sense" works and how to bend it to our likings by adding positive and reductive goals, the video clearly showed the benefit of the mix of goals used, but are they enough avoid danger, or do we need something more... powerful?

Negative goals are quite natural for us: if the kart lower its health by a 10%, you can think of it as a mere "reduction" applied to the possitive goals -distance raced squared in this case- or something purely negative: a -10 in the final score.

If we try to get the same results as in previous video but using some short of negative goals, we will end up with something odd: the fear is ok in some really dangerous situations, they help you avoiding them efectively, but too much fear, a big negative scoring arising in some moment, will make the "common sense" to freeze. You have added a "phobia" to something.

So I would suggest not using them if you can live without them, instead use positive and reduction goals combinations. I suppose negative goals are always avoidable, but it is a personal intuition more than a golden rule to strictly follow.

Anyhow, here you have a video where this "fear" clearly appears, there are more similar videos, all have in comomn that I was playing around with some short of negative goal.

In this particular case, I was trying to avoid crashing after adding bouncing to the simulation. I still didn't have a way to measure the energy of the impacts, so I could not still use health to avoid it.

My idea was to use "how long did you survived in this future" to get it: if you are simulating 4 seconds in the future and a kart crashed in the final moment, 4 seconds away from the starting point, then it score as zero, but if you crash in the second 2, then you score as -1, and if you crash at second 1, it score as -10, and so on. The exact formula you use is not important, just someting negative that get really high -but negative- when the crash occurs closer to the starting moment. Score=Log(time_lived/time_simulated) can do the work, as far as you avoid Log(0) somehow.

This trick is used on the second video below, in this first simulation I will show you a better variation: when the kart runs off-track, I simulate a engine cut-off, and the kart continue running by inertia and then stop with a high friction. This distance raced after the engine cut-off scores negatively. It was a way to use the impact energy some easy way, and it worked ok.

And this is the resulting behaviours: the white kart doesn't negatively score dying at all, it is fearless, and some times it plays against it and leaves the track. The yellow one have a "low fear" coeficient, and it result in the most reliable of the three. Finally, the grey kart feels this fear doubled, and it makes him refuse to pass some narrow paths, even if it can perfectly fit.


But using negative goals is not evil, it can serve a purpouse: fill the holes in the intelligence and forceit avoid risk. This is what made yellow one the best of the three after all.

In this second video we use four kart to fine tune the fear level that is better (but using the "time lived was too short" fear as comented before): a white kart with no fear compites with other four ones, each one with more and more fear to crashing. Yellow is the bravest one of them, followed by orange, red and black karts. Black kart is a little too "coward" and tends to avoid dangerous paths at the cost of speed.


Surprisingly -or may be not- there is a "sweet point" around the fear we added to the orange one. Being fearless drives the white one into troubles at some close turns, while darker karts are so afraid of colliding that some times they almost stop before deciding.

So adding a little of fear into the combination of goals showed in the last post could made a more reliable combo after all... I will need to make more test on this and find out some "right way" to combine the three kind of goals (actually only works ok with positive goals and a second kind of goals, not the three at the same time).

Anyway, finding such a way of combining the three kind of goals may be not the right way. The intelligence can be bettered in such a way it doesn't need negative goals anymore, and even the reduction goals can be forgotten.

We should have then built a truly "metric" over the phase space, a more sophisticated one than the ones showed here, and the "common sense" would shine without the limiting fears of both kind, negatives and reductions.

But it will be on a following post...

Monday, 21 April 2014

"Reduction" goals

In the last post we describen "common sense" and how to use them with positive goals, but I also commented how badly we need to learn to deal with negativeness: as it was always said, a little of frear is good.

In the physical layer of the algortihm, we always talk about entropy, and it is not different this time, so lets go down to the basics to understand how to think about negative goals the rigth way.

A living thing is a polar estructure, it follows two apparently oposite laws of entropy at the same time.

First of all, a living thing is a physical thing, so all the physic laws of the macroscopic world we live in apply, and we know it means obveying the second law of thermodinamics: the instantaneus entropy always has to grow, and in the optimum possible way.

On top of this physic law seats the second one: keep your internal entropy low, as low as possible.

This duality is under the need for some kind of negative goals: even if entropy always grown (so positive goals are very natural to the algortihm) you need to maintain your internal entropy low, so you need to add some kind of limiting factor to avoid damaging our selves by letting our internal entropy to grow too much (too much means you die, a little high means you are hill).

Imagine you travel in time and go back when the first living cell started to use "common sense", we will try to follow its development as it will give us the clues to negative goals.

I imagine a living cell floating in the dirty water, its structure is new on earth, it is possible for the cell to get a quite low entropy inside: the cell is ordered, with different parts for different purpouses, and if you look inside a lot of cells, the ways they can be organized internally is really really low, compared to a similar volume of dirty water, much more caotic than the cell interior.

This "living" structure is much better at generating entropy: if you want to take a mountain -very low entropy, tomorrow the montain will be like today, not many changes possible- and blast it into a bunch of rocks, stones and sand, you can wait for the laws of physic to do their work (erosion will eventually do this for you) or add life. If you plant a lot of plants, trees, mouses, bears and so, in centuries instead of millions of years, you can have a totally eroded, caved and changed montain.

Now, by pure random mixed with natural selection, on some of those cell -in this case a primitive algae capable of photosyntesis- gradually appear a new structure: a light detector, placed in the border of the cell, plus a little tail on the oposite side, both connected by only one "wire". A neuronal system so simple it only has a detector, a wire and an actuator.

With 4 o 5 of those structures placed along the cell's edge, you have a totally different thing: this living thing is now intelligent, in some primitive but genuine way.

When ligth hits half of the cell, the detectors on this part will fire and the wires will pass this into the little tails, that will start moving. As a result, the cell will gently travel toward the light, it will scpae from shadows and move into sunny zones.

This simple cell contains all the ingredients for an intelligence, but in such a low level, it makes a perfect structure for our mental experiment with negative goals.

Why you need negative goals in this example of the green cell? Because when this cell reach a sunny part, the detectors will still fire, all at the same time, so all the "tails" will move during all the time the cell is having the sun bath. You will spend more energy than what you can get from the sun ligth!

How did nature dealed with that? Negative socoring, but the neuronal version of it: inhivition of signals.

Imagine the last example, the cell has 4 of those detector-wire-tail structures that help it move toward the light. Now add a simple neuronal network with only one neuron, connected to the four receptor as inputs and to the four tails as ouput.

If all the four detector fire at the same time, their signal reach this neuron and activate it. If only three of the detector fires, this neuron won't fire. This neuron is detecting the case "it is sunny everywhere", it will only fire on this event.

When "the" neuron -its brain only has this one- fires, it pass this to the four "tails" in such a way, it will "inhivite" the firing of the tail movement.

You have avoided this ugly case of "all sunny" with a simple but "negative" signal coming from one of the simpliest neuronal network possible.

How does it look in the real algorthim? Quite simple too!

I added to the simulation a "health" and an "energy" for each of the players -karts or rockets- and, when scoring each future, instead of just using the scquere of the distance raced, I multiplied it by the health level (from 0=dead to 1=perfect) and the energy level (from 0=depleted to 1=full). That was all.

The efect was: if ina future I crash and my 100% health lower to 10%, then the positive goals collected in the future trace will be lowered with *0.1, so not crashing futures will automatically be 10 times more appeling compared to the ones where I loose 90% of my health.

So let have a look on 10 intelligent players -5 kart and 5 rocekts- moving around a circuit filled with small drops of energy. Getting the drops doesn't give any scoring, but the energy you get will make the energy level to pop up from 0.85 to 0.95, so in the same way not crashing was interesting, it is getting a drop.

Just to make it a little more natural, I added to the rockect the possibility to land gnetly on the black pixels and rest to slowly refill its energy. But I never told them to land or anything similar, I just said: a future score with (raced^2)*health*energy, all the behaviours you will see just emerged from this formulae (plus the neutral common sense).



There exists a second way to do it, I assume it appeared in nature much later, but it works quite diferently: fear.

But this post was about reduction goals, so I will stop here!

Robotic phsycology?

Entropic intelligence, or the natural tendency on intelligent beings to do whatever it takes in order to maximize entropy generation (but measured not on the present moment but at some point in the future) not only do generate intelligent behaviour as the original paper authors suggested, it is the missing part we needed to push actual AI into real "human like" intelligence.

It is now a year from my first contact with this idea, and for the first time I find my self prepared to name it correctly and give a definition of what this algortihm is really doing inside.

During all this time, the "intelligence" algortihm itself and the resulting behaviours have been named -in my thoughts, the code and the post here- with many different worlds, basically because I didn't know what exactly was emerging on the simulations, just that it seemd deeply related to the intelligence concept some how.

Don't spect any kind of mathematical prof, all the concepts we are dealing with are not properly defined in the actual science, so I can only rely on my intuition, and the fact that intelkigent behaviour do emerge, visually, on the videos.

I staterted calling it "entropic intelligence", but there are levels inside this idea and I needed different names or concepts for each part. Lately I switched to use "brute intelligence" as the name for the simpliest algortihm and "goals" for the different additions needed to push this basic intelligence up to something usable.

Common sense

In the simpliest version of this algorithm, the one that score each option with the log of the number of possible different futures found, what we get is basically a generic way to numerically represent and simulate the part of the intelligence usually known as "comon sense".

So is the "common sense" living it last days of being the "less common" of all the senses? Yes, I really think so.

What exactly does it means? Well, this simple algortihm can be applied to any system you can simulate, without any previous knowledge of the system and without any additional information or goals to drive the intelligence, it is a "neutral" way to add intelligence to anything we can think of.

Imagine you are going let this AI to drive a kart, or a helicopter, or manage a whole industrial proccess by moving joysticks and you just ask it to do it carefully, with common sense, trying at any cost to have it up and running when I come back from my lunch time. This is what you are getting with this AI, a baby sitter capable of taking care of ANY system you give to it, but with no specific goals except "keep it up and running for me".

Witch "golden rule" could you think of to condense this idea? "Always move into situations with as many reachable futures as possible".

The extended version could be like this: Avoid sitations that only have a few possible ways of continuing, instead, choose situations with lots of different futures reachable, so if one of those possible ways to contiue get "blocked" in the short term, you always have plenty of other choices to take.

Don't try to do fancy things with your RC helicopter, like low passing over our heads at high speeds, as those course of actions involves having quite a few possible ways to go throug it, mabe only one, and it can lead you to the dissaster if the only way to survive it is to be lucky and rest on the asumption that nothing unexpected will block your narrow way.

Instead, keep the helicopter high and moving, so whatever it happends next, you always have plenty of different options to take and survive.

It is simple, universal, neutral and it does work impressively well. You can watch it working on this video that shows a kart on a track just provided with this "common sense":


Artificial phsycology

In th full version of the algorthim, what we represent and simulate is the full "psycological" part of our minds, the one that decide what exactly to do at any moment (and also what to avoid) based on the supplied simulation of the reality, plus a "physcologic" part that ultimately dictate what will this intelligence "like" or "dislike" to do.

In this process of "thinking" we already showed in all the past posts, this kind of AI just needed to be feed with two external things: a way to imagine out what would happend if I do this or that (a simulator of the system, not necesarily too acurate) and a measure of how better -or worst- is this change that I can simulate.

This second part acts as the "distance" the system had to "walk" to go from point A (the initial state of the system) to the final state B (the simulated end position of the system after a given time). Both states of the system, A and B, are technically said to be in the "phase space" of the system, a way to say they are two among all the possible states of the system in time.

By adding such a function to the brute inteligence of the entropic AI, we are technically defining a metric in the phase espace of the system. This could seems as just a set of mathematical and basic physics details to be solved and forgotten, but the kind of function we finally apply will, in fact, determine how the system will behave. Your are definig the inteligence personality.

I have made extensive tests with all kind of funtions that made any sense to me, and carefully watched the resulting behaviours in the post produced videos. Some times the karts behavied fearless, other formulaes brougt real fear to them while other just made them cautious, but as a general rule, adding complexity to the formulaes without much care about the way we do it will end up in any kind of patological personality.

Bipolarity, squizofrenia, suicide tendencies (a big lot of them) will do arise if you just play around with this distance formula. If you use a distance definition that allows negative distances, for instance, you will have to deal with fear, a fear that will make your inteligence to panic when confronted with situations that gives negative distances any where it decides to go, and freeze it.

So the question here is witch mathematical form should the goals I use to define the metric take in order to get a usefull, trusty behaviour?

The short answer is the ones that makes a balanced personality. My wife is a fairly good phsycologist and we spend many late hours talking about all this, and we both are quite surprised on how perfectly all matches.

But to be a little more concise, I can offer you a recipe that only use tree kind of simple goals:

1) Positive goals are the best ones, they just sum an amount to the distance, and if you need to mix several of them, you just need to sum them all. The are the only ones than will build a real distance and they are the only ones if you plan to get a perfect inteligence.

For instance, scoring each of the karts futures with the square of the euclidean distance raced, the fist goal-driven intelligen I simulated, gives you a near optimal intelligence capable of rivaling with any human. But it worked nicely only for the fact that I swipped out any risky end in the simulation phase: when the kart crashed on a wall during the simulation of a future, I just ended the future and so the "raced distance" was smaller.

You can visually compare the previous "common sense only" kart with another one with a "phycological" tendency to run as fast as possible and compare your self:


If I should had been more realistic on my simulation by calculating the bounces of the kart with the track limits, as I did eventually, then the futures wont stop after a crash any more, and the raced distance will also sum the part after the bounce, making it much more attrative to the intelligence. It will not fear breaking the kart at all, so eventually it will just crash and die.

It is a silly and 100% optimistic intelligence, and believe me, it is not a nice combination. The videos showing this behaviour were never recorded, but you can easily simulate it on the V1.0 application, just move the slider "Keep health" fom the initial strength of 1 (rightmost possition) down to 0, and watch your karts crash and break into pieces (well, the simulation won't make the kart burn in flames, but it will crash and stop moving).

Basically we have two ways to go from here: try to take away from the combo the "silly" part, or the "100%" from the optimistic part. Both are possible to simulate, but here we will just consider using our basic brute layer 1 intelligence and see how far we can go before we think on seriuosly simulate higher levels of intelligence, the multi layered model I will hopefuly have on production on some months from now.

So if the intelligence you are using is not state of the art, if you are using a "one layer" of this entropic inteligence as in the cases I am actually simulating, then you will need to add some short of " negative goals".

I refused for long to even consider this option. After all, entropy is about always having a positive growth, so if this distance have to mimic some kind of entropy drived proccess, and if aesthetically using a distance that is not a real distance to build a metric seems to you an aberration, then may be we should start by forbiding any kind of goal not being strictly about add a positive value.
But we need them, so I stop complaining and eventually get a nice way to deal with it.

In the next of these "phsycological" posts I will show you a couple of ways to add negative goals to the model without going too mad (the intelligence, not you) that will give us some nice videos, showing more complex behaviours, but be warned: they will have odd consecuences.