Artificial Life and Believable Characters

“Art + Life: Life made by Man rather than by Nature”. -Langton (Langton)

The AI approach tries to choose emotions depending on certain factors. It can use some fuzzy logic to determine what emotion to display on some fuzzy input. Then there might be models for expressing such emotions, such as the facial animation system used in Half Life 2 (Valve 2004), which works pretty well, but it has one expression for one emotion, so it’s pretty simplified. Also, the matching emotions are pre-set by designers. This can cause problems, for example: Affirming by nodding your head forwards and back is something that we are used to in the West, but in some Eastern cultures affirmation is signalled by a movement from right to left. So these definitions of which expression leads to what emotion should not be generalised and pre-fixed. If you do so, you might have to consider different societies conventions. Expression should be unique to the character itself. In the best case the character would learn which expressions cause what reaction. In that way in a similar society they’d all end up using the same expressions, while in a different society they would learn others. I think this is the basis for language as well, because language is the expression of certain facts about the environment. So for example “chair” would be different in every language, because we all learn it from our social environment. I don’t know if Chomsky and the lot agree, but I think that the development of the same concepts exists (because they exist in our shared reality), but language is dependent on where we are and what we are surrounded by and who we want to affect in what way.

That’s why I think these approaches of assigning certain expressions to certain causes are not right. Another example is the Oz project by Michael Mateas had a similar result. They applied Ortony Collins and Clores binding of expressions to certain emotional situations. These were displayed by little animated creatures called Woggles. They found that the ones where they had made binding errors were deemed to have the most personality by observers: Users reported that the Woggle that repeatedly banged its head against the ground was the most interesting and described it as neurotic, angry or upset. They had all these connotations about the crazy Woggle, while the Woggles that had “correct” or plausible bindings such as “get angry when hit” or “be sad when alone”, the observers just noted that they thought the Woggles were simply programmed to do so. The interesting thing was that the most believable behaviour was due to a programming error in the simulation and not deliberate. This shows that it is often not the expected bindings that seem the most interesting. The individuality of a reaction is more important than the fact that the reaction corresponds to some kind of social convention.

This was using AI systems (in terms of defining specifically, symbolically or logically which input gives which output). You could use fuzzy logic, but it is still deterministic – a given input forms a given output unless you use a random variable somewhere. I don’t like random variables. Random just means “I don’t know”, just like 50:50 or 50% means “I don’t know”. So essentially what Im looking for is some system which allows a creature to perceive certain emotional expressions from its environment and then try to emulate these expressions on the output end of its own … feeling. Basically we want to create a connection between the expression of an emotion that we see externally, to something that we feel internally. So once that connection is made, if I feel that emotion – I will express something similar to what I saw before. That way, through this “adaption” process – you could call it adaption, synchronicity, copying or pattern matching (it’s all in the same vein) – with that kind of process, we would get the individual expression that really forms the character. Interesting is that if you have a character or child grow up different cultures at the same time. Say it spends half of its life in India and the other half in the US or Europe, the expressions used would be very mixed, just like the language would be mixed.

I even speak Denglish (German English) with my old friends (we went to an international school in Germany, where the teaching language was English). For us it doesn’t make a difference if we use German or English words to express what we mean, we often don’t even notice when we use verschiedene Sprachen. For somebody who is not from the same (sub) culture, from either German or English background (but not both), it would sound very odd.

Basically this aspect of subculture comes in, where forming these mixtures of different known stereotypes form a new culture. Therefore trying to define how many cultures there are, is futile, because for every mixture you creating a new subculture. This is one of the reasons why I don’t agree with symbolic AI, because you get this symbol overflow. You can take a few base symbols and find an exhaustive list of combinations, but you will still find things you cannot express with this set of symbols and will have to introduce a new one. So how many symbols do you need to introduce to be able to express all that is around us (really have an exhaustive set of symbols)? I don’t think you can really achieve this – although this is debated. Maybe a non-symbolic approach can also never fathom every possible concept. I am basically more interested in a procedural/process based approach to generating and correlating an experience. And I want to use the actual sensory information, the perception of an emotion to create the emotion, to act it out.

This is where Braitenberg and other Artificial Life based architectures, because they look at biology for inspiration. They found that the Brain and nervous system does a lot of this kind of pattern matching. Some go even further (Hawking) and claim that there is a general processing algorithm that handles temporal pattern matching in every region of the brain and that can be found in us all. Although I do agree with that to some degree, I am more intrigued by Braitenberg’s views. He comes from a purely biological perspective. He had been analysing the neural structures in real brains and nervous systems for 40 years prior to writing his seminal book “Vehicles: Experiments in Synthetic Psychology”, which was based on an Essay He had written 20 years before titled “Taxis, Kinesis and Decaussation”. Taxis is reacting to some stimuli, Kinesis is reaction to the strength of the stimulus and Decaussation is the way that cross connections exist in our brains (Left to Right for arms eyes etc, except nose). Essentially for the biology I am relying on Braitenberg due to his experience.

My idea is that an approach to real believability in these kinds of agents is possible if we look at the low-level pattern recognition algorithm that Braitenberg suggests. I want to build an AI controller that interfaces with, or in some cases replaces systems such as Euphoria and Massive. Euphoria was based mostly on human-like bipeds (which is very useful for games as they usually feature human characters). My system will at some point be able to handle this as well, but I am focusing on little Vehicles with wheels, so there is not really a realistic motion for these (as long as we consider the necessary environmental effects such as friction, inertia, gravity and slip etc.). But there is no reason why this kind of technology that we are employing cannot achieve the kind of result Natural Motion’s has, as their system was biologically inspired as well.

We haven’t really mentioned other Artificial life projects yet:

1. Planning

2. Low level behaviour

3. Cell behaviour – autopoisis (extreme self-sustaining behaviour or self rebuilding) by Varela

When I was talking about Disney, I was talking about the visible thought-process being the main thing that makes animated characters in non-interactive scenarios believable. I wanted to comment that thought process is something that is pretty difficult to tweak so it seems believable with normal AI systems. You could incorporate delays in the decision making process or something in the animation. But that delay in would have to be determined by something – either by a random variable, fixed or another process. Either you end up with a huge hierarchy of sub-processes going on (deliberation) that give some time value – and that would be possible. But what values do you give these things, so my thinking is that if you use Artificial Life or biologically based system, which incorporate time delays at the very low level, then we would get this impression across that a thought has to travel through the brain, and that this takes time. The result is that no action is performed as soon as it is selected. In a sense it is procedural. A process might cause the character to tentatively begin an action and then be convinced that it is the correct action while we do it. So this “testing” happens in real-time, while we perform the action. This is the kind of thing that is very difficult to model using a classic AI system, because time is usually of no intrinsic value in these kind of systems.

Essentially I am thinking the Disney animation criteria is like a benchmark. What we are trying to do s create an artificial animator. An artificial animator with a personality that can generate personalities. So the technique of creating characters by writing of the aspects that are considered there and the aspects that go into the actual animation (movement), so to say which has been done by Euphoria- they flow together to form an individual, believable character that can then be used in interactive scenarios. That’s an important aspect I wanted to tie back to, that the Disney/Pixar animation techniques, which are so successful that they given the robot Wall-E so much character that we were able to identify with him. So we can look at that as our benchmark and think what goes into each frame and into the time delays that are being displayed by the character: Are they calculated? Are they Pre-decided? Are they Human? Are they what the animator felt themselves? Are they being recorded in some cognitive time-delay chart (i.e. 2ms for fear 3ms for love etc). I don’t think those kind of criteria can be determined as constants (it depends how much you love fear something…). All these criteria are incredibly hard to model with an AI system. But they would develop/emerge from AL based or Biologically inspired system, which includes time delays. Because if somebody is unsure about something, the thought would be propelled through the brain for a longer time until a distinct action is performed. Some action may be performed half-way and would flip between them. Like the bunny in front of the car’s headlights. Not sure whether it should run or stay because there are so many possible outcomes (+actions) and they are all being activated. I don’t think even fuzzy logic can do this – they would be equivalent in the decision made (the bunny runs or it doesn’t), but while this decision is being made, while it’s deliberating between the different options – that process is something that is only displayed in the Artificial life model.

Believability Notes

Chapter 1 Introduction

1.1 What is “believable”  ?

The meaning of “believable” is something that we consider to be true, or a person who we believe to be trustworthy. Believable can also be understood as “suspension of disbelief”. This can be defined as the ability to overcome the observer’s doubts that the subject presented is real or some form of truth. A common way to achieve this is to enable the observer to relate to the subject of the work on a personal level. In the artistic domain, examples of believable subjects can be a character in a play, the plot of a novel or an animated figure.

1.2 Suspension of Disbelief

Suspending disbelief has been the primary endeavour in character-based arts across all mediums, from sculptures and painting to theatre, over writing to radio and up to television and video games.

Throughout the evolution of media our ability to emulate reality in our creations, has become more refined. With this increasing degree of realism and fidelity, suspending disbelief in the viewer appears to become easier to achieve. Yet when we compare expressive media according to how much “work” is left to the observer’s imagination, we see that it has become more difficult. To bring across the “Gestalt” of a character (expressions, body language, choice of actions etc.) in a modern medium, requires the artist to pay attention to a much greater set of details and nuances, than was required in a less explicit, more abstract medium. If the artist does not cater to these details, the work will be less believable.

A medium like written text can “use” the reader’s imagination to fill in the details, while an inherently explicit medium like interactive animation is posed against closer scrutiny for lacking detail.

For example if a writer writes, “There is a room, it is dark”, the reader will imagine a dark room as he knows it from experience. On the other hand if a painter draws a black square and says “This is a dark room”, the observer is likely not to believe this, unless the painter considers and includes the details of the room: the dimension of the room, the perspective of the eye. The painter also has to put light into the dark room to make these properties visible.

–possibly describe the difference between believability and realism–

It could be said that Leonardo Da Vinci, who took around 5 years to create his masterpiece, the Mona Lisa, would have struggled even more to produce an animation of the same intrinsic quality– let alone an interactive animation. The medium of painting, with all its expressional limits allowed him to focus on the characteristic nuances that make it a masterpiece. A 10 second animation with 15 images a second, while retaining all that detail, would have taken a lifetime (~750 years, actually).

1.3 Details

Description of what those details are. Based mainly on thesis by Loyall and Disney and the Stanislavski.

While realism applies to the generalities, the physics of the world, believability rests in the nuances. Before I describe what these are, let me talk about a problem that occurs with increased realism, especially with human characters.

1.4 The Uncanny Valley Effect

For creators of believable human-like characters the Uncanny Valley concept has been a problematic issue. Originally the term was coined by psychologists Ernst Jentsch in 1906 (Jentsch 1997) and Sigmund Freud in 1919. The theory states that if a realistic human-like figure comes too close to looking like a real human being, an observer will suddenly switch from an empathetic, to a repulsed response. (White, McKay et al. 2007) state that this effect can be observed for static or moving images, figurines and robots and doesn’t just apply to the visual impression (looking like a human), but also to movement (moving like a human).

The movie “Final Fantasy: The Spirits Within” (Sakaguchi and Sakakibara 2001) is a good example. It featured characters that looked very realistic when static (Figure 1). The artists considered almost all the criteria for physical photo-realism, such as light reflections and refractions in the skin and eyes, texture, moisture and natural colours. Yet when seen in motion, the animated characters often elicit a feeling of discomfort, a feeling that somehow these characters look less like living breathing people and more like walking, talking corpses.

A way to avoid the Uncanny valley altogether is by avoiding realism. Non-realistic anthropomorphic characters, such as Disney’s Donald Duck do not run the risk of seeming too human-like, yet can be used to transport believable human traits – a method also used in Aesop’s Fables.

My work aims to show a new approach to creating interactive believable characters. My focus is not on creating human-like figures.

1.6 Key Related Research Projects

1.6.1 Oz Project and HAP Architecture by Joseph Bates and Bryan Loyall at CMU

I would not say that the agents created with HAP are autonomous. The agents have no ability to correct their behaviour and therefore fall under Searl’s Chinese room argument about weak AI. HAP creates an interactive narrative with additional, implied rules and dynamics that the author might add unintentionally while creating the agent’s rules. The agent is therefore not an interactive personality.

Jentsch, E. (1997). “On the psychology of the uncanny (1906).” Angelaki: Journal of Theoretical Humanities 2(1): 7-16.

Sakaguchi, H. and M. Sakakibara (2001). Final Fantasy: The Spirits Within. Japan.

White, G., L. McKay, et al. (2007). “{Motion and the uncanny valley}.” J. Vis. 7(9): 477-477.