December 8, 2013

The Human Monopoly On Conversation Is Over

Each of us knows many languages. We use body-language; we make structured sounds with our vocal chords; we read and write mathematical symbols. We also read and write symbols taken from something called an alphabet. We form these symbols into collections called words, that define parts of the human experience of reality. These foundational parts, or concepts, can be combined according to certain rules that allow us to build composite concepts - ideas made out of other ideas.

A language is a tool for making ideas concrete, for storage and for transmission.

Let’s imagine for a minute that we can see the ideas held inside people’s minds. In the diagram above, we see that Alice is bored. This unarticulated feeling is represented by bored(me). She chooses to communicate this idea to Bob using the following string of informational symbols:

“omg im sooo bored”

It gets the message across. We can see that because in Bob’s mind we see the idea bored(alice) appear.

Is this exactly equivalent to the idea stored in Alice’s mind?
The only difference between the two ideas is the me in Alice’s mind substituted for Alice in Bob’s. For Alice, me stands for Alice, so it does seem that she managed to transmit information to Bob with no loss of meaning.

But what if bored means something else to Alice than it does to Bob? Maybe Alice has a far lower boredom-threshold than Bob. She just gets bored more easily than Bob does. If Bob doesn’t know this, the idea in his mind means something different to what Alice was trying to tell him. The idea in Bob’s mind means that Alice is bored by Bob’s definition of boredom.

We can see the effects of this kind of mismatch far more clearly in the next example:

Alice thinks that Bob is cute and wants to get to know him better over coffee. In her mind, having a coffee with Bob is sort of a date. In Bob’s mind, having a coffee with Alice is something they would do as friends.

This coffee will obviously get a little awkward. This awkwardness is entirely caused by the difference in their mental models: though they agree on the practicalities of having coffee together, they differ on what that coffee entails.

We have two characters whose inner desires have been revealed to us, in conflict because they don’t fully articulate how they feel. This is a familiar situation. Most of us rarely articulate what we really feel. In reality this is frustrating, because we are always in the dark, only having insight into our own minds. In novels and films the narrator reveals the inner thoughts of all the characters, so we can watch the conflict unfold fully aware of all the forces involved.

Luckily for us, even though our mental models might never align perfectly with anyone else’s, they tend to mostly overlap. Alice and Bob will go to a cafe, order something to drink and spend a while in each other’s company, because this is what was understood by both parties as the meaning of “grabbing a coffee”. Conveying so much information in such a small phrase is impressive. What might be more impressive is how much was left unsaid.

What happens when we try to converse with something that has a radically different experience of reality to us? Are we less able to leave things unsaid?

The range of things we can talk about with a dog is incredibly limited. We might be able to get a dog to recognise the word “treat” or “walkies”, but we’ll never be able to make a dog understand the phrase “favourite colour”. It’s too complex - the idea is made up of too many components.

When two intelligences converse, they can only talk about things that are understood by both parties. A word is only a useful informational vector if both parties know what it means.

So what about when we talk to computers?

Alice asks Google where she and Bob can get a coffee.

Computers know anything we can unambiguously express to them. In this example, Google has been told about locations and restaurants, which means it can give Alice a sensible answer.

Google knows that Alice is located on Sveavägen. It also knows that there is an Espresso House cafe located on Kungsgatan, and that Kungsgatan is close to Sveavägen. Together, this means that Espresso House is a sensible answer to Alice’s question.

This answer is remarkably similar to how a human being might respond, if asked the same question. It is a product of knowledge stored in the computer’s mind. Even though the way knowledge is stored in a computer is probably very different to how it is stored in a human mind, the computer managed to give an intelligent response.

But is this really intelligence, or just a simulation of it? Does Google really understand what Alice is asking, or does it just behave as if it does? If I recommend that Espresso House to Alice, it might be because I’ve been there before and liked it. Google might recommend it because it has been rated 4 out of 5-stars by thousands of users.

Is there a meaningful difference between these two things? What makes our type of intelligence “real” but the computer’s intelligence simulated?

Let’s look at another example, which better shows the difference between a computer and human store of knowledge.

Alice and Bob’s coffee at Espresso House is awkward for both of them. It leaves Bob deeply troubled. In his confusion and desperation, he turns to Google, and asks “What is love?”

To Google, this phrase can only mean one thing: the 1993 dancefloor hit by Haddaway. Google helpfully responds with information about the song, entirely oblivious to Bob’s inner turmoil.

Google stores knowledge about the song because that information is simple to express: the name of the artist, its release year, even the sound of the track itself. Google can store the lyrics:

What is love

Oh baby, don’t hurt me

Don’t hurt me no more

but will Google ever be able to store the complex, tangled feelings evoked by a catchy, upbeat melody combined with the sound of a man pleading with his lover to stop hurting him?

It’s not surprising that Google chooses to ignore this aspect of Bob’s question.

Computers don’t understand those subjects that we barely understand ourselves. If someone asks one of us to explain what love is, how are we supposed to respond? Whatever we can put into words seems reductive, incomplete.

Which is why we have art. Art evokes through analogy and synecdoche those things too complex and too deeply embedded in our experience of reality to be formally and unambiguously expressed in words.

So is this the way things are?

Will we never be able to discuss love and hate and regret with a machine, for the same reason that we can’t discuss those things with a dog?

Well there’s a fundamental difference between a dog (or a human) and a computer. Unlike with biological entities, we can actually look into the mind of a computer.

And not just that, we can program it too. We can decide how a computer extracts knowledge from the things we say to it. We get to choose how this knowledge is stored. We define how this stored knowledge is combined to form a response.

In other words, we can teach a computer to do far more tricks than a dog. Can we teach them the trick of being human?

Let’s think about how we might teach a computer to participate in a knock-knock joke.

Human: Knock knock.

Computer: Who’s there?

H: Boo!

C: Boo who?

H: Now now, don’t cry.

C: Hahaha!

The human starts by saying “knock knock”. The computer needs to respond to this phrase by replying “who’s there?”

H: “Knock knock” –> C: “Who’s there?”

“Who’s there” must come after “knock knock”. We can ensure this happens in the following way:

H: “Knock knock” –> C: remember! knock-knocked

C: when knock-knocked, say “Who’s there?”

and remember! said-whos-there

Every time something is said, the computer needs to explicitly remember it, so it can keep track of where it is in the conversation. We do this without realising it.

H: Boo! –> C: Boo who?

The next part of the joke is a little harder to handle because it involves a variable - the name of whoever or whatever is at the door. We’ll call this variable X.

When this dialogue occurs, the computer needs to understand that in this joke, X = Boo. This allows it to fill in the template “X who?” with “Boo who?”

H: “Boo!” –> C: remember! whos-there(Boo)

C: when said-whos-there and whos-there(X), say “X who?”

and remember! asked-X-who

Now for the punchline.

H: “Now now, don’t cry” –> C: remember! punchlined

C: when punchlined, say “Hahaha!”

We’ve just taught a computer to respond “intelligently” to any knock-knock joke!

H: Knock, knock. –> knock-knocked

C: Who’s there? –> said-whos-there

H: Noah. –> whos-there(Noah)

C: Noah who? –> asked-X-who

H: Noah good place we can get coffee? –> punchlined

C: Hahaha!

So is the computer laughing because it genuinely finds the joke funny, or does it just know when it should laugh, and so it does, mechanically?

when punchlined, say “Hahaha”.

Have you ever laughed at a joke you didn’t understand, just to fit in? I know I have. Of course, our pretence at finding a joke funny is many orders of magnitude more sophisticated than the pretence of this computer.

But what if we made it more sophisticated? What if we encoded enough linguistic knowledge in a computer for it to understand that the first joke is funny because “Boo who” sounds like “Boo hoo”, and the second because “Noah” sounds like “Know a”? What if we made a computer’s humour level rise when it finds unexpected homophones (words and phrases that mean different things but sound the same)?

Would a computer then be genuinely laughing at a knock-knock joke?

It seems incredibly difficult, but not impossible.

That’s a glimpse of the road toward making computers understand a very specific and narrow kind of humour. Does it represent a first step? If we do the same for love and hate and desire, could we end up with this?

Does that mean we might all be no more than stupendously sophisticated, biological machines? That our intelligence is as much an illusion as a computer’s, but an illusion so convincing that we believe it ourselves?

What do you think?

Check out the slides here

Kudos

The Human Monopoly On Conversation Is Over

Now read this

The Importance of Feedback