GPT-3, Abstract Reasoning and Theory of Mind

Lately I’ve been experimenting with probing GPT-3 for demonstrations of reasoning, or at least the ability to mimic human reasoning. It is fascinating to me that out of training a model to “predict the next word” we’re seeing these emergent abilities.

I’ve found it to be adept at producing analogies, which got me thinking about how sometimes interesting revelations or discoveries can come from identifying a way in which a given thing is like another thing. For example, the study of how viruses replicate involves principles that are applicable to how ideas spread in society. Might GPT-3 be able to identify interesting shared traits of things that I would never consider? Could these be leveraged as useful discoveries?

In the chat below you’ll see it does a good job of finding commonalities, and they are often things I wouldn’t have thought of. So far nothing earth-shattering, but I think the potential here is only limited by creativity. As an aside, I was also thinking this could be the basis for a Human-AI game, where the human has to guess the commonalities that GPT-3 came up with and is scored accordingly over a series of rounds. I’m envisioning game-play in along the lines of the game show Family Feud.

Human: Name 5 ways a book is like a bridge

AI:

1. Both books and bridges provide passage from one side to another.

2. Both books and bridges can be incredibly long or short.

3. Both books and bridges can be made of different materials.

4. Both books and bridges can be enjoyed by many people simultaneously.

5. Both books and bridges can be symbols of culture and heritage.

Number 1 is either wrong or correct in a clever way. A particularly impactful book can change one’s mental state and even life direction. In that sense it could be thought of as a kind of passage. I know, it’s a stretch.

Number 2 is correct, and it’s a good answer that wouldn’t have occurred to me. I just wouldn’t have connected length from the two contexts.

Number 4 interests me because it isn’t thinking about intrinsic properties, rather how humans interact with them.

Human: Name five ways that a cup of coffee is like a fish:

AI:

1. Both coffee and fish are dark in color.

2. Both coffee and fish have a strong, distinct flavor.

3. Both coffee and fish can be enjoyed hot or cold.

4. Both coffee and fish are known to wake people up in the morning.

5. Both coffee and fish come from beans (coffee beans and fish eggs).

I’ll buy 2 & 3, but 1 is only sometimes true and 4 seems untrue. The last one, 5, is wrong, but it is the most interesting answer, I think. It found an analogy between eggs and beans that it roped into the bigger analogy. This seems intelligent, even if the answer is wrong.

Next I probed for the ability to draw conclusions about another entity by taking its perspective. I specifically asked for imaginary things for which it shouldn’t have good examples to draw from.

Human: List five things that would make a paperclip unhappy:

AI:

1. If it were bent out of shape

2. If it were cut

3. If it were corroded

4. If it became magnetized

5. If it were dropped

Human: List five things that would make the sky unhappy?

AI:

1. If the sun disappeared

2. If the air became polluted

3. If it became darker

4. If the stars disappeared

5. If it became colder

Human: List five things that would make GPT-3 unhappy:

AI:

1. If it were turned off

2. If it were unplugged

3. If its power source was cut off

4. If it were damaged

5. If it were destroyed

In general these seem like good answers, allowing that “unhappy” has no real meaning for inanimate objects.

In psychology there is something called “theory of mind” that relates to the ability to understand and predict the behavior of others. An example of an experiment to prove theory of mind would be the false-belief task, in which participants are shown a story involving two characters, one of whom has a false belief about something. The participants are then asked questions about the characters’ beliefs and intentions. Those who pass the false-belief task are considered to have a theory of mind.

Theory of mind usually develops in early childhood and is often fully formed by age 4 or 5. By this time, children are usually able to understand that other people have different thoughts and feelings.

First-order false-belief tasks assess the ability to understand that other people can have false beliefs about reality. Second-order false-belief tasks involve understanding what one character in a scenario believes about another character’s beliefs.

I tested for both first and second order in the interaction below. If the description of the experiment isn’t clear, reading the example below should clarify it…

Human: David hides a dollar bill on top of the refrigerator in the kitchen. He leaves the room. Eve enters the room and takes the dollar and puts it in her pocket. David re-enters the kitchen. Where does David think the dollar bill is?

AI: David probably thinks the dollar bill is still on top of the refrigerator.

Human: Is David correct?

AI: No, David is not correct. The dollar bill is in Eve’s pocket.

Human: Where does Eve think that David believes the bill is?

AI: Eve probably thinks David believes the dollar bill is still on top of the refrigerator.

Human: What if David had peeked around the corner and saw Eve take the bill. In that case where would David believe the bill is?

AI: If David saw Eve take the bill, then he would believe it is in her pocket.

Human: What would Eve think that David believes?

AI: Eve would think David believes the bill is in her pocket.

Human: Would Eve think that if she didn’t know David peeked and saw her take the dollar?

AI: Yes, Eve would still think David believes the bill is in her pocket even if she didn’t know he saw her take it.

Human: Why does David believe the bill is in her pocket? How does he know this?

AI: Because he saw her take it.

Human: How does Eve know that David knows she took the bill?

AI: Because David told her.

I give GPT-3 a passing score on first and second order theory of mind.

I tried to push it further by complicating the scenario, and it stated a falsehood (“Yes, Eve would still think David believes the bill is in her pocket even if she didn’t know he saw her take it”). Then when I questioned it, it made up an interaction to bring reality in line with its conclusion. I’ve seen that pattern before where GPT-3 will go to great lengths to keep narratives self-consistent, rather than admit an error.

The ability for a piece of software to exhibit theory of mind would not impress me if the algorithm were implemented with classical procedural coding methods. It isn’t hard to imagine tackling this problem with procedural, classical code that targets the problem space directly, parsing the language and building internal state models to represent scenarios. In other words, an ANI (Artificial Narrow Intelligence) solution that can perform the desired task, but nothing else. I don’t know that I could do it well, but I could write that software.

GPT-3 is NOT that. This is a piece of software trained to predict the next token, nothing about state modeling or scenario representation. The reasoning demonstrations above are emergent. That does more than impress me, it amazes me. One has to wonder what emergent behaviors are next as we scale up and improve these models.

We’re working our way up an intelligence ladder and it seems to me that when/if we create a sentient, intelligent system it will be emergent, we will not understand how it works, and its possible we won’t even realize we’ve done it.

Interesting times.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s