Evolution of Cooperation[2]: Ape Vocalizations and Gestures
Reading notes on Michael Tomasello’s work. Translated from Chinese and lightly polished with Claude.
This essay focuses on Tomasello’s analysis of the communicative forms used by apes. Apes have two main communicative modalities: vocalization and gesture. Tomasello’s question is interesting: which modality is the more plausible evolutionary precursor of human language — the “intermediate step between ape communication and arbitrary linguistic conventions”?
His answer: gesture.
1. The analysis of vocalization
The central observation is this: over ontogeny, apes acquire diverse comprehension of vocalizations, but their vocal generation lacks any comparable diversity.
What is vocal comprehension? Hearing sound A and knowing that a predator is approaching. What is vocal generation? Intentionally transmitting information through sound (a notion we will sharpen below).
The clearest evidence for the diversity of vocal comprehension is that some apes can even use birds’ alarm calls to identify predators. But Tomasello points out that this kind of learning is associative learning — they learn that A predicts B (A = a certain sound, B = a certain predator).
The clearest evidence for the lack of diversity in vocal generation is that the vocal patterns of apes are nearly identical within a species, even for animals raised by humans, and humans cannot teach them new vocalizations. The conclusion is that ape vocalizations are direct reflections of specific emotional states — hard-wired into the genome.
Vocalizations do influence the behavior of conspecifics (“I call, they run”), but apes do not intend this effect. These calls are just direct outputs of internal emotional states, no more deliberate than a knee-jerk reflex. The direct evidence: when an ape judges itself to be in no danger of predation, it will not call out even when it sees a predator. The only reason vocalizations affect others is that they are broadcast indiscriminately — without target, without any intention to elicit a particular response.
2. The analysis of gesture
Unlike vocalizations, ape gestures show substantial individual variation within a group. This reveals that ape gestures contain an ontogenetically ritualized component, and therefore admit diversity.
Ape gestures come in two kinds: intention-movements and attention-getters. An example of each:
Intention-movement. An infant ape pats its mother’s back, then waits for her to lower her back so it can climb up.
Attention-getter. An ape presents its back to a conspecific, hoping to be groomed. Or an ape throws a stone at a conspecific, hoping to play.
You may ask: both look like “purposeful actions” — why distinguish them? The point lies in how each is acquired and in the cognitive capacity that each implies.
2.1 Intention-movements
The acquisition process makes the character of intention-movements easy to understand. Take “patting the back”:
- At first, the infant ape pulls its mother’s back downward in order to climb up.
- Eventually, the mother realizes that the infant pulls on her back because it wants to climb up, and as soon as she feels a tug, she lowers her back of her own accord.
- Finally, the infant ape notices that whenever it pulls, the mother lowers her back automatically. So it simplifies the gesture into a light pat and waits for the mother to lower her back.
The acquisition process here — ritualization — has just one essential feature: the entire action sequence is natural. It is only that, as the sequence repeats, each side anticipates what comes next and responds to that anticipation, so that what was originally a multi-step chain becomes triggered by the first step alone. The literal meaning of “ritualization” already says it: the gesture is gradually fixed by convention, not given a priori.
Tomasello does not claim that the apes here exhibit “intention reading” or anything like cooperation. You could say that the mother “understands” the infant as an intentional agent whose goal is to climb up; that she “cooperates” by voluntarily lowering her back; and that the infant “expects” the mother to be a cooperative agent who can read its intention, and therefore communicates that intention with a pat and waits for the response. But the form of this argument — which Tomasello will use repeatedly later — does not really matter here, and he does not dwell on it. We need only note: intention-movements emerge from a natural action sequence, are essentially its compression, and need not involve any complex cognitive process (though they could).
He notes finally that intention-movements are unlikely to be acquired through imitation, since within the same group the specific intention-movements vary widely between individuals.
2.2 Attention-getters
Attention-getters show more clearly that the ape is an intentional agent, that it understands conspecifics as having perception, and that capturing the attention of another is a necessary precondition for triggering action in another. By their nature, attention-getters require the sender to grasp that the receiver will act only after it has “noticed,” and therefore to ensure that the receiver does notice the gesture. This creates a kind of attention to others’ attention.
Tomasello describes the acquisition this way: an ape “accidentally” throws a stone in a conspecific’s direction, and notices that the conspecific now attends to it. The ape then begins to exploit this attention-capturing effect and generalizes it — through indirectness — into a tool usable across many situations: play, grooming, and so on. The sender has a social intention requiring another’s participation; once the other directs attention to the right object (the referential intention), the other acts accordingly.
Apes will also walk up to a conspecific, try several attention-capturing tactics, and continuously check whether attention is being captured. This “try-different-things-until-it-works” behavior reveals goal-directed cognition.
2.3 Summary
Intention-movements are compressions of natural action sequences, and need not involve complex cognition.
Attention-getters require the sender to deliberately capture another’s attention and to understand that the other must notice something before acting on it — i.e., to treat the other as a perceptual agent. The associated action sequences are not natural either: throwing a stone can be just throwing a stone, but in an attention-getter, throwing a stone is for the purpose of getting attention.
3. Interaction with humans
I find this the most interesting chapter. It may show that apes understand others as intentional, but not as cooperatively intentional.
Apes learn to use the gesture of pointing toward humans, and can capture human attention with it. For example, in a zoo, they clap to get food, or point out food they cannot reach to their keeper. Apes do not point to one another. Tomasello’s gloss: when an ape’s environment suddenly becomes more cooperative, the ape understands that humans can be exploited to obtain food. Here, the ape essentially treats humans as a perception-bearing (and therefore attention-bearing) tool in the environment — part of the environment itself.
A more elegant experiment makes the point sharply (this is a beautiful design):
Setup. Three buckets sit upside-down on a table. One conceals food; the ape does not know which.
Condition 1. The experimenter points at the bucket containing the food.
Condition 2. The experimenter first goes through a round of food competition with the ape, so the ape learns that the experimenter is a rival for food. Then the experimenter reaches a hand toward the bucket with the food.
Result. In Condition 1, even though the ape notices the gesture and looks at the indicated bucket, it ignores the cue and chooses a bucket at random. In Condition 2, the ape opens the bucket the experimenter reached toward.
This experiment shows that the ape treats humans as just another member of its own species — and treats gestures only in their imperative function. When an ape points at food, it is simply ordering the other to hand over the food. So in Condition 1, when the human points at a bucket, the ape thinks the human is merely commanding it to do something with that bucket for the human’s own purposes. The ape does not see the human as trying to help it. The ape does not ask itself, “He pointed at a bucket — is it because this bucket is useful to me?” In Condition 2, the ape understands the human as having an intention to obtain food, understands that the human will act according to that intention, and therefore infers the food’s location from the human’s behavior.
In sum: apes understand others as intentional agents endowed with perception, but not as cooperative agents.
4. Summary
Humans speak with sounds, but it is in action that the complexity of ape behavior lies.
Apes understand others as intentional, perceptual agents — but not as cooperative ones.