Stay up to date with notifications from The Independent

Notifications can be managed in browser preferences.

The gift of the gab

Why did humans learn to speak languages, while other primates never got beyond grunting? It's all down to our unique filing system for words, scientists have found

Kate Ravolious
Monday 24 March 2003 01:00 GMT
Comments

Your support helps us to tell the story

From reproductive rights to climate change to Big Tech, The Independent is on the ground when the story is developing. Whether it's investigating the financials of Elon Musk's pro-Trump PAC or producing our latest documentary, 'The A Word', which shines a light on the American women fighting for reproductive rights, we know how important it is to parse out the facts from the messaging.

At such a critical moment in US history, we need reporters on the ground. Your donation allows us to keep sending journalists to speak to both sides of the story.

The Independent is trusted by Americans across the entire political spectrum. And unlike many other quality news outlets, we choose not to lock Americans out of our reporting and analysis with paywalls. We believe quality journalism should be available to everyone, paid for by those who can afford it.

Your support makes all the difference.

Japanese, English, Swahili and Hungarian might sound completely unrelated, but it turns out that they have more in common than most of us realise. Researchers have identified a pattern in all human languages, and it now appears that this pattern may explain the very origins of how we began to talk.

How we progressed from being cavemen grunting at each other to sophisticated social creatures who can discuss anything from making a fire to Wagner's operas has always been a mystery. What is particularly strange is that there is no evidence of an intermediate stage of language; it appears that one morning we tumbled out of our trees and suddenly started talking. Our closest relatives – animals such as apes, chimpanzees and gorillas – communicate using a limited number of signals, with about 30 different signals being the maximum. All humans speak a language with a massive vocabulary and are able to convey detailed, precise messages to each other. Why did the other primates never get beyond the grunting stage – and what made us leap forward with our language skills?

For a number of years, scientists have recognised that all human languages follow a pattern, characterised by the frequency of different words. This is known as Zipf's law, named after the Harvard linguistic professor George Kingsley Zipf, who died in 1950. Our speech is peppered with small, ambiguous words (such as "the", "be", and "to"), but only lightly scattered with longer, more specific words (such as "elephant", "staircase", "saucepan"). If you were to take all the words in a book and draw a graph of the number of times each word appeared, you would draw a steeply dropping curve all the way from the numerous common words to the "one off" appearances of obscure words. And, curiously, you would get the same-shaped curve for any text in any language.

But is Zipf's law just an unusual quirk of language, or does it tell us something more fundamental? Recently, two physicists decided to take a different approach to looking at language evolution. Ramon Ferrer of the Universitat Pompeu Fabra in Barcelona, and Ricard Sole from the Santa Fe Institute in New Mexico, developed a mathematical model of how language evolves, and considered how much effort is required both to speak and listen. When they ran their model, Zipf's law emerged as a natural consequence of language evolution and proved to be the most efficient way of communicating. "It shows how the speaker and listener compromise to be able to communicate," says Ferrer. Indeed, it begins to shed light on the foundations of human chatter.

The obvious way to communicate is to create individual signals and sounds that have one meaning. But this one-to-one language structure soon runs into problems: after a certain point our brains don't manage to remember all the different words and what they mean. This one-to-one method of communicating is where chimpanzees, apes and gorillas have remained. But at some point in the past, humans developed a cunning way to remember more words, and this set us on the path to the language we speak today.

In the same way that it is easier to remember how to bake a cake that we have baked many times before, it is also easier to recall words that we have heard many times previously. Knowing that a "zither" is a type of stringed instrument may come in handy for crossword puzzles, but it is not common in everyday conversations. So most of us don't tend to remember the meaning of words such as "zither". On the other hand, a word such as "the" is incredibly useful for constructing sentences and crops up frequently in conversation, but it won't score you many intellectual points if you use it at a dinner party.

Words such as "the" are essential. The more we hear and use a word, the more likely we are to remember it. Unconsciously, we prioritise which words we remember. This system – remembering words by their familiarity – is the most efficient way to memorise them, and explains why human languages always have a structure that follows Zipf's law.

Ferrer and Sole's model showed that human language lies between two extreme states. At one end of the scale is a complete lack of communication where we all make noises but no one understands because there is no logic to what the sounds mean. And at the other end lies "perfect communication", where every possible object, action and feeling has a special word assigned to it. The downside to perfect communication is that it requires lots of memory and an efficient indexing system, plus the ability to make lots of different sounds.

Animal languages and artificial computer languages are both at the "perfect communication" end of the scale. For apes and computers, every sound they make has an unambiguous meaning. Computers make use of huge memories to store millions of different words and can communicate very precise messages. Meanwhile, apes have a limited but very precise language, which means that they can tell each other they are hungry, but have difficulty discussing the finer points of the harmonics in Beethoven's Third Symphony.

Human language teeters between the extremes of nonsense and perfect clarity, continually balancing the need to communicate with the need to understand. We sometimes say ambiguous things, but using words with multiple meanings allows us to construct more sentences and convey a greater variety of messages than we would otherwise be able to.

So what forced humans into using a Zipf's-law system of remembering words? "The change from grunting to chattering was quite abrupt," Ferrer believes. An environmental change, for example, may have caused humans suddenly to have a greater need to discuss things. "Because they didn't have any more memory space for new words they started using Zipf's law to keep the vocabulary size constant, while still managing to incorporate more meanings," Ferrer says.

And Zipf's law is not only a clever way to memorise words, it has also proved to be a useful tool in analysing language. Scientists have been using Zipf's law to identify plagiarism and to spot authors who write under pen names.

Just as we all have unique fingerprints, we also all have unique ways of speaking. Some of us have favourite words we like to use; others speak in florid sentences; and some just want to make sure that they are understood. Although everyone's speech and writing obeys Zipf's law, we all have slightly different wiggles on the graph of our word-frequency distribution. These nuances of speech are enough to differentiate one person's speech or writing from another's.

Two Indian scientists used Zipf's law to analyse the works of Shakespeare. It seems that the Bard was not as prolific a writer as his canon suggests, and that in fact a number of people wrote under the name Shakespeare. "When Shakespeare's complete works were analysed, the word distribution no longer followed Zipf's law," Ferrer explains. "This is probably due to combining the vocabularies from a number of different people, and it suggests that there were multiple authors writing under the name of Shakespeare."

So even Shakespeare can't escape scrutiny. And Zipf's law may yet uncover more secrets. Authors can no longer guarantee anonymity, and plagiarism has become easier to spot. Meanwhile, Ferrer's work confirms that nattering to your neighbour is a truly human characteristic, and that the tussle between communicating detail and ease of understanding still goes on.

The compromise is to use a Zipf's law structure that keeps language finely balanced, like a tightrope walker. At some point, many thousands of years ago, we tiptoed on to the tightrope and started to use Zipf's law. Language flourished – and we have been gossiping ever since.

Join our commenting forum

Join thought-provoking conversations, follow other Independent readers and see their replies

Comments

Thank you for registering

Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in