06.11.2018 Phonetics

Drawing the Voice

By decoding voices, forensic phoneticians provide important clues for the police or secret services. UZH phonetician Volker Dellwo hopes that in the future it will be possible to make identikit images based on voices.

Marita Fuchs

Lauschen — What does the voice give away? Forensic phoneticians get clues about the person from their voice. (Illustration: iStock, alashi)

Around four years ago, terrorists posted a video on YouTube showing a person being beheaded. One of the murderers appeared in the video, and though he had a mask on his voice could be heard quite clearly. The police and secret services searched frantically for the perpetrator, but in the end it was the phoneticians who gave them the crucial piece of information: From his voice it was clear that the man came from East London. For the detectives, this decisive clue brought them a significant step closer to finding the wanted man.

Volker Dellwo, professor at the UZH Institute of Computational Linguistics, is much in demand with police and public prosecutors, as he’s an expert in forensic voice analysis. A person’s voice is actually a biometric characteristic. When we talk, we use around 200 muscles. The sound of each person’s voice is formed through different physiological characteristics, such as the size of the larynx, the volume of the mouth and the length of the vocal cords. The vibration of the vocal cords, the shape of the tongue and the size of the jaw also affect the sound of the voice.

Identikit images of the lower half of the face

Phoneticians can estimate from the voice how old the speaker is, whether they’re male or female, how tall they are, and where they come from. Characteristics such as gaps in the teeth or clicking noises from dentures can also give helpful clues. If a blackmailer puts a cloth over the mouthpiece of the phone, their voice can still be recognized. It’s even possible to recognize a voice in a recording in which snippets of other voices are mixed in.

Dellwo’s vision is that it will be possible in the future to draw facial composites (identikits) of the lower part of the face based on people’s voices. This would be possible because there are certain characteristics affecting the voice that are very difficult for the speaker to manipulate, such as length of the jaw or size of the inside of the mouth. Partial composites would give the police another method of identifying criminals. However, it is not just anatomical differences that make each voice different, learned features also play a role – for example, speaking a particular dialect or belonging to a certain social group affects the tone and modulation of the voice.

Clues but not proof

Using their expertise, phoneticians can provide crucial clues for police, prosecutors and secret services. “But a person’s voice is not as unique as a fingerprint or DNA,” cautions Dellwo. “We are identifying probabilities.” This is because, similarly to faces, voices vary greatly – changing according to our emotions, the time of day, or with age, for example.

Fingerprints and DNA, however, remain the same throughout our lives. To be able to clearly identify someone from their voice, many different recordings are required. Dellwo works with signal processing programs which have algorithms that are programmed to recognize patterns and identify the voices.

Unconscious variation

The program uses spoken-language and communication characteristics to identify the voice – a very difficult task, as the following example shows: If three or four people are speaking on the radio, they have to differentiate their voices so that the listeners can identify who is who – it’s important that the speakers make their voices distinguishable.

“Our hypothesis is that people make their voices different from other people’s as soon as they are in a group situation, without being aware that they are doing it.” That is, your voice changes in connection with the people around you. It is this variation of the vocal characteristics that makes voices so difficult to identify with certainty. This variation was hitherto underestimated, says Dellwo: Research in the area is just at the starting gates.

There are some voices that we think we know very well, such as German Chancellor Angela Merkel, or former US President Barack Obama. But in fact, Dellwo points out, Angela Merkel’s voice at home might sound quite different to the one she uses in parliament. “In order to be able to reliably identify a voice analysis, we need to know many of the voice’s features,” says Dellwo. That forms the basis for a forensic analysis.

Dellwo believes that the voice will become an increasingly important element of detective work in the future. However, in order to be able to compare voices, it is also necessary to document them. Unfortunately, says Dellwo, though some countries have started voice databases, there are not yet any such databases in Switzerland.

Marita Fuchs, Editor UZH News. English translation by Caitlin Stephens, UZH Communications.

Additional Information

Volker Dellwo, Institute of Computational Linguistics

Quicklinks and available languages

Main navigation

Drawing the Voice

Identikit images of the lower half of the face

Clues but not proof

Unconscious variation

Additional Information

Link