Trevor Cox Explores Science and Overall Evolution of Sound and Human Voice

Inventions like sound recording and the synthesized voice have changed human communication forever, and while advances in artificial intelligence are hinting toward an even greater transformation, Trevor Cox can’t help but notice the consequences of hearing more, amplified.   

Cox, author and professor of acoustic engineering at Salford University, gave his lecture, “Now You’re Talking,” at 10:45 a.m. Monday, July 22 in the Amphitheater, opening Week Five, “The Life of the Spoken Word.”

Cox started with a demonstration of how the human voice works. To begin, he had the audience put their hands on their throats and make two sounds: an “E” and an “S” sound. The “E” caused vibration, where the “S” did not. This is because with vowels, sound elongates in the larynx with one’s vocal folds.

“When you make the ‘E’ sound, you push air out of your lungs, and then you break that air up with the movement of the vocal folds,” Cox said. “That gives you variation and little pulses of pressure, and that’s what a sound wave is.”

Vocal folds can open and close up to 100 times a second.

“One of the remarkable things about the human voice is how robust it is,” he said. “Because we really hammer it, (the folds) have to be moving fast all the time.”

Vocal folds also determine pitch — the longer they are, the higher the pitch of sound. Lighter sounds have higher frequencies, so when a sound changes the length of the vocal folds, it is actually changing how much mass is moving.

However, on their own, the only sound vocal folds make is an incoherent, buzzing noise. Cox showed a video of the throat of an opera singer, observed with an MRI scanner, to stress the importance of other throat and mouth muscles, like the tongue, in making sound and forming diction.

“If I was to show you an MRI scan of you talking, you would look really similar,” he said. “You have the same amazing flexibility in your system.”

But when did humans evolve to have the ability to speak? Cox believes it began with Neanderthals, though it’s an assumption that is impossible to prove. Archaeology, which heavily relies on fossils, is not a resource for his research because the soft tissue in vocal anatomy doesn’t fossilize.

In an attempt to narrow down when speech began, Cox compared a chimpanzee, an animal that can’t speak, to humans — animals that can. In recent years, people have tried to teach chimpanzees to speak, but in the end, found that chimps could only use gestures to communicate.

What is stopping them? Cox said it is the anatomy of the larynx. In humans, the larynx sits lower than it does in chimpanzees.

“There is a lot of discussion around why it’s lower, but speaking fluidly is one reason why,” Cox said. “You saw that flexibility of the tongue. By moving the larynx out of the way, the tongue has a much greater ability to change the shape of the throat and the mouth and to speak more rapidly.”

William Fitch, an evolutionary biologist and cognitive scientist at the University of Vienna, studied chimpanzees and determined that the animals actually could speak if their brains could control their vocal anatomy.

“The conclusion of this paper is that probably, chimpanzees could speak; the vocal anatomy is not what’s limiting, it’s the brain that’s limiting,” Cox said.

Unfortunately, the brain doesn’t fossilize either, so Fitch’s findings don’t transfer to Neanderthals. Cox said the only other resource is symbolic thought. For Neanderthals, symbolic thought was displayed through cave paintings.

“It seems that Neanderthals were making art, which means they were doing things beyond just surviving, which means they were thinking beyond just surviving, and that makes it more likely that Neanderthals talked,” he said.

Ancient acoustics were prevalent in monuments as well, such as Stonehenge in Wiltshire, England.

“If you think of any human ceremony you’ve been involved with, it involves singing, talking, music — it involves sound,” Cox said. “Therefore, the acoustics of old spaces would have been important to how they were used.”

Studying sound in Stonehenge was difficult because so many of the original stones are missing. To replicate how it used to sound, Cox created a computer model and a scale model of its earliest design. The scale model was tested two weeks ago, and Cox played two videos to show the difference in sound quality. The first video was an orchestral piece without any stones and the second was the same piece, with the stones. When the stones were added, the sound became “deeper and richer.”

Throughout history, Cox said the greatest developments in voice have coincided with developments in technology.

“Inventions like the phonograph, the microphone and the telephone changed our relationship to the voice and changed the human voice,” he said.

To convey the importance of the microphone, Cox played an example of Freddie Mercury and Montserrat Caballé singing “Barcelona.” Caballé was a professional opera singer trained to amplify her own sound with vocal techniques, where Mercury had always performed with the help of a microphone, making him far easier to be heard in larger venues.

“The reason we have such diverse, modern singing styles is because to sing to a large arena now, all you need to do is sing into a microphone (close) to your mouth, and the sounds can be amplified,” Cox said. “This means that Freddie Mercury can whisper, he can shout, or even talk-sing and all of that works. You can’t do that as an opera singer.”

Voices primarily change with age. Cox played an example of Queen Elizabeth II, comparing her first Christmas message in 1957 to her Christmas message in 2017. The recordings proved that her voice is now noticeably lower.

“That’s a natural aging process,” he said. “As females get older, their voices tend to slowly go down in pitch.”

Cox said deeper tones in females can also be due to cultural changes. In a study comparing the voices of women in the 1940s to women’s voices now, researchers concluded that women’s voices are deeper than they were before. Cox said this is because more women are in their “rightful leadership roles.”

“As they assume more leadership roles, their pitch is lowered and this is the sad thing: They have lowered their pitch to sound more like a man,” he said. “It’s sad because it’s basically based off the fact that your brain makes suggestions about who is likely to be a leader, and because we still have a bad gender imbalance, your brain guesses a man is more likely to be a leader.”

Cox used an example of Kim Kardashian, who speaks with vocal fry, the lowest register of one’s voice. Cox said Kardashian’s vocal fry annoys listeners, where the vocal fry of actor Vin Diesel does not.

“It’s an interesting and sexist way we respond to voices,” he said.

Although the voice is flexible and constantly changing, accents have remained the same through generations. Cox recalled a study in England which proved there is a north-south divide in the way people pronounce certain words. In the south, bath is pronounced “bah-th,” where in the north, it is pronounced “bath.” Cox said accents haven’t changed because they are “a part of identity.”

Cox concluded his lecture with a story of a woman named Eugenia whose husband passed away in a car accident. Eugenia uploaded their text messages into an artificial intelligence engine and made a “chat bot” so she could talk to him again. Cox finds this concept fascinating and a little “creepy,” but said it leads into his next point: Voice identity is under threat. Current artificial intelligence technology can use speech synthesis systems to mimic individual voices. According to Cox, this will be used to both “comic and ill effect.”

“We’ve all had emails pretending to be from a loved one who is lost and needs money transferred to a bank account and all that; we are going to start getting voice messages doing exactly the same,” he said. “Unfortunately, with all of these technologies, they get used for ill.”

But there is an advantage to the developing technology. For people who want to speak in their natural voice, but can’t due to medical reasons, speech synthesis provides sufficient personalization of artificial language.

“I can’t imagine anything more important than being able to say to your wife, your husband or your children that you love them, in your own voice,” Cox said.
Tags : “Now You’re Talking”“The Life of the Spoken Word”lecturemorning lecturemorning lecture recapSalford UniversityTrevor CoxWeek Five

The author Jamie Landers

Jamie Landers is entering her third season as a reporter for The Chautauquan Daily, covering all things music-related within the online platform. Previously, she recapped the Chautauqua Lecture Series in 2019 and the Interfaith Lecture Series in 2018. In addition, she is a rising senior at The Walter Cronkite School of Journalism in Phoenix, Arizona, where she most recently served as a breaking news reporter for The Arizona Republic, as well as a documentary producer for Arizona PBS.