When we speak, we engage nearly 100 muscles, continuously moving our lips, jaw, tongue, and throat to shape our breath into the fluent sequences of sounds that form our words and sentences. A new study by UC San Francisco scientists reveals how these complex articulatory movements are coordinated in the brain

The new research reveals that the brain's speech centers are organized more according to the physical needs of the vocal tract as it produces speech than by how the speech sounds (its "phonetics"). Linguists divide speech into abstract units of sound called "phonemes" and consider the /k/ sound in "keep" the same as the /k/ in "coop." 

The findings, which extend previous studies on how the brain interprets the sounds of spoken language, could help guide the creation of new generation of prosthetic devices for those who are unable to speak: brain implants could monitor neural activity related to speech production and rapidly and directly translate those signals into the synthetic spoken language.

A Neural Code for Vocal Tract Movements

In some cases, to prepare for these operations, he places high-density arrays of tiny electrodes onto the surface of the patients' brains. Both to help identify the location was triggering the patients' seizures and mapping out other vital areas, such as those involved in language, to make sure the surgery avoids damaging them.

In the new study, Chartier and Anumanchipalli asked five volunteers awaiting surgery, with ECoG electrodes placed over a region of ventral sensorimotor cortex that is a key center of speech production, to read aloud a collection of 460 natural sentences.

The sentences were expressly constructed to encapsulate nearly all the possible articulatory contexts in American English. This comprehensiveness was crucial to capture the complete range of "coarticulation," the blending of phonemes that is essential to natural speech.

This approach allowed the researchers to identify distinct populations of neurons responsible for the specific vocal tract movement patterns needed to produce fluent speech sounds, a level of complexity that had not been seen in previous experiments that used simpler syllable-by-syllable speech tasks.

The researchers also identified neural populations associated with specific classes of phonetic phenomena, including separate clusters for consonants and vowels of different types, but their analysis suggested that these phonetic groupings were more of a byproduct of more natural groupings based on different types of muscle movement.

The researchers found that neurons in the ventral sensorimotor cortex were highly attuned to this and other co-articulatory features of English, suggesting that the brain cells are tuned to produce fluid, context-dependent speech as opposed to reading out discrete speech segments in serial order.

"During speech production, there is another layer of neural processing that happens, which enables the speaker to merge phonemes into something the listener can understand," said Anumanchipalli.

The path to a Speech Prosthetic

"This study highlights why we need to take into account vocal tract movements and not just linguistic features like phonemes when studying speech production," Chartier said. He thinks that this work paves the way not only for additional studies that tackle the sensorimotor aspect of speech production but could also pay practical dividends.

Ultimately, the study could represent a new research avenue for Chartier and Anumanchipalli's team at UCSF. "It's made me think twice about phonemes fit in—in a sense, these units of speech that we pin so much of our research on are just byproducts of a sensorimotor signal," Anumanchipalli said.