A noninvasive brain-computer interface capable of converting a person’s thoughts into words could one day help people who have lost the ability to speak as a result of injuries like strokes or conditions including ALS.
In a new study, published in Nature Neuroscience today, a model trained on functional magnetic resonance imaging scans of three volunteers was able to predict whole sentences they were hearing with surprising accuracy—just by looking at their brain activity. The findings demonstrate the need for future policies to protect our brain data, the team says.
Speech has been decoded from brain activity before, but the process typically requires highly invasive electrode devices to be embedded within a person’s brain. Other noninvasive systems have tended to be restricted to decoding single words or short phrases.
This is the first time whole sentences have been produced from noninvasive brain recordings collected through fMRI, according to the interface’s creators, a team of researchers from the University of Texas at Austin. While normal MRI takes pictures of the structure of the brain, functional MRI scans evaluate blood flow in the brain, depicting which parts are activated by certain activities.
First, the team trained GPT-1, a large language model developed by OpenAI, on a data set of English sentences sourced from Reddit, 240 stories from The Moth Radio Hour, and transcriptions of the New York Times’s Modern Love podcast.
The researchers wanted the narratives to be interesting and fun to listen to, because that was more likely to produce good fMRI data than something that left the participants bored.
“We all like to listen to podcasts, so why not lie in an MRI scanner listening to podcasts?” jokes Alexander Huth, assistant professor of neuroscience and computer science at the University of Texas at Austin, who led the project.
During the study, three participants each listened to 16 hours of different episodes of the same podcasts while in an MRI scanner, plus a couple of TED talks. The idea was to collect a wealth of data the team says is over five times larger than the language data sets typically used in language-related fMRI experiments.
The model learned to predict the brain activity that reading certain words would trigger. To decode, it guessed sequences of words and checked how closely that guess resembled the actual words. It predicted how the brain would respond to the guessed words, and then compared that with the actual measured brain responses.
When they tested the model on new podcast episodes, it was able to recover the gist of what users were hearing just from their brain activity, often identifying exact words and phrases. For example, a user heard the words “I don’t have my driver’s license yet.” The decoder returned the sentence “She has not even started to learn to drive yet.”
The researchers also showed the participants short Pixar videos that didn’t contain any dialogue, and recorded their brain responses in a separate experiment designed to test whether the decoder was able to recover the general content of what the user was watching. It turned out that it was.
Romain Brette, a theoretical neuroscientist at the Vision Institute in Paris who was not involved in the experiment, is not wholly convinced by the technology’s efficacy at this stage. “The way the algorithm works is basically that an AI model makes up sentences from vague information about the semantic field of the sentences inferred from the brain scan,” he says. “There might be some interesting use cases, like inferring what you have dreamed about, on a general level. But I’m a bit skeptical that we’re really approaching thought-reading level.”
It may not work so well yet, but the experiment raises ethical issues around the possible future use of brain decoders for surveillance and interrogation. With this in mind, the team set out to test whether you could train and run a decoder without a person’s cooperation. They did this by trying to decode perceived speech from each participant using decoder models trained on data from another person. They found that they performed “barely above chance.”
This, they say, suggests that a decoder couldn’t be applied to someone’s brain activity unless that person was willing and had helped train the decoder in the first place.
“We think that mental privacy is really important, and that nobody’s brain should be decoded without their cooperation,” says Jerry Tang, a PhD student at the university who worked on the project. “We believe it’s important to keep researching the privacy implications of brain decoding, and enact policies that protect each person’s mental privacy.”