A woman made her AI voice clone say “arse.” Then she got banned.
This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here.
Over the past couple of weeks, I’ve been speaking to people who have lost their voices. Both Joyce Esser, who lives in the UK, and Jules Rodriguez, who lives in Miami, Florida, have forms of motor neuron disease—a class of progressive disorders that result in the gradual loss of the ability to move and control muscles.
It’s a crushing diagnosis for everyone involved. Jules’s wife, Maria, told me that once it was official, she and Jules left the doctor’s office gripping each other in floods of tears. Their lives were turned upside down. Four and a half years later, Jules cannot move his limbs, and a tracheostomy has left him unable to speak.
“To say this diagnosis has been devastating is an understatement,” says Joyce, who has bulbar MND—she can still move her limbs but struggles to speak and swallow. “Losing my voice has been a massive deal for me because it’s such a big part of who I am.”
AI is bringing back those lost voices. Both Jules and Joyce have fed an AI tool built by ElevenLabs recordings of their old voices to re-create them. Today, they can “speak” in their old voices by typing sentences into devices, selecting letters by hand or eye gaze. It’s been a remarkable and extremely emotional experience for them—both thought they’d lost their voices for good.
But speaking through a device has limitations. It’s slow, and it doesn’t sound completely natural. And, strangely, users might be limited in what they’re allowed to say.
Joyce doesn’t use her voice clone all that often. She finds it impractical for everyday conversations. But she does like to hear her old voice and will use it on occasion. One such occasion was when she was waiting for her husband, Paul, to get ready to go out.
Joyce typed a message for her voice clone to read out: “Come on, Hunnie, get your arse in gear!!” She then added: “I’d better get my knickers on too!!!”
“The next day I got a warning from ElevenLabs that I was using inappropriate language and not to do it again!!!” Joyce told me via email (we communicated with a combination of email, speech, text-to-voice tools, and a writing board). She wasn’t sure what had been inappropriate, exactly. It’s not as though she’d used any especially vile language—just, as she puts it, “normal British banter between a couple getting ready to go out.”
Joyce assumed that one of the words she’d used had been automatically flagged up by “the prudish American computer,” and that once someone from the ElevenLabs team had assessed the warning, it would be dismissed.
“Well, apparently not, because the next day a human banned me!!!!” says Joyce. She says she felt mortified. “I’d just got my voice back and now they’d taken it away from me … and only two days after I’d done a presentation to my local MND group telling them how amazing ElevenLabs were.”
Joyce contacted ElevenLabs, who apologized and reinstated her account. But it’s still not clear why she was banned in the first place. When I first asked Sophia Noel, a company representative, about the incident, she directed me to the company’s prohibited use policy.
There are rules against threatening child safety, engaging in illegal behavior, providing medical advice, impersonating others, interfering with elections, and more. But there’s nothing specifically about inappropriate language. I asked Noel about this, and she said that Joyce’s remark was most likely interpreted as a threat.
ElevenLabs’ terms of use state that the company does not have any obligation to screen, edit, or monitor content but add that it may “terminate or suspend” access to its services when content is “reasonably likely, in our sole determination, to violate applicable law or [the user] Terms.” ElevenLabs has a moderation tool that “screens content to ensure it aligns with our Terms of Service,” says Dustin Blank, head of partnerships at the company.
The question is: Should companies be screening the language of people with motor neuron disease?
After all, that’s not how other communication devices for people with this condition work. People with MND are usually advised to “bank” their voices as soon as they can—to record set phrases that can be used to create a synthetic voice that sounds a bit like them, albeit a somewhat robotic-sounding version. (Jules recently joked that his sounded like “a Daft Punk song at quarter speed.”)
Banked voices aren’t subject to the same scrutiny, says Joyce’s husband, Paul. “Joyce was told … you can put whatever [language] you want in there,” he says. Voice banking wasn’t an option for Joyce, whose speech had already deteriorated by the time she was diagnosed with MND. Jules did bank his voice but doesn’t tend to use it, because the voice clone sounds so much better.
Joyce doesn’t hold a grudge—and her experience is far from universal. Jules uses the same technology, but he hasn’t received any warnings about his language—even though a comedy routine he performs using his voice clone contains plenty of curse words, says his wife, Maria. He opened a recent set by yelling “Fuck you guys!” at the audience—his way of ensuring they don’t give him any pity laughs, he joked. That comedy set is even promoted on the ElevenLabs website.
Blank says language like that used by Joyce is no longer restricted. “There is no specific swear ban that I know of,” says Noel. That’s just as well.
“People living with MND should be able to say whatever is on their mind, even swearing,” says Richard Cave of the MND Association in the UK, who helps people with MND set up their voice clones. “There’s plenty to swear about.”
Now read the rest of The Checkup
Read more from MIT Technology Review’s archive
You can read more about how voice clones are re-creating the voices of people with motor neuron disease in this story.
Researchers are working to create realistic avatars of people with strokes and amyotrophic lateral sclerosis that can be controlled via a brain implant. Last year, two such individuals were able to use these devices to speak at a rate of around 60 to 70 words per minute—half the rate of typical speech, but more than four times faster than had previously been achieved using a similar approach.
Other people with ALS who are locked in—completely paralyzed but cognitively able—have used brain implants to communicate, too. A few years ago, a man in Germany used such a device to ask for massages and beer, and to tell his son he loved him.
Several companies are working on creating hyperrealistic avatars. Don’t call them deepfakes— they prefer to think of them as “synthetic media,” writes my former colleague Melissa Heikkilä, who created her own avatar with the company Synthesia.
ElevenLabs’ tool can be used to create “humanlike speech” in 32 languages. Meta is building a model that can translate over 100 languages into 36 other languages.
From around the web
Covid-19 conspiracy theorists—some of whom believe the virus is an intentionally engineered bioweapon—will soon be heading US agencies. Some federal workers are worried they may be out for revenge against current and former employees. (Wired)
Cats might have spread bird flu to humans—and vice versa. That’s according to data from the US Centers for Disease Control and Prevention, which published the finding but then abruptly removed it. (The New York Times)
And a dairy worker is confirmed to have been infected with a second strain of bird flu that more recently spilled over from birds to cows. The person’s only symptom was conjunctivitis. (Ars Technica)
Health officials in states with abortion bans are claiming that either few or zero abortions are taking place. The claims are “ludicrous,” according to doctors in those states. (KFF Health News)
A judge in the UK has warned women against accepting sperm donations from a man who claims to have fathered more than 180 children in several countries. Robert Charles Albon, who calls himself Joe Donor, has subjected a female couple to a “nightmare” of controlling behavior, the judge said. (The Guardian)