How DeepMind thinks it can make chatbots safer
To receive The Algorithm in your inbox every Monday, sign up here.
Welcome to the Algorithm!
Some technologists hope that one day we will develop a superintelligent AI system that people will be able to have conversations with. Ask it a question, and it will offer an answer that sounds like something composed by a human expert. You could use it to ask for medical advice, or to help plan a holiday. Well, that’s the idea, at least.
In reality, we’re still a long way away from that. Even the most sophisticated systems of today are pretty dumb. I once got Meta’s AI chatbot BlenderBot to tell me that a prominent Dutch politician was a terrorist. In experiments where AI-powered chatbots were used to offer medical advice, they told pretend patients to kill themselves. Doesn’t fill you with a lot of optimism, does it?
That’s why AI labs are working hard to make their conversational AIs safer and more helpful before turning them loose in the real world. I just published a story about Alphabet-owned AI lab DeepMind’s latest effort: a new chatbot called Sparrow.
DeepMind’s new trick to making a good AI-powered chatbot was to have humans tell it how to behave—and force it to back up its claims using Google search. Human participants were then asked to evaluate how plausible the AI system’s answers were. The idea is to keep training the AI using dialogue between humans and machines.
In reporting the story, I spoke to Sara Hooker, who leads Cohere for AI, a nonprofit AI research lab.
She told me that one of the biggest hurdles in safely deploying conversational AI systems is their brittleness, meaning they perform brilliantly until they are taken to unfamiliar territory, which makes them behave unpredictably.
“It is also a difficult problem to solve because any two people might disagree on whether a conversation is inappropriate. And even if we agree that something is appropriate right now, this may change over time, or rely on shared context that can be subjective,” Hooker says.
Despite that, DeepMind’s findings underline that AI safety is not just a technical fix. You need humans in the loop.
In the long term, DeepMind hopes, having people steer the chatbot through dialogue could be a helpful tool for supervising machines.
“We might have a discussion about what a machine is doing in a way that allows us to communicate what we actually want and not miss subtle things,” says Geoffrey Irving, a safety researcher at DeepMind.
DeepMind’s model combines a lot of different strands of safety research into one model, with impressive results. You can read about it here.
But let’s be real. Nobody is building these systems purely because they want customer service bots to have better tools to help you rebook your canceled flight.
AI chatbots are powered by large language models, which produce human-sounding text by scraping vast amounts of writing from the internet. They could be a powerful tool for an entire new form of online search.
There’s a lot of money to be made in improving search, which has really lost its mojo. Google search has become overpersonalized and overcommercialized. It’s also riddled with hidden scams and malware.
Google is extremely anxious, too, about new competitors such as TikTok, which has quickly become Gen Z’s go-to source of information. That company is already offering a kind of search that the Googles of the world are trying to build: type in a question, and you’ll get tons of engaging content featuring actual humans.
But there are legitimate questions about whether AI can ever compete with this, as my colleague Will Heaven wrote last March.
Or as Emily Bender of the University of Washington, who studies computational linguistics and ethical issues in natural-language processing, put it in Will’s story: “The Star Trek fantasy—where you have this all-knowing computer that you can ask questions and it just gives you the answer—is not what we can provide and not what we need.”
“It is infantilizing to say that the way we get information is to ask an expert and have them just give it to us,” Bender said.
This startup’s AI is smart enough to drive different types of vehicles
Wayve, a driverless car startup based in London, has made a single machine-learning model that can drive two different types of vehicles, a passenger car and a delivery truck—a first for the industry.
Watch out, Tesla: The breakthrough suggests that Wayve’s approach to autonomous vehicles might just scale up faster than the technology of mainstream companies like Cruise, Waymo, and Tesla. My colleague Will Heaven visited Wayve’s offices in London to check out the company’s new vehicle. Read more here.
Bits and Bytes
How colleges use AI to monitor student protests.
Colleges in the US are using Social Sentinel, a tool pitched as a way to help save students’ lives. Quelle surprise: it was used to surveil students. Following this investigation, one college has announced it is dropping the tool. (The Dallas Morning News)
Clearview AI, used by police to find criminals, is now in public defenders’ hands.
The lawyer for a man accused of vehicular homicide used the controversial facial recognition software to prove his client’s innocence. (The New York Times)
Hated that video? YouTube’s algorithm might push you another just like it.
New research from Mozilla shows that user controls have little effect on which videos YouTube’s influential AI recommends. (MIT Technology Review)
The YouTube baker fighting back against deadly “craft hacks.”
More on moderation: Ann Reardon spends her time debunking dangerous activities that go viral on the platform—but the craze shows no signs of abating. (MIT Technology Review)
ISIS executions and nonconsensual porn are powering AI art.
The data sets behind AI art tools are full of problematic content, as I reportedin last week’s edition. Getty images has also banned AI-generated images over fears of lawsuits. (Vice)
How AI art sees Los Angeles.
A lovely piece about what AI art generator Midjourney produces when descriptions of LA from literature are used as prompts. (LA Times)
That’s it from me. Catch you next week!