Want AI that flags hateful content? Build it.
Humane Intelligence, an organization focused on evaluating AI systems, is launching a competition that challenges developers to create a computer vision model that can track hateful image-based propaganda online. Organized in partnership with the Nordic counterterrorism group Revontulet, the bounty program opens September 26. It is open to anyone, 18 or older, who wants to compete and promises $10,000 in prizes for the winners.
This is the second of a planned series of 10 “algorithmic bias bounty” programs from Humane Intelligence, a nonprofit that investigates the societal impact of AI and was launched by the prominent AI researcher Rumman Chowdhury in 2022. The series is supported by Google.org, Google’s philanthropic arm.
“The goal of our bounty programs is to, number one, teach people how to do algorithmic assessments,” says Chowdhury, “but also, number two, to actually solve a pressing problem in the field.”
Its first challenge asked participants to evaluate gaps in sample data sets that may be used to train models—gaps that may specifically produce output that is factually inaccurate, biased, or misleading.
The second challenge deals with tracking hateful imagery online—an incredibly complex problem. Generative AI has enabled an explosion in this type of content, and AI is also deployed to manipulate content so that it won’t be removed from social media. For example, extremist groups may use AI to slightly alter an image that a platform has already banned, quickly creating hundreds of different copies that can’t easily be flagged by automated detection systems. Extremist networks can also use AI to embed a pattern into an image that is undetectable to the human eye but will confuse and evade detection systems. It has essentially created a cat-and-mouse game between extremist groups and online platforms.
The challenge asks for two different models. The first, a task for those with intermediate skills, is one that identifies hateful images; the second, considered an advanced challenge, is a model that attempts to fool the first one. “That actually mimics how it works in the real world,” says Chowdhury. “The do-gooders make one approach, and then the bad guys make an approach.” The goal is to engage machine-learning researchers on the topic of mitigating extremism, which may lead to the creation of new models that can effectively screen for hateful images.
A core challenge of the project is that hate-based propaganda can be very dependent on its context. And someone who doesn’t have a deep understanding of certain symbols or signifiers may not be able to tell what even qualifies as propaganda for a white nationalist group.
“If [the model] never sees an example of a hateful image from a part of the world, then it’s not going to be any good at detecting it,” says Jimmy Lin, a professor of computer science at the University of Waterloo, who is not associated with the bounty program.
This effect is amplified around the world, since many models don’t have a vast knowledge of cultural contexts. That’s why Humane Intelligence decided to partner with a non-US organization for this particular challenge. “Most of these models are often fine-tuned to US examples, which is why it’s important that we’re working with a Nordic counterterrorism group,” says Chowdhury.
Lin, though, warns that solving these problems may require more than algorithmic changes. “We have models that generate fake content. Well, can we develop other models that can detect fake generated content? Yes, that is certainly one approach to it,” he says. “But I think overall, in the long run, training, literacy, and education efforts are actually going to be more beneficial and have a longer-lasting impact. Because you’re not going to be subjected to this cat-and-mouse game.”
The challenge will run till November 7, 2024. Two winners will be selected, one for the intermediate challenge and another for the advanced; they will receive $4,000 and $6,000, respectively. Participants will also have their models reviewed by Revontulet, which may decide to add them to its current suite of tools to combat extremism.