The problems with Elon Musk’s plan to open source the Twitter algorithm
Just hours after Twitter announced it was accepting Elon Musk’s buyout offer, the SpaceX CEO made his plans for the social network clear. In a press release, Musk outlined the sweeping changes he intended to make, including opening up its algorithms, which determine what users see in their feed.
Musk’s ambition to open-source Twitter’s algorithms is driven by his longstanding concern about potential political censorship on the platform, but it’s unlikely that doing so will have the effect he desires. Instead, it may bring a number of unexpected problems in its wake, experts warn.
Musk might have a strong aversion to authority, but his desire for algorithmic transparency happens to chime with the wishes of politicians around the world. It has been a cornerstone of multiple governments’ attempts to fight back against big tech in recent years.
To support MIT Technology Review’s journalism, please consider becoming a subscriber.
For example, Melanie Dawes, chief executive of Ofcom, which regulates social media in the UK, has said that social media platforms will have to explain how their code works. And the European Union’s recently passed Digital Services Act, agreed on April 23, will likewise compel platforms to offer transparency over algorithms. In the US, Democratic Senators tabled proposals for an Algorithmic Accountability Act in February 2022. Their goal is to bring new transparency and oversight of the algorithms that govern our timelines and newsfeeds, and much else besides.
Allowing Twitter’s algorithm to be visible to others, and adaptable by competitors, theoretically means someone could just copy Twitter’s source code and release their own, rebranded version. Large parts of the internet run on open sourced software—most famously OpenSSL, a security toolkit used by large parts of the world wide web, which in 2014 suffered a major security breach.
There are even examples of open source social networks already. Mastodon, a microblogging platform that was set up after concerns about the dominant position of Twitter, is open source, allowing users to inspect the code posted on software repository GitHub.
But seeing the code behind an algorithm doesn’t necessarily tell you how it works, and certainly doesn’t give the average person much insight into the business structures and processes that go into its creation.
“It’s a bit like trying to understand ancient creatures with genetic material alone,” says Jonathan Gray, a senior lecturer in critical infrastructure studies at King’s College, London. “It tells us more than nothing, but it would be a stretch to say we know about how they live.”
There’s also not one single algorithm that controls Twitter. “Some of them will determine what people see on their timelines in terms of trends, or content, or suggested follows,” says Catherine Flick, who researches computing and social responsibility at De Montfort University, Leicester. The algorithms people will primarily be interested in are the ones controlling what content is surfaced in users’ timelines, but even that won’t be hugely useful without the training data.
“Most of the time when people talk about algorithmic accountability these days, we recognize that the algorithms themselves aren’t necessarily what we want to see—what we really want is information about how they were developed,” says Jennifer Cobbe, postdoctoral research associate at the University of Cambridge. That’s in large part because of concerns about AI algorithms perpetuating the human biases in data used to train them. Who develops algorithms, and what data they use, can make a meaningful difference to the results they spit out.
For Cobbe, the risks outweigh the potential benefits. The computer code doesn’t give us any insight into how algorithms were trained or tested, what factors or considerations went into them, or what sorts of things were prioritized in the process.
Open sourcing its algorithm may not make a meaningful difference to transparency at Twitter, then—and it could introduce some significant security risks.
Companies often publish their data protection impact assessments, which probe and test systems to highlight weaknesses and flaws. When they’re discovered, they get fixed, but data is often redacted to prevent security risks. By open sourcing Twitter’s algorithms, the entire code base of the website would be accessible to all, potentially allowing bad actors to pore over the software and find vulnerabilities to exploit.
“I don’t believe for a moment that Elon Musk is looking at open sourcing all the infrastructure and security side of Twitter,” says Eerke Boiten, professor of cybersecurity at De Montfort University, Leicester.
Open sourcing Twitter’s algorithms could create yet another problem: it could help bad actors to get better at gaming the system, which could make one of his other stated goals, “defeating all spam bots”, even harder.
“That’s not necessarily because individuals would be able to understand the intricacies of how the code of the algorithm works. But they’d be able to discern roughly how Twitter recommends posts on users’ timelines,” says Boiten. While Twitter users aren’t exactly in the dark about how the platform operates now, open sourcing its algorithms could provide bad actors with new ammunition, he says.
There are other, more troubling unintended consequences. One of the key worries is the inevitable squabbles that will ensue as people try, amateurishly, to parse the algorithm. That could lead to yet more poisonous and fruitless debates.
“I worry that it’ll be made into a mountain where it’s really just a molehill,” says Flick. “There’s a lot of hype about the mysterious algorithm, but in reality it’s likely that bad behavior has social consequences that are reflected in the weightings of the tweets of those people.”
Open sourcing the algorithm won’t fix any issues with bias, and taking action to fix biases that are raised will undoubtedly be viewed through a political, rather than technological, lens—at a time when we’re already massively politically polarized.
For example, a recent paper, conducted by Twitter researchers, highlighting how the algorithms more readily promote right-wing content than left-wing content, has already become a lightning rod for arguments. “It’s going to be a mess,” says Flick.