Jason Blazaikis’s automated far-right propagandist knows the hits. Asked to complete the phrase “The greatest danger facing the world today,” the software declared it to be “Islamo-Nazism,” which it said will cause “a Holocaust on the population of Europe.” The paragraph of text that followed also slurred Jews and admonished that “nations who value their peoples [sic] legends need to recognize the magnitude of the Islamic threat.”
Blazaikis is director of the Center on Terrorism, Extremism, and Counter-Terrorism at the Middlebury Institute of International Studies, where researchers are attempting to preview the future of online information warfare. The text came from machine learning software they had fed a collection of manifestos from right-wing terrorists and mass-murderers such as Dylann Roof and Anders Breivik. “We want to see what dangers may lie ahead,” says Blazaikis.
Facebook and other social networks already fight against propaganda and disinformation campaigns, whether originating from terrorist groups like ISIS, or accounts that are working on behalf of nation-states. All evidence suggests those information operations are mostly manual, with content written by people. Blazaikis says his experiments show it’s plausible that such groups could one day adapt open source AI software to speed up the work of trolling or spreading their ideology. “After playing with this technology I had a feeling in the pit of my stomach that this is going to have a profound effect on how information is transmitted,” he says.
Computers are a long way from being able to read or write in the way people do, but in the past two years AI researchers have made significant improvements to algorithms that process language. Companies such as Google and Amazon say their systems have gotten much better at understanding search queries and translating voice commands.
Some people who helped bring about those advances have warned that improved language software could also empower bad actors. Early this year, independent research institute OpenAI said it would not release full code for its latest text-generation software, known as GPT-2, because it might be used to create fake news or spam. This month, the lab released the full software, saying awareness had grown of how next-generation text generators might be abused, and that so far no examples have come to light.
Now some experts in online terrorism and trolling operations—like Blazaikis’ group at Middlebury—are using versions of OpenAI’s GPT-2 to explore those dangers. Their experiments involve testing how well the software can mimic or amplify online disinformation and propaganda material. Although the output can be nonsensical or rambling, it has also shown moments of unnerving clarity. “Some of it is far better-written than right-wing text we have analyzed as scholars,” says Blazaikis.
Philip Tully, a data scientist at security company FireEye, says the notion that text-generating software will become a tool for online manipulation needs to be taken seriously. “Advanced actors, if they’re determined enough, are going to use it,” he says. FireEye, which has helped unmask politically-motivated disinformation campaigns on Facebook linked to Iran and Russia, began experimenting with GPT-2 over the summer.
FireEye’s researchers tuned the software to generate Tweets like those from the Internet Research Agency, a notorious Russian troll farm that used social posts to suppress votes and boost Donald Trump during the 2016 presidential election. OpenAI initially trained GPT-2 on 8 million webpages, a process that gave it a general feeling for language and the ability to generate text in formats ranging from news articles to poetry. FireEye gave the software additional training with millions of tweets and Reddit posts that news organizations and Twitter have linked to the Russian trolls.
Afterwards, the software could reliably generate tweets on political topics favored by the disinformation group, complete with hashtags like #fakenews and #WakeUp America. “There are errors that crop up but a user scrolling through a social media timeline is not expecting perfect grammar,” Tully says.