AI ‘pre-cog’ predicts which Twitter users will spread disinformation

Researchers from the University of Sheffield have developed an AI system that detects which social media users spread disinformation — before they actually share it.

The team found that Twitter users who share content from unreliable sources mostly tweet about politics or religion, while those who repost trustworthy sources tweet more about their personal lives.

“We also found that the correlation between the use of impolite language and the spread of unreliable content can be attributed to high online political hostility,” said study co-author Dr Nikos Aletras, a lecturer in Natural Language Processing at the University of Sheffield.

[Read: Why AI is the future of home security]

How Startup Amsterdam Boosts Innovation and Growth at TNW Conference

Discover how the City of Amsterdam partnered with TNW to amplify its startup ecosystem, attract global talent, and foster innovation that drives economic impact.

Read the case study

The team reported their findings after analyzing more than 1 million tweets from around 6,200 Twitter users.

They began by collecting posts from a list of news media accounts on Twitter, which had been classified as either trustworthy or deceptive in four categories: satire, propaganda, hoax, and clickbait.

They then used the Twitter public API to retrieve the most recent 3,200 tweets for each source, and filtered out any retweets to leave only original posts.

Next, they removed satirical sites such as The Onion that have humourous rather than deceptive purposes to produce a list of 251 trustworthy sources, such as the BBC and Reuters, and 159 unreliable sources, which included Infowars and Disclose.tv.

They then placed the roughly 6,200 Twitter users into two separate groups: those who have shared unreliable sources at least three times, and those who have only ever reposted stories from the trustworthy sites.

Finally, the researchers used the linguistic information in the tweets to train a series of models to forecast whether a user would likely spread disinformation.

Their most effective method used a neural model called T-BERT. The team says it can predict with 79.7% accuracy whether a user will repost unreliable sources in the future:

This demonstrates that neural models can automatically unveil (non-linear) relationships between a user’s generated textual content (i.e., language use) in the data and the prevalence of that user retweeting from reliable or unreliable news sources in the future

The team also performed a linguistic feature analysis to detect differences in language use between the two groups.

They found that users who shared unreliable sources were more likely to use words such as “liberal,” “government,” and “media,” and often referred to Islam or politics in the Middle East. In contrast, the users who shared trustworthy sources frequently tweeted about their social interactions and emotions, and often used words like “mood,” “wanna,” and “birthday.”

The researchers hope their findings will help social media giants combat disinformation.

“Studying and analyzing the behavior of users sharing content from unreliable news sources can help social media platforms to prevent the spread of fake news at the user level, complementing existing fact-checking methods that work on the post or the news source level,” said study co-author Yida Mu, a PhD student at the University of Sheffield.

You can read the full study in the journal PeerJ.

Story by Thomas Macaulay

Senior reporter

Thomas is a senior reporter at TNW. He covers European tech, with a focus on AI, cybersecurity, and government policy. Thomas is a senior reporter at TNW. He covers European tech, with a focus on AI, cybersecurity, and government policy.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

AI ‘pre-cog’ predicts which Twitter users will spread disinformation

Get the TNW newsletter

Also tagged with

Meta’s Threads will not be rolled out in the EU ‘at this point’

Taylor Swift deepfake porn deluge a ‘wake-up call’ for lawmakers

Discover TNW All Access

Meta takes new AI system offline because Twitter users are mean

Musk’s in a legal duel with a king over Twitter’s unpaid London rent