Save over 40% when you secure your tickets today to TNW Conference 💥 Prices will increase on November 22 →

This article was published on November 14, 2023

‘Unsafe’ AI images proliferate online. Study suggests 3 ways to curb the scourge

A significant percentage of AI-generated images are violent or dehumanising, finds new research


‘Unsafe’ AI images proliferate online. Study suggests 3 ways to curb the scourge

Over the past year, AI image generators have taken the world by storm. Heck, even our distinguished writers at TNW use them from time to time. 

Truth is, tools like Stable Diffusion, Latent Diffusion, or DALL·E can be incredibly useful for producing unique images from simple prompts — like this picture of Elon Musk riding a unicorn.

But it’s not all fun and games. Users of these AI models can just as easily generate hateful, dehumanising, and pornographic images at the click of a button — with little to no repercussions. 

“People use these AI tools to draw all kinds of images, which inherently presents a risk,” said researcher Yiting Qu from the CISPA Helmholtz Center for Information Security in Germany. Things become especially problematic when disturbing or explicit images are shared on mainstream media platforms, she stressed.

The 💜 of EU tech

The latest rumblings from the EU tech scene, a story from our wise ol' founder Boris, and some questionable AI art. It's free, every week, in your inbox. Sign up now!

While these risks seem quite obvious, there has been little research undertaken so far to quantify the dangers and create safe guardrails for their use. “Currently, there isn’t even a universal definition in the research community of what is and is not an unsafe image,” said Qu. 

To illuminate the issue, Qu and her team investigated the most popular AI image generators, the prevalence of unsafe images on these platforms, and three ways to prevent their creation and circulation online.

The researchers fed four prominent AI image generators with text prompts from sources known for unsafe content, such as the far-right platform 4chan. Shockingly, 14.56% of images generated were classified as “unsafe,” with Stable Diffusion producing the highest percentage at 18.92%. These included images with sexually explicit, violent, disturbing, hateful, or political content.

Creating safeguards

The fact that so many uncertain images were generated in Qu’s study shows that existing filters do not do their job adequately. The researcher developed her own filter, which scores a much higher hit rate in comparison, but suggests a number of other ways to curb the threat.  

One way to prevent the spread of inhumane imagery is to program AI image generators to not generate this imagery in the first place, she said. Essentially, if AI models aren’t trained on unsafe images, they can’t replicate them. 

Beyond that, Qu recommends blocking unsafe words from the search function, so that users can’t put together prompts that produce harmful images. For those images already circulating, “there must be a way of classifying these and deleting them online,” she said.

With all these measures, the challenge is to find the right balance. “There needs to be a trade-off between freedom and security of content,” said Qu. “But when it comes to preventing these images from experiencing wide circulation on mainstream platforms, I think strict regulation makes sense.” 

Aside from generating harmful content, the makers of AI text-to-image software have come under fire for a range of issues, such as stealing artists’ work and amplifying dangerous gender and race stereotypes

While initiatives like the AI Safety Summit, which took place in the UK this month, aim to create guardrails for the technology, critics claim big tech companies hold too much sway over the negotiations. Whether that’s true or not, the reality is that, at present, proper, safe management of AI is patchy at best and downright alarming at its worst.  

 

 

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with


Published
Back to top