Scathing study exposes Google’s harmful approach to AI development

A study published earlier this week by Surge AI appears to lay bare one of the biggest problems plaguing the AI industry: bullshit, exploitative data-labeling practices.

Last year, Google built a dataset called “GoEmotions.” It was billed as a “fine-grained emotion dataset” — basically a ready-to-train-on dataset for building AI that can recognize emotional sentiment in text.

Per a Google blog post:

In “GoEmotions: A Dataset of Fine-Grained Emotions”, we describe GoEmotions, a human-annotated dataset of 58k Reddit comments extracted from popular English-language subreddits and labeled with 27 emotion categories. As the largest fully annotated English language fine-grained emotion dataset to date, we designed the GoEmotions taxonomy with both psychology and data applicability in mind.

Here’s another way of putting it: Google scraped 58,000 Reddit comments and then sent those files to a third-party company for labeling. More on that later.

The study

Surge AI took a look at a sample of 1,000 labeled comments from the GoEmotions dataset and found that a significant portion of them were mislabeled.

TNW Conference - The 2025 Agenda has just touched down

Discover the insightful and dare we say controversial sessions that will take place June 19-20.

Check It Out

Per the study:

A whopping 30% of the dataset is severely mislabeled! (We tried training a model on the dataset ourselves, but noticed deep quality issues. So we took 1000 random comments, asked Surgers whether the original emotion was reasonably accurate, and found strong errors in 308 of them.)

It goes on to point out some of the major problems with the dataset, including this doozy:

Problem #1: “Reddit comments were presented with no additional metadata”

First of all, language doesn’t live in a vacuum! Why would you present a comment with no additional metadata? The subreddit and parent post it’s replying to are especially important context.

Imagine you see the comment “his traps hide the fucking sun” by itself. Would you have any idea what it means? Probably not – maybe that’s why Google mislabeled it.

But what if you were told it came from the /r/nattyorjuice subreddit dedicated to bodybuilding? Would you realize, then, that traps refers to someone’s trapezoid muscles?

The problem

This kind of data can’t be properly labeled. Using the above “his traps hide the fucking sun” comment as an example, it’s impossible to imagine a single person on the planet capable of understanding every edge case when it comes to human sentiment.

It’s not that the particular labelers didn’t do a good job, it’s that they were given an impossible task.

There are no shortcuts to gleaning insight into human communications. We’re not stupid like machines are. We can incorporate our entire environment and lived history into the context of our communications and, through the tamest expression of our masterful grasp on semantic manipulation, turn nonsense into philosophy (shit happens) or turn a truly mundane statement into the punchline of an ageless joke (to get to the other side).

What these Google researchers have done is spent who knows how much time and money developing a crappy digital version of a Magic 8-Ball. Sometimes it’s right, sometimes it’s wrong, and there’s no way to be sure one way or another.

This particular kind of AI development is a grift. It’s a scam. And it’s one of the oldest in the book.

Here’s how it works: The researchers took an impossible problem, “how to determine human sentiment in text at massive scales without context,” and used the magic of bullshit to turn it into a relatively simple one that any AI can solve “how to match keywords to labels.”

The reason it’s a grift is because you don’t need AI to match keywords to labels. Hell, you could do that in Microsoft Excel 20 years ago.

A bit deeper

You know the dataset the AI was trained on contains mislabeled data. Thus, the only way you can be absolutely sure that a given result it returns is accurate is to verify it yourself — you have to be the so-called human in the loop. But what about all the results it doesn’t return that it should?

We’re not trying to find all the cars that are red in a dataset of automobile images. We’re making determinations about human beings.

If the AI screws up and misses some red cars, those cars are unlikely to suffer negative outcomes. And if it accidentally labels some blue cars as red, those blue cars should be okay.

But this particular dataset is specifically built for decision-making related to human outcomes.

Per Google:

It’s been a long-term goal among the research community to enable machines to understand context and emotion, which would, in turn, enable a variety of applications, including empathetic chatbots, models to detect harmful online behavior, and improved customer support interactions.

Again, we know for a fact that any AI model trained on this dataset will produce erroneous outputs. That means every single time the AI makes a decision that either rewards or punishes any human, it causes demonstrable harm to other humans.

If the AI’s output can be used to influence human rewards — by, for example, surfacing all the resumes in a stack that have “positive sentiment” in them — we have to assume that some of the files it didn’t surface were wrongfully discriminated against.

That’s something humans-in-the-loop cannot help with. It would require a person to review every single file that wasn’t selected.

And, if the AI has the ability to influence human punishments — by, for example, taking down content it considers “hate speech” — we can be certain that sentiments that objectively don’t deserve punishment will be erroneously surfaced and, thus, humans will be harmed.

Worst of all, study after study demonstrates that these systems are inherently full of human bias and that minority groups are always disproportionately negatively-impacted.

The solution

There’s only one way to fix this kind of research: throw it in the trash.

It is our stance here at Neural that it is entirely unethical to train an AI on human-created content without the expressed individual consent of the humans who created it.

Whether it’s legal to do so or not is irrelevant. When I post on Reddit, I do so in the good faith that my discourse is intended for other humans. Google doesn’t compensate me for my data so it shouldn’t use it, even if the terms of service allow for it.

Furthermore, it is also our stance that it is unethical to deploy AI models trained on data that hasn’t been verified to be error-free when the output from those models has the potential to affect human outcomes.

Final thoughts

Google’s researchers aren’t stupid. They know that a generic “keyword search and comparison” algorithm can’t turn an AI model into a human-level expert in psychology, sociology, pop-culture, and semantics just because they feed it a dataset full of randomly-mislabeled Reddit posts.

You can draw your own conclusions as to their motivations.

But no amount of talent and technology can turn a bag full of bullshit into a useful AI model when human outcomes are at stake.

Story by Tristan Greene

Editor, Neural by TNW

Tristan is a futurist covering human-centric artificial intelligence advances, quantum computing, STEM, physics, and space stuff. Pronouns: (show all) Tristan is a futurist covering human-centric artificial intelligence advances, quantum computing, STEM, physics, and space stuff. Pronouns: He/him

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with

Google

Scathing study exposes Google’s harmful approach to AI development

The study

The problem

A bit deeper

The solution

Final thoughts

Get the TNW newsletter

Also tagged with

French competition watchdog fines Google €250M for AI copyright breaches

DOJ move against Chrome renews calls for Google to sell Android

Discover TNW All Access

Google-Anthropic partnership raises AI competition fears in the UK

AI is changing science: Google DeepMind duo win Nobel Prize in Chemistry