DeepMind’s new AI taps games to enhance fundamental algorithms

DeepMind made its AI name in games. Now it’s playing with the foundations of computing

DeepMind has applied its mastery of games to a more serious business: the foundations of computer science.

The Google subsidiary today unveiled AlphaDev, an AI system that discovers new fundamental algorithms. According to DeepMind, the algorithms it’s unearthed surpass those honed by human experts over decades.

The London-based lab has grand ambitions for the project. As demand for computation grows and silicon chips approach their limits, fundamental algorithms will have to become exponentially more efficient. By enhancing these processes, DeepMind aims to transform the infrastructure of the digital world.

The first target in this mission is sorting algorithms, which are used to order data. Under the covers of our devices, they determine everything from search rankings to movie recommendations.

The

The latest rumblings from the EU tech scene, a story from our wise ol' founder Boris, and some questionable AI art. It's free, every week, in your inbox. Sign up now!

To enhance their performance, AlphaDev explored assembly instructions, which are used to create binary code for computers. After an exhaustive search, the system uncovered a sorting algorithm that outperformed the previous benchmarks.

To find the winning combination, DeepMind had to revisit the feats that made it famous: winning board games.

Gaming the system

DeepMind made its name in games. In 2016, the company grabbed headlines when its AI program defeated a world champion of Go, a wickedly complicated Chinese board game.

Following the victory, DeepMind built a more general-purpose system, AlphaZero. Using a process of trial and error called reinforcement learning, the program mastered not only Go, but also chess and shogi (aka “Japanese chess”).

AlphaDev — the new algorithm builder — is based on AlphaZero. But the influence of gaming extends beyond the underlying model.

“We penalise it for making mistakes.

DeepMind formulated AlphaDev’s task as a single-player game. To win the game, the system had to build a new and improved sorting algorithm.

The system played its moves by selecting assembly instructions to add to the algorithm. To find the optimal instructions, the system had to probe a vast quantity of instruction combinations. According to DeepMind, the number was similar to the number of particles in the universe. And just one bad choice could invalidate the entire algorithm.

After each move, AlphaDev compared the algorithm’s output with the expected results. If the output was correct and the performance was efficient, the system got a “reward” — a signal that it was playing well.

“We penalise it for making mistakes, and we reward it for finding more and more of these sequences that are sorted correctly,” Daniel Mankowitz, the lead researcher, told TNW.

As you’ve probably guessed, AlphaDev won the game. But the system didn’t only find a correct and faster program. It also discovered novel approaches to the task.

The new algorithms contained instruction sequences that saved a single instruction each time they were applied. Dubbed “swap and copy moves,” they served as shortcuts to further algorithmic efficiencies.

DeepMind compares the approach to another moment in games: the fabled “move 37,” which an AI system played against Go champion Lee Sedol.

The strange move shocked human experts, who thought the machine had made a mistake. But they soon discovered that the program had a plan.

“It ended up not just winning the game, but also influencing the strategies that professional Go players started using,” said Mankowitz.

The win marked the first time AI has beaten a top-ranked Go professional — a milestone that experts had predicted was another decade away.

Three years later, Lee retired from professional Go competition. He attributed the decision to the abilities of his AI rivals.

“Even if I become the number one, there is an entity that cannot be defeated,” he said.

Sorting out computing

AlphaDev’s sorting algorithms have now been open-sourced in the main C++ library, where it’s available to millions of developers and companies. According to DeepMind, it’s the first change to this part of the sorting library in over a decade — and the first algorithm designed through reinforcement learning to join the library.

After the sorting game, AlphaDev began to play with hashing, which is used to retrieve, store, and compress data. The result was another enhanced algorithm, which has now been released in the open-source Abseil library. DeepMind estimates that it’s being used trillions of times a day.

Ultimately, the lab envisions AlphaDev as a step towards transforming the entire computing ecosystem. And it all began with playing board games.

Story by Thomas Macaulay

Senior reporter

Thomas is a senior reporter at TNW. He covers European tech, with a focus on AI, cybersecurity, and government policy. Thomas is a senior reporter at TNW. He covers European tech, with a focus on AI, cybersecurity, and government policy.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

DeepMind made its AI name in games. Now it’s playing with the foundations of computing

Gaming the system

Sorting out computing

Get the TNW newsletter

Also tagged with

UK startup uses AI to discover new rare earth-free magnet for EVs

US chip giant AMD to buy European LLM leader Silo AI for $665M

Join TNW All Access

This AI algorithm counts flowers on trees to predict crop yields months in advance

Samsung backs ‘world’s most powerful’ AI chip for edge devices