DeepMind’s search for AGI may not be successful, say AI researchers


David Silver, head of the Reinforcement Learning research group at DeepMind, is awarded the Honorary Ranking “Ninth Dan” for AlphaGo.

YOUNG YEON-JE | AFP | Getty Images

Computer scientists wonder if DeepMind, the British-owned Alphabet-owned company widely considered one of the world’s leading AI laboratories, will ever be able to create machines with the kind of “general” intelligence that humans and animals use Animals can be seen.

In its quest for artificial general intelligence, sometimes referred to as human-scale AI, DeepMind focuses part of its efforts on an approach called “reinforcement learning”.

This involves programming an AI to perform certain actions to maximize its chance of a reward in a given situation. In other words, the algorithm “learns” to do a task by looking for those preprogrammed rewards. The technique has been successfully used to train AI models how to play (and excel) games like go and chess. But they remain relatively stupid or “narrow”. For example, DeepMind’s famous AlphaGo AI cannot draw a stick figure or tell the difference between a cat and a rabbit while a seven year old can.

Even so, DeepMind, which was acquired by Google in 2014 for around $ 600 million, believes that AI systems supported by reinforcement learning could theoretically grow and learn so much that they could break the theoretical barrier to AGI without new technological developments .

Researchers at the company, which has grown to around 1,000 employees, owned by Alphabet, argued in a paper submitted last month to the peer-reviewed Artificial Intelligence Journal that “reward is enough” to achieve general AI. The paper was first published by VentureBeat last week.

In the paper, the researchers claim that if you “reward” an algorithm every time it does something you want, which is the essence of reinforcement learning, then it will eventually show signs of general intelligence.

“Reward is enough to encourage behavior that demonstrates skills that have been studied in natural and artificial intelligence, including knowledge, learning, perception, social intelligence, language, generalization, and imitation,” the authors write.

“We propose that agents learning through trial and error to maximize reward might learn behavior exhibiting most, if not all of these skills, and therefore powerful agents for reinforcement learning could provide a solution for artificial general intelligence. “

However, not all are convinced.

Samim Winiger, an AI researcher in Berlin, told CNBC that DeepMind’s “reward is enough” view is a “somewhat philosophical fringe position misleadingly portrayed as hard science”.

He said the road to general AI is complex and the scientific community is aware that there are myriad challenges and known unknowns that “rightly make most researchers in the field feel humble” and prevent them from “great to make totalitarian declarations ”. like “RL is the definitive answer, all you need is a reward.”

DeepMind told CNBC that while reinforcement learning is behind some of its most famous research breakthroughs, AI technology makes up only a fraction of the total research it does. The company thinks it’s important to understand things on a more basic level, so it’s pursuing other areas like “symbolic AI” and “population based training”.

“In somewhat typical DeepMind fashion, they chose to make bold statements that grab attention at all costs, rather than a more nuanced approach,” said Winiger. “It’s more politics than science.”

Stephen Merity, an independent AI researcher, told CNBC that there was “a difference between theory and practice”. He also noted that “one stack of dynamite is probably enough to get you to the moon, but it’s not really practical.”

Ultimately, there is no evidence that reinforcement learning will ever lead to AGI.

Rodolfo Rosini, a tech investor and AI-focused entrepreneur, told CNBC, “The truth is that nobody knows and that DeepMind’s main product continues to be PR, not technical innovations or products.”

Entrepreneur William Tunstall-Pedoe, who sold his Siri-like app Evi to Amazon, told CNBC that even if the researchers are right, “that doesn’t mean we’ll be there anytime soon, nor does it mean there isn’t any better there. quick way to get there. “

DeepMind’s “Reward is enough” paper was co-authored by DeepMind heavyweights Richard Sutton and David Silver, who met DeepMind CEO Demis Hassabis at the University of Cambridge in the 1990s.

“The core problem of the ‘reward is enough’ thesis is not that it is wrong, but that it cannot be wrong and thus Karl Popper’s famous criterion that all scientific hypotheses are falsifiable is not met,” says a senior AI – Researcher at a large US technology company who wanted to remain anonymous due to the sensitivity of the discussion.

“Since Silver et al. Speaking generally and the term reward is accordingly underspecified, you can always either pick out cases where the hypothesis is met or the term reward can be shifted to be met, ”the source added.

As such, the unfortunate verdict here is not that these prominent members of our research community were in any way wrong, but that what is written is trivial. In the end, what will be learned from this paper? actionable consequences of recognizing the inalienable truth of this hypothesis, was this paper enough? “

What is AGI?

Although AGI is often referred to as the holy grail of the AI ​​community, there is no consensus on what AGI actually is. A definition is the ability of an intelligent agent to understand or learn any intellectual task of a person.

But not everyone agrees, and some wonder if AGI will ever exist. Others are afraid of the possible effects and whether AGI would build its own, even more powerful forms of AI or so-called superintelligence.

Ian Hogarth, an entrepreneur turned angel investor, told CNBC that he hoped reinforcement learning wasn’t enough to achieve AGI. “The more existing techniques can be scaled to AGI, the less time we have to prepare AI security efforts and the less chance our species will do well,” he said.

Winiger argues that we are no closer to AGI today than we were a few decades ago. “The only thing that has changed fundamentally since the 1950 / 60s is that science fiction is now a valid tool for giant corporations to confuse and mislead the public, journalists and shareholders,” he said .

Powered by hundreds of millions of dollars from Alphabet every year, DeepMind competes with Facebook and OpenAI to hire the brightest people in the field to develop AGI. “This invention could help society find answers to some of the world’s most pressing and fundamental scientific challenges,” DeepMind writes on its website.

Lila Ibrahim, COO of DeepMind, said Monday that trying to “figure out how to make the vision a reality” has been the biggest challenge since joining the company in April 2018.