Geoffrey Hinton, the godfather of ‘deep learning’—which helped Google’s AlphaGo beat a grandmaster—on the past, present and future of AI
In the right circles, it had all the buzz of a championship boxing match, and the tense sporting atmosphere of a Super Bowl. Millions of people tuned in on live streams to watch. The stakes, too, were high, at a million dollars. But in one corner was Lee Sedol, a South Korean grandmaster in the ancient and complex game of Go—and in the other, the AI program AlphaGo, designed by Google’s Deep Mind team.
The result, though, was hardly dramatic. With three straight victories to start the five-game series—Sedol was only able to take the fourth game—AlphaGo trounced the human Go star.
If an elite human player losing a game to a computer feels like old hat—chess legend Garry Kasparov lost to Deep Blue two decades ago, and Jeopardy players fell to IBM’s Watson in 2011—this particular loss is actually particularly momentous. What made the AlphaGo victory so impressive is that the possibilities for a single move in Go are massive, even more so than in a game like chess. So a victory in this game calls for something far more than mere supercomputers and computing strength. It calls for “deep learning,” which uses neural networks, modelled after the brain’s neurons, that allow programs to effectively learn like humans do—by seeing the world, consuming data, and learning patterns and rules.
These neural networks were, in the 1980s, once popular—then were dismissed as bunk by the AI establishment. But scientists like 68-year-old Geoffrey Hinton, a Canadian considered the “godfather of neural networks” who now splits his time between the University of Toronto and Google, pressed forward with the work. It turned out that his way was the right way; now that the power of computing has caught up with the algorithms, deep learning has become seen as technology’s next big thing, sparking bidding wars among companies like Google to buy up firms researching ways to use deep learning.
So while AlphaGo cruised to victory, the triumph actually represents something of a comeback story. For decades, the idea of “deep learning” was seen as scientific lunacy. It was only through the work of scientists like Hinton that it has become the next big thing—a way of programming that’s already influencing the apps we have in our own pockets.
In an interview with Maclean’s, Hinton explains what the future of deep learning holds, why we shouldn’t be afraid of AI, and whether or not he’s vindicated by its successes so far.
Q: What did you think of AlphaGo’s wins?
A: It was quite exciting. I mean, I stayed up until 2 a.m. watching the games. We really didn’t know before the first game with Lee Sedol whether AlphaGo had serious weaknesses that we just didn’t know about. And we saw in the fourth game there were some weaknesses. In the end, it was very exciting. The people on the team thought AlphaGo would probably win, but they didn’t know. It’s probably lucky that Game 4 wasn’t Game 1, if he had won the first game they’d be really nervous.
Q: So, why is it important that AI triumphed in the game of Go?
A: It relies on a lot of intuition. The really skilled players just sort of see where a good place to put a stone would be. They do a lot of reasoning as well, which they call reading, but they also have very good intuition about where a good place to go would be, and that’s the kind of thing that people just thought computes couldn’t do. But with these neural networks, computers can do that too. They can think about all the possible moves and think that one particular move seems a bit better than the others, just intuitively. That’s what the feed point neural network is doing: it’s giving the system intuitions about what might be a good move. It then goes off and tries all sorts of alternatives. The neural networks provides you with good intuitions, and that’s what the other programs were lacking, and that’s what people didn’t really understand computers could do.
Q: In 2014, experts said that Go might be something AI could one day win at, but the common thinking was that it would take at least a decade. Obviously, they undershot that estimate. Would you have guessed then that this was possible?
A: I guess I would’ve believed that if you got together a really good team, really well-managed, and you pushed really hard for a year, and you use these neural networks, maybe you could do it—probably not, but maybe. But the Deep Mind people really made it. So I was surprised they did it so quickly.
Q: So what now? Are there other, even more complicated games that the AI world wants to conquer next?
A: From what we think of as board games and things like that, I don’t think there is—I think this is really the pinnacle. There are of course other games, these fantasy games, where you interact with characters who say things to you. AI still can’t deal with those because they still can’t deal with natural language well enough, but it’s getting much better. And the way translation’s currently done will change because Google now has what promises to be a much better way to do machine translation. That’s part of understanding natural language properly, and that’ll influence lots of things—it’ll influence fantasy games and things like that, but it will also allow you to search much better, because you’ll have a better sense of what documents mean. It’s already influencing things—in Gmail you have Smart Reply, that figures out from an email what might be a quick reply, and it gives you alternatives when it thinks they’re appropriate. They’ve done a pretty good job. You might expect it to be a big table, of ‘If the email looks like this, this is a good reply, and if the email looks like that, then his might be a good reply.’ It actually synthesizes the reply from the email. The neural net goes through the words in the email, and gets some internal state in its neurons, and then uses that internal state to generate a reply. It’s been trained in a lot of data, where it was told what the kinds of replies are, but it’s actually generating a reply, and it’s much closer to how people do language.
Q: Beyond games, then—what might come next for AI?
A: It depends who you talk to. My belief is that we’re not going to get human-level abilities until we have systems that have the same number of parameters in them as the brain. So in the brain, you have connections between the neurons called synapses, and they can change. All your knowledge is stored in those synapses. You have about 1,000-trillion synapses—10 to the 15, it’s a very big number. So that’s quite unlike the neural networks we have right now. They’re far, far smaller, the biggest ones we have right now have about a billion synapses. That’s about a million times smaller than the brain.
Q: Do you dare predict a timeline for that?
A: More than five years. I refuse to say anything beyond five years because I don’t think we can see much beyond five years. And you look at these past predictions like there’s only a market in the world for five computers [as allegedly said by IBM founder Thomas Watson] and you realize it’s not a good idea to predict too far into the future.
Q: The popular thinking on stories like AlphaGo can be one of fear—the fear that AI will become better than us, and will come to dominate humanity. Is it totally preposterous to fear the results of deep learning?
A: Well, I think people need to understand that deep learning is making a lot of things, behind-the-scenes, much better. Deep learning is already working in Google search, and in image search; it allows you to image search a term like “hug.” It’s used to getting you Smart Replies to your Gmail, it’s in speech and vision, it will soon be used in machine translation I believe. It will be applied to other major problems like climate science for example, and energy conservation, and in genomics.
So it’s a bit like … as soon as you have good mechanical technology, you can make things like backhoes that can dig holes in the road. But of course a backhoe can knock your head off. But you don’t want to not develop a backhoe because it can knock your head off, that would be regarded as silly. Obviously, if they’re used wrong, that can happen. Any new technology, if it’s used by evil people, bad things can happen. But that’s more a question of the politics of the technology. I think we should think of AI as the intellectual equivalent of a backhoe. It will be much better than us at a lot of things. And it can be incredibly good—backhoes can save us a lot of digging. But of course, you can misuse it.
Q: There’s also the fear that AI will render humanity obsolete, that there will come an inevitable loss of labour.
A: It’s hard to predict beyond five years. I’m pretty confident it won’t happen in the next five years, and I’m fairly confident that it won’t be something I’m going to have to deal with. But it’s something people should definitely be thinking about. But the main thing shouldn’t be, how do we cripple this technology so it can’t be harmful, it should be, how do we improve our political system so people can’t use it for bad purposes?
Q: How important is the power of computing to continued work in the deep learning field?
In deep learning, the algorithms we use now are versions of the algorithms we were developing in the 1980s, the 1990s. People were very optimistic about them, but it turns out they didn’t work too well. Now we know the reason is they didn’t work too well is that we didn’t have powerful enough computers, we didn’t have enough data sets to train them. If we want to approach the level of the human brain, we need much more computation, we need better hardware. We are much closer than we were 20 years ago, but we’re still a long way away. We’ll see something with proper common-sense reasoning.
Q: Can the growth in computing continue, to allow applications of deep learning to keep expanding?
A: For the last 20 years, we’ve had exponential growth, and for the last 20 years, people have said it can’t continue. It just continues. But there are other considerations we haven’t thought of before. If you look at AlphaGo, I’m not sure of the fine details of the amount of power it was using, but I wouldn’t be surprised if it was using hundreds of kilowatts of power to do the computation. Lee Sedong was probably using about 30 watts, that’s about what the brain takes, it’s comparable to a light bulb. So hardware will be crucial to making much bigger neural networks, and it’s my guess we’ll need much bigger neural networks to get high-quality common sense.
Q: In the ’80s, scientists in the AI field dismissed deep learning and neural networks. What changed?
A: Mainly the fact that it worked. At the time, it didn’t solve big practical AI problems, it didn’t replace the existing technology. But in 2009, in Toronto, we developed a neural network for speech recognition that was slightly better than the existing technology, and that was important, because the existing technology had 30 years of a lot of people making it work very well, and a couple grad students in my lab developed something better in a few months. It became obvious to the smart people at that point that this technology was going to wipe out the existing one.
Google was then the first to use their engineering to get it into their products and in 2012, it came out in the Android, and made the speech recognition in the Android work much better than before: It reduced the word-error rate to about 26 per cent. Then, in 2012, students in my lab took that technology that had been developed by other people, and developed even further, and while the existing technology was getting 26 per cent errors, and we got 16 per cent errors. In the years after we did that, people said, ‘Wow, this really works.’ They were very skeptical for many many years, they published papers dismissing it. Over the next years, they all switched to it.
Then in the next few years, the error rate went down from 16 per cent, which is what we got, to about four per cent. It was much, much, much better. That led the big companies and the academics to realize this really works.
Q: This kind of intellectual comeback story feels like it could only happen in science. The writer Thomas Kuhn talks about it when he talked about “paradigm shifts”—that these scientific revolutions don’t necessarily produce better ideas, just different ideas. Culture at large seems to have lost this concept. Is the comeback of deep learning the kind of thing that can only happen in science?
A: I think that’s what differentiates science from religion. In science, you can say things that seem crazy, but in the long run they can turn out to be right. We can get really good evidence, and in the end the community will come around. Probably the scientists you’re arguing with won’t come around, but the younger generation will defect, and that’s what’s happening with deep learning. It’s not so much the old conventional AI guys are believing in it, it’s the young graduate students all seeing which ways things are going.
I had some experience with this when I was young, in the 1950s. My father was an entomologist who believed in continental drift. In the early ’50s, that was regarded as nonsense. It was in the mid-50s that it came back. Someone had thought of it 30 or 40 years earlier named Alfred Wegener, and he never got to see it come back. It was based on some very naive ideas, like the way Africa sort of fit into South America, and geologists just pooh-poohed it. They called it complete rubbish, sheer fantasy.
I remember a very interesting debate that my father was involved in, where there was a water beetle that can’t travel very far and can’t fly. You have these in the north coast of Australia, and in millions of years, they haven’t been able to travel from one stream to another. And it came up that in the north coast of New Guinea, you have the same water beetle, with slight variations. The only way that could have happened was if New Guinea came off Australia and turned around, that the north coast of New Guinea used to be attached to the coast of Australia. It was very interesting seeing the reaction of the geologists to this argument, which was that ‘beetles can’t move continents.’ They refused to look at the evidence.
Q: Did you ever think of quitting in the face of the establishment’s dismissal of your thinking?
A: People were very, very much against this stuff. Things were tough then. But my view was, the brain has to work somehow, and it sure as hell doesn’t work the way normal computer programs work, and in particular, the idea that everything has to be programmed into AI is crazy. You interact with the world, and you figure out how the world works. It seemed to me the only hope to getting a lot of knowledge into an AI system was to develop learning algorithms that allowed them to learn this knowledge. That approach in the long run I thought was the only one with any hope of success. And it turned out I was right.
Q: Is that particularly vindicating for you?
A: Yes. I try not to crow about it, but it does look like the approach that me and some other people have been advocating for a long time is actually now working a lot better than the conventional AI.
This interview has been condensed and edited.