PART II: How To Make Machines Learn Like We Do

0

Arslan Chaudhry is a Rhodes Scholar studying Artificial Intelligence and Machine Learning towards his PhD dissertation at Oxford. Originally from Lahore, Arslan completed his undergraduate education from UET Lahore in Electrical Engineering and worked with Mentor Graphics, a popular company specializing in embedded electronics, for two years before arriving at Oxford. As part of our series ‘Artificial Intelligence: Truths and Myths’, Spectra spoke to Arslan about his journey, Artificial Intelligence (AI) and his work in this field.

The Interview consists of two parts; the first part which can be read here explores Arslan’s personal experience as a student at UET and now at Oxford. In this part, we have an in-depth conversation with Arslan on multiple themes of Artificial Intelligence, including, but not limited to, limitations of current AI systems, his work on continual learning, academic lethargy in Pakistan and the need for a national AI policy.

The interview has been condensed and edited for the purposes of clarity and brevity.

Arslan presenting his work on life-long learning at International Conference of Learning Representations.

Spectra: What are the problem(s) you are looking to solve in machine learning?

I work on what we call continual learning. In this setting, you have a learning agent and there is data whose distribution may change over time and the task is that this agent should continuously learn and adapt to new data. [This can be better understood in the context of a project]. So assume that you want to build a driverless car. To build this car, you need a system which can detect all the possible objects on the road. So one possibility is that first, you collect the dataset of all the possible things that you’re going to see in the world. And then you will train your model on all these objects. But this is an impossible task, because you don’t know which objects your car is going to see on the road and there is always a possibility that there is this new object which your car has not seen. So for example, let’s say that you’ve trained your car in UK and then all of a sudden, you take this car to Pakistan, and it sees a rickshaw for the first time or a bail garhi (ox cart). Your car won’t be able to detect these new objects [that it has never seen during training]. To help your car detect them, if you retain your model on these objects, then your car essentially will forget what it had seen in the past [in the UK] and this is a big problem. Continual learning provides a way to keep training your agent as you get new types of data and not forget about what you’ve seen in the past and this is something that I’m trying to solve in machine learning i.e. how can you build new concepts or how can you learn new concepts without forgetting old ones.

There is generally a lot of excitement around self-driving cars. Can you guess when will they be on the roads?

I would say a hacky version of driverless cars, in which the car is doing 95% of the work but there is still a human sitting behind the driving wheel, will be seen in the next 5-6 years. But full autonomy, where there is no human involved at all, is still way off, I would say two to three decades.

What are the challenges that you think we will need to solve to get to fully autonomous AI?

I think there are a bunch of engineering challenges as well as a plethora of theoretical ones. Engineering challenges range from building sensors which are accurate enough to building hardware which can process all this data coming from the sensors in an efficient way. There has been some progress on these fronts; but these challenges are still there. On the theory side, again, there are a bunch of challenges. One concerns the adaptability of AI models. For example, when you change the domain from night to day or from London to Lahore or from desert to an [urban] road, how quickly does your model adapt? That adaptability problem is there and there I think continual learning is going to be important. Second, interpretability is also very important. Right now, with neural networks as black boxes, interpreting different results and [making] theoretical guarantees around the product and then having uncertainty estimates around those [results] is going to be really important. This is so, because of safety certifications, I’m quite sure that unless you can certify that our car is predicting something with this much confidence, people are not going to certify these cars to drive on roads. Lastly, the general problem of causality is always going to be there.

Do you agree that current deep learning approaches are ultimately limited and there is a need for some alternative learning paradigm? 

Ideally, we should learn causal models which can explain the causation of why network is predicting a certain output. And it can extract out spurious correlations from the data, so that you can exactly pinpoint what is causing this output. But unfortunately, right now, we don’t have very good models which can do well on causation. And I think this is the critique of Juda Pearl and others as well that the deep learning models are a glorified template matching machines and they are just capturing the spurious correlations in the data and not explaining the data well. And many examples of this phenomenon have been demonstrated. You train a  state-of-the-art classification model, and then you just take a picture of a cow and then put it on a beach and your model fails miserably. Or even simpler cases that you train a model to detect hand-written digits [on MNIST] and you color digits zero to five as green and six to nine as red. And then at test time you switch the colors and paint zero to 5 as red and 6 to 9 as green and the performance of your model suddenly becomes very bad. So it turns out that our [deep learning] models learn all kinds of spurious correlations and not the things which can explain the data. So maybe we need a new method altogether. Empirical risk minimization is not going to cut it, I.I.D idea is not going to cut it. And take Leon Bottou, David Lopez-Paz, these guys are working on invariant risk minimization, which is the alternative of empirical risk minimization to counter these things.

Right now, backpropagation algorithm is the workhorse of deep learning. So is backdrop the right algorithm or approach for this problem [of providing a reliable interpretable learning paradigm]?

So, regarding backpropagation, there had been enough evidence to suggest that if you look at the brain, updates to the synapses happen locally, so there is no global update. So for example, if you want to update your memory for a concept, then this update is local and you don’t backpropagate the signals throughout your brain. However, in backpropagation, whenever you see a new concept or example, you do a forward pass, and in the backward pass, you update everything. So the updates are not local and this results in catastrophic forgetting. So, in that aspect, it is not actually mimicking brain behaviour [appropriately]. In my humble view, these updates should be more local and you should only update few weights. But then the problem is how do you identify those weights. I agree that algorithm of back propagation needs to be improved or replaced. However, Yann Le Cun and others, believe that this is the simplest possible and the best algorithm that we have and it beats everything, at least numerically and we can’t do anything better than that. In fact, in one of the recent talks, Yann LeCunn said that in the next 30 to 40 years, he sees a lot of things changing: new architectures, new data sets, and everything else, but he does not see backpropagation changing that much.

Another characteristic of deep learning paradigm is that it enables you to use a model free approach. However, again, people like Juda Pearl and Gary Marcus have criticized this aspect of deep learning. What is your take on that?

Yes, so the main pitch of deep learning was that it did away with all the hand engineering people did before deep learning era and gave this general purpose algorithm [backpropagation] that could learn everything with enough data and this has been working really well. But its problem is that it is again coming at the cost of you not learning models, which are explainable or interpretable or can explain the causation and so on. And if you think about it, I fully agree that we need model-based approaches, because the kind of world model that we humans build is very different from the kind of world model that a bird is building and so on.

So, the reason we are so efficient in processing our sensory data is because we already have these very very good world models or environment models built in us and we utilize these models very efficiently to learn from very few examples or to adapt very quickly.

So what do you think is the holy grail of AI? Is it human intelligence?

No, I don’t think human intelligence is the holy grail. As I said, quoting Yann before, that human intelligence is not general at all but rather quite specific. Among all the possible concepts that you could learn, your intelligence is on a very small manifold. So, one example that he likes to give is that if you have 1 million connections to your optical nerve and then if you consider each connection as zero one bit, then you have  2 ^ (2 ^ 1 million) possible functions that you can represent and out of these 2 ^ (2 ^ 1 million) possible functions, you only process a very small number of functions. So, in a way your intelligence is not general at all and it is embedded in a prior model that you have set for yourself or that our species has set for ourselves.

On what is the holy grail of AI, I would say each company has her own mantra. For example, DeepMind have this mantra that they want to solve intelligence. When they say intelligence, I don’t know what they mean. Do they mean human intelligence or do they mean some other kind of intelligence? I am not sure.

Some people always refer to the term AGI, artificial general intelligence, whenever they’re talking about the holy grail of AI and what they normally refer from AGI is human intelligence. So, a lot of the founders of deep learning are now saying that replicating human intelligence is not their goal. They want to develop intelligent machines, which will help us solve some of the problems more efficiently than we do today.

How would you compare the current state-of-the-art of machine intelligence and human intelligence? 

It is an interesting question. I mean, in the classical challenges posed to machines, like chess, or Go or Atari games, what people discovered in all these things was that the kind of strategies that an AI agent ended up learning were not interpretable by humans and the agent was able to discover strategies which humans were not able to develop, which means they have already surpassed human intelligence, in that sense. But at the same time, there are many tasks, which humans can do very well, and machines can’t at the moment.

Arlsan [4th from left in sitting row] with his research group: Torr Vision Group.

Pakistan is currently severely lagging behind in AI. So what can we do to catch up? 

I think, first of all, having the realization that AI is important for Pakistan, is very important. I think one of the problems is that during the late 90s or during the dot com bubble we didn’t appreciate the importance of the internet, and thereby, didn’t develop systems and we didn’t foresee the changes that this internet bubble was going to bring or the kind of investment it was going to bring. 

India on the other hand, quickly realized the importance of this and they established a huge set-up in Bangalore and other cities, and now all the big companies are working there. So, and as with the web, India is way ahead of us in machine learning and AI as well. They have realized the importance of these fields and already have an AI policy. In fact, I think nearly 30 to 35 countries already have their own AI policies, even a country like the UAE, (maybe we should not say that), even they have their own AI policy but Pakistan doesn’t.

So, having this realization that AI in the next decade is going to remarkably change the way the governments work, or the way our socio-economic systems work is very important. Unfortunately, we are currently insensitive to this transformation happening around the world but it’s high time that we become proactive and start working, first of all, on our AI policy. What do we want it to be like? What can we gain from AI? How can we catch up with the rest of the world? See, according to this paper by Mike Osborne from Oxford, which came out three to four years ago, even in a developed country like US, almost 41-50% of the jobs are going to be replaced with automation. Now imagine what is going to happen in a country like Pakistan, where already most of the jobs are basically manual labor. Our government does not have enough resources to sustain its existing unemployed population then what is it going to do for this new wave of people who are going to be unemployed because of this wave of automation? So yeah, we definitely need to catch up. And we definitely need to start making people more cognizant of what AI is. Actually we should start making our whole governmental system more aware of these technologies because this is going to be the future.

Who can develop this AI policy? Is there any local talent working abroad that may be engaged?

You don’t need to write it from scratch, 30 or so countries already have these policies, you can just borrow from them and tailor it to your own needs and start building local talent which you can then expand.I mean, it is hard to find AI talent belonging to Pakistan. So, even abroad it is hard to find any AI researcher from Pakistan in good universities. And that is a statement on how much our society is unaware of this field. I think we need to do a lot on this side. I’ve been to Stanford, Berkeley, Cambridge, and Oxford. I have not come across many Pakistanis who are working on AI. In terms of professors, I know one, who’s working at Facebook Reality Labs- Yaser Sheikh. He did his PhD with Mubarak Shah at the University of Central Florida and he’s now a professor at CMU. So Mubarak is a big name in computer vision. Then there are some students who have started doing masters in Canada and other places in Europe but it’s only a very recent phenomenon. But again in the top schools, in the top labs, I don’t find many Pakistanis.

How many Indians you have met, just to compare?

Oh, Indians, there are so many. it’s funny I was looking at the chart of MILA [Montreal Institute For Learning Algorithms], basically Bengio’s group and it’s actually a consortium of all the main universities working on AI in Canada. So this includes, the University of Montreal, University of Alberta, University of McGill and so on. And they had this chart showing how many people from different parts of the world are working in MILA and most of them are from India. None from Pakistan. Even Syria has one student, but none from Pakistan. So India is way ahead of us. Even in FAIR [Facebook of Artificial Intelligence], Berkeley, Stanford, Oxford, Cambridge [etc.], there are so many Indians that you meet.

While machine learning applications have generally improved the quality of human life, we are also seeing abuses of this technology. And this is happening at state level, for example, the targeting of Uyghur Muslims and now in Hong Kong as well. How can researchers guard against this malicious use of their research and products?

I mean this has always been a problem with any innovation. The World Wide Web or the internet brought this wealth of knowledge, information and opportunities for everyone and then people started using it to spread all kinds of misinformation. And the same has been true with all the technologies. Nuclear energy is another example. It’s a remarkable technology and if used properly, it can basically solve a lot of the problems which are causing climate change. But again, you have nuclear bombs. AI is no exception to this either. So, for example, generative modeling, which is a very nice idea and it allows you to do some very interesting things in AI, but all of a sudden you have these deep fakes. So, I don’t think there is any escape when it comes to the misuse of all of these new technologies.

That said, there’s been very active work on the ethics of AI recently. And there’s already a consortium consisting of all of the top companies and big labs which are talking about what are the ground rules of building an AI system, especially of building an AI system which is going to deal directly with society. There has been a lot of work but I still believe that a typical AI researcher who is an engineer, what he or she needs to consider is all the ways in which his or her technology or innovations can be used in a malicious way. 

My medium-term goal is to establish myself as a good researcher in my field, be it in academia or industry. I will figure this out in the next one year or so as I’m graduating in seven to eight months. But the long term goal is to go back to academia, set up my own lab, train more people and contribute as much to Pakistan as I can in the capacity of an open academic and a researcher and set up a state-of-the-art research lab in Pakistan and train the new, young people who are coming into the field of AI.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.