Smart traffic cameras that wrongly issue fines, robots that get blackouts after learning a simple new task, or a self-driving car that ends up in the ditch because it does not know the concept ‘bend’. These are just a few examples of artificial intelligence ‘having a bad day’. Researchers at KU Leuven try to make AI systems more flexible and to ensure that they become genuinely intelligent (and faultless).
Over the past decade, we have made enormous progress in the development of artificial intelligence, ranging from speech or facial recognition on smartphones to autonomous flying drones or self-driving cars. KU Leuven also conducts extensive research into AI. For example, Professor Tinne Tuytelaars at the ESAT – Processing Speech and Imaging Department (PSI) has been working on computer vision for many years. About twenty years ago, during her doctoral research, she developed an algorithm so that computers could recognize a can of cola or other simple objects. It was ground-breaking at the time, but computer algorithms are now capable of doing much more.
“In the field of computer vision, we have made great advances in image recognition”, Professor Tuytelaars says. “If you have sufficient data, it is perfectly possible to identify number plate or faces in Facebook photos or security camera footage. Or think of medical applications: by analysing X-rays, computer systems can help doctors to formulate diagnoses or to discover a tumour. These are all developments made in recent years.”
According to Tuytelaars, this is only the tip of the iceberg. “The major challenge in my field is to teach computer systems to interpret images the way people can. Although we have made huge strides forward, we are still in the very early stages of this evolution. Computer systems are very good at performing specific tasks, but you have to teach them to do each one individually. There is not yet an ‘umbrella system’ like humans have.”
Tuytelaars is constantly looking for ways to improve AI systems. New technologies developed over the past few years not only offer a helping hand, but also expedite the process. “In the past, we had to programme a computer algorithm from A to Z ourselves, and for complex images this is a long, laborious process, whereas we now use ‘deep learning’, a machine-learning strategy inspired by the way the human brain operates. We basically unleash gigantic quantities of data on a computer system and let it ‘learn’ the things it needs to attain the desired result by itself.”
Deep learning makes use of ‘neural networks’: digital neurons that are connected and exchange information. These networks consist of different layers, each deeper than the last, that can recognize increasingly complex data.
Imagine that you want to teach a computer system to recognize a cat or a dog in an image. You would feed thousands of pictures into the system and ‘direct’ them through the various network layers. The first layer might focus on the edges of the picture, while the next might look for a certain colour or form. Based on all these parameters, the system determines whether it can see a dog or a cat.
“To attain the desired result, we still have to ‘train’ all these systems using a kind of feedback loop”, Tuytelaars says. “Each time, the computer repeats the same ‘thought exercise’ in order to tell you if it has identified a cat or a dog. If we tell the system that the answer is wrong, it starts again, and adjusts its network slightly, in the hope of achieving the desired result, for example by prioritizing certain parameters in the final decision. In the end, the system knows the conditions that a picture of a dog or cat must meet.”
The deep learning technique works well in the laboratory or test environment, but Tuytelaars wants to equip AI systems to work in the real world.
“An AI system can only perform what it has learned based on available data. The current process is as follows: we collect a dataset and train our model using part of the data. We then test what the model has learned on the rest of the data and if that works well, we are satisfied. But for applications that interact with the real world, this sometimes causes problems. A self-driving car that has only learned to drive on an asphalt road might get confused if it drives into a field. Another problem is ‘catastrophic forgetting’: if we teach a neural network new tasks, such as ‘driving in a field’, it sometimes ‘forgets’ previously learned tasks or starts mixing things up.”
Ideally, you can make AI systems learn new things constantly, without them forgetting previously learned tasks. This principle is called continual learning. This is one of the major current challenges in deep learning and the key to the next generation of AI. Tuytelaars herself conducts intensive research into the subject, as well as focusing on practice.
“We are currently preparing an experiment in object detection”, she tells us. “A self-driving car may be trained to recognize red-coloured busses in fine weather in city A, but we want to find an easy way to teach it to recognize busses of a different colour in the rain in city B. In other words, we must develop a model that is able to ‘generalize’ very easily and switch between different settings, as well as continue to expand its neural network with new examples. This is essential because the system will frequently encounter such situations in practice.”
AI systems that are trained via ‘deep learning’ sometimes have problems when they interact with the real world. A self-driving car that has learned to drive on an asphalt road might get confused if it drives into a field.
Tuytelaars is one of the fortunate researchers who received an ERC Advanced Grant this year, a prestigious five-year research grant. She intends to use her ERC grant to pursue her research into ‘continual learning’. “I want to refine the technology and algorithms and use them for concrete applications”, she tells us. “The goal is to develop a system that automatically generates audio descriptions for videos, specifically for the visually impaired. In other words, I also focus on the interaction between images and language, which I think is another important aspect in the development of improved AI applications.”
The computer system must be capable of analysing footage from films or television series and to provide the right ‘text’, such as: ‘Frank comes into the café’ or ‘Simonneke cries’.
“Because public broadcasters are obliged to provide such audio descriptions, we have a lot of data at our disposal to work with. But here too, there are many obstacles that must be overcome. It is important that the system continues to learn and knows how to make the right connections because there are many different contexts that each require a different description. The same actor’s character may be called ‘Frank’ in one soap opera, for example, and ‘Pol’ in another. This continual learning is also important to prevent the audio descriptions from becoming dated.”
AI needs a kind of basic logic, as a result of which systems can generalize more easily, without having to learn everything from scratch.
If Tuytelaars succeeds, we will be another step closer to intelligent machines in five years. What does she think computer vision systems will look like in twenty years?
“That is difficult to predict. Things are evolving very quickly at the moment, but will that continue? Or will we get stuck somewhere? The development of continual learning is important, but we also need to explore combinations of learning from data and more ‘knowledge-based reasoning’. The latter refers to a kind of basic logic, as a result of which AI systems can generalize more easily, without having to learn everything from scratch. This would enable systems to learn more quickly and make them more reliable.”
Fostering interaction between various kinds of AI expertise is also the objective of the recently founded KU Leuven institute Leuven.AI, Tuytelaars says. “Many research groups at KU Leuven work on different AI applications, but the technology they use is often the same. It is thus not an insurmountable challenge to bring these different applications together and to ensure more cross-fertilization. This will only increase the chance of new breakthroughs or new applications.”