Class-incremental lifelong object learning for domestic robots

Abstract

Traditionally, robots have been confined to settings where they operate in isolation and in highly controlled and structured environments to execute well-defined non-varying tasks. As a result, they usually operate without the need to perceive their surroundings or to adapt to changing stimuli. However, as robots start to move towards human-centred environments and share the physical space with people, there is an urgent need to endow them with the flexibility to learn and adapt given the changing nature of the stimuli they receive and the evolving requirements of their users. Standard machine learning is not suitable for these types of applications because it operates under the assumption that data samples are independent and identically distributed, and requires access to all the data in advance. If any of these assumptions is broken, the model fails catastrophically, i.e., either it does not learn or it forgets all that was previously learned. Therefore, different strategies are required to address this problem. The focus of this thesis is on lifelong object learning, whereby a model is able to learn from data that becomes available over time. In particular we address the problem of classincremental learning with an emphasis on algorithms that can enable interactive learning with a user. In class-incremental learning, models learn from sequential data batches where each batch can contain samples coming from ideally a single class. The emphasis on interactive learning capabilities poses additional requirements in terms of the speed with which model updates are performed as well as how the interaction is handled. The work presented in this thesis can be divided into two main lines of work. First, we propose two versions of a lifelong learning algorithm composed of a feature extractor based on pre-trained residual networks, an array of growing self-organising networks and a classifier. Self-organising networks are able to adapt their structure based on the input data distribution, and learn representative prototypes of the data. These prototypes can then be used to train a classifier. The proposed approaches are evaluated on various benchmarks under several conditions and the results show that they outperform competing approaches in each case. Second, we propose a robot architecture to address lifelong object learning through interactions with a human partner using natural language. The architecture consists of an object segmentation, tracking and preprocessing pipeline, a dialogue system, and a learning module based on the algorithm developed in the first part of the thesis. Finally, the thesis also includes an exploration into the contributions that different preprocessing operations have on performance when learning from both RGB and Depth images.James Watt Scholarshi

    Similar works