20 research outputs found

    Latent Replay for Real-Time Continual Learning

    Full text link
    Training deep neural networks at the edge on light computational devices, embedded systems and robotic platforms is nowadays very challenging. Continual learning techniques, where complex models are incrementally trained on small batches of new data, can make the learning problem tractable even for CPU-only embedded devices enabling remarkable levels of adaptiveness and autonomy. However, a number of practical problems need to be solved: catastrophic forgetting before anything else. In this paper we introduce an original technique named "Latent Replay" where, instead of storing a portion of past data in the input space, we store activations volumes at some intermediate layer. This can significantly reduce the computation and storage required by native rehearsal. To keep the representation stable and the stored activations valid we propose to slow-down learning at all the layers below the latent replay one, leaving the layers above free to learn at full pace. In our experiments we show that Latent Replay, combined with existing continual learning techniques, achieves state-of-the-art performance on complex video benchmarks such as CORe50 NICv2 (with nearly 400 small and highly non-i.i.d. batches) and OpenLORIS. Finally, we demonstrate the feasibility of nearly real-time continual learning on the edge through the deployment of the proposed technique on a smartphone device.Comment: Pre-print v3: 13 pages, 9 figures, 10 tables, 1 algorith

    Continual learning for object and scene classification

    Get PDF
    Since their existence, computers have been a great asset to mankind, primarily because of their ability to perform specific tasks at speeds humans could never compete with. However, there are many tasks that humans consider easy which are quite difficult for computers to perform. For instance, a human can be shown a picture of an automobile and a bicycle and then be able to easily discriminate between future automobiles and bicycles. For a computer to perform such a task using current algorithms, typically, it must first be shown a large number of images of the two classes, with varying features and positions, and then spend a great deal of time learning to extract and identify features so that it can successfully distinguish between the two. Nevertheless, it is still able to perform the task (eventually) and, after the computational training is complete, would be able to classify images of automobiles and bicycles faster, and sometimes better, than the human. Nonetheless, the real out-performance displayed by the human is when another class is added to the mix, e.g., “aeroplane”. The human can immediately add aeroplanes to its set of known objects, whereas a computer would typically have to go almost back to the start and re-learn all the classes from scratch. The reason the network requires to be retrained is because of a phenomenon named Catastrophic Forgetting, where the changes made to the system during the acquisition of new knowledge bring about the loss of previous knowledge. In this dissertation, we explore Continual Learning, where we propose a way to deal with Catastrophic Forgetting by making a framework capable of learning new information without having to start from scratch and even “improving” its knowledge on what it already knows. With the above in mind, we implemented a Modular Dynamic Neural Network (MDNN) framework, which is primarily made up of modular sub-networks and progressively grows and re-arranges itself as it learns continuously. The network is structured in such a way that its internal components function independently from one another so that when new information is learned, only specific sub-networks are altered in a way that most of the old information is not forgotten. The network is divided into two main blocks, the feature extraction component which is based on a ResNet50 and the modular dynamic classification sub-networks. We have, so far, achieved results below those of the state of the art using ImageNet and CIFAR10, nevertheless, we demonstrate that the framework can meet its initial purpose, which is learning new information without having to start from scratch.Desde a sua invenção que os computadores têm sido uma excecional ferramenta para a humanidade, principalmente dada a sua capacidade de realizar tarefas específicas, a velocidades que os humanos nunca poderão atingir. Embora os computadores atuais possam superar os humanos de muitas formas, o oposto também é verdade. Existem muitas tarefas que os humanos consideram fáceis de executar, mas que para os computadores são bastante difíceis. Um ser humano, por exemplo, pode ver uma imagem de um automóvel e outra de uma bicicleta e ser instantaneamente capaz de distinguir (praticamente sem erros) entre futuros automóveis e futuras bicicletas com que venha a ser apresentado. Por outro lado, para um computador realizar tal tarefa é tipicamente necessário que este seja apresentado a um grande número de imagens das duas classes, com características e posições variadas e, a seguir, passar bastante tempo a aprender a extrair e identificar atributos que depois são utilizados para distinguir entre as duas classes. Eventualmente, o computador será capaz de realizar a tarefa e, após terminada a aprendizagem, consegue classificar grandes números de imagens de automóveis e bicicletas mais rapidamente (e às vezes melhor) do que o próprio ser humano. No entanto, é quando uma nova classe é adicionada ao conjunto de classes a conhecer, como por exemplo “avião”, que os humanos mostram uma capacidade superior. O humano pode adicionar aviões imediatamente ao seu conjunto de objetos conhecidos, enquanto que um computador, usando os métodos de aprendizagem, teria tipicamente de voltar quase ao início e reaprender todas as classes do zero. Nestes métodos incluímos principalmente o uso de redes neuronais artificiais (Artificial Neural Networks), que têm demonstrado ser dos métodos de aprendizagem de máquina com melhor performance para estes tipos de problemas. Considerando, pois, o uso destas redes, a razão pela qual os métodos de aprendizagem tipicamente necessitam que uma rede seja treinada novamente perante novas classes, é um fenómeno usualmente designado de Catastrophic Forgetting. Neste fenómeno, as mudanças feitas ao sistema durante a aquisição de novos conhecimentos resultam numa perda de conhecimentos anteriores. É neste contexto que nesta dissertação exploramos métodos usualmente designados de Continual Learning, onde se investigam maneiras de lidar com o referido fenómeno, desenvolvendo um sistema capaz de aprender novas informações sem a necessidade de começar do zero

    LOW MEMORY CONTINUAL LEARNING CLASSIFICATION ALGORITHMS FOR LOW RESOURCE HARDWARE

    Get PDF
    Continual Learning (CL) is a machine learning approach which focuses on continuous learning of data rather than single dataset-based learning. In this thesis, this same focus is applied with respect to the field of machine learning for embedded devices which is still in the early stages of development. This focus is further used to develop various algorithms such as utilizing prior trained starting networks, weighted output schemes, and replay or reduced datasets for training while maintaining a consistent focus on low resource devices to maintain acceptable performance. The experimental results show an improvement in model training times as compared to the time to train a neural network using all available information with the following accuracy for the Fashion MNIST dataset (~90% to 73% accuracy on 10 classes with a factor of 10 reduction in training time). The other main result showed a reduction in required memory as only 1 class size worth of data is required to be stored at a time rather than the full dataset for non-Replay algorithms. For the Replay based algorithms, this is still reduced to less than 2 classes worth of data for 10 classes which is an 80% reduction overall in memory. This was done with the goal of creating a usable model while a fully trained network is developed on backend systems to limit overall downtime and still maintain system performance

    Online Continual Learning on Sequences

    Full text link
    Online continual learning (OCL) refers to the ability of a system to learn over time from a continuous stream of data without having to revisit previously encountered training samples. Learning continually in a single data pass is crucial for agents and robots operating in changing environments and required to acquire, fine-tune, and transfer increasingly complex representations from non-i.i.d. input distributions. Machine learning models that address OCL must alleviate \textit{catastrophic forgetting} in which hidden representations are disrupted or completely overwritten when learning from streams of novel input. In this chapter, we summarize and discuss recent deep learning models that address OCL on sequential input through the use (and combination) of synaptic regularization, structural plasticity, and experience replay. Different implementations of replay have been proposed that alleviate catastrophic forgetting in connectionists architectures via the re-occurrence of (latent representations of) input sequences and that functionally resemble mechanisms of hippocampal replay in the mammalian brain. Empirical evidence shows that architectures endowed with experience replay typically outperform architectures without in (online) incremental learning tasks.Comment: L. Oneto et al. (eds.), Recent Trends in Learning From Data, Studies in Computational Intelligence 89
    corecore