113 research outputs found

    Automatic Curriculum Learning For Deep RL: A Short Survey

    Full text link
    Automatic Curriculum Learning (ACL) has become a cornerstone of recent successes in Deep Reinforcement Learning (DRL).These methods shape the learning trajectories of agents by challenging them with tasks adapted to their capacities. In recent years, they have been used to improve sample efficiency and asymptotic performance, to organize exploration, to encourage generalization or to solve sparse reward problems, among others. The ambition of this work is dual: 1) to present a compact and accessible introduction to the Automatic Curriculum Learning literature and 2) to draw a bigger picture of the current state of the art in ACL to encourage the cross-breeding of existing concepts and the emergence of new ideas.Comment: Accepted at IJCAI202

    Boosting Deep Open World Recognition by Clustering

    Get PDF
    While convolutional neural networks have brought significant advances in robot vision, their ability is often limited to closed world scenarios, where the number of semantic concepts to be recognized is determined by the available training set. Since it is practically impossible to capture all possible semantic concepts present in the real world in a single training set, we need to break the closed world assumption, equipping our robot with the capability to act in an open world. To provide such ability, a robot vision system should be able to (i) identify whether an instance does not belong to the set of known categories (i.e. open set recognition), and (ii) extend its knowledge to learn new classes over time (i.e. incremental learning). In this work, we show how we can boost the performance of deep open world recognition algorithms by means of a new loss formulation enforcing a global to local clustering of class-specific features. In particular, a first loss term, i.e. global clustering, forces the network to map samples closer to the class centroid they belong to while the second one, local clustering, shapes the representation space in such a way that samples of the same class get closer in the representation space while pushing away neighbours belonging to other classes. Moreover, we propose a strategy to learn class-specific rejection thresholds, instead of heuristically estimating a single global threshold, as in previous works. Experiments on RGB-D Object and Core50 datasets show the effectiveness of our approach.Comment: IROS/RAL 202

    Self-Attention Meta-Learner for Continual Learning

    Get PDF
    Continual learning aims to provide intelligent agents capable of learning multiple tasks sequentially with neural networks. One of its main challenging, catastrophic forgetting, is caused by the neural networks non-optimal ability to learn in non-stationary distributions. In most settings of the current approaches, the agent starts from randomly initialized parameters and is optimized to master the current task regardless of the usefulness of the learned representation for future tasks. Moreover, each of the future tasks uses all the previously learned knowledge although parts of this knowledge might not be helpful for its learning. These cause interference among tasks, especially when the data of previous tasks is not accessible. In this paper, we propose a new method, named Self-Attention Meta-Learner (SAM), which learns a prior knowledge for continual learning that permits learning a sequence of tasks, while avoiding catastrophic forgetting. SAM incorporates an attention mechanism that learns to select the particular relevant representation for each future task. Each task builds a specific representation branch on top of the selected knowledge, avoiding the interference between tasks. We evaluate the proposed method on the Split CIFAR-10/100 and Split MNIST benchmarks in the task agnostic inference. We empirically show that we can achieve a better performance than several state-of-the-art methods for continual learning by building on the top of selected representation learned by SAM. We also show the role of the meta-attention mechanism in boosting informative features corresponding to the input data and identifying the correct target in the task agnostic inference. Finally, we demonstrate that popular existing continual learning methods gain a performance boost when they adopt SAM as a starting point

    Криминологическая классификация роботов: риск-ориентированный подход

    Get PDF
    The subject of the research is key criminal risks in robotics. The purpose of the article is to confirm or disprove the hypothesis that key criminal risks of using robots may be identified and classified. The author dares to describe the key aspects of the application of risk-based approach in the assessment of robotic activities, identify the key risks of using robots, give a criminological classification. The methodology includes a formal logical method, systematic approach, formal legal interpretation of legal acts and academic literature, SWOT analysis. The main results of the study. The author applies the main provisions of criminal riskology when assessing encroachments involving robots. Key risks and challenges when using robots are identified. The severity of the consequences of harm caused by using robots (from minor to critical risk) is assessed and a matrix of the probability of its occurrence is provided. The author's criminological classification of robots is based on the risk-based approach and is substantiated on two grounds. The first one is the category of public danger and the second is the potential severity of the consequences of harm caused by robots. The causal complex that can lead to criminal risks in robotics is identified. The grounds of such risks are divided into those related to the mechanical subsystem of robots, digital subsystem of robots and power supply subsystem of robots. Conclusions. The risk-based approach is the most progressive and effective basis for regulating the criminal relations in robotics. The author demonstrates the existence of real risks to the peace and security of mankind, life and health of people, objects of wildlife, nonliving material objects from the use of robots. It is necessary to recognize robotics as source of increased potential criminal danger and to adopt appropriate regulation as soon as possible. The necessity and expediency of applying a risk-based approach to robotics is theoretically substantiated, and the characteristics of robots that are important in assessing the criminal potential of their exploitation are evaluated. The conclusions and recom mendations of this paper may become a basis for the implementation of the risk-based approach in legal regulation of robotics. The risk matrix presented in the article can be used to establish a framework for regulatory impact on robotics, assess the consequences of potential harm and minimize it.Раскрываются ключевые аспекты применения риск-ориентированного подхода к робототехнике, в частности предлагается ориентироваться на основные положения криминальной рискологии при оценке посягательств с участием роботов. Обозначены ключевые риски и вызовы при использовании роботов. Дана авторская классификация рисков по тяжести последствий при причинении вреда и по вероятности причинения вреда роботами; приведена матрица степени причинения вреда роботами по категории рисков. На основе риск-ориентированного подхода приведена и обоснована криминологическая классификация роботов, а также выявлен причинный комплекс, который может привести к возникновению рисков

    LOW MEMORY CONTINUAL LEARNING CLASSIFICATION ALGORITHMS FOR LOW RESOURCE HARDWARE

    Get PDF
    Continual Learning (CL) is a machine learning approach which focuses on continuous learning of data rather than single dataset-based learning. In this thesis, this same focus is applied with respect to the field of machine learning for embedded devices which is still in the early stages of development. This focus is further used to develop various algorithms such as utilizing prior trained starting networks, weighted output schemes, and replay or reduced datasets for training while maintaining a consistent focus on low resource devices to maintain acceptable performance. The experimental results show an improvement in model training times as compared to the time to train a neural network using all available information with the following accuracy for the Fashion MNIST dataset (~90% to 73% accuracy on 10 classes with a factor of 10 reduction in training time). The other main result showed a reduction in required memory as only 1 class size worth of data is required to be stored at a time rather than the full dataset for non-Replay algorithms. For the Replay based algorithms, this is still reduced to less than 2 classes worth of data for 10 classes which is an 80% reduction overall in memory. This was done with the goal of creating a usable model while a fully trained network is developed on backend systems to limit overall downtime and still maintain system performance

    A Peer-to-peer Federated Continual Learning Network for Improving CT Imaging from Multiple Institutions

    Full text link
    Deep learning techniques have been widely used in computed tomography (CT) but require large data sets to train networks. Moreover, data sharing among multiple institutions is limited due to data privacy constraints, which hinders the development of high-performance DL-based CT imaging models from multi-institutional collaborations. Federated learning (FL) strategy is an alternative way to train the models without centralizing data from multi-institutions. In this work, we propose a novel peer-to-peer federated continual learning strategy to improve low-dose CT imaging performance from multiple institutions. The newly proposed method is called peer-to-peer continual FL with intermediate controllers, i.e., icP2P-FL. Specifically, different from the conventional FL model, the proposed icP2P-FL does not require a central server that coordinates training information for a global model. In the proposed icP2P-FL method, the peer-to-peer federated continual learning is introduced wherein the DL-based model is continually trained one client after another via model transferring and inter institutional parameter sharing due to the common characteristics of CT data among the clients. Furthermore, an intermediate controller is developed to make the overall training more flexible. Numerous experiments were conducted on the AAPM low-dose CT Grand Challenge dataset and local datasets, and the experimental results showed that the proposed icP2P-FL method outperforms the other comparative methods both qualitatively and quantitatively, and reaches an accuracy similar to a model trained with pooling data from all the institutions

    Is Class-Incremental Enough for Continual Learning?

    Get PDF
    The ability of a model to learn continually can be empirically assessed in different continual learning scenarios. Each scenario defines the constraints and the opportunities of the learning environment. Here, we challenge the current trend in the continual learning literature to experiment mainly on class-incremental scenarios, where classes present in one experience are never revisited. We posit that an excessive focus on this setting may be limiting for future research on continual learning, since class-incremental scenarios artificially exacerbate catastrophic forgetting, at the expense of other important objectives like forward transfer and computational efficiency. In many real-world environments, in fact, repetition of previously encountered concepts occurs naturally and contributes to softening the disruption of previous knowledge. We advocate for a more in-depth study of alternative continual learning scenarios, in which repetition is integrated by design in the stream of incoming information. Starting from already existing proposals, we describe the advantages such class-incremental with repetition scenarios could offer for a more comprehensive assessment of continual learning models

    Investigating Continual Learning Strategies in Neural Networks

    Get PDF
    This paper explores the role of continual learning strategies when neural networks are confronted with learning tasks sequentially. We analyze the stability-plasticity dilemma with three factors in mind: the type of network architecture used, the continual learning scenario defined and the continual learning strategy implemented. Our results show that complementary learning systems and neural volume significantly contribute towards memory retrieval and consolidation in neural networks. Finally, we demonstrate how regularization strategies such as elastic weight consolidation are more well-suited for larger neural networks whereas rehearsal strategies such as gradient episodic memory are better suited for smaller neural networks
    corecore