1,092 research outputs found

    Towards Real-World Data Streams for Deep Continual Learning

    Get PDF
    Continual Learning deals with Artificial Intelligent agents striving to learn from an ever-ending stream of data. Recently, Deep Continual Learning focused on the design of new strategies to endow Artificial Neural Networks with the ability to learn continuously without forgetting previous knowledge. In fact, the learning process of any Artificial Neural Network model is well-known to lack the sufficient stability to preserve existing knowledge when learning new information. This phenomenon, called catastrophic forgetting or simply forgetting, is considered one of the main obstacles for the design of effective Continual Learning agents. However, existing strategies designed to mitigate forgetting have been evaluated on a restricted set of Continual Learning scenarios. The most used one is, by far, the Class-Incremental scenario applied on object detection tasks. Even though it drove interest in Continual Learning, Class-Incremental scenarios strongly constraint the properties of the data stream, thus limiting its ability to model real-world environments. The core of this thesis concerns the introduction of three Continual Learning data streams, whose design is centered around specific real-world environments properties. First, we propose the Class- Incremental with Repetition scenario, which builds a data stream including both the introduction of new concepts and the repetition of previous ones. Repetition is naturally present in many environments and it constitutes an important source of information. Second, we formalize the Continual Pre-Training scenario, which leverages a data stream of unstructured knowledge to keep a pre-trained model updated over time. One important objective of this scenario is to study how to continuously build general, robust representations that does not strongly depend on the specific task to be solved. This is a fundamental property of real-world agents, which build cross-task knowledge and then adapts it to specific needs. Third, we study Continual Learning scenarios where data streams are composed by temporally-correlated data. Temporal correlation is ubiquitous and lies at the foundation of most environments we, as humans, experience during our life. We leverage Recurrent Neural Networks as our main model, due to their intrinsic ability to model temporal correlations. We discovered that, when applied to recurrent models, Continual Learning strategies behave in an unexpected manner. This highlights the limits of the current experimental validation, mostly focused on Computer Vision tasks. Ultimately, the introduction of new data streams contributed to deepen our understanding of how Artificial Neural Networks learn continuously. We discover that forgetting strongly depends on the properties of the data stream and we observed large changes from one data stream to another. Moreover, when forgetting is mild, we were able to effectively mitigate it with simple strategies, or even without any specific ones. Loosening the focus on forgetting allows us to turn our attention to other interesting problems, outlined in this thesis, like (i) separation between continual representation learning and quick adaptation to novel tasks, (ii) robustness to unbalanced data streams and (iii) ability to continuously learn temporal correlations. These objectives currently defy existing strategies and will likely represent the next challenge for Continual Learning research

    Energy Efficient Neocortex-Inspired Systems with On-Device Learning

    Get PDF
    Shifting the compute workloads from cloud toward edge devices can significantly improve the overall latency for inference and learning. On the contrary this paradigm shift exacerbates the resource constraints on the edge devices. Neuromorphic computing architectures, inspired by the neural processes, are natural substrates for edge devices. They offer co-located memory, in-situ training, energy efficiency, high memory density, and compute capacity in a small form factor. Owing to these features, in the recent past, there has been a rapid proliferation of hybrid CMOS/Memristor neuromorphic computing systems. However, most of these systems offer limited plasticity, target either spatial or temporal input streams, and are not demonstrated on large scale heterogeneous tasks. There is a critical knowledge gap in designing scalable neuromorphic systems that can support hybrid plasticity for spatio-temporal input streams on edge devices. This research proposes Pyragrid, a low latency and energy efficient neuromorphic computing system for processing spatio-temporal information natively on the edge. Pyragrid is a full-scale custom hybrid CMOS/Memristor architecture with analog computational modules and an underlying digital communication scheme. Pyragrid is designed for hierarchical temporal memory, a biomimetic sequence memory algorithm inspired by the neocortex. It features a novel synthetic synapses representation that enables dynamic synaptic pathways with reduced memory usage and interconnects. The dynamic growth in the synaptic pathways is emulated in the memristor device physical behavior, while the synaptic modulation is enabled through a custom training scheme optimized for area and power. Pyragrid features data reuse, in-memory computing, and event-driven sparse local computing to reduce data movement by ~44x and maximize system throughput and power efficiency by ~3x and ~161x over custom CMOS digital design. The innate sparsity in Pyragrid results in overall robustness to noise and device failure, particularly when processing visual input and predicting time series sequences. Porting the proposed system on edge devices can enhance their computational capability, response time, and battery life

    Coordinated Voltage and Reactive Power Control for Renewable Dominant Smart Distribution Systems

    Get PDF
    Driven by their economic and environmental advantages, smart grids promote the deployment of active components, including renewable energy sources (RESs), energy storage systems (ESSs), and electric vehicles (EVs), for sustainability and environmental benefits. As a result of smart grid technologies and the amount of data collected by smart meters, better operation and control schemes can be developed to allow for cleaner energy with high efficiency, and without breaching network operating constraints. Power distribution networks may face some operational and control challenges as the integration of intermittent energy sources (wind and PV power systems) increases. Some of these challenges include voltage rise and fluctuation, reverse power flow, and the malfunction of conventional Volt/Var control devices. Depending on their location, RESs may introduce two issues related to the Volt/Var control problem, the first of which is that the severity of loading variations will be greater than the case without RESs. The second occurs when the RES is connected between the load center and any regulating devices. The power in-feed from the intermittent RESs may not only mislead the regulator’s control circuit, resulting in unfavorable voltage, but may also enforce the regulator taps to operate randomly following bus voltage variations. This thesis investigates and presents a methodology for the Volt/Var control problem in Smart Distribution Grids (SDGs) under the high penetration and fluctuation of RESs. The research involves the application of predictive control actions to optimally set Volt/Var control devices before the predicted voltage violation takes place. The main objective of this controller is to manage and control the operation of Volt/Var devices in an optimal way that improves the voltage profile along the feeders, reduces real power losses and minimizes the number of Volt/Var device taps and/or switching movements under all loading conditions and for high penetration RESs. This thesis first presents a very Short-Term Stacking Ensemble (STSE) forecasting model for solar PV and wind power outputs that is developed to predict the generated power for intervals of 15 minutes. The proposed model combines heterogeneous machine learning algorithms composed of three well-established models: Support Vector Regression (SVR); Radial Basis Function Neural Network (RBFNN); and Random Forest (RF) heuristically via SVR. The STSE model aims to minimize the prediction error associated with renewable resources when used in the real-time operation of power distribution networks. Secondly, a day-ahead Predictive Volt/Var Control (PVVC) model is developed to find the optimal coordination between Volt/Var control devices under the high penetration and power variations of RESs. The objective of the PVVC model is defined as simultaneous minimization of voltage deviation at each bus, power losses, operating cycle of regulation equipment, and RES curtailment. The benefit of using smart inverter interface RESs with the capability of injective/absorbing reactive power is examined and applied as ancillary services for voltage support. Thirdly, a Sequential Predictive Control (SPC) Strategy for smart grids is developed. The model uses the past and currently available data to forecast demand and RES outputs for intervals of 15 minutes, with real-time updating mechanisms. It then schedules the settings and operations of Volt/Var control devices by solving the Volt/Var control problem in a rolling horizon optimization framework. Because the optimization must be solved in a short interval with a global solution, a solution methodology for linearizing the nonlinear optimization problem is adapted. The original control problem, which is a Mixed-Integer Nonlinear Programming (MINLP) optimization problem, is transformed into a Mixed Integer Second Order Conic Programming (MISOCP) problem that guarantees a global solution through convexity and remarkably reduces the computational burden. Case studies carried out to compare the proposed model against state-of-the-art models provides evidence for the proposed model’s effectiveness. Results indicate that the SPC is capable of accurately solving the control problem within small time slots. The proposed models aim to efficiently operate SDGs at a high penetration level of RES for a day-ahead, as well as in real-time, depending on the preference of network operators. The primary purpose is to minimize operating costs while increasing the efficiency and lifespan of Volt/Var control devices

    Continual learning for computer vision applications

    Get PDF
    One of the most visionary goals of Artificial Intelligence is to create a system able to mimic and eventually surpass the intelligence observed in biological systems including, ambitiously, the one observed in humans. The main distinctive strength of humans is their ability to build a deep understanding of the world by learning continuously and drawing from their experiences. This ability, which is found in various degrees in all intelligent biological beings, allows them to adapt and properly react to changes by incrementally expanding and refining their knowledge. Arguably, achieving this ability is one of the main goals of Artificial Intelligence and a cornerstone towards the creation of intelligent artificial agents. Modern Deep Learning approaches allowed researchers and industries to achieve great advancements towards the resolution of many long-standing problems in areas like Computer Vision and Natural Language Processing. However, while this current age of renewed interest in AI allowed for the creation of extremely useful applications, a concerningly limited effort is being directed towards the design of systems able to learn continuously. The biggest problem that hinders an AI system from learning incrementally is the catastrophic forgetting phenomenon. This phenomenon, which was discovered in the 90s, naturally occurs in Deep Learning architectures where classic learning paradigms are applied when learning incrementally from a stream of experiences. This dissertation revolves around the Continual Learning field, a sub-field of Machine Learning research that has recently made a comeback following the renewed interest in Deep Learning approaches. This work will focus on a comprehensive view of continual learning by considering algorithmic, benchmarking, and applicative aspects of this field. This dissertation will also touch on community aspects such as the design and creation of research tools aimed at supporting Continual Learning research, and the theoretical and practical aspects concerning public competitions in this field
    • …
    corecore