Search CORE

215,589 research outputs found

Critical Learning Periods for Multisensory Integration in Deep Networks

Author: Achille Alessandro
Kleinman Michael
Soatto Stefano
Publication venue
Publication date: 06/10/2022
Field of study

We show that the ability of a neural network to integrate information from diverse sources hinges critically on being exposed to properly correlated signals during the early phases of training. Interfering with the learning process during this initial stage can permanently impair the development of a skill, both in artificial and biological systems where the phenomenon is known as critical learning period. We show that critical periods arise from the complex and unstable early transient dynamics, which are decisive of final performance of the trained system and their learned representations. This evidence challenges the view, engendered by analysis of wide and shallow networks, that early learning dynamics of neural networks are simple, akin to those of a linear model. Indeed, we show that even deep linear networks exhibit critical learning periods for multi-source integration, while shallow networks do not. To better understand how the internal representations change according to disturbances or sensory deficits, we introduce a new measure of source sensitivity, which allows us to track the inhibition and integration of sources during training. Our analysis of inhibition suggests cross-source reconstruction as a natural auxiliary training objective, and indeed we show that architectures trained with cross-sensor reconstruction objectives are remarkably more resilient to critical periods. Our findings suggest that the recent success in self-supervised multi-modal training compared to previous supervised efforts may be in part due to more robust learning dynamics and not solely due to better architectures and/or more data

arXiv.org e-Print Archive

PBIL for Optimizing Hyperparameters of Convolutional Neural Networks and STL Decomposition

Author: Cárdenas Montes Miguel
Gutiérrez Naranjo Miguel Ángel
Vasco Carofilis Roberto A.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

The optimization of hyperparameters in Deep Neural Net-works is a critical task for the final performance, but it involves a high amount of subjective decisions based on previous researchers’ expertise. This paper presents the implementation of Population-based Incremen-tal Learning for the automatic optimization of hyperparameters in Deep Learning architectures. Namely, the proposed architecture is a combina-tion of preprocessing the time series input with Seasonal Decomposition of Time Series by Loess, a classical method for decomposing time series, and forecasting with Convolutional Neural Networks. In the past, this combination has produced promising results, but penalized by an incre-mental number of parameters. The proposed architecture is applied to the prediction of the 222Rn level at the Canfranc Underground Labora-tory (Spain). By predicting the lowlevel periods of 222Rn, the potential contamination during the maintenance operations in the experiments hosted in the laboratory could be minimized. In this paper, it is shown that Population-based Incremental Learning can be used for the choice of optimized hyperparameters in Deep Learning architectures with a reasonable computational cost.Ministerio de Economía y Competitividad MDM- 2015-050

idUS. Depósito de Investigación Universidad de Sevilla

Safe Robot Planning and Control Using Uncertainty-Aware Deep Learning

Author: Fan David D.
Publication venue: Georgia Institute of Technology
Publication date: 15/09/2021
Field of study

In order for robots to autonomously operate in novel environments over extended periods of time, they must learn and adapt to changes in the dynamics of their motion and the environment. Neural networks have been shown to be a versatile and powerful tool for learning dynamics and semantic information. However, there is reluctance to deploy these methods on safety-critical or high-risk applications, since neural networks tend to be black-box function approximators. Therefore, there is a need for investigation into how these machine learning methods can be safely leveraged for learning-based controls, planning, and traversability. The aim of this thesis is to explore methods for both establishing safety guarantees as well as accurately quantifying risks when using deep neural networks for robot planning, especially in high-risk environments. First, we consider uncertainty-aware Bayesian Neural Networks for adaptive control, and introduce a method for guaranteeing safety under certain assumptions. Second, we investigate deep quantile regression learning methods for learning time-and-state varying uncertainties, which we use to perform trajectory optimization with Model Predictive Control. Third, we introduce a complete framework for risk-aware traversability and planning, which we use to enable safe exploration of extreme environments. Fourth, we again leverage deep quantile regression and establish a method for accurately learning the distribution of traversability risks in these environments, which can be used to create safety constraints for planning and control.Ph.D

Scholarly Materials And Research @ Georgia Tech

Arguing Machines: Human Supervision of Black Box AI Systems That Make Life-Critical Decisions

Author: Ding Li
Fridman Lex
Jenik Benedikt
Reimer Bryan
Publication venue
Publication date: 24/09/2018
Field of study

We consider the paradigm of a black box AI system that makes life-critical decisions. We propose an "arguing machines" framework that pairs the primary AI system with a secondary one that is independently trained to perform the same task. We show that disagreement between the two systems, without any knowledge of underlying system design or operation, is sufficient to arbitrarily improve the accuracy of the overall decision pipeline given human supervision over disagreements. We demonstrate this system in two applications: (1) an illustrative example of image classification and (2) on large-scale real-world semi-autonomous driving data. For the first application, we apply this framework to image classification achieving a reduction from 8.0% to 2.8% top-5 error on ImageNet. For the second application, we apply this framework to Tesla Autopilot and demonstrate the ability to predict 90.4% of system disengagements that were labeled by human annotators as challenging and needing human supervision

arXiv.org e-Print Archive

Crossref

Anomaly Detection in Multivariate Non-stationary Time Series for Automatic DBMS Diagnosis

Author: Lee Doyup
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/10/2017
Field of study

Anomaly detection in database management systems (DBMSs) is difficult because of increasing number of statistics (stat) and event metrics in big data system. In this paper, I propose an automatic DBMS diagnosis system that detects anomaly periods with abnormal DB stat metrics and finds causal events in the periods. Reconstruction error from deep autoencoder and statistical process control approach are applied to detect time period with anomalies. Related events are found using time series similarity measures between events and abnormal stat metrics. After training deep autoencoder with DBMS metric data, efficacy of anomaly detection is investigated from other DBMSs containing anomalies. Experiment results show effectiveness of proposed model, especially, batch temporal normalization layer. Proposed model is used for publishing automatic DBMS diagnosis reports in order to determine DBMS configuration and SQL tuning.Comment: 8 page

arXiv.org e-Print Archive

Crossref

포항공과대학교

Learning Scheduling Algorithms for Data Processing Clusters

Author: Abadi Martín
Addanki Ravichandra
Dai Hanjun
Finn Chelsea
Ghodsi Ali
Gog Ionel
Grandl Robert
Greensmith Evan
Hindman Benjamin
Kingma Diederik P
Mao Hongzi
Mao Hongzi
Marcus Ryan
Mirhoseini Azalia
Mirhoseini Azalia
Pinto Lerrel
Schulman John
Spark Apache
Sutton S.
Weaver Lex
Zaharia Matei
Publication venue
Publication date: 21/08/2019
Field of study

Efficiently scheduling data processing jobs on distributed compute clusters requires complex algorithms. Current systems, however, use simple generalized heuristics and ignore workload characteristics, since developing and tuning a scheduling policy for each workload is infeasible. In this paper, we show that modern machine learning techniques can generate highly-efficient policies automatically. Decima uses reinforcement learning (RL) and neural networks to learn workload-specific scheduling algorithms without any human instruction beyond a high-level objective such as minimizing average job completion time. Off-the-shelf RL techniques, however, cannot handle the complexity and scale of the scheduling problem. To build Decima, we had to develop new representations for jobs' dependency graphs, design scalable RL models, and invent RL training methods for dealing with continuous stochastic job arrivals. Our prototype integration with Spark on a 25-node cluster shows that Decima improves the average job completion time over hand-tuned scheduling heuristics by at least 21%, achieving up to 2x improvement during periods of high cluster load

arXiv.org e-Print Archive

Crossref

DSpace@MIT