301,943 research outputs found

    Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods

    Get PDF
    Training neural networks is a challenging non-convex optimization problem, and backpropagation or gradient descent can get stuck in spurious local optima. We propose a novel algorithm based on tensor decomposition for guaranteed training of two-layer neural networks. We provide risk bounds for our proposed method, with a polynomial sample complexity in the relevant parameters, such as input dimension and number of neurons. While learning arbitrary target functions is NP-hard, we provide transparent conditions on the function and the input for learnability. Our training method is based on tensor decomposition, which provably converges to the global optimum, under a set of mild non-degeneracy conditions. It consists of simple embarrassingly parallel linear and multi-linear operations, and is competitive with standard stochastic gradient descent (SGD), in terms of computational complexity. Thus, we propose a computationally efficient method with guaranteed risk bounds for training neural networks with one hidden layer.Comment: The tensor decomposition analysis is expanded, and the analysis of ridge regression is added for recovering the parameters of last layer of neural networ

    System Identification for Nonlinear Control Using Neural Networks

    Get PDF
    An approach to incorporating artificial neural networks in nonlinear, adaptive control systems is described. The controller contains three principal elements: a nonlinear inverse dynamic control law whose coefficients depend on a comprehensive model of the plant, a neural network that models system dynamics, and a state estimator whose outputs drive the control law and train the neural network. Attention is focused on the system identification task, which combines an extended Kalman filter with generalized spline function approximation. Continual learning is possible during normal operation, without taking the system off line for specialized training. Nonlinear inverse dynamic control requires smooth derivatives as well as function estimates, imposing stringent goals on the approximating technique

    Penalized Estimation of Directed Acyclic Graphs From Discrete Data

    Full text link
    Bayesian networks, with structure given by a directed acyclic graph (DAG), are a popular class of graphical models. However, learning Bayesian networks from discrete or categorical data is particularly challenging, due to the large parameter space and the difficulty in searching for a sparse structure. In this article, we develop a maximum penalized likelihood method to tackle this problem. Instead of the commonly used multinomial distribution, we model the conditional distribution of a node given its parents by multi-logit regression, in which an edge is parameterized by a set of coefficient vectors with dummy variables encoding the levels of a node. To obtain a sparse DAG, a group norm penalty is employed, and a blockwise coordinate descent algorithm is developed to maximize the penalized likelihood subject to the acyclicity constraint of a DAG. When interventional data are available, our method constructs a causal network, in which a directed edge represents a causal relation. We apply our method to various simulated and real data sets. The results show that our method is very competitive, compared to many existing methods, in DAG estimation from both interventional and high-dimensional observational data.Comment: To appear in Statistics and Computin

    Hybrid Building/Floor Classification and Location Coordinates Regression Using A Single-Input and Multi-Output Deep Neural Network for Large-Scale Indoor Localization Based on Wi-Fi Fingerprinting

    Full text link
    In this paper, we propose hybrid building/floor classification and floor-level two-dimensional location coordinates regression using a single-input and multi-output (SIMO) deep neural network (DNN) for large-scale indoor localization based on Wi-Fi fingerprinting. The proposed scheme exploits the different nature of the estimation of building/floor and floor-level location coordinates and uses a different estimation framework for each task with a dedicated output and hidden layers enabled by SIMO DNN architecture. We carry out preliminary evaluation of the performance of the hybrid floor classification and floor-level two-dimensional location coordinates regression using new Wi-Fi crowdsourced fingerprinting datasets provided by Tampere University of Technology (TUT), Finland, covering a single building with five floors. Experimental results demonstrate that the proposed SIMO-DNN-based hybrid classification/regression scheme outperforms existing schemes in terms of both floor detection rate and mean positioning errors.Comment: 6 pages, 4 figures, 3rd International Workshop on GPU Computing and AI (GCA'18
    corecore