301,943 research outputs found
Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods
Training neural networks is a challenging non-convex optimization problem,
and backpropagation or gradient descent can get stuck in spurious local optima.
We propose a novel algorithm based on tensor decomposition for guaranteed
training of two-layer neural networks. We provide risk bounds for our proposed
method, with a polynomial sample complexity in the relevant parameters, such as
input dimension and number of neurons. While learning arbitrary target
functions is NP-hard, we provide transparent conditions on the function and the
input for learnability. Our training method is based on tensor decomposition,
which provably converges to the global optimum, under a set of mild
non-degeneracy conditions. It consists of simple embarrassingly parallel linear
and multi-linear operations, and is competitive with standard stochastic
gradient descent (SGD), in terms of computational complexity. Thus, we propose
a computationally efficient method with guaranteed risk bounds for training
neural networks with one hidden layer.Comment: The tensor decomposition analysis is expanded, and the analysis of
ridge regression is added for recovering the parameters of last layer of
neural networ
System Identification for Nonlinear Control Using Neural Networks
An approach to incorporating artificial neural networks in nonlinear, adaptive control systems is described. The controller contains three principal elements: a nonlinear inverse dynamic control law whose coefficients depend on a comprehensive model of the plant, a neural network that models system dynamics, and a state estimator whose outputs drive the control law and train the neural network. Attention is focused on the system identification task, which combines an extended Kalman filter with generalized spline function approximation. Continual learning is possible during normal operation, without taking the system off line for specialized training. Nonlinear inverse dynamic control requires smooth derivatives as well as function estimates, imposing stringent goals on the approximating technique
Penalized Estimation of Directed Acyclic Graphs From Discrete Data
Bayesian networks, with structure given by a directed acyclic graph (DAG),
are a popular class of graphical models. However, learning Bayesian networks
from discrete or categorical data is particularly challenging, due to the large
parameter space and the difficulty in searching for a sparse structure. In this
article, we develop a maximum penalized likelihood method to tackle this
problem. Instead of the commonly used multinomial distribution, we model the
conditional distribution of a node given its parents by multi-logit regression,
in which an edge is parameterized by a set of coefficient vectors with dummy
variables encoding the levels of a node. To obtain a sparse DAG, a group norm
penalty is employed, and a blockwise coordinate descent algorithm is developed
to maximize the penalized likelihood subject to the acyclicity constraint of a
DAG. When interventional data are available, our method constructs a causal
network, in which a directed edge represents a causal relation. We apply our
method to various simulated and real data sets. The results show that our
method is very competitive, compared to many existing methods, in DAG
estimation from both interventional and high-dimensional observational data.Comment: To appear in Statistics and Computin
Hybrid Building/Floor Classification and Location Coordinates Regression Using A Single-Input and Multi-Output Deep Neural Network for Large-Scale Indoor Localization Based on Wi-Fi Fingerprinting
In this paper, we propose hybrid building/floor classification and
floor-level two-dimensional location coordinates regression using a
single-input and multi-output (SIMO) deep neural network (DNN) for large-scale
indoor localization based on Wi-Fi fingerprinting. The proposed scheme exploits
the different nature of the estimation of building/floor and floor-level
location coordinates and uses a different estimation framework for each task
with a dedicated output and hidden layers enabled by SIMO DNN architecture. We
carry out preliminary evaluation of the performance of the hybrid floor
classification and floor-level two-dimensional location coordinates regression
using new Wi-Fi crowdsourced fingerprinting datasets provided by Tampere
University of Technology (TUT), Finland, covering a single building with five
floors. Experimental results demonstrate that the proposed SIMO-DNN-based
hybrid classification/regression scheme outperforms existing schemes in terms
of both floor detection rate and mean positioning errors.Comment: 6 pages, 4 figures, 3rd International Workshop on GPU Computing and
AI (GCA'18
- …