Search CORE

3,324 research outputs found

Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods

Author: Anandkumar Anima
Janzamin Majid
Sedghi Hanie
Publication venue
Publication date: 01/01/2015
Field of study

Training neural networks is a challenging non-convex optimization problem, and backpropagation or gradient descent can get stuck in spurious local optima. We propose a novel algorithm based on tensor decomposition for guaranteed training of two-layer neural networks. We provide risk bounds for our proposed method, with a polynomial sample complexity in the relevant parameters, such as input dimension and number of neurons. While learning arbitrary target functions is NP-hard, we provide transparent conditions on the function and the input for learnability. Our training method is based on tensor decomposition, which provably converges to the global optimum, under a set of mild non-degeneracy conditions. It consists of simple embarrassingly parallel linear and multi-linear operations, and is competitive with standard stochastic gradient descent (SGD), in terms of computational complexity. Thus, we propose a computationally efficient method with guaranteed risk bounds for training neural networks with one hidden layer.Comment: The tensor decomposition analysis is expanded, and the analysis of ridge regression is added for recovering the parameters of last layer of neural networ

arXiv.org e-Print Archive

CiteSeerX

eScholarship - University of California

Multiplexed gradient descent: Fast online training of modern datasets on hardware neural networks without backpropagation

Author: Buckley Sonia M.
Dienstfrey Andrew
Ganesh Natesh
McCaughan Adam N.
Nam Sae Woo
Oripov Bakhrom G.
Publication venue
Publication date: 05/03/2023
Field of study

We present multiplexed gradient descent (MGD), a gradient descent framework designed to easily train analog or digital neural networks in hardware. MGD utilizes zero-order optimization techniques for online training of hardware neural networks. We demonstrate its ability to train neural networks on modern machine learning datasets, including CIFAR-10 and Fashion-MNIST, and compare its performance to backpropagation. Assuming realistic timescales and hardware parameters, our results indicate that these optimization techniques can train a network on emerging hardware platforms orders of magnitude faster than the wall-clock time of training via backpropagation on a standard GPU, even in the presence of imperfect weight updates or device-to-device variations in the hardware. We additionally describe how it can be applied to existing hardware as part of chip-in-the-loop training, or integrated directly at the hardware level. Crucially, the MGD framework is highly flexible, and its gradient descent process can be optimized to compensate for specific hardware limitations such as slow parameter-update speeds or limited input bandwidth

arXiv.org e-Print Archive

Directory of Open Access Journals

Comparison of Neural Networks and Least Mean Squared Algorithms for Active Noise Canceling

Author: Park Samuel Kyung Won
Publication venue: Clemson University Libraries
Publication date: 01/08/2018
Field of study

Active Noise Canceling (ANC) is the idea of using superposition to achieve cancellation of unwanted noise and is implemented for many applications such as attempting to reduce noise in a commercial airplane cabin. One of the main traditional techniques for noise cancellation is the adaptive least mean squares (LMS) algorithm that produces the anti-noise signal, or the 180 degree out-of-phase signal to cancel the noise via superposition. This work attempts to compare several neural network approaches against the traditional LMS algorithms. The noise signals that are used for the training of the network are from the Signal Processing Information Base (SPIB) database. The neural network architectures utilized in this paper include the Multilayer Feedforward Neural Network, the Recurrent Neural Network, the Long Short Term Neural Network, and the Convolutional Neural Network. These neural networks are trained to predict the anti-noise signal based on an incoming noise signal. The results of the simulation demonstrate successful ANC using neural networks, and they show that neural networks can yield better noise attenuation than LMS algorithms. Results show that the Convolutional Neural Network architecture outperforms the other architectures implemented and tested in this work

Clemson University: TigerPrints

Acoustic signal processing with robust machine learning algorithm for improved monitoring of particulate solid materials in a gas flowline

Author: Andrew Cowell
Bello
Don McGlinchey
Droubi
El-Alej
El-Alej
Guido
Guo
Haugsdal
Hu
Isaacson
Kos
Kuda Tijjani Aminu
Le
Ludeña-Choez
Mackinnon
Mason
McCulloch
McKay
Mirjalili
Mirjalili
Mitrović
Mittal
Odigie
Ooi
Riedmiller
Shannon
Shuiping
Sun
Sun
Thiruvenkatanathan
Toh
Waibel
Wang
Wang
Xie
Yan
Publication venue: 'Elsevier BV'
Publication date: 01/03/2019
Field of study

Crossref

ResearchOnline@GCU

Active disturbance cancellation in nonlinear dynamical systems using neural networks

Author: Canfield John C
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/01/2003
Field of study

A proposal for the use of a time delay CMAC neural network for disturbance cancellation in nonlinear dynamical systems is presented. Appropriate modifications to the CMAC training algorithm are derived which allow convergent adaptation for a variety of secondary signal paths. Analytical bounds on the maximum learning gain are presented which guarantee convergence of the algorithm and provide insight into the necessary reduction in learning gain as a function of the system parameters. Effectiveness of the algorithm is evaluated through mathematical analysis, simulation studies, and experimental application of the technique on an acoustic duct laboratory model

UNH Scholars' Repository

Relay Backpropagation for Effective Learning of Deep Convolutional Neural Networks

Author: J Xiao
K He
Kaiming He
O Russakovsky
R Kamimura
TM Cover
YA LeCun
Publication venue
Publication date: 01/01/2016
Field of study

Learning deeper convolutional neural networks becomes a tendency in recent years. However, many empirical evidences suggest that performance improvement cannot be gained by simply stacking more layers. In this paper, we consider the issue from an information theoretical perspective, and propose a novel method Relay Backpropagation, that encourages the propagation of effective information through the network in training stage. By virtue of the method, we achieved the first place in ILSVRC 2015 Scene Classification Challenge. Extensive experiments on two challenging large scale datasets demonstrate the effectiveness of our method is not restricted to a specific dataset or network architecture. Our models will be available to the research community later.Comment: Technical report for our submissions to the ILSVRC 2015 Scene Classification Challenge, where we won the first plac

arXiv.org e-Print Archive

Crossref

To go deep or wide in learning?

Author: Dukkipati Ambedkar
Pandey Gaurav
Publication venue
Publication date: 23/02/2014
Field of study

To achieve acceptable performance for AI tasks, one can either use sophisticated feature extraction methods as the first layer in a two-layered supervised learning model, or learn the features directly using a deep (multi-layered) model. While the first approach is very problem-specific, the second approach has computational overheads in learning multiple layers and fine-tuning of the model. In this paper, we propose an approach called wide learning based on arc-cosine kernels, that learns a single layer of infinite width. We propose exact and inexact learning strategies for wide learning and show that wide learning with single layer outperforms single layer as well as deep architectures of finite width for some benchmark datasets.Comment: 9 pages, 1 figure, Accepted for publication in Seventeenth International Conference on Artificial Intelligence and Statistic

arXiv.org e-Print Archive

CiteSeerX