7,302 research outputs found
A multi-class classification model with parametrized target outputs for randomized-based feedforward neural networks.
Randomized-based Feedforward Neural Networks approach regression and classification (binary and multi-class) problems by minimizing the same optimization problem. Specifically, the model parameters are determined through the ridge regression estimator of the patterns projected in the hidden layer space (randomly generated in its neural network version) for models without direct links and the patterns projected in the hidden layer space along with the original input data for models with direct links. The targets are encoded for the multi-class classification problem according to the 1-of- encoding ( the number of classes), which implies that the model parameters are estimated to project all the patterns belonging to its corresponding class to one and the remaining to zero. This approach has several drawbacks, which motivated us to propose an alternative optimization model for the framework. In the proposed optimization model, model parameters are estimated for each class so that their patterns are projected to a reference point (also optimized during the process), whereas the remaining patterns (not belonging to that class) are projected as far away as possible from the reference point. The final problem is finally presented as a generalized eigenvalue problem. Four models are then presented: the neural network version of the algorithm and its corresponding kernel version for the neural networks models with and without direct links. In addition, the optimization model has also been implemented in randomization-based multi-layer or deep neural networks.Funding for open access charge: Universidad de Málaga / CBU
A multi-class classification model with parametrized target outputs for randomized-based feedforward neural networks
Randomized-based Feedforward Neural Networks approach regression and classification (binary and
multi-class) problems by minimizing the same optimization problem. Specifically, the model parameters are determined through the ridge regression estimator of the patterns projected in the hidden
layer space (randomly generated in its neural network version) for models without direct links and
the patterns projected in the hidden layer space along with the original input data for models with
direct links. The targets are encoded for the multi-class classification problem according to the 1-
of-J encoding (J the number of classes), which implies that the model parameters are estimated to
project all the patterns belonging to its corresponding class to one and the remaining to zero. This
approach has several drawbacks, which motivated us to propose an alternative optimization model
for the framework. In the proposed optimization model, model parameters are estimated for each
class so that their patterns are projected to a reference point (also optimized during the process),
whereas the remaining patterns (not belonging to that class) are projected as far away as possible from
the reference point. The final problem is finally presented as a generalized eigenvalue problem. Four
models are then presented: the neural network version of the algorithm and its corresponding kernel
version for the neural networks models with and without direct links. In addition, the optimization
model has also been implemented in randomization-based multi-layer or deep neural networks. The
empirical results obtained by the proposed models were compared to those reported by state-ofthe-art models in the correct classification rate and a separability index (which measures the degree
of separability in projection terms per class of the patterns belonging to the class of the others).
The proposed methods show very competitive performance in the separability index and prediction
accuracy compared to the neural networks version of the comparison methods (with and without
direct links). Remarkably, the model provides significantly superior performance in deep models with
direct links compared to its deep model counterpart
Support vector machine for functional data classification
In many applications, input data are sampled functions taking their values in
infinite dimensional spaces rather than standard vectors. This fact has complex
consequences on data analysis algorithms that motivate modifications of them.
In fact most of the traditional data analysis tools for regression,
classification and clustering have been adapted to functional inputs under the
general name of functional Data Analysis (FDA). In this paper, we investigate
the use of Support Vector Machines (SVMs) for functional data analysis and we
focus on the problem of curves discrimination. SVMs are large margin classifier
tools based on implicit non linear mappings of the considered data into high
dimensional spaces thanks to kernels. We show how to define simple kernels that
take into account the unctional nature of the data and lead to consistent
classification. Experiments conducted on real world data emphasize the benefit
of taking into account some functional aspects of the problems.Comment: 13 page
Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods
Training neural networks is a challenging non-convex optimization problem,
and backpropagation or gradient descent can get stuck in spurious local optima.
We propose a novel algorithm based on tensor decomposition for guaranteed
training of two-layer neural networks. We provide risk bounds for our proposed
method, with a polynomial sample complexity in the relevant parameters, such as
input dimension and number of neurons. While learning arbitrary target
functions is NP-hard, we provide transparent conditions on the function and the
input for learnability. Our training method is based on tensor decomposition,
which provably converges to the global optimum, under a set of mild
non-degeneracy conditions. It consists of simple embarrassingly parallel linear
and multi-linear operations, and is competitive with standard stochastic
gradient descent (SGD), in terms of computational complexity. Thus, we propose
a computationally efficient method with guaranteed risk bounds for training
neural networks with one hidden layer.Comment: The tensor decomposition analysis is expanded, and the analysis of
ridge regression is added for recovering the parameters of last layer of
neural networ
Stacking-based Deep Neural Network: Deep Analytic Network on Convolutional Spectral Histogram Features
Stacking-based deep neural network (S-DNN), in general, denotes a deep neural
network (DNN) resemblance in terms of its very deep, feedforward network
architecture. The typical S-DNN aggregates a variable number of individually
learnable modules in series to assemble a DNN-alike alternative to the targeted
object recognition tasks. This work likewise devises an S-DNN instantiation,
dubbed deep analytic network (DAN), on top of the spectral histogram (SH)
features. The DAN learning principle relies on ridge regression, and some key
DNN constituents, specifically, rectified linear unit, fine-tuning, and
normalization. The DAN aptitude is scrutinized on three repositories of varying
domains, including FERET (faces), MNIST (handwritten digits), and CIFAR10
(natural objects). The empirical results unveil that DAN escalates the SH
baseline performance over a sufficiently deep layer.Comment: 5 page
- …