56 research outputs found
e-Distance Weighted Support Vector Regression
We propose a novel support vector regression approach called e-Distance
Weighted Support Vector Regression (e-DWSVR).e-DWSVR specifically addresses two
challenging issues in support vector regression: first, the process of noisy
data; second, how to deal with the situation when the distribution of boundary
data is different from that of the overall data. The proposed e-DWSVR optimizes
the minimum margin and the mean of functional margin simultaneously to tackle
these two issues. In addition, we use both dual coordinate descent (CD) and
averaged stochastic gradient descent (ASGD) strategies to make e-DWSVR scalable
to large scale problems. We report promising results obtained by e-DWSVR in
comparison with existing methods on several benchmark datasets
A Survey of Classification Methods
Classification may refer to categorization, the process in which ideas and objects are recognized, differentiated, and understood. There are many types of classification, researchers face a problem to choose a suitable method that give a good classification performance to solve their classification problems. In this paper, we present the basic classification techniques. Several major kinds of classification method including neural network, decision tree, Bayesian networks, support vector machine and k-nearest neighbor classifier. The goal of this survey is to provide a comprehensive review of the above different classification techniques
Stochastic Training of Neural Networks via Successive Convex Approximations
This paper proposes a new family of algorithms for training neural networks
(NNs). These are based on recent developments in the field of non-convex
optimization, going under the general name of successive convex approximation
(SCA) techniques. The basic idea is to iteratively replace the original
(non-convex, highly dimensional) learning problem with a sequence of (strongly
convex) approximations, which are both accurate and simple to optimize.
Differently from similar ideas (e.g., quasi-Newton algorithms), the
approximations can be constructed using only first-order information of the
neural network function, in a stochastic fashion, while exploiting the overall
structure of the learning problem for a faster convergence. We discuss several
use cases, based on different choices for the loss function (e.g., squared loss
and cross-entropy loss), and for the regularization of the NN's weights. We
experiment on several medium-sized benchmark problems, and on a large-scale
dataset involving simulated physical data. The results show how the algorithm
outperforms state-of-the-art techniques, providing faster convergence to a
better minimum. Additionally, we show how the algorithm can be easily
parallelized over multiple computational units without hindering its
performance. In particular, each computational unit can optimize a tailored
surrogate function defined on a randomly assigned subset of the input
variables, whose dimension can be selected depending entirely on the available
computational power.Comment: Preprint submitted to IEEE Transactions on Neural Networks and
Learning System
Robust Block Coordinate Descent
In this paper we present a novel randomized block coordinate descent method
for the minimization of a convex composite objective function. The method uses
(approximate) partial second-order (curvature) information, so that the
algorithm performance is more robust when applied to highly nonseparable or ill
conditioned problems. We call the method Robust Coordinate Descent (RCD). At
each iteration of RCD, a block of coordinates is sampled randomly, a quadratic
model is formed about that block and the model is minimized
approximately/inexactly to determine the search direction. An inexpensive line
search is then employed to ensure a monotonic decrease in the objective
function and acceptance of large step sizes. We prove global convergence of the
RCD algorithm, and we also present several results on the local convergence of
RCD for strongly convex functions. Finally, we present numerical results on
large-scale problems to demonstrate the practical performance of the method.Comment: 23 pages, 6 figure
- …