29,509 research outputs found
A Family of Maximum Margin Criterion for Adaptive Learning
In recent years, pattern analysis plays an important role in data mining and
recognition, and many variants have been proposed to handle complicated
scenarios. In the literature, it has been quite familiar with high
dimensionality of data samples, but either such characteristics or large data
have become usual sense in real-world applications. In this work, an improved
maximum margin criterion (MMC) method is introduced firstly. With the new
definition of MMC, several variants of MMC, including random MMC, layered MMC,
2D^2 MMC, are designed to make adaptive learning applicable. Particularly, the
MMC network is developed to learn deep features of images in light of simple
deep networks. Experimental results on a diversity of data sets demonstrate the
discriminant ability of proposed MMC methods are compenent to be adopted in
complicated application scenarios.Comment: 14 page
Bandwidth selection in kernel empirical risk minimization via the gradient
In this paper, we deal with the data-driven selection of multidimensional and
possibly anisotropic bandwidths in the general framework of kernel empirical
risk minimization. We propose a universal selection rule, which leads to
optimal adaptive results in a large variety of statistical models such as
nonparametric robust regression and statistical learning with errors in
variables. These results are stated in the context of smooth loss functions,
where the gradient of the risk appears as a good criterion to measure the
performance of our estimators. The selection rule consists of a comparison of
gradient empirical risks. It can be viewed as a nontrivial improvement of the
so-called Goldenshluger-Lepski method to nonlinear estimators. Furthermore, one
main advantage of our selection rule is the nondependency on the Hessian matrix
of the risk, usually involved in standard adaptive procedures.Comment: Published at http://dx.doi.org/10.1214/15-AOS1318 in the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Meta-models for structural reliability and uncertainty quantification
A meta-model (or a surrogate model) is the modern name for what was
traditionally called a response surface. It is intended to mimic the behaviour
of a computational model M (e.g. a finite element model in mechanics) while
being inexpensive to evaluate, in contrast to the original model which may take
hours or even days of computer processing time. In this paper various types of
meta-models that have been used in the last decade in the context of structural
reliability are reviewed. More specifically classical polynomial response
surfaces, polynomial chaos expansions and kriging are addressed. It is shown
how the need for error estimates and adaptivity in their construction has
brought this type of approaches to a high level of efficiency. A new technique
that solves the problem of the potential biasedness in the estimation of a
probability of failure through the use of meta-models is finally presented.Comment: Keynote lecture Fifth Asian-Pacific Symposium on Structural
Reliability and its Applications (5th APSSRA) May 2012, Singapor
Hashing for Similarity Search: A Survey
Similarity search (nearest neighbor search) is a problem of pursuing the data
items whose distances to a query item are the smallest from a large database.
Various methods have been developed to address this problem, and recently a lot
of efforts have been devoted to approximate search. In this paper, we present a
survey on one of the main solutions, hashing, which has been widely studied
since the pioneering work locality sensitive hashing. We divide the hashing
algorithms two main categories: locality sensitive hashing, which designs hash
functions without exploring the data distribution and learning to hash, which
learns hash functions according the data distribution, and review them from
various aspects, including hash function design and distance measure and search
scheme in the hash coding space
Perceptron learning with random coordinate descent
A perceptron is a linear threshold classifier that separates examples with a hyperplane. It is perhaps the simplest learning model that is used standalone. In this paper, we propose a family of random coordinate descent algorithms for perceptron learning on binary classification problems. Unlike most perceptron learning algorithms which require smooth cost functions, our algorithms directly minimize the training error, and usually achieve the lowest training error compared with other algorithms. The algorithms are also computational efficient. Such advantages make them favorable for both standalone use and ensemble learning, on problems that are not linearly separable. Experiments show that our algorithms work very well with AdaBoost, and achieve the lowest test errors for half of the datasets
Optimizing 0/1 Loss for Perceptrons by Random Coordinate Descent
The 0/1 loss is an important cost function for perceptrons. Nevertheless it cannot be easily minimized by most existing perceptron learning algorithms. In this paper, we propose a family of random coordinate descent algorithms to directly minimize the 0/1 loss for perceptrons, and prove their convergence. Our algorithms are computationally efficient, and usually achieve the lowest 0/1 loss compared with other algorithms. Such advantages make them favorable for nonseparable real-world problems. Experiments show that our algorithms are especially useful for ensemble learning, and could achieve the lowest test error for many complex data sets when coupled with AdaBoost
- …