Search CORE

29,509 research outputs found

A Family of Maximum Margin Criterion for Adaptive Learning

Author: CM Bishop
G-B Huang
H Li
H Xiong
H Yu
J Liu
J Xiao
J Yang
JM Geusebroek
K Labusch
M Cheng
M Li
MA Turk
PN Belhumeur
S Yan
T Ojala
T-H Chan
Y Lecun
Publication venue
Publication date: 01/01/2018
Field of study

In recent years, pattern analysis plays an important role in data mining and recognition, and many variants have been proposed to handle complicated scenarios. In the literature, it has been quite familiar with high dimensionality of data samples, but either such characteristics or large data have become usual sense in real-world applications. In this work, an improved maximum margin criterion (MMC) method is introduced firstly. With the new definition of MMC, several variants of MMC, including random MMC, layered MMC, 2D^2 MMC, are designed to make adaptive learning applicable. Particularly, the MMC network is developed to learn deep features of images in light of simple deep networks. Experimental results on a diversity of data sets demonstrate the discriminant ability of proposed MMC methods are compenent to be adopted in complicated application scenarios.Comment: 14 page

arXiv.org e-Print Archive

Crossref

Research Online

Bandwidth selection in kernel empirical risk minimization via the gradient

Author: Chichignoud Michaël
Loustau Sébastien
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 27/01/2014
Field of study

In this paper, we deal with the data-driven selection of multidimensional and possibly anisotropic bandwidths in the general framework of kernel empirical risk minimization. We propose a universal selection rule, which leads to optimal adaptive results in a large variety of statistical models such as nonparametric robust regression and statistical learning with errors in variables. These results are stated in the context of smooth loss functions, where the gradient of the risk appears as a good criterion to measure the performance of our estimators. The selection rule consists of a comparison of gradient empirical risks. It can be viewed as a nontrivial improvement of the so-called Goldenshluger-Lepski method to nonlinear estimators. Furthermore, one main advantage of our selection rule is the nondependency on the Hessian matrix of the risk, usually involved in standard adaptive procedures.Comment: Published at http://dx.doi.org/10.1214/15-AOS1318 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Meta-models for structural reliability and uncertainty quantification

Author: Sudret Bruno
Publication venue
Publication date: 01/01/2011
Field of study

A meta-model (or a surrogate model) is the modern name for what was traditionally called a response surface. It is intended to mimic the behaviour of a computational model M (e.g. a finite element model in mechanics) while being inexpensive to evaluate, in contrast to the original model which may take hours or even days of computer processing time. In this paper various types of meta-models that have been used in the last decade in the context of structural reliability are reviewed. More specifically classical polynomial response surfaces, polynomial chaos expansions and kriging are addressed. It is shown how the need for error estimates and adaptivity in their construction has brought this type of approaches to a high level of efficiency. A new technique that solves the problem of the potential biasedness in the estimation of a probability of failure through the use of meta-models is finally presented.Comment: Keynote lecture Fifth Asian-Pacific Symposium on Structural Reliability and its Applications (5th APSSRA) May 2012, Singapor

arXiv.org e-Print Archive

CiteSeerX

HAL-Ecole des Ponts ParisTech

Hashing for Similarity Search: A Survey

Author: Ji Jianqiu
Shen Heng Tao
Song Jingkuan
Wang Jingdong
Publication venue
Publication date: 13/08/2014
Field of study

Similarity search (nearest neighbor search) is a problem of pursuing the data items whose distances to a query item are the smallest from a large database. Various methods have been developed to address this problem, and recently a lot of efforts have been devoted to approximate search. In this paper, we present a survey on one of the main solutions, hashing, which has been widely studied since the pioneering work locality sensitive hashing. We divide the hashing algorithms two main categories: locality sensitive hashing, which designs hash functions without exploring the data distribution and learning to hash, which learns hash functions according the data distribution, and review them from various aspects, including hash function design and distance measure and search scheme in the hash coding space

arXiv.org e-Print Archive

CiteSeerX

Perceptron learning with random coordinate descent

Author: Li Ling
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/2005
Field of study

A perceptron is a linear threshold classifier that separates examples with a hyperplane. It is perhaps the simplest learning model that is used standalone. In this paper, we propose a family of random coordinate descent algorithms for perceptron learning on binary classification problems. Unlike most perceptron learning algorithms which require smooth cost functions, our algorithms directly minimize the training error, and usually achieve the lowest training error compared with other algorithms. The algorithms are also computational efficient. Such advantages make them favorable for both standalone use and ensemble learning, on problems that are not linearly separable. Experiments show that our algorithms work very well with AdaBoost, and achieve the lowest test errors for half of the datasets

CiteSeerX

Caltech Authors

Optimizing 0/1 Loss for Perceptrons by Random Coordinate Descent

Author: Li Ling
Lin Hsuan-Tien
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

The 0/1 loss is an important cost function for perceptrons. Nevertheless it cannot be easily minimized by most existing perceptron learning algorithms. In this paper, we propose a family of random coordinate descent algorithms to directly minimize the 0/1 loss for perceptrons, and prove their convergence. Our algorithms are computationally efficient, and usually achieve the lowest 0/1 loss compared with other algorithms. Such advantages make them favorable for nonseparable real-world problems. Experiments show that our algorithms are especially useful for ensemble learning, and could achieve the lowest test error for many complex data sets when coupled with AdaBoost

Crossref

Caltech Authors