Search CORE

25 research outputs found

Learning to Resolve Natural Language Ambiguities: A Unified Approach

Author: Roth Dan
Publication venue
Publication date: 01/01/1998
Field of study

We analyze a few of the commonly used statistics based and machine learning algorithms for natural language disambiguation tasks and observe that they can be re-cast as learning linear separators in the feature space. Each of the methods makes a priori assumptions, which it employs, given the data, when searching for its hypothesis. Nevertheless, as we show, it searches a space that is as rich as the space of all linear separators. We use this to build an argument for a data driven approach which merely searches for a good linear separator in the feature space, without further assumptions on the domain or a specific problem. We present such an approach - a sparse network of linear separators, utilizing the Winnow learning algorithm - and show how to use it in a variety of ambiguity resolution problems. The learning approach presented is attribute-efficient and, therefore, appropriate for domains having very large number of attributes. In particular, we present an extensive experimental comparison of our approach with other methods on several well studied lexical disambiguation tasks such as context-sensitive spelling correction, prepositional phrase attachment and part of speech tagging. In all cases we show that our approach either outperforms other methods tried for these tasks or performs comparably to the best

arXiv.org e-Print Archive

CiteSeerX

Regression depth and support vector machine

Author: Christmann Andreas
Publication venue
Publication date
Field of study

The regression depth method (RDM) proposed by Rousseeuw and Hubert [RH99] plays an important role in the area of robust regression for a continuous response variable. Christmann and Rousseeuw [CR01] showed that RDM is also useful for the case of binary regression. Vapnik?s convex risk minimization principle [Vap98] has a dominating role in statistical machine learning theory. Important special cases are the support vector machine (SVM), [epsilon]-support vector regression and kernel logistic regression. In this paper connections between these methods from different disciplines are investigated for the case of pattern recognition. Some results concerning the robustness of the SVM and other kernel based methods are given. --

Research Papers in Economics

Counting aggregate classifiers.

Author: Adem Jan
Gochet Willy
Spieksma Frederik
Publication venue
Publication date
Field of study

There are many methods to design classifiers for the supervised classification problem. In this paper, we study the problem of aggregating classifiers. We construct an algorithm to count the number of distinct aggregate classifiers. This leads to a new way of finding a best aggregate classifier. When there are only two classes, we explore the link between aggregating classifiers and n-bit boolean functions. Further, the sequence of the number of distinct aggregated classifiers appears to be new.Boolean function; Classification; Classifiers; Design; Functions; Methods; Studies; Supervised classification; Weighted majority vote;

Research Papers in Economics

On robustness properties of convex risk minimization methods for pattern recognition

Author: Christmann Andreas
Steinwart Ingo
Publication venue
Publication date
Field of study

The paper brings together methods from two disciplines: machine learning theory and robust statistics. Robustness properties of machine learning methods based on convex risk minimization are investigated for the problem of pattern recognition. Assumptions are given for the existence of the influence function of the classifiers and for bounds of the influence function. Kernel logistic regression, support vector machines, least squares and the AdaBoost loss function are treated as special cases. A sensitivity analysis of the support vector machine is given. --AdaBoost loss function,influence function,kernel logistic regression,robustness,sensitivity curve,statistical learning,support vector machine,total variation

Research Papers in Economics

Regression Depth and Support Vector Machine

Author: Christmann Andreas
Publication venue
Publication date: 01/01/2004
Field of study

The regression depth method (RDM) proposed by Rousseeuw and Hubert [RH99] plays an important role in the area of robust regression for a continuous response variable. Christmann and Rousseeuw [CR01] showed that RDM is also useful for the case of binary regression. Vapnik’s convex risk minimization principle [Vap98] has a dominating role in statistical machine learning theory. Important special cases are the support vector machine (SVM), epsilon-support vector regression and kernel logistic regression. In this paper connections between these methods from different disciplines are investigated for the case of pattern recognition. Some results concerning the robustness of the SVM and other kernel based methods are given

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

Kernel-convoluted Deep Neural Networks with Data Augmentation

Author: Kim Dongha
Kim Minjin
Kim Yongdai
Kim Young-geun
Paik Myunghee Cho
Publication venue
Publication date: 23/12/2020
Field of study

The Mixup method (Zhang et al. 2018), which uses linearly interpolated data, has emerged as an effective data augmentation tool to improve generalization performance and the robustness to adversarial examples. The motivation is to curtail undesirable oscillations by its implicit model constraint to behave linearly at in-between observed data points and promote smoothness. In this work, we formally investigate this premise, propose a way to explicitly impose smoothness constraints, and extend it to incorporate with implicit model constraints. First, we derive a new function class composed of kernel-convoluted models (KCM) where the smoothness constraint is directly imposed by locally averaging the original functions with a kernel function. Second, we propose to incorporate the Mixup method into KCM to expand the domains of smoothness. In both cases of KCM and the KCM adapted with the Mixup, we provide risk analysis, respectively, under some conditions for kernels. We show that the upper bound of the excess risk is not slower than that of the original function class. The upper bound of the KCM with the Mixup remains dominated by that of the KCM if the perturbation of the Mixup vanishes faster than

O(n^{-1/2})

where

n

is a sample size. Using CIFAR-10 and CIFAR-100 datasets, our experiments demonstrate that the KCM with the Mixup outperforms the Mixup method in terms of generalization and robustness to adversarial examples

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Making Risk Minimization Tolerant to Label Noise

Author: Ghosh Aritra
Manwani Naresh
Sastry P. S.
Publication venue: 'Elsevier BV'
Publication date: 10/09/2015
Field of study

In many applications, the training data, from which one needs to learn a classifier, is corrupted with label noise. Many standard algorithms such as SVM perform poorly in presence of label noise. In this paper we investigate the robustness of risk minimization to label noise. We prove a sufficient condition on a loss function for the risk minimization under that loss to be tolerant to uniform label noise. We show that the

0-1

loss, sigmoid loss, ramp loss and probit loss satisfy this condition though none of the standard convex loss functions satisfy it. We also prove that, by choosing a sufficiently large value of a parameter in the loss function, the sigmoid loss, ramp loss and probit loss can be made tolerant to non-uniform label noise also if we can assume the classes to be separable under noise-free data distribution. Through extensive empirical studies, we show that risk minimization under the

0-1

loss, the sigmoid loss and the ramp loss has much better robustness to label noise when compared to the SVM algorithm

arXiv.org e-Print Archive

CiteSeerX

Supervised Classification: Quite a Brief Overview

Author: Aizerman
Ben-David
Besag
Beygelzimer
Bishop
Boser
Bottou
Bradley
Braga-Neto
Breiman
Breiman
Carbonneau
Chandola
Chapelle
Cheplygina
Chow
Christianini
Cohen
Cohn
Cortes
Cortes
Cover
Devroye
Dietterich
Dietterich
Dubuisson
Duda
Duda
Duin
Duin
Duin
Duin
Dwork
Dwork
Efron
Efron
Efron
Fanelli
Fawcett
Fedorov
Fisher
Fix
Freund
Fu
Galar
Geman
Girosi
Guyon
Hand
Hand
Hastie
Hinton
Ho
Ho
Hoerl
Hoffgen
Ioannidis
Isaksson
Jahrer
Jain
Jain
Kahneman
Krijthe
Kuncheva
Lachenbruch
Lafferty
Landgrebe
Langley
Lavrač
Leek
Levine
Li
Li
Li
Li
Little
Loog
Loog
Loog
Loog
Loog
Markou
Maron
McLachlan
Minka
Moonesinghe
Nair
Niemeijer
Nissen
Pan
Poggio
Polikar
Provost
Pękalska
Pękalska
Pękalska
Quinlan
Quiñonero-Candela
Rasmussen
Ripley
Rosenblatt
Rubinstein
Schaffer
Schiavo
Schmidhuber
Schölkopf
Schölkopf
Schölkopf
Settles
Shrivastava
Smola
Suykens
Tax
Tibshirani
Vapnik
Wahba
Wahba
Wahba
Wald
White
Wolpert
Wolpert
Wolpert
Yang
Zhou
Zhu
Publication venue
Publication date: 25/10/2017
Field of study

The original problem of supervised classification considers the task of automatically assigning objects to their respective classes on the basis of numerical measurements derived from these objects. Classifiers are the tools that implement the actual functional mapping from these measurements---also called features or inputs---to the so-called class label---or output. The fields of pattern recognition and machine learning study ways of constructing such classifiers. The main idea behind supervised methods is that of learning from examples: given a number of example input-output relations, to what extent can the general mapping be learned that takes any new and unseen feature vector to its correct class? This chapter provides a basic introduction to the underlying ideas of how to come to a supervised classification problem. In addition, it provides an overview of some specific classification techniques, delves into the issues of object representation and classifier evaluation, and (very) briefly covers some variations on the basic supervised classification task that may also be of interest to the practitioner

arXiv.org e-Print Archive

Crossref