Search CORE

65,939 research outputs found

Label Propagation for Learning with Label Proportions

Author: Poyiadzi Rafael
Santos-Rodriguez Raul
Twomey Niall
Publication venue
Publication date: 24/10/2018
Field of study

Learning with Label Proportions (LLP) is the problem of recovering the underlying true labels given a dataset when the data is presented in the form of bags. This paradigm is particularly suitable in contexts where providing individual labels is expensive and label aggregates are more easily obtained. In the healthcare domain, it is a burden for a patient to keep a detailed diary of their daily routines, but often they will be amenable to provide higher level summaries of daily behavior. We present a novel and efficient graph-based algorithm that encourages local smoothness and exploits the global structure of the data, while preserving the `mass' of each bag.Comment: Accepted to MLSP 201

arXiv.org e-Print Archive

Crossref

Explore Bristol Research

Classification with Asymmetric Label Noise: Consistency and Maximal Denoising

Author: Blanchard Gilles
Flaska Marek
Handy Gregory
Pozzi Sara
Scott Clayton
Publication venue
Publication date: 01/01/2016
Field of study

In many real-world classification problems, the labels of training examples are randomly corrupted. Most previous theoretical work on classification with label noise assumes that the two classes are separable, that the label noise is independent of the true class label, or that the noise proportions for each class are known. In this work, we give conditions that are necessary and sufficient for the true class-conditional distributions to be identifiable. These conditions are weaker than those analyzed previously, and allow for the classes to be nonseparable and the noise levels to be asymmetric and unknown. The conditions essentially state that a majority of the observed labels are correct and that the true class-conditional distributions are "mutually irreducible," a concept we introduce that limits the similarity of the two distributions. For any label noise problem, there is a unique pair of true class-conditional distributions satisfying the proposed conditions, and we argue that this pair corresponds in a certain sense to maximal denoising of the observed distributions. Our results are facilitated by a connection to "mixture proportion estimation," which is the problem of estimating the maximal proportion of one distribution that is present in another. We establish a novel rate of convergence result for mixture proportion estimation, and apply this to obtain consistency of a discrimination rule based on surrogate loss minimization. Experimental results on benchmark data and a nuclear particle classification problem demonstrate the efficacy of our approach

arXiv.org e-Print Archive

HAL Descartes

Hal-Diderot

Combining similarity in time and space for training set formation under concept drift

Author: Zliobaite Indre
Publication venue: 'IOS Press'
Publication date: 01/01/2011
Field of study

Concept drift is a challenge in supervised learning for sequential data. It describes a phenomenon when the data distributions change over time. In such a case accuracy of a classifier benefits from the selective sampling for training. We develop a method for training set selection, particularly relevant when the expected drift is gradual. Training set selection at each time step is based on the distance to the target instance. The distance function combines similarity in space and in time. The method determines an optimal training set size online at every time step using cross validation. It is a wrapper approach, it can be used plugging in different base classifiers. The proposed method shows the best accuracy in the peer group on the real and artificial drifting data. The method complexity is reasonable for the field applications

Repository TU/e

Crossref

Pure OAI Repository

Bournemouth University Research Online

Learning to Rank based on Analogical Reasoning

Author: Fahandar Mohsen Ahmadi
Hüllermeier Eyke
Publication venue
Publication date: 28/11/2017
Field of study

Object ranking or "learning to rank" is an important problem in the realm of preference learning. On the basis of training data in the form of a set of rankings of objects represented as feature vectors, the goal is to learn a ranking function that predicts a linear order of any new set of objects. In this paper, we propose a new approach to object ranking based on principles of analogical reasoning. More specifically, our inference pattern is formalized in terms of so-called analogical proportions and can be summarized as follows: Given objects

A,B,C,D

, if object

A

is known to be preferred to

B

, and

C

relates to

D

A

relates to

B

, then

C

is (supposedly) preferred to

D

. Our method applies this pattern as a main building block and combines it with ideas and techniques from instance-based learning and rank aggregation. Based on first experimental results for data sets from various domains (sports, education, tourism, etc.), we conclude that our approach is highly competitive. It appears to be specifically interesting in situations in which the objects are coming from different subdomains, and which hence require a kind of knowledge transfer.Comment: Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), 8 page

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications