4,139 research outputs found
PAC-Bayesian Majority Vote for Late Classifier Fusion
A lot of attention has been devoted to multimedia indexing over the past few
years. In the literature, we often consider two kinds of fusion schemes: The
early fusion and the late fusion. In this paper we focus on late classifier
fusion, where one combines the scores of each modality at the decision level.
To tackle this problem, we investigate a recent and elegant well-founded
quadratic program named MinCq coming from the Machine Learning PAC-Bayes
theory. MinCq looks for the weighted combination, over a set of real-valued
functions seen as voters, leading to the lowest misclassification rate, while
making use of the voters' diversity. We provide evidence that this method is
naturally adapted to late fusion procedure. We propose an extension of MinCq by
adding an order- preserving pairwise loss for ranking, helping to improve Mean
Averaged Precision measure. We confirm the good behavior of the MinCq-based
fusion approaches with experiments on a real image benchmark.Comment: 7 pages, Research repor
Recommended from our members
Statistical Properties and Guarantees for Certain Anomaly Detection Relevant Problems
Anomaly detection aims at detecting the points that appear different than the majority of the data, such that they are suspected to be generated from a different distribution. Anomaly detectors have been applied in many different fields, such as detecting fraudulent behaviors in bank transaction, finding broken sensors in a weather station network, discovering suspicious patterns for disease from patients' medical images, and so on. The work in this thesis is divided into two parts. The first part seeks to provide a stopping rule condition for one state-of-the-art anomaly detector iForest, for the purpose of outputting the top k most anomalous points as early as possible. The second part focuses on providing PAC guarantees for one version of the open category detection problem in order to achieve a desired recall without introducing an unnecessarily large false positive rate. In addition, we provide an analysis of the optimal way of distributing samples under a fixed budget, and we combine certain mixture proportion estimators with our algorithm to demonstrate a pipeline for solving the problem when we do not have prior information about the mixture proportion
A review of domain adaptation without target labels
Domain adaptation has become a prominent problem setting in machine learning
and related fields. This review asks the question: how can a classifier learn
from a source domain and generalize to a target domain? We present a
categorization of approaches, divided into, what we refer to as, sample-based,
feature-based and inference-based methods. Sample-based methods focus on
weighting individual observations during training based on their importance to
the target domain. Feature-based methods revolve around on mapping, projecting
and representing features such that a source classifier performs well on the
target domain and inference-based methods incorporate adaptation into the
parameter estimation procedure, for instance through constraints on the
optimization procedure. Additionally, we review a number of conditions that
allow for formulating bounds on the cross-domain generalization error. Our
categorization highlights recurring ideas and raises questions important to
further research.Comment: 20 pages, 5 figure
One-Class Classification: Taxonomy of Study and Review of Techniques
One-class classification (OCC) algorithms aim to build classification models
when the negative class is either absent, poorly sampled or not well defined.
This unique situation constrains the learning of efficient classifiers by
defining class boundary just with the knowledge of positive class. The OCC
problem has been considered and applied under many research themes, such as
outlier/novelty detection and concept learning. In this paper we present a
unified view of the general problem of OCC by presenting a taxonomy of study
for OCC problems, which is based on the availability of training data,
algorithms used and the application domains applied. We further delve into each
of the categories of the proposed taxonomy and present a comprehensive
literature review of the OCC algorithms, techniques and methodologies with a
focus on their significance, limitations and applications. We conclude our
paper by discussing some open research problems in the field of OCC and present
our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure
- …