17,700 research outputs found
PAC-Bayesian Majority Vote for Late Classifier Fusion
A lot of attention has been devoted to multimedia indexing over the past few
years. In the literature, we often consider two kinds of fusion schemes: The
early fusion and the late fusion. In this paper we focus on late classifier
fusion, where one combines the scores of each modality at the decision level.
To tackle this problem, we investigate a recent and elegant well-founded
quadratic program named MinCq coming from the Machine Learning PAC-Bayes
theory. MinCq looks for the weighted combination, over a set of real-valued
functions seen as voters, leading to the lowest misclassification rate, while
making use of the voters' diversity. We provide evidence that this method is
naturally adapted to late fusion procedure. We propose an extension of MinCq by
adding an order- preserving pairwise loss for ranking, helping to improve Mean
Averaged Precision measure. We confirm the good behavior of the MinCq-based
fusion approaches with experiments on a real image benchmark.Comment: 7 pages, Research repor
A Study of SVM Kernel Functions for Sensitivity Classification Ensembles with POS Sequences
Freedom of Information (FOI) laws legislate that government documents should be opened to the public. However, many government documents contain sensitive information, such as confidential information, that is exempt from release. Therefore, government documents must be sensitivity reviewed prior to release, to identify and close any sensitive information. With the adoption of born-digital documents, such as email, there is a need for automatic sensitivity classification to assist digital sensitivity review. SVM classifiers and Part-of-Speech sequences have separately been shown to be promising for sensitivity classification. However, sequence classification methodologies, and specifically SVM kernel functions, have not been fully investigated for sensitivity classification. Therefore, in this work, we present an evaluation of five SVM kernel functions for sensitivity classification using POS sequences. Moreover, we show that an ensemble classifier that combines POS sequence classification with text classification can significantly improve sensitivity classification effectiveness (+6.09% F2) compared with a text classification baseline, according to McNemar's test of significance
- …