Search CORE

195 research outputs found

PAC-Bayesian Majority Vote for Late Classifier Fusion

Author: Ayache Stéphane
Habrard Amaury
Morvant Emilie
Publication venue
Publication date: 01/01/2012
Field of study

A lot of attention has been devoted to multimedia indexing over the past few years. In the literature, we often consider two kinds of fusion schemes: The early fusion and the late fusion. In this paper we focus on late classifier fusion, where one combines the scores of each modality at the decision level. To tackle this problem, we investigate a recent and elegant well-founded quadratic program named MinCq coming from the Machine Learning PAC-Bayes theory. MinCq looks for the weighted combination, over a set of real-valued functions seen as voters, leading to the lowest misclassification rate, while making use of the voters' diversity. We provide evidence that this method is naturally adapted to late fusion procedure. We propose an extension of MinCq by adding an order- preserving pairwise loss for ranking, helping to improve Mean Averaged Precision measure. We confirm the good behavior of the MinCq-based fusion approaches with experiments on a real image benchmark.Comment: 7 pages, Research repor

arXiv.org e-Print Archive

HAL-UJM

HAL AMU

On the Generalization of the C-Bound to Structured Output Ensemble Methods

Author: Laviolette François
Morvant Emilie
Ralaivola Liva
Roy Jean-Francis
Publication venue: HAL CCSD
Publication date: 01/01/2015
Field of study

This paper generalizes an important result from the PAC-Bayesian literature for binary classification to the case of ensemble methods for structured outputs. We prove a generic version of the \Cbound, an upper bound over the risk of models expressed as a weighted majority vote that is based on the first and second statistical moments of the vote's margin. This bound may advantageously

(i)

be applied on more complex outputs such as multiclass labels and multilabel, and

(ii)

allow to consider margin relaxations. These results open the way to develop new ensemble methods for structured output prediction with PAC-Bayesian guarantees

arXiv.org e-Print Archive

HAL-UJM

HAL AMU

The Emotional Impact of Audio - Visual Stimuli

Author: Thomas Titus Pallithottathu
Publication venue: RIT Scholar Works
Publication date: 01/07/2017
Field of study

Induced affect is the emotional effect of an object on an individual. It can be quantiﬁed through two metrics: valence and arousal. Valance quantifies how positive or negative something is, while arousal quantifies the intensity from calm to exciting. These metrics enable researchers to study how people opine on various topics. Affective content analysis of visual media is a challenging problem due to differences in perceived reactions. Industry standard machine learning classifiers such as Support Vector Machines can be used to help determine user affect. The best affect-annotated video datasets are often analyzed by feeding large amounts of visual and audio features through machine-learning algorithms. The goal is to maximize accuracy, with the hope that each feature will bring useful information to the table. We depart from this approach to quantify how different modalities such as visual, audio, and text description information can aid in the understanding affect. To that end, we train independent models for visual, audio and text description. Each are convolutional neural networks paired with support vector machines to classify valence and arousal. We also train various ensemble models that combine multi-modal information with the hope that the information from independent modalities benefits each other. We ﬁnd that our visual network alone achieves state-of-the-art valence classiﬁcation accuracy and that our audio network, when paired with our visual, achieves competitive results on arousal classiﬁcation. Each network is much stronger on one metric than the other. This may lead to more sophisticated multimodal approaches to accurately identifying affect in video data. This work also contributes to induced emotion classification by augmenting existing sizable media datasets and providing a robust framework for classifying the same

RIT Scholar Works

Adaptation de domaine de vote de majorité par auto-étiquetage non itératif

Author: Morvant Emilie
Publication venue: HAL CCSD
Publication date: 01/01/2014
Field of study

National audienceEn apprentissage automatique, nous parlons d'adaptation de domaine lorsque les données de test (cibles) et d'apprentissage (sources) sont générées selon différentes distributions. Nous devons donc développer des algorithmes de classification capables de s'adapter à une nouvelle distribution, pour laquelle aucune information sur les étiquettes n'est disponible. Nous attaquons cette problématique sous l'angle de l'approche PAC-Bayésienne qui se focalise sur l'apprentissage de modèles définis comme des votes de majorité sur un ensemble de fonctions. Dans ce contexte, nous introduisons PV-MinCq une version adaptative de l'algorithme (non adaptatif) MinCq. PV-MinCq suit le principe suivant. Nous transférons les étiquettes sources aux points cibles proches pour ensuite appliquer MinCq sur l'échantillon cible ''auto-étiqueté'' (justifié par une borne théorique). Plus précisément, nous définissons un auto-étiquetage non itératif qui se focalise dans les régions où les distributions marginales source et cible sont les plus similaires. Dans un second temps, nous étudions l'influence de notre auto-étiquetage pour en déduire une procédure de validation des hyperparamètres. Finalement, notre approche montre des résultats empiriques prometteurs

HAL-UJM

IST Austria: PubRep (Institute of Science and Technology)

Hal-Diderot

Learning a priori constrained weighted majority votes

Author: Amaury Habrard
Aurélien Bellet
D Haussler
D Kedem
Emilie Morvant
F Laviolette
G Lever
KQ Weinberger
L Breiman
L Breiman
M Marchand
Marc Sebban
PK Atrey
R Nock
R Schapire
S Floyd
S Sun
T Graepel
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A Review of Artificial Intelligence in Breast Imaging

Author: Ahmed Mohammed H.
Ajam Tarek
Al-Karawi Dhurgham
Al-Zaidi Shakir
Alshalabi Bashar A.
Helael Khaled Ahmad
Mouhsen Abdulmajeed Mounzer
Obeidat Naser
Salman Mohamed
Publication venue
Publication date: 09/05/2024
Field of study

With the increasing dominance of artificial intelligence (AI) techniques, the important prospects for their application have extended to various medical fields, including domains such as in vitro diagnosis, intelligent rehabilitation, medical imaging, and prognosis. Breast cancer is a common malignancy that critically affects women’s physical and mental health. Early breast cancer screening—through mammography, ultrasound, or magnetic resonance imaging (MRI)—can substantially improve the prognosis for breast cancer patients. AI applications have shown excellent performance in various image recognition tasks, and their use in breast cancer screening has been explored in numerous studies. This paper introduces relevant AI techniques and their applications in the field of medical imaging of the breast (mammography and ultrasound), specifically in terms of identifying, segmenting, and classifying lesions; assessing breast cancer risk; and improving image quality. Focusing on medical imaging for breast cancer, this paper also reviews related challenges and prospects for AI

Coventry University Pure Portal

Apprentissage de vote de majorité pour la classification supervisée et l'adaptation de domaine : approches PAC-Bayésiennes et combinaison de similarités

Author: Morvant Emilie
Publication venue: HAL CCSD
Publication date: 18/09/2013
Field of study

Nowadays, due to the expansion of the web a plenty of data are available and many applications need to make use of supervised machine learning methods able to take into account different information sources. For instance, for multimedia semantic indexing applications, one have to efficiently take advantage of information about color, textual, texture or sound sources of the document. Most of the existing methods try to combine these multimodal informations, either by directly fusionning the descriptors or by combining similarities or classifiers, in order to produce a classification model more reliable for the considered task. Usually, these multimodal facets imply two main issues. On the one hand, one have to be able to correctly make use of all the a priori information available. On the other hand, the data, on which the model will be applied, does not come from the same probability distribution than the data used during the learning step. In this context, we have to adapt the model on new data, which is known as domain adaptation. In this thesis, we propose several theoretically-founded contributions for tackle these issues. A first serie of contributions studies the problem of learning a weighted majority vote over a set of voters in a supervised classification setting.These results fall within the context of the PAC-Bayesian theory allowing to derive generalization abilities for such a vote by assuming an a priori on the relevance of the voters. Our first contribution aims at extending a recent algorithm, MinCq, minimizing a bound over the error of the majority vote in binary classification. This extension can take into account an a priori belief on the performances of the voters. This belief is expressed as an aligned distribution. We illustrate its usefulness for combining nearest neighbor classifiers, and for classifier fusion on a multimedia semantic indexing task. Then, we propose a theoretical contribution for multiclass classification tasks. Our approach is based on an original PAC-Bayesian analysis considering the operator norm of the confusion matrix as an error measure. Our second series of contributions relates to domain adaptation. In this situation we present our third result for combining similarities in order to infer a representation space for moving closer the learning distribution and the testing distribution. This contribution is based on the theory of learning from (epsilon,gamma,tau)-good similarity functions and is justified by the minimization of an usual bound in domain adaptation. For our last contribution, we propose the first PAC-Bayesian analysis for domain adaptation. This analysis is based on a consistent divergence measure between distributions allowing us to derive a generalization bound for learning majority votes in binary classification. Moreover, we propose a first algorithm specialized to linear classifiers and able to directly minimize our bound.De nos jours, avec l'expansion d'Internet, l'abondance et la diversité des données accessibles qui en résulte, de nombreuses applications requièrent l'utilisation de méthodes d'apprentissage automatique supervisé capables de prendre en considération différentes sources d'informations. Par exemple, pour des applications relevant de l'indexation sémantique de documents multimédia, il s'agit de pouvoir efficacement tirer bénéfice d'informations liées à la couleur, au texte, à la texture ou au son des documents à traiter. La plupart des méthodes existantes proposent de combiner ces informations multimodales, soit en fusionnant directement les descriptions, soit en combinant des similarités ou des classifieurs, avec pour objectif de construire un modèle de classification automatique plus fiable pour la tâche visée. Ces aspects multimodaux induisent généralement deux types de difficultés. D'une part, il faut être capable d'utiliser au mieux toute l'information a priori disponible sur les objets à combiner. D'autre part, les données sur lesquelles le modèle doit être appliqué ne suivent nécessairement pas la même distribution de probabilité que les données utilisées lors de la phase d'apprentissage. Dans ce contexte, il faut être à même d'adapter le modèle à de nouvelles données, ce qui relève de l'adaptation de domaine. Dans cette thèse, nous proposons plusieurs contributions fondées théoriquement et répondant à ces problématiques. Une première série de contributions s'intéresse à l'apprentissage de votes de majorité pondérés sur un ensemble de votants dans le cadre de la classification supervisée. Ces contributions s'inscrivent dans le contexte de la théorie PAC-Bayésienne permettant d'étudier les capacités en généralisation de tels votes de majorité en supposant un a priori sur la pertinence des votants. Notre première contribution vise à étendre un algorithme récent, MinCq, minimisant une borne sur l'erreur du vote de majorité en classification binaire. Cette extension permet de prendre en compte une connaissance a priori sur les performances des votants à combiner sous la forme d'une distribution alignée. Nous illustrons son intérêt dans une optique de combinaison de classifieurs de type plus proches voisins, puis dans une perspective de fusion de classifieurs pour l'indexation sémantique de documents multimédia. Nous proposons ensuite une contribution théorique pour des problèmes de classification multiclasse. Cette approche repose sur une analyse PAC-Bayésienne originale en considérant la norme opérateur de la matrice de confusion comme mesure de risque. Notre seconde série de contributions concerne la problématique de l'adaptation de domaine. Dans cette situation, nous présentons notre troisième apport visant à combiner des similarités permettant d'inférer un espace de représentation de manière à rapprocher les distributions des données d'apprentissage et des données à traiter. Cette contribution se base sur la théorie des fonctions de similarités (epsilon,gamma,tau)-bonnes et se justifie par la minimisation d'une borne classique en adaptation de domaine. Pour notre quatrième et dernière contribution, nous proposons la première analyse PAC-Bayésienne appropriée à l'adaptation de domaine. Cette analyse se base sur une mesure consistante de divergence entre distributions permettant de dériver une borne en généralisation pour l'apprentissage de votes de majorité en classification binaire. Elle nous permet également de proposer un algorithme adapté aux classifieurs linéaires capable de minimiser cette borne de manière directe

Optimal allocation and classification in multi-agent systems:with applications in precision agriculture

Author: Cobbenhagen Alfonsus Theodorus Johannes Roy
Publication venue: Technische Universiteit Eindhoven
Publication date: 16/10/2020
Field of study

Pure OAI Repository

Adaptive algorithms for real-world transactional data mining.

Author: Apeh Edward Tersoo
Publication venue
Publication date: 01/01/2012
Field of study

The accurate identiﬁcation of the right customer to target with the right product at the right time, through the right channel, to satisfy the customer’s evolving needs, is a key performance driver and enhancer for businesses. Data mining is an analytic process designed to explore usually large amounts of data (typically business or market related) in search of consistent patterns and/or systematic relationships between variables for the purpose of generating explanatory/predictive data models from the detected patterns. It provides an effective and established mechanism for accurate identiﬁcation and classiﬁcation of customers. Data models derived from the data mining process can aid in effectively recognizing the status and preference of customers - individually and as a group. Such data models can be incorporated into the business market segmentation, customer targeting and channelling decisions with the goal of maximizing the total customer lifetime proﬁt. However, due to costs, privacy and/or data protection reasons, the customer data available for data mining is often restricted to veriﬁed and validated data,(in most cases,only the business owned transactional data is available). Transactional data is a valuable resource for generating such data models. Transactional data can be electronically collected and readily made available for data mining in large quantity at minimum extra cost. Transactional data is however, inherently sparse and skewed. These inherent characteristics of transactional data give rise to the poor performance of data models built using customer data based on transactional data. Data models for identifying, describing, and classifying customers, constructed using evolving transactional data thus need to effectively handle the inherent sparseness and skewness of evolving transactional data in order to be efficient and accurate. Using real-world transactional data, this thesis presents the ﬁndings and results from the investigation of data mining algorithms for analysing, describing, identifying and classifying customers with evolving needs. In particular, methods for handling the issues of scalability, uncertainty and adaptation whilst mining evolving transactional data are analysed and presented. A novel application of a new framework for integrating transactional data binning and classiﬁcation techniques is presented alongside an effective prototype selection algorithm for efficient transactional data model building. A new change mining architecture for monitoring, detecting and visualizing the change in customer behaviour using transactional data is proposed and discussed as an effective means for analysing and understanding the change in customer buying behaviour over time. Finally, the challenging problem of discerning between the change in the customer proﬁle (which may necessitate the effective change of the customer’s label) and the change in performance of the model(s) (which may necessitate changing or adapting the model(s)) is introduced and discussed by way of a novel ﬂexible and efficient architecture for classiﬁer model adaptation and customer proﬁles class relabeling

Bournemouth University Research Online