19 research outputs found
Unsupervised ensemble classification with correlated decision agents
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Decision-making procedures when a set of individual binary labels is processed to produce a unique joint decision can be approached modeling the individual labels as multivariate independent Bernoulli random variables. This probabilistic model allows an unsupervised solution using EM-based algorithms, which basically estimate the distribution model parameters and take a joint decision using a Maximum a Posteriori criterion. These methods usually assume that individual decision agents are conditionally independent, an assumption that might not hold in practical setups. Therefore, in this work we formulate and solve the decision-making problem using an EM-based approach but assuming correlated decision agents. Improved performance is obtained on synthetic and real datasets, compared to classical and state-of-the-art algorithms.Peer ReviewedPostprint (author's final draft
Learning from unequally reliable blind ensembles of classifiers
© 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.The rising interest in pattern recognition and data analytics has spurred the development of a plethora of machine learning algorithms and tools. However, as each algorithm has its strengths and weaknesses, one is motivated to judiciously fuse multiple algorithms in order to find the “best” performing one, for a given dataset. Ensemble learning aims to create a high- performance meta-algorithm, by combining the outputs from multiple algorithms. The present work introduces a simple blind scheme for learning from ensembles of classifiers, using joint matrix factorization. Blind refers to the combiner who has no knowledge of the ground-truth labels that each classifier has been trained on. Performance is evaluated on synthetic and real datasets.Peer ReviewedPostprint (author's final draft
Anomaly Detection with Variance Stabilized Density Estimation
Density estimation based anomaly detection schemes typically model anomalies
as examples that reside in low-density regions. We propose a modified density
estimation problem and demonstrate its effectiveness for anomaly detection.
Specifically, we assume the density function of normal samples is uniform in
some compact domain. This assumption implies the density function is more
stable (with lower variance) around normal samples than anomalies. We first
corroborate this assumption empirically using a wide range of real-world data.
Then, we design a variance stabilized density estimation problem for maximizing
the likelihood of the observed samples while minimizing the variance of the
density around normal samples. We introduce an ensemble of autoregressive
models to learn the variance stabilized distribution. Finally, we perform an
extensive benchmark with 52 datasets demonstrating that our method leads to
state-of-the-art results while alleviating the need for data-specific
hyperparameter tuning.Comment: 12 pages, 6 figure