Search CORE

48,560 research outputs found

A Bayesian alternative to mutual information for the hierarchical clustering of dependent random variables

Author: Bellec Pierre
Marrelec Guillaume
Messé Arnaud
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2015
Field of study

The use of mutual information as a similarity measure in agglomerative hierarchical clustering (AHC) raises an important issue: some correction needs to be applied for the dimensionality of variables. In this work, we formulate the decision of merging dependent multivariate normal variables in an AHC procedure as a Bayesian model comparison. We found that the Bayesian formulation naturally shrinks the empirical covariance matrix towards a matrix set a priori (e.g., the identity), provides an automated stopping rule, and corrects for dimensionality using a term that scales up the measure as a function of the dimensionality of the variables. Also, the resulting log Bayes factor is asymptotically proportional to the plug-in estimate of mutual information, with an additive correction for dimensionality in agreement with the Bayesian information criterion. We investigated the behavior of these Bayesian alternatives (in exact and asymptotic forms) to mutual information on simulated and real data. An encouraging result was first derived on simulations: the hierarchical clustering based on the log Bayes factor outperformed off-the-shelf clustering techniques as well as raw and normalized mutual information in terms of classification accuracy. On a toy example, we found that the Bayesian approaches led to results that were similar to those of mutual information clustering techniques, with the advantage of an automated thresholding. On real functional magnetic resonance imaging (fMRI) datasets measuring brain activity, it identified clusters consistent with the established outcome of standard procedures. On this application, normalized mutual information had a highly atypical behavior, in the sense that it systematically favored very large clusters. These initial experiments suggest that the proposed Bayesian alternatives to mutual information are a useful new tool for hierarchical clustering

arXiv.org e-Print Archive

CiteSeerX

Crossref

HAL-Inserm

Directory of Open Access Journals

PubMed Central

FigShare

A Geometric Approach to Pairwise Bayesian Alignment of Functional Data Using Importance Sampling

Author: Kurtek Sebastian
Publication venue
Publication date: 01/01/2017
Field of study

We present a Bayesian model for pairwise nonlinear registration of functional data. We use the Riemannian geometry of the space of warping functions to define appropriate prior distributions and sample from the posterior using importance sampling. A simple square-root transformation is used to simplify the geometry of the space of warping functions, which allows for computation of sample statistics, such as the mean and median, and a fast implementation of a

k

-means clustering algorithm. These tools allow for efficient posterior inference, where multiple modes of the posterior distribution corresponding to multiple plausible alignments of the given functions are found. We also show pointwise

95\%

credible intervals to assess the uncertainty of the alignment in different clusters. We validate this model using simulations and present multiple examples on real data from different application domains including biometrics and medicine

arXiv.org e-Print Archive

Crossref

A Quasi-Bayesian Perspective to Online Clustering

Author: Guedj Benjamin
Li Le
Loustau Sébastien
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2018
Field of study

When faced with high frequency streams of data, clustering raises theoretical and algorithmic pitfalls. We introduce a new and adaptive online clustering algorithm relying on a quasi-Bayesian approach, with a dynamic (i.e., time-dependent) estimation of the (unknown and changing) number of clusters. We prove that our approach is supported by minimax regret bounds. We also provide an RJMCMC-flavored implementation (called PACBO, see https://cran.r-project.org/web/packages/PACBO/index.html) for which we give a convergence guarantee. Finally, numerical experiments illustrate the potential of our procedure

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

UCL Discovery

Outlier detection techniques for wireless sensor networks: A survey

Author: Havinga Paul
Meratnia Nirvana
Zhang Yang
Publication venue: IEEE
Publication date: 01/01/2010
Field of study

In the field of wireless sensor networks, those measurements that significantly deviate from the normal pattern of sensed data are considered as outliers. The potential sources of outliers include noise and errors, events, and malicious attacks on the network. Traditional outlier detection techniques are not directly applicable to wireless sensor networks due to the nature of sensor data and specific requirements and limitations of the wireless sensor networks. This survey provides a comprehensive overview of existing outlier detection techniques specifically developed for the wireless sensor networks. Additionally, it presents a technique-based taxonomy and a comparative table to be used as a guideline to select a technique suitable for the application at hand based on characteristics such as data type, outlier type, outlier identity, and outlier degree

CiteSeerX

Crossref

University of Twente Research Information