4 research outputs found
A Bayesian alternative to mutual information for the hierarchical clustering of dependent random variables
The use of mutual information as a similarity measure in agglomerative
hierarchical clustering (AHC) raises an important issue: some correction needs
to be applied for the dimensionality of variables. In this work, we formulate
the decision of merging dependent multivariate normal variables in an AHC
procedure as a Bayesian model comparison. We found that the Bayesian
formulation naturally shrinks the empirical covariance matrix towards a matrix
set a priori (e.g., the identity), provides an automated stopping rule, and
corrects for dimensionality using a term that scales up the measure as a
function of the dimensionality of the variables. Also, the resulting log Bayes
factor is asymptotically proportional to the plug-in estimate of mutual
information, with an additive correction for dimensionality in agreement with
the Bayesian information criterion. We investigated the behavior of these
Bayesian alternatives (in exact and asymptotic forms) to mutual information on
simulated and real data. An encouraging result was first derived on
simulations: the hierarchical clustering based on the log Bayes factor
outperformed off-the-shelf clustering techniques as well as raw and normalized
mutual information in terms of classification accuracy. On a toy example, we
found that the Bayesian approaches led to results that were similar to those of
mutual information clustering techniques, with the advantage of an automated
thresholding. On real functional magnetic resonance imaging (fMRI) datasets
measuring brain activity, it identified clusters consistent with the
established outcome of standard procedures. On this application, normalized
mutual information had a highly atypical behavior, in the sense that it
systematically favored very large clusters. These initial experiments suggest
that the proposed Bayesian alternatives to mutual information are a useful new
tool for hierarchical clustering
Automated extraction of mutual independence patterns using Bayesian comparison of partition models
Mutual independence is a key concept in statistics that characterizes the
structural relationships between variables. Existing methods to investigate
mutual independence rely on the definition of two competing models, one being
nested into the other and used to generate a null distribution for a statistic
of interest, usually under the asymptotic assumption of large sample size. As
such, these methods have a very restricted scope of application. In the present
manuscript, we propose to change the investigation of mutual independence from
a hypothesis-driven task that can only be applied in very specific cases to a
blind and automated search within patterns of mutual independence. To this end,
we treat the issue as one of model comparison that we solve in a Bayesian
framework. We show the relationship between such an approach and existing
methods in the case of multivariate normal distributions as well as
cross-classified multinomial distributions. We propose a general Markov chain
Monte Carlo (MCMC) algorithm to numerically approximate the posterior
distribution on the space of all patterns of mutual independence. The relevance
of the method is demonstrated on synthetic data as well as two real datasets,
showing the unique insight provided by this approach.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence (in
press
Understanding the nature of face processing in early autism: a prospective study
4AbstractDimensional approaches to psychopathology interrogate the core neurocognitive domains interactingat the individual level to shape diagnostic symptoms. Embedding this approach in prospective longitudinal studies couldtransform our understanding of the mechanisms underlying neurodevelopmental disorders. Such designs require us to move beyond traditional group comparisons and determine which domain-specific alterations apply at the level of the individual, and whether they vary across distinct phenotypic subgroups. As a proof of principle, this studyexamineshow the domain of face processingcontributes to the emergenceof Autism Spectrum Disorder (ASD). We used an event-related potentials (ERPs) task in a cohort of 8-month-oldinfants with (n=148) and without (n=68) an older sibling withASD, andcombined traditional case-control comparisonswith machine-learningtechniques for prediction of social traits and ASD diagnosisat 36 months,and Bayesian hierarchical clustering for stratification into subgroups. Abroad profile of alterations in the time-course of neural processing of faces in infancy was predictive oflaterASD, with a strong convergence in ERP features predicting social traits and diagnosis.We identified two main subgroups in ASD,defined by distinct patternsof neural responsestofaces,which differed on latersensory sensitivity. Taken together, our findings suggest that individual differences between infantscontribute to the diffuse pattern of alterations predictive of ASD in the first year of life. Moving from group-level comparisons to pattern recognition and stratification can help to understand and reduce heterogeneity in clinical cohorts, and improve our understanding of the mechanisms that lead to later neurodevelopmental outcomes