71,432 research outputs found
Detecting synchronization clusters in multivariate time series via coarse-graining of Markov chains
Synchronization cluster analysis is an approach to the detection of
underlying structures in data sets of multivariate time series, starting from a
matrix R of bivariate synchronization indices. A previous method utilized the
eigenvectors of R for cluster identification, analogous to several recent
attempts at group identification using eigenvectors of the correlation matrix.
All of these approaches assumed a one-to-one correspondence of dominant
eigenvectors and clusters, which has however been shown to be wrong in
important cases. We clarify the usefulness of eigenvalue decomposition for
synchronization cluster analysis by translating the problem into the language
of stochastic processes, and derive an enhanced clustering method harnessing
recent insights from the coarse-graining of finite-state Markov processes. We
illustrate the operation of our method using a simulated system of coupled
Lorenz oscillators, and we demonstrate its superior performance over the
previous approach. Finally we investigate the question of robustness of the
algorithm against small sample size, which is important with regard to field
applications.Comment: Follow-up to arXiv:0706.3375. Journal submission 9 Jul 2007.
Published 19 Dec 200
Detecting multivariate interactions in spatial point patterns with Gibbs models and variable selection
We propose a method for detecting significant interactions in very large
multivariate spatial point patterns. This methodology develops high dimensional
data understanding in the point process setting. The method is based on
modelling the patterns using a flexible Gibbs point process model to directly
characterise point-to-point interactions at different spatial scales. By using
the Gibbs framework significant interactions can also be captured at small
scales. Subsequently, the Gibbs point process is fitted using a
pseudo-likelihood approximation, and we select significant interactions
automatically using the group lasso penalty with this likelihood approximation.
Thus we estimate the multivariate interactions stably even in this setting. We
demonstrate the feasibility of the method with a simulation study and show its
power by applying it to a large and complex rainforest plant population data
set of 83 species
Networks : On the relation of bi- and multivariate measures
Date of Acceptance: 28/04/2015 Acknowledgement The article processing charge was funded by the German Research Foundation (DFG) and the Albert Ludwigs University Freiburg in the funding programme Open Access PublishingPeer reviewedPublisher PD
Causal Patterns: Extraction of multiple causal relationships by Mixture of Probabilistic Partial Canonical Correlation Analysis
In this paper, we propose a mixture of probabilistic partial canonical
correlation analysis (MPPCCA) that extracts the Causal Patterns from two
multivariate time series. Causal patterns refer to the signal patterns within
interactions of two elements having multiple types of mutually causal
relationships, rather than a mixture of simultaneous correlations or the
absence of presence of a causal relationship between the elements. In
multivariate statistics, partial canonical correlation analysis (PCCA)
evaluates the correlation between two multivariates after subtracting the
effect of the third multivariate. PCCA can calculate the Granger Causal- ity
Index (which tests whether a time-series can be predicted from an- other
time-series), but is not applicable to data containing multiple partial
canonical correlations. After introducing the MPPCCA, we propose an
expectation-maxmization (EM) algorithm that estimates the parameters and latent
variables of the MPPCCA. The MPPCCA is expected to ex- tract multiple partial
canonical correlations from data series without any supervised signals to split
the data as clusters. The method was then eval- uated in synthetic data
experiments. In the synthetic dataset, our method estimated the multiple
partial canonical correlations more accurately than the existing method. To
determine the types of patterns detectable by the method, experiments were also
conducted on real datasets. The method estimated the communication patterns In
motion-capture data. The MP- PCCA is applicable to various type of signals such
as brain signals, human communication and nonlinear complex multibody systems.Comment: DSAA2017 - The 4th IEEE International Conference on Data Science and
Advanced Analytic
Visual and interactive exploration of point data
Point data, such as Unit Postcodes (UPC), can provide very detailed information at fine
scales of resolution. For instance, socio-economic attributes are commonly assigned to
UPC. Hence, they can be represented as points and observable at the postcode level.
Using UPC as a common field allows the concatenation of variables from disparate data
sources that can potentially support sophisticated spatial analysis. However, visualising
UPC in urban areas has at least three limitations. First, at small scales UPC occurrences
can be very dense making their visualisation as points difficult. On the other hand,
patterns in the associated attribute values are often hardly recognisable at large scales.
Secondly, UPC can be used as a common field to allow the concatenation of highly
multivariate data sets with an associated postcode. Finally, socio-economic variables
assigned to UPC (such as the ones used here) can be non-Normal in their distributions
as a result of a large presence of zero values and high variances which constrain their
analysis using traditional statistics.
This paper discusses a Point Visualisation Tool (PVT), a proof-of-concept system
developed to visually explore point data. Various well-known visualisation techniques
were implemented to enable their interactive and dynamic interrogation. PVT provides
multiple representations of point data to facilitate the understanding of the relations
between attributes or variables as well as their spatial characteristics. Brushing between
alternative views is used to link several representations of a single attribute, as well as
to simultaneously explore more than one variable. PVT’s functionality shows how the
use of visual techniques embedded in an interactive environment enable the exploration
of large amounts of multivariate point data
Graph analysis of functional brain networks: practical issues in translational neuroscience
The brain can be regarded as a network: a connected system where nodes, or
units, represent different specialized regions and links, or connections,
represent communication pathways. From a functional perspective communication
is coded by temporal dependence between the activities of different brain
areas. In the last decade, the abstract representation of the brain as a graph
has allowed to visualize functional brain networks and describe their
non-trivial topological properties in a compact and objective way. Nowadays,
the use of graph analysis in translational neuroscience has become essential to
quantify brain dysfunctions in terms of aberrant reconfiguration of functional
brain networks. Despite its evident impact, graph analysis of functional brain
networks is not a simple toolbox that can be blindly applied to brain signals.
On the one hand, it requires a know-how of all the methodological steps of the
processing pipeline that manipulates the input brain signals and extract the
functional network properties. On the other hand, a knowledge of the neural
phenomenon under study is required to perform physiological-relevant analysis.
The aim of this review is to provide practical indications to make sense of
brain network analysis and contrast counterproductive attitudes
A cluster driven log-volatility factor model: a deepening on the source of the volatility clustering
We introduce a new factor model for log volatilities that performs
dimensionality reduction and considers contributions globally through the
market, and locally through cluster structure and their interactions. We do not
assume a-priori the number of clusters in the data, instead using the Directed
Bubble Hierarchical Tree (DBHT) algorithm to fix the number of factors. We use
the factor model and a new integrated non parametric proxy to study how
volatilities contribute to volatility clustering. Globally, only the market
contributes to the volatility clustering. Locally for some clusters, the
cluster itself contributes statistically to volatility clustering. This is
significantly advantageous over other factor models, since the factors can be
chosen statistically, whilst also keeping economically relevant factors.
Finally, we show that the log volatility factor model explains a similar amount
of memory to a Principal Components Analysis (PCA) factor model and an
exploratory factor model
A temporal precedence based clustering method for gene expression microarray data
Background: Time-course microarray experiments can produce useful data which can help in understanding the underlying dynamics of the system. Clustering is an important stage in microarray data analysis where the data is grouped together according to certain characteristics. The majority of clustering techniques are based on distance or visual similarity measures which may not be suitable for clustering of temporal microarray data where the sequential nature of time is important. We present a Granger causality based technique to cluster temporal microarray gene expression data, which measures the interdependence between two time-series by statistically testing if one time-series can be used for forecasting the other time-series or not.
Results: A gene-association matrix is constructed by testing temporal relationships between pairs of genes using the Granger causality test. The association matrix is further analyzed using a graph-theoretic technique to detect highly connected components representing interesting biological modules. We test our approach on synthesized datasets and real biological datasets obtained for Arabidopsis thaliana. We show the effectiveness of our approach by analyzing the results using the existing biological literature. We also report interesting structural properties of the association network commonly desired in any biological system.
Conclusions: Our experiments on synthesized and real microarray datasets show that our approach produces encouraging results. The method is simple in implementation and is statistically traceable at each step. The method can produce sets of functionally related genes which can be further used for reverse-engineering of gene circuits
- …