108,841 research outputs found
Multivariate Spatiotemporal Hawkes Processes and Network Reconstruction
There is often latent network structure in spatial and temporal data and the
tools of network analysis can yield fascinating insights into such data. In
this paper, we develop a nonparametric method for network reconstruction from
spatiotemporal data sets using multivariate Hawkes processes. In contrast to
prior work on network reconstruction with point-process models, which has often
focused on exclusively temporal information, our approach uses both temporal
and spatial information and does not assume a specific parametric form of
network dynamics. This leads to an effective way of recovering an underlying
network. We illustrate our approach using both synthetic networks and networks
constructed from real-world data sets (a location-based social media network, a
narrative of crime events, and violent gang crimes). Our results demonstrate
that, in comparison to using only temporal data, our spatiotemporal approach
yields improved network reconstruction, providing a basis for meaningful
subsequent analysis --- such as community structure and motif analysis --- of
the reconstructed networks
Unified functional network and nonlinear time series analysis for complex systems science: The pyunicorn package
We introduce the \texttt{pyunicorn} (Pythonic unified complex network and
recurrence analysis toolbox) open source software package for applying and
combining modern methods of data analysis and modeling from complex network
theory and nonlinear time series analysis. \texttt{pyunicorn} is a fully
object-oriented and easily parallelizable package written in the language
Python. It allows for the construction of functional networks such as climate
networks in climatology or functional brain networks in neuroscience
representing the structure of statistical interrelationships in large data sets
of time series and, subsequently, investigating this structure using advanced
methods of complex network theory such as measures and models for spatial
networks, networks of interacting networks, node-weighted statistics or network
surrogates. Additionally, \texttt{pyunicorn} provides insights into the
nonlinear dynamics of complex systems as recorded in uni- and multivariate time
series from a non-traditional perspective by means of recurrence quantification
analysis (RQA), recurrence networks, visibility graphs and construction of
surrogate time series. The range of possible applications of the library is
outlined, drawing on several examples mainly from the field of climatology.Comment: 28 pages, 17 figure
Spectral gene set enrichment (SGSE)
Motivation: Gene set testing is typically performed in a supervised context
to quantify the association between groups of genes and a clinical phenotype.
In many cases, however, a gene set-based interpretation of genomic data is
desired in the absence of a phenotype variable. Although methods exist for
unsupervised gene set testing, they predominantly compute enrichment relative
to clusters of the genomic variables with performance strongly dependent on the
clustering algorithm and number of clusters. Results: We propose a novel
method, spectral gene set enrichment (SGSE), for unsupervised competitive
testing of the association between gene sets and empirical data sources. SGSE
first computes the statistical association between gene sets and principal
components (PCs) using our principal component gene set enrichment (PCGSE)
method. The overall statistical association between each gene set and the
spectral structure of the data is then computed by combining the PC-level
p-values using the weighted Z-method with weights set to the PC variance scaled
by Tracey-Widom test p-values. Using simulated data, we show that the SGSE
algorithm can accurately recover spectral features from noisy data. To
illustrate the utility of our method on real data, we demonstrate the superior
performance of the SGSE method relative to standard cluster-based techniques
for testing the association between MSigDB gene sets and the variance structure
of microarray gene expression data. Availability:
http://cran.r-project.org/web/packages/PCGSE/index.html Contact:
[email protected] or [email protected]
netgwas: An R Package for Network-Based Genome-Wide Association Studies
Graphical models are powerful tools for modeling and making statistical
inferences regarding complex associations among variables in multivariate data.
In this paper we introduce the R package netgwas, which is designed based on
undirected graphical models to accomplish three important and interrelated
goals in genetics: constructing linkage map, reconstructing linkage
disequilibrium (LD) networks from multi-loci genotype data, and detecting
high-dimensional genotype-phenotype networks. The netgwas package deals with
species with any chromosome copy number in a unified way, unlike other
software. It implements recent improvements in both linkage map construction
(Behrouzi and Wit, 2018), and reconstructing conditional independence network
for non-Gaussian continuous data, discrete data, and mixed
discrete-and-continuous data (Behrouzi and Wit, 2017). Such datasets routinely
occur in genetics and genomics such as genotype data, and genotype-phenotype
data. We demonstrate the value of our package functionality by applying it to
various multivariate example datasets taken from the literature. We show, in
particular, that our package allows a more realistic analysis of data, as it
adjusts for the effect of all other variables while performing pairwise
associations. This feature controls for spurious associations between variables
that can arise from classical multiple testing approach. This paper includes a
brief overview of the statistical methods which have been implemented in the
package. The main body of the paper explains how to use the package. The
package uses a parallelization strategy on multi-core processors to speed-up
computations for large datasets. In addition, it contains several functions for
simulation and visualization. The netgwas package is freely available at
https://cran.r-project.org/web/packages/netgwasComment: 32 pages, 9 figures; due to the limitation "The abstract field cannot
be longer than 1,920 characters", the abstract appearing here is slightly
shorter than that in the PDF fil
How complex climate networks complement eigen techniques for the statistical analysis of climatological data
Eigen techniques such as empirical orthogonal function (EOF) or coupled
pattern (CP) / maximum covariance analysis have been frequently used for
detecting patterns in multivariate climatological data sets. Recently,
statistical methods originating from the theory of complex networks have been
employed for the very same purpose of spatio-temporal analysis. This climate
network (CN) analysis is usually based on the same set of similarity matrices
as is used in classical EOF or CP analysis, e.g., the correlation matrix of a
single climatological field or the cross-correlation matrix between two
distinct climatological fields. In this study, formal relationships as well as
conceptual differences between both eigen and network approaches are derived
and illustrated using exemplary global precipitation, evaporation and surface
air temperature data sets. These results allow to pinpoint that CN analysis can
complement classical eigen techniques and provides additional information on
the higher-order structure of statistical interrelationships in climatological
data. Hence, CNs are a valuable supplement to the statistical toolbox of the
climatologist, particularly for making sense out of very large data sets such
as those generated by satellite observations and climate model intercomparison
exercises.Comment: 18 pages, 11 figure
- …