7,837 research outputs found
Characterization of complex networks: A survey of measurements
Each complex network (or class of networks) presents specific topological
features which characterize its connectivity and highly influence the dynamics
of processes executed on the network. The analysis, discrimination, and
synthesis of complex networks therefore rely on the use of measurements capable
of expressing the most relevant topological features. This article presents a
survey of such measurements. It includes general considerations about complex
network characterization, a brief review of the principal models, and the
presentation of the main existing measurements. Important related issues
covered in this work comprise the representation of the evolution of complex
networks in terms of trajectories in several measurement spaces, the analysis
of the correlations between some of the most traditional measurements,
perturbation analysis, as well as the use of multivariate statistics for
feature selection and network classification. Depending on the network and the
analysis task one has in mind, a specific set of features may be chosen. It is
hoped that the present survey will help the proper application and
interpretation of measurements.Comment: A working manuscript with 78 pages, 32 figures. Suggestions of
measurements for inclusion are welcomed by the author
Multivariate Approaches to Classification in Extragalactic Astronomy
Clustering objects into synthetic groups is a natural activity of any
science. Astrophysics is not an exception and is now facing a deluge of data.
For galaxies, the one-century old Hubble classification and the Hubble tuning
fork are still largely in use, together with numerous mono-or bivariate
classifications most often made by eye. However, a classification must be
driven by the data, and sophisticated multivariate statistical tools are used
more and more often. In this paper we review these different approaches in
order to situate them in the general context of unsupervised and supervised
learning. We insist on the astrophysical outcomes of these studies to show that
multivariate analyses provide an obvious path toward a renewal of our
classification of galaxies and are invaluable tools to investigate the physics
and evolution of galaxies.Comment: Open Access paper.
http://www.frontiersin.org/milky\_way\_and\_galaxies/10.3389/fspas.2015.00003/abstract\>.
\<10.3389/fspas.2015.00003 \&g
Representing complex data using localized principal components with application to astronomical data
Often the relation between the variables constituting a multivariate data
space might be characterized by one or more of the terms: ``nonlinear'',
``branched'', ``disconnected'', ``bended'', ``curved'', ``heterogeneous'', or,
more general, ``complex''. In these cases, simple principal component analysis
(PCA) as a tool for dimension reduction can fail badly. Of the many alternative
approaches proposed so far, local approximations of PCA are among the most
promising. This paper will give a short review of localized versions of PCA,
focusing on local principal curves and local partitioning algorithms.
Furthermore we discuss projections other than the local principal components.
When performing local dimension reduction for regression or classification
problems it is important to focus not only on the manifold structure of the
covariates, but also on the response variable(s). Local principal components
only achieve the former, whereas localized regression approaches concentrate on
the latter. Local projection directions derived from the partial least squares
(PLS) algorithm offer an interesting trade-off between these two objectives. We
apply these methods to several real data sets. In particular, we consider
simulated astrophysical data from the future Galactic survey mission Gaia.Comment: 25 pages. In "Principal Manifolds for Data Visualization and
Dimension Reduction", A. Gorban, B. Kegl, D. Wunsch, and A. Zinovyev (eds),
Lecture Notes in Computational Science and Engineering, Springer, 2007, pp.
180--204,
http://www.springer.com/dal/home/generic/search/results?SGWID=1-40109-22-173750210-
Topics in social network analysis and network science
This chapter introduces statistical methods used in the analysis of social
networks and in the rapidly evolving parallel-field of network science.
Although several instances of social network analysis in health services
research have appeared recently, the majority involve only the most basic
methods and thus scratch the surface of what might be accomplished.
Cutting-edge methods using relevant examples and illustrations in health
services research are provided
Inferring the photometric and size evolution of galaxies from image simulations
Current constraints on models of galaxy evolution rely on morphometric
catalogs extracted from multi-band photometric surveys. However, these catalogs
are altered by selection effects that are difficult to model, that correlate in
non trivial ways, and that can lead to contradictory predictions if not taken
into account carefully. To address this issue, we have developed a new approach
combining parametric Bayesian indirect likelihood (pBIL) techniques and
empirical modeling with realistic image simulations that reproduce a large
fraction of these selection effects. This allows us to perform a direct
comparison between observed and simulated images and to infer robust
constraints on model parameters. We use a semi-empirical forward model to
generate a distribution of mock galaxies from a set of physical parameters.
These galaxies are passed through an image simulator reproducing the
instrumental characteristics of any survey and are then extracted in the same
way as the observed data. The discrepancy between the simulated and observed
data is quantified, and minimized with a custom sampling process based on
adaptive Monte Carlo Markov Chain methods. Using synthetic data matching most
of the properties of a CFHTLS Deep field, we demonstrate the robustness and
internal consistency of our approach by inferring the parameters governing the
size and luminosity functions and their evolutions for different realistic
populations of galaxies. We also compare the results of our approach with those
obtained from the classical spectral energy distribution fitting and
photometric redshift approach.Our pipeline infers efficiently the luminosity
and size distribution and evolution parameters with a very limited number of
observables (3 photometric bands). When compared to SED fitting based on the
same set of observables, our method yields results that are more accurate and
free from systematic biases.Comment: 24 pages, 12 figures, accepted for publication in A&
Inference-based statistical network analysis uncovers star-like brain functional architectures for internalizing psychopathology in children
To improve the statistical power for imaging biomarker detection, we propose
a latent variable-based statistical network analysis (LatentSNA) that combines
brain functional connectivity with internalizing psychopathology, implementing
network science in a generative statistical process to preserve the
neurologically meaningful network topology in the adolescents and children
population. The developed inference-focused generative Bayesian framework (1)
addresses the lack of power and inflated Type II errors in current analytic
approaches when detecting imaging biomarkers, (2) allows unbiased estimation of
biomarkers' influence on behavior variants, (3) quantifies the uncertainty and
evaluates the likelihood of the estimated biomarker effects against chance and
(4) ultimately improves brain-behavior prediction in novel samples and the
clinical utilities of neuroimaging findings. We collectively model multi-state
functional networks with multivariate internalizing profiles for 5,000 to 7,000
children in the Adolescent Brain Cognitive Development (ABCD) study with
sufficiently accurate prediction of both children internalizing traits and
functional connectivity, and substantially improved our ability to explain the
individual internalizing differences compared with current approaches. We
successfully uncover large, coherent star-like brain functional architectures
associated with children's internalizing psychopathology across multiple
functional systems and establish them as unique fingerprints for childhood
internalization
Learning Latent Tree Graphical Models
We study the problem of learning a latent tree graphical model where samples
are available only from a subset of variables. We propose two consistent and
computationally efficient algorithms for learning minimal latent trees, that
is, trees without any redundant hidden nodes. Unlike many existing methods, the
observed nodes (or variables) are not constrained to be leaf nodes. Our first
algorithm, recursive grouping, builds the latent tree recursively by
identifying sibling groups using so-called information distances. One of the
main contributions of this work is our second algorithm, which we refer to as
CLGrouping. CLGrouping starts with a pre-processing procedure in which a tree
over the observed variables is constructed. This global step groups the
observed nodes that are likely to be close to each other in the true latent
tree, thereby guiding subsequent recursive grouping (or equivalent procedures)
on much smaller subsets of variables. This results in more accurate and
efficient learning of latent trees. We also present regularized versions of our
algorithms that learn latent tree approximations of arbitrary distributions. We
compare the proposed algorithms to other methods by performing extensive
numerical experiments on various latent tree graphical models such as hidden
Markov models and star graphs. In addition, we demonstrate the applicability of
our methods on real-world datasets by modeling the dependency structure of
monthly stock returns in the S&P index and of the words in the 20 newsgroups
dataset
- …