2,236 research outputs found
Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs
Laplacian mixture models identify overlapping regions of influence in
unlabeled graph and network data in a scalable and computationally efficient
way, yielding useful low-dimensional representations. By combining Laplacian
eigenspace and finite mixture modeling methods, they provide probabilistic or
fuzzy dimensionality reductions or domain decompositions for a variety of input
data types, including mixture distributions, feature vectors, and graphs or
networks. Provable optimal recovery using the algorithm is analytically shown
for a nontrivial class of cluster graphs. Heuristic approximations for scalable
high-performance implementations are described and empirically tested.
Connections to PageRank and community detection in network analysis demonstrate
the wide applicability of this approach. The origins of fuzzy spectral methods,
beginning with generalized heat or diffusion equations in physics, are reviewed
and summarized. Comparisons to other dimensionality reduction and clustering
methods for challenging unsupervised machine learning problems are also
discussed.Comment: 13 figures, 35 reference
Information geometric methods for complexity
Research on the use of information geometry (IG) in modern physics has
witnessed significant advances recently. In this review article, we report on
the utilization of IG methods to define measures of complexity in both
classical and, whenever available, quantum physical settings. A paradigmatic
example of a dramatic change in complexity is given by phase transitions (PTs).
Hence we review both global and local aspects of PTs described in terms of the
scalar curvature of the parameter manifold and the components of the metric
tensor, respectively. We also report on the behavior of geodesic paths on the
parameter manifold used to gain insight into the dynamics of PTs. Going
further, we survey measures of complexity arising in the geometric framework.
In particular, we quantify complexity of networks in terms of the Riemannian
volume of the parameter space of a statistical manifold associated with a given
network. We are also concerned with complexity measures that account for the
interactions of a given number of parts of a system that cannot be described in
terms of a smaller number of parts of the system. Finally, we investigate
complexity measures of entropic motion on curved statistical manifolds that
arise from a probabilistic description of physical systems in the presence of
limited information. The Kullback-Leibler divergence, the distance to an
exponential family and volumes of curved parameter manifolds, are examples of
essential IG notions exploited in our discussion of complexity. We conclude by
discussing strengths, limits, and possible future applications of IG methods to
the physics of complexity.Comment: review article, 60 pages, no figure
25 Years of Self-Organized Criticality: Solar and Astrophysics
Shortly after the seminal paper {\sl "Self-Organized Criticality: An
explanation of 1/f noise"} by Bak, Tang, and Wiesenfeld (1987), the idea has
been applied to solar physics, in {\sl "Avalanches and the Distribution of
Solar Flares"} by Lu and Hamilton (1991). In the following years, an inspiring
cross-fertilization from complexity theory to solar and astrophysics took
place, where the SOC concept was initially applied to solar flares, stellar
flares, and magnetospheric substorms, and later extended to the radiation belt,
the heliosphere, lunar craters, the asteroid belt, the Saturn ring, pulsar
glitches, soft X-ray repeaters, blazars, black-hole objects, cosmic rays, and
boson clouds. The application of SOC concepts has been performed by numerical
cellular automaton simulations, by analytical calculations of statistical
(powerlaw-like) distributions based on physical scaling laws, and by
observational tests of theoretically predicted size distributions and waiting
time distributions. Attempts have been undertaken to import physical models
into the numerical SOC toy models, such as the discretization of
magneto-hydrodynamics (MHD) processes. The novel applications stimulated also
vigorous debates about the discrimination between SOC models, SOC-like, and
non-SOC processes, such as phase transitions, turbulence, random-walk
diffusion, percolation, branching processes, network theory, chaos theory,
fractality, multi-scale, and other complexity phenomena. We review SOC studies
from the last 25 years and highlight new trends, open questions, and future
challenges, as discussed during two recent ISSI workshops on this theme.Comment: 139 pages, 28 figures, Review based on ISSI workshops "Self-Organized
Criticality and Turbulence" (2012, 2013, Bern, Switzerland
Single Cell Proteomics in Biomedicine: High-dimensional Data Acquisition, Visualization and Analysis
New insights on cellular heterogeneity in the last decade provoke the development of a variety of single cell omics tools at a lightning pace. The resultant high-dimensional single cell data generated by these tools require new theoretical approaches and analytical algorithms for effective visualization and interpretation. In this review, we briefly survey the state-of-the-art single cell proteomic tools with a particular focus on data acquisition and quantification, followed by an elaboration of a number of statistical and computational approaches developed to date for dissecting the high-dimensional single cell data. The underlying assumptions, unique features, and limitations of the analytical methods with the designated biological questions they seek to answer will be discussed. Particular attention will be given to those information theoretical approaches that are anchored in a set of first principles of physics and can yield detailed (and often surprising) predictions
A survey of statistical network models
Networks are ubiquitous in science and have become a focal point for
discussion in everyday life. Formal statistical models for the analysis of
network data have emerged as a major topic of interest in diverse areas of
study, and most of these involve a form of graphical representation.
Probability models on graphs date back to 1959. Along with empirical studies in
social psychology and sociology from the 1960s, these early works generated an
active network community and a substantial literature in the 1970s. This effort
moved into the statistical literature in the late 1970s and 1980s, and the past
decade has seen a burgeoning network literature in statistical physics and
computer science. The growth of the World Wide Web and the emergence of online
networking communities such as Facebook, MySpace, and LinkedIn, and a host of
more specialized professional network communities has intensified interest in
the study of networks and network data. Our goal in this review is to provide
the reader with an entry point to this burgeoning literature. We begin with an
overview of the historical development of statistical network modeling and then
we introduce a number of examples that have been studied in the network
literature. Our subsequent discussion focuses on a number of prominent static
and dynamic network models and their interconnections. We emphasize formal
model descriptions, and pay special attention to the interpretation of
parameters and their estimation. We end with a description of some open
problems and challenges for machine learning and statistics.Comment: 96 pages, 14 figures, 333 reference
- …