59 research outputs found
Statistical Mechanics of Community Detection
Starting from a general \textit{ansatz}, we show how community detection can
be interpreted as finding the ground state of an infinite range spin glass. Our
approach applies to weighted and directed networks alike. It contains the
\textit{at hoc} introduced quality function from \cite{ReichardtPRL} and the
modularity as defined by Newman and Girvan \cite{Girvan03} as special
cases. The community structure of the network is interpreted as the spin
configuration that minimizes the energy of the spin glass with the spin states
being the community indices. We elucidate the properties of the ground state
configuration to give a concise definition of communities as cohesive subgroups
in networks that is adaptive to the specific class of network under study.
Further we show, how hierarchies and overlap in the community structure can be
detected. Computationally effective local update rules for optimization
procedures to find the ground state are given. We show how the \textit{ansatz}
may be used to discover the community around a given node without detecting all
communities in the full network and we give benchmarks for the performance of
this extension. Finally, we give expectation values for the modularity of
random graphs, which can be used in the assessment of statistical significance
of community structure
Partitioning and modularity of graphs with arbitrary degree distribution
We solve the graph bi-partitioning problem in dense graphs with arbitrary
degree distribution using the replica method. We find the cut-size to scale
universally with . In contrast, earlier results studying the problem in
graphs with a Poissonian degree distribution had found a scaling with ^1/2
[Fu and Anderson, J. Phys. A: Math. Gen. 19, 1986]. The new results also
generalize to the problem of q-partitioning. They can be used to find the
expected modularity Q [Newman and Grivan, Phys. Rev. E, 69, 2004] of random
graphs and allow for the assessment of statistical significance of the output
of community detection algorithms.Comment: Revised version including new plots and improved discussion of some
mathematical detail
Increase in consumption of alcohol-based hand rub in German acute care hospitals over a 12 year period
Background: Hand hygiene plays a crucial role in the transmission of pathogens and the prevention of healthcare-associated infections. In 2007, a voluntary national electronic surveillance tool for the documentation of consumption of alcohol-based hand rub (AHC) was introduced as a surrogate for hand hygiene compliance (HAND-KISS) and for the provision of benchmark data as feedback.The aim of the study was to determine the trend in alcohol-based hand rub consumption between 2007 and 2018.
Materials and methods: In this cohort study, AHC and patient days (PD) were documented on every ward in participating hospitals by trained local staff. Data was collected and validated in HAND-KISS. Intensive care units (ICU), intermediate care units (IMC), and regular wards (RW) that provided data during the study period between 2007 until 2018 were included into the study.
Results: In 2018, 75.2% of acute care hospitals in Germany (n=1.460) participated. On ICUs (n=1998) mean AHC increased 1.74 fold (95%CI 1.71, 1.76; p<.0001) from 79.2ml/PD to 137.4ml/PD. On IMCs (n=475) AHC increased 1.69 fold (95%CI 1.60, 1.79; p<.0001) from 41.4ml/PD to 70.6ml /PD..On RWs (n=14,857) AHC was 19.0ml/PD in 2007 and increased 1.71 fold (95%CI 1.70, 1.73; p<.0001) to 32.6ml/PD in 2018.
Conclusions: AHC in German hospitals increased on all types of wards during the past 12years. Surveillance of AHC is widely established in German hospitals. Large differences among medical specialties exist and warrant further investigation
A Statistical Performance Analysis of Graph Clustering Algorithms
Measuring graph clustering quality remains an open problem. Here, we introduce three statistical measures to address the problem. We empirically explore their behavior under a number of stress test scenarios and compare it to the commonly used modularity and conductance. Our measures are robust, immune to resolution limit, easy to intuitively interpret and also have a formal statistical interpretation. Our empirical stress test results confirm that our measures compare favorably to the established ones. In particular, they are shown to be more responsive to graph structure, less sensitive to sample size and breakdowns during numerical implementation and less sensitive to uncertainty in connectivity. These features are especially important in the context of larger data sets or when the data may contain errors in the connectivity patterns
eBay users form stable groups of common interest
Market segmentation of an online auction site is studied by analyzing the
users' bidding behavior. The distribution of user activity is investigated and
a network of bidders connected by common interest in individual articles is
constructed. The network's cluster structure corresponds to the main user
groups according to common interest, exhibiting hierarchy and overlap. Key
feature of the analysis is its independence of any similarity measure between
the articles offered on eBay, as such a measure would only introduce bias in
the analysis. Results are compared to null models based on random networks and
clusters are validated and interpreted using the taxonomic classifications of
eBay categories. We find clear-cut and coherent interest profiles for the
bidders in each cluster. The interest profiles of bidder groups are compared to
the classification of articles actually bought by these users during the time
span 6-9 months after the initial grouping. The interest profiles discovered
remain stable, indicating typical interest profiles in society. Our results
show how network theory can be applied successfully to problems of market
segmentation and sociological milieu studies with sparse, high dimensional
data.Comment: Major revision of the manuscript. Methodological improvements and
inclusion of analysis of temporal development of user interests. 19 pages, 12
figures, 5 table
Information Symmetries in Irreversible Processes
We study dynamical reversibility in stationary stochastic processes from an
information theoretic perspective. Extending earlier work on the reversibility
of Markov chains, we focus on finitary processes with arbitrarily long
conditional correlations. In particular, we examine stationary processes
represented or generated by edge-emitting, finite-state hidden Markov models.
Surprisingly, we find pervasive temporal asymmetries in the statistics of such
stationary processes with the consequence that the computational resources
necessary to generate a process in the forward and reverse temporal directions
are generally not the same. In fact, an exhaustive survey indicates that most
stationary processes are irreversible. We study the ensuing relations between
model topology in different representations, the process's statistical
properties, and its reversibility in detail. A process's temporal asymmetry is
efficiently captured using two canonical unifilar representations of the
generating model, the forward-time and reverse-time epsilon-machines. We
analyze example irreversible processes whose epsilon-machine presentations
change size under time reversal, including one which has a finite number of
recurrent causal states in one direction, but an infinite number in the
opposite. From the forward-time and reverse-time epsilon-machines, we are able
to construct a symmetrized, but nonunifilar, generator of a process---the
bidirectional machine. Using the bidirectional machine, we show how to directly
calculate a process's fundamental information properties, many of which are
otherwise only poorly approximated via process samples. The tools we introduce
and the insights we offer provide a better understanding of the many facets of
reversibility and irreversibility in stochastic processes.Comment: 32 pages, 17 figures, 2 tables;
http://csc.ucdavis.edu/~cmg/compmech/pubs/pratisp2.ht
Comparative Study for Inference of Hidden Classes in Stochastic Block Models
Inference of hidden classes in stochastic block model is a classical problem
with important applications. Most commonly used methods for this problem
involve na\"{\i}ve mean field approaches or heuristic spectral methods.
Recently, belief propagation was proposed for this problem. In this
contribution we perform a comparative study between the three methods on
synthetically created networks. We show that belief propagation shows much
better performance when compared to na\"{\i}ve mean field and spectral
approaches. This applies to accuracy, computational efficiency and the tendency
to overfit the data.Comment: 8 pages, 5 figures AIGM1
The interplay of microscopic and mesoscopic structure in complex networks
Not all nodes in a network are created equal. Differences and similarities
exist at both individual node and group levels. Disentangling single node from
group properties is crucial for network modeling and structural inference.
Based on unbiased generative probabilistic exponential random graph models and
employing distributive message passing techniques, we present an efficient
algorithm that allows one to separate the contributions of individual nodes and
groups of nodes to the network structure. This leads to improved detection
accuracy of latent class structure in real world data sets compared to models
that focus on group structure alone. Furthermore, the inclusion of hitherto
neglected group specific effects in models used to assess the statistical
significance of small subgraph (motif) distributions in networks may be
sufficient to explain most of the observed statistics. We show the predictive
power of such generative models in forecasting putative gene-disease
associations in the Online Mendelian Inheritance in Man (OMIM) database. The
approach is suitable for both directed and undirected uni-partite as well as
for bipartite networks
Orientation bias of optically selected galaxy clusters and its impact on stacked weak-lensing analyses
Weak-lensing measurements of the averaged shear profiles of galaxy clusters binned by some proxy for cluster mass are commonly converted to cluster mass estimates under the assumption that these cluster stacks have spherical symmetry. In this paper, we test whether this assumption holds for optically selected clusters binned by estimated optical richness. Using mock catalogues created from N-body simulations populated realistically with galaxies, we ran a suite of optical cluster finders and estimated their optical richness. We binned galaxy clusters by true cluster mass and estimated optical richness and measure the ellipticity of these stacks. We find that the processes of optical cluster selection and richness estimation are biased, leading to stacked structures that are elongated along the line of sight. We show that weak-lensing alone cannot measure the size of this orientation bias. Weak-lensing masses of stacked optically selected clusters are overestimated by up to 3–6 per cent when clusters can be uniquely associated with haloes. This effect is large enough to lead to significant biases in the cosmological parameters derived from large surveys like the Dark Energy Survey, if not calibrated via simulations or fitted simultaneously. This bias probably also contributes to the observed discrepancy between the observed and predicted Sunyaev–Zel’dovich signal of optically selected clusters
- …