4,495 research outputs found
Statistical Mechanics of Community Detection
Starting from a general \textit{ansatz}, we show how community detection can
be interpreted as finding the ground state of an infinite range spin glass. Our
approach applies to weighted and directed networks alike. It contains the
\textit{at hoc} introduced quality function from \cite{ReichardtPRL} and the
modularity as defined by Newman and Girvan \cite{Girvan03} as special
cases. The community structure of the network is interpreted as the spin
configuration that minimizes the energy of the spin glass with the spin states
being the community indices. We elucidate the properties of the ground state
configuration to give a concise definition of communities as cohesive subgroups
in networks that is adaptive to the specific class of network under study.
Further we show, how hierarchies and overlap in the community structure can be
detected. Computationally effective local update rules for optimization
procedures to find the ground state are given. We show how the \textit{ansatz}
may be used to discover the community around a given node without detecting all
communities in the full network and we give benchmarks for the performance of
this extension. Finally, we give expectation values for the modularity of
random graphs, which can be used in the assessment of statistical significance
of community structure
Multifractal Network Generator
We introduce a new approach to constructing networks with realistic features.
Our method, in spite of its conceptual simplicity (it has only two parameters)
is capable of generating a wide variety of network types with prescribed
statistical properties, e.g., with degree- or clustering coefficient
distributions of various, very different forms. In turn, these graphs can be
used to test hypotheses, or, as models of actual data. The method is based on a
mapping between suitably chosen singular measures defined on the unit square
and sparse infinite networks. Such a mapping has the great potential of
allowing for graph theoretical results for a variety of network topologies. The
main idea of our approach is to go to the infinite limit of the singular
measure and the size of the corresponding graph simultaneously. A very unique
feature of this construction is that the complexity of the generated network is
increasing with the size. We present analytic expressions derived from the
parameters of the -- to be iterated-- initial generating measure for such major
characteristics of graphs as their degree, clustering coefficient and
assortativity coefficient distributions. The optimal parameters of the
generating measure are determined from a simple simulated annealing process.
Thus, the present work provides a tool for researchers from a variety of fields
(such as biology, computer science, biology, or complex systems) enabling them
to create a versatile model of their network data.Comment: Preprint. Final version appeared in PNAS
Scaling Nonparametric Bayesian Inference via Subsample-Annealing
We describe an adaptation of the simulated annealing algorithm to
nonparametric clustering and related probabilistic models. This new algorithm
learns nonparametric latent structure over a growing and constantly churning
subsample of training data, where the portion of data subsampled can be
interpreted as the inverse temperature beta(t) in an annealing schedule. Gibbs
sampling at high temperature (i.e., with a very small subsample) can more
quickly explore sketches of the final latent state by (a) making longer jumps
around latent space (as in block Gibbs) and (b) lowering energy barriers (as in
simulated annealing). We prove subsample annealing speeds up mixing time N^2 ->
N in a simple clustering model and exp(N) -> N in another class of models,
where N is data size. Empirically subsample-annealing outperforms naive Gibbs
sampling in accuracy-per-wallclock time, and can scale to larger datasets and
deeper hierarchical models. We demonstrate improved inference on million-row
subsamples of US Census data and network log data and a 307-row hospital rating
dataset, using a Pitman-Yor generalization of the Cross Categorization model.Comment: To appear in AISTATS 201
Particle Swarm Optimization for the Clustering of Wireless Sensors
Clustering is necessary for data aggregation, hierarchical routing, optimizing sleep patterns, election of extremal sensors, optimizing coverage and resource allocation, reuse of frequency bands and codes, and conserving energy. Optimal clustering is typically an NP-hard problem. Solutions to NP-hard problems involve searches through vast spaces of possible solutions. Evolutionary algorithms have been applied successfully to a variety of NP-hard problems. We explore one such approach, Particle Swarm Optimization (PSO), an evolutionary programming technique where a \u27swarm\u27 of test solutions, analogous to a natural swarm of bees, ants or termites, is allowed to interact and cooperate to find the best solution to the given problem. We use the PSO approach to cluster sensors in a sensor network. The energy efficiency of our clustering in a data-aggregation type sensor network deployment is tested using a modified LEACH-C code. The PSO technique with a recursive bisection algorithm is tested against random search and simulated annealing; the PSO technique is shown to be robust. We further investigate developing a distributed version of the PSO algorithm for clustering optimally a wireless sensor network
Mengenal pasti masalah pemahaman dan hubungannya dengan latar belakang matematik, gaya pembelajaran, motivasi dan minat pelajar terhadap bab pengawalan kos makanan di Sekolah Menengah Teknik (ert) Rembau: satu kajian kes.
Kajian ini dijalankan untuk mengkaji hubungan korelasi antara latar belakang Matematik, gaya pembelajaran, motivasi dan minat dengan pemahaman pelajar terhadap bab tersebut. Responden adalah seramai 30 orang iaitu terdiri daripada pelajar tingkatan lima kursus Katering, Sekolah Menengah Teknik (ERT) Rembau, Negeri Sembilan. Instrumen kajian adalah soal selidik dan semua data dianalisis menggunakan program SPSS versi 10.0 untuk mendapatkan nilai min dan nilai korelasi bagi memenuhi objektif yang telah ditetapkan. Hasil kajian ini menunjukkan bahawa hubungan korelasi antara gaya pembelajaran pelajar terhadap pemahaman pelajar adalah kuat. Manakala hubungan korelasi antara latar belakang Matematik, motivasi dan minat terhadap pemahaman pelajar adalah sederhana. Nilai tahap min bagi masalah pemahaman pelajar, latar belakang Matematik, gaya pembelajaran, motivasi dan minat terhadap bab Pengawalan Kos Makanan adalah sederhana. Kajian ini mencadangkan penghasilan satu Modul Pembelajaran Kendiri bagi bab Pengawalan Kos Makanan untuk membantu pelajar kursus Katering dalam proses pembelajaran mereka
Generating Robust and Efficient Networks Under Targeted Attacks
Much of our commerce and traveling depend on the efficient operation of large
scale networks. Some of those, such as electric power grids, transportation
systems, communication networks, and others, must maintain their efficiency
even after several failures, or malicious attacks. We outline a procedure that
modifies any given network to enhance its robustness, defined as the size of
its largest connected component after a succession of attacks, whilst keeping a
high efficiency, described in terms of the shortest paths among nodes. We also
show that this generated set of networks is very similar to networks optimized
for robustness in several aspects such as high assortativity and the presence
of an onion-like structure
Community detection algorithms: a comparative analysis
Uncovering the community structure exhibited by real networks is a crucial
step towards an understanding of complex systems that goes beyond the local
organization of their constituents. Many algorithms have been proposed so far,
but none of them has been subjected to strict tests to evaluate their
performance. Most of the sporadic tests performed so far involved small
networks with known community structure and/or artificial graphs with a
simplified structure, which is very uncommon in real systems. Here we test
several methods against a recently introduced class of benchmark graphs, with
heterogeneous distributions of degree and community size. The methods are also
tested against the benchmark by Girvan and Newman and on random graphs. As a
result of our analysis, three recent algorithms introduced by Rosvall and
Bergstrom, Blondel et al. and Ronhovde and Nussinov, respectively, have an
excellent performance, with the additional advantage of low computational
complexity, which enables one to analyze large systems.Comment: 12 pages, 8 figures. The software to compute the values of our
general normalized mutual information is available at
http://santo.fortunato.googlepages.com/inthepress
- …