412,042 research outputs found
Dynamical evolution of clustering in complex network of earthquakes
The network approach plays a distinguished role in contemporary science of
complex systems/phenomena. Such an approach has been introduced into seismology
in a recent work [S. Abe and N. Suzuki, Europhys. Lett. 65, 581 (2004)]. Here,
we discuss the dynamical property of the earthquake network constructed in
California and report the discovery that the values of the clustering
coefficient remain stationary before main shocks, suddenly jump up at the main
shocks, and then slowly decay following a power law to become stationary again.
Thus, the network approach is found to characterize main shocks in a peculiar
manner.Comment: 10 pages, 3 figures, 1 tabl
Taxonomy and clustering in collaborative systems: the case of the on-line encyclopedia Wikipedia
In this paper we investigate the nature and structure of the relation between
imposed classifications and real clustering in a particular case of a
scale-free network given by the on-line encyclopedia Wikipedia. We find a
statistical similarity in the distributions of community sizes both by using
the top-down approach of the categories division present in the archive and in
the bottom-up procedure of community detection given by an algorithm based on
the spectral properties of the graph. Regardless the statistically similar
behaviour the two methods provide a rather different division of the articles,
thereby signaling that the nature and presence of power laws is a general
feature for these systems and cannot be used as a benchmark to evaluate the
suitability of a clustering method.Comment: 5 pages, 3 figures, epl2 styl
Deterministic hierarchical networks
It has been shown that many networks associated with complex systems are
small-world (they have both a large local clustering coefficient and a small
diameter) and they are also scale-free (the degrees are distributed according
to a power law). Moreover, these networks are very often hierarchical, as they
describe the modularity of the systems that are modeled. Most of the studies
for complex networks are based on stochastic methods. However, a deterministic
method, with an exact determination of the main relevant parameters of the
networks, has proven useful. Indeed, this approach complements and enhances the
probabilistic and simulation techniques and, therefore, it provides a better
understanding of the systems modeled. In this paper we find the radius,
diameter, clustering coefficient and degree distribution of a generic family of
deterministic hierarchical small-world scale-free networks that has been
considered for modeling real-life complex systems
Power-law weighted networks from local attachments
This letter introduces a mechanism for constructing, through a process of
distributed decision-making, substrates for the study of collective dynamics on
extended power-law weighted networks with both a desired scaling exponent and a
fixed clustering coefficient. The analytical results show that the connectivity
distribution converges to the scaling behavior often found in social and
engineering systems. To illustrate the approach of the proposed framework we
generate network substrates that resemble steady state properties of the
empirical citation distributions of (i) publications indexed by the Institute
for Scientific Information from 1981 to 1997; (ii) patents granted by the U.S.
Patent and Trademark Office from 1975 to 1999; and (iii) opinions written by
the Supreme Court and the cases they cite from 1754 to 2002.Comment: 18 pages, 3 figures; Proceedings of the IEEE Conference on Decision
and Control and the European Control Conference, Orlando, FL, Dec. 2011;
Added references; We modified the model in order to take into account
extended power-law distributions which better fit to the citations data sets;
Added proofs of theorems; Shorten version; Updated plo
A new unsupervised feature selection method for text clustering based on genetic algorithms
Nowadays a vast amount of textual information is collected and stored in various databases around the world, including the Internet as the largest database of all. This rapidly increasing growth of published text means that even the most avid reader cannot hope to keep up with all the reading in a field and consequently the nuggets of insight or new knowledge are at risk of languishing undiscovered in the literature. Text mining offers a solution to this problem by replacing or supplementing the human reader with automatic systems undeterred by the text explosion. It involves analyzing a large collection of documents to discover previously unknown information. Text clustering is one of the most important areas in text mining, which includes text preprocessing, dimension reduction by selecting some terms (features) and finally clustering using selected terms. Feature selection appears to be the most important step in the process. Conventional unsupervised feature selection methods define a measure of the discriminating power of terms to select proper terms from corpus. However up to now the valuation of terms in groups has not been investigated in reported works. In this paper a new and robust unsupervised feature selection approach is proposed that evaluates terms in groups. In addition a new Modified Term Variance measuring method is proposed for evaluating groups of terms. Furthermore a genetic based algorithm is designed and implemented for finding the most valuable groups of terms based on the new measure. These terms then will be utilized to generate the final feature vector for the clustering process . In order to evaluate and justify our approach the proposed method and also a conventional term variance method are implemented and tested using corpus collection Reuters-21578. For a more accurate comparison, methods have been tested on three corpuses and for each corpus clustering task has been done ten times and results are averaged. Results of comparing these two methods are very promising and show that our method produces better average accuracy and F1-measure than the conventional term variance method
FPGA-Based Processor Acceleration for Image Processing Applications
FPGA-based embedded image processing systems offer considerable computing resources but present programming challenges when compared to software systems. The paper describes an approach based on an FPGA-based soft processor called Image Processing Processor (IPPro) which can operate up to 337 MHz on a high-end Xilinx FPGA family and gives details of the dataflow-based programming environment. The approach is demonstrated for a k-means clustering operation and a traffic sign recognition application, both of which have been prototyped on an Avnet Zedboard that has Xilinx Zynq-7000 system-on-chip (SoC). A number of parallel dataflow mapping options were explored giving a speed-up of 8 times for the k-means clustering using 16 IPPro cores, and a speed-up of 9.6 times for the morphology filter operation of the traffic sign recognition using 16 IPPro cores compared to their equivalent ARM-based software implementations. We show that for k-means clustering, the 16 IPPro cores implementation is 57, 28 and 1.7 times more power efficient (fps/W) than ARM Cortex-A7 CPU, nVIDIA GeForce GTX980 GPU and ARM Mali-T628 embedded GPU respectively
Sparse Allreduce: Efficient Scalable Communication for Power-Law Data
Many large datasets exhibit power-law statistics: The web graph, social
networks, text data, click through data etc. Their adjacency graphs are termed
natural graphs, and are known to be difficult to partition. As a consequence
most distributed algorithms on these graphs are communication intensive. Many
algorithms on natural graphs involve an Allreduce: a sum or average of
partitioned data which is then shared back to the cluster nodes. Examples
include PageRank, spectral partitioning, and many machine learning algorithms
including regression, factor (topic) models, and clustering. In this paper we
describe an efficient and scalable Allreduce primitive for power-law data. We
point out scaling problems with existing butterfly and round-robin networks for
Sparse Allreduce, and show that a hybrid approach improves on both.
Furthermore, we show that Sparse Allreduce stages should be nested instead of
cascaded (as in the dense case). And that the optimum throughput Allreduce
network should be a butterfly of heterogeneous degree where degree decreases
with depth into the network. Finally, a simple replication scheme is introduced
to deal with node failures. We present experiments showing significant
improvements over existing systems such as PowerGraph and Hadoop
Abstract Phase-space Networks Describing Reactive Dynamics
An abstract network approach is proposed for the description of the dynamics
in reactive processes. The phase space of the variables (concentrations in
reactive systems) is partitioned into a finite number of segments, which
constitute the nodes of the abstract network. Transitions between the nodes are
dictated by the dynamics of the reactive process and provide the links between
the nodes. These are weighted networks, since each link weight reflects the
transition rate between the corresponding states-nodes. With this construction
the network properties mirror the dynamics of the underlying process and one
can investigate the system properties by studying the corresponding abstract
network. As a working example the Lattice Limit Cycle (LLC) model is used. Its
corresponding abstract network is constructed and the transition matrix
elements are computed via Kinetic (Dynamic) Monte Carlo simulations. For this
model it is shown that the degree distribution follows a power law with
exponent -1, while the average clustering coefficient scales with the
network size (number of nodes) as . The
computed exponents classify the LLC abstract reactive network into the
scale-free networks. This conclusion corroborates earlier investigations
demonstrating the formation of fractal spatial patterns in LLC reactive
dynamics due to stochasticity and to the clustering of homologous species. The
present construction of abstract networks (based on the partition of the phase
space) is generic and can be implemented with appropriate adjustments in many
dynamical systems and in time series analysis.Comment: 10 pages, 6 figure
- …
