13,179 research outputs found
Deterministic Automata for Unordered Trees
Automata for unordered unranked trees are relevant for defining schemas and
queries for data trees in Json or Xml format. While the existing notions are
well-investigated concerning expressiveness, they all lack a proper notion of
determinism, which makes it difficult to distinguish subclasses of automata for
which problems such as inclusion, equivalence, and minimization can be solved
efficiently. In this paper, we propose and investigate different notions of
"horizontal determinism", starting from automata for unranked trees in which
the horizontal evaluation is performed by finite state automata. We show that a
restriction to confluent horizontal evaluation leads to polynomial-time
emptiness and universality, but still suffers from coNP-completeness of the
emptiness of binary intersections. Finally, efficient algorithms can be
obtained by imposing an order of horizontal evaluation globally for all
automata in the class. Depending on the choice of the order, we obtain
different classes of automata, each of which has the same expressiveness as
CMso.Comment: In Proceedings GandALF 2014, arXiv:1408.556
Hierarchical information clustering by means of topologically embedded graphs
We introduce a graph-theoretic approach to extract clusters and hierarchies
in complex data-sets in an unsupervised and deterministic manner, without the
use of any prior information. This is achieved by building topologically
embedded networks containing the subset of most significant links and analyzing
the network structure. For a planar embedding, this method provides both the
intra-cluster hierarchy, which describes the way clusters are composed, and the
inter-cluster hierarchy which describes how clusters gather together. We
discuss performance, robustness and reliability of this method by first
investigating several artificial data-sets, finding that it can outperform
significantly other established approaches. Then we show that our method can
successfully differentiate meaningful clusters and hierarchies in a variety of
real data-sets. In particular, we find that the application to gene expression
patterns of lymphoma samples uncovers biologically significant groups of genes
which play key-roles in diagnosis, prognosis and treatment of some of the most
relevant human lymphoid malignancies.Comment: 33 Pages, 18 Figures, 5 Table
Data clustering using a model granular magnet
We present a new approach to clustering, based on the physical properties of
an inhomogeneous ferromagnet. No assumption is made regarding the underlying
distribution of the data. We assign a Potts spin to each data point and
introduce an interaction between neighboring points, whose strength is a
decreasing function of the distance between the neighbors. This magnetic system
exhibits three phases. At very low temperatures it is completely ordered; all
spins are aligned. At very high temperatures the system does not exhibit any
ordering and in an intermediate regime clusters of relatively strongly coupled
spins become ordered, whereas different clusters remain uncorrelated. This
intermediate phase is identified by a jump in the order parameters. The
spin-spin correlation function is used to partition the spins and the
corresponding data points into clusters. We demonstrate on three synthetic and
three real data sets how the method works. Detailed comparison to the
performance of other techniques clearly indicates the relative success of our
method.Comment: 46 pages, postscript, 15 ps figures include
Networks and the epidemiology of infectious disease
The science of networks has revolutionised research into the dynamics of interacting elements. It could be argued that epidemiology in particular has embraced the potential of network theory more than any other discipline. Here we review the growing body of research concerning the spread of infectious diseases on networks, focusing on the interplay between network theory and epidemiology. The review is split into four main sections, which examine: the types of network relevant to epidemiology; the multitude of ways these networks can be characterised; the statistical methods that can be applied to infer the epidemiological parameters on a realised network; and finally simulation and analytical methods to determine epidemic dynamics on a given network. Given the breadth of areas covered and the ever-expanding number of publications, a comprehensive review of all work is impossible. Instead, we provide a personalised overview into the areas of network epidemiology that have seen the greatest progress in recent years or have the greatest potential to provide novel insights. As such, considerable importance is placed on analytical approaches and statistical methods which are both rapidly expanding fields. Throughout this review we restrict our attention to epidemiological issues
- …