57,422 research outputs found
On Quantifying Qualitative Geospatial Data: A Probabilistic Approach
Living in the era of data deluge, we have witnessed a web content explosion,
largely due to the massive availability of User-Generated Content (UGC). In
this work, we specifically consider the problem of geospatial information
extraction and representation, where one can exploit diverse sources of
information (such as image and audio data, text data, etc), going beyond
traditional volunteered geographic information. Our ambition is to include
available narrative information in an effort to better explain geospatial
relationships: with spatial reasoning being a basic form of human cognition,
narratives expressing such experiences typically contain qualitative spatial
data, i.e., spatial objects and spatial relationships.
To this end, we formulate a quantitative approach for the representation of
qualitative spatial relations extracted from UGC in the form of texts. The
proposed method quantifies such relations based on multiple text observations.
Such observations provide distance and orientation features which are utilized
by a greedy Expectation Maximization-based (EM) algorithm to infer a
probability distribution over predefined spatial relationships; the latter
represent the quantified relationships under user-defined probabilistic
assumptions. We evaluate the applicability and quality of the proposed approach
using real UGC data originating from an actual travel blog text corpus. To
verify the quality of the result, we generate grid-based maps visualizing the
spatial extent of the various relations
Symmetry based Structure Entropy of Complex Networks
Precisely quantifying the heterogeneity or disorder of a network system is
very important and desired in studies of behavior and function of the network
system. Although many degree-based entropies have been proposed to measure the
heterogeneity of real networks, heterogeneity implicated in the structure of
networks can not be precisely quantified yet. Hence, we propose a new structure
entropy based on automorphism partition to precisely quantify the structural
heterogeneity of networks. Analysis of extreme cases shows that entropy based
on automorphism partition can quantify the structural heterogeneity of networks
more precisely than degree-based entropy. We also summarized symmetry and
heterogeneity statistics of many real networks, finding that real networks are
indeed more heterogenous in the view of automorphism partition than what have
been depicted under the measurement of degree based entropies; and that
structural heterogeneity is strongly negatively correlated to symmetry of real
networks.Comment: 7 pages, 6 figure
Randomizing bipartite networks: the case of the World Trade Web
Within the last fifteen years, network theory has been successfully applied
both to natural sciences and to socioeconomic disciplines. In particular,
bipartite networks have been recognized to provide a particularly insightful
representation of many systems, ranging from mutualistic networks in ecology to
trade networks in economy, whence the need of a pattern detection-oriented
analysis in order to identify statistically-significant structural properties.
Such an analysis rests upon the definition of suitable null models, i.e. upon
the choice of the portion of network structure to be preserved while
randomizing everything else. However, quite surprisingly, little work has been
done so far to define null models for real bipartite networks. The aim of the
present work is to fill this gap, extending a recently-proposed method to
randomize monopartite networks to bipartite networks. While the proposed
formalism is perfectly general, we apply our method to the binary, undirected,
bipartite representation of the World Trade Web, comparing the observed values
of a number of structural quantities of interest with the expected ones,
calculated via our randomization procedure. Interestingly, the behavior of the
World Trade Web in this new representation is strongly different from the
monopartite analogue, showing highly non-trivial patterns of self-organization.Comment: 22 pages, 13 figure
Comparing the hierarchy of author given tags and repository given tags in a large document archive
Folksonomies - large databases arising from collaborative tagging of items by
independent users - are becoming an increasingly important way of categorizing
information. In these systems users can tag items with free words, resulting in
a tripartite item-tag-user network. Although there are no prescribed relations
between tags, the way users think about the different categories presumably has
some built in hierarchy, in which more special concepts are descendants of some
more general categories. Several applications would benefit from the knowledge
of this hierarchy. Here we apply a recent method to check the differences and
similarities of hierarchies resulting from tags given by independent
individuals and from tags given by a centrally managed repository system. The
results from out method showed substantial differences between the lower part
of the hierarchies, and in contrast, a relatively high similarity at the top of
the hierarchies.Comment: 10 page
Statistical mechanics of ontology based annotations
We present a statistical mechanical theory of the process of annotating an
object with terms selected from an ontology. The term selection process is
formulated as an ideal lattice gas model, but in a highly structured
inhomogeneous field. The model enables us to explain patterns recently observed
in real-world annotation data sets, in terms of the underlying graph structure
of the ontology. By relating the external field strengths to the information
content of each node in the ontology graph, the statistical mechanical model
also allows us to propose a number of practical metrics for assessing the
quality of both the ontology, and the annotations that arise from its use.
Using the statistical mechanical formalism we also study an ensemble of
ontologies of differing size and complexity; an analysis not readily performed
using real data alone. Focusing on regular tree ontology graphs we uncover a
rich set of scaling laws describing the growth in the optimal ontology size as
the number of objects being annotated increases. In doing so we provide a
further possible measure for assessment of ontologies.Comment: 27 pages, 5 figure
- …