12,373 research outputs found
Using text analysis to quantify the similarity and evolution of scientific disciplines
We use an information-theoretic measure of linguistic similarity to
investigate the organization and evolution of scientific fields. An analysis of
almost 20M papers from the past three decades reveals that the linguistic
similarity is related but different from experts and citation-based
classifications, leading to an improved view on the organization of science. A
temporal analysis of the similarity of fields shows that some fields (e.g.,
computer science) are becoming increasingly central, but that on average the
similarity between pairs has not changed in the last decades. This suggests
that tendencies of convergence (e.g., multi-disciplinarity) and divergence
(e.g., specialization) of disciplines are in balance.Comment: 9 pages, 4 figure
Identifying Clusters in Bayesian Disease Mapping
Disease mapping is the field of spatial epidemiology interested in estimating
the spatial pattern in disease risk across areal units. One aim is to
identify units exhibiting elevated disease risks, so that public health
interventions can be made. Bayesian hierarchical models with a spatially smooth
conditional autoregressive prior are used for this purpose, but they cannot
identify the spatial extent of high-risk clusters. Therefore we propose a two
stage solution to this problem, with the first stage being a spatially adjusted
hierarchical agglomerative clustering algorithm. This algorithm is applied to
data prior to the study period, and produces potential cluster structures
for the disease data. The second stage fits a separate Poisson log-linear model
to the study data for each cluster structure, which allows for step-changes in
risk where two clusters meet. The most appropriate cluster structure is chosen
by model comparison techniques, specifically by minimising the Deviance
Information Criterion. The efficacy of the methodology is established by a
simulation study, and is illustrated by a study of respiratory disease risk in
Glasgow, Scotland
Local Variation as a Statistical Hypothesis Test
The goal of image oversegmentation is to divide an image into several pieces,
each of which should ideally be part of an object. One of the simplest and yet
most effective oversegmentation algorithms is known as local variation (LV)
(Felzenszwalb and Huttenlocher 2004). In this work, we study this algorithm and
show that algorithms similar to LV can be devised by applying different
statistical models and decisions, thus providing further theoretical
justification and a well-founded explanation for the unexpected high
performance of the LV approach. Some of these algorithms are based on
statistics of natural images and on a hypothesis testing decision; we denote
these algorithms probabilistic local variation (pLV). The best pLV algorithm,
which relies on censored estimation, presents state-of-the-art results while
keeping the same computational complexity of the LV algorithm
- …