1,127 research outputs found
Scalable Mining of Common Routes in Mobile Communication Network Traffic Data
A probabilistic method for inferring common routes from mobile communication network traffic data is presented. Besides providing mobility information, valuable in a multitude of application areas, the method has the dual purpose of enabling efficient coarse-graining as well as anonymisation by mapping individual sequences onto common routes. The approach is to represent spatial trajectories by Cell ID sequences that are grouped into routes using locality-sensitive hashing and graph clustering. The method is demonstrated to be scalable, and to accurately group sequences using an evaluation set of GPS tagged data
Viral antibody dynamics in a chiropteran host
1. Bats host many viruses that are significant for human and domestic animal health, but the dynamics of these infections in their natural reservoir hosts remain poorly elucidated.<p></p>
2. In these, and other, systems, there is evidence that seasonal life-cycle events drive infection dynamics, directly impacting the risk of exposure to spillover hosts. Understanding these dynamics improves our ability to predict zoonotic spillover from the reservoir hosts.<p></p>
3. To this end, we followed henipavirus antibody levels of >100 individual E. helvum in a closed, captive, breeding population over a 30-month period, using a powerful novel antibody quantitation method.<p></p>
4. We demonstrate the presence of maternal antibodies in this system and accurately determine their longevity. We also present evidence of population-level persistence of viral infection and demonstrate periods of increased horizontal virus transmission associated with the pregnancy/lactation period.<p></p>
5.The novel findings of infection persistence and the effect of pregnancy on viral transmission, as well as an accurate quantitation of chiropteran maternal antiviral antibody half-life, provide fundamental baseline data for the continued study of viral infections in these important reservoir hosts
Fractal-like Distributions over the Rational Numbers in High-throughput Biological and Clinical Data
Recent developments in extracting and processing biological and clinical data are allowing quantitative approaches to studying living systems. High-throughput sequencing, expression profiles, proteomics, and electronic health records are some examples of such technologies. Extracting meaningful information from those technologies requires careful analysis of the large volumes of data they produce. In this note, we present a set of distributions that commonly appear in the analysis of such data. These distributions present some interesting features: they are discontinuous in the rational numbers, but continuous in the irrational numbers, and possess a certain self-similar (fractal-like) structure. The first set of examples which we present here are drawn from a high-throughput sequencing experiment. Here, the self-similar distributions appear as part of the evaluation of the error rate of the sequencing technology and the identification of tumorogenic genomic alterations. The other examples are obtained from risk factor evaluation and analysis of relative disease prevalence and co-mordbidity as these appear in electronic clinical data. The distributions are also relevant to identification of subclonal populations in tumors and the study of the evolution of infectious diseases, and more precisely the study of quasi-species and intrahost diversity of viral populations
You can't see what you can't see: Experimental evidence for how much relevant information may be missed due to Google's Web search personalisation
The influence of Web search personalisation on professional knowledge work is
an understudied area. Here we investigate how public sector officials
self-assess their dependency on the Google Web search engine, whether they are
aware of the potential impact of algorithmic biases on their ability to
retrieve all relevant information, and how much relevant information may
actually be missed due to Web search personalisation. We find that the majority
of participants in our experimental study are neither aware that there is a
potential problem nor do they have a strategy to mitigate the risk of missing
relevant information when performing online searches. Most significantly, we
provide empirical evidence that up to 20% of relevant information may be missed
due to Web search personalisation. This work has significant implications for
Web research by public sector professionals, who should be provided with
training about the potential algorithmic biases that may affect their judgments
and decision making, as well as clear guidelines how to minimise the risk of
missing relevant information.Comment: paper submitted to the 11th Intl. Conf. on Social Informatics;
revision corrects error in interpretation of parameter Psi/p in RBO resulting
from discrepancy between the documentation of the implementation in R
(https://rdrr.io/bioc/gespeR/man/rbo.html) and the original definition
(https://dl.acm.org/citation.cfm?id=1852106) as per 20/05/201
Complexity transitions in global algorithms for sparse linear systems over finite fields
We study the computational complexity of a very basic problem, namely that of
finding solutions to a very large set of random linear equations in a finite
Galois Field modulo q. Using tools from statistical mechanics we are able to
identify phase transitions in the structure of the solution space and to
connect them to changes in performance of a global algorithm, namely Gaussian
elimination. Crossing phase boundaries produces a dramatic increase in memory
and CPU requirements necessary to the algorithms. In turn, this causes the
saturation of the upper bounds for the running time. We illustrate the results
on the specific problem of integer factorization, which is of central interest
for deciphering messages encrypted with the RSA cryptosystem.Comment: 23 pages, 8 figure
Clustering and preferential attachment in growing networks
We study empirically the time evolution of scientific collaboration networks
in physics and biology. In these networks, two scientists are considered
connected if they have coauthored one or more papers together. We show that the
probability of scientists collaborating increases with the number of other
collaborators they have in common, and that the probability of a particular
scientist acquiring new collaborators increases with the number of his or her
past collaborators. These results provide experimental evidence in favor of
previously conjectured mechanisms for clustering and power-law degree
distributions in networks.Comment: 13 pages, 2 figure
Minimizing energy below the glass thresholds
Focusing on the optimization version of the random K-satisfiability problem,
the MAX-K-SAT problem, we study the performance of the finite energy version of
the Survey Propagation (SP) algorithm. We show that a simple (linear time)
backtrack decimation strategy is sufficient to reach configurations well below
the lower bound for the dynamic threshold energy and very close to the analytic
prediction for the optimal ground states. A comparative numerical study on one
of the most efficient local search procedures is also given.Comment: 12 pages, submitted to Phys. Rev. E, accepted for publicatio
Network robustness and fragility: Percolation on random graphs
Recent work on the internet, social networks, and the power grid has
addressed the resilience of these networks to either random or targeted
deletion of network nodes. Such deletions include, for example, the failure of
internet routers or power transmission lines. Percolation models on random
graphs provide a simple representation of this process, but have typically been
limited to graphs with Poisson degree distribution at their vertices. Such
graphs are quite unlike real world networks, which often possess power-law or
other highly skewed degree distributions. In this paper we study percolation on
graphs with completely general degree distribution, giving exact solutions for
a variety of cases, including site percolation, bond percolation, and models in
which occupation probabilities depend on vertex degree. We discuss the
application of our theory to the understanding of network resilience.Comment: 4 pages, 2 figure
Minimum spanning trees on random networks
We show that the geometry of minimum spanning trees (MST) on random graphs is
universal. Due to this geometric universality, we are able to characterise the
energy of MST using a scaling distribution () found using uniform
disorder. We show that the MST energy for other disorder distributions is
simply related to . We discuss the relationship to invasion
percolation (IP), to the directed polymer in a random media (DPRM) and the
implications for the broader issue of universality in disordered systems.Comment: 4 pages, 3 figure
- …