6,248 research outputs found
Entropy-scaling search of massive biological data
Many datasets exhibit a well-defined structure that can be exploited to
design faster search tools, but it is not always clear when such acceleration
is possible. Here, we introduce a framework for similarity search based on
characterizing a dataset's entropy and fractal dimension. We prove that
searching scales in time with metric entropy (number of covering hyperspheres),
if the fractal dimension of the dataset is low, and scales in space with the
sum of metric entropy and information-theoretic entropy (randomness of the
data). Using these ideas, we present accelerated versions of standard tools,
with no loss in specificity and little loss in sensitivity, for use in three
domains---high-throughput drug screening (Ammolite, 150x speedup), metagenomics
(MICA, 3.5x speedup of DIAMOND [3,700x BLASTX]), and protein structure search
(esFragBag, 10x speedup of FragBag). Our framework can be used to achieve
"compressive omics," and the general theory can be readily applied to data
science problems outside of biology.Comment: Including supplement: 41 pages, 6 figures, 4 tables, 1 bo
A universal approach for drainage basins
Drainage basins are essential to Geohydrology and Biodiversity. Defining
those regions in a simple, robust and efficient way is a constant challenge in
Earth Science. Here, we introduce a model to delineate multiple drainage basins
through an extension of the Invasion Percolation-Based Algorithm (IPBA). In
order to prove the potential of our approach, we apply it to real and
artificial datasets. We observe that the perimeter and area distributions of
basins and anti-basins display long tails extending over several orders of
magnitude and following approximately power-law behaviors. Moreover, the
exponents of these power laws depend on spatial correlations and are invariant
under the landscape orientation, not only for terrestrial, but lunar and
martian landscapes. The terrestrial and martian results are statistically
identical, which suggests that a hypothetical martian river would present
similarity to the terrestrial rivers. Finally, we propose a theoretical value
for the Hack's exponent based on the fractal dimension of watersheds,
. We measure for Earth, which is close to
our estimation of . Our study suggests that Hack's law can
have its origin purely in the maximum and minimum lines of the landscapes.Comment: 20 pages, 6 Figures, and 1 Tabl
Fractal dimension evolution and spatial replacement dynamics of urban growth
This paper presents a new perspective of looking at the relation between
fractals and chaos by means of cities. Especially, a principle of space filling
and spatial replacement is proposed to explain the fractal dimension of urban
form. The fractal dimension evolution of urban growth can be empirically
modeled with Boltzmann's equation. For the normalized data, Boltzmann's
equation is equivalent to the logistic function. The logistic equation can be
transformed into the well-known 1-dimensional logistic map, which is based on a
2-dimensional map suggesting spatial replacement dynamics of city development.
The 2-dimensional recurrence relations can be employed to generate the
nonlinear dynamical behaviors such as bifurcation and chaos. A discovery is
made that, for the fractal dimension growth following the logistic curve, the
normalized dimension value is the ratio of space filling. If the rate of
spatial replacement (urban growth) is too high, the periodic oscillations and
chaos will arise, and the city system will fall into disorder. The spatial
replacement dynamics can be extended to general replacement dynamics, and
bifurcation and chaos seem to be related with some kind of replacement process.Comment: 17 pages, 5 figures, 2 table
The spatial distribution of star and cluster formation in M51
Aims. We study the connection between spatially resolved star formation and
young star clusters across the disc of M51. Methods. We combine star cluster
data based on B, V, and I-band Hubble Space Telescope ACS imaging, together
with new WFPC2 U-band photometry to derive ages, masses, and extinctions of
1580 resolved star clusters using SSP models. This data is combined with data
on the spatially resolved star formation rates and gas surface densities, as
well as Halpha and 20cm radio-continuum (RC) emission, which allows us to study
the spatial correlations between star formation and star clusters. Two-point
autocorrelation functions are used to study the clustering of star clusters as
a function of spatial scale and age. Results. We find that the clustering of
star clusters among themselves decreases both with spatial scale and age,
consistent with hierarchical star formation. The slope of the autocorrelation
functions are consistent with projected fractal dimensions in the range of
1.2-1.6, which is similar to other galaxies, therefore suggesting that the
fractal dimension of hierarchical star formation is universal. Both star and
cluster formation peak at a galactocentric radius of 2.5 and 5 kpc, which we
tentatively attribute to the presence of the 4:1 resonance and the co-rotation
radius. The positions of the youngest (<10 Myr) star clusters show the
strongest correlation with the spiral arms, Halpha, and the RC emission, and
these correlations decrease with age. The azimuthal distribution of clusters in
terms of kinematic age away from the spiral arms indicates that the majority of
the clusters formed 5-20 Myr before their parental gas cloud reached the centre
of the spiral arm.Comment: 14 pages, 21 figures, accepted for publication in A&
Azimuthal Anisotropy in High Energy Nuclear Collision - An Approach based on Complex Network Analysis
Recently, a complex network based method of Visibility Graph has been applied
to confirm the scale-freeness and presence of fractal properties in the process
of multiplicity fluctuation. Analysis of data obtained from experiments on
hadron-nucleus and nucleus-nucleus interactions results in values of
Power-of-Scale-freeness-of-Visibility-Graph-(PSVG) parameter extracted from the
visibility graphs. Here, the relativistic nucleus-nucleus interaction data have
been analysed to detect azimuthal-anisotropy by extending the Visibility Graph
method and extracting the average clustering coefficient, one of the important
topological parameters, from the graph. Azimuthal-distributions corresponding
to different pseudorapidity-regions around the central-pseudorapidity value are
analysed utilising the parameter. Here we attempt to correlate the conventional
physical significance of this coefficient with respect to complex-network
systems, with some basic notions of particle production phenomenology, like
clustering and correlation. Earlier methods for detecting anisotropy in
azimuthal distribution, were mostly based on the analysis of statistical
fluctuation. In this work, we have attempted to find deterministic information
on the anisotropy in azimuthal distribution by means of precise determination
of topological parameter from a complex network perspective
Geometric Exponents of Dilute Logarithmic Minimal Models
The fractal dimensions of the hull, the external perimeter and of the red
bonds are measured through Monte Carlo simulations for dilute minimal models,
and compared with predictions from conformal field theory and SLE methods. The
dilute models used are those first introduced by Nienhuis. Their loop fugacity
is beta = -2cos(pi/barkappa}) where the parameter barkappa is linked to their
description through conformal loop ensembles. It is also linked to conformal
field theories through their central charges c = 13 - 6(barkappa +
barkappa^{-1}) and, for the minimal models of interest here, barkappa = p/p'
where p and p' are two coprime integers. The geometric exponents of the hull
and external perimeter are studied for the pairs (p,p') = (1,1), (2,3), (3,4),
(4,5), (5,6), (5,7), and that of the red bonds for (p,p') = (3,4). Monte Carlo
upgrades are proposed for these models as well as several techniques to improve
their speeds. The measured fractal dimensions are obtained by extrapolation on
the lattice size H,V -> infinity. The extrapolating curves have large slopes;
despite these, the measured dimensions coincide with theoretical predictions up
to three or four digits. In some cases, the theoretical values lie slightly
outside the confidence intervals; explanations of these small discrepancies are
proposed.Comment: 41 pages, 32 figures, added reference
- …