6,248 research outputs found

    Entropy-scaling search of massive biological data

    Get PDF
    Many datasets exhibit a well-defined structure that can be exploited to design faster search tools, but it is not always clear when such acceleration is possible. Here, we introduce a framework for similarity search based on characterizing a dataset's entropy and fractal dimension. We prove that searching scales in time with metric entropy (number of covering hyperspheres), if the fractal dimension of the dataset is low, and scales in space with the sum of metric entropy and information-theoretic entropy (randomness of the data). Using these ideas, we present accelerated versions of standard tools, with no loss in specificity and little loss in sensitivity, for use in three domains---high-throughput drug screening (Ammolite, 150x speedup), metagenomics (MICA, 3.5x speedup of DIAMOND [3,700x BLASTX]), and protein structure search (esFragBag, 10x speedup of FragBag). Our framework can be used to achieve "compressive omics," and the general theory can be readily applied to data science problems outside of biology.Comment: Including supplement: 41 pages, 6 figures, 4 tables, 1 bo

    A universal approach for drainage basins

    Full text link
    Drainage basins are essential to Geohydrology and Biodiversity. Defining those regions in a simple, robust and efficient way is a constant challenge in Earth Science. Here, we introduce a model to delineate multiple drainage basins through an extension of the Invasion Percolation-Based Algorithm (IPBA). In order to prove the potential of our approach, we apply it to real and artificial datasets. We observe that the perimeter and area distributions of basins and anti-basins display long tails extending over several orders of magnitude and following approximately power-law behaviors. Moreover, the exponents of these power laws depend on spatial correlations and are invariant under the landscape orientation, not only for terrestrial, but lunar and martian landscapes. The terrestrial and martian results are statistically identical, which suggests that a hypothetical martian river would present similarity to the terrestrial rivers. Finally, we propose a theoretical value for the Hack's exponent based on the fractal dimension of watersheds, γ=D/2\gamma=D/2. We measure γ=0.54±0.01\gamma=0.54 \pm 0.01 for Earth, which is close to our estimation of γ≈0.55\gamma \approx 0.55. Our study suggests that Hack's law can have its origin purely in the maximum and minimum lines of the landscapes.Comment: 20 pages, 6 Figures, and 1 Tabl

    Fractal dimension evolution and spatial replacement dynamics of urban growth

    Full text link
    This paper presents a new perspective of looking at the relation between fractals and chaos by means of cities. Especially, a principle of space filling and spatial replacement is proposed to explain the fractal dimension of urban form. The fractal dimension evolution of urban growth can be empirically modeled with Boltzmann's equation. For the normalized data, Boltzmann's equation is equivalent to the logistic function. The logistic equation can be transformed into the well-known 1-dimensional logistic map, which is based on a 2-dimensional map suggesting spatial replacement dynamics of city development. The 2-dimensional recurrence relations can be employed to generate the nonlinear dynamical behaviors such as bifurcation and chaos. A discovery is made that, for the fractal dimension growth following the logistic curve, the normalized dimension value is the ratio of space filling. If the rate of spatial replacement (urban growth) is too high, the periodic oscillations and chaos will arise, and the city system will fall into disorder. The spatial replacement dynamics can be extended to general replacement dynamics, and bifurcation and chaos seem to be related with some kind of replacement process.Comment: 17 pages, 5 figures, 2 table

    The spatial distribution of star and cluster formation in M51

    Full text link
    Aims. We study the connection between spatially resolved star formation and young star clusters across the disc of M51. Methods. We combine star cluster data based on B, V, and I-band Hubble Space Telescope ACS imaging, together with new WFPC2 U-band photometry to derive ages, masses, and extinctions of 1580 resolved star clusters using SSP models. This data is combined with data on the spatially resolved star formation rates and gas surface densities, as well as Halpha and 20cm radio-continuum (RC) emission, which allows us to study the spatial correlations between star formation and star clusters. Two-point autocorrelation functions are used to study the clustering of star clusters as a function of spatial scale and age. Results. We find that the clustering of star clusters among themselves decreases both with spatial scale and age, consistent with hierarchical star formation. The slope of the autocorrelation functions are consistent with projected fractal dimensions in the range of 1.2-1.6, which is similar to other galaxies, therefore suggesting that the fractal dimension of hierarchical star formation is universal. Both star and cluster formation peak at a galactocentric radius of 2.5 and 5 kpc, which we tentatively attribute to the presence of the 4:1 resonance and the co-rotation radius. The positions of the youngest (<10 Myr) star clusters show the strongest correlation with the spiral arms, Halpha, and the RC emission, and these correlations decrease with age. The azimuthal distribution of clusters in terms of kinematic age away from the spiral arms indicates that the majority of the clusters formed 5-20 Myr before their parental gas cloud reached the centre of the spiral arm.Comment: 14 pages, 21 figures, accepted for publication in A&

    Azimuthal Anisotropy in High Energy Nuclear Collision - An Approach based on Complex Network Analysis

    Get PDF
    Recently, a complex network based method of Visibility Graph has been applied to confirm the scale-freeness and presence of fractal properties in the process of multiplicity fluctuation. Analysis of data obtained from experiments on hadron-nucleus and nucleus-nucleus interactions results in values of Power-of-Scale-freeness-of-Visibility-Graph-(PSVG) parameter extracted from the visibility graphs. Here, the relativistic nucleus-nucleus interaction data have been analysed to detect azimuthal-anisotropy by extending the Visibility Graph method and extracting the average clustering coefficient, one of the important topological parameters, from the graph. Azimuthal-distributions corresponding to different pseudorapidity-regions around the central-pseudorapidity value are analysed utilising the parameter. Here we attempt to correlate the conventional physical significance of this coefficient with respect to complex-network systems, with some basic notions of particle production phenomenology, like clustering and correlation. Earlier methods for detecting anisotropy in azimuthal distribution, were mostly based on the analysis of statistical fluctuation. In this work, we have attempted to find deterministic information on the anisotropy in azimuthal distribution by means of precise determination of topological parameter from a complex network perspective

    Geometric Exponents of Dilute Logarithmic Minimal Models

    Full text link
    The fractal dimensions of the hull, the external perimeter and of the red bonds are measured through Monte Carlo simulations for dilute minimal models, and compared with predictions from conformal field theory and SLE methods. The dilute models used are those first introduced by Nienhuis. Their loop fugacity is beta = -2cos(pi/barkappa}) where the parameter barkappa is linked to their description through conformal loop ensembles. It is also linked to conformal field theories through their central charges c = 13 - 6(barkappa + barkappa^{-1}) and, for the minimal models of interest here, barkappa = p/p' where p and p' are two coprime integers. The geometric exponents of the hull and external perimeter are studied for the pairs (p,p') = (1,1), (2,3), (3,4), (4,5), (5,6), (5,7), and that of the red bonds for (p,p') = (3,4). Monte Carlo upgrades are proposed for these models as well as several techniques to improve their speeds. The measured fractal dimensions are obtained by extrapolation on the lattice size H,V -> infinity. The extrapolating curves have large slopes; despite these, the measured dimensions coincide with theoretical predictions up to three or four digits. In some cases, the theoretical values lie slightly outside the confidence intervals; explanations of these small discrepancies are proposed.Comment: 41 pages, 32 figures, added reference
    • …
    corecore