8,658 research outputs found

    Entropy-scaling search of massive biological data

    Get PDF
    Many datasets exhibit a well-defined structure that can be exploited to design faster search tools, but it is not always clear when such acceleration is possible. Here, we introduce a framework for similarity search based on characterizing a dataset's entropy and fractal dimension. We prove that searching scales in time with metric entropy (number of covering hyperspheres), if the fractal dimension of the dataset is low, and scales in space with the sum of metric entropy and information-theoretic entropy (randomness of the data). Using these ideas, we present accelerated versions of standard tools, with no loss in specificity and little loss in sensitivity, for use in three domains---high-throughput drug screening (Ammolite, 150x speedup), metagenomics (MICA, 3.5x speedup of DIAMOND [3,700x BLASTX]), and protein structure search (esFragBag, 10x speedup of FragBag). Our framework can be used to achieve "compressive omics," and the general theory can be readily applied to data science problems outside of biology.Comment: Including supplement: 41 pages, 6 figures, 4 tables, 1 bo

    Wavelets and their use

    Get PDF
    This review paper is intended to give a useful guide for those who want to apply discrete wavelets in their practice. The notion of wavelets and their use in practical computing and various applications are briefly described, but rigorous proofs of mathematical statements are omitted, and the reader is just referred to corresponding literature. The multiresolution analysis and fast wavelet transform became a standard procedure for dealing with discrete wavelets. The proper choice of a wavelet and use of nonstandard matrix multiplication are often crucial for achievement of a goal. Analysis of various functions with the help of wavelets allows to reveal fractal structures, singularities etc. Wavelet transform of operator expressions helps solve some equations. In practical applications one deals often with the discretized functions, and the problem of stability of wavelet transform and corresponding numerical algorithms becomes important. After discussing all these topics we turn to practical applications of the wavelet machinery. They are so numerous that we have to limit ourselves by some examples only. The authors would be grateful for any comments which improve this review paper and move us closer to the goal proclaimed in the first phrase of the abstract.Comment: 63 pages with 22 ps-figures, to be published in Physics-Uspekh

    Entropy computing via integration over fractal measures

    Full text link
    We discuss the properties of invariant measures corresponding to iterated function systems (IFSs) with place-dependent probabilities and compute their Renyi entropies, generalized dimensions, and multifractal spectra. It is shown that with certain dynamical systems one can associate the corresponding IFSs in such a way that their generalized entropies are equal. This provides a new method of computing entropy for some classical and quantum dynamical systems. Numerical techniques are based on integration over the fractal measures.Comment: 14 pages in Latex, Revtex + 4 figures in .ps attached (revised version, new title, several changes, to appear in CHAOS

    Herding as a Learning System with Edge-of-Chaos Dynamics

    Full text link
    Herding defines a deterministic dynamical system at the edge of chaos. It generates a sequence of model states and parameters by alternating parameter perturbations with state maximizations, where the sequence of states can be interpreted as "samples" from an associated MRF model. Herding differs from maximum likelihood estimation in that the sequence of parameters does not converge to a fixed point and differs from an MCMC posterior sampling approach in that the sequence of states is generated deterministically. Herding may be interpreted as a"perturb and map" method where the parameter perturbations are generated using a deterministic nonlinear dynamical system rather than randomly from a Gumbel distribution. This chapter studies the distinct statistical characteristics of the herding algorithm and shows that the fast convergence rate of the controlled moments may be attributed to edge of chaos dynamics. The herding algorithm can also be generalized to models with latent variables and to a discriminative learning setting. The perceptron cycling theorem ensures that the fast moment matching property is preserved in the more general framework

    A weak local irregularity property in S^\nu spaces

    Full text link
    Although it has been shown that, from the prevalence point of view, the elements of the S^ \nu spaces are almost surely multifractal, we show here that they also almost surely satisfy a weak uniform irregularity property

    For the Jubilee of Vladimir Mikhailovich Chernov

    Get PDF
    On April 25, 2019, Vladimir Chernov celebrated his 70th birthday, Doctor of Physics and Mathematics, Chief Researcher at the Laboratory of Mathematical Methods of Image Processing of the Image Processing Systems Institute of the Russian Academy of Sciences (IPSI RAS), a branch of the Federal Science Research Center "Crystallography and Photonics RAS and part-Time Professor at the Department of Geoinformatics and Information Security of the Samara National Research University named after academician S.P. Korolev (Samara University). The article briefly describes the scientific and pedagogical achievements of the hero of the day. © Published under licence by IOP Publishing Ltd
    corecore