8,658 research outputs found
Entropy-scaling search of massive biological data
Many datasets exhibit a well-defined structure that can be exploited to
design faster search tools, but it is not always clear when such acceleration
is possible. Here, we introduce a framework for similarity search based on
characterizing a dataset's entropy and fractal dimension. We prove that
searching scales in time with metric entropy (number of covering hyperspheres),
if the fractal dimension of the dataset is low, and scales in space with the
sum of metric entropy and information-theoretic entropy (randomness of the
data). Using these ideas, we present accelerated versions of standard tools,
with no loss in specificity and little loss in sensitivity, for use in three
domains---high-throughput drug screening (Ammolite, 150x speedup), metagenomics
(MICA, 3.5x speedup of DIAMOND [3,700x BLASTX]), and protein structure search
(esFragBag, 10x speedup of FragBag). Our framework can be used to achieve
"compressive omics," and the general theory can be readily applied to data
science problems outside of biology.Comment: Including supplement: 41 pages, 6 figures, 4 tables, 1 bo
Wavelets and their use
This review paper is intended to give a useful guide for those who want to
apply discrete wavelets in their practice. The notion of wavelets and their use
in practical computing and various applications are briefly described, but
rigorous proofs of mathematical statements are omitted, and the reader is just
referred to corresponding literature. The multiresolution analysis and fast
wavelet transform became a standard procedure for dealing with discrete
wavelets. The proper choice of a wavelet and use of nonstandard matrix
multiplication are often crucial for achievement of a goal. Analysis of various
functions with the help of wavelets allows to reveal fractal structures,
singularities etc. Wavelet transform of operator expressions helps solve some
equations. In practical applications one deals often with the discretized
functions, and the problem of stability of wavelet transform and corresponding
numerical algorithms becomes important. After discussing all these topics we
turn to practical applications of the wavelet machinery. They are so numerous
that we have to limit ourselves by some examples only. The authors would be
grateful for any comments which improve this review paper and move us closer to
the goal proclaimed in the first phrase of the abstract.Comment: 63 pages with 22 ps-figures, to be published in Physics-Uspekh
Entropy computing via integration over fractal measures
We discuss the properties of invariant measures corresponding to iterated
function systems (IFSs) with place-dependent probabilities and compute their
Renyi entropies, generalized dimensions, and multifractal spectra. It is shown
that with certain dynamical systems one can associate the corresponding IFSs in
such a way that their generalized entropies are equal. This provides a new
method of computing entropy for some classical and quantum dynamical systems.
Numerical techniques are based on integration over the fractal measures.Comment: 14 pages in Latex, Revtex + 4 figures in .ps attached (revised
version, new title, several changes, to appear in CHAOS
Herding as a Learning System with Edge-of-Chaos Dynamics
Herding defines a deterministic dynamical system at the edge of chaos. It
generates a sequence of model states and parameters by alternating parameter
perturbations with state maximizations, where the sequence of states can be
interpreted as "samples" from an associated MRF model. Herding differs from
maximum likelihood estimation in that the sequence of parameters does not
converge to a fixed point and differs from an MCMC posterior sampling approach
in that the sequence of states is generated deterministically. Herding may be
interpreted as a"perturb and map" method where the parameter perturbations are
generated using a deterministic nonlinear dynamical system rather than randomly
from a Gumbel distribution. This chapter studies the distinct statistical
characteristics of the herding algorithm and shows that the fast convergence
rate of the controlled moments may be attributed to edge of chaos dynamics. The
herding algorithm can also be generalized to models with latent variables and
to a discriminative learning setting. The perceptron cycling theorem ensures
that the fast moment matching property is preserved in the more general
framework
A weak local irregularity property in S^\nu spaces
Although it has been shown that, from the prevalence point of view, the
elements of the S^ \nu spaces are almost surely multifractal, we show here that
they also almost surely satisfy a weak uniform irregularity property
For the Jubilee of Vladimir Mikhailovich Chernov
On April 25, 2019, Vladimir Chernov celebrated his 70th birthday, Doctor of Physics and Mathematics, Chief Researcher at the Laboratory of Mathematical Methods of Image Processing of the Image Processing Systems Institute of the Russian Academy of Sciences (IPSI RAS), a branch of the Federal Science Research Center "Crystallography and Photonics RAS and part-Time Professor at the Department of Geoinformatics and Information Security of the Samara National Research University named after academician S.P. Korolev (Samara University). The article briefly describes the scientific and pedagogical achievements of the hero of the day. © Published under licence by IOP Publishing Ltd
- …