Search CORE

852 research outputs found

Optimal Substring-Equality Queries with Applications to Sparse Text Indexing

Author: Prezza Nicola
Publication venue
Publication date: 01/01/2020
Field of study

We consider the problem of encoding a string of length

n

from an integer alphabet of size

\sigma

so that access and substring equality queries (that is, determining the equality of any two substrings) can be answered efficiently. Any uniquely-decodable encoding supporting access must take

n\log\sigma + \Theta(\log (n\log\sigma))

bits. We describe a new data structure matching this lower bound when

\sigma\leq n^{O(1)}

while supporting both queries in optimal

O(1)

time. Furthermore, we show that the string can be overwritten in-place with this structure. The redundancy of

\Theta(\log n)

bits and the constant query time break exponentially a lower bound that is known to hold in the read-only model. Using our new string representation, we obtain the first in-place subquadratic (indeed, even sublinear in some cases) algorithms for several string-processing problems in the restore model: the input string is rewritable and must be restored before the computation terminates. In particular, we describe the first in-place subquadratic Monte Carlo solutions to the sparse suffix sorting, sparse LCP array construction, and suffix selection problems. With the sole exception of suffix selection, our algorithms are also the first running in sublinear time for small enough sets of input suffixes. Combining these solutions, we obtain the first sublinear-time Monte Carlo algorithm for building the sparse suffix tree in compact space. We also show how to derandomize our algorithms using small space. This leads to the first Las Vegas in-place algorithm computing the full LCP array in

O(n\log n)

time and to the first Las Vegas in-place algorithms solving the sparse suffix sorting and sparse LCP array construction problems in

O(n^{1.5}\sqrt{\log \sigma})

time. Running times of these Las Vegas algorithms hold in the worst case with high probability.Comment: Refactored according to TALG's reviews. New w.h.p. bounds and Las Vegas algorithm

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Fast Scalable Construction of (Minimal Perfect Hash) Functions

Author: A Goerdt
AM Frieze
AM Odlyzko
BA LaMacchia
BS Majewski
D Belazzougui
D Belazzougui
D Belazzougui
D Belazzougui
FC Botelho
M Aumüller
M Dietzfelbinger
M Dietzfelbinger
N Fountoulakis
Publication venue
Publication date: 22/03/2016
Field of study

Recent advances in random linear systems on finite fields have paved the way for the construction of constant-time data structures representing static functions and minimal perfect hash functions using less space with respect to existing techniques. The main obstruction for any practical application of these results is the cubic-time Gaussian elimination required to solve these linear systems: despite they can be made very small, the computation is still too slow to be feasible. In this paper we describe in detail a number of heuristics and programming techniques to speed up the resolution of these systems by several orders of magnitude, making the overall construction competitive with the standard and widely used MWHC technique, which is based on hypergraph peeling. In particular, we introduce broadword programming techniques for fast equation manipulation and a lazy Gaussian elimination algorithm. We also describe a number of technical improvements to the data structure which further reduce space usage and improve lookup speed. Our implementation of these techniques yields a minimal perfect hash function data structure occupying 2.24 bits per element, compared to 2.68 for MWHC-based ones, and a static function data structure which reduces the multiplicative overhead from 1.23 to 1.03

arXiv.org e-Print Archive

Crossref

Recommended from our members

New Applications of the Nearest-Neighbor Chain Algorithm

Author: Mamano Grande Nil
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

The nearest-neighbor chain algorithm was proposed in the eighties as a way to speed up certain hierarchical clustering algorithms. In the first part of the dissertation, we show that its application is not limited to clustering. We apply it to a variety of geometric and combinatorial problems. In each case, we show that the nearest-neighbor chain algorithm finds the same solution as a preexistent greedy algorithm, but often with an improved runtime. We obtain speedups over greedy algorithms for Euclidean TSP, Steiner TSP in planar graphs, straight skeletons, a geometric coverage problem, and three stable matching models. In the second part, we study the stable-matching Voronoi diagram, a type of plane partition which combines properties of stable matchings and Voronoi diagrams. We propose political redistricting as an application. We also show that it is impossible to compute this diagram in an algebraic model of computation, and give three algorithmic approaches to overcome this obstacle. One of them is based on the nearest-neighbor chain algorithm, linking the two parts together

eScholarship - University of California

The resolved star-formation relation in nearby active galactic nuclei

Author: Casasola Viviana
Combes Francoise
Garcia-Burillo Santiago
Hunt Leslie
Publication venue: 'EDP Sciences'
Publication date: 01/01/2015
Field of study

We present an analysis of the relation between star formation rate (SFR) surface density (sigmasfr) and mass surface density of molecular gas (sigmahtwo), commonly referred to as the Kennicutt-Schmidt (K-S) relation, at its intrinsic spatial scale, i.e. the size of giant molecular clouds (10-150 pc), in the central, high-density regions of four nearby low-luminosity active galactic nuclei (AGN). We used interferometric IRAM CO(1-0) and CO(2-1), and SMA CO(3-2) emission line maps to derive sigmahtwo and HST-Halpha images to estimate sigmasfr. Each galaxy is characterized by a distinct molecular SF relation at spatial scales between 20 to 200 pc. The K-S relations can be sub-linear, but also super-linear, with slopes ranging from 0.5 to 1.3. Depletion times range from 1 and 2Gyr, compatible with results for nearby normal galaxies. These findings are valid independently of which transition, CO(1-0), CO(2-1), or CO(3-2), is used to derive sigmahtwo. Because of star-formation feedback, life-time of clouds, turbulent cascade, or magnetic fields, the K-S relation might be expected to degrade on small spatial scales (<100 pc). However, we find no clear evidence for this, even on scales as small as 20 pc, and this might be because of the higher density of GMCs in galaxy centers which have to resist higher shear forces. The proportionality between sigmahtwo and sigmasfr found between 10 and 100 Msun/pc2 is valid even at high densities, 10^3 Msun/pc2. However, by adopting a common CO-to-H2 conversion factor (alpha_CO), the central regions of the galaxies have higher sigmasfr for a given gas column than those expected from the models, with a behavior that lies between the mergers/high-redshift starburst systems and the more quiescent star-forming galaxies, assuming that the first ones require a lower value of alpha_CO.Comment: 22 pages, 8 figures, Accepted for publication in Astronomy and Astrophysic

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

HAL-INSU

OA@INAF - Istituto Nazionale di Astrofisica

HAL-OBSPM

Hal-Diderot