Search CORE

7 research outputs found

Parity Graph-Driven Read-Once Branching Programs and An Exponential Lower Bound for Integer Multiplication

Author: A Razborov
B Bollig
B Bollig
B Bollig
D Sieling
H Brosenne
I Wegener
I Wegener
J Gergov
J Gergov
JS Thathachar
M Ajtai
M Krause
N Alon
P Beame
P Beame
P Savickÿ
P Woelfel
R. E. Bryant
RE Bryant
S Ponzio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2002
Field of study

Scalable Storage for Digital Libraries

Author: Mather Paul
Publication venue
Publication date: 01/10/2002
Field of study

I propose a storage system optimised for digital libraries. Its key features are its heterogeneous scalability; its integration and exploitation of rich semantic metadata associated with digital objects; its use of a name space; and its aggressive performance optimisation in the digital library domain

Computer Science Technical Reports @Virginia Tech

Nearly Optimal Static Las Vegas Succinct Dictionary

Author: Faith
Grossi Roberto
Jacobson Guy
Miltersen Peter Bro
şcu Mihai P
şcu Mihai P
şcu Mihai P
şcu Mihai P
Publication venue
Publication date: 31/08/2020
Field of study

Given a set

S

n

(distinct) keys from key space

[U]

, each associated with a value from

\Sigma

, the \emph{static dictionary} problem asks to preprocess these (key, value) pairs into a data structure, supporting value-retrieval queries: for any given

x\in [U]

\mathtt{valRet}(x)

must return the value associated with

x

x\in S

, or return

\bot

x\notin S

. The special case where

|\Sigma|=1

is called the \emph{membership} problem. The "textbook" solution is to use a hash table, which occupies linear space and answers each query in constant time. On the other hand, the minimum possible space to encode all (key, value) pairs is only

\mathtt{OPT}:= \lceil\lg_2\binom{U}{n}+n\lg_2|\Sigma|\rceil

bits, which could be much less. In this paper, we design a randomized dictionary data structure using

\mathtt{OPT}+\mathrm{poly}\lg n+O(\lg\lg\lg\lg\lg U)

bits of space, and it has \emph{expected constant} query time, assuming the query algorithm can access an external lookup table of size

n^{0.001}

. The lookup table depends only on

U

n

and

|\Sigma|

, and not the input. Previously, even for membership queries and

U\leq n^{O(1)}

, the best known data structure with constant query time requires

\mathtt{OPT}+n/\mathrm{poly}\lg n

bits of space (Pagh [Pag01] and P\v{a}tra\c{s}cu [Pat08]); the best-known using

\mathtt{OPT}+n^{0.999}

space has query time

O(\lg n)

; the only known non-trivial data structure with

\mathtt{OPT}+n^{0.001}

space has

O(\lg n)

query time and requires a lookup table of size

\geq n^{2.99}

(!). Our new data structure answers open questions by P\v{a}tra\c{s}cu and Thorup [Pat08,Tho13]. We also present a scheme that compresses a sequence

X\in\Sigma^n

to its zeroth order (empirical) entropy up to

|\Sigma|\cdot\mathrm{poly}\lg n

extra bits, supporting decoding each

X_i

O(\lg |\Sigma|)

expected time.Comment: preliminary version appeared in STOC'2

arXiv.org e-Print Archive

Crossref

Marca d'agua para documentos via modulação de luminância

Author: Borges Paulo Vinicius Koerich
Publication venue: Florianópolis, SC
Publication date: 01/01/2008
Field of study

Tese (doutorado) - Universidade Federal de Santa Catarina, Centro Tecnológico. Programa de Pós-Graduação em Engenharia Elétrica

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositório Institucional da UFSC

RCAAP - Repositório Científico de Acesso Aberto de Portugal

Algorithms incorporating concurrency and caching

Author: Fineman Jeremy T
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2009
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 189-203).This thesis describes provably good algorithms for modern large-scale computer systems, including today's multicores. Designing efficient algorithms for these systems involves overcoming many challenges, including concurrency (dealing with parallel accesses to the same data) and caching (achieving good memory performance.) This thesis includes two parallel algorithms that focus on testing for atomicity violations in a parallel fork-join program. These algorithms augment a parallel program with a data structure that answers queries about the program's structure, on the fly. Specifically, one data structure, called SP-ordered-bags, maintains the series-parallel relationships among threads, which is vital for uncovering race conditions (bugs) in the program. Another data structure, called XConflict, aids in detecting conflicts in a transactional-memory system with nested parallel transactions. For a program with work T and span To, maintaining either data structure adds an overhead of PT, to the running time of the parallel program when executed on P processors using an efficient scheduler, yielding a total runtime of O(T1/P + PTo). For each of these data structures, queries can be answered in 0(1) time. This thesis also introduces the compressed sparse rows (CSB) storage format for sparse matrices, which allows both Ax and ATx to be computed efficiently in parallel, where A is an n x n sparse matrix with nnz > n nonzeros and x is a dense n-vector. The parallel multiplication algorithm uses e(nnz) work and ... span, yielding a parallelism of ... , which is amply high for virtually any large matrix.(cont.) Also addressing concurrency, this thesis considers two scheduling problems. The first scheduling problem, motivated by transactional memory, considers randomized backoff when jobs have different lengths. I give an analysis showing that binary exponential backoff achieves makespan V2e(6v 1- i ) with high probability, where V is the total length of all n contending jobs. This bound is significantly larger than when jobs are all the same size. A variant of exponential backoff, however, achieves makespan of ... with high probability. I also present the size-hashed backoff protocol, specifically designed for jobs having different lengths, that achieves makespan ... with high probability. The second scheduling problem considers scheduling n unit-length jobs on m unrelated machines, where each job may fail probabilistically. Specifically, an input consists of a set of n jobs, a directed acyclic graph G describing the precedence constraints among jobs, and a failure probability qij for each job j and machine i. The goal is to find a schedule that minimizes the expected makespan. I give an O(log log(min {m, n}))-approximation for the case of independent jobs (when there are no precedence constraints) and an O(log(n + m) log log(min {m, n}))-approximation algorithm when precedence constraints form disjoint chains. This chain algorithm can be extended into one that supports precedence constraints that are trees, which worsens the approximation by another log(n) factor. To address caching, this thesis includes several new variants of cache-oblivious dynamic dictionaries.(cont.) A cache-oblivious dictionary fills the same niche as a classic B-tree, but it does so without tuning for particular memory parameters. Thus, cache-oblivious dictionaries optimize for all levels of a multilevel hierarchy and are more portable than traditional B-trees. I describe how to add concurrency to several previously existing cache-oblivious dictionaries. I also describe two new data structures that achieve significantly cheaper insertions with a small overhead on searches. The cache-oblivious lookahead array (COLA) supports insertions/deletions and searches in O((1/B) log N) and O(log N) memory transfers, respectively, where B is the block size, M is the memory size, and N is the number of elements in the data structure. The xDict supports these operations in O((1/1B E1-) logB(N/M)) and O((1/)0logB(N/M)) memory transfers, respectively, where 0 < E < 1 is a tunable parameter. Also on caching, this thesis answers the question: what is the worst possible page-replacement strategy? The goal of this whimsical chapter is to devise an online strategy that achieves the highest possible fraction of page faults / cache misses as compared to the worst offline strategy. I show that there is no deterministic strategy that is competitive with the worst offline. I also give a randomized strategy based on the most recently used heuristic and show that it is the worst possible pagereplacement policy. On a more serious note, I also show that direct mapping is, in some sense, a worst possible page-replacement policy. Finally, this thesis includes a new algorithm, following a new approach, for the problem of maintaining a topological ordering of a dag as edges are dynamically inserted.(cont.) The main result included here is an O(n2 log n) algorithm for maintaining a topological ordering in the presence of up to m < n(n - 1)/2 edge insertions. In contrast, the previously best algorithm has a total running time of O(min { m3/ 2, n5/2 }). Although these algorithms are not parallel and do not exhibit particularly good locality, some of the data structural techniques employed in my solution are similar to others in this thesis.by Jeremy T. Fineman.Ph.D

DSpace@MIT

Nonoblivious hashing

Author: Alan Siegel
Amos Fiat
Jeanette P. Schmidt
Moni Naor
~AJTAI M.
~AJTAI M.
~BORODIN A.
~CARTER J. L.
~CEL~S P.
~FIAT A.
~FIAT A.
~FLAT A.
~FREDMAN M. L.
~GONNET G. H.
~J
~MA SON
~MAIRSON H.G.
~MEHLHORN K.
~MUNRO J.I.
~SCHMIDT J. P.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref