8 research outputs found
Recency Queries with Succinct Representation
In the context of the sliding-window set membership problem, and caching policies that require knowledge of item recency, we formalize the problem of Recency on a stream. Informally, the query asks, "when was the last time I saw item x?" Existing structures, such as hash tables, can support a recency query by augmenting item occurrences with timestamps. To support recency queries on a window of W items, this might require ?(W log W) bits.
We propose a succinct data structure for Recency. By combining sliding-window dictionaries in a hierarchical structure, and careful design of the underlying hash tables, we achieve a data structure that returns a 1+? approximation to the recency of every item in O(log(? W)) time, in only (1+o(1))(1+?)(?+Wlog(?^(-1))) bits. Here, ? is the information-theoretic lower bound on the number of bits for a set of size W, in a universe of cardinality N
Bloom Filters in Adversarial Environments
Many efficient data structures use randomness, allowing them to improve upon
deterministic ones. Usually, their efficiency and correctness are analyzed
using probabilistic tools under the assumption that the inputs and queries are
independent of the internal randomness of the data structure. In this work, we
consider data structures in a more robust model, which we call the adversarial
model. Roughly speaking, this model allows an adversary to choose inputs and
queries adaptively according to previous responses. Specifically, we consider a
data structure known as "Bloom filter" and prove a tight connection between
Bloom filters in this model and cryptography.
A Bloom filter represents a set of elements approximately, by using fewer
bits than a precise representation. The price for succinctness is allowing some
errors: for any it should always answer `Yes', and for any it should answer `Yes' only with small probability.
In the adversarial model, we consider both efficient adversaries (that run in
polynomial time) and computationally unbounded adversaries that are only
bounded in the number of queries they can make. For computationally bounded
adversaries, we show that non-trivial (memory-wise) Bloom filters exist if and
only if one-way functions exist. For unbounded adversaries we show that there
exists a Bloom filter for sets of size and error , that is
secure against queries and uses only
bits of memory. In comparison, is the best
possible under a non-adaptive adversary
Probabilistic data types
Dissertação de mestrado integrado em Engenharia InformáticaConflict-Free Replicated Data Types (CRDTs) provide deterministic outcomes from concurrent
executions. The conflict resolution mechanism uses information on the ordering of the last
operations performed, which indicates if a given operation is known by a replica, typically
using some variant of version vectors. This thesis will explore the construction of CRDTs
that use a novel stochastic mechanism that can track with high accuracy knowledge of the
occurrence of recently performed operations and with less accuracy for older operations.
The aim is to obtain better scaling properties and avoid the use of metadata that is linear on
the number of replicas.Conflict-Free Replicated Data Types (CRDTs) oferecem resultados determinísticos de execuções
concorrentes. O mecanismo de resolução de conflitos usa informação sobre a ordenação das últimas operações realizadas, que indica se uma dada operação é conhecida por uma réplica, geralmente usando alguma variante de version vectors. Esta tese explorara a construção de CRDTs que utilizam um novo mecanismo estocástico que pode identificar com alta precisão
o conhecimento sobre a ocorrência de operações realizadas recentemente e com menor
precisão para operações mais antigas. O objetivo é a obtenção de melhores propriedades de escalabilidade e evitar o uso de metadados em quantidade linear em relação ao número de réplicas
Encapsulated Search Index: Public-Key, Sub-linear, Distributed, and Delegatable
We build the first sub-linear (in fact, potentially constant-time) public-key searchable encryption system:
− server can publish a public key .
− anybody can build an encrypted index for document under .
− client holding the index can obtain a token from the server to check if a keyword belongs to .
− search using is almost as fast (e.g., sub-linear) as the non-private search.
− server granting the token does not learn anything about the document , beyond the
keyword .
− yet, the token is specific to the pair : the client does not learn if other keywords belong to , or if w belongs to other, freshly indexed documents .
− server cannot fool the client by giving a wrong token .
We call such a primitive Encapsulated Search Index (ESI). Our ESI scheme can be made - distributed among servers in the best possible way: non-interactive, verifiable, and resilient to any coalition of up to malicious servers. We also introduce the notion of delegatable ESI and show how to extend our construction to this setting.
Our solution — including public indexing, sub-linear search, delegation, and distributed token generation — is deployed as a commercial application by Atakama