38 research outputs found
Fast evaluation of union-intersection expressions
We show how to represent sets in a linear space data structure such that
expressions involving unions and intersections of sets can be computed in a
worst-case efficient way. This problem has applications in e.g. information
retrieval and database systems. We mainly consider the RAM model of
computation, and sets of machine words, but also state our results in the I/O
model. On a RAM with word size , a special case of our result is that the
intersection of (preprocessed) sets, containing elements in total, can
be computed in expected time , where is the
number of elements in the intersection. If the first of the two terms
dominates, this is a factor faster than the standard solution of
merging sorted lists. We show a cell probe lower bound of time , meaning that our upper bound is nearly
optimal for small . Our algorithm uses a novel combination of approximate
set representations and word-level parallelism
Linear probing with constant independence
Hashing with linear probing dates back to the 1950s, and is among the most studied algorithms. In recent years it has become one of the most important hash table organizations since it uses the cache of modern computers very well. Unfortunately, previous analyses rely either on complicated and space consuming hash functions, or on the unrealistic assumption of free access to a truly random hash function. Already Carter and Wegman, in their seminal paper on universal hashing, raised the question of extending their analysis to linear probing. However, we show in this paper that linear probing using a pairwise independent family may have expected logarithmic cost per operation. On the positive side, we show that 5-wise independence is enough to ensure constant expected time per operation. This resolves the question of finding a space and time efficient hash function that provably ensures good performance for linear probing
Simulating Uniform Hashing in Constant Time and Optimal Space
Many algorithms and data structures employing hashing have been analyzed under the uniform hashing assumption, i.e., the assumption that hash functions behave like truly random functions. In this paper it is shown how to implement hash functions that can be evaluated on a RAM in constant time, and behave like truly random functions on any set of n inputs, with high probability. The space needed to represent a function is O(n) words, which is the best possible (and a polynomial improvement compared to previous fast hash functions). As a consequence, a broad class of hashing schemes can be implemented to meet, with high probability, the performance guarantees of their uniform hashing analysis
Continual Counting with Gradual Privacy Expiration
Differential privacy with gradual expiration models the setting where data
items arrive in a stream and at a given time the privacy loss guaranteed
for a data item seen at time is , where is a
monotonically non-decreasing function. We study the fundamental
problem where each data item consists of
a bit, and the algorithm needs to output at each time step the sum of all the
bits streamed so far. For a stream of length and privacy
expiration continual counting is possible with maximum (over all time steps)
additive error and the best known lower bound is
; closing this gap is a challenging open problem.
We show that the situation is very different for privacy with gradual
expiration by giving upper and lower bounds for a large set of expiration
functions . Specifically, our algorithm achieves an additive error of for a large set of privacy expiration functions. We also
give a lower bound that shows that if is the additive error of any
-DP algorithm for this problem, then the product of and the
privacy expiration function after steps must be
. Our algorithm matches this lower bound as its
additive error is , even when .
Our empirical evaluation shows that we achieve a slowly growing privacy loss
with significantly smaller empirical privacy loss for large values of than
a natural baseline algorithm
Lectin-Dependent Enhancement of Ebola Virus Infection via Soluble and Transmembrane C-type Lectin Receptors
Mannose-binding lectin (MBL) is a key soluble effector of the innate immune system that recognizes pathogen-specific surface glycans. Surprisingly, low-producing MBL genetic variants that may predispose children and immunocompromised individuals to infectious diseases are more common than would be expected in human populations. Since certain immune defense molecules, such as immunoglobulins, can be exploited by invasive pathogens, we hypothesized that MBL might also enhance infections in some circumstances. Consequently, the low and intermediate MBL levels commonly found in human populations might be the result of balancing selection. Using model infection systems with pseudotyped and authentic glycosylated viruses, we demonstrated that MBL indeed enhances infection of Ebola, Hendra, Nipah and West Nile viruses in low complement conditions. Mechanistic studies with Ebola virus (EBOV) glycoprotein pseudotyped lentiviruses confirmed that MBL binds to N-linked glycan epitopes on viral surfaces in a specific manner via the MBL carbohydrate recognition domain, which is necessary for enhanced infection. MBL mediates lipid-raft-dependent macropinocytosis of EBOV via a pathway that appears to require less actin or early endosomal processing compared with the filovirus canonical endocytic pathway. Using a validated RNA interference screen, we identified C1QBP (gC1qR) as a candidate surface receptor that mediates MBL-dependent enhancement of EBOV infection. We also identified dectin-2 (CLEC6A) as a potentially novel candidate attachment factor for EBOV. Our findings support the concept of an innate immune haplotype that represents critical interactions between MBL and complement component C4 genes and that may modify susceptibility or resistance to certain glycosylated pathogens. Therefore, higher levels of native or exogenous MBL could be deleterious in the setting of relative hypocomplementemia which can occur genetically or because of immunodepletion during active infections. Our findings confirm our hypothesis that the pressure of infectious diseases may have contributed in part to evolutionary selection of MBL mutant haplotypes