5,732 research outputs found
The space complexity of inner product filters
Motivated by the problem of filtering candidate pairs in inner product
similarity joins we study the following inner product estimation problem: Given
parameters , and unit vectors consider the task of distinguishing between the cases and where is the inner product of vectors and .
The goal is to distinguish these cases based on information on each vector
encoded independently in a bit string of the shortest length possible. In
contrast to much work on compressing vectors using randomized dimensionality
reduction, we seek to solve the problem deterministically, with no probability
of error. Inner product estimation can be solved in general via estimating
with an additive error bounded by . We show that bits of information about each vector is necessary and
sufficient. Our upper bound is constructive and improves a known upper bound of
by up to a factor of 2 when is close
to . The lower bound holds even in a stronger model where one of the vectors
is known exactly, and an arbitrary estimation function is allowed.Comment: To appear at ICDT 202
Building Wavelet Histograms on Large Data in MapReduce
MapReduce is becoming the de facto framework for storing and processing
massive data, due to its excellent scalability, reliability, and elasticity. In
many MapReduce applications, obtaining a compact accurate summary of data is
essential. Among various data summarization tools, histograms have proven to be
particularly important and useful for summarizing data, and the wavelet
histogram is one of the most widely used histograms. In this paper, we
investigate the problem of building wavelet histograms efficiently on large
datasets in MapReduce. We measure the efficiency of the algorithms by both
end-to-end running time and communication cost. We demonstrate straightforward
adaptations of existing exact and approximate methods for building wavelet
histograms to MapReduce clusters are highly inefficient. To that end, we design
new algorithms for computing exact and approximate wavelet histograms and
discuss their implementation in MapReduce. We illustrate our techniques in
Hadoop, and compare to baseline solutions with extensive experiments performed
in a heterogeneous Hadoop cluster of 16 nodes, using large real and synthetic
datasets, up to hundreds of gigabytes. The results suggest significant (often
orders of magnitude) performance improvement achieved by our new algorithms.Comment: VLDB201
Superlative Quantifiers as Modifiers of Meta-Speech Acts
The superlative quantifiers, at least and at most, are commonly assumed to have the same truth-conditions as the comparative quantifiers more than and fewer than. However, as Geurts & Nouwen (2007) have demonstrated, this is wrong, and several theories have been proposed to account for them. In this paper we propose that superlative quantifiers are illocutionary operators; specifically, they modify meta-speech acts. Meta speech-acts are operators that do not express a speech act, but a willingness to make or refrain from making a certain speech act. The classic example is speech act denegation, e.g. I don\u27t promise to come, where the speaker is explicitly refraining from performing the speech act of promising What denegations do is to delimit the future development of conversation, that is, they delimit future admissible speech acts. Hence we call them meta-speech acts. They are not moves in a game, but rather commitments to behave in certain ways in the future. We formalize the notion of meta speech acts as commitment development spaces, which are rooted graphs: The root of the graph describes the commitment development up to the current point in conversation; the continuations from the root describe the admissible future directions. We define and formalize the meta-speech act GRANT, which indicates that the speaker, while not necessarily subscribing to a proposition, refrains from asserting its negation. We propose that superlative quantifiers are quantifiers over GRANTs. Thus, Mary petted at least three rabbits means that the minimal number n such that the speaker GRANTs that Mary petted n rabbits is n = 3. In other words, the speaker denies that Mary petted two, one, or no rabbits, but GRANTs that she petted more. We formalize this interpretation of superlative quantifiers in terms of commitment development spaces, and show how the truth conditions that are derived from it are partly entailed and partly conversationally implicated. We demonstrates how the theory accounts for a wide variety of phenomena regarding the interpretation of superlative quantifiers, their distribution, and the contexts in which they can be embedded
The AFIT ENgineer, Volume 5, Issue 2
In this issue: Quantum information science (QIS) research at AFIT Engineers Week Returns to AFIT AFIT Joins U.S. Space Command’s Academic Engagement Enterprise Digital Innovation and Integration Center of Excellence (DIICE) FY22 External Sponsor Funding summar
Engineering Aggregation Operators for Relational In-Memory Database Systems
In this thesis we study the design and implementation of Aggregation operators in the context of relational in-memory database systems. In particular, we identify and address the following challenges: cache-efficiency, CPU-friendliness, parallelism within and across processors, robust handling of skewed data, adaptive processing, processing with constrained memory, and integration with modern database architectures. Our resulting algorithm outperforms the state-of-the-art by up to 3.7x
- …