Search CORE

5,732 research outputs found

The space complexity of inner product filters

Author: Pagh Rasmus
Sivertsen Johan von Tangen
Publication venue
Publication date: 01/01/2020
Field of study

Motivated by the problem of filtering candidate pairs in inner product similarity joins we study the following inner product estimation problem: Given parameters

d\in {\bf N}

\alpha>\beta\geq 0

and unit vectors

x,y\in {\bf R}^{d}

consider the task of distinguishing between the cases

\langle x, y\rangle\leq\beta

and

\langle x, y\rangle\geq \alpha

where

\langle x, y\rangle = \sum_{i=1}^d x_i y_i

is the inner product of vectors

x

and

y

. The goal is to distinguish these cases based on information on each vector encoded independently in a bit string of the shortest length possible. In contrast to much work on compressing vectors using randomized dimensionality reduction, we seek to solve the problem deterministically, with no probability of error. Inner product estimation can be solved in general via estimating

\langle x, y\rangle

with an additive error bounded by

\varepsilon = \alpha - \beta

. We show that

d \log_2 \left(\tfrac{\sqrt{1-\beta}}{\varepsilon}\right) \pm \Theta(d)

bits of information about each vector is necessary and sufficient. Our upper bound is constructive and improves a known upper bound of

d \log_2(1/\varepsilon) + O(d)

by up to a factor of 2 when

\beta

is close to

1

. The lower bound holds even in a stronger model where one of the vectors is known exactly, and an arbitrary estimation function is allowed.Comment: To appear at ICDT 202

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

The IT University of Copenhagen's Repository

Proceedings of the first international VLDB workshop on Management of Uncertain Data

Author: Dekhtyar A.
Publication venue: Centre for Telematics and Information Technology (CTIT)
Publication date: 24/09/2007
Field of study

University of Twente Research Information

Building Wavelet Histograms on Large Data in MapReduce

Author: Jestes Jeffrey
Li Feifei
Yi Ke
Publication venue
Publication date: 01/01/2011
Field of study

MapReduce is becoming the de facto framework for storing and processing massive data, due to its excellent scalability, reliability, and elasticity. In many MapReduce applications, obtaining a compact accurate summary of data is essential. Among various data summarization tools, histograms have proven to be particularly important and useful for summarizing data, and the wavelet histogram is one of the most widely used histograms. In this paper, we investigate the problem of building wavelet histograms efficiently on large datasets in MapReduce. We measure the efficiency of the algorithms by both end-to-end running time and communication cost. We demonstrate straightforward adaptations of existing exact and approximate methods for building wavelet histograms to MapReduce clusters are highly inefficient. To that end, we design new algorithms for computing exact and approximate wavelet histograms and discuss their implementation in MapReduce. We illustrate our techniques in Hadoop, and compare to baseline solutions with extensive experiments performed in a heterogeneous Hadoop cluster of 16 nodes, using large real and synthetic datasets, up to hundreds of gigabytes. The results suggest significant (often orders of magnitude) performance improvement achieved by our new algorithms.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

Hong Kong University of Science and Technology Institutional Repository

Superlative Quantifiers as Modifiers of Meta-Speech Acts

Author: Cohen Ariel
Krifka Manfred
Publication venue: 'New Prairie Press'
Publication date: 01/01/2010
Field of study

The superlative quantifiers, at least and at most, are commonly assumed to have the same truth-conditions as the comparative quantifiers more than and fewer than. However, as Geurts & Nouwen (2007) have demonstrated, this is wrong, and several theories have been proposed to account for them. In this paper we propose that superlative quantifiers are illocutionary operators; specifically, they modify meta-speech acts. Meta speech-acts are operators that do not express a speech act, but a willingness to make or refrain from making a certain speech act. The classic example is speech act denegation, e.g. I don\u27t promise to come, where the speaker is explicitly refraining from performing the speech act of promising What denegations do is to delimit the future development of conversation, that is, they delimit future admissible speech acts. Hence we call them meta-speech acts. They are not moves in a game, but rather commitments to behave in certain ways in the future. We formalize the notion of meta speech acts as commitment development spaces, which are rooted graphs: The root of the graph describes the commitment development up to the current point in conversation; the continuations from the root describe the admissible future directions. We define and formalize the meta-speech act GRANT, which indicates that the speaker, while not necessarily subscribing to a proposition, refrains from asserting its negation. We propose that superlative quantifiers are quantifiers over GRANTs. Thus, Mary petted at least three rabbits means that the minimal number n such that the speaker GRANTs that Mary petted n rabbits is n = 3. In other words, the speaker denies that Mary petted two, one, or no rabbits, but GRANTs that she petted more. We formalize this interpretation of superlative quantifiers in terms of commitment development spaces, and show how the truth conditions that are derived from it are partly entailed and partly conversationally implicated. We demonstrates how the theory accounts for a wide variety of phenomena regarding the interpretation of superlative quantifiers, their distribution, and the contexts in which they can be embedded

Directory of Open Access Journals

Kansas State University

The AFIT ENgineer, Volume 5, Issue 2

Author: Graduate School of Engineering and Management Air Force Institute of Technology
Publication venue: AFIT Scholar
Publication date: 01/06/2023
Field of study

In this issue: Quantum information science (QIS) research at AFIT Engineers Week Returns to AFIT AFIT Joins U.S. Space Command’s Academic Engagement Enterprise Digital Innovation and Integration Center of Excellence (DIICE) FY22 External Sponsor Funding summar

AFTI Scholar (Air Force Institute of Technology)

Interactive Constrained {B}oolean Matrix Factorization

Author: Miettinen P.
Mukuze N.
Publication venue
Publication date: 01/01/2016
Field of study

MPG.PuRe

Engineering Aggregation Operators for Relational In-Memory Database Systems

Author: Müller Ingo
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2016
Field of study

In this thesis we study the design and implementation of Aggregation operators in the context of relational in-memory database systems. In particular, we identify and address the following challenges: cache-efficiency, CPU-friendliness, parallelism within and across processors, robust handling of skewed data, adaptive processing, processing with constrained memory, and integration with modern database architectures. Our resulting algorithm outperforms the state-of-the-art by up to 3.7x

KITopen