Search CORE

9 research outputs found

Deterministic algorithms for skewed matrix products

Author: Kutzkov Konstantin
Publication venue
Publication date: 20/09/2012
Field of study

Recently, Pagh presented a randomized approximation algorithm for the multiplication of real-valued matrices building upon work for detecting the most frequent items in data streams. We continue this line of research and present new {\em deterministic} matrix multiplication algorithms. Motivated by applications in data mining, we first consider the case of real-valued, nonnegative

n

-by-

n

input matrices

A

and

B

, and show how to obtain a deterministic approximation of the weights of individual entries, as well as the entrywise

p

-norm, of the product

AB

. The algorithm is simple, space efficient and runs in one pass over the input matrices. For a user defined

b \in (0, n^2)

the algorithm runs in time

O(nb + n\cdot\text{Sort}(n))

and space

O(n + b)

and returns an approximation of the entries of

AB

within an additive factor of

\|AB\|_{E1}/b

, where

\|C\|_{E1} = \sum_{i, j} |C_{ij}|

is the entrywise 1-norm of a matrix

C

and

\text{Sort}(n)

is the time required to sort

n

real numbers in linear space. Building upon a result by Berinde et al. we show that for skewed matrix products (a common situation in many real-life applications) the algorithm is more efficient and achieves better approximation guarantees than previously known randomized algorithms. When the input matrices are not restricted to nonnegative entries, we present a new deterministic group testing algorithm detecting nonzero entries in the matrix product with large absolute value. The algorithm is clearly outperformed by randomized matrix multiplication algorithms, but as a byproduct we obtain the first

O(n^{2 + \varepsilon})

-time deterministic algorithm for matrix products with

O(\sqrt{n})

nonzero entries

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

The IT University of Copenhagen's Repository

Simple Set Sketching

Author: Houen Jakob Bæk Tejs
Pagh Rasmus
Walzer Stefan
Publication venue
Publication date: 07/11/2022
Field of study

Imagine handling collisions in a hash table by storing, in each cell, the bit-wise exclusive-or of the set of keys hashing there. This appears to be a terrible idea: For

\alpha n

keys and

n

buckets, where

\alpha

is constant, we expect that a constant fraction of the keys will be unrecoverable due to collisions. We show that if this collision resolution strategy is repeated three times independently the situation reverses: If

\alpha

is below a threshold of

\approx 0.81

then we can recover the set of all inserted keys in linear time with high probability. Even though the description of our data structure is simple, its analysis is nontrivial. Our approach can be seen as a variant of the Invertible Bloom Filter (IBF) of Eppstein and Goodrich. While IBFs involve an explicit checksum per bucket to decide whether the bucket stores a single key, we exploit the idea of quotienting, namely that some bits of the key are implicit in the location where it is stored. We let those serve as an implicit checksum. These bits are not quite enough to ensure that no errors occur and the main technical challenge is to show that decoding can recover from these errors.Comment: To be published at SIAM Symposium on Simplicity in Algorithms (SOSA23

arXiv.org e-Print Archive

Improved Algorithms for White-Box Adversarial Streams

Author: Feng Ying
Woodruff David P.
Publication venue
Publication date: 07/07/2023
Field of study

We study streaming algorithms in the white-box adversarial stream model, where the internal state of the streaming algorithm is revealed to an adversary who adaptively generates the stream updates, but the algorithm obtains fresh randomness unknown to the adversary at each time step. We incorporate cryptographic assumptions to construct robust algorithms against such adversaries. We propose efficient algorithms for sparse recovery of vectors, low rank recovery of matrices and tensors, as well as low rank plus sparse recovery of matrices, i.e., robust PCA. Unlike deterministic algorithms, our algorithms can report when the input is not sparse or low rank even in the presence of such an adversary. We use these recovery algorithms to improve upon and solve new problems in numerical linear algebra and combinatorial optimization on white-box adversarial streams. For example, we give the first efficient algorithm for outputting a matching in a graph with insertions and deletions to its edges provided the matching size is small, and otherwise we declare the matching size is large. We also improve the approximation versus memory tradeoff of previous work for estimating the number of non-zero elements in a vector and computing the matrix rank.Comment: ICML 202

arXiv.org e-Print Archive

30th International Symposium on Theoretical Aspects of Computer Science: STACS '13, February 27th to March 2nd, 2013, Kiel, Germany

Author: STACS <30 2013, Kiel>
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik
Publication date: 01/02/2013
Field of study

Digitale Bibliothek Thüringen

Deterministic K-set structure

Author: Anirban Majumder
Sumit Ganguly
Publication venue
Publication date: 01/01/2006
Field of study

Abstract. A k-set structure over data streams is a bounded-space data structure that supports stream insertion and deletion operations and returns the set of (item, frequency) pairs in the stream, provided, the number of distinct items in the stream does not exceed k; and returns nil otherwise. This is a fundamental problem with applications in data streaming [24], data reconciliation in distributed systems [22] and mobile computing [28], etc. In this paper, we study the problem of obtaining deterministic algorithms for the k-set problem.

CiteSeerX

Crossref