Search CORE

2,722 research outputs found

Better Streaming Algorithms for the Maximum Coverage Problem

Author: McGregor Andrew
Vu Hoa T.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 20th International Conference on Database Theory (ICDT 2017)
Publication date: 01/01/2017
Field of study

We study the classic NP-Hard problem of finding the maximum k-set coverage in the data stream model: given a set system of m sets that are subsets of a universe {1,...,n}, find the k sets that cover the most number of distinct elements. The problem can be approximated up to a factor 1-1/e in polynomial time. In the streaming-set model, the sets and their elements are revealed online. The main goal of our work is to design algorithms, with approximation guarantees as close as possible to 1-1/e, that use sublinear space o(mn). Our main results are: 1) Two (1-1/e-epsilon) approximation algorithms: One uses O(1/epsilon) passes and O(k/epsilon^2 polylog(m,n)) space whereas the other uses only a single pass but O(m/epsilon^2 polylog(m,n)) space. 2) We show that any approximation factor better than (1-(1-1/k)^k) in constant passes require space that is linear in m for constant k even if the algorithm is allowed unbounded processing time. We also demonstrate a single-pass, (1-epsilon) approximation algorithm using O(m/epsilon^2 min(k,1/epsilon) polylog(m,n)) space. We also study the maximum k-vertex coverage problem in the dynamic graph stream model. In this model, the stream consists of edge insertions and deletions of a graph on N vertices. The goal is to find k vertices that cover the most number of distinct edges. We show that any constant approximation in constant passes requires space that is linear in N for constant k whereas O(N/epsilon^2 polylog(m,n)) space is sufficient for a (1-epsilon) approximation and arbitrary k in a single pass. For regular graphs, we show that O(k/epsilon^3 polylog(m,n)) space is sufficient for a (1-epsilon) approximation in a single pass. We generalize this to a K-epsilon approximation when the ratio between the minimum and maximum degree is bounded below by K

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Maximum Coverage in Sublinear Space, Faster

Author: Choudhury Farhana
Jaud Stephen
Wirth Anthony
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 21st International Symposium on Experimental Algorithms (SEA 2023)
Publication date: 01/01/2023
Field of study

Given a collection of m sets from a universe ?, the Maximum Set Coverage problem consists of finding k sets whose union has largest cardinality. This problem is NP-Hard, but the solution can be approximated by a polynomial time algorithm up to a factor 1-1/e. However, this algorithm does not scale well with the input size. In a streaming context, practical high-quality solutions are found, but with space complexity that scales linearly with respect to the size of the universe n = |?|. However, one randomized streaming algorithm has been shown to produce a 1-1/e-? approximation of the optimal solution with a space complexity that scales only poly-logarithmically with respect to m and n. In order to achieve such a low space complexity, the authors used two techniques in their multi-pass approach: - F?-sketching, allows to determine with great accuracy the number of distinct elements in a set using less space than the set itself. - Subsampling, consists of only solving the problem on a subspace of the universe. It is implemented using ?-independent hash functions. This article focuses on the sublinear-space algorithm and highlights the time cost of these two techniques, especially subsampling. We present optimizations that significantly reduce the time complexity of the algorithm. Firstly, we give some optimizations that do not alter the space complexity, number of passes and approximation quality of the original algorithm. In particular, we reanalyze the error bounds to show that the original independence factor of ?(?^{-2} k log m) can be fine-tuned to ?(k log m); we also show how F?-sketching can be removed. Secondly, we derive a new lower bound for the probability of producing a 1-1/e-? approximation using only pairwise independence: 1- (4/(c k log m)) compared to 1-(2e/(m^{ck/6})) with ?(k log m)-independence. Although the theoretical guarantees are weaker, suggesting the approximation quality would suffer, for large streams, our algorithms perform well in practice. Finally, our experimental results show that even a pairwise-independent hash-function sampler does not produce worse solution than the original algorithm, while running significantly faster by several orders of magnitude

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Optimal Bounds for Dominating Set in Graph Streams

Author: Khanna Sanjeev
Konrad Christian
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 13th Innovations in Theoretical Computer Science Conference (ITCS 2022)
Publication date: 01/01/2022
Field of study

Dagstuhl Research Online Publication Server

Explore Bristol Research

Maximum Coverage in Random-Arrival Streams

Author: Choudhury Farhana
Warneke Rowan
Wirth Anthony
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st Annual European Symposium on Algorithms (ESA 2023)
Publication date: 01/01/2023
Field of study

Dagstuhl Research Online Publication Server

Final report on the evaluation of RRM/CRRM algorithms

Author: Alayon Glasunov Andrés
Almeida Teresa
Barbaresi Andrea
Casadevall Palacio Fernando José
Gelabert Doran Xavier
Majkowski Jakub
Pérez Romero Jordi
Sallent Roig José Oriol
Sánchez González Juan
Umbert Juliana Anna
Publication venue
Publication date: 01/01/2005
Field of study

Deliverable public del projecte EVERESTThis deliverable provides a definition and a complete evaluation of the RRM/CRRM algorithms selected in D11 and D15, and evolved and refined on an iterative process. The evaluation will be carried out by means of simulations using the simulators provided at D07, and D14.Preprin

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Maximum Coverage in the Data Stream Model: Parameterized and Generalized

Author: McGregor Andrew
Tench David
Vu Hoa T.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 24th International Conference on Database Theory (ICDT 2021)
Publication date: 01/01/2021
Field of study

We present algorithms for the Max-Cover and Max-Unique-Cover problems in the data stream model. The input to both problems are

m

subsets of a universe of size

n

and a value

k\in [m]

. In Max-Cover, the problem is to find a collection of at most

k

sets such that the number of elements covered by at least one set is maximized. In Max-Unique-Cover, the problem is to find a collection of at most

k

sets such that the number of elements covered by exactly one set is maximized. Our goal is to design single-pass algorithms that use space that is sublinear in the input size. Our main algorithmic results are: If the sets have size at most

d

, there exist single-pass algorithms using

\tilde{O}(d^{d+1} k^d)

space that solve both problems exactly. This is optimal up to polylogarithmic factors for constant

d

. If each element appears in at most

r

sets, we present single pass algorithms using

\tilde{O}(k^2 r/\epsilon^3)

space that return a

1+\epsilon

approximation in the case of Max-Cover. We also present a single-pass algorithm using slightly more memory, i.e.,

\tilde{O}(k^3 r/\epsilon^{4})

space, that

1+\epsilon

approximates Max-Unique-Cover. In contrast to the above results, when

d

and

r

are arbitrary, any constant pass

1+\epsilon

approximation algorithm for either problem requires

\Omega(\epsilon^{-2}m)

space but a single pass

O(\epsilon^{-2}mk)

space algorithm exists. In fact any constant-pass algorithm with an approximation better than

e/(e-1)

and

e^{1-1/k}

for Max-Cover and Max-Unique-Cover respectively requires

\Omega(m/k^2)

space when

d

and

r

are unrestricted. En route, we also obtain an algorithm for a parameterized version of the streaming Set-Cover problem.Comment: Conference version to appear at ICDT 202

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Real-Time Scheduling for Content Broadcasting in LTE

Author: Casetti Claudio Ettore
Chiasserini Carla Fabiana
Malandrino Francesco
Zhou Siyuan
Publication venue: IEEE
Publication date: 01/01/2014
Field of study

Broadcasting capabilities are one of the most promising features of upcoming LTE-Advanced networks. However, the task of scheduling broadcasting sessions is far from trivial, since it affects the available resources of several contiguous cells as well as the amount of resources that can be devoted to unicast traffic. In this paper, we present a compact, convenient model for broadcasting in LTE, as well as a set of efficient algorithms to define broadcasting areas and to actually perform content scheduling. We study the performance of our algorithms in a realistic scenario, deriving interesting insights on the possible trade-offs between effectiveness and computational efficienc

arXiv.org e-Print Archive

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

The Power of Randomization: Distributed Submodular Maximization on Massive Datasets

Author: Barbosa Rafael da Ponte
Ene Alina
Nguyen Huy L.
Ward Justin
Publication venue
Publication date: 01/01/2015
Field of study

A wide variety of problems in machine learning, including exemplar clustering, document summarization, and sensor placement, can be cast as constrained submodular maximization problems. Unfortunately, the resulting submodular optimization problems are often too large to be solved on a single machine. We develop a simple distributed algorithm that is embarrassingly parallel and it achieves provable, constant factor, worst-case approximation guarantees. In our experiments, we demonstrate its efficiency in large problems with different kinds of constraints with objective values always close to what is achievable in the centralized setting

arXiv.org e-Print Archive

Warwick Research Archives Portal Repository

High-Dimensional Geometric Streaming in Polynomial Space

Author: Woodruff David P.
Yasuda Taisuke
Publication venue
Publication date: 18/04/2022
Field of study

Many existing algorithms for streaming geometric data analysis have been plagued by exponential dependencies in the space complexity, which are undesirable for processing high-dimensional data sets. In particular, once

d\geq\log n

, there are no known non-trivial streaming algorithms for problems such as maintaining convex hulls and L\"owner-John ellipsoids of

n

points, despite a long line of work in streaming computational geometry since [AHV04]. We simultaneously improve these results to

\mathrm{poly}(d,\log n)

bits of space by trading off with a

\mathrm{poly}(d,\log n)

factor distortion. We achieve these results in a unified manner, by designing the first streaming algorithm for maintaining a coreset for

\ell_\infty

subspace embeddings with

\mathrm{poly}(d,\log n)

space and

\mathrm{poly}(d,\log n)

distortion. Our algorithm also gives similar guarantees in the \emph{online coreset} model. Along the way, we sharpen results for online numerical linear algebra by replacing a log condition number dependence with a

\log n

dependence, answering a question of [BDM+20]. Our techniques provide a novel connection between leverage scores, a fundamental object in numerical linear algebra, and computational geometry. For

\ell_p

subspace embeddings, we give nearly optimal trade-offs between space and distortion for one-pass streaming algorithms. For instance, we give a deterministic coreset using

O(d^2\log n)

space and

O((d\log n)^{1/2-1/p})

distortion for

p>2

, whereas previous deterministic algorithms incurred a

\mathrm{poly}(n)

factor in the space or the distortion [CDW18]. Our techniques have implications in the offline setting, where we give optimal trade-offs between the space complexity and distortion of subspace sketch data structures. To do this, we give an elementary proof of a "change of density" theorem of [LT80] and make it algorithmic.Comment: Abstract shortened to meet arXiv limits; v2 fix statements concerning online condition numbe

arXiv.org e-Print Archive