Search CORE

9,410 research outputs found

A Linear-time Algorithm for Sparsification of Unweighted Graphs

Author: Hariharan Ramesh
Panigrahi Debmalya
Publication venue
Publication date: 05/05/2010
Field of study

Given an undirected graph

G

and an error parameter

\epsilon > 0

, the {\em graph sparsification} problem requires sampling edges in

G

and giving the sampled edges appropriate weights to obtain a sparse graph

G_{\epsilon}

with the following property: the weight of every cut in

G_{\epsilon}

is within a factor of

(1\pm \epsilon)

of the weight of the corresponding cut in

G

. If

G

is unweighted, an

O(m\log n)

-time algorithm for constructing

G_{\epsilon}

with

O(n\log n/\epsilon^2)

edges in expectation, and an

O(m)

-time algorithm for constructing

G_{\epsilon}

with

O(n\log^2 n/\epsilon^2)

edges in expectation have recently been developed (Hariharan-Panigrahi, 2010). In this paper, we improve these results by giving an

O(m)

-time algorithm for constructing

G_{\epsilon}

with

O(n\log n/\epsilon^2)

edges in expectation, for unweighted graphs. Our algorithm is optimal in terms of its time complexity; further, no efficient algorithm is known for constructing a sparser

G_{\epsilon}

. Our algorithm is Monte-Carlo, i.e. it produces the correct output with high probability, as are all efficient graph sparsification algorithms

arXiv.org e-Print Archive

CiteSeerX

Almost-Smooth Histograms and Sliding-Window Graph Algorithms

Author: Krauthgamer Robert
Reitblat David
Publication venue
Publication date: 20/07/2020
Field of study

We study algorithms for the sliding-window model, an important variant of the data-stream model, in which the goal is to compute some function of a fixed-length suffix of the stream. We extend the smooth-histogram framework of Braverman and Ostrovsky (FOCS 2007) to almost-smooth functions, which includes all subadditive functions. Specifically, we show that if a subadditive function can be

(1+\epsilon)

-approximated in the insertion-only streaming model, then it can be

(2+\epsilon)

-approximated also in the sliding-window model with space complexity larger by factor

O(\epsilon^{-1}\log w)

, where

w

is the window size. We demonstrate how our framework yields new approximation algorithms with relatively little effort for a variety of problems that do not admit the smooth-histogram technique. For example, in the frequency-vector model, a symmetric norm is subadditive and thus we obtain a sliding-window

(2+\epsilon)

-approximation algorithm for it. Another example is for streaming matrices, where we derive a new sliding-window

(\sqrt{2}+\epsilon)

-approximation algorithm for Schatten

4

-norm. We then consider graph streams and show that many graph problems are subadditive, including maximum submodular matching, minimum vertex-cover, and maximum

k

-cover, thereby deriving sliding-window

O(1)

-approximation algorithms for them almost for free (using known insertion-only algorithms). Finally, we design for every

d\in (1,2]

an artificial function, based on the maximum-matching size, whose almost-smoothness parameter is exactly

d

arXiv.org e-Print Archive

Adaptive random forests for evolving data stream classification

Author: Abdessalem Talel
Barddal Jean Paul
Bifet Albert
Enembreck Fabrício
Gomes Heitor Murilo
Holmes Geoffrey
Pfahringer Bernhard
Read Jesse
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Random forests is currently one of the most used machine learning algorithms in the non-streaming (batch) setting. This preference is attributable to its high learning performance and low demands with respect to input preparation and hyper-parameter tuning. However, in the challenging context of evolving data streams, there is no random forests algorithm that can be considered state-of-the-art in comparison to bagging and boosting based algorithms. In this work, we present the adaptive random forest (ARF) algorithm for classification of evolving data streams. In contrast to previous attempts of replicating random forests for data stream learning, ARF includes an effective resampling method and adaptive operators that can cope with different types of concept drifts without complex optimizations for different data sets. We present experiments with a parallel implementation of ARF which has no degradation in terms of classification performance in comparison to a serial implementation, since trees and adaptive operators are independent from one another. Finally, we compare ARF with state-of-the-art algorithms in a traditional test-then-train evaluation and a novel delayed labelling evaluation, and show that ARF is accurate and uses a feasible amount of resources

Research Commons@Waikato

HAL-Polytechnique