Search CORE

580 research outputs found

Monitoring Networked Applications With Incremental Quantile Estimation

Author: Chambers John M.
James David A.
Lambert Diane
Wiel Scott Vander
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 02/08/2007
Field of study

Networked applications have software components that reside on different computers. Email, for example, has database, processing, and user interface components that can be distributed across a network and shared by users in different locations or work groups. End-to-end performance and reliability metrics describe the software quality experienced by these groups of users, taking into account all the software components in the pipeline. Each user produces only some of the data needed to understand the quality of the application for the group, so group performance metrics are obtained by combining summary statistics that each end computer periodically (and automatically) sends to a central server. The group quality metrics usually focus on medians and tail quantiles rather than on averages. Distributed quantile estimation is challenging, though, especially when passing large amounts of data around the network solely to compute quality metrics is undesirable. This paper describes an Incremental Quantile (IQ) estimation method that is designed for performance monitoring at arbitrary levels of network aggregation and time resolution when only a limited amount of data can be transferred. Applications to both real and simulated data are provided.Comment: This paper commented in: [arXiv:0708.0317], [arXiv:0708.0336], [arXiv:0708.0338]. Rejoinder in [arXiv:0708.0339]. Published at http://dx.doi.org/10.1214/088342306000000583 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Selection from read-only memory with limited workspace

Author: A. Golynski
B. Chazelle
D.E. Knuth
G. Jacobson
G. Navarro
G.N. Frederickson
J. Pagter
J.I. Munro
J.I. Munro
J.I. Munro
M. Blum
P. Beame
R. Grossi
R. Raman
T. Asano
T.H. Cormen
T.M. Chan
V. Raman
Publication venue
Publication date: 01/01/2013
Field of study

Given an unordered array of

N

elements drawn from a totally ordered set and an integer

k

in the range from

1

N

, in the classic selection problem the task is to find the

k

-th smallest element in the array. We study the complexity of this problem in the space-restricted random-access model: The input array is stored on read-only memory, and the algorithm has access to a limited amount of workspace. We prove that the linear-time prune-and-search algorithm---presented in most textbooks on algorithms---can be modified to use

\Theta(N)

bits instead of

\Theta(N)

words of extra space. Prior to our work, the best known algorithm by Frederickson could perform the task with

\Theta(N)

bits of extra space in

O(N \lg^{*} N)

time. Our result separates the space-restricted random-access model and the multi-pass streaming model, since we can surpass the

\Omega(N \lg^{*} N)

lower bound known for the latter model. We also generalize our algorithm for the case when the size of the workspace is

\Theta(S)

bits, where

\lg^3{N} \leq S \leq N

. The running time of our generalized algorithm is

O(N \lg^{*}(N/S) + N (\lg N) / \lg{} S)

, slightly improving over the

O(N \lg^{*}(N (\lg N)/S) + N (\lg N) / \lg{} S)

bound of Frederickson's algorithm. To obtain the improvements mentioned above, we developed a new data structure, called the wavelet stack, that we use for repeated pruning. We expect the wavelet stack to be a useful tool in other applications as well.Comment: 16 pages, 1 figure, Preliminary version appeared in COCOON-201

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System

Tight Lower Bound for Comparison-Based Quantile Summaries

Author: Cormode Graham
Veselý Pavel
Publication venue
Publication date: 16/01/2020
Field of study

Quantiles, such as the median or percentiles, provide concise and useful information about the distribution of a collection of items, drawn from a totally ordered universe. We study data structures, called quantile summaries, which keep track of all quantiles, up to an error of at most

\varepsilon

. That is, an

\varepsilon

-approximate quantile summary first processes a stream of items and then, given any quantile query

0\le \phi\le 1

, returns an item from the stream, which is a

\phi'

-quantile for some

\phi' = \phi \pm \varepsilon

. We focus on comparison-based quantile summaries that can only compare two items and are otherwise completely oblivious of the universe. The best such deterministic quantile summary to date, due to Greenwald and Khanna (SIGMOD '01), stores at most

O(\frac{1}{\varepsilon}\cdot \log \varepsilon N)

items, where

N

is the number of items in the stream. We prove that this space bound is optimal by showing a matching lower bound. Our result thus rules out the possibility of constructing a deterministic comparison-based quantile summary in space

f(\varepsilon)\cdot o(\log N)

, for any function

f

that does not depend on

N

. As a corollary, we improve the lower bound for biased quantiles, which provide a stronger, relative-error guarantee of

(1\pm \varepsilon)\cdot \phi

, and for other related computational tasks.Comment: 20 pages, 2 figures, major revison of the construction (Sec. 3) and some other parts of the pape

arXiv.org e-Print Archive

Crossref

Warwick Research Archives Portal Repository

Comment: Monitoring Networked Applications With Incremental Quantile Estimation

Author: Lawrence Earl
Michailidis George
Nair Vijayan N.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 02/08/2007
Field of study

Our comments are in two parts. First, we make some observations regarding the methodology in Chambers et al. [arXiv:0708.0302]. Second, we briefly describe another interesting network monitoring problem that arises in the context of assessing quality of service, such as loss rates and delay distributions, in packet-switched networks.Comment: Published at http://dx.doi.org/10.1214/088342306000000600 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

A Fast Algorithm for Approximate Quantiles in High Speed Data Streams

Author: Qi Zhang
Wei Wang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

We present a fast algorithm for computing approx-imate quantiles in high speed data streams with deter-ministic error bounds. For data streams of size N where N is unknown in advance, our algorithm par-titions the stream into sub-streams of exponentially increasing size as they arrive. For each sub-stream which has a xed size, we compute and maintain a multi-level summary structure using a novel algorithm. In order to achieve high speed performance, the algo-rithm uses simple block-wise merge and sample oper-ations. Overall, our algorithms for xed-size streams and arbitrary-size streams have a computational cost of O(N log ( 1 log N)) and an average per-element update cost of O(log log N) if is xed.

CiteSeerX

Crossref

Approximate Quantile Computation over Sensor Networks

Author: Han Jiawei
Wang Wei
Yan Xifeng
Yang Jiong
Publication venue
Publication date: 01/02/2005
Field of study

Sensor networks have been deployed in various environments, from battle field surveillance to weather monitoring. The amount of data generated by the sensors can be large. One way to analyze such large data set is to capture the essential statistics of the data. Thus the quantile computation in the large scale sensor network becomes an important but challenging problem. The data may be widely distributed, e.g., there may be thousands of sensors. In addition, the memory and bandwidth among sensors could be quite limited. Most previous quantile computation methods assume that the data is either stored or streaming in a centralized site, which could not be directly applied in the sensor environment. In this paper, we propose a novel algorithm to compute the quantile for sensor network data, which dynamically adapts to the memory limitations. Moreover, since sensors may update their values at any time, an incremental maintenance algorithm is developed to reduce the number of times that a global recomputation is needed upon updates. The performance and complexity of our algorithms are analyzed both theoretically and empirically on various large data sets, which demonstrate the high promise of our method

Illinois Digital Environment for Access to Learning and Scholarship Repository