Search CORE

3,064 research outputs found

Optimal Gossip Algorithms for Exact and Approximate Quantile Computations

Author: Haeupler Bernhard
Mohapatra Jeet
Su Hsin-Hao
Publication venue
Publication date: 25/11/2017
Field of study

This paper gives drastically faster gossip algorithms to compute exact and approximate quantiles. Gossip algorithms, which allow each node to contact a uniformly random other node in each round, have been intensely studied and been adopted in many applications due to their fast convergence and their robustness to failures. Kempe et al. [FOCS'03] gave gossip algorithms to compute important aggregate statistics if every node is given a value. In particular, they gave a beautiful

O(\log n + \log \frac{1}{\epsilon})

round algorithm to

\epsilon

-approximate the sum of all values and an

O(\log^2 n)

round algorithm to compute the exact

\phi

-quantile, i.e., the the

\lceil \phi n \rceil

smallest value. We give an quadratically faster and in fact optimal gossip algorithm for the exact

\phi

-quantile problem which runs in

O(\log n)

rounds. We furthermore show that one can achieve an exponential speedup if one allows for an

\epsilon

-approximation. We give an

O(\log \log n + \log \frac{1}{\epsilon})

round gossip algorithm which computes a value of rank between

\phi n

and

(\phi+\epsilon)n

at every node.% for any

0 \leq \phi \leq 1

and

0 < \epsilon < 1

. Our algorithms are extremely simple and very robust - they can be operated with the same running times even if every transmission fails with a, potentially different, constant probability. We also give a matching

\Omega(\log \log n + \log \frac{1}{\epsilon})

lower bound which shows that our algorithm is optimal for all values of

\epsilon

arXiv.org e-Print Archive

Crossref

Approximate Quantile Computation over Sensor Networks

Author: Han Jiawei
Wang Wei
Yan Xifeng
Yang Jiong
Publication venue
Publication date: 01/02/2005
Field of study

Sensor networks have been deployed in various environments, from battle field surveillance to weather monitoring. The amount of data generated by the sensors can be large. One way to analyze such large data set is to capture the essential statistics of the data. Thus the quantile computation in the large scale sensor network becomes an important but challenging problem. The data may be widely distributed, e.g., there may be thousands of sensors. In addition, the memory and bandwidth among sensors could be quite limited. Most previous quantile computation methods assume that the data is either stored or streaming in a centralized site, which could not be directly applied in the sensor environment. In this paper, we propose a novel algorithm to compute the quantile for sensor network data, which dynamically adapts to the memory limitations. Moreover, since sensors may update their values at any time, an incremental maintenance algorithm is developed to reduce the number of times that a global recomputation is needed upon updates. The performance and complexity of our algorithms are analyzed both theoretically and empirically on various large data sets, which demonstrate the high promise of our method

Illinois Digital Environment for Access to Learning and Scholarship Repository

Optimal Exploitation of the Sentinel-2 Spectral Capabilities for Crop Leaf Area Index Mapping

Author: D'Urso Guido
Hank Tobias B.
Mauser Wolfram
Richter Katja
Vuolo Francesco
Publication venue: 'MDPI AG'
Publication date: 01/01/2012
Field of study

The continuously increasing demand of accurate quantitative high quality information on land surface properties will be faced by a new generation of environmental Earth observation (EO) missions. One current example, associated with a high potential to contribute to those demands, is the multi-spectral ESA Sentinel-2 (S2) system. The present study focuses on the evaluation of spectral information content needed for crop leaf area index (LAI) mapping in view of the future sensors. Data from a field campaign were used to determine the optimal spectral sampling from available S2 bands applying inversion of a radiative transfer model (PROSAIL) with look-up table (LUT) and artificial neural network (ANN) approaches. Overall LAI estimation performance of the proposed LUT approach (LUTN₅₀) was comparable in terms of retrieval performances with a tested and approved ANN method. Employing seven- and eight-band combinations, the LUTN₅₀ approach obtained LAI RMSE of 0.53 and normalized LAI RMSE of 0.12, which was comparable to the results of the ANN. However, the LUTN50 method showed a higher robustness and insensitivity to different band settings. Most frequently selected wavebands were located in near infrared and red edge spectral regions. In conclusion, our results emphasize the potential benefits of the Sentinel-2 mission for agricultural applications

Multidisciplinary Digital Publishing Institute

Archivio della ricerca - Università degli studi di Napoli Federico II

Directory of Open Access Journals

Open Access LMU

An Experimental Study of Distributed Quantile Estimation

Author: Zhuang Zixuan
Publication venue
Publication date: 01/01/2015
Field of study

Quantiles are very important statistics information used to describe the distribution of datasets. Given the quantiles of a dataset, we can easily know the distribution of the dataset, which is a fundamental problem in data analysis. However, quite often, computing quantiles directly is inappropriate due to the memory limitations. Further, in many settings such as data streaming and sensor network model, even the data size is unpredictable. Although the quantiles computation has been widely studied, it was mostly in the sequential setting. In this paper, we study several quantile computation algorithms in the distributed setting and compare them in terms of space usage, running time, and accuracy. Moreover, we provide detailed experimental comparisons between several popular algorithms. Our work focuses on the approximate quantile algorithms which provide error bounds. Approximate quantiles have received more attentions than exact ones since they are often faster, can be more easily adapted to the distributed setting while giving sufficiently good statistical information on the data sets.Comment: M.S. Thesi

arXiv.org e-Print Archive

Ezid

eScholarship - University of California

Randomized Algorithms for Tracking Distributed Count, Frequencies, and Ranks

Author: Huang Zengfeng
Yi Ke
Zhang Qin
Publication venue
Publication date: 02/12/2011
Field of study

We show that randomization can lead to significant improvements for a few fundamental problems in distributed tracking. Our basis is the {\em count-tracking} problem, where there are

k

players, each holding a counter

n_i

that gets incremented over time, and the goal is to track an \eps-approximation of their sum

n=\sum_i n_i

continuously at all times, using minimum communication. While the deterministic communication complexity of the problem is \Theta(k/\eps \cdot \log N), where

N

is the final value of

n

when the tracking finishes, we show that with randomization, the communication cost can be reduced to \Theta(\sqrt{k}/\eps \cdot \log N). Our algorithm is simple and uses only O(1) space at each player, while the lower bound holds even assuming each player has infinite computing power. Then, we extend our techniques to two related distributed tracking problems: {\em frequency-tracking} and {\em rank-tracking}, and obtain similar improvements over previous deterministic algorithms. Both problems are of central importance in large data monitoring and analysis, and have been extensively studied in the literature.Comment: 19 pages, 1 figur

arXiv.org e-Print Archive

Hong Kong University of Science and Technology Institutional Repository

Tight Lower Bound for Comparison-Based Quantile Summaries

Author: Cormode Graham
Veselý Pavel
Publication venue
Publication date: 16/01/2020
Field of study

Quantiles, such as the median or percentiles, provide concise and useful information about the distribution of a collection of items, drawn from a totally ordered universe. We study data structures, called quantile summaries, which keep track of all quantiles, up to an error of at most

\varepsilon

. That is, an

\varepsilon

-approximate quantile summary first processes a stream of items and then, given any quantile query

0\le \phi\le 1

, returns an item from the stream, which is a

\phi'

-quantile for some

\phi' = \phi \pm \varepsilon

. We focus on comparison-based quantile summaries that can only compare two items and are otherwise completely oblivious of the universe. The best such deterministic quantile summary to date, due to Greenwald and Khanna (SIGMOD '01), stores at most

O(\frac{1}{\varepsilon}\cdot \log \varepsilon N)

items, where

N

is the number of items in the stream. We prove that this space bound is optimal by showing a matching lower bound. Our result thus rules out the possibility of constructing a deterministic comparison-based quantile summary in space

f(\varepsilon)\cdot o(\log N)

, for any function

f

that does not depend on

N

. As a corollary, we improve the lower bound for biased quantiles, which provide a stronger, relative-error guarantee of

(1\pm \varepsilon)\cdot \phi

, and for other related computational tasks.Comment: 20 pages, 2 figures, major revison of the construction (Sec. 3) and some other parts of the pape

arXiv.org e-Print Archive

Crossref

Warwick Research Archives Portal Repository