Search CORE

127,726 research outputs found

Distributed anonymous discrete function computation

Author: Hendrickx Julien M.
Olshevsky Alex
Tsitsiklis John N.
Publication venue
Publication date: 01/01/2011
Field of study

We propose a model for deterministic distributed function computation by a network of identical and anonymous nodes. In this model, each node has bounded computation and storage capabilities that do not grow with the network size. Furthermore, each node only knows its neighbors, not the entire graph. Our goal is to characterize the class of functions that can be computed within this model. In our main result, we provide a necessary condition for computability which we show to be nearly sufficient, in the sense that every function that satisfies this condition can at least be approximated. The problem of computing suitably rounded averages in a distributed manner plays a central role in our development; we provide an algorithm that solves it in time that grows quadratically with the size of the network

arXiv.org e-Print Archive

DSpace@MIT

DIAL UCLouvain

An Iterative Scheme for Leverage-based Approximate Aggregation

Author: Han Shanshan
Li Jianzhong
Wan Jialin
Wang Hongzhi
Publication venue
Publication date: 22/01/2019
Field of study

The current data explosion poses great challenges to the approximate aggregation with an efficiency and accuracy. To address this problem, we propose a novel approach to calculate the aggregation answers with a high accuracy using only a small portion of the data. We introduce leverages to reflect individual differences in the samples from a statistical perspective. Two kinds of estimators, the leverage-based estimator, and the sketch estimator (a "rough picture" of the aggregation answer), are in constraint relations and iteratively improved according to the actual conditions until their difference is below a threshold. Due to the iteration mechanism and the leverages, our approach achieves a high accuracy. Moreover, some features, such as not requiring recording the sampled data and easy to extend to various execution modes (e.g., the online mode), make our approach well suited to deal with big data. Experiments show that our approach has an extraordinary performance, and when compared with the uniform sampling, our approach can achieve high-quality answers with only 1/3 of the same sample size.Comment: 17 pages, 9 figure

arXiv.org e-Print Archive

Crossref

Algorithms for Provisioning Queries and Analytics

Author: Assadi Sepehr
Khanna Sanjeev
Li Yang
Tannen Val
Publication venue
Publication date: 18/12/2015
Field of study

Provisioning is a technique for avoiding repeated expensive computations in what-if analysis. Given a query, an analyst formulates

k

hypotheticals, each retaining some of the tuples of a database instance, possibly overlapping, and she wishes to answer the query under scenarios, where a scenario is defined by a subset of the hypotheticals that are "turned on". We say that a query admits compact provisioning if given any database instance and any

k

hypotheticals, one can create a poly-size (in

k

) sketch that can then be used to answer the query under any of the

2^{k}

possible scenarios without accessing the original instance. In this paper, we focus on provisioning complex queries that combine relational algebra (the logical component), grouping, and statistics/analytics (the numerical component). We first show that queries that compute quantiles or linear regression (as well as simpler queries that compute count and sum/average of positive values) can be compactly provisioned to provide (multiplicative) approximate answers to an arbitrary precision. In contrast, exact provisioning for each of these statistics requires the sketch size to be exponential in

k

. We then establish that for any complex query whose logical component is a positive relational algebra query, as long as the numerical component can be compactly provisioned, the complex query itself can be compactly provisioned. On the other hand, introducing negation or recursion in the logical component again requires the sketch size to be exponential in

k

. While our positive results use algorithms that do not access the original instance after a scenario is known, we prove our lower bounds even for the case when, knowing the scenario, limited access to the instance is allowed

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Distributed anonymous function computation in information fusion and multiagent systems

Author: Hendrickx Julien M.
Olshevsky Alex
Tsitsiklis John N.
Publication venue
Publication date: 01/01/2009
Field of study

We propose a model for deterministic distributed function computation by a network of identical and anonymous nodes, with bounded computation and storage capabilities that do not scale with the network size. Our goal is to characterize the class of functions that can be computed within this model. In our main result, we exhibit a class of non-computable functions, and prove that every function outside this class can at least be approximated. The problem of computing averages in a distributed manner plays a central role in our development

arXiv.org e-Print Archive

Interactive querying and data visualization for abuse detection in social network sites

Author: De Turck Filip
Ordonez Ante Leandro
Van Seghbroeck Gregory
Vanhove Thomas
Wauters Tim
Publication venue
Publication date: 01/01/2016
Field of study

Crossref

Ghent University Academic Bibliography