Search CORE

123,423 research outputs found

VerdictDB: Universalizing Approximate Query Processing

Author: Bickel P. J.
Bootstrapping Sample Survey Data Comparing Recent
Canty A. J.
Condie T.
Eykholt K.
Flajolet P.
Ganti V.
Hall P.
Kleiner A.
Mayo D. G.
Meliou A.
Mozafari B.
Mozafari B.
Mozafari B.
Mozafari B.
Olston C.
Park Y.
Politis D. N.
Sidirourgos L.
Su H.
Vrbsky S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/11/2018
Field of study

Despite 25 years of research in academia, approximate query processing (AQP) has had little industrial adoption. One of the major causes of this slow adoption is the reluctance of traditional vendors to make radical changes to their legacy codebases, and the preoccupation of newer vendors (e.g., SQL-on-Hadoop products) with implementing standard features. Additionally, the few AQP engines that are available are each tied to a specific platform and require users to completely abandon their existing databases---an unrealistic expectation given the infancy of the AQP technology. Therefore, we argue that a universal solution is needed: a database-agnostic approximation engine that will widen the reach of this emerging technology across various platforms. Our proposal, called VerdictDB, uses a middleware architecture that requires no changes to the backend database, and thus, can work with all off-the-shelf engines. Operating at the driver-level, VerdictDB intercepts analytical queries issued to the database and rewrites them into another query that, if executed by any standard relational engine, will yield sufficient information for computing an approximate answer. VerdictDB uses the returned result set to compute an approximate answer and error estimates, which are then passed on to the user or application. However, lack of access to the query execution layer introduces significant challenges in terms of generality, correctness, and efficiency. This paper shows how VerdictDB overcomes these challenges and delivers up to 171

\times

speedup (18.45

\times

on average) for a variety of existing engines, such as Impala, Spark SQL, and Amazon Redshift, while incurring less than 2.6% relative error. VerdictDB is open-sourced under Apache License.Comment: Extended technical report of the paper that appeared in Proceedings of the 2018 International Conference on Management of Data, pp. 1461-1476. ACM, 201

arXiv.org e-Print Archive

Crossref

Fast and Robust Rank Aggregation against Model Misspecification

Author: Chen Weijie
Niu Gang
Pan Yuangang
Sugiyama Masashi
Tsang Ivor W.
Publication venue
Publication date: 29/05/2019
Field of study

In rank aggregation, preferences from different users are summarized into a total order under the homogeneous data assumption. Thus, model misspecification arises and rank aggregation methods take some noise models into account. However, they all rely on certain noise model assumptions and cannot handle agnostic noises in the real world. In this paper, we propose CoarsenRank, which rectifies the underlying data distribution directly and aligns it to the homogeneous data assumption without involving any noise model. To this end, we define a neighborhood of the data distribution over which Bayesian inference of CoarsenRank is performed, and therefore the resultant posterior enjoys robustness against model misspecification. Further, we derive a tractable closed-form solution for CoarsenRank making it computationally efficient. Experiments on real-world datasets show that CoarsenRank is fast and robust, achieving consistent improvement over baseline methods

arXiv.org e-Print Archive

Secure and Privacy-Preserving Data Aggregation Protocols for Wireless Sensor Networks

Author: Sen Jaydip
Publication venue
Publication date: 06/03/2012
Field of study

This chapter discusses the need of security and privacy protection mechanisms in aggregation protocols used in wireless sensor networks (WSN). It presents a comprehensive state of the art discussion on the various privacy protection mechanisms used in WSNs and particularly focuses on the CPDA protocols proposed by He et al. (INFOCOM 2007). It identifies a security vulnerability in the CPDA protocol and proposes a mechanism to plug that vulnerability. To demonstrate the need of security in aggregation process, the chapter further presents various threats in WSN aggregation mechanisms. A large number of existing protocols for secure aggregation in WSN are discussed briefly and a protocol is proposed for secure aggregation which can detect false data injected by malicious nodes in a WSN. The performance of the protocol is also presented. The chapter concludes while highlighting some future directions of research in secure data aggregation in WSNs.Comment: 32 pages, 7 figures, 3 table

arXiv.org e-Print Archive

IntechOpen

Multipath streaming: fundamental limits and efficient algorithms

Author: Combes Richard
Elayoubi Salah-Eddine
Sidi Habib
Publication venue
Publication date: 14/06/2016
Field of study

We investigate streaming over multiple links. A file is split into small units called chunks that may be requested on the various links according to some policy, and received after some random delay. After a start-up time called pre-buffering time, received chunks are played at a fixed speed. There is starvation if the chunk to be played has not yet arrived. We provide lower bounds (fundamental limits) on the starvation probability of any policy. We further propose simple, order-optimal policies that require no feedback. For general delay distributions, we provide tractable upper bounds for the starvation probability of the proposed policies, allowing to select the pre-buffering time appropriately. We specialize our results to: (i) links that employ CSMA or opportunistic scheduling at the packet level, (ii) links shared with a primary user (iii) links that use fair rate sharing at the flow level. We consider a generic model so that our results give insight into the design and performance of media streaming over (a) wired networks with several paths between the source and destination, (b) wireless networks featuring spectrum aggregation and (c) multi-homed wireless networks.Comment: 24 page

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Resilient networking in wireless sensor networks

Author: Erdene-Ochir Ochirkhand
Kountouris Apostolos
Minier Marine
Valois Fabrice
Publication venue
Publication date: 15/03/2010
Field of study

This report deals with security in wireless sensor networks (WSNs), especially in network layer. Multiple secure routing protocols have been proposed in the literature. However, they often use the cryptography to secure routing functionalities. The cryptography alone is not enough to defend against multiple attacks due to the node compromise. Therefore, we need more algorithmic solutions. In this report, we focus on the behavior of routing protocols to determine which properties make them more resilient to attacks. Our aim is to find some answers to the following questions. Are there any existing protocols, not designed initially for security, but which already contain some inherently resilient properties against attacks under which some portion of the network nodes is compromised? If yes, which specific behaviors are making these protocols more resilient? We propose in this report an overview of security strategies for WSNs in general, including existing attacks and defensive measures. In this report we focus at the network layer in particular, and an analysis of the behavior of four particular routing protocols is provided to determine their inherent resiliency to insider attacks. The protocols considered are: Dynamic Source Routing (DSR), Gradient-Based Routing (GBR), Greedy Forwarding (GF) and Random Walk Routing (RWR)

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server