123,423 research outputs found
VerdictDB: Universalizing Approximate Query Processing
Despite 25 years of research in academia, approximate query processing (AQP)
has had little industrial adoption. One of the major causes of this slow
adoption is the reluctance of traditional vendors to make radical changes to
their legacy codebases, and the preoccupation of newer vendors (e.g.,
SQL-on-Hadoop products) with implementing standard features. Additionally, the
few AQP engines that are available are each tied to a specific platform and
require users to completely abandon their existing databases---an unrealistic
expectation given the infancy of the AQP technology. Therefore, we argue that a
universal solution is needed: a database-agnostic approximation engine that
will widen the reach of this emerging technology across various platforms.
Our proposal, called VerdictDB, uses a middleware architecture that requires
no changes to the backend database, and thus, can work with all off-the-shelf
engines. Operating at the driver-level, VerdictDB intercepts analytical queries
issued to the database and rewrites them into another query that, if executed
by any standard relational engine, will yield sufficient information for
computing an approximate answer. VerdictDB uses the returned result set to
compute an approximate answer and error estimates, which are then passed on to
the user or application. However, lack of access to the query execution layer
introduces significant challenges in terms of generality, correctness, and
efficiency. This paper shows how VerdictDB overcomes these challenges and
delivers up to 171 speedup (18.45 on average) for a variety of
existing engines, such as Impala, Spark SQL, and Amazon Redshift, while
incurring less than 2.6% relative error. VerdictDB is open-sourced under Apache
License.Comment: Extended technical report of the paper that appeared in Proceedings
of the 2018 International Conference on Management of Data, pp. 1461-1476.
ACM, 201
Fast and Robust Rank Aggregation against Model Misspecification
In rank aggregation, preferences from different users are summarized into a
total order under the homogeneous data assumption. Thus, model misspecification
arises and rank aggregation methods take some noise models into account.
However, they all rely on certain noise model assumptions and cannot handle
agnostic noises in the real world. In this paper, we propose CoarsenRank, which
rectifies the underlying data distribution directly and aligns it to the
homogeneous data assumption without involving any noise model. To this end, we
define a neighborhood of the data distribution over which Bayesian inference of
CoarsenRank is performed, and therefore the resultant posterior enjoys
robustness against model misspecification. Further, we derive a tractable
closed-form solution for CoarsenRank making it computationally efficient.
Experiments on real-world datasets show that CoarsenRank is fast and robust,
achieving consistent improvement over baseline methods
Secure and Privacy-Preserving Data Aggregation Protocols for Wireless Sensor Networks
This chapter discusses the need of security and privacy protection mechanisms
in aggregation protocols used in wireless sensor networks (WSN). It presents a
comprehensive state of the art discussion on the various privacy protection
mechanisms used in WSNs and particularly focuses on the CPDA protocols proposed
by He et al. (INFOCOM 2007). It identifies a security vulnerability in the CPDA
protocol and proposes a mechanism to plug that vulnerability. To demonstrate
the need of security in aggregation process, the chapter further presents
various threats in WSN aggregation mechanisms. A large number of existing
protocols for secure aggregation in WSN are discussed briefly and a protocol is
proposed for secure aggregation which can detect false data injected by
malicious nodes in a WSN. The performance of the protocol is also presented.
The chapter concludes while highlighting some future directions of research in
secure data aggregation in WSNs.Comment: 32 pages, 7 figures, 3 table
Multipath streaming: fundamental limits and efficient algorithms
We investigate streaming over multiple links. A file is split into small
units called chunks that may be requested on the various links according to
some policy, and received after some random delay. After a start-up time called
pre-buffering time, received chunks are played at a fixed speed. There is
starvation if the chunk to be played has not yet arrived. We provide lower
bounds (fundamental limits) on the starvation probability of any policy. We
further propose simple, order-optimal policies that require no feedback. For
general delay distributions, we provide tractable upper bounds for the
starvation probability of the proposed policies, allowing to select the
pre-buffering time appropriately. We specialize our results to: (i) links that
employ CSMA or opportunistic scheduling at the packet level, (ii) links shared
with a primary user (iii) links that use fair rate sharing at the flow level.
We consider a generic model so that our results give insight into the design
and performance of media streaming over (a) wired networks with several paths
between the source and destination, (b) wireless networks featuring spectrum
aggregation and (c) multi-homed wireless networks.Comment: 24 page
Resilient networking in wireless sensor networks
This report deals with security in wireless sensor networks (WSNs),
especially in network layer. Multiple secure routing protocols have been
proposed in the literature. However, they often use the cryptography to secure
routing functionalities. The cryptography alone is not enough to defend against
multiple attacks due to the node compromise. Therefore, we need more
algorithmic solutions. In this report, we focus on the behavior of routing
protocols to determine which properties make them more resilient to attacks.
Our aim is to find some answers to the following questions. Are there any
existing protocols, not designed initially for security, but which already
contain some inherently resilient properties against attacks under which some
portion of the network nodes is compromised? If yes, which specific behaviors
are making these protocols more resilient? We propose in this report an
overview of security strategies for WSNs in general, including existing attacks
and defensive measures. In this report we focus at the network layer in
particular, and an analysis of the behavior of four particular routing
protocols is provided to determine their inherent resiliency to insider
attacks. The protocols considered are: Dynamic Source Routing (DSR),
Gradient-Based Routing (GBR), Greedy Forwarding (GF) and Random Walk Routing
(RWR)
- …