1,081 research outputs found
Medians and Beyond: New Aggregation Techniques for Sensor Networks
Wireless sensor networks offer the potential to span and monitor large
geographical areas inexpensively. Sensors, however, have significant power
constraint (battery life), making communication very expensive. Another
important issue in the context of sensor-based information systems is that
individual sensor readings are inherently unreliable. In order to address these
two aspects, sensor database systems like TinyDB and Cougar enable in-network
data aggregation to reduce the communication cost and improve reliability. The
existing data aggregation techniques, however, are limited to relatively simple
types of queries such as SUM, COUNT, AVG, and MIN/MAX. In this paper we propose
a data aggregation scheme that significantly extends the class of queries that
can be answered using sensor networks. These queries include (approximate)
quantiles, such as the median, the most frequent data values, such as the
consensus value, a histogram of the data distribution, as well as range
queries. In our scheme, each sensor aggregates the data it has received from
other sensors into a fixed (user specified) size message. We provide strict
theoretical guarantees on the approximation quality of the queries in terms of
the message size. We evaluate the performance of our aggregation scheme by
simulation and demonstrate its accuracy, scalability and low resource
utilization for highly variable input data sets
Query Workload-Aware Index Structures for Range Searches in 1D, 2D, and High-Dimensional Spaces
abstract: Most current database management systems are optimized for single query execution.
Yet, often, queries come as part of a query workload. Therefore, there is a need
for index structures that can take into consideration existence of multiple queries in a
query workload and efficiently produce accurate results for the entire query workload.
These index structures should be scalable to handle large amounts of data as well as
large query workloads.
The main objective of this dissertation is to create and design scalable index structures
that are optimized for range query workloads. Range queries are an important
type of queries with wide-ranging applications. There are no existing index structures
that are optimized for efficient execution of range query workloads. There are
also unique challenges that need to be addressed for range queries in 1D, 2D, and
high-dimensional spaces. In this work, I introduce novel cost models, index selection
algorithms, and storage mechanisms that can tackle these challenges and efficiently
process a given range query workload in 1D, 2D, and high-dimensional spaces. In particular,
I introduce the index structures, HCS (for 1D spaces), cSHB (for 2D spaces),
and PSLSH (for high-dimensional spaces) that are designed specifically to efficiently
handle range query workload and the unique challenges arising from their respective
spaces. I experimentally show the effectiveness of the above proposed index structures
by comparing with state-of-the-art techniques.Dissertation/ThesisDoctoral Dissertation Computer Science 201
Dragon: Multidimensional Range Queries on Distributed Aggregation Trees,
Distributed query processing is of paramount importance in next-generation distribution services, such as Internet of
Things (IoT) and cyber-physical systems. Even if several multi-attribute range queries supports have been proposed for
peer-to-peer systems, these solutions must be rethought to fully meet the requirements of new computational paradigms
for IoT, like fog computing. This paper proposes dragon, an ecient support for distributed multi-dimensional range
query processing targeting ecient query resolution on highly dynamic data. In dragon nodes at the edges of the
network collect and publish multi-dimensional data. The nodes collectively manage an aggregation tree storing data
digests which are then exploited, when resolving queries, to prune the sub-trees containing few or no relevant matches.
Multi-attribute queries are managed by linearising the attribute space through space lling curves. We extensively
analysed dierent aggregation and query resolution strategies in a wide spectrum of experimental set-ups. We show that
dragon manages eciently fast changing data values. Further, we show that dragon resolves queries by contacting a
lower number of nodes when compared to a similar approach in the state of the art
Integrating data warehouses with web data : a survey
This paper surveys the most relevant research on combining Data Warehouse (DW) and Web data. It studies the XML
technologies that are currently being used to integrate, store, query, and retrieve Web data and their application to DWs. The paper
reviews different DW distributed architectures and the use of XML languages as an integration tool in these systems. It also introduces
the problem of dealing with semistructured data in a DW. It studies Web data repositories, the design of multidimensional databases for
XML data sources, and the XML extensions of OnLine Analytical Processing techniques. The paper addresses the application of
information retrieval technology in a DW to exploit text-rich document collections. The authors hope that the paper will help to discover
the main limitations and opportunities that offer the combination of the DW and the Web fields, as well as to identify open research
line
- …