5,205 research outputs found
Interactive Data Exploration with Smart Drill-Down
We present {\em smart drill-down}, an operator for interactively exploring a
relational table to discover and summarize "interesting" groups of tuples. Each
group of tuples is described by a {\em rule}. For instance, the rule tells us that there are a thousand tuples with value in the
first column and in the second column (and any value in the third column).
Smart drill-down presents an analyst with a list of rules that together
describe interesting aspects of the table. The analyst can tailor the
definition of interesting, and can interactively apply smart drill-down on an
existing rule to explore that part of the table. We demonstrate that the
underlying optimization problems are {\sc NP-Hard}, and describe an algorithm
for finding the approximately optimal list of rules to display when the user
uses a smart drill-down, and a dynamic sampling scheme for efficiently
interacting with large tables. Finally, we perform experiments on real datasets
on our experimental prototype to demonstrate the usefulness of smart drill-down
and study the performance of our algorithms
Towards a Holistic Integration of Spreadsheets with Databases: A Scalable Storage Engine for Presentational Data Management
Spreadsheet software is the tool of choice for interactive ad-hoc data
management, with adoption by billions of users. However, spreadsheets are not
scalable, unlike database systems. On the other hand, database systems, while
highly scalable, do not support interactivity as a first-class primitive. We
are developing DataSpread, to holistically integrate spreadsheets as a
front-end interface with databases as a back-end datastore, providing
scalability to spreadsheets, and interactivity to databases, an integration we
term presentational data management (PDM). In this paper, we make a first step
towards this vision: developing a storage engine for PDM, studying how to
flexibly represent spreadsheet data within a database and how to support and
maintain access by position. We first conduct an extensive survey of
spreadsheet use to motivate our functional requirements for a storage engine
for PDM. We develop a natural set of mechanisms for flexibly representing
spreadsheet data and demonstrate that identifying the optimal representation is
NP-Hard; however, we develop an efficient approach to identify the optimal
representation from an important and intuitive subclass of representations. We
extend our mechanisms with positional access mechanisms that don't suffer from
cascading update issues, leading to constant time access and modification
performance. We evaluate these representations on a workload of typical
spreadsheets and spreadsheet operations, providing up to 20% reduction in
storage, and up to 50% reduction in formula evaluation time
A network-aware framework for energy-efficient data acquisition in wireless sensor networks
Wireless sensor networks enable users to monitor the physical world at an extremely high fidelity. In order to collect the data generated by these tiny-scale devices, the data management community has proposed the utilization of declarative data-acquisition frameworks. While these frameworks have facilitated the energy-efficient retrieval of data from the physical environment, they were agnostic of the underlying network topology and also did not support advanced query processing semantics. In this paper we present KSpot+, a distributed network-aware framework that optimizes network efficiency by combining three components: (i) the tree balancing module, which balances the workload of each sensor node by constructing efficient network topologies; (ii) the workload balancing module, which minimizes data reception inefficiencies by synchronizing the sensor network activity intervals; and (iii) the query processing module, which supports advanced query processing semantics. In order to validate the efficiency of our approach, we have developed a prototype implementation of KSpot+ in nesC and JAVA. In our experimental evaluation, we thoroughly assess the performance of KSpot+ using real datasets and show that KSpot+ provides significant energy reductions under a variety of conditions, thus significantly prolonging the longevity of a WSN
- …