6,842 research outputs found
Distributed top-k aggregation queries at large
Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-k aggregation queries in such distributed environments. The optimizations can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address three degrees of freedom: 1) hierarchically grouping input lists into top-k operator trees and optimizing the tree structure, 2) computing data-adaptive scan depths for different input sources, and 3) data-adaptive sampling of a small subset of input sources in scenarios with hundreds or thousands of query-relevant network nodes. All optimizations are based on a statistical cost model that utilizes local synopses, e.g., in the form of histograms, efficiently computed convolutions, and estimators based on order statistics. The paper presents comprehensive experiments, with three different real-life datasets and using the ns-2 network simulator for a packet-level simulation of a large Internet-style network
Towards a Scalable Dynamic Spatial Database System
With the rise of GPS-enabled smartphones and other similar mobile devices,
massive amounts of location data are available. However, no scalable solutions
for soft real-time spatial queries on large sets of moving objects have yet
emerged. In this paper we explore and measure the limits of actual algorithms
and implementations regarding different application scenarios. And finally we
propose a novel distributed architecture to solve the scalability issues.Comment: (2012
Semi-automatic semantic enrichment of raw sensor data
One of the more recent sources of large volumes of generated data is sensor devices, where dedicated sensing equipment is used to monitor events and happenings in a wide range of domains, including monitoring human biometrics. In recent trials to examine the effects that key moments in movies have on the human body, we fitted fitted with a number of biometric sensor devices and monitored them as they watched a range of dierent movies in groups. The purpose of these experiments was to examine the correlation between humans' highlights in movies as observed from biometric sensors, and highlights in the same movies as identified by our automatic movie analysis techniques. However,the problem with this type of experiment is that both the analysis of the video stream and the sensor data readings are not directly usable
in their raw form because of the sheer volume of low-level data values generated both from the sensors and from the movie analysis. This work describes the semi-automated enrichment of both video analysis and sensor data and the mechanism used to query the data in both centralised
environments, and in a peer-to-peer architecture when the number of sensor devices grows to large numbers. We present and validate a scalable means of semi-automating the semantic enrichment of sensor data, thereby providing a means of large-scale sensor management
Scalable Peer-to-Peer Streaming for Live Entertainment Content
We present a system for streaming live entertainment content over the Internet originating from a single source to a scalable number of consumers without resorting to centralized or provider-provisioned resources. The system creates a peer-to-peer overlay network, which attempts to optimize use of existing capacity to ensure quality of service, delivering low startup delay and lag in playout of the live content. There are three main aspects of our solution: first, a swarming mechanism that constructs an overlay topology for minimizing propagation delays from the source to end consumers; second, a distributed overlay anycast system that uses a location-based search algorithm for peers to quickly find the closest peers in a given stream; and finally, a novel incentive mechanism that encourages peers to donate capacity even when the user is not actively consuming content
- âŠ