46,047 research outputs found
Efficiently Maintaining Distributed Model-Based Views on Real-Time Data Streams
Minimizing communication cost is a fundamental problem in large-scale federated sensor networks. Maintaining model-based views of data streams has been highlighted because it permits efficient data communication by transmitting parameter values of models, instead of original data streams. We propose a framework that employs the advantages of using model-based views for communication-efficient stream data processing over federated sensor networks, yet it significantly improves state-of-the-art approaches. The framework is generic and any time-parameterized models can be plugged, while accuracy guarantees for query results are ensured throughout the large-scale networks. In addition, we boost the performance of the framework by the coded model update that enables efficient model update from one node to another. It predetermines parameter values for the model, updates only identifiers of the parameter values, and compresses the identifiers by utilizing bitmaps. Moreover, we propose a correlation model, named coded inter-variable model, that merges the efficiency of the coded model update with that of correlation models. Empirical studies with real data demonstrate that our proposal achieves substantial amounts of communication reduction, outperforming state-of-the art methods
Efficiently Maintaining Distributed Model-Based Views on Real-Time Data Streams
Minimizing communication cost is a fundamental problem in large-scale federated sensor networks. Existing solutions applicable for the problem are often ad-hoc for specific query types, or they are inefficient when query results contain large volumes of data to be transferred over the networks. Maintaining model-based views of data streams has been recently highlighted because it permits the data communication over networks to be efficient by transmitting parameter values for the models, instead of sending original data streams. This paper proposes a novel framework that employs the advantages of using model-based views for communication-efficient stream data processing over federated sensor networks, yet it significantly improves state-of-the-art approaches. The framework is generic and any time-parameterized models can be plugged, as well as accuracy guarantees for query results are ensured throughout the large-scale networks. In addition, we boost the performance of the framework by the coded model update that enables efficient model update from one node to another. It predetermines parameter values for the model, updates only identifiers of the parameter values, and compresses the identifiers by utilizing bitmaps. Moreover, we propose a novel correlation model, named coded inter-variable model, that integrates the efficiency of the coded model update into more precise predictions of correlated models. Empirical studies with real data demonstrate that our proposal achieves substantial amounts of communication reduction, outperforming a state-of-the art method
The Motivation, Architecture and Demonstration of Ultralight Network Testbed
In this paper we describe progress in the NSF-funded Ultralight project and a recent demonstration of Ultralight technologies at SuperComputing 2005 (SC|05). The goal of the
Ultralight project is to help meet the data-intensive computing challenges of the next generation of particle physics experiments with a comprehensive, network-focused approach. Ultralight adopts a new approach to networking: instead of treating it traditionally, as a static, unchanging and unmanaged set of inter-computer links, we are developing and using it as a dynamic, configurable, and closely monitored resource that is managed from end-to-end. Thus we are constructing a next-generation global system that is able to meet the data processing, distribution, access and analysis needs of the particle physics community. In this paper we present the motivation for, and an overview of, the Ultralight project. We then cover early
results in the various working areas of the project. The remainder of the paper describes our experiences of the Ultralight network architecture, kernel setup, application tuning and configuration used during the bandwidth challenge event at SC|05. During this Challenge, we
achieved a record-breaking aggregate data rate in excess of 150 Gbps while moving physics datasets between many sites interconnected by the Ultralight backbone network. The exercise highlighted the benefits of Ultralight's research and development efforts that are enabling new and advanced methods of distributed scientific data analysis
When Things Matter: A Data-Centric View of the Internet of Things
With the recent advances in radio-frequency identification (RFID), low-cost
wireless sensor devices, and Web technologies, the Internet of Things (IoT)
approach has gained momentum in connecting everyday objects to the Internet and
facilitating machine-to-human and machine-to-machine communication with the
physical world. While IoT offers the capability to connect and integrate both
digital and physical entities, enabling a whole new class of applications and
services, several significant challenges need to be addressed before these
applications and services can be fully realized. A fundamental challenge
centers around managing IoT data, typically produced in dynamic and volatile
environments, which is not only extremely large in scale and volume, but also
noisy, and continuous. This article surveys the main techniques and
state-of-the-art research efforts in IoT from data-centric perspectives,
including data stream processing, data storage models, complex event
processing, and searching in IoT. Open research issues for IoT data management
are also discussed
Towards Analytics Aware Ontology Based Access to Static and Streaming Data (Extended Version)
Real-time analytics that requires integration and aggregation of
heterogeneous and distributed streaming and static data is a typical task in
many industrial scenarios such as diagnostics of turbines in Siemens. OBDA
approach has a great potential to facilitate such tasks; however, it has a
number of limitations in dealing with analytics that restrict its use in
important industrial applications. Based on our experience with Siemens, we
argue that in order to overcome those limitations OBDA should be extended and
become analytics, source, and cost aware. In this work we propose such an
extension. In particular, we propose an ontology, mapping, and query language
for OBDA, where aggregate and other analytical functions are first class
citizens. Moreover, we develop query optimisation techniques that allow to
efficiently process analytical tasks over static and streaming data. We
implement our approach in a system and evaluate our system with Siemens turbine
data
A Graph-Partition-Based Scheduling Policy for Heterogeneous Architectures
In order to improve system performance efficiently, a number of systems
choose to equip multi-core and many-core processors (such as GPUs). Due to
their discrete memory these heterogeneous architectures comprise a distributed
system within a computer. A data-flow programming model is attractive in this
setting for its ease of expressing concurrency. Programmers only need to define
task dependencies without considering how to schedule them on the hardware.
However, mapping the resulting task graph onto hardware efficiently remains a
challenge. In this paper, we propose a graph-partition scheduling policy for
mapping data-flow workloads to heterogeneous hardware. According to our
experiments, our graph-partition-based scheduling achieves comparable
performance to conventional queue-base approaches.Comment: Presented at DATE Friday Workshop on Heterogeneous Architectures and
Design Methods for Embedded Image Systems (HIS 2015) (arXiv:1502.07241
A Survey on Transactional Stream Processing
Transactional stream processing (TSP) strives to create a cohesive model that
merges the advantages of both transactional and stream-oriented guarantees.
Over the past decade, numerous endeavors have contributed to the evolution of
TSP solutions, uncovering similarities and distinctions among them. Despite
these advances, a universally accepted standard approach for integrating
transactional functionality with stream processing remains to be established.
Existing TSP solutions predominantly concentrate on specific application
characteristics and involve complex design trade-offs. This survey intends to
introduce TSP and present our perspective on its future progression. Our
primary goals are twofold: to provide insights into the diverse TSP
requirements and methodologies, and to inspire the design and development of
groundbreaking TSP systems
A File System Abstraction for Sense and Respond Systems
The heterogeneity and resource constraints of sense-and-respond systems pose
significant challenges to system and application development. In this paper, we
present a flexible, intuitive file system abstraction for organizing and
managing sense-and-respond systems based on the Plan 9 design principles. A key
feature of this abstraction is the ability to support multiple views of the
system via filesystem namespaces. Constructed logical views present an
application-specific representation of the network, thus enabling high-level
programming of the network. Concurrently, structural views of the network
enable resource-efficient planning and execution of tasks. We present and
motivate the design using several examples, outline research challenges and our
research plan to address them, and describe the current state of
implementation.Comment: 6 pages, 3 figures Workshop on End-to-End, Sense-and-Respond Systems,
Applications, and Services In conjunction with MobiSys '0
- …