15,015 research outputs found
Towards Analytics Aware Ontology Based Access to Static and Streaming Data (Extended Version)
Real-time analytics that requires integration and aggregation of
heterogeneous and distributed streaming and static data is a typical task in
many industrial scenarios such as diagnostics of turbines in Siemens. OBDA
approach has a great potential to facilitate such tasks; however, it has a
number of limitations in dealing with analytics that restrict its use in
important industrial applications. Based on our experience with Siemens, we
argue that in order to overcome those limitations OBDA should be extended and
become analytics, source, and cost aware. In this work we propose such an
extension. In particular, we propose an ontology, mapping, and query language
for OBDA, where aggregate and other analytical functions are first class
citizens. Moreover, we develop query optimisation techniques that allow to
efficiently process analytical tasks over static and streaming data. We
implement our approach in a system and evaluate our system with Siemens turbine
data
When Things Matter: A Data-Centric View of the Internet of Things
With the recent advances in radio-frequency identification (RFID), low-cost
wireless sensor devices, and Web technologies, the Internet of Things (IoT)
approach has gained momentum in connecting everyday objects to the Internet and
facilitating machine-to-human and machine-to-machine communication with the
physical world. While IoT offers the capability to connect and integrate both
digital and physical entities, enabling a whole new class of applications and
services, several significant challenges need to be addressed before these
applications and services can be fully realized. A fundamental challenge
centers around managing IoT data, typically produced in dynamic and volatile
environments, which is not only extremely large in scale and volume, but also
noisy, and continuous. This article surveys the main techniques and
state-of-the-art research efforts in IoT from data-centric perspectives,
including data stream processing, data storage models, complex event
processing, and searching in IoT. Open research issues for IoT data management
are also discussed
SQPR: Stream Query Planning with Reuse
When users submit new queries to a distributed stream processing system (DSPS), a query planner must allocate physical resources, such as CPU cores, memory and network bandwidth, from a set of hosts to queries. Allocation decisions must provide the correct mix of resources required by queries, while achieving an efficient overall allocation to scale in the number of admitted queries. By exploiting overlap between queries and reusing partial results, a query planner can conserve resources but has to carry out more complex planning decisions. In this paper, we describe SQPR, a query planner that targets DSPSs in data centre environments with heterogeneous resources. SQPR models query admission, allocation and reuse as a single constrained optimisation problem and solves an approximate version to achieve scalability. It prevents individual resources from becoming bottlenecks by re-planning past allocation decisions and supports different allocation objectives. As our experimental evaluation in comparison with a state-of-the-art planner shows SQPR makes efficient resource allocation decisions, even with a high utilisation of resources, with acceptable overheads
Data Provenance and Management in Radio Astronomy: A Stream Computing Approach
New approaches for data provenance and data management (DPDM) are required
for mega science projects like the Square Kilometer Array, characterized by
extremely large data volume and intense data rates, therefore demanding
innovative and highly efficient computational paradigms. In this context, we
explore a stream-computing approach with the emphasis on the use of
accelerators. In particular, we make use of a new generation of high
performance stream-based parallelization middleware known as InfoSphere
Streams. Its viability for managing and ensuring interoperability and integrity
of signal processing data pipelines is demonstrated in radio astronomy. IBM
InfoSphere Streams embraces the stream-computing paradigm. It is a shift from
conventional data mining techniques (involving analysis of existing data from
databases) towards real-time analytic processing. We discuss using InfoSphere
Streams for effective DPDM in radio astronomy and propose a way in which
InfoSphere Streams can be utilized for large antennae arrays. We present a
case-study: the InfoSphere Streams implementation of an autocorrelating
spectrometer, and using this example we discuss the advantages of the
stream-computing approach and the utilization of hardware accelerators
- âŚ