47 research outputs found
Fast Search for Dynamic Multi-Relational Graphs
Acting on time-critical events by processing ever growing social media or
news streams is a major technical challenge. Many of these data sources can be
modeled as multi-relational graphs. Continuous queries or techniques to search
for rare events that typically arise in monitoring applications have been
studied extensively for relational databases. This work is dedicated to answer
the question that emerges naturally: how can we efficiently execute a
continuous query on a dynamic graph? This paper presents an exact subgraph
search algorithm that exploits the temporal characteristics of representative
queries for online news or social media monitoring. The algorithm is based on a
novel data structure called the Subgraph Join Tree (SJ-Tree) that leverages the
structural and semantic characteristics of the underlying multi-relational
graph. The paper concludes with extensive experimentation on several real-world
datasets that demonstrates the validity of this approach.Comment: SIGMOD Workshop on Dynamic Networks Management and Mining (DyNetMM),
201
Control-based Scheduling in a Distributed Stream Processing System
Stream processing systems receive continuous streams
of messages with raw information and produce streams
of messages with processed information. The utility of a
stream-processing system depends, in part, on the accuracy
and timeliness of the output. Streams in complex event processing
systems are processed on distributed systems; several
steps are taken on different processors to process each
incoming message, and messages may be enqueued between
steps. This paper deals with the problems of distributed dynamic
control of streams to optimize the total utility provided
by the system. A challenge of distributed control is
that timeliness of output depends only on the total end-toend
time and is otherwise independent of the delays at each
separate processor whereas the controller for each processor
takes action to control only the steps on that processor
and cannot directly control the entire network.
This paper identifies key problems in distributed control
and analyzes two scheduling algorithms that help in an initial
analysis of a difficult problem
Knowledge-infused and Consistent Complex Event Processing over Real-time and Persistent Streams
Emerging applications in Internet of Things (IoT) and Cyber-Physical Systems
(CPS) present novel challenges to Big Data platforms for performing online
analytics. Ubiquitous sensors from IoT deployments are able to generate data
streams at high velocity, that include information from a variety of domains,
and accumulate to large volumes on disk. Complex Event Processing (CEP) is
recognized as an important real-time computing paradigm for analyzing
continuous data streams. However, existing work on CEP is largely limited to
relational query processing, exposing two distinctive gaps for query
specification and execution: (1) infusing the relational query model with
higher level knowledge semantics, and (2) seamless query evaluation across
temporal spaces that span past, present and future events. These allow
accessible analytics over data streams having properties from different
disciplines, and help span the velocity (real-time) and volume (persistent)
dimensions. In this article, we introduce a Knowledge-infused CEP (X-CEP)
framework that provides domain-aware knowledge query constructs along with
temporal operators that allow end-to-end queries to span across real-time and
persistent streams. We translate this query model to efficient query execution
over online and offline data streams, proposing several optimizations to
mitigate the overheads introduced by evaluating semantic predicates and in
accessing high-volume historic data streams. The proposed X-CEP query model and
execution approaches are implemented in our prototype semantic CEP engine,
SCEPter. We validate our query model using domain-aware CEP queries from a
real-world Smart Power Grid application, and experimentally analyze the
benefits of our optimizations for executing these queries, using event streams
from a campus-microgrid IoT deployment.Comment: 34 pages, 16 figures, accepted in Future Generation Computer Systems,
October 27, 201
Expressions as Data in Relational Data Base Management Systems
Numerous applications, such as publish/subscribe, website personalization, applications involving continuous queries, etc., require that user.s interest be persistently maintained and matched with the expected data. Conditional Expressions can be used to maintain user interests. This thesis focuses on the support for expression data type in relational database system, allowing storing of conditional expressions as .data. in columns of database tables and evaluating those expressions using an EVALUATE operator. With this context, expressions can be interpreted as descriptions, queries, and filters, and this significantly broadens the use of a relational database system to support new types of applications. The thesis presents an overview of the expression data type, storing the expressions, evaluating the stored expressions and shows how these applications can be easily supported with improved functionality. A sample application is also explained in order to show the importance of expressions in application context, with a comparison of the application with and without expressions