11 research outputs found
gSPICE: Model-Based Event Shedding in Complex Event Processing
Overload situations, in the presence of resource limitations, in complex
event processing (CEP) systems are typically handled using load shedding to
maintain a given latency bound. However, load shedding might negatively impact
the quality of results (QoR). To minimize the shedding impact on QoR, CEP
researchers propose shedding approaches that drop events/internal state with
the lowest importances/utilities. In both black-box and white-box shedding
approaches, different features are used to predict these utilities. In this
work, we propose a novel black-box shedding approach that uses a new set of
features to drop events from the input event stream to maintain a given latency
bound. Our approach uses a probabilistic model to predict these event
utilities. Moreover, our approach uses Zobrist hashing and well-known machine
learning models, e.g., decision trees and random forests, to handle the
predicted event utilities. Through extensive evaluations on several synthetic
and two real-world datasets and a representative set of CEP queries, we show
that, in the majority of cases, our load shedding approach outperforms
state-of-the-art black-box load shedding approaches, w.r.t. QoR
P4CEP: Towards In-Network Complex Event Processing
In-network computing using programmable networking hardware is a strong trend
in networking that promises to reduce latency and consumption of server
resources through offloading to network elements (programmable switches and
smart NICs). In particular, the data plane programming language P4 together
with powerful P4 networking hardware has spawned projects offloading services
into the network, e.g., consensus services or caching services. In this paper,
we present a novel case for in-network computing, namely, Complex Event
Processing (CEP). CEP processes streams of basic events, e.g., stemming from
networked sensors, into meaningful complex events. Traditionally, CEP
processing has been performed on servers or overlay networks. However, we argue
in this paper that CEP is a good candidate for in-network computing along the
communication path avoiding detouring streams to distant servers to minimize
communication latency while also exploiting processing capabilities of novel
networking hardware. We show that it is feasible to express CEP operations in
P4 and also present a tool to compile CEP operations, formulated in our P4CEP
rule specification language, to P4 code. Moreover, we identify challenges and
problems that we have encountered to show future research directions for
implementing full-fledged in-network CEP systems.Comment: 6 pages. Author's versio
Just a Second -- Scheduling Thousands of Time-Triggered Streams in Large-Scale Networks
Deterministic real-time communication with bounded delay is an essential
requirement for many safety-critical cyber-physical systems, and has received
much attention from major standardization bodies such as IEEE and IETF. In
particular, Ethernet technology has been extended by time-triggered scheduling
mechanisms in standards like TTEthernet and Time-Sensitive Networking. Although
the scheduling mechanisms have become part of standards, the traffic planning
algorithms to create time-triggered schedules are still an open and challenging
research question due to the problem's high complexity. In particular,
so-called plug-and-produce scenarios require the ability to extend schedules on
the fly within seconds. The need for scalable scheduling and routing algorithms
is further supported by large-scale distributed real-time systems like smart
energy grids with tight communication requirements. In this paper, we tackle
this challenge by proposing two novel algorithms called Hierarchical Heuristic
Scheduling (H2S) and Cost-Efficient Lazy Forwarding Scheduling (CELF) to
calculate time-triggered schedules for TTEthernet. H2S and CELF are highly
efficient and scalable, calculating schedules for more than 45,000 streams on
random networks with 1,000 bridges as well as a realistic energy grid network
within sub-seconds to seconds
Flexible Content-based Publish/Subscribe over Programmable Data Planes
Publish/subscribe systems have to react fast on changes in their environment while handling many events with low end-to-end latency and high throughput. Moving the broker functionality of publish/subscribe systems to the underlying network layer reduces the path length of events and, in addition, forwarding benefits from powerful and programmable hardware. So far attempts of underlay publish/subscribe depend on a specific API of the network devices, e. g., the OpenFlow protocol, which have restrictions in dealing with dynamic devices and corresponding changes in the introduced attribute names for matching and filtering events. In this work, we focus on the next generation of network devices, which are envisioned to provide reconfigurable hardware components, specified by the open P4 description language. We introduce two new approaches that enable a flexible and generic attribute/value encoding, understandable by P4-capable packet processors, to benefit from the performance properties of hardware. Furthermore, the proposed approaches reduce the effort in encoding and decoding event messages
Sputum Proteomics Reveals a Shift in Vitamin D-binding Protein and Antimicrobial Protein Axis in Tuberculosis Patients
Abstract Existing understanding of molecular composition of sputum and its role in tuberculosis patients is variously limited to its diagnostic potential. We sought to identify infection induced sputum proteome alteration in active/non tuberculosis patients (A/NTB) and their role in altered lung patho-physiology. Out of the study population (n = 118), sputum proteins isolated from discovery set samples (n = 20) was used for an 8-plex isobaric tag for relative and absolute concentration analysis. A minimum set of protein with at least log2(ATB/NTB) >±1.0 in ATB was selected as biosignature and validated in 32 samples. Predictive accuracy was calculated from area under the receiver operating characteristic curve (AUC of ROC) using a confirmatory set (n = 50) by Western blot analysis. Mass spectrometry analysis identified a set of 192 sputum proteins, out of which a signature of β-integrin, vitamin D binding protein:DBP, uteroglobin, profilin and cathelicidin antimicrobial peptide was sufficient to differentiate ATB from NTB. AUC of ROC of the biosignature was calculated to 0.75. A shift in DBP-antimicrobial peptide (AMP) axis in the lungs of tuberculosis patients is observed. The identified sputum protein signature is a promising panel to differentiate ATB from NTB groups and suggest a deregulated DBP-AMP axis in lungs of tuberculosis patients
A review of data mining applications in crime
Crime continues to remain a severe threat to all communities and nations across the globe alongside the sophistication in technology and processes that are being exploited to enable highly complex criminal activities. Data mining, the process of uncovering hidden information from Big Data, is now an important tool for investigating, curbing and preventing crime and is exploited by both private and government institutions around the world. The primary aim of this paper is to provide a concise review of the data mining applications in crime. To this end, the paper reviews over 100 applications of data mining in crime, covering a substantial quantity of research to date, presented in chronological order with an overview table of many important data mining applications in the crime domain as a reference directory. The data mining techniques themselves are briefly introduced to the reader and these include entity extraction, clustering, association rule mining, decision trees, support vector machines, naive Bayes rule, neural networks and social network analysis amongst others