3,708 research outputs found
A Hierarchical Temporal Memory Sequence Classifier for Streaming Data
Real-world data streams often contain concept drift and noise. Additionally, it is often the case that due to their very nature, these real-world data streams also include temporal dependencies between data. Classifying data streams with one or more of these characteristics is exceptionally challenging. Classification of data within data streams is currently the primary focus of research efforts in many fields (i.e., intrusion detection, data mining, machine learning). Hierarchical Temporal Memory (HTM) is a type of sequence memory that exhibits some of the predictive and anomaly detection properties of the neocortex. HTM algorithms conduct training through exposure to a stream of sensory data and are thus suited for continuous online learning. This research developed an HTM sequence classifier aimed at classifying streaming data, which contained concept drift, noise, and temporal dependencies. The HTM sequence classifier was fed both artificial and real-world data streams and evaluated using the prequential evaluation method. Cost measures for accuracy, CPU-time, and RAM usage were calculated for each data stream and compared against a variety of modern classifiers (e.g., Accuracy Weighted Ensemble, Adaptive Random Forest, Dynamic Weighted Majority, Leverage Bagging, Online Boosting ensemble, and Very Fast Decision Tree). The HTM sequence classifier performed well when the data streams contained concept drift, noise, and temporal dependencies, but was not the most suitable classifier of those compared against when provided data streams did not include temporal dependencies. Finally, this research explored the suitability of the HTM sequence classifier for detecting stalling code within evasive malware. The results were promising as they showed the HTM sequence classifier capable of predicting coding sequences of an executable file by learning the sequence patterns of the x86 EFLAGs register. The HTM classifier plotted these predictions in a cardiogram-like graph for quick analysis by reverse engineers of malware. This research highlights the potential of HTM technology for application in online classification problems and the detection of evasive malware
Detecting Irregular Patterns in IoT Streaming Data for Fall Detection
Detecting patterns in real time streaming data has been an interesting and
challenging data analytics problem. With the proliferation of a variety of
sensor devices, real-time analytics of data from the Internet of Things (IoT)
to learn regular and irregular patterns has become an important machine
learning problem to enable predictive analytics for automated notification and
decision support. In this work, we address the problem of learning an irregular
human activity pattern, fall, from streaming IoT data from wearable sensors. We
present a deep neural network model for detecting fall based on accelerometer
data giving 98.75 percent accuracy using an online physical activity monitoring
dataset called "MobiAct", which was published by Vavoulas et al. The initial
model was developed using IBM Watson studio and then later transferred and
deployed on IBM Cloud with the streaming analytics service supported by IBM
Streams for monitoring real-time IoT data. We also present the systems
architecture of the real-time fall detection framework that we intend to use
with mbientlabs wearable health monitoring sensors for real time patient
monitoring at retirement homes or rehabilitation clinics.Comment: 7 page
Anomaly and Change Detection in Graph Streams through Constant-Curvature Manifold Embeddings
Mapping complex input data into suitable lower dimensional manifolds is a
common procedure in machine learning. This step is beneficial mainly for two
reasons: (1) it reduces the data dimensionality and (2) it provides a new data
representation possibly characterised by convenient geometric properties.
Euclidean spaces are by far the most widely used embedding spaces, thanks to
their well-understood structure and large availability of consolidated
inference methods. However, recent research demonstrated that many types of
complex data (e.g., those represented as graphs) are actually better described
by non-Euclidean geometries. Here, we investigate how embedding graphs on
constant-curvature manifolds (hyper-spherical and hyperbolic manifolds) impacts
on the ability to detect changes in sequences of attributed graphs. The
proposed methodology consists in embedding graphs into a geometric space and
perform change detection there by means of conventional methods for numerical
streams. The curvature of the space is a parameter that we learn to reproduce
the geometry of the original application-dependent graph space. Preliminary
experimental results show the potential capability of representing graphs by
means of curved manifold, in particular for change and anomaly detection
problems.Comment: To be published in IEEE IJCNN 201
AIDPS:Adaptive Intrusion Detection and Prevention System for Underwater Acoustic Sensor Networks
Underwater Acoustic Sensor Networks (UW-ASNs) are predominantly used for
underwater environments and find applications in many areas. However, a lack of
security considerations, the unstable and challenging nature of the underwater
environment, and the resource-constrained nature of the sensor nodes used for
UW-ASNs (which makes them incapable of adopting security primitives) make the
UW-ASN prone to vulnerabilities. This paper proposes an Adaptive decentralised
Intrusion Detection and Prevention System called AIDPS for UW-ASNs. The
proposed AIDPS can improve the security of the UW-ASNs so that they can
efficiently detect underwater-related attacks (e.g., blackhole, grayhole and
flooding attacks). To determine the most effective configuration of the
proposed construction, we conduct a number of experiments using several
state-of-the-art machine learning algorithms (e.g., Adaptive Random Forest
(ARF), light gradient-boosting machine, and K-nearest neighbours) and concept
drift detection algorithms (e.g., ADWIN, kdqTree, and Page-Hinkley). Our
experimental results show that incremental ARF using ADWIN provides optimal
performance when implemented with One-class support vector machine (SVM)
anomaly-based detectors. Furthermore, our extensive evaluation results also
show that the proposed scheme outperforms state-of-the-art bench-marking
methods while providing a wider range of desirable features such as scalability
and complexity
Data Stream Clustering: A Review
Number of connected devices is steadily increasing and these devices
continuously generate data streams. Real-time processing of data streams is
arousing interest despite many challenges. Clustering is one of the most
suitable methods for real-time data stream processing, because it can be
applied with less prior information about the data and it does not need labeled
instances. However, data stream clustering differs from traditional clustering
in many aspects and it has several challenging issues. Here, we provide
information regarding the concepts and common characteristics of data streams,
such as concept drift, data structures for data streams, time window models and
outlier detection. We comprehensively review recent data stream clustering
algorithms and analyze them in terms of the base clustering technique,
computational complexity and clustering accuracy. A comparison of these
algorithms is given along with still open problems. We indicate popular data
stream repositories and datasets, stream processing tools and platforms. Open
problems about data stream clustering are also discussed.Comment: Has been accepted for publication in Artificial Intelligence Revie
Automatically Selecting Parameters for Graph-Based Clustering
Data streams present a number of challenges, caused by change in stream concepts over time. In this thesis we present a novel method for detection of concept drift within data streams by analysing geometric features of the clustering algorithm, RepStream. Further, we present novel methods for automatically adjusting critical input parameters over time, and generating self-organising nearest-neighbour graphs, improving robustness and decreasing the need to domain-specific knowledge in the face of stream evolution
- …