72 research outputs found
Graph Neural Networks based Log Anomaly Detection and Explanation
Event logs are widely used to record the status of high-tech systems, making
log anomaly detection important for monitoring those systems. Most existing log
anomaly detection methods take a log event count matrix or log event sequences
as input, exploiting quantitative and/or sequential relationships between log
events to detect anomalies. Unfortunately, only considering quantitative or
sequential relationships may result in low detection accuracy. To alleviate
this problem, we propose a graph-based method for unsupervised log anomaly
detection, dubbed Logs2Graphs, which first converts event logs into attributed,
directed, and weighted graphs, and then leverages graph neural networks to
perform graph-level anomaly detection. Specifically, we introduce One-Class
Digraph Inception Convolutional Networks, abbreviated as OCDiGCN, a novel graph
neural network model for detecting graph-level anomalies in a collection of
attributed, directed, and weighted graphs. By coupling the graph representation
and anomaly detection steps, OCDiGCN can learn a representation that is
especially suited for anomaly detection, resulting in a high detection
accuracy. Importantly, for each identified anomaly, we additionally provide a
small subset of nodes that play a crucial role in OCDiGCN's prediction as
explanations, which can offer valuable cues for subsequent root cause
diagnosis. Experiments on five benchmark datasets show that Logs2Graphs
performs at least on par with state-of-the-art log anomaly detection methods on
simple datasets while largely outperforming state-of-the-art log anomaly
detection methods on complicated datasets.Comment: Preprint submitted to Engineering Applications of Artificial
Intelligenc
Approximation vector machines for large-scale online learning
One of the most challenging problems in kernel online learning is to bound the model size and to promote model sparsity. Sparse models not only improve computation and memory usage, but also enhance the generalization capacity -- a principle that concurs with the law of parsimony. However, inappropriate sparsity modeling may also significantly degrade the performance. In this paper, we propose Approximation Vector Machine (AVM), a model that can simultaneously encourage sparsity and safeguard its risk in compromising the performance. In an online setting context, when an incoming instance arrives, we approximate this instance by one of its neighbors whose distance to it is less than a predefined threshold. Our key intuition is that since the newly seen instance is expressed by its nearby neighbor the optimal performance can be analytically formulated and maintained. We develop theoretical foundations to support this intuition and further establish an analysis for the common loss functions including Hinge, smooth Hinge, and Logistic (i.e., for the classification task) and â„“1, â„“2, and ε-insensitive (i.e., for the regression task) to characterize the gap between the approximation and optimal solutions. This gap crucially depends on two key factors including the frequency of approximation (i.e., how frequent the approximation operation takes place) and the predefined threshold. We conducted extensive experiments for classification and regression tasks in batch and online modes using several benchmark datasets. The quantitative results show that our proposed AVM obtained comparable predictive performances with current state-of-the-art methods while simultaneously achieving significant computational speed-up due to the ability of the proposed AVM in maintaining the model size
Data-driven solutions to enhance planning, operation and design tools in Industry 4.0 context
This thesis proposes three different data-driven solutions to be combined to state-of-the-art solvers and tools in order to primarily enhance their computational performances. The problem of efficiently designing the open sea floating platforms on which wind turbines can be mount on will be tackled, as well as the tuning of a data-driven engine's monitoring tool for maritime transportation. Finally, the activities of SAT and ASP solvers will be thoroughly studied and a deep learning architecture will be proposed to enhance the heuristics-based solving approach adopted by such software. The covered domains are different and the same is true for their respective targets. Nonetheless, the proposed Artificial Intelligence and Machine Learning algorithms are shared as well as the overall picture: promote Industrial AI and meet the constraints imposed by Industry 4.0 vision. The lesser presence of human-in-the-loop, a data-driven approach to discover causalities otherwise ignored, a special attention to the environmental impact of industries' emissions, a real and efficient exploitation of the Big Data available today are just a subset of the latter. Hence, from a broader perspective, the experiments carried out within this thesis are driven towards the aforementioned targets and the resulting outcomes are satisfactory enough to potentially convince the research community and industrialists that they are not just "visions" but they can be actually put into practice. However, it is still an introduction to the topic and the developed models are at what can be defined a "pilot" stage. Nonetheless, the results are promising and they pave the way towards further improvements and the consolidation of the dictates of Industry 4.0
Leaf recognition for accurate plant classification.
Doctor of Philosophy in Computer Science, University of KwaZulu-Natal, Durban 2017.Plants are the most important living organisms on our planet because they are
sources of energy and protect our planet against global warming. Botanists were
the first scientist to design techniques for plant species recognition using leaves. Although
many techniques for plant recognition using leaf images have been proposed
in the literature, the precision and the quality of feature descriptors for shape, texture,
and color remain the major challenges. This thesis investigates the precision
of geometric shape features extraction and improved the determination of the Minimum
Bounding Rectangle (MBR). The comparison of the proposed improved MBR
determination method to Chaudhuri's method is performed using Mean Absolute
Error (MAE) generated by each method on each edge point of the MBR. On the
top left point of the determined MBR, Chaudhuri's method has the MAE value of
26.37 and the proposed method has the MAE value of 8.14.
This thesis also investigates the use of the Convexity Measure of Polygons for the
characterization of the degree of convexity of a given leaf shape. Promising results
are obtained when using the Convexity Measure of Polygons combined with other
geometric features to characterize leave images, and a classification rate of 92% was
obtained with a Multilayer Perceptron Neural Network classifier. After observing
the limitations of the Convexity Measure of Polygons, a new shape feature called
Convexity Moments of Polygons is presented in this thesis. This new feature has
the invariant properties of the Convexity Measure of Polygons, but is more precise
because it uses more than one value to characterize the degree of convexity of a
given shape. Promising results are obtained when using the Convexity Moments
of Polygons combined with other geometric features to characterize the leaf images
and a classification rate of 95% was obtained with the Multilayer Perceptron Neural
Network classifier.
Leaf boundaries carry valuable information that can be used to distinguish between
plant species. In this thesis, a new boundary-based shape characterization
method called Sinuosity Coefficients is proposed. This method has been used in
many fields of science like Geography to describe rivers meandering. The Sinuosity
Coefficients is scale and translation invariant. Promising results are obtained when
using Sinuosity Coefficients combined with other geometric features to characterize
the leaf images, a classification rate of 80% was obtained with the Multilayer
Perceptron Neural Network classifier.
Finally, this thesis implements a model for plant classification using leaf images,
where an input leaf image is described using the Convexity Moments, the Sinuosity
Coefficients and the geometric features to generate a feature vector for the recognition
of plant species using a Radial Basis Neural Network. With the model designed
and implemented the overall classification rate of 97% was obtained
Unsupervised Few-shot Learning via Deep Laplacian Eigenmaps
Learning a new task from a handful of examples remains an open challenge in
machine learning. Despite the recent progress in few-shot learning, most
methods rely on supervised pretraining or meta-learning on labeled
meta-training data and cannot be applied to the case where the pretraining data
is unlabeled. In this study, we present an unsupervised few-shot learning
method via deep Laplacian eigenmaps. Our method learns representation from
unlabeled data by grouping similar samples together and can be intuitively
interpreted by random walks on augmented training data. We analytically show
how deep Laplacian eigenmaps avoid collapsed representation in unsupervised
learning without explicit comparison between positive and negative samples. The
proposed method significantly closes the performance gap between supervised and
unsupervised few-shot learning. Our method also achieves comparable performance
to current state-of-the-art self-supervised learning methods under linear
evaluation protocol
- …