21 research outputs found
Automated Modeling of Real-Time Anomaly Detection using Non-Parametric Statistical technique for Data Streams in Cloud Environments
The main objective of online anomaly detection is to identify abnormal/unusual behavior such as network intrusions, malware infections, over utilized system resources due to design defects etc from real time data stream. Terrabytes of performance data generated in cloud data centers is a well accepted example of such data stream in real time. In this paper, we propose an online anomaly detection framework using non-parametric statistical technique in cloud data center. In order to determine the accuracy of the proposed work, we experiments it to data collected from RUBis cloud testbed and Yahoo Cloud Serving Benchmark (YCSB). Our experimental results shows the greater accuracy in terms of True Positive Rate (TPR), False Positive Rate (FPR), True Negative Rate (TNR) and False Negative Rate (FNR)
The model of an anomaly detector for HiLumi LHC magnets based on Recurrent Neural Networks and adaptive quantization
This paper focuses on an examination of an applicability of Recurrent Neural
Network models for detecting anomalous behavior of the CERN superconducting
magnets. In order to conduct the experiments, the authors designed and
implemented an adaptive signal quantization algorithm and a custom GRU-based
detector and developed a method for the detector parameters selection. Three
different datasets were used for testing the detector. Two artificially
generated datasets were used to assess the raw performance of the system
whereas the 231 MB dataset composed of the signals acquired from HiLumi magnets
was intended for real-life experiments and model training. Several different
setups of the developed anomaly detection system were evaluated and compared
with state-of-the-art OC-SVM reference model operating on the same data. The
OC-SVM model was equipped with a rich set of feature extractors accounting for
a range of the input signal properties. It was determined in the course of the
experiments that the detector, along with its supporting design methodology,
reaches F1 equal or very close to 1 for almost all test sets. Due to the
profile of the data, the best_length setup of the detector turned out to
perform the best among all five tested configuration schemes of the detection
system. The quantization parameters have the biggest impact on the overall
performance of the detector with the best values of input/output grid equal to
16 and 8, respectively. The proposed solution of the detection significantly
outperformed OC-SVM-based detector in most of the cases, with much more stable
performance across all the datasets.Comment: Related to arXiv:1702.0083
In Datacenter Performance, The Only Constant Is Change
All computing infrastructure suffers from performance variability, be it
bare-metal or virtualized. This phenomenon originates from many sources: some
transient, such as noisy neighbors, and others more permanent but sudden, such
as changes or wear in hardware, changes in the underlying hypervisor stack, or
even undocumented interactions between the policies of the computing resource
provider and the active workloads. Thus, performance measurements obtained on
clouds, HPC facilities, and, more generally, datacenter environments are almost
guaranteed to exhibit performance regimes that evolve over time, which leads to
undesirable nonstationarities in application performance. In this paper, we
present our analysis of performance of the bare-metal hardware available on the
CloudLab testbed where we focus on quantifying the evolving performance regimes
using changepoint detection. We describe our findings, backed by a dataset with
nearly 6.9M benchmark results collected from over 1600 machines over a period
of 2 years and 9 months. These findings yield a comprehensive characterization
of real-world performance variability patterns in one computing facility, a
methodology for studying such patterns on other infrastructures, and contribute
to a better understanding of performance variability in general.Comment: To be presented at the 20th IEEE/ACM International Symposium on
Cluster, Cloud and Internet Computing (CCGrid,
http://cloudbus.org/ccgrid2020/) on May 11-14, 2020 in Melbourne, Victoria,
Australi
Statistical Detection of Collective Data Fraud
Statistical divergence is widely applied in multimedia processing, basically
due to regularity and interpretable features displayed in data. However, in a
broader range of data realm, these advantages may no longer be feasible, and
therefore a more general approach is required. In data detection, statistical
divergence can be used as a similarity measurement based on collective
features. In this paper, we present a collective detection technique based on
statistical divergence. The technique extracts distribution similarities among
data collections, and then uses the statistical divergence to detect collective
anomalies. Evaluation shows that it is applicable in the real world.Comment: 6 pages, 6 figures and tables, submitted to ICME 202