571 research outputs found

    Usage of Modified Holt-Winters Method in the Anomaly Detection of Network Traffic: Case Studies

    Get PDF

    Performance measurement with high performance computer of HW-GA anomaly detection algorithms for streaming data

    Get PDF
    Anomaly detection is very important in every sector as health, education, business, etc. Knowing what is going wrong with data/digital system help peoples from every sector to take decision. Detection anomalies in real time Big Data is nowadays very crucial. Dealing with real time data requires speed, for this reason the aim of this paper is to measure the performance of our previously proposed HW-GA algorithm compared with other anomaly detection algorithms. Many factors will be analyzed which may affect the performance of HW-GA as visualization of result, amount of data and performance of computers. Algorithm execution time and CPU usage are the parameters which will be measured to evaluate the performance of HW-GA algorithm. Also, another aim of this paper is to test the HW-GA algorithm with large amount of data to verify if it will find the possible anomalies and the result to compare with other algorithms. The experiments will be done in R with different datasets as real data Covid-19 and e-dnevnik data and three benchmarks from Numenta datasets. The real data have not known anomalies but in the benchmark data the anomalies are known this is in order to evaluate how the algorithms work in both situations. The novelty of this paper is that the performance will be tested in three different computers which one of them is high performance computer

    Managing Uncertainty: A Case for Probabilistic Grid Scheduling

    Get PDF
    The Grid technology is evolving into a global, service-orientated architecture, a universal platform for delivering future high demand computational services. Strong adoption of the Grid and the utility computing concept is leading to an increasing number of Grid installations running a wide range of applications of different size and complexity. In this paper we address the problem of elivering deadline/economy based scheduling in a heterogeneous application environment using statistical properties of job historical executions and its associated meta-data. This approach is motivated by a study of six-month computational load generated by Grid applications in a multi-purpose Grid cluster serving a community of twenty e-Science projects. The observed job statistics, resource utilisation and user behaviour is discussed in the context of management approaches and models most suitable for supporting a probabilistic and autonomous scheduling architecture

    Rating the Significance of Detected Network Events

    Get PDF
    Existing anomaly detection systems do not reliably produce accurate severity ratings for detected network events, which results in network operators wasting a large amount of time and effort in investigating false alarms. This project investigates the use of data fusion to combine evidence from multiple anomaly detection methods to produce a consistent and accurate representation of the severity of a network event. Four new detection methods were added to Netevmon, a network anomaly detection framework, and ground truth was collected from a latency training dataset to calculate the set of probabilities required for each of the five data fusion methods chosen for testing. The evaluation was performed against a second test dataset containing manually assigned severity scores for each event and the significance ratings produced by the fusion methods were compared against the assigned severity score to determine the accuracy of each data fusion method. The results of the evaluation showed that none of the data fusion methods achieved a desirable level of accuracy for practical deployment. However, Dempster-Shafer was the most promising of the fusion methods investigated due to correctly classifying more significant events than the other methods, albeit with a slightly higher false alarm rate. We conclude by suggesting some possible options for improving the accuracy of Dempster-Shafer that could be investigated as part of future work

    Data Quality Management in Large-Scale Cyber-Physical Systems

    Get PDF
    Cyber-Physical Systems (CPSs) are cross-domain, multi-model, advance information systems that play a significant role in many large-scale infrastructure sectors of smart cities public services such as traffic control, smart transportation control, and environmental and noise monitoring systems. Such systems, typically, involve a substantial number of sensor nodes and other devices that stream and exchange data in real-time and usually are deployed in uncontrolled, broad environments. Thus, unexpected measurements may occur due to several internal and external factors, including noise, communication errors, and hardware failures, which may compromise these systems quality of data and raise serious concerns related to safety, reliability, performance, and security. In all cases, these unexpected measurements need to be carefully interpreted and managed based on domain knowledge and computational models. Therefore, in this research, data quality challenges were investigated, and a comprehensive, proof of concept, data quality management system was developed to tackle unaddressed data quality challenges in large-scale CPSs. The data quality management system was designed to address data quality challenges associated with detecting: sensor nodes measurement errors, sensor nodes hardware failures, and mismatches in sensor nodes spatial and temporal contextual attributes. Detecting sensor nodes measurement errors associated with the primary data quality dimensions of accuracy, timeliness, completeness, and consistency in large-scale CPSs were investigated using predictive and anomaly analysis models via utilising statistical and machine-learning techniques. Time-series clustering techniques were investigated as a feasible mean for detecting long-segmental outliers as an indicator of sensor nodesā€™ continuous halting and incipient hardware failures. Furthermore, the quality of the spatial and temporal contextual attributes of sensor nodes observations was investigated using timestamp analysis techniques. The different components of the data quality management system were tested and calibrated using benchmark time-series collected from a high-quality, temperature sensor network deployed at the University of East London. Furthermore, the effectiveness of the proposed data quality management system was evaluated using a real-world, large-scale environmental monitoring network consisting of more than 200 temperature sensor nodes distributed around London. The data quality management system achieved high accuracy detection rate using LSTM predictive analysis technique and anomaly detection associated with DBSCAN. It successfully identified timeliness and completeness errors in sensor nodesā€™ measurements using periodicity analysis combined with a rule engine. It achieved up to 100% accuracy in detecting potentially failed sensor nodes using the characteristic-based time-series clustering technique when applied to two days or longer time-series window. Timestamp analysis was adopted effectively for evaluating the quality of temporal and spatial contextual attributes of sensor nodes observations, but only within CPS applications in which using gateway modules is possible

    Detecting Energy Theft and Anomalous Power Usage in Smart Meter Data

    Get PDF
    The success of renewable energy usage is fuelling the power grids most significant transformation seen in decades, from a centrally controlled electricity supply towards an intelligent, decentralized infrastructure. However, as power grid components become more connected, they also become more vulnerable to cyber attacks, fraud, and software failures. Many recent developments focus on cyber-physical security, such as physical tampering detection, as well as traditional information security solutions, such as encryption, which cannot cover the entire challenge of cyber threats, as digital electricity meters can be vulnerable to software flaws and hardware malfunctions. With the digitalization of electricity meters, many previously solved security problems, such as electricity theft, are reintroduced as IT related challenges which require modern detection schemes based on data analysis, machine learning and forecasting. The rapid advancements in statistical methods, akin to machine learning techniques, resulted in a boosted interest towards concepts to model, forecast or extract load information, as provided by a smart meter, and detect tampering early on. Anomaly Detection Systems discovers tampering methods by analysing statistical deviations from a defined normal behaviour and is commonly accepted as an appropriate technique to uncover yet unknown patterns of misuse. This work proposes anomaly detection approaches, using the power measurements, for the early detection of tampered with electricity meters. Algorithms based on time series prediction and probabilistic models with detection rates above 90% were implemented and evaluated using various parameters. The contributions include the assessment of different dimensions of available data, introduction of metrics and aggregation methods to optimize the detection of specific pattern, and examination of sophisticated threads such as mimicking behaviour. The work contributes to the understanding of significant characteristics and normal behaviour of electric load data as well as evidence for tampering and especially energy theft

    The model of an anomaly detector for HiLumi LHC magnets based on Recurrent Neural Networks and adaptive quantization

    Full text link
    This paper focuses on an examination of an applicability of Recurrent Neural Network models for detecting anomalous behavior of the CERN superconducting magnets. In order to conduct the experiments, the authors designed and implemented an adaptive signal quantization algorithm and a custom GRU-based detector and developed a method for the detector parameters selection. Three different datasets were used for testing the detector. Two artificially generated datasets were used to assess the raw performance of the system whereas the 231 MB dataset composed of the signals acquired from HiLumi magnets was intended for real-life experiments and model training. Several different setups of the developed anomaly detection system were evaluated and compared with state-of-the-art OC-SVM reference model operating on the same data. The OC-SVM model was equipped with a rich set of feature extractors accounting for a range of the input signal properties. It was determined in the course of the experiments that the detector, along with its supporting design methodology, reaches F1 equal or very close to 1 for almost all test sets. Due to the profile of the data, the best_length setup of the detector turned out to perform the best among all five tested configuration schemes of the detection system. The quantization parameters have the biggest impact on the overall performance of the detector with the best values of input/output grid equal to 16 and 8, respectively. The proposed solution of the detection significantly outperformed OC-SVM-based detector in most of the cases, with much more stable performance across all the datasets.Comment: Related to arXiv:1702.0083

    Mendeteksi Anomali Menggunakan Algoritma Holt-Winters berdasarkan Tingkat Keyakinan dari Teorema Bayes

    Get PDF
    Trafik dapat diartikan sebagai suatu informasi yang berpindah dari transmitter ke receiver. Jaringan trafik tidak dapat dipastikan secara akurat kapan akan terjadi suatu keadaan yang di luar batas normal dari trafik tersebut. Keadaan di luar batas trafik itu dapat dinamakan Anomali trafik. Untuk mengetahui dan mendeteksi dan memprediksi sebuah trafik yang tidak normal atau anomali dapat dicapai melalui Algoritma Holt-Winters. Perhitungan Algoritma Holt-Winters ini sendiri didasari oleh Exponential Smoothing. Tingkat keberhasilan dalam pendeteksian anomali ini juga beragam, dan banyak faktor yang mempengaruhi hasil akhir. Maka untuk mengetahui nilai akurasi dari Algoritma Holt-Winters diperlukan perhitungan kembali yang dapat dilakukan menggunakan Teorema Bayes. Dalam penelitian ini dilakukan pendeteksian anomali yang terjadi pada masa yang akan datang pada suatu jaringan menggunakan algoritma Holt-Winters. Algoritma ini menggunakan data trafik jaringan yang sebelumnya sudah terjadi untuk kemudian dihitung dan diprediksi kapan anomali pada trafik tersebut terjadi lagi. Hasil dari Algoritma Holt-Winters akan kembali dihitung untuk mencari tingkat akurasinya menggunakan Teorema Bayes. Algoritma Holt-Winters mengumpulkan data anomali pada suatu trafik per service dengan hasil akurasi dari prediksi dengan service ftp merupakan 91%, dengan service ftp-data merupakan 89%, dan dengan service telnet merupakan 88%. Dengan menghitung hasil dari Algoritma Holt-Winters ini menggunakan perhitungan probabilitas untuk tingkat keyakinan pada Teorema Bayes sebesar 0,51 untuk ftp, 1 untuk ftp-data dan 0,48 untuk telnet
    • ā€¦
    corecore