38,929 research outputs found

    Interpretable Graph Anomaly Detection using Gradient Attention Maps

    Full text link
    Detecting unusual patterns in graph data is a crucial task in data mining. However, existing methods often face challenges in consistently achieving satisfactory performance and lack interpretability, which hinders our understanding of anomaly detection decisions. In this paper, we propose a novel approach to graph anomaly detection that leverages the power of interpretability to enhance performance. Specifically, our method extracts an attention map derived from gradients of graph neural networks, which serves as a basis for scoring anomalies. In addition, we conduct theoretical analysis using synthetic data to validate our method and gain insights into its decision-making process. To demonstrate the effectiveness of our method, we extensively evaluate our approach against state-of-the-art graph anomaly detection techniques. The results consistently demonstrate the superior performance of our method compared to the baselines

    Process Aware Host-based Intrusion Detection Model

    Get PDF
    Nowadays, many organizations use Process Aware Information Systems (PAISs) to automate their business process. As any other information systems, security plays a major role in PAIS to provide a secure state and maintain the system in it. In order to provide security in a PAIS, a Process Aware Host-based Intrusion Detection (PAHID) model is proposed in this paper. The model detects host-based intrusions in a PAIS using process mining techniques.The proposed model uses both anomaly detection and misuse detection techniques for more efficiency, and organizational perspective of process mining is considered (rather than control-flow perspective) to detect more attack types. The model is automated and can deal with large logs and is suitable for flexible application domains. The PAHID model is implemented by the use of ProM framework and Java programming. It is evaluated by using a simulated log based on a real-world organization information system. Results demonstrate that the model provides high accuracy and low false positive rate

    Intrusion Detection Using Self-Training Support Vector Machines

    Get PDF
    Intrusion is broadly defined as a successful attack on a network. Intrusion Detection System (IDS) is a software tool used to detect unauthorized access to a computer system or network. It is a dynamic monitoring entity that complements the static monitoring abilities of a firewall. Data Mining techniques provide efficient methods for the development of IDS. The idea behind using data mining techniques is that they can automate the process of creating traffic models from some reference data and thereby eliminate the need of laborious manual intervention. Such systems are capable of detecting not only known attacks but also their variations.Existing IDS technologies, on the basis of detection methodology are broadly classified as Misuse or Signature Based Detection and Anomaly Detection Based System. The idea behind misuse detection consists of comparing network traffic against a Model describing known intrusion. The anomaly detection method is based on the analysis of the profiles that represent normal traffic behavior. Semi-Supervised systems for anomaly detection would reduce the demands of the training process by reducing the requirement of training labeled data. A Self Training Support Vector Machine based detection algorithm is presented in this thesis. In the past, Self-Training of SVM has been successfully used for reducing the size of labeled training set in other domains. A similar method was implemented and results of the simulation performed on the KDD Cup 99 dataset for intrusion detection show a reduction of upto 90% in the size of labeled training set required as compared to the supervised learning techniques

    Damage Vision Mining Opportunity for Imbalanced Anomaly Detection

    Full text link
    In past decade, previous balanced datasets have been used to advance algorithms for classification, object detection, semantic segmentation, and anomaly detection in industrial applications. Specifically, for condition-based maintenance, automating visual inspection is crucial to ensure high quality. Deterioration prognostic attempts to optimize the fine decision process for predictive maintenance and proactive repair. In civil infrastructure and living environment, damage data mining cannot avoid the imbalanced data issue because of rare unseen events and high quality status by improved operations. For visual inspection, deteriorated class acquired from the surface of concrete and steel components are occasionally imbalanced. From numerous related surveys, we summarize that imbalanced data problems can be categorized into four types; 1) missing range of target and label valuables, 2) majority-minority class imbalance, 3) foreground-background of spatial imbalance, 4) long-tailed class of pixel-wise imbalance. Since 2015, there has been many imbalanced studies using deep learning approaches that includes regression, image classification, object detection, semantic segmentation. However, anomaly detection for imbalanced data is not yet well known. In the study, we highlight one-class anomaly detection application whether anomalous class or not, and demonstrate clear examples on imbalanced vision datasets: wooden, concrete deterioration, and disaster damage. We provide key results on damage vision mining advantage, hypothesizing that the more effective range of positive ratio, the higher accuracy gain of anomaly detection application. Finally, the applicability of the damage learning methods, limitations, and future works are mentioned.Comment: 12 pages, 14 figures, 8 table

    Signature-based anomaly intrusion detection using integrated data mining classifiers

    Get PDF
    As the influence of Internet and networking technologies as communication medium advance and expand across the globe, cyber attacks also grow accordingly. Anomaly detection systems (ADSs) are employed to scrutinize information such as packet behaviours coming from various locations on network to find those intrusive activities as fast as possible with precision. Unfortunately, besides minimizing false alarms; the performance issues related to heavy computational process has become drawbacks to be resolved in this kind of detection systems. In this work, a novel Signature-Based Anomaly Detection Scheme (SADS) which could be applied to scrutinize packet headers' behaviour patterns more precisely and promptly is proposed. Integrating data mining classifiers such as Naive Bayes and Random Forest can be utilized to decrease false alarms as well as generate signatures based on detection results for future prediction and reducing processing time. Results from a number of experiments using DARPA 1999 and ISCX 2012 benchmark dataset have validated that SADS own better detection capabilities with lower processing duration as contrast to conventional anomaly-based detection method

    Can process mining help in anomaly-based intrusion detection?

    Get PDF
    In this paper, we consider the naive applications of process mining in network traffic comprehension, traffic anomaly detection, and intrusion detection. We standardise the procedure of transforming packet data into an event log. We mine multiple process models and analyse the process models mined with the inductive miner using ProM and the fuzzy miner using Disco. We compare the two types of process models extracted from event logs of differing sizes. We contrast the process models with the RFC TCP state transition diagram and the diagram by Bishop et al. We analyse the issues and challenges associated with process mining in intrusion detection and explain why naive process mining with network data is ineffective

    Detecting market manipulation in stock market data

    Get PDF
    Anomaly Detection is an extensively researched problem that has diverse applications in many domains. Anomaly detection is the process of finding data points or patterns that do not conform to expected behavior within a dataset. Solutions to this problem have used techniques from disciplines such as statistics, machine learning, data mining, spectral theory and information theory. In the case of stock market data, the input is a non-linear complex time series that render statistical methods ineffective. The aim of this thesis, is to detect anomalies within the Standard and Poor and Qatar Stock Exchange using the behavior of similar time series. Many works on stock market manipulation focus on supervised learning techniques, which require labeled datasets. The labeling process requires substantial efforts. Anomalous behavior is also dynamic in nature. For those reasons, the development of an unsupervised market manipulation detection technique would be very interesting. The Contextual Anomaly Detector (CAD) is an unsupervised method that finds anomalies by looking at similarly behaving time series and uses them to predict expected values. When the predicted value is different from the actual value in the time series by a certain threshold, it is considered an anomaly. This thesis will look at the Contextual Anomaly Detector (CAD) and implement a different preprocessing step to improve recall and precision

    Task automation through email data analysis

    Get PDF
    Currently, many companies do not use the information contained in their emails, yet it is a data set that is full of information and could be very useful. This thesis report focuses on email data analysis and task automation, particularly in the area of email-based process mining. The state of the art section reviews existing research on extracting information from email content using techniques such as lexical analysis, language detection, semantic analysis and machine learning methods. It explores different areas of process mining, including process pattern discovery, anomaly discovery, and process extraction from texts. The objectives of this research are to assess the feasibility of extracting candidate processes from emails, to develop human-understandable metrics to classify processes, to propose a system to identify automation opportunities in email templates and explore possibilities for automation in email interactions. To do this, we carried out different steps such as data preparation, chains detection, text representation, distance matrix calculation and grouping methods

    Data mining based cyber-attack detection

    Get PDF
    corecore