1,585 research outputs found

    An adaptive distributed Intrusion detection system architecture using multi agents

    Get PDF
    Intrusion detection systems are used for monitoring the network data, analyze them and find the intrusions if any. The major issues with these systems are the time taken for analysis, transfer of bulk data from one part of the network to another, high false positives and adaptability to the future threats. These issues are addressed here by devising a framework for intrusion detection. Here, various types of co-operating agents are distributed in the network for monitoring, analyzing, detecting and reporting. Analysis and detection agents are the mobile agents which are the primary detection modules for detecting intrusions. Their mobility eliminates the transfer of bulk data for processing. An algorithm named territory is proposed to avoid interference of one analysis agent with another one. A communication layout of the analysis and detection module with other modules is depicted. The inter-agent communication reduces the false positives significantly. It also facilitates the identification of distributed types of attacks. The co-ordinator agents log various events and summarize the activities in its network. It also communicates with co-ordinator agents of other networks. The system is highly scalable by increasing the number of various agents if needed. Centralized processing is avoided here to evade single point of failure. We created a prototype and the experiments done gave very promising results showing the effectiveness of the system

    Introduction to the Special Issue on Sustainable Solutions for the Intelligent Transportation Systems

    Get PDF
    The intelligent transportation systems improve the transportation system’s operational efficiency and enhance its safety and reliability by high-tech means such as information technology, control technology, and computer technology. In recent years, sustainable development has become an important topic in intelligent transportation’s development, including new infrastructure and energy distribution, new energy vehicles and new transportation systems, and the development of low-carbon and intelligent transportation equipment. New energy vehicles’ development is a significant part of green transportation, and its automation performance improvement is vital for smart transportation. The development of intelligent transportation and green, low-carbon, and intelligent transportation equipment needs to be promoted, a significant feature of transportation development in the future. For intelligent infrastructure and energy distribution facilities, the electricity for popular electric vehicles and renewable energy, such as nuclear power and hydrogen power, should be considered

    An enhanced resampling technique for imbalanced data sets

    Get PDF
    A data set is considered imbalanced if the distribution of instances in one class (majority class) outnumbers the other class (minority class). The main problem related to binary imbalanced data sets is classifiers tend to ignore the minority class. Numerous resampling techniques such as undersampling, oversampling, and a combination of both techniques have been widely used. However, the undersampling and oversampling techniques suffer from elimination and addition of relevant data which may lead to poor classification results. Hence, this study aims to increase classification metrics by enhancing the undersampling technique and combining it with an existing oversampling technique. To achieve this objective, a Fuzzy Distancebased Undersampling (FDUS) is proposed. Entropy estimation is used to produce fuzzy thresholds to categorise the instances in majority and minority class into membership functions. FDUS is then combined with the Synthetic Minority Oversampling TEchnique (SMOTE) known as FDUS+SMOTE, which is executed in sequence until a balanced data set is achieved. FDUS and FDUS+SMOTE are compared with four techniques based on classification accuracy, F-measure and Gmean. From the results, FDUS achieved better classification accuracy, F-measure and G-mean, compared to the other techniques with an average of 80.57%, 0.85 and 0.78, respectively. This showed that fuzzy logic when incorporated with Distance-based Undersampling technique was able to reduce the elimination of relevant data. Further, the findings showed that FDUS+SMOTE performed better than combination of SMOTE and Tomek Links, and SMOTE and Edited Nearest Neighbour on benchmark data sets. FDUS+SMOTE has minimised the removal of relevant data from the majority class and avoid overfitting. On average, FDUS and FDUS+SMOTE were able to balance categorical, integer and real data sets and enhanced the performance of binary classification. Furthermore, the techniques performed well on small record size data sets that have of instances in the range of approximately 100 to 800

    Developing Efficient and Effective Intrusion Detection System using Evolutionary Computation

    Get PDF
    The internet and computer networks have become an essential tool in distributed computing organisations especially because they enable the collaboration between components of heterogeneous systems. The efficiency and flexibility of online services have attracted many applications, but as they have grown in popularity so have the numbers of attacks on them. Thus, security teams must deal with numerous threats where the threat landscape is continuously evolving. The traditional security solutions are by no means enough to create a secure environment, intrusion detection systems (IDSs), which observe system works and detect intrusions, are usually utilised to complement other defence techniques. However, threats are becoming more sophisticated, with attackers using new attack methods or modifying existing ones. Furthermore, building an effective and efficient IDS is a challenging research problem due to the environment resource restrictions and its constant evolution. To mitigate these problems, we propose to use machine learning techniques to assist with the IDS building effort. In this thesis, Evolutionary Computation (EC) algorithms are empirically investigated for synthesising intrusion detection programs. EC can construct programs for raising intrusion alerts automatically. One novel proposed approach, i.e. Cartesian Genetic Programming, has proved particularly effective. We also used an ensemble-learning paradigm, in which EC algorithms were used as a meta-learning method to produce detectors. The latter is more fully worked out than the former and has proved a significant success. An efficient IDS should always take into account the resource restrictions of the deployed systems. Memory usage and processing speed are critical requirements. We apply a multi-objective approach to find trade-offs among intrusion detection capability and resource consumption of programs and optimise these objectives simultaneously. High complexity and the large size of detectors are identified as general issues with the current approaches. The multi-objective approach is used to evolve Pareto fronts for detectors that aim to maintain the simplicity of the generated patterns. We also investigate the potential application of these algorithms to detect unknown attacks

    IoT Data Analytics in Dynamic Environments: From An Automated Machine Learning Perspective

    Full text link
    With the wide spread of sensors and smart devices in recent years, the data generation speed of the Internet of Things (IoT) systems has increased dramatically. In IoT systems, massive volumes of data must be processed, transformed, and analyzed on a frequent basis to enable various IoT services and functionalities. Machine Learning (ML) approaches have shown their capacity for IoT data analytics. However, applying ML models to IoT data analytics tasks still faces many difficulties and challenges, specifically, effective model selection, design/tuning, and updating, which have brought massive demand for experienced data scientists. Additionally, the dynamic nature of IoT data may introduce concept drift issues, causing model performance degradation. To reduce human efforts, Automated Machine Learning (AutoML) has become a popular field that aims to automatically select, construct, tune, and update machine learning models to achieve the best performance on specified tasks. In this paper, we conduct a review of existing methods in the model selection, tuning, and updating procedures in the area of AutoML in order to identify and summarize the optimal solutions for every step of applying ML algorithms to IoT data analytics. To justify our findings and help industrial users and researchers better implement AutoML approaches, a case study of applying AutoML to IoT anomaly detection problems is conducted in this work. Lastly, we discuss and classify the challenges and research directions for this domain.Comment: Published in Engineering Applications of Artificial Intelligence (Elsevier, IF:7.8); Code/An AutoML tutorial is available at Github link: https://github.com/Western-OC2-Lab/AutoML-Implementation-for-Static-and-Dynamic-Data-Analytic

    Process-Oriented Stream Classification Pipeline:A Literature Review

    Get PDF
    Featured Application: Nowadays, many applications and disciplines work on the basis of stream data. Common examples are the IoT sector (e.g., sensor data analysis), or video, image, and text analysis applications (e.g., in social media analytics or astronomy). With our work, we gather different approaches and terminology, and give a broad overview over the topic. Our main target groups are practitioners and newcomers to the field of data stream classification. Due to the rise of continuous data-generating applications, analyzing data streams has gained increasing attention over the past decades. A core research area in stream data is stream classification, which categorizes or detects data points within an evolving stream of observations. Areas of stream classification are diverse—ranging, e.g., from monitoring sensor data to analyzing a wide range of (social) media applications. Research in stream classification is related to developing methods that adapt to the changing and potentially volatile data stream. It focuses on individual aspects of the stream classification pipeline, e.g., designing suitable algorithm architectures, an efficient train and test procedure, or detecting so-called concept drifts. As a result of the many different research questions and strands, the field is challenging to grasp, especially for beginners. This survey explores, summarizes, and categorizes work within the domain of stream classification and identifies core research threads over the past few years. It is structured based on the stream classification process to facilitate coordination within this complex topic, including common application scenarios and benchmarking data sets. Thus, both newcomers to the field and experts who want to widen their scope can gain (additional) insight into this research area and find starting points and pointers to more in-depth literature on specific issues and research directions in the field.</p

    Towards handling temporal dependence in concept drift streams.

    Get PDF
    Modern technological advancements have led to the production of an incomprehensible amount of data from a wide array of devices. A constant supply of new data provides an invaluable opportunity for access to qualitative and quantitative insights. Organisations recognise that, in today's modern era, data provides a means of mitigating risk and loss whilst maximising effciency and profit. However, processing this data is not without its challenges. Much of this data is produced in an online environment. Realtime stream data is unbound in size, variety and velocity. Data may arrive complete or with missing attributes, and data availability and persistence is limited to a small window of time. Classification methods and techniques that process offline data are not applicable to online data streams. Instead, new online classification methods have been developed. Research concerning the problematic and prevalent issue of concept drift has produced a considerable number of methods that allow online classifiers to adapt to changes in the stream distribution. However, recent research suggests that the presence of temporal dependence can cause misleading evaluation when accuracy is used as the core metric. This thesis investigates temporal dependence and its negative effcts upon the classification of concept drift data. First, this thesis proposes a novel method for coping with temporal dependence during the classification of real-time data streams, where concept drift is present. Results indicate that a statistical based, selective resetting approach can reduce the impact of temporal dependence in concept drift streams without significant loss in predictive accuracy. Secondly, a new ensemble based method, KTUE, that adopts the Kappa-Temporal statistic for vote weighting is suggested. Results show that this method is capable of outperforming some state-of-the-art ensemble methods in both temporally dependent and non-temporally dependent environments. Finally, this research proposes a novel algorithm for the simulation of temporally dependent concept drift data, which aims to help address the lack of established datasets available for evaluation. Experimental results show that temporal dependence can be injected into fabricated data streams using existing generation methods
    • …