185 research outputs found

    Multilayer Cyberattacks Identification and Classification Using Machine Learning in Internet of Blockchain (IoBC)-Based Energy Networks

    Get PDF
    The world's need for energy is rising due to factors like population growth, economic expansion, and technological breakthroughs. However, there are major consequences when gas and coal are burnt to meet this surge in energy needs. Although these fossil fuels are still essential for meeting energy demands, their combustion releases a large amount of carbon dioxide and other pollutants into the atmosphere. This significantly jeopardizes community health in addition to exacerbating climate change, thus it is essential need to move swiftly to incorporate renewable energy sources by employing advanced information and communication technologies. However, this change brings up several security issues emphasizing the need for innovative cyber threats detection and prevention solutions. Consequently, this study presents bigdata sets obtained from the solar and wind powered distributed energy systems through the blockchain-based energy networks in the smart grid (SG). A hybrid machine learning (HML) model that combines both the Deep Learning (DL) and Long-Short-Term-Memory (LSTM) models characteristics is developed and applied to identify the unique patterns of Denial of Service (DoS) and Distributed Denial of Service (DDoS) cyberattacks in the power generation, transmission, and distribution processes. The presented big datasets are essential and significantly helps in identifying and classifying cyberattacks, leading to predicting the accurate energy systems behavior in the SG.© 2024 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/)fi=vertaisarvioitu|en=peerReviewed

    Feature Space Modeling for Accurate and Efficient Learning From Non-Stationary Data

    Get PDF
    A non-stationary dataset is one whose statistical properties such as the mean, variance, correlation, probability distribution, etc. change over a specific interval of time. On the contrary, a stationary dataset is one whose statistical properties remain constant over time. Apart from the volatile statistical properties, non-stationary data poses other challenges such as time and memory management due to the limitation of computational resources mostly caused by the recent advancements in data collection technologies which generate a variety of data at an alarming pace and volume. Additionally, when the collected data is complex, managing data complexity, emerging from its dimensionality and heterogeneity, can pose another challenge for effective computational learning. The problem is to enable accurate and efficient learning from non-stationary data in a continuous fashion over time while facing and managing the critical challenges of time, memory, concept change, and complexity simultaneously. Feature space modeling is one of the most effective solutions to address this problem. For non-stationary data, selecting relevant features is even more critical than stationary data due to the reduction of feature dimension which can ensure the best use a computational resource to produce higher accuracy and efficiency by data mining algorithms. In this dissertation, we investigated a variety of feature space modeling techniques to improve the overall performance of data mining algorithms. In particular, we built Relief based feature sub selection method in combination with data complexity iv analysis to improve the classification performance using ovarian cancer image data collected in a non-stationary batch mode. We also collected time series health sensor data in a streaming environment and deployed feature space transformation using Singular Value Decomposition (SVD). This led to reduced dimensionality of feature space resulting in better accuracy and efficiency produced by Density Ration Estimation Method in identifying potential change points in data over time. We have also built an unsupervised feature space modeling using matrix factorization and Lasso Regression which was successfully deployed in conjugate with Relative Density Ratio Estimation to address the botnet attacks in a non-stationary environment. Relief based feature model improved 16% accuracy of Fuzzy Forest classifier. For change detection framework, we observed 9% improvement in accuracy for PCA feature transformation. Due to the unsupervised feature selection model, for 2% and 5% malicious traffic ratio, the proposed botnet detection framework exhibited average 20% better accuracy than One Class Support Vector Machine (OSVM) and average 25% better accuracy than Autoencoder. All these results successfully demonstrate the effectives of these feature space models. The fundamental theme that repeats itself in this dissertation is about modeling efficient feature space to improve both accuracy and efficiency of selected data mining models. Every contribution in this dissertation has been subsequently and successfully employed to capitalize on those advantages to solve real-world problems. Our work bridges the concepts from multiple disciplines ineffective and surprising ways, leading to new insights, new frameworks, and ultimately to a cross-production of diverse fields like mathematics, statistics, and data mining

    TOWARDS A HOLISTIC EFFICIENT STACKING ENSEMBLE INTRUSION DETECTION SYSTEM USING NEWLY GENERATED HETEROGENEOUS DATASETS

    Get PDF
    With the exponential growth of network-based applications globally, there has been a transformation in organizations\u27 business models. Furthermore, cost reduction of both computational devices and the internet have led people to become more technology dependent. Consequently, due to inordinate use of computer networks, new risks have emerged. Therefore, the process of improving the speed and accuracy of security mechanisms has become crucial.Although abundant new security tools have been developed, the rapid-growth of malicious activities continues to be a pressing issue, as their ever-evolving attacks continue to create severe threats to network security. Classical security techniquesfor instance, firewallsare used as a first line of defense against security problems but remain unable to detect internal intrusions or adequately provide security countermeasures. Thus, network administrators tend to rely predominantly on Intrusion Detection Systems to detect such network intrusive activities. Machine Learning is one of the practical approaches to intrusion detection that learns from data to differentiate between normal and malicious traffic. Although Machine Learning approaches are used frequently, an in-depth analysis of Machine Learning algorithms in the context of intrusion detection has received less attention in the literature.Moreover, adequate datasets are necessary to train and evaluate anomaly-based network intrusion detection systems. There exist a number of such datasetsas DARPA, KDDCUP, and NSL-KDDthat have been widely adopted by researchers to train and evaluate the performance of their proposed intrusion detection approaches. Based on several studies, many such datasets are outworn and unreliable to use. Furthermore, some of these datasets suffer from a lack of traffic diversity and volumes, do not cover the variety of attacks, have anonymized packet information and payload that cannot reflect the current trends, or lack feature set and metadata.This thesis provides a comprehensive analysis of some of the existing Machine Learning approaches for identifying network intrusions. Specifically, it analyzes the algorithms along various dimensionsnamely, feature selection, sensitivity to the hyper-parameter selection, and class imbalance problemsthat are inherent to intrusion detection. It also produces a new reliable dataset labeled Game Theory and Cyber Security (GTCS) that matches real-world criteria, contains normal and different classes of attacks, and reflects the current network traffic trends. The GTCS dataset is used to evaluate the performance of the different approaches, and a detailed experimental evaluation to summarize the effectiveness of each approach is presented. Finally, the thesis proposes an ensemble classifier model composed of multiple classifiers with different learning paradigms to address the issue of detection accuracy and false alarm rate in intrusion detection systems

    Game-Theoretic and Machine-Learning Techniques for Cyber-Physical Security and Resilience in Smart Grid

    Get PDF
    The smart grid is the next-generation electrical infrastructure utilizing Information and Communication Technologies (ICTs), whose architecture is evolving from a utility-centric structure to a distributed Cyber-Physical System (CPS) integrated with a large-scale of renewable energy resources. However, meeting reliability objectives in the smart grid becomes increasingly challenging owing to the high penetration of renewable resources and changing weather conditions. Moreover, the cyber-physical attack targeted at the smart grid has become a major threat because millions of electronic devices interconnected via communication networks expose unprecedented vulnerabilities, thereby increasing the potential attack surface. This dissertation is aimed at developing novel game-theoretic and machine-learning techniques for addressing the reliability and security issues residing at multiple layers of the smart grid, including power distribution system reliability forecasting, risk assessment of cyber-physical attacks targeted at the grid, and cyber attack detection in the Advanced Metering Infrastructure (AMI) and renewable resources. This dissertation first comprehensively investigates the combined effect of various weather parameters on the reliability performance of the smart grid, and proposes a multilayer perceptron (MLP)-based framework to forecast the daily number of power interruptions in the distribution system using time series of common weather data. Regarding evaluating the risk of cyber-physical attacks faced by the smart grid, a stochastic budget allocation game is proposed to analyze the strategic interactions between a malicious attacker and the grid defender. A reinforcement learning algorithm is developed to enable the two players to reach a game equilibrium, where the optimal budget allocation strategies of the two players, in terms of attacking/protecting the critical elements of the grid, can be obtained. In addition, the risk of the cyber-physical attack can be derived based on the successful attack probability to various grid elements. Furthermore, this dissertation develops a multimodal data-driven framework for the cyber attack detection in the power distribution system integrated with renewable resources. This approach introduces the spare feature learning into an ensemble classifier for improving the detection efficiency, and implements the spatiotemporal correlation analysis for differentiating the attacked renewable energy measurements from fault scenarios. Numerical results based on the IEEE 34-bus system show that the proposed framework achieves the most accurate detection of cyber attacks reported in the literature. To address the electricity theft in the AMI, a Distributed Intelligent Framework for Electricity Theft Detection (DIFETD) is proposed, which is equipped with Benford’s analysis for initial diagnostics on large smart meter data. A Stackelberg game between utility and multiple electricity thieves is then formulated to model the electricity theft actions. Finally, a Likelihood Ratio Test (LRT) is utilized to detect potentially fraudulent meters

    Developing new techniques to analyse and classify EEG signals

    Get PDF
    A massive amount of biomedical time series data such as Electroencephalograph (EEG), electrocardiography (ECG), Electromyography (EMG) signals are recorded daily to monitor human performance and diagnose different brain diseases. Effectively and accurately analysing these biomedical records is considered a challenge for researchers. Developing new techniques to analyse and classify these signals can help manage, inspect and diagnose these signals. In this thesis novel methods are proposed for EEG signals classification and analysis based on complex networks, a statistical model and spectral graph wavelet transform. Different complex networks attributes were employed and studied in this thesis to investigate the main relationship between behaviours of EEG signals and changes in networks attributes. Three types of EEG signals were investigated and analysed; sleep stages, epileptic and anaesthesia. The obtained results demonstrated the effectiveness of the proposed methods for analysing these three EEG signals types. The methods developed were applied to score sleep stages EEG signals, and to analyse epileptic, as well as anaesthesia EEG signals. The outcomes of the project will help support experts in the relevant medical fields and decrease the cost of diagnosing brain diseases

    Applied Metaheuristic Computing

    Get PDF
    For decades, Applied Metaheuristic Computing (AMC) has been a prevailing optimization technique for tackling perplexing engineering and business problems, such as scheduling, routing, ordering, bin packing, assignment, facility layout planning, among others. This is partly because the classic exact methods are constrained with prior assumptions, and partly due to the heuristics being problem-dependent and lacking generalization. AMC, on the contrary, guides the course of low-level heuristics to search beyond the local optimality, which impairs the capability of traditional computation methods. This topic series has collected quality papers proposing cutting-edge methodology and innovative applications which drive the advances of AMC

    Statistical learning in network architecture

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 167-[177]).The Internet has become a ubiquitous substrate for communication in all parts of society. However, many original assumptions underlying its design are changing. Amid problems of scale, complexity, trust and security, the modern Internet accommodates increasingly critical services. Operators face a security arms race while balancing policy constraints, network demands and commercial relationships. This thesis espouses learning to embrace the Internet's inherent complexity, address diverse problems and provide a component of the network's continued evolution. Malicious nodes, cooperative competition and lack of instrumentation on the Internet imply an environment with partial information. Learning is thus an attractive and principled means to ensure generality and reconcile noisy, missing or conflicting data. We use learning to capitalize on under-utilized information and infer behavior more reliably, and on faster time-scales, than humans with only local perspective. Yet the intrinsic dynamic and distributed nature of networks presents interesting challenges to learning. In pursuit of viable solutions to several real-world Internet performance and security problems, we apply statistical learning methods as well as develop new, network-specific algorithms as a step toward overcoming these challenges. Throughout, we reconcile including intelligence at different points in the network with the end-to-end arguments. We first consider learning as an end-node optimization for efficient peer-to-peer overlay neighbor selection and agent-centric latency prediction. We then turn to security and use learning to exploit fundamental weaknesses in malicious traffic streams. Our method is both adaptable and not easily subvertible. Next, we show that certain security and optimization problems require collaboration, global scope and broad views.(cont.) We employ ensembles of weak classifiers within the network core to mitigate IP source address forgery attacks, thereby removing incentive and coordination issues surrounding existing practice. Finally, we argue for learning within the routing plane as a means to directly optimize and balance provider and user objectives. This thesis thus serves first to validate the potential for using learning methods to address several distinct problems on the Internet and second to illuminate design principles in building such intelligent systems in network architecture.by Robert Edward Beverly, IV.Ph.D
    • …
    corecore