31 research outputs found

    Optimal Software Patching Plan for PMUs

    Get PDF
    Phasor measurement units (PMUs) deployed to monitor the state of an electrical grid need to be patched from time to time to prevent attacks that exploit vulnerabilities in the software. Applying some of these patches requires a PMU reboot, which takes the PMU offline for some time. If the PMU placement provides enough redundancy, it is possible to patch a set of PMUs at a time while maintaining full system observability. The challenge is then to find a patching plan that guarantees that the patch is rolled out to all PMUs in the smallest number of rounds possible while full system observability is maintained at all times. We show that this problem can be formulated as a sensor patching problem, which we demonstrate to be NP-complete. However, if the grid forms a tree, we show that the minimum number of rounds is two and we provide a polynomial-time algorithm that finds an optimal patching plan. For the non-tree case, we formulate the problem as a binary integer linear programming problem (BILP) and solve it using an ILP-solver. We also propose a heuristic algorithm to find an approximate solution to the patching problem for grids that are too large to be solved by an ILP-solver. Through simulation, we compare the performance of the ILP-solver and the heuristic algorithm over different bus systems

    Cyber Defense Remediation in Energy Delivery Systems

    Get PDF
    The integration of Information Technology (IT) and Operational Technology (OT) in Cyber-Physical Systems (CPS) has resulted in increased efficiency and facilitated real-time information acquisition, processing, and decision making. However, the increase in automation technology and the use of the internet for connecting, remote controlling, and supervising systems and facilities has also increased the likelihood of cybersecurity threats that can impact safety of humans and property. There is a need to assess cybersecurity risks in the power grid, nuclear plants, chemical factories, etc. to gain insight into the likelihood of safety hazards. Quantitative cybersecurity risk assessment will lead to informed cyber defense remediation and will ensure the presence of a mitigation plan to prevent safety hazards. In this dissertation, using Energy Delivery Systems (EDS) as a use case to contextualize a CPS, we address key research challenges in managing cyber risk for cyber defense remediation. First, we developed a platform for modeling and analyzing the effect of cyber threats and random system faults on EDS\u27s safety that could lead to catastrophic damages. We developed a data-driven attack graph and fault graph-based model to characterize the exploitability and impact of threats in EDS. We created an operational impact assessment to quantify the damages. Finally, we developed a strategic response decision capability that presents optimal mitigation actions and policies that balance the tradeoff between operational resilience (tactical risk) and strategic risk. Next, we addressed the challenge of management of tactical risk based on a prioritized cyber defense remediation plan. A prioritized cyber defense remediation plan is critical for effective risk management in EDS. Due to EDS\u27s complexity in terms of the heterogeneous nature of blending IT and OT and Industrial Control System (ICS), scale, and critical processes tasks, prioritized remediation should be applied gradually to protect critical assets. We proposed a methodology for prioritizing cyber risk remediation plans by detecting and evaluating critical EDS nodes\u27 paths. We conducted evaluation of critical nodes characteristics based on nodes\u27 architectural positions, measure of centrality based on nodes\u27 connectivity and frequency of network traffic, as well as the controlled amount of electrical power. The model also examines the relationship between cost models of budget allocation for removing vulnerabilities on critical nodes and their impact on gradual readiness. The proposed cost models were empirically validated in an existing network ICS test-bed computing nodes criticality. Two cost models were examined, and although varied, we concluded the lack of correlation between types of cost models to most damageable attack path and critical nodes readiness. Finally, we proposed a time-varying dynamical model for the cyber defense remediation in EDS. We utilize the stochastic evolutionary game model to simulate the dynamic adversary of cyber-attack-defense. We leveraged the Logit Quantal Response Dynamics (LQRD) model to quantify real-world players\u27 cognitive differences. We proposed the optimal decision making approach by calculating the stable evolutionary equilibrium and balancing defense costs and benefits. Case studies on EDS indicate that the proposed method can help the defender predict possible attack action, select the related optimal defense strategy over time, and gain the maximum defense payoffs. We also leveraged software-defined networking (SDN) in EDS for dynamical cyber defense remediation. We presented an approach to aid the selection security controls dynamically in an SDN-enabled EDS and achieve tradeoffs between providing security and Quality of Service (QoS). We modeled the security costs based on end-to-end packet delay and throughput. We proposed a non-dominated sorting based multi-objective optimization framework which can be implemented within an SDN controller to address the joint problem of optimizing between security and QoS parameters by alleviating time complexity at O(MN2). The M is the number of objective functions, and N is the population for each generation, respectively. We presented simulation results that illustrate how data availability and data integrity can be achieved while maintaining QoS constraints

    Achieving the Dispatchability of Distribution Feeders through Prosumers Data Driven Forecasting and Model Predictive Control of Electrochemical Storage

    Get PDF
    We propose and experimentally validate a control strategy to dispatch the operation of a distribution feeder interfacing heterogeneous prosumers by using a grid-connected battery energy storage system (BESS) as a controllable element coupled with a minimally invasive monitoring infrastructure. It consists in a two-stage procedure: day-ahead dispatch planning, where the feeder 5-minute average power consumption trajectory for the next day of operation (called \emph{dispatch plan}) is determined, and intra-day/real-time operation, where the mismatch with respect to the \emph{dispatch plan} is corrected by applying receding horizon model predictive control (MPC) to decide the BESS charging/discharging profile while accounting for operational constraints. The consumption forecast necessary to compute the \emph{dispatch plan} and the battery model for the MPC algorithm are built by applying adaptive data driven methodologies. The discussed control framework currently operates on a daily basis to dispatch the operation of a 20~kV feeder of the EPFL university campus using a 750~kW/500~kWh lithium titanate BESS.Comment: Submitted for publication, 201

    Fault-tolerant wide-area control of power systems

    No full text
    In this thesis, the stability and performance of closed-loop systems following the loss of sensors or feedback signals (sensor faults) are studied. The objective is to guarantee stability in the face of sensor faults while optimising performance under nominal (no sensor fault) condition. One of the main contributions of this work is to deal effectively with the combinatorial binary nature of the problem when the number of sensors is large. Several fault-tolerant controller and observer architectures that are suitable for different applications are proposed and their effectiveness demonstrated. The problems are formulated in terms of the existence of feasible solutions to linear matrix inequalities. The formulations presented in this work are described in a general form and can be applied to a large class of systems. In particular, the use of fault-tolerant architectures for damping inter-area oscillations in power systems using wide-area signals has been demonstrated. As an extension of the proposed formulations, regional pole placement to enhance the damping of inter-area modes has been incorporated. The objective is to achieve specified damping ratios for the inter-area modes and maximise the closed-loop performance under nominal condition while guaranteeing stability for all possible combinations of sensors faults. The performances of the proposed fault-tolerant architectures are validated through extensive nonlinear simulations using a simplified equivalent model of the Nordic power system.Open Acces

    Enhancing Cyber-Resiliency of DER-based SmartGrid: A Survey

    Full text link
    The rapid development of information and communications technology has enabled the use of digital-controlled and software-driven distributed energy resources (DERs) to improve the flexibility and efficiency of power supply, and support grid operations. However, this evolution also exposes geographically-dispersed DERs to cyber threats, including hardware and software vulnerabilities, communication issues, and personnel errors, etc. Therefore, enhancing the cyber-resiliency of DER-based smart grid - the ability to survive successful cyber intrusions - is becoming increasingly vital and has garnered significant attention from both industry and academia. In this survey, we aim to provide a systematical and comprehensive review regarding the cyber-resiliency enhancement (CRE) of DER-based smart grid. Firstly, an integrated threat modeling method is tailored for the hierarchical DER-based smart grid with special emphasis on vulnerability identification and impact analysis. Then, the defense-in-depth strategies encompassing prevention, detection, mitigation, and recovery are comprehensively surveyed, systematically classified, and rigorously compared. A CRE framework is subsequently proposed to incorporate the five key resiliency enablers. Finally, challenges and future directions are discussed in details. The overall aim of this survey is to demonstrate the development trend of CRE methods and motivate further efforts to improve the cyber-resiliency of DER-based smart grid.Comment: Submitted to IEEE Transactions on Smart Grid for Publication Consideratio

    Analysis of Remote Tripping Command Injection Attacks in Industrial Control Systems Through Statistical and Machine Learning Methods

    Get PDF
    In the past decade, cyber operations have been increasingly utilized to further policy goals of state-sponsored actors to shift the balance of politics and power on a global scale. One of the ways this has been evidenced is through the exploitation of electric grids via cyber means. A remote tripping command injection attack is one of the types of attacks that could have devastating effects on the North American power grid. To better understand these attacks and create detection axioms to both quickly identify and mitigate the effects of a remote tripping command injection attack, a dataset comprised of 128 variables (primarily synchrophasor measurements) was analyzed via statistical methods and machine learning algorithms in RStudio and WEKA software respectively. While statistical methods were not successful due to the non-linearity and complexity of the dataset, machine learning algorithms surpassed accuracy metrics established in previous research given a simplified dataset of the specified attack and normal operational data. This research allows future cybersecurity researchers to better understand remote tripping command injection attacks in comparison to normal operational conditions. Further, an incorporation of the analysis has the potential to increase detection and thus mitigate risk to the North American power grid in future work

    Domain-Specific Optimization For Machine Learning System

    Get PDF
    The machine learning (ML) system has been an indispensable part of the ML ecosystem in recent years. The rapid growth of ML brings new system challenges such as the need of handling more large-scale data and computation, the requirements for higher execution performance, and lower resource usage, stimulating the demand for improving ML system. General-purpose system optimization is widely used but brings limited benefits because ML applications vary in execution behaviors based on their algorithms, input data, and configurations. It\u27s difficult to perform comprehensive ML system optimizations without application specific information. Therefore, domain-specific optimization, a method that optimizes particular types of ML applications based on their unique characteristics, is necessary for advanced ML systems. This dissertation performs domain-specific system optimizations for three important ML applications: graph-based applications, SGD-based applications, and Python-based applications. For SGD-based applications, this dissertation proposes a lossy compression scheme for application checkpoint constructions (called {LC-Checkpoint\xspace}). {LC-Checkpoint\xspace} intends to simultaneously maximize the compression rate of checkpoints and reduce the recovery cost of SGD-based training processes. Extensive experiments show that {LC-Checkpoint\xspace} achieves a high compression rate with a lower recovery cost over a state-of-the-art algorithm. For kernel regression applications, this dissertation designs and implements a parallel software that targets to handle million-scale datasets. The software is evaluated on two million-scale downstream applications (i.e., equity return forecasting problem on the US stock dataset, and image classification problem on the ImageNet dataset) to demonstrate its efficacy and efficiency. For graph-based applications, this dissertation introduces {ATMem\xspace}, a runtime framework to optimize application data placement on heterogeneous memory systems. {ATMem\xspace} aims to maximize the fast memory (small-capacity) utilization by placing only critical data regions that yield the highest performance gains on the fast memory. Experimental results show that {ATMem\xspace} achieves significant speedup over the baseline that places all data on slow memory (large-capacity) with only placing a minority portion of the data on the fast memory. The future research direction is to adapt ML algorithms for software systems/architectures, deeply bind the design of ML algorithms to the implementation of ML systems, to achieve optimal solutions for ML applications

    Reliable and Robust Cyber-Physical Systems for Real-Time Control of Electric Grids

    Get PDF
    Real-time control of electric grids is a novel approach to handling the increasing penetration of distributed and volatile energy generation brought about by renewables. Such control occurs in cyber-physical systems (CPSs), in which software agents maintain safe and optimal grid operation by exchanging messages over a communication network. We focus on CPSs with a centralized controller that receives measurements from the various resources in the grid, performs real-time computations, and issues setpoints. Long-term deployment of such CPSs makes them susceptible to software agent faults, such as crashes and delays of controllers and unresponsiveness of resources, and to communication network faults, such as packet losses, delays, and reordering. CPS controllers must provide correct control in the presence of external non-idealities, i.e., be robust, and in the presence of controller faults, i.e., be reliable. In this thesis, we design, test, and deploy solutions that achieve these goals for real-time CPSs. We begin by abstracting a CPS for electric grids into four layers: the control layer, the network layer, the sensing and actuation layer, and the physical layer. Then, we provide a model for the components in each layer, and for the interactions among them. This enables us to formally define the properties required for reliable and robust CPSs. We propose two mechanisms, Robuster and intentionality clocks, for making a single controller robust to unresponsive resources and non-ideal network conditions. These mechanisms enable the controller to compute and issue setpoints even when some measurements are missing, rather than to have to wait for measurements from all resources. We show that our proposed mechanisms guarantee grid safety and outperform state-of-the-art alternatives. Then, we propose Axo: a framework for crash- and delay-fault tolerance via active replication of the controller. Axo ensures that faults in the controller replicas are masked from the resources, and it provides a mechanism for detecting and recovering faulty replicas. We prove the reliable validity and availability guarantees of Axo and derive the bounds on its detection and recovery time. We showcase the benefits of Axo via a stability analysis of an inverted pendulum system. Solutions based on active replication must guarantee that the replicas issue consistent setpoints. Traditional consensus-based schemes for achieving this are not suitable for real-time CPSs, as they incur high latency and low availability. We propose Quarts, an agreement mechanism that guarantees consistency and a low bounded latency- overhead. We show, via extensive simulations, that Quarts provides an availability at least an order of magnitude higher than state-of-the-art solutions. In order to test the effect of our proposed solutions on electric grids, we developed T-RECS, a virtual commissioning tool for software-based control of electric grids. T-RECS enables us to test the proper functioning of the software agents both in ideal and faulty conditions. This provides insight into the effect of faults on the grid and helps us to evaluate the impact of our reliability solutions. We show how our proposed solutions fit together, and that they can be used to design a reliable and robust CPS for real-time control of electric grids. To this end, we study a CPS with COMMELEC, a real-time control framework for electric grids via explicit power setpoints. We analyze the reliability issues..
    corecore