667 research outputs found

    Machine Learning for the New York City Power Grid

    Get PDF
    Power companies can benefit from the use of knowledge discovery methods and statistical machine learning for preventive maintenance. We introduce a general process for transforming historical electrical grid data into models that aim to predict the risk of failures for components and systems. These models can be used directly by power companies to assist with prioritization of maintenance and repair work. Specialized versions of this process are used to produce (1) feeder failure rankings, (2) cable, joint, terminator, and transformer rankings, (3) feeder Mean Time Between Failure (MTBF) estimates, and (4) manhole events vulnerability rankings. The process in its most general form can handle diverse, noisy, sources that are historical (static), semi-real-time, or real-time, incorporates state-of-the-art machine learning algorithms for prioritization (supervised ranking or MTBF), and includes an evaluation of results via cross-validation and blind test. Above and beyond the ranked lists and MTBF estimates are business management interfaces that allow the prediction capability to be integrated directly into corporate planning and decision support; such interfaces rely on several important properties of our general modeling approach: that machine learning features are meaningful to domain experts, that the processing of data is transparent, and that prediction results are accurate enough to support sound decision making. We discuss the challenges in working with historical electrical grid data that were not designed for predictive purposes. The “rawness” of these data contrasts with the accuracy of the statistical models that can be obtained from the process; these models are sufficiently accurate to assist in maintaining New York City's electrical grid

    Susceptibility Ranking of Electrical Feeders: A Case Study

    Get PDF
    Ranking problems arise in a wide range of real world applications where an ordering on a set of examples is preferred to a classification model. These applications include collaborative filtering, information retrieval and ranking components of a system by susceptibility to failure. In this paper, we present an ongoing project to rank the feeder cables of a major metropolitan area's electrical grid according to their susceptibility to outages. We describe our framework and the application of machine learning ranking methods, using scores from Support Vector Machines (SVM), RankBoost and Martingale Boosting. Finally, we present our experimental results and the lessons learned from this challenging real-world application

    Analytics for Power Grid Distribution Reliability in New York City

    Get PDF
    We summarize the first major effort to use analytics for preemptive maintenance and repair of an electrical distribution network. This is a large-scale multiyear effort between scientists and students at Columbia University and the Massachusetts Institute of Technology and engineers from the Consolidated Edison Company of New York (Con Edison), which operates the world’s oldest and largest underground electrical system. Con Edison’s preemptive maintenance programs are less than a decade old and are made more effective with the use of analytics developing alongside them. Some of the data we used for our projects are historical records dating as far back as the 1880s, and some of the data are free-text documents typed by Con Edison dispatchers. The operational goals of this work are to assist with Con Edison’s preemptive inspection and repair program and its vented-cover replacement program. This has a continuing impact on the public safety, operating costs, and reliability of electrical service in New York City

    An Event System Architecture for Scaling Scale-Resistant Services

    Get PDF
    Large organizations are deploying ever-increasing numbers of networked compute devices, from utilities installing smart controllers on electricity distribution cables, to the military giving PDAs to soldiers, to corporations putting PCs on the desks of employees. These computers are often far more capable than is needed to accomplish their primary task, whether it be guarding a circuit breaker, displaying a map, or running a word processor. These devices would be far more useful if they had some awareness of the world around them: a controller that resists tripping a switch, knowing that it would set off a cascade failure, a PDA that warns its owner of imminent danger, a PC that exchanges reports of suspicious network activity to its peers to identify stealthy computer crackers. In order to provide these higher-level services, the devices need a model of their environment. The controller needs a model of the distribution grid, the PDA needs a model of the battlespace, and the PC needs a model of the network and of normal network and user behavior. Unfortunately, not only might models such as these require substantial computational resources, but generating and updating them is even more demanding. Modelbuilding algorithms tend to be bad in three ways: requiring large amounts of CPU and memory to run, needing large amounts of data from the outside to stay up to date, and running so slowly that can't keep up with any fast changes in the environment that might occur. We can solve these problems by reducing the scope of the model to the immediate locale of the device, since reducing the size of the model makes the problem of model generation much more tractable. But such models are also much less useful, having no knowledge of the wider system. This thesis proposes a better solution to this problem called Level of Detail, after the computer graphics technique of the same name. Instead of simplifying the representation of distant objects, however, we simplify less-important data. Compute devices in the system receive streams of data that is a mixture of detailed data from devices that directly affect them and data summaries (aggregated data) from less directly influential devices. The degree to which the data is aggregated (i.e., how much it is reduced) is determined by calculating an influence metric between the target device and the remote device. The smart controller thus receives a continuous stream of raw data from the adjacent transformer, but only an occasional small status report summarizing all the equipment in a neighborhood in another part of the city. This thesis describes the data distribution system, the aggregation functions, and the influence metrics that can be used to implement such a system. I also describe my current towards establishing a test environment and validating the concepts, and describe the next steps in the research plan

    Failure Analysis of the New York City Power Grid

    Get PDF
    As U.S. power grid transforms itself into Smart Grid, it has become less reliable in the past years. Power grid failures lead to huge financial cost and affect people’s life. Using a statistical analysis and holistic approach, this paper analyzes the New York City power grid failures: failure patterns and climatic effects. Our findings include: higher peak electrical load increases likelihood of power grid failure; increased subsequent failures among electrical feeders sharing the same substation; underground feeders fail less than overhead feeders; cables and joints installed during certain years are more likely to fail; higher weather temperature leads to more power grid failures. We further suggest preventive maintenance, intertemporal consumption, and electrical load optimization for failure prevention. We also estimated that the predictability of the power grid component failures correlates with the cycles of the North Atlantic Oscillation (NAO) Index

    Improving System Reliability for Cyber-Physical Systems

    Get PDF
    Cyber-physical systems (CPS) are systems featuring a tight combination of, and coordination between, the system's computational and physical elements. Cyber-physical systems include systems ranging from critical infrastructure such as a power grid and transportation system to health and biomedical devices. System reliability, i.e., the ability of a system to perform its intended function under a given set of environmental and operational conditions for a given period of time, is a fundamental requirement of cyber-physical systems. An unreliable system often leads to disruption of service, financial cost and even loss of human life. An important and prevalent type of cyber-physical system meets the following criteria: processing large amounts of data; employing software as a system component; running online continuously; having operator-in-the-loop because of human judgment and an accountability requirement for safety critical systems. This thesis aims to improve system reliability for this type of cyber-physical system. To improve system reliability for this type of cyber-physical system, I present a system evaluation approach entitled automated online evaluation (AOE), which is a data-centric runtime monitoring and reliability evaluation approach that works in parallel with the cyber-physical system to conduct automated evaluation along the workflow of the system continuously using computational intelligence and self-tuning techniques and provide operator-in-the-loop feedback on reliability improvement. For example, abnormal input and output data at or between the multiple stages of the system can be detected and flagged through data quality analysis. As a result, alerts can be sent to the operator-in-the-loop. The operator can then take actions and make changes to the system based on the alerts in order to achieve minimal system downtime and increased system reliability. One technique used by the approach is data quality analysis using computational intelligence, which applies computational intelligence in evaluating data quality in an automated and efficient way in order to make sure the running system perform reliably as expected. Another technique used by the approach is self-tuning which automatically self-manages and self-configures the evaluation system to ensure that it adapts itself based on the changes in the system and feedback from the operator. To implement the proposed approach, I further present a system architecture called autonomic reliability improvement system (ARIS). This thesis investigates three hypotheses. First, I claim that the automated online evaluation empowered by data quality analysis using computational intelligence can effectively improve system reliability for cyber-physical systems in the domain of interest as indicated above. In order to prove this hypothesis, a prototype system needs to be developed and deployed in various cyber-physical systems while certain reliability metrics are required to measure the system reliability improvement quantitatively. Second, I claim that the self-tuning can effectively self-manage and self-configure the evaluation system based on the changes in the system and feedback from the operator-in-the-loop to improve system reliability. Third, I claim that the approach is efficient. It should not have a large impact on the overall system performance and introduce only minimal extra overhead to the cyberphysical system. Some performance metrics should be used to measure the efficiency and added overhead quantitatively. Additionally, in order to conduct efficient and cost-effective automated online evaluation for data-intensive CPS, which requires large volumes of data and devotes much of its processing time to I/O and data manipulation, this thesis presents COBRA, a cloud-based reliability assurance framework. COBRA provides automated multi-stage runtime reliability evaluation along the CPS workflow using data relocation services, a cloud data store, data quality analysis and process scheduling with self-tuning to achieve scalability, elasticity and efficiency. Finally, in order to provide a generic way to compare and benchmark system reliability for CPS and to extend the approach described above, this thesis presents FARE, a reliability benchmark framework that employs a CPS reliability model, a set of methods and metrics on evaluation environment selection, failure analysis, and reliability estimation. The main contributions of this thesis include validation of the above hypotheses and empirical studies of ARIS automated online evaluation system, COBRA cloud-based reliability assurance framework for data-intensive CPS, and FARE framework for benchmarking reliability of cyber-physical systems. This work has advanced the state of the art in the CPS reliability research, expanded the body of knowledge in this field, and provided some useful studies for further research

    Semiparametric Estimation of a Gaptime-Associated Hazard Function

    Get PDF
    This dissertation proposes a suite of novel Bayesian semiparametric estimators for a proportional hazard function associated with the gaptimes, or inter-arrival times, of a counting process in survival analysis. The Cox model is applied and extended in order to identify the subsequent effect of an event on future events in a system with renewal. The estimators may also be applied, without changes, to model the effect of a point treatment on subsequent events, as well as the effect of an event on subsequent events in neighboring subjects. These Bayesian semiparametric estimators are used to analyze the survival and reliability of the New York City electric grid. In particular, the phenomenon of "infant mortality," whereby electrical supply units are prone to immediate recurrence of failure, is flexibly quantified as a period of increased risk. In this setting, the Cox model removes the significant confounding effect of seasonality. Without this correction, infant mortality would be misestimated due to the exogenously increased failure rate during summer months and times of high demand. The structural assumptions of the Bayesian estimators allow the use and interpretation of sparse event data without the rigid constraints of standard parametric models used in reliability studies

    A Framework for Quality Assurance of Machine Learning Applications

    Get PDF
    Some machine learning applications are intended to learn properties of data sets where the correct answers are not already known to human users. It is challenging to test and debug such ML software, because there is no reliable test oracle. We describe a framework and collection of tools aimed to assist with this problem. We present our findings from using the testing framework with three implementations of an ML ranking algorithm (all of which had bugs)
    • …
    corecore