4 research outputs found

    Machine Learning for the New York City Power Grid

    Get PDF
    Power companies can benefit from the use of knowledge discovery methods and statistical machine learning for preventive maintenance. We introduce a general process for transforming historical electrical grid data into models that aim to predict the risk of failures for components and systems. These models can be used directly by power companies to assist with prioritization of maintenance and repair work. Specialized versions of this process are used to produce (1) feeder failure rankings, (2) cable, joint, terminator, and transformer rankings, (3) feeder Mean Time Between Failure (MTBF) estimates, and (4) manhole events vulnerability rankings. The process in its most general form can handle diverse, noisy, sources that are historical (static), semi-real-time, or real-time, incorporates state-of-the-art machine learning algorithms for prioritization (supervised ranking or MTBF), and includes an evaluation of results via cross-validation and blind test. Above and beyond the ranked lists and MTBF estimates are business management interfaces that allow the prediction capability to be integrated directly into corporate planning and decision support; such interfaces rely on several important properties of our general modeling approach: that machine learning features are meaningful to domain experts, that the processing of data is transparent, and that prediction results are accurate enough to support sound decision making. We discuss the challenges in working with historical electrical grid data that were not designed for predictive purposes. The “rawness” of these data contrasts with the accuracy of the statistical models that can be obtained from the process; these models are sufficiently accurate to assist in maintaining New York City's electrical grid

    Integer optimization methods for machine learning

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2012.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student-submitted PDF version of thesis.Includes bibliographical references (p. 129-137).In this thesis, we propose new mixed integer optimization (MIO) methods to ad- dress problems in machine learning. The first part develops methods for supervised bipartite ranking, which arises in prioritization tasks in diverse domains such as information retrieval, recommender systems, natural language processing, bioinformatics, and preventative maintenance. The primary advantage of using MIO for ranking is that it allows for direct optimization of ranking quality measures, as opposed to current state-of-the-art algorithms that use heuristic loss functions. We demonstrate using a number of datasets that our approach can outperform other ranking methods. The second part of the thesis focuses on reverse-engineering ranking models. This is an application of a more general ranking problem than the bipartite case. Quality rankings affect business for many organizations, and knowing the ranking models would allow these organizations to better understand the standards by which their products are judged and help them to create higher quality products. We introduce an MIO method for reverse-engineering such models and demonstrate its performance in a case-study with real data from a major ratings company. We also devise an approach to find the most cost-effective way to increase the rank of a certain product. In the final part of the thesis, we develop MIO methods to first generate association rules and then use the rules to build an interpretable classifier in the form of a decision list, which is an ordered list of rules. These are both combinatorially challenging problems because even a small dataset may yield a large number of rules and a small set of rules may correspond to many different orderings. We show how to use MIO to mine useful rules, as well as to construct a classifier from them. We present results in terms of both classification accuracy and interpretability for a variety of datasets.by Allison An Chang.Ph.D

    Ranking electrical feeders of the New York power grid

    No full text
    Ranking problems arise in a wide range of real world applications where an ordering on a set of examples is preferred to a classification model. These applications include collaborative filtering, information retrieval and ranking components of a system by susceptibility to failure. In this extended abstract, we present an ongoing project to rank the underground primary feeders of Consolidated Edison Company of New York according to their susceptibility to outages. We describe our framework and the application of different machine learning ranking methods along with experiments on concept drift detection. Between the high-voltage transmission system and the household-voltage secondary system, electricity is sent through primary distribution feeders, cables which move energy around the New York city area. There are three regions of interest 1 – Manhattan, Brooklyn/Queens and Bronx – which together have over 1000 feeders. Individual feeders fail with some regularity. On average there were over five failures per day over all regions during summer 2007. Our goal was to create an ordered list of feeders in the system ranked from most susceptible to failure to least susceptible. There are a number of challenges. The number of attributes is very large, while failures are relatively rare. The system is believed to exhibit concept drift, where the causes of failures change significantly over the course of the year. Many of the attributes are in the form of time series which must be aggregated. Other attributes map to feeder sub-components (e.g. cable sections, averagin
    corecore