Search CORE

4 research outputs found

Machine Learning for the New York City Power Grid

Author: Anderson Roger N.
Boulanger Albert
Chow Maggie
Dutta Haimonti
Gross Philip N.
Huang Bert
Ierome Steve
Isaac Delfina F.
Kressner Arthur
Passonneau Rebecca J.
Radeva Axinia
Rudin Cynthia
Salleb-Aouissi Ansaf
Waltz David
Wu Leon
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2011
Field of study

Power companies can benefit from the use of knowledge discovery methods and statistical machine learning for preventive maintenance. We introduce a general process for transforming historical electrical grid data into models that aim to predict the risk of failures for components and systems. These models can be used directly by power companies to assist with prioritization of maintenance and repair work. Specialized versions of this process are used to produce (1) feeder failure rankings, (2) cable, joint, terminator, and transformer rankings, (3) feeder Mean Time Between Failure (MTBF) estimates, and (4) manhole events vulnerability rankings. The process in its most general form can handle diverse, noisy, sources that are historical (static), semi-real-time, or real-time, incorporates state-of-the-art machine learning algorithms for prioritization (supervised ranking or MTBF), and includes an evaluation of results via cross-validation and blind test. Above and beyond the ranked lists and MTBF estimates are business management interfaces that allow the prediction capability to be integrated directly into corporate planning and decision support; such interfaces rely on several important properties of our general modeling approach: that machine learning features are meaningful to domain experts, that the processing of data is transparent, and that prediction results are accurate enough to support sound decision making. We discuss the challenges in working with historical electrical grid data that were not designed for predictive purposes. The “rawness” of these data contrasts with the accuracy of the statistical models that can be obtained from the process; these models are sufficiently accurate to assist in maintaining New York City's electrical grid

DSpace@MIT

Crossref

Recommended from our members

Semiparametric Estimation of a Gaptime-Associated Hazard Function

Author: Teravainen Timothy
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2014
Field of study

This dissertation proposes a suite of novel Bayesian semiparametric estimators for a proportional hazard function associated with the gaptimes, or inter-arrival times, of a counting process in survival analysis. The Cox model is applied and extended in order to identify the subsequent effect of an event on future events in a system with renewal. The estimators may also be applied, without changes, to model the effect of a point treatment on subsequent events, as well as the effect of an event on subsequent events in neighboring subjects. These Bayesian semiparametric estimators are used to analyze the survival and reliability of the New York City electric grid. In particular, the phenomenon of "infant mortality," whereby electrical supply units are prone to immediate recurrence of failure, is flexibly quantified as a period of increased risk. In this setting, the Cox model removes the significant confounding effect of seasonality. Without this correction, infant mortality would be misestimated due to the exogenously increased failure rate during summer months and times of high demand. The structural assumptions of the Bayesian estimators allow the use and interpretation of sparse event data without the rigid constraints of standard parametric models used in reliability studies

Columbia University Academic Commons

Integer optimization methods for machine learning

Author: Chang Allison An
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2012
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2012.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student-submitted PDF version of thesis.Includes bibliographical references (p. 129-137).In this thesis, we propose new mixed integer optimization (MIO) methods to ad- dress problems in machine learning. The first part develops methods for supervised bipartite ranking, which arises in prioritization tasks in diverse domains such as information retrieval, recommender systems, natural language processing, bioinformatics, and preventative maintenance. The primary advantage of using MIO for ranking is that it allows for direct optimization of ranking quality measures, as opposed to current state-of-the-art algorithms that use heuristic loss functions. We demonstrate using a number of datasets that our approach can outperform other ranking methods. The second part of the thesis focuses on reverse-engineering ranking models. This is an application of a more general ranking problem than the bipartite case. Quality rankings affect business for many organizations, and knowing the ranking models would allow these organizations to better understand the standards by which their products are judged and help them to create higher quality products. We introduce an MIO method for reverse-engineering such models and demonstrate its performance in a case-study with real data from a major ratings company. We also devise an approach to find the most cost-effective way to increase the rank of a certain product. In the final part of the thesis, we develop MIO methods to first generate association rules and then use the rules to build an interpretable classifier in the form of a decision list, which is an ordered list of rules. These are both combinatorially challenging problems because even a small dataset may yield a large number of rules and a small set of rules may correspond to many different orderings. We show how to use MIO to mine useful rules, as well as to construct a classifier from them. We present results in terms of both classification accuracy and interpretability for a variety of datasets.by Allison An Chang.Ph.D

DSpace@MIT

Ranking electrical feeders of the New York power grid

Author: Albert Boulanger
Ansaf Salleb-aouissi
Haimonti Dutta
Philip Gross
Publication venue
Publication date: 01/01/2009
Field of study

Ranking problems arise in a wide range of real world applications where an ordering on a set of examples is preferred to a classification model. These applications include collaborative filtering, information retrieval and ranking components of a system by susceptibility to failure. In this extended abstract, we present an ongoing project to rank the underground primary feeders of Consolidated Edison Company of New York according to their susceptibility to outages. We describe our framework and the application of different machine learning ranking methods along with experiments on concept drift detection. Between the high-voltage transmission system and the household-voltage secondary system, electricity is sent through primary distribution feeders, cables which move energy around the New York city area. There are three regions of interest 1 – Manhattan, Brooklyn/Queens and Bronx – which together have over 1000 feeders. Individual feeders fail with some regularity. On average there were over five failures per day over all regions during summer 2007. Our goal was to create an ordered list of feeders in the system ranked from most susceptible to failure to least susceptible. There are a number of challenges. The number of attributes is very large, while failures are relatively rare. The system is believed to exhibit concept drift, where the causes of failures change significantly over the course of the year. Many of the attributes are in the form of time series which must be aggregated. Other attributes map to feeder sub-components (e.g. cable sections, averagin

CiteSeerX

Crossref