Data Analytics and Machine Learning to Enhance the Operational Visibility and Situation Awareness of Smart Grid High Penetration Photovoltaic Systems

Abstract

Electric utilities have limited operational visibility and situation awareness over grid-tied distributed photovoltaic systems (PV). This will pose a risk to grid stability when the PV penetration into a given feeder exceeds 60% of its peak or minimum daytime load. Third-party service providers offer only real-time monitoring but not accurate insights into system performance and prediction of productions. PV systems also increase the attack surface of distribution networks since they are not under the direct supervision and control of the utility security analysts. Six key objectives were successfully achieved to enhance PV operational visibility and situation awareness: (1) conceptual cybersecurity frameworks for PV situation awareness at device, communications, applications, and cognitive levels; (2) a unique combinatorial approach using LASSO-Elastic Net regularizations and multilayer perceptron for PV generation forecasting; (3) applying a fixed-point primal dual log-barrier interior point method to expedite AC optimal power flow convergence; (4) adapting big data standards and capability maturity models to PV systems; (5) using K-nearest neighbors and random forests to impute missing values in PV big data; and (6) a hybrid data-model method that takes PV system deration factors and historical data to estimate generation and evaluate system performance using advanced metrics. These objectives were validated on three real-world case studies comprising grid-tied commercial PV systems. The results and conclusions show that the proposed imputation approach improved the accuracy by 91%, the estimation method performed better by 75% and 10% for two PV systems, and the use of the proposed forecasting model improved the generalization performance and reduced the likelihood of overfitting. The application of primal dual log-barrier interior point method improved the convergence of AC optimal power flow by 0.7 and 0.6 times that of the currently used deterministic models. Through the use of advanced performance metrics, it is shown how PV systems of different nameplate capacities installed at different geographical locations can be directly evaluated and compared over both instantaneous as well as extended periods of time. The results of this dissertation will be of particular use to multiple stakeholders of the PV domain including, but not limited to, the utility network and security operation centers, standards working groups, utility equipment, and service providers, data consultants, system integrator, regulators and public service commissions, government bodies, and end-consumers

    Similar works