8,359 research outputs found

    Is Big Data Sufficient for a Reliable Detection of Non-Technical Losses?

    Get PDF
    Non-technical losses (NTL) occur during the distribution of electricity in power grids and include, but are not limited to, electricity theft and faulty meters. In emerging countries, they may range up to 40% of the total electricity distributed. In order to detect NTLs, machine learning methods are used that learn irregular consumption patterns from customer data and inspection results. The Big Data paradigm followed in modern machine learning reflects the desire of deriving better conclusions from simply analyzing more data, without the necessity of looking at theory and models. However, the sample of inspected customers may be biased, i.e. it does not represent the population of all customers. As a consequence, machine learning models trained on these inspection results are biased as well and therefore lead to unreliable predictions of whether customers cause NTL or not. In machine learning, this issue is called covariate shift and has not been addressed in the literature on NTL detection yet. In this work, we present a novel framework for quantifying and visualizing covariate shift. We apply it to a commercial data set from Brazil that consists of 3.6M customers and 820K inspection results. We show that some features have a stronger covariate shift than others, making predictions less reliable. In particular, previous inspections were focused on certain neighborhoods or customer classes and that they were not sufficiently spread among the population of customers. This framework is about to be deployed in a commercial product for NTL detection.Comment: Proceedings of the 19th International Conference on Intelligent System Applications to Power Systems (ISAP 2017

    The Challenge of Non-Technical Loss Detection using Artificial Intelligence: A Survey

    Get PDF
    Detection of non-technical losses (NTL) which include electricity theft, faulty meters or billing errors has attracted increasing attention from researchers in electrical engineering and computer science. NTLs cause significant harm to the economy, as in some countries they may range up to 40% of the total electricity distributed. The predominant research direction is employing artificial intelligence to predict whether a customer causes NTL. This paper first provides an overview of how NTLs are defined and their impact on economies, which include loss of revenue and profit of electricity providers and decrease of the stability and reliability of electrical power grids. It then surveys the state-of-the-art research efforts in a up-to-date and comprehensive review of algorithms, features and data sets used. It finally identifies the key scientific and engineering challenges in NTL detection and suggests how they could be addressed in the future

    Artificial Intelligence for the Detection of Electricity Theft and Irregular Power Usage in Emerging Markets

    Get PDF
    Power grids are critical infrastructure assets that face non-technical losses (NTL), which include, but are not limited to, electricity theft, broken or malfunctioning meters and arranged false meter readings. In emerging markets, NTL are a prime concern and often range up to 40% of the total electricity distributed. The annual world-wide costs for utilities due to NTL are estimated to be around USD 100 billion. Reducing NTL in order to increase revenue, profit and reliability of the grid is therefore of vital interest to utilities and authorities. In the beginning of this thesis, we provide an in-depth discussion of the causes of NTL and the economic effects thereof. Industrial NTL detection systems are still largely based on expert knowledge when deciding whether to carry out costly on-site inspections of customers. Electric utilities are reluctant to move to large-scale deployments of automated systems that learn NTL profiles from data. This is due to the latter's propensity to suggest a large number of unnecessary inspections. In this thesis, we compare expert knowledge-based decision making systems to automated statistical decision making. We then branch out our research into different directions: First, in order to allow human experts to feed their knowledge in the decision process, we propose a method for visualizing prediction results at various granularity levels in a spatial hologram. Our approach allows domain experts to put the classification results into the context of the data and to incorporate their knowledge for making the final decisions of which customers to inspect. Second, we propose a machine learning framework that classifies customers into NTL or non-NTL using a variety of features derived from the customers' consumption data as well as a selection of master data. The methodology used is specifically tailored to the level of noise in the data. Last, we discuss the issue of biases in data sets. A bias occurs whenever training sets are not representative of the test data, which results in unreliable models. We show how quantifying and reducing these biases leads to an increased accuracy of the trained NTL detectors. This thesis has resulted in appreciable results on real-world big data sets of millions customers. Our systems are being deployed in a commercial NTL detection software. We also provide suggestions on how to further reduce NTL by not only carrying out inspections, but by implementing market reforms, increasing efficiency in the organization of utilities and improving communication between utilities, authorities and customers

    Detection of Irregular Power Usage using Machine Learning

    Get PDF
    Electricity losses are a frequently appearing problem in power grids. Non-technical losses (NTL) appear during distribution and include, but are not limited to, the following causes: Meter tampering in order to record lower consumptions, bypassing meters by rigging lines from the power source, arranged false meter readings by bribing meter readers, faulty or broken meters, un-metered supply, technical and human errors in meter readings, data processing and billing. NTLs are also reported to range up to 40% of the total electricity distributed in countries such as India, Pakistan, Malaysia, Brazil or Lebanon. This is an introductory level course to discuss how to predict if a customer causes a NTL. In the last years, employing data analytics methods such as machine learning and data mining have evolved as the primary direction to solve this problem. This course will present and compare different approaches reported in the literature. Practical case studies on real data sets will be included. As an additional outcome, attendees will understand the open challenges of NTL detection and learn how these challenges could be solved in the coming years

    Introduction to Detection of Non-Technical Losses using Data Analytics

    Get PDF
    Electricity losses are a frequently appearing problem in power grids. Non-technical losses (NTL) appear during distribution and include, but are not limited to, the following causes: Meter tampering in order to record lower consumptions, bypassing meters by rigging lines from the power source, arranged false meter readings by bribing meter readers, faulty or broken meters, un-metered supply, technical and human errors in meter readings, data processing and billing. NTLs are also reported to range up to 40% of the total electricity distributed in countries such as Brazil, India, Malaysia or Lebanon. This is an introductory level course to discuss how to predict if a customer causes a NTL. In the last years, employing data analytics methods such as data mining and machine learning have evolved as the primary direction to solve this problem. This course will compare and contrast different approaches reported in the literature. Practical case studies on real data sets will be included. Therefore, attendees will not only understand, but rather experience the challenges of NTL detection and learn how these challenges could be solved in the coming years

    Energy Theft Detection in Smart Grids with Genetic Algorithm-Based Feature Selection

    Get PDF
    As big data, its technologies, and application continue to advance, the Smart Grid (SG) has become one of the most successful pervasive and fixed computing platforms that efficiently uses a data-driven approach and employs efficient information and communication technology (ICT) and cloud computing. As a result of the complicated architecture of cloud computing, the distinctive working of advanced metering infrastructures (AMI), and the use of sensitive data, it has become challenging to make the SG secure. Faults of the SG are categorized into two main categories, Technical Losses (TLs) and Non-Technical Losses (NTLs). Hardware failure, communication issues, ohmic losses, and energy burnout during transmission and propagation of energy are TLs. NTL’s are human-induced errors for malicious purposes such as attacking sensitive data and electricity theft, along with tampering with AMI for bill reduction by fraudulent customers. This research proposes a data-driven methodology based on principles of computational intelligence as well as big data analysis to identify fraudulent customers based on their load profile. In our proposed methodology, a hybrid Genetic Algorithm and Support Vector Machine (GA-SVM) model has been used to extract the relevant subset of feature data from a large and unsupervised public smart grid project dataset in London, UK, for theft detection. A subset of 26 out of 71 features is obtained with a classification accuracy of 96.6%, compared to studies conducted on small and limited datasets

    Security of data science and data science for security

    Get PDF
    In this chapter, we present a brief overview of important topics regarding the connection of data science and security. In the first part, we focus on the security of data science and discuss a selection of security aspects that data scientists should consider to make their services and products more secure. In the second part about security for data science, we switch sides and present some applications where data science plays a critical role in pushing the state-of-the-art in securing information systems. This includes a detailed look at the potential and challenges of applying machine learning to the problem of detecting obfuscated JavaScripts
    corecore