8,360 research outputs found
Large-Scale Detection of Non-Technical Losses in Imbalanced Data Sets
Non-technical losses (NTL) such as electricity theft cause significant harm
to our economies, as in some countries they may range up to 40% of the total
electricity distributed. Detecting NTLs requires costly on-site inspections.
Accurate prediction of NTLs for customers using machine learning is therefore
crucial. To date, related research largely ignore that the two classes of
regular and non-regular customers are highly imbalanced, that NTL proportions
may change and mostly consider small data sets, often not allowing to deploy
the results in production. In this paper, we present a comprehensive approach
to assess three NTL detection models for different NTL proportions in large
real world data sets of 100Ks of customers: Boolean rules, fuzzy logic and
Support Vector Machine. This work has resulted in appreciable results that are
about to be deployed in a leading industry solution. We believe that the
considerations and observations made in this contribution are necessary for
future smart meter research in order to report their effectiveness on
imbalanced and large real world data sets.Comment: Proceedings of the Seventh IEEE Conference on Innovative Smart Grid
Technologies (ISGT 2016
FSL-BM: Fuzzy Supervised Learning with Binary Meta-Feature for Classification
This paper introduces a novel real-time Fuzzy Supervised Learning with Binary
Meta-Feature (FSL-BM) for big data classification task. The study of real-time
algorithms addresses several major concerns, which are namely: accuracy, memory
consumption, and ability to stretch assumptions and time complexity. Attaining
a fast computational model providing fuzzy logic and supervised learning is one
of the main challenges in the machine learning. In this research paper, we
present FSL-BM algorithm as an efficient solution of supervised learning with
fuzzy logic processing using binary meta-feature representation using Hamming
Distance and Hash function to relax assumptions. While many studies focused on
reducing time complexity and increasing accuracy during the last decade, the
novel contribution of this proposed solution comes through integration of
Hamming Distance, Hash function, binary meta-features, binary classification to
provide real time supervised method. Hash Tables (HT) component gives a fast
access to existing indices; and therefore, the generation of new indices in a
constant time complexity, which supersedes existing fuzzy supervised algorithms
with better or comparable results. To summarize, the main contribution of this
technique for real-time Fuzzy Supervised Learning is to represent hypothesis
through binary input as meta-feature space and creating the Fuzzy Supervised
Hash table to train and validate model.Comment: FICC201
A Review of Fault Diagnosing Methods in Power Transmission Systems
Transient stability is important in power systems. Disturbances like faults need to be segregated to restore transient stability. A comprehensive review of fault diagnosing methods in the power transmission system is presented in this paper. Typically, voltage and current samples are deployed for analysis. Three tasks/topics; fault detection, classification, and location are presented separately to convey a more logical and comprehensive understanding of the concepts. Feature extractions, transformations with dimensionality reduction methods are discussed. Fault classification and location techniques largely use artificial intelligence (AI) and signal processing methods. After the discussion of overall methods and concepts, advancements and future aspects are discussed. Generalized strengths and weaknesses of different AI and machine learning-based algorithms are assessed. A comparison of different fault detection, classification, and location methods is also presented considering features, inputs, complexity, system used and results. This paper may serve as a guideline for the researchers to understand different methods and techniques in this field
Theoretical Interpretations and Applications of Radial Basis Function Networks
Medical applications usually used Radial Basis Function Networks just as Artificial Neural Networks. However, RBFNs are Knowledge-Based Networks that can be interpreted in several way: Artificial Neural Networks, Regularization Networks, Support Vector Machines, Wavelet Networks, Fuzzy Controllers, Kernel Estimators, Instanced-Based Learners. A survey of their interpretations and of their corresponding learning algorithms is provided as well as a brief survey on dynamic learning algorithms. RBFNs' interpretations can suggest applications that are particularly interesting in medical domains
The Challenge of Non-Technical Loss Detection using Artificial Intelligence: A Survey
Detection of non-technical losses (NTL) which include electricity theft,
faulty meters or billing errors has attracted increasing attention from
researchers in electrical engineering and computer science. NTLs cause
significant harm to the economy, as in some countries they may range up to 40%
of the total electricity distributed. The predominant research direction is
employing artificial intelligence to predict whether a customer causes NTL.
This paper first provides an overview of how NTLs are defined and their impact
on economies, which include loss of revenue and profit of electricity providers
and decrease of the stability and reliability of electrical power grids. It
then surveys the state-of-the-art research efforts in a up-to-date and
comprehensive review of algorithms, features and data sets used. It finally
identifies the key scientific and engineering challenges in NTL detection and
suggests how they could be addressed in the future
Coreference detection in XML metadata
Preserving data quality is an important issue in data collection management. One of the crucial issues hereby is the detection of duplicate objects (called coreferent objects) which describe the same entity, but in different ways. In this paper we present a method for detecting coreferent objects in metadata, in particular in XML schemas. Our approach consists in comparing the paths from a root element to a given element in the schema. Each path precisely defines the context and location of a specific element in the schema. Path matching is based on the comparison of the different steps of which paths are composed. The uncertainty about the matching of steps is expressed with possibilistic truth values and aggregated using the Sugeno integral. The discovered coreference of paths can help for determining the coreference of different XML schemas
QCBA: Postoptimization of Quantitative Attributes in Classifiers based on Association Rules
The need to prediscretize numeric attributes before they can be used in
association rule learning is a source of inefficiencies in the resulting
classifier. This paper describes several new rule tuning steps aiming to
recover information lost in the discretization of numeric (quantitative)
attributes, and a new rule pruning strategy, which further reduces the size of
the classification models. We demonstrate the effectiveness of the proposed
methods on postoptimization of models generated by three state-of-the-art
association rule classification algorithms: Classification based on
Associations (Liu, 1998), Interpretable Decision Sets (Lakkaraju et al, 2016),
and Scalable Bayesian Rule Lists (Yang, 2017). Benchmarks on 22 datasets from
the UCI repository show that the postoptimized models are consistently smaller
-- typically by about 50% -- and have better classification performance on most
datasets
- …