19,067 research outputs found
Large-Scale Detection of Non-Technical Losses in Imbalanced Data Sets
Non-technical losses (NTL) such as electricity theft cause significant harm
to our economies, as in some countries they may range up to 40% of the total
electricity distributed. Detecting NTLs requires costly on-site inspections.
Accurate prediction of NTLs for customers using machine learning is therefore
crucial. To date, related research largely ignore that the two classes of
regular and non-regular customers are highly imbalanced, that NTL proportions
may change and mostly consider small data sets, often not allowing to deploy
the results in production. In this paper, we present a comprehensive approach
to assess three NTL detection models for different NTL proportions in large
real world data sets of 100Ks of customers: Boolean rules, fuzzy logic and
Support Vector Machine. This work has resulted in appreciable results that are
about to be deployed in a leading industry solution. We believe that the
considerations and observations made in this contribution are necessary for
future smart meter research in order to report their effectiveness on
imbalanced and large real world data sets.Comment: Proceedings of the Seventh IEEE Conference on Innovative Smart Grid
Technologies (ISGT 2016
Security Evaluation of Support Vector Machines in Adversarial Environments
Support Vector Machines (SVMs) are among the most popular classification
techniques adopted in security applications like malware detection, intrusion
detection, and spam filtering. However, if SVMs are to be incorporated in
real-world security systems, they must be able to cope with attack patterns
that can either mislead the learning algorithm (poisoning), evade detection
(evasion), or gain information about their internal parameters (privacy
breaches). The main contributions of this chapter are twofold. First, we
introduce a formal general framework for the empirical evaluation of the
security of machine-learning systems. Second, according to our framework, we
demonstrate the feasibility of evasion, poisoning and privacy attacks against
SVMs in real-world security problems. For each attack technique, we evaluate
its impact and discuss whether (and how) it can be countered through an
adversary-aware design of SVMs. Our experiments are easily reproducible thanks
to open-source code that we have made available, together with all the employed
datasets, on a public repository.Comment: 47 pages, 9 figures; chapter accepted into book 'Support Vector
Machine Applications
Fine-Grained Product Class Recognition for Assisted Shopping
Assistive solutions for a better shopping experience can improve the quality
of life of people, in particular also of visually impaired shoppers. We present
a system that visually recognizes the fine-grained product classes of items on
a shopping list, in shelves images taken with a smartphone in a grocery store.
Our system consists of three components: (a) We automatically recognize useful
text on product packaging, e.g., product name and brand, and build a mapping of
words to product classes based on the large-scale GroceryProducts dataset. When
the user populates the shopping list, we automatically infer the product class
of each entered word. (b) We perform fine-grained product class recognition
when the user is facing a shelf. We discover discriminative patches on product
packaging to differentiate between visually similar product classes and to
increase the robustness against continuous changes in product design. (c) We
continuously improve the recognition accuracy through active learning. Our
experiments show the robustness of the proposed method against cross-domain
challenges, and the scalability to an increasing number of products with
minimal re-training.Comment: Accepted at ICCV Workshop on Assistive Computer Vision and Robotics
(ICCV-ACVR) 201
CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise
In this paper, we study the problem of learning image classification models
with label noise. Existing approaches depending on human supervision are
generally not scalable as manually identifying correct or incorrect labels is
time-consuming, whereas approaches not relying on human supervision are
scalable but less effective. To reduce the amount of human supervision for
label noise cleaning, we introduce CleanNet, a joint neural embedding network,
which only requires a fraction of the classes being manually verified to
provide the knowledge of label noise that can be transferred to other classes.
We further integrate CleanNet and conventional convolutional neural network
classifier into one framework for image classification learning. We demonstrate
the effectiveness of the proposed algorithm on both of the label noise
detection task and the image classification on noisy data task on several
large-scale datasets. Experimental results show that CleanNet can reduce label
noise detection error rate on held-out classes where no human supervision
available by 41.5% compared to current weakly supervised methods. It also
achieves 47% of the performance gain of verifying all images with only 3.2%
images verified on an image classification task. Source code and dataset will
be available at kuanghuei.github.io/CleanNetProject.Comment: Accepted to CVPR 201
- …