2,882 research outputs found
Deep Self-Taught Learning for Handwritten Character Recognition
Recent theoretical and empirical work in statistical machine learning has
demonstrated the importance of learning algorithms for deep architectures,
i.e., function classes obtained by composing multiple non-linear
transformations. Self-taught learning (exploiting unlabeled examples or
examples from other distributions) has already been applied to deep learners,
but mostly to show the advantage of unlabeled examples. Here we explore the
advantage brought by {\em out-of-distribution examples}. For this purpose we
developed a powerful generator of stochastic variations and noise processes for
character images, including not only affine transformations but also slant,
local elastic deformations, changes in thickness, background images, grey level
changes, contrast, occlusion, and various types of noise. The
out-of-distribution examples are obtained from these highly distorted images or
by including examples of object classes different from those in the target test
set. We show that {\em deep learners benefit more from out-of-distribution
examples than a corresponding shallow learner}, at least in the area of
handwritten character recognition. In fact, we show that they beat previously
published results and reach human-level performance on both handwritten digit
classification and 62-class handwritten character recognition
Security Evaluation of Support Vector Machines in Adversarial Environments
Support Vector Machines (SVMs) are among the most popular classification
techniques adopted in security applications like malware detection, intrusion
detection, and spam filtering. However, if SVMs are to be incorporated in
real-world security systems, they must be able to cope with attack patterns
that can either mislead the learning algorithm (poisoning), evade detection
(evasion), or gain information about their internal parameters (privacy
breaches). The main contributions of this chapter are twofold. First, we
introduce a formal general framework for the empirical evaluation of the
security of machine-learning systems. Second, according to our framework, we
demonstrate the feasibility of evasion, poisoning and privacy attacks against
SVMs in real-world security problems. For each attack technique, we evaluate
its impact and discuss whether (and how) it can be countered through an
adversary-aware design of SVMs. Our experiments are easily reproducible thanks
to open-source code that we have made available, together with all the employed
datasets, on a public repository.Comment: 47 pages, 9 figures; chapter accepted into book 'Support Vector
Machine Applications
One-Class Classification: Taxonomy of Study and Review of Techniques
One-class classification (OCC) algorithms aim to build classification models
when the negative class is either absent, poorly sampled or not well defined.
This unique situation constrains the learning of efficient classifiers by
defining class boundary just with the knowledge of positive class. The OCC
problem has been considered and applied under many research themes, such as
outlier/novelty detection and concept learning. In this paper we present a
unified view of the general problem of OCC by presenting a taxonomy of study
for OCC problems, which is based on the availability of training data,
algorithms used and the application domains applied. We further delve into each
of the categories of the proposed taxonomy and present a comprehensive
literature review of the OCC algorithms, techniques and methodologies with a
focus on their significance, limitations and applications. We conclude our
paper by discussing some open research problems in the field of OCC and present
our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure
Advances in Character Recognition
This book presents advances in character recognition, and it consists of 12 chapters that cover wide range of topics on different aspects of character recognition. Hopefully, this book will serve as a reference source for academic research, for professionals working in the character recognition field and for all interested in the subject
Support vector machine applications in terahertz pulsed signals feature sets
Copyright © 2007 IEEE. All Rights Reserved.In the past decade, terahertz radiation (T-rays) have been extensively applied within the fields of industrial and biomedical imaging, owing to their noninvasive property. Support vector machine (SVM) learning algorithms are sufficiently powerful to detect patterns hidden inside noisy biomedical measurements. This paper introduces a frequency orientation component method to extract T-ray feature sets for the application of two- and multiclass classification using SVMs. Effective discriminations of ribonucleic acid (RNA) samples and various powdered substances are demonstrated. The development of this method has become important in T-ray chemical sensing and image processing, which results in enhanced detectability useful for many applications, such as quality control, security detection and clinic diagnosis.Xiaoxia Yin, Brian W.-H. Ng, Bernd M. Fischer, Bradley Ferguson, and Derek Abbot
Differentially Private Empirical Risk Minimization
Privacy-preserving machine learning algorithms are crucial for the
increasingly common setting in which personal data, such as medical or
financial records, are analyzed. We provide general techniques to produce
privacy-preserving approximations of classifiers learned via (regularized)
empirical risk minimization (ERM). These algorithms are private under the
-differential privacy definition due to Dwork et al. (2006). First we
apply the output perturbation ideas of Dwork et al. (2006), to ERM
classification. Then we propose a new method, objective perturbation, for
privacy-preserving machine learning algorithm design. This method entails
perturbing the objective function before optimizing over classifiers. If the
loss and regularizer satisfy certain convexity and differentiability criteria,
we prove theoretical results showing that our algorithms preserve privacy, and
provide generalization bounds for linear and nonlinear kernels. We further
present a privacy-preserving technique for tuning the parameters in general
machine learning algorithms, thereby providing end-to-end privacy guarantees
for the training process. We apply these results to produce privacy-preserving
analogues of regularized logistic regression and support vector machines. We
obtain encouraging results from evaluating their performance on real
demographic and benchmark data sets. Our results show that both theoretically
and empirically, objective perturbation is superior to the previous
state-of-the-art, output perturbation, in managing the inherent tradeoff
between privacy and learning performance.Comment: 40 pages, 7 figures, accepted to the Journal of Machine Learning
Researc
A Review of Codebook Models in Patch-Based Visual Object Recognition
The codebook model-based approach, while ignoring any structural aspect in vision, nonetheless provides state-of-the-art performances on current datasets. The key role of a visual codebook is to provide a way to map the low-level features into a fixed-length vector in histogram space to which standard classifiers can be directly applied. The discriminative power of such a visual codebook determines the quality of the codebook model, whereas the size of the codebook controls the complexity of the model. Thus, the construction of a codebook is an important step which is usually done by cluster analysis. However, clustering is a process that retains regions of high density in a distribution and it follows that the resulting codebook need not have discriminant properties. This is also recognised as a computational bottleneck of such systems. In our recent work, we proposed a resource-allocating codebook, to constructing a discriminant codebook in a one-pass design procedure that slightly outperforms more traditional approaches at drastically reduced computing times. In this review we survey several approaches that have been proposed over the last decade with their use of feature detectors, descriptors, codebook construction schemes, choice of classifiers in recognising objects, and datasets that were used in evaluating the proposed methods
Theoretical Interpretations and Applications of Radial Basis Function Networks
Medical applications usually used Radial Basis Function Networks just as Artificial Neural Networks. However, RBFNs are Knowledge-Based Networks that can be interpreted in several way: Artificial Neural Networks, Regularization Networks, Support Vector Machines, Wavelet Networks, Fuzzy Controllers, Kernel Estimators, Instanced-Based Learners. A survey of their interpretations and of their corresponding learning algorithms is provided as well as a brief survey on dynamic learning algorithms. RBFNs' interpretations can suggest applications that are particularly interesting in medical domains
- …