1,373 research outputs found
On the Use of Default Parameter Settings in the Empirical Evaluation of Classification Algorithms
We demonstrate that, for a range of state-of-the-art machine learning
algorithms, the differences in generalisation performance obtained using
default parameter settings and using parameters tuned via cross-validation can
be similar in magnitude to the differences in performance observed between
state-of-the-art and uncompetitive learning systems. This means that fair and
rigorous evaluation of new learning algorithms requires performance comparison
against benchmark methods with best-practice model selection procedures, rather
than using default parameter settings. We investigate the sensitivity of three
key machine learning algorithms (support vector machine, random forest and
rotation forest) to their default parameter settings, and provide guidance on
determining sensible default parameter values for implementations of these
algorithms. We also conduct an experimental comparison of these three
algorithms on 121 classification problems and find that, perhaps surprisingly,
rotation forest is significantly more accurate on average than both random
forest and a support vector machine
A game-theoretic framework for classifier ensembles using weighted majority voting with local accuracy estimates
In this paper, a novel approach for the optimal combination of binary
classifiers is proposed. The classifier combination problem is approached from
a Game Theory perspective. The proposed framework of adapted weighted majority
rules (WMR) is tested against common rank-based, Bayesian and simple majority
models, as well as two soft-output averaging rules. Experiments with ensembles
of Support Vector Machines (SVM), Ordinary Binary Tree Classifiers (OBTC) and
weighted k-nearest-neighbor (w/k-NN) models on benchmark datasets indicate that
this new adaptive WMR model, employing local accuracy estimators and the
analytically computed optimal weights outperform all the other simple
combination rules.Comment: 21 pages, 9 tables, 1 figure, 68 reference
Is rotation forest the best classifier for problems with continuous features?
In short, our experiments suggest that yes, on average, rotation forest is
better than the most common alternatives when all the attributes are
real-valued. Rotation forest is a tree based ensemble that performs transforms
on subsets of attributes prior to constructing each tree. We present an
empirical comparison of classifiers for problems with only real-valued
features. We evaluate classifiers from three families of algorithms: support
vector machines; tree-based ensembles; and neural networks tuned with a large
grid search. We compare classifiers on unseen data based on the quality of the
decision rule (using classification error) the ability to rank cases (area
under the receiver operating characteristic) and the probability estimates
(using negative log likelihood). We conclude that, in answer to the question
posed in the title, yes, rotation forest is significantly more accurate on
average than competing techniques when compared on three distinct sets of
datasets. Further, we assess the impact of the design features of rotation
forest through an ablative study that transforms random forest into rotation
forest. We identify the major limitation of rotation forest as its scalability,
particularly in number of attributes. To overcome this problem we develop a
model to predict the train time of the algorithm and hence propose a contract
version of rotation forest where a run time cap is imposed {\em a priori}. We
demonstrate that on large problems rotation forest can be made an order of
magnitude faster without significant loss of accuracy. We also show that there
is no real benefit (on average) from tuning rotation forest. We maintain that
without any domain knowledge to indicate an algorithm preference, rotation
forest should be the default algorithm of choice for problems with continuous
attributes
Hierarchical Invariant Feature Learning with Marginalization for Person Re-Identification
This paper addresses the problem of matching pedestrians across multiple
camera views, known as person re-identification. Variations in lighting
conditions, environment and pose changes across camera views make
re-identification a challenging problem. Previous methods address these
challenges by designing specific features or by learning a distance function.
We propose a hierarchical feature learning framework that learns invariant
representations from labeled image pairs. A mapping is learned such that the
extracted features are invariant for images belonging to same individual across
views. To learn robust representations and to achieve better generalization to
unseen data, the system has to be trained with a large amount of data.
Critically, most of the person re-identification datasets are small. Manually
augmenting the dataset by partial corruption of input data introduces
additional computational burden as it requires several training epochs to
converge. We propose a hierarchical network which incorporates a
marginalization technique that can reap the benefits of training on large
datasets without explicit augmentation. We compare our approach with several
baseline algorithms as well as popular linear and non-linear metric learning
algorithms and demonstrate improved performance on challenging publicly
available datasets, VIPeR, CUHK01, CAVIAR4REID and iLIDS. Our approach also
achieves the stateof-the-art results on these datasets
Histopathologic Image Processing: A Review
Histopathologic Images (HI) are the gold standard for evaluation of some
tumors. However, the analysis of such images is challenging even for
experienced pathologists, resulting in problems of inter and intra observer.
Besides that, the analysis is time and resource consuming. One of the ways to
accelerate such an analysis is by using Computer Aided Diagnosis systems. In
this work we present a literature review about the computing techniques to
process HI, including shallow and deep methods. We cover the most common tasks
for processing HI such as segmentation, feature extraction, unsupervised
learning and supervised learning. A dataset section show some datasets found
during the literature review. We also bring a study case of breast cancer
classification using a mix of deep and shallow machine learning methods. The
proposed method obtained an accuracy of 91% in the best case, outperforming the
compared baseline of the dataset
Learning Bag-of-Features Pooling for Deep Convolutional Neural Networks
Convolutional Neural Networks (CNNs) are well established models capable of
achieving state-of-the-art classification accuracy for various computer vision
tasks. However, they are becoming increasingly larger, using millions of
parameters, while they are restricted to handling images of fixed size. In this
paper, a quantization-based approach, inspired from the well-known
Bag-of-Features model, is proposed to overcome these limitations. The proposed
approach, called Convolutional BoF (CBoF), uses RBF neurons to quantize the
information extracted from the convolutional layers and it is able to natively
classify images of various sizes as well as to significantly reduce the number
of parameters in the network. In contrast to other global pooling operators and
CNN compression techniques the proposed method utilizes a trainable pooling
layer that it is end-to-end differentiable, allowing the network to be trained
using regular back-propagation and to achieve greater distribution shift
invariance than competitive methods. The ability of the proposed method to
reduce the parameters of the network and increase the classification accuracy
over other state-of-the-art techniques is demonstrated using three image
datasets.Comment: Accepted at ICCV 201
Dense Adaptive Cascade Forest: A Self Adaptive Deep Ensemble for Classification Problems
Recent researches have shown that deep forest ensemble achieves a
considerable increase in classification accuracy compared with the general
ensemble learning methods, especially when the training set is small. In this
paper, we take advantage of deep forest ensemble and introduce the Dense
Adaptive Cascade Forest (daForest). Our model has a better performance than the
original Cascade Forest with three major features: first, we apply SAMME.R
boosting algorithm to improve the performance of the model. It guarantees the
improvement as the number of layers increases. Second, our model connects each
layer to the subsequent ones in a feed-forward fashion, which enhances the
capability of the model to resist performance degeneration. Third, we add a
hyper-parameters optimization layer before the first classification layer,
making our model spend less time to set up and find the optimal
hyper-parameters. Experimental results show that daForest performs
significantly well, and in some cases, even outperforms neural networks and
achieves state-of-the-art results.Comment: 22 pages, 6 figure
Sufficient Conditions for Idealised Models to Have No Adversarial Examples: a Theoretical and Empirical Study with Bayesian Neural Networks
We prove, under two sufficient conditions, that idealised models can have no
adversarial examples. We discuss which idealised models satisfy our conditions,
and show that idealised Bayesian neural networks (BNNs) satisfy these. We
continue by studying near-idealised BNNs using HMC inference, demonstrating the
theoretical ideas in practice. We experiment with HMC on synthetic data derived
from MNIST for which we know the ground-truth image density, showing that
near-perfect epistemic uncertainty correlates to density under image manifold,
and that adversarial images lie off the manifold in our setting. This suggests
why MC dropout, which can be seen as performing approximate inference, has been
observed to be an effective defence against adversarial examples in practice;
We highlight failure-cases of non-idealised BNNs relying on dropout, suggesting
a new attack for dropout models and a new defence as well. Lastly, we
demonstrate the defence on a cats-vs-dogs image classification task with a
VGG13 variant
Demystifying Orthogonal Monte Carlo and Beyond
Orthogonal Monte Carlo (OMC) is a very effective sampling algorithm imposing
structural geometric conditions (orthogonality) on samples for variance
reduction. Due to its simplicity and superior performance as compared to its
Quasi Monte Carlo counterparts, OMC is used in a wide spectrum of challenging
machine learning applications ranging from scalable kernel methods to
predictive recurrent neural networks, generative models and reinforcement
learning. However theoretical understanding of the method remains very limited.
In this paper we shed new light on the theoretical principles behind OMC,
applying theory of negatively dependent random variables to obtain several new
concentration results. We also propose a novel extensions of the method
leveraging number theory techniques and particle algorithms, called
Near-Orthogonal Monte Carlo (NOMC). We show that NOMC is the first algorithm
consistently outperforming OMC in applications ranging from kernel methods to
approximating distances in probabilistic metric spaces.Comment: 22 pages, 4 figure
Learning Features for Offline Handwritten Signature Verification using Deep Convolutional Neural Networks
Verifying the identity of a person using handwritten signatures is
challenging in the presence of skilled forgeries, where a forger has access to
a person's signature and deliberately attempt to imitate it. In offline
(static) signature verification, the dynamic information of the signature
writing process is lost, and it is difficult to design good feature extractors
that can distinguish genuine signatures and skilled forgeries. This reflects in
a relatively poor performance, with verification errors around 7% in the best
systems in the literature. To address both the difficulty of obtaining good
features, as well as improve system performance, we propose learning the
representations from signature images, in a Writer-Independent format, using
Convolutional Neural Networks. In particular, we propose a novel formulation of
the problem that includes knowledge of skilled forgeries from a subset of users
in the feature learning process, that aims to capture visual cues that
distinguish genuine signatures and forgeries regardless of the user. Extensive
experiments were conducted on four datasets: GPDS, MCYT, CEDAR and Brazilian
PUC-PR datasets. On GPDS-160, we obtained a large improvement in
state-of-the-art performance, achieving 1.72% Equal Error Rate, compared to
6.97% in the literature. We also verified that the features generalize beyond
the GPDS dataset, surpassing the state-of-the-art performance in the other
datasets, without requiring the representation to be fine-tuned to each
particular dataset
- …