1,964 research outputs found
PointMap: A real-time memory-based learning system with on-line and post-training pruning
Also published in the International Journal of Hybrid Intelligent Systems, Volume 1, January, 2004A memory-based learning system called PointMap is a simple and computationally efficient extension of Condensed Nearest Neighbor that allows the user to limit the number of exemplars stored during incremental learning. PointMap evaluates the information value of coding nodes during training, and uses this index to prune uninformative nodes either on-line or after training. These pruning methods allow the user to control both a priori code size and sensitivity to detail in the training data, as well as to determine the code size necessary for accurate performance on a given data set. Coding and pruning computations are local in space, with only the nearest coded neighbor available for comparison with the input; and in time, with only the current input available during coding. Pruning helps solve common problems of traditional memory-based learning systems: large memory requirements, their accompanying slow on-line computations, and sensitivity to noise. PointMap copes with the curse of dimensionality by considering multiple nearest neighbors during testing without increasing the complexity of the training process or the stored code. The performance of PointMap is compared to that of a group of sixteen nearest-neighbor systems on benchmark problems.This research was supported by grants from the Air Force Office of Scientific Research (AFOSR F49620-98-l-0108, F49620-0l-l-0397, and F49620-0l-l-0423)
and the Office of Naval Research (ONR N00014-0l-l-0624)
k-Nearest Neighbour Classifiers: 2nd Edition (with Python examples)
Perhaps the most straightforward classifier in the arsenal or machine
learning techniques is the Nearest Neighbour Classifier -- classification is
achieved by identifying the nearest neighbours to a query example and using
those neighbours to determine the class of the query. This approach to
classification is of particular importance because issues of poor run-time
performance is not such a problem these days with the computational power that
is available. This paper presents an overview of techniques for Nearest
Neighbour classification focusing on; mechanisms for assessing similarity
(distance), computational issues in identifying nearest neighbours and
mechanisms for reducing the dimension of the data.
This paper is the second edition of a paper previously published as a
technical report. Sections on similarity measures for time-series, retrieval
speed-up and intrinsic dimensionality have been added. An Appendix is included
providing access to Python code for the key methods.Comment: 22 pages, 15 figures: An updated edition of an older tutorial on kN
Efficient Asymmetric Co-Tracking using Uncertainty Sampling
Adaptive tracking-by-detection approaches are popular for tracking arbitrary
objects. They treat the tracking problem as a classification task and use
online learning techniques to update the object model. However, these
approaches are heavily invested in the efficiency and effectiveness of their
detectors. Evaluating a massive number of samples for each frame (e.g.,
obtained by a sliding window) forces the detector to trade the accuracy in
favor of speed. Furthermore, misclassification of borderline samples in the
detector introduce accumulating errors in tracking. In this study, we propose a
co-tracking based on the efficient cooperation of two detectors: a rapid
adaptive exemplar-based detector and another more sophisticated but slower
detector with a long-term memory. The sampling labeling and co-learning of the
detectors are conducted by an uncertainty sampling unit, which improves the
speed and accuracy of the system. We also introduce a budgeting mechanism which
prevents the unbounded growth in the number of examples in the first detector
to maintain its rapid response. Experiments demonstrate the efficiency and
effectiveness of the proposed tracker against its baselines and its superior
performance against state-of-the-art trackers on various benchmark videos.Comment: Submitted to IEEE ICSIPA'201
An Optimal k Nearest Neighbours Ensemble for Classification Based on Extended Neighbourhood Rule with Features subspace
To minimize the effect of outliers, kNN ensembles identify a set of closest
observations to a new sample point to estimate its unknown class by using
majority voting in the labels of the training instances in the neighbourhood.
Ordinary kNN based procedures determine k closest training observations in the
neighbourhood region (enclosed by a sphere) by using a distance formula. The k
nearest neighbours procedure may not work in a situation where sample points in
the test data follow the pattern of the nearest observations that lie on a
certain path not contained in the given sphere of nearest neighbours.
Furthermore, these methods combine hundreds of base kNN learners and many of
them might have high classification errors thereby resulting in poor ensembles.
To overcome these problems, an optimal extended neighbourhood rule based
ensemble is proposed where the neighbours are determined in k steps. It starts
from the first nearest sample point to the unseen observation. The second
nearest data point is identified that is closest to the previously selected
data point. This process is continued until the required number of the k
observations are obtained. Each base model in the ensemble is constructed on a
bootstrap sample in conjunction with a random subset of features. After
building a sufficiently large number of base models, the optimal models are
then selected based on their performance on out-of-bag (OOB) data.Comment: 12 page
- …