327 research outputs found
k-Nearest Neighbour Classifiers: 2nd Edition (with Python examples)
Perhaps the most straightforward classifier in the arsenal or machine
learning techniques is the Nearest Neighbour Classifier -- classification is
achieved by identifying the nearest neighbours to a query example and using
those neighbours to determine the class of the query. This approach to
classification is of particular importance because issues of poor run-time
performance is not such a problem these days with the computational power that
is available. This paper presents an overview of techniques for Nearest
Neighbour classification focusing on; mechanisms for assessing similarity
(distance), computational issues in identifying nearest neighbours and
mechanisms for reducing the dimension of the data.
This paper is the second edition of a paper previously published as a
technical report. Sections on similarity measures for time-series, retrieval
speed-up and intrinsic dimensionality have been added. An Appendix is included
providing access to Python code for the key methods.Comment: 22 pages, 15 figures: An updated edition of an older tutorial on kN
k-Nearest Neighbour Classifiers - A Tutorial
Perhaps the most straightforward classifier in the arsenal or Machine Learning techniques is the Nearest Neighbour Classifier – classification is achieved by identifying the nearest neighbours to a query example and using those neighbours to determine the class of the query. This approach to classification is of particular importance because issues of poor run-time performance is not such a problem these days with the computational power that is available. This paper presents an overview of techniques for Nearest Neighbour classification focusing on; mechanisms for assessing similarity (distance), computational issues in identifying nearest neighbours and mechanisms for reducing the dimension of the data.This paper is the second edition of a paper previously published as a technical report . Sections on similarity measures for time-series, retrieval speed-up and intrinsic dimensionality have been added. An Appendix is included providing access to Python code for the key methods
Estudio de métodos de construcción de ensembles de clasificadores y aplicaciones
La inteligencia artificial se dedica a la creación de sistemas informáticos con un comportamiento inteligente. Dentro de este área el aprendizaje computacional estudia la creación de sistemas que aprenden por sí mismos.
Un tipo de aprendizaje computacional es el aprendizaje supervisado, en el cual, se le proporcionan al sistema tanto las entradas como la salida esperada y el sistema aprende a partir de estos datos. Un sistema de este tipo se denomina clasificador.
En ocasiones ocurre, que en el conjunto de ejemplos que utiliza el sistema para aprender, el número de ejemplos de un tipo es mucho mayor que el número de ejemplos de otro tipo. Cuando esto ocurre se habla de conjuntos desequilibrados.
La combinación de varios clasificadores es lo que se denomina "ensemble", y a menudo ofrece mejores resultados que cualquiera de los miembros que lo forman. Una de las claves para el buen funcionamiento de los ensembles es la diversidad.
Esta tesis, se centra en el desarrollo de nuevos algoritmos de construcción de ensembles, centrados en técnicas de incremento de la diversidad y en los problemas desequilibrados. Adicionalmente, se aplican estas técnicas a la solución de varias problemas industriales.Ministerio de Economía y Competitividad, proyecto TIN-2011-2404
Colour local feature fusion for image matching and recognition
This thesis investigates the use of colour information for local image feature extraction. The work is motivated by the inherent limitation of the most widely used state of the art local feature techniques, caused by their disregard of colour information. Colour contains important information that improves the description of the world around us, and by disregarding it; chromatic edges may be lost and thus decrease the level of saliency and distinctiveness of the resulting grayscale image. This thesis addresses the question of whether colour can improve the distinctive and descriptive capabilities of local features, and if this leads to better performances in image feature matching and object recognition applications. To ensure that the developed local colour features are robust to general imaging conditions and capable for real-world applications, this work utilises the most prominent photometric colour invariant gradients from the literature. The research addresses several limitations of previous studies that used colour invariants, by implementing robust local colour features in the form of a Harris-Laplace interest region detection and a SIFT description which characterises the detected image region. Additionally, a comprehensive and rigorous evaluation is performed, that compares the largest number of colour invariants of any previous study. This research provides for the first time, conclusive findings on the capability of the chosen colour invariants for practical real-world computer vision tasks. The last major aspect of the research involves the proposal of a feature fusion extraction strategy, that uses grayscale intensity and colour information conjointly. Two separate fusion approaches are implemented and evaluated, one for local feature matching tasks and another approach for object recognition. Results from the fusion analysis strongly indicate, that the colour invariants contain unique and useful information that can enhance the performance of techniques that use grayscale only based features
Recommended from our members
Homogeneous vector capsules and their application to sufficient and complete data
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University LondonCapsules (vector-valued neurons) have recently become a more active area of research
in neural networks. However, existing formulations have several drawbacks including
the large number of trainable parameters that they require as well as the reliance on
routing mechanisms between layers of capsules.
The primary aim of this project is to demonstrate the benefits of a new formulation
of capsules called Homogeneous Vector Capsules (HVCs) that overcome these
drawbacks.
Using HVCs, new state-of-the-art accuracies for the MNIST dataset are established
for multiple individual models as well as multiple ensembles.
This work additionally presents a dataset consisting of high-resolution images of
13 micro-PCBs captured in various rotations and perspectives relative to the camera,
with each sample labeled for PCB type, rotation category, and perspective categories.
Experiments performed and elucidated in this work examine classification accuracy of
rotations and perspectives that were not trained on as well as the ability to artificially
generate missing rotations and perspectives during training. The results of these
experiments include showing that using HVCs is superior to using fully connected
layers.
This work also showed that certain training samples are more informative of class
membership than others. These samples can be identified prior to training by analyzing
their position in reduced dimensional space relative to the classes’ centroids in that
space. And a definition and calculation both for class density and dataset completeness
based on the distribution of data in the reduced dimensional space has been put forth.
Experimentation using the dataset completeness calculation shows that those datasets
that meet a certain completeness threshold can be trained on a subset of the total
dataset, based on each class’s density, while improving upon or maintaining validation
accuracy
3D exemplar-based image inpainting in electron microscopy
In electron microscopy (EM) a common problem is the non-availability of data, which causes artefacts in reconstructions. In this thesis the goal is to generate artificial data where missing in EM by using exemplar-based inpainting (EBI). We implement an accelerated 3D version tailored to applications in EM, which reduces reconstruction times from days to minutes. We develop intelligent sampling strategies to find optimal data as input for reconstruction methods. Further, we investigate approaches to reduce electron dose and acquisition time. Sparse sampling followed by inpainting is the most promising approach. As common evaluation measures may lead to misinterpretation of results in EM and falsify a subsequent analysis, we propose to use application driven metrics and demonstrate this in a segmentation task. A further application of our technique is the artificial generation of projections in tiltbased EM. EBI is used to generate missing projections, such that the full angular range is covered. Subsequent reconstructions are significantly enhanced in terms of resolution, which facilitates further analysis of samples. In conclusion, EBI proves promising when used as an additional data generation step to tackle the non-availability of data in EM, which is evaluated in selected applications. Enhancing adaptive sampling methods and refining EBI, especially considering the mutual influence, promotes higher throughput in EM using less electron dose while not lessening quality.Ein häufig vorkommendes Problem in der Elektronenmikroskopie (EM) ist die Nichtverfügbarkeit von Daten, was zu Artefakten in Rekonstruktionen führt. In dieser Arbeit ist es das Ziel fehlende Daten in der EM künstlich zu erzeugen, was durch Exemplar-basiertes Inpainting (EBI) realisiert wird. Wir implementieren eine auf EM zugeschnittene beschleunigte 3D Version, welche es ermöglicht, Rekonstruktionszeiten von Tagen auf Minuten zu reduzieren. Wir entwickeln intelligente Abtaststrategien, um optimale Datenpunkte für die Rekonstruktion zu erhalten. Ansätze zur Reduzierung von Elektronendosis und Aufnahmezeit werden untersucht. Unterabtastung gefolgt von Inpainting führt zu den besten Resultaten. Evaluationsmaße zur Beurteilung der Rekonstruktionsqualität helfen in der EM oft nicht und können zu falschen Schlüssen führen, weswegen anwendungsbasierte Metriken die bessere Wahl darstellen. Dies demonstrieren wir anhand eines Beispiels. Die künstliche Erzeugung von Projektionen in der neigungsbasierten Elektronentomographie ist eine weitere Anwendung. EBI wird verwendet um fehlende Projektionen zu generieren. Daraus resultierende Rekonstruktionen weisen eine deutlich erhöhte Auflösung auf. EBI ist ein vielversprechender Ansatz, um nicht verfügbare Daten in der EM zu generieren. Dies wird auf Basis verschiedener Anwendungen gezeigt und evaluiert. Adaptive Aufnahmestrategien und EBI können also zu einem höheren Durchsatz in der EM führen, ohne die Bildqualität merklich zu verschlechtern
- …