Search CORE

729 research outputs found

Approximation and Relaxation Approaches for Parallel and Distributed Machine Learning

Author: Tyree Stephen
Publication venue: Washington University Open Scholarship
Publication date: 15/12/2014
Field of study

Large scale machine learning requires tradeoffs. Commonly this tradeoff has led practitioners to choose simpler, less powerful models, e.g. linear models, in order to process more training examples in a limited time. In this work, we introduce parallelism to the training of non-linear models by leveraging a different tradeoff--approximation. We demonstrate various techniques by which non-linear models can be made amenable to larger data sets and significantly more training parallelism by strategically introducing approximation in certain optimization steps. For gradient boosted regression tree ensembles, we replace precise selection of tree splits with a coarse-grained, approximate split selection, yielding both faster sequential training and a significant increase in parallelism, in the distributed setting in particular. For metric learning with nearest neighbor classification, rather than explicitly train a neighborhood structure we leverage the implicit neighborhood structure induced by task-specific random forest classifiers, yielding a highly parallel method for metric learning. For support vector machines, we follow existing work to learn a reduced basis set with extremely high parallelism, particularly on GPUs, via existing linear algebra libraries. We believe these optimization tradeoffs are widely applicable wherever machine learning is put in practice in large scale settings. By carefully introducing approximation, we also introduce significantly higher parallelism and consequently can process more training examples for more iterations than competing exact methods. While seemingly learning the model with less precision, this tradeoff often yields noticeably higher accuracy under a restricted training time budget

Washington University St. Louis: Open Scholarship

Handwritten Digit Recognition and Classification Using Machine Learning

Author: Zhao Ke
Publication venue: Dublin Institute of Technology
Publication date: 01/01/2018
Field of study

In this paper, multiple learning techniques based on Optical character recognition (OCR) for the handwritten digit recognition are examined, and a new accuracy level for recognition of the MNIST dataset is reported. The proposed framework involves three primary parts, image pre-processing, feature extraction and classification. This study strives to improve the recognition accuracy by more than 99% in handwritten digit recognition. As will be seen, pre-processing and feature extraction play crucial roles in this experiment to reach the highest accuracy

Arrow@TUDublin

A visual approach to sketched symbol recognition

Author: Davis Randall
Ouyang Tom Yu
Publication venue: Morgan Kaufmann Publishers Inc.
Publication date: 01/01/2009
Field of study

There is increasing interest in building systems that can automatically interpret hand-drawn sketches. However, many challenges remain in terms of recognition accuracy, robustness to different drawing styles, and ability to generalize across multiple domains. To address these challenges, we propose a new approach to sketched symbol recognition that focuses on the visual appearance of the symbols. This allows us to better handle the range of visual and stroke-level variations found in freehand drawings. We also present a new symbol classifier that is computationally efficient and invariant to rotation and local deformations. We show that our method exceeds state-of-the-art performance on all three domains we evaluated, including handwritten digits, PowerPoint shapes, and electrical circuit symbols

CiteSeerX

DSpace@MIT

Recommended from our members

Fat-Fast VG-RAM WNN: A high performance approach

Author: Alberto F. De Souza
Aleksander
Aleksander
Aleksander
Aleksander
Artur d’Avila Garcez
Avelino Forechi
Berger
Cardoso
Cardoso
Claudine Badue
De Souza
Edilson de Aguiar
Jain
Jorcy de Oliveira Neto
Kandel
LeCun
Ludermir
Misra
Mitchell
Norouzi
Nurmaini
Rohwer
Rokach
Souza
Souza
Stallkamp
Thiago Oliveira-Santos
Xu
Publication venue: 'Elsevier BV'
Publication date: 01/03/2016
Field of study

The Virtual Generalizing Random Access Memory Weightless Neural Network (VGRAM WNN) is a type of WNN that only requires storage capacity proportional to the training set. As such, it is an effective machine learning technique that offers simple implementation and fast training – it can be made in one shot. However, the VG-RAM WNN test time for applications that require many training samples can be large, since it increases with the size of the memory of each neuron. In this paper, we present Fat-Fast VG-RAM WNNs. Fat-Fast VG-RAM WNNs employ multi-index chained hashing for fast neuron memory search. Our chained hashing technique increases the VG-RAM memory consumption (fat) but reduces test time substantially (fast), while keeping most of its machine learning performance. To address the memory consumption problem, we employ a data clustering technique to reduce the overall size of the neurons’ memory. This can be achieved by replacing clusters of neurons’ memory by their respective centroid values. With our approach, we were able to reduce VG-RAM WNN test time and memory footprint, while maintaining a high and acceptable machine learning performance. We performed experiments with the Fat-Fast VG-RAM WNN applied to two recognition problems: (i) handwritten digit recognition, and (ii) traffic sign recognition. Our experimental results showed that, in both recognition problems, our new VG-RAM WNN approach was able to run three orders of magnitude faster and consume two orders of magnitude less memory than standard VG-RAM, while experiencing only a small reduction in recognition performance

City Research Online

Crossref

On the Brittleness of Handwritten Digit Recognition Models

Author
Publication venue: 'Hindawi Limited'
Publication date
Field of study

Crossref

One-Class Classification: Taxonomy of Study and Review of Techniques

Author: Khan Shehroz S.
Madden Michael G.
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 29/11/2013
Field of study

One-class classification (OCC) algorithms aim to build classification models when the negative class is either absent, poorly sampled or not well defined. This unique situation constrains the learning of efficient classifiers by defining class boundary just with the knowledge of positive class. The OCC problem has been considered and applied under many research themes, such as outlier/novelty detection and concept learning. In this paper we present a unified view of the general problem of OCC by presenting a taxonomy of study for OCC problems, which is based on the availability of training data, algorithms used and the application domains applied. We further delve into each of the categories of the proposed taxonomy and present a comprehensive literature review of the OCC algorithms, techniques and methodologies with a focus on their significance, limitations and applications. We conclude our paper by discussing some open research problems in the field of OCC and present our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure

arXiv.org e-Print Archive

Access to Research at National University of Ireland, Galway

Learning with Single View Co-training and Marginalized Dropout

Author: Chen Minmin
Publication venue: Washington University Open Scholarship
Publication date: 19/03/2013
Field of study

The generalization properties of most existing machine learning techniques are predicated on the assumptions that 1) a sufficiently large quantity of training data is available; 2) the training and testing data come from some common distribution. Although these assumptions are often met in practice, there are also many scenarios in which training data from the relevant distribution is insufficient. We focus on making use of additional data, which is readily available or can be obtained easily but comes from a different distribution than the testing data, to aid learning. We present five learning scenarios, depending on how the distribution we used to sample the additional training data differs from the testing distribution: 1) learning with weak supervision; 2) domain adaptation; 3) learning from multiple domains; 4) learning from corrupted data; 5) learning with partial supervision. We introduce two strategies and manifest them in five ways to cope with the difference between the training and testing distribution. The first strategy, which gives rise to Pseudo Multi-view Co-training: PMC) and Co-training for Domain Adaptation: CODA), is inspired by the co-training algorithm for multi-view data. PMC generalizes co-training to the more common single view data and allows us to learn from weakly labeled data retrieved free from the web. CODA integrates PMC with an another feature selection component to address the feature incompatibility between domains for domain adaptation. PMC and CODA are evaluated on a variety of real datasets, and both yield record performance. The second strategy marginalized dropout leads to marginalized Stacked Denoising Autoencoders: mSDA), Marginalized Corrupted Features: MCF) and FastTag: FastTag). mSDA diminishes the difference between distributions associated with different domains by learning a new representation through marginalized corruption and reconstruciton. MCF learns from a known distribution which is created by corrupting a small set of training data, and improves robustness of learned classifiers by training on ``infinitely\u27\u27 many data sampled from the distribution. FastTag applies marginalized dropout to the output of partially labeled data to recover missing labels for multi-label tasks. These three algorithms not only achieve the state-of-art performance in various tasks, but also deliver orders of magnitude speed up at training and testing comparing to competing algorithms

Washington University St. Louis: Open Scholarship