Search CORE

862 research outputs found

Practical recommendations for gradient-based training of deep architectures

Author: Bengio Yoshua
Publication venue
Publication date: 16/09/2012
Field of study

Learning algorithms related to artificial neural networks and in particular for Deep Learning may seem to involve many bells and whistles, called hyper-parameters. This chapter is meant as a practical guide with recommendations for some of the most commonly used hyper-parameters, in particular in the context of learning algorithms based on back-propagated gradient and gradient-based optimization. It also discusses how to deal with the fact that more interesting results can be obtained when allowing one to adjust many hyper-parameters. Overall, it describes elements of the practice used to successfully and efficiently train and debug large-scale and often deep multi-layer neural networks. It closes with open questions about the training difficulties observed with deeper architectures

arXiv.org e-Print Archive

CiteSeerX

CANCER DETECTION FOR LOW GRADE SQUAMOUS ENTRAEPITHELIAL LESION

Author: Chakdar Kriti
Publication venue: 'Paleontological Institute at The University of Kansas'
Publication date: 01/01/2012
Field of study

The National Cancer Institute estimates in 2012, about 577,190 Americans are expected to die of cancer, more than 1,500 people a day. Cancer is the second most common cause of death in the US, accounting for nearly 1 of every 4 deaths. Cancer diagnosis has a very important role in the early detection and treatment of cancer. Automating the cancer diagnosis process can play a very significant role in reducing the number of falsely identified or unidentified cases. The aim of this thesis is to demonstrate different machine learning approaches for cancer detection. Dr. Tawfik, pathologist from University of Kansas medical Center (KUMC) is an inventor of a novel pathology tissue slicer. The data used in this study comes from this slicer, which successfully allows semi-automated cancer diagnosis and it has the potential to improve patient care. In this study the slides are processed and visual features are computed and the dataset is made from scratch. After features extraction, different machine learning approaches are applied on the dataset which has shown its capability of extracting high-level representations from high-dimensional data. Support Vector Machine and Deep Belief Networks (DBN) are the concentration in this study. In the first section, Support vector machine is applied on the dataset. Next Deep Belief Network which is capable of extracting features in an unsupervised manner is implemented and with back-propagation the network is fine tuned. The results show that DBN can be effective when applied to cytological cancer diagnosis by increasing the accuracy in cancer detection. In the last section a subset of DBN features are selected and then appended with raw features and Support Vector Machine is trained and tested with that. It shows improvement over the first section results. In the end the study infers that Deep Belief Network can be successfully used over other leading classification methods for cancer detection

KU ScholarWorks

Deep Self-Taught Learning for Handwritten Character Recognition

Author: Bastien Frédéric
Bengio Yoshua
Bergeron Arnaud
Boulanger-Lewandowski Nicolas
Breuel Thomas
Chherawala Youssouf
Cisse Moustapha
Côté Myriam
Erhan Dumitru
Eustache Jeremy
Glorot Xavier
Lebeuf Sylvain Pannetier
Muller Xavier
Pascanu Razvan
Rifai Salah
Savard Francois
Sicard Guillaume
Publication venue
Publication date: 01/01/2010
Field of study

Recent theoretical and empirical work in statistical machine learning has demonstrated the importance of learning algorithms for deep architectures, i.e., function classes obtained by composing multiple non-linear transformations. Self-taught learning (exploiting unlabeled examples or examples from other distributions) has already been applied to deep learners, but mostly to show the advantage of unlabeled examples. Here we explore the advantage brought by {\em out-of-distribution examples}. For this purpose we developed a powerful generator of stochastic variations and noise processes for character images, including not only affine transformations but also slant, local elastic deformations, changes in thickness, background images, grey level changes, contrast, occlusion, and various types of noise. The out-of-distribution examples are obtained from these highly distorted images or by including examples of object classes different from those in the target test set. We show that {\em deep learners benefit more from out-of-distribution examples than a corresponding shallow learner}, at least in the area of handwritten character recognition. In fact, we show that they beat previously published results and reach human-level performance on both handwritten digit classification and 62-class handwritten character recognition

arXiv.org e-Print Archive

CiteSeerX

Deep Learning of Representations: Looking Forward

Deep learning research aims at discovering learning algorithms that discover multiple levels of distributed representations, with higher levels representing more abstract concepts. Although the study of deep learning has already led to impressive theoretical results, learning algorithms and breakthrough experiments, several challenges lie ahead. This paper proposes to examine some of these challenges, centering on the questions of scaling deep learning algorithms to much larger models and datasets, reducing optimization difficulties due to ill-conditioning or local minima, designing more efficient and powerful inference and sampling procedures, and learning to disentangle the factors of variation underlying the observed data. It also proposes a few forward-looking research directions aimed at overcoming these challenges

arXiv.org e-Print Archive

Crossref