1,139 research outputs found
Cancer diagnosis using deep learning: A bibliographic review
In this paper, we first describe the basics of the field of cancer diagnosis, which includes steps of cancer diagnosis followed by the typical classification methods used by doctors, providing a historical idea of cancer classification techniques to the readers. These methods include Asymmetry, Border, Color and Diameter (ABCD) method, seven-point detection method, Menzies method, and pattern analysis. They are used regularly by doctors for cancer diagnosis, although they are not considered very efficient for obtaining better performance. Moreover, considering all types of audience, the basic evaluation criteria are also discussed. The criteria include the receiver operating characteristic curve (ROC curve), Area under the ROC curve (AUC), F1 score, accuracy, specificity, sensitivity, precision, dice-coefficient, average accuracy, and Jaccard index. Previously used methods are considered inefficient, asking for better and smarter methods for cancer diagnosis. Artificial intelligence and cancer diagnosis are gaining attention as a way to define better diagnostic tools. In particular, deep neural networks can be successfully used for intelligent image analysis. The basic framework of how this machine learning works on medical imaging is provided in this study, i.e., pre-processing, image segmentation and post-processing. The second part of this manuscript describes the different deep learning techniques, such as convolutional neural networks (CNNs), generative adversarial models (GANs), deep autoencoders (DANs), restricted Boltzmann’s machine (RBM), stacked autoencoders (SAE), convolutional autoencoders (CAE), recurrent neural networks (RNNs), long short-term memory (LTSM), multi-scale convolutional neural network (M-CNN), multi-instance learning convolutional neural network (MIL-CNN). For each technique, we provide Python codes, to allow interested readers to experiment with the cited algorithms on their own diagnostic problems. The third part of this manuscript compiles the successfully applied deep learning models for different types of cancers. Considering the length of the manuscript, we restrict ourselves to the discussion of breast cancer, lung cancer, brain cancer, and skin cancer. The purpose of this bibliographic review is to provide researchers opting to work in implementing deep learning and artificial neural networks for cancer diagnosis a knowledge from scratch of the state-of-the-art achievements
Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments
Eliminating the negative effect of non-stationary environmental noise is a
long-standing research topic for automatic speech recognition that stills
remains an important challenge. Data-driven supervised approaches, including
ones based on deep neural networks, have recently emerged as potential
alternatives to traditional unsupervised approaches and with sufficient
training, can alleviate the shortcomings of the unsupervised methods in various
real-life acoustic environments. In this light, we review recently developed,
representative deep learning approaches for tackling non-stationary additive
and convolutional degradation of speech with the aim of providing guidelines
for those involved in the development of environmentally robust speech
recognition systems. We separately discuss single- and multi-channel techniques
developed for the front-end and back-end of speech recognition systems, as well
as joint front-end and back-end training frameworks
Deep Learning Techniques for Music Generation -- A Survey
This paper is a survey and an analysis of different ways of using deep
learning (deep artificial neural networks) to generate musical content. We
propose a methodology based on five dimensions for our analysis:
Objective - What musical content is to be generated? Examples are: melody,
polyphony, accompaniment or counterpoint. - For what destination and for what
use? To be performed by a human(s) (in the case of a musical score), or by a
machine (in the case of an audio file).
Representation - What are the concepts to be manipulated? Examples are:
waveform, spectrogram, note, chord, meter and beat. - What format is to be
used? Examples are: MIDI, piano roll or text. - How will the representation be
encoded? Examples are: scalar, one-hot or many-hot.
Architecture - What type(s) of deep neural network is (are) to be used?
Examples are: feedforward network, recurrent network, autoencoder or generative
adversarial networks.
Challenge - What are the limitations and open challenges? Examples are:
variability, interactivity and creativity.
Strategy - How do we model and control the process of generation? Examples
are: single-step feedforward, iterative feedforward, sampling or input
manipulation.
For each dimension, we conduct a comparative analysis of various models and
techniques and we propose some tentative multidimensional typology. This
typology is bottom-up, based on the analysis of many existing deep-learning
based systems for music generation selected from the relevant literature. These
systems are described and are used to exemplify the various choices of
objective, representation, architecture, challenge and strategy. The last
section includes some discussion and some prospects.Comment: 209 pages. This paper is a simplified version of the book: J.-P.
Briot, G. Hadjeres and F.-D. Pachet, Deep Learning Techniques for Music
Generation, Computational Synthesis and Creative Systems, Springer, 201
Connectionist multivariate density-estimation and its application to speech synthesis
Autoregressive models factorize a multivariate joint probability distribution into a
product of one-dimensional conditional distributions. The variables are assigned
an ordering, and the conditional distribution of each variable modelled using all
variables preceding it in that ordering as predictors.
Calculating normalized probabilities and sampling has polynomial computational
complexity under autoregressive models. Moreover, binary autoregressive
models based on neural networks obtain statistical performances similar to that of
some intractable models, like restricted Boltzmann machines, on several datasets.
The use of autoregressive probability density estimators based on neural
networks to model real-valued data, while proposed before, has never been properly
investigated and reported. In this thesis we extend the formulation of neural
autoregressive distribution estimators (NADE) to real-valued data; a model we call
the real-valued neural autoregressive density estimator (RNADE). Its statistical
performance on several datasets, including visual and auditory data, is reported
and compared to that of other models. RNADE obtained higher test likelihoods
than other tractable models, while retaining all the attractive computational
properties of autoregressive models.
However, autoregressive models are limited by the ordering of the variables
inherent to their formulation. Marginalization and imputation tasks can only be
solved analytically if the missing variables are at the end of the ordering. We
present a new training technique that obtains a set of parameters that can be
used for any ordering of the variables. By choosing a model with a convenient
ordering of the dimensions at test time, it is possible to solve any marginalization
and imputation tasks analytically.
The same training procedure also makes it practical to train NADEs and
RNADEs with several hidden layers. The resulting deep and tractable models
display higher test likelihoods than the equivalent one-hidden-layer models for all
the datasets tested.
Ensembles of NADEs or RNADEs can be created inexpensively by combining
models that share their parameters but differ in the ordering of the variables. These
ensembles of autoregressive models obtain state-of-the-art statistical performances
for several datasets.
Finally, we demonstrate the application of RNADE to speech synthesis, and
confirm that capturing the phone-conditional dependencies of acoustic features
improves the quality of synthetic speech. Our model generates synthetic speech
that was judged by naive listeners as being of higher quality than that generated
by mixture density networks, which are considered a state-of-the-art synthesis
techniqu
On advancing MCMC-based methods for Markovian data structures with applications to deep learning, simulation, and resampling
Markov chain Monte Carlo (MCMC) is a computational statistical approach for numerically approximating distributional quantities useful for inference that might otherwise be intractable to directly calculate. A challenge with MCMC methods is developing implementations which are both statistically rigorous and computationally scalable to large data sets. This work generally aims to bridge these aspects by exploiting conditional independence, or Markov structures, in data models. Chapter 2 investigates the model properties and Bayesian fitting of a graph model with Markovian dependence used in deep machine learning and image classification, called a restricted Bolzmann machine (RBM), and Chapter 3 presents a framework for describing inherent instability in a general class of models which includes RBMs. Chapters 4 and 5 introduce a fast method for simulating data from a Markov Random Field (MRF) by exploiting conditional independence specified in the model and a flexible `R` package that implements the approach in C++
- …