191 research outputs found
A Novel Deep Convolutional Neural Network Architecture Based on Transfer Learning for Handwritten Urdu Character Recognition
Deep convolutional neural networks (CNN) have made a huge impact on computer vision and set the state-of-the-art in providing extremely definite classification results. For character recognition, where the training images are usually inadequate, mostly transfer learning of pre-trained CNN is often utilized. In this paper, we propose a novel deep convolutional neural network for handwritten Urdu character recognition by transfer learning three pre-trained CNN models. We fine-tuned the layers of these pre-trained CNNs so as to extract features considering both global and local details of the Urdu character structure. The extracted features from the three CNN models are concatenated to train with two fully connected layers for classification. The experiment is conducted on UNHD, EMILLE, DBAHCL, and CDB/Farsi dataset, and we achieve 97.18% average recognition accuracy which outperforms the individual CNNs and numerous conventional classification methods
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
Vehicle Detection of Multi-source Remote Sensing Data Using Active Fine-tuning Network
Vehicle detection in remote sensing images has attracted increasing interest
in recent years. However, its detection ability is limited due to lack of
well-annotated samples, especially in densely crowded scenes. Furthermore,
since a list of remotely sensed data sources is available, efficient
exploitation of useful information from multi-source data for better vehicle
detection is challenging. To solve the above issues, a multi-source active
fine-tuning vehicle detection (Ms-AFt) framework is proposed, which integrates
transfer learning, segmentation, and active classification into a unified
framework for auto-labeling and detection. The proposed Ms-AFt employs a
fine-tuning network to firstly generate a vehicle training set from an
unlabeled dataset. To cope with the diversity of vehicle categories, a
multi-source based segmentation branch is then designed to construct additional
candidate object sets. The separation of high quality vehicles is realized by a
designed attentive classifications network. Finally, all three branches are
combined to achieve vehicle detection. Extensive experimental results conducted
on two open ISPRS benchmark datasets, namely the Vaihingen village and Potsdam
city datasets, demonstrate the superiority and effectiveness of the proposed
Ms-AFt for vehicle detection. In addition, the generalization ability of Ms-AFt
in dense remote sensing scenes is further verified on stereo aerial imagery of
a large camping site
Computational Optimizations for Machine Learning
The present book contains the 10 articles finally accepted for publication in the Special Issue “Computational Optimizations for Machine Learning” of the MDPI journal Mathematics, which cover a wide range of topics connected to the theory and applications of machine learning, neural networks and artificial intelligence. These topics include, among others, various types of machine learning classes, such as supervised, unsupervised and reinforcement learning, deep neural networks, convolutional neural networks, GANs, decision trees, linear regression, SVM, K-means clustering, Q-learning, temporal difference, deep adversarial networks and more. It is hoped that the book will be interesting and useful to those developing mathematical algorithms and applications in the domain of artificial intelligence and machine learning as well as for those having the appropriate mathematical background and willing to become familiar with recent advances of machine learning computational optimization mathematics, which has nowadays permeated into almost all sectors of human life and activity
Top-Down Selection in Convolutional Neural Networks
Feedforward information processing fills the role of hierarchical feature encoding, transformation, reduction, and abstraction in a bottom-up manner. This paradigm of information processing is sufficient for task requirements that are satisfied in the one-shot rapid traversal of sensory information through the visual hierarchy. However, some tasks demand higher-order information processing using short-term recurrent, long-range feedback, or other processes. The predictive, corrective, and modulatory information processing in top-down fashion complement the feedforward pass to fulfill many complex task requirements. Convolutional neural networks have recently been successful in addressing some aspects of the feedforward processing. However, the role of top-down processing in such models has not yet been fully understood. We propose a top-down selection framework for convolutional neural networks to address the selective and modulatory nature of top-down processing in vision systems. We examine various aspects of the proposed model in different experimental settings such as object localization, object segmentation, task priming, compact neural representation, and contextual interference reduction. We test the hypothesis that the proposed approach is capable of accomplishing hierarchical feature localization according to task cuing. Additionally, feature modulation using the proposed approach is tested for demanding tasks such as segmentation and iterative parameter fine-tuning. Moreover, the top-down attentional traces are harnessed to enable a more compact neural representation. The experimental achievements support the practical complementary role of the top-down selection mechanisms to the bottom-up feature encoding routines
Deep Learning Techniques for Medical Image Classification
A thesis submitted in partial fulfillment of the requirements for the degree of Doctor in Information Management, specialization in Information and Decision SystemsIn recent years, artificial intelligence (AI) has been applied in many fields to address complex and critical real-world tasks. Deep learning rises as a subfield of AI, where artificial neural networks (ANN) are used to map complicated functions, which can be challenging even for experienced users. One of the ANN variants is called convolutional neural network (CNN), which has shown great potential in image processing by providing state-of-the-art results for many significant image processing challenges. The medical field can significantly benefit from AI usage, especially in the medical image classification domain. In this doctoral dissertation, we applied different AI techniques to analyze medical images and to give the physicians a second opinion or reduce the time and effort needed for the image classification. Initially, we reviewed several studies that were published to discuss the transfer learning of CNNs. Afterward, we studied different hyperparameters that need to be optimized for CNNs to be trained accurately. Lastly, we proposed a novel CNN architecture to help in the classification of histopathology images
Deep Learning and Spatial Statistics for Determining Road Surface Condition
Machine Learning (ML), and especially Deep Learning (DL) methods, have evolved
rapidly over the last years and showed remarkable advances in research areas such as computer
vision and natural language processing; however, there are still engineering applications
in industries such as transportation where DL methods have not been applied yet or that
can be benefited from an integrated approach using DL in addition to other methods.
For countries in Northern latitudes, one of such applications is Monitoring Road Surface
Condition (RSC) during the Winter season for improving road safety and road maintenance
operations.
In this study, we introduce a novel approach for monitoring of RSC that integrates
DL methods and Spatial Statistics (SS) to simultaneously process data from roadside
cameras and weather stations to determine automatically the category of snow coverage
at sample locations across a region of interest. Our approach integrates the advantages of
SS for interpolating spatial variables and the strengths of DL for Computer Vision tasks,
particularly for image classification. On one hand, SS models serve to understand the spatial
autocorrelation of random variables and to determine their expected values in unsampled
locations based on a number of near observations. On the other hand, DL models extract
relevant patterns from a large number of training images and learn a mapping from input
images to a set of predefined labels.
We implement and evaluate our approach using data collected in the province of Ontario
during the 2017-2018 Winter season. Specifically, we included data from three separate
sources, Environment Canada (EC) Weather stations, Road Weather Information System
(RWIS) stations, and roadside cameras from the Ministry of Transportation of Ontario
(MTO). To the best of our knowledge, this is the first study that integrates both DL and
SS techniques for processing the three data sources with the goal of monitoring RSC.
The DL models we implement and compared are Inception, Inception-Resnet, Xception,
DenseNet, MobileNetv2, and NASNet. All of these models have achieved remarkable
results for image classification in well-known benchmarks. The SS models we evaluate are
Ordinary Kriging (OK), Radial Basis Functions (RBF), and Inverse Distance Weighted
(IDW). The first provides a comprehensive understanding of the spatial autocorrelation
for each particular variable, while the second and third allow a faster implementation. Our
integrated approach works by combining the output feature vector from the DL model with
the interpolated values from the SS model to output a more robust prediction of RSC for
the locations of interest
Deep Learning and Spatial Statistics for Determining Road Surface Condition
Machine Learning (ML), and especially Deep Learning (DL) methods, have evolved
rapidly over the last years and showed remarkable advances in research areas such as computer
vision and natural language processing; however, there are still engineering applications
in industries such as transportation where DL methods have not been applied yet or that
can be benefited from an integrated approach using DL in addition to other methods.
For countries in Northern latitudes, one of such applications is Monitoring Road Surface
Condition (RSC) during the Winter season for improving road safety and road maintenance
operations.
In this study, we introduce a novel approach for monitoring of RSC that integrates
DL methods and Spatial Statistics (SS) to simultaneously process data from roadside
cameras and weather stations to determine automatically the category of snow coverage
at sample locations across a region of interest. Our approach integrates the advantages of
SS for interpolating spatial variables and the strengths of DL for Computer Vision tasks,
particularly for image classification. On one hand, SS models serve to understand the spatial
autocorrelation of random variables and to determine their expected values in unsampled
locations based on a number of near observations. On the other hand, DL models extract
relevant patterns from a large number of training images and learn a mapping from input
images to a set of predefined labels.
We implement and evaluate our approach using data collected in the province of Ontario
during the 2017-2018 Winter season. Specifically, we included data from three separate
sources, Environment Canada (EC) Weather stations, Road Weather Information System
(RWIS) stations, and roadside cameras from the Ministry of Transportation of Ontario
(MTO). To the best of our knowledge, this is the first study that integrates both DL and
SS techniques for processing the three data sources with the goal of monitoring RSC.
The DL models we implement and compared are Inception, Inception-Resnet, Xception,
DenseNet, MobileNetv2, and NASNet. All of these models have achieved remarkable
results for image classification in well-known benchmarks. The SS models we evaluate are
Ordinary Kriging (OK), Radial Basis Functions (RBF), and Inverse Distance Weighted
(IDW). The first provides a comprehensive understanding of the spatial autocorrelation
for each particular variable, while the second and third allow a faster implementation. Our
integrated approach works by combining the output feature vector from the DL model with
the interpolated values from the SS model to output a more robust prediction of RSC for
the locations of interest
- …