21,510 research outputs found
Review of Face Detection Systems Based Artificial Neural Networks Algorithms
Face detection is one of the most relevant applications of image processing
and biometric systems. Artificial neural networks (ANN) have been used in the
field of image processing and pattern recognition. There is lack of literature
surveys which give overview about the studies and researches related to the
using of ANN in face detection. Therefore, this research includes a general
review of face detection studies and systems which based on different ANN
approaches and algorithms. The strengths and limitations of these literature
studies and systems were included also.Comment: 16 pages, 12 figures, 1 table, IJMA Journa
Minimizing Computational Resources for Deep Machine Learning: A Compression and Neural Architecture Search Perspective for Image Classification and Object Detection
Computational resources represent a significant bottleneck across all current deep learning computer vision approaches. Image and video data storage requirements for training deep neural networks have led to the widespread use of image and video compression, the use of which naturally impacts the performance of neural network architectures during both training and inference. The prevalence of deep neural networks deployed on edge devices necessitates efficient network architecture design, while training neural networks requires significant time and computational resources, despite the acceleration of both hardware and software developments within the field of artificial intelligence (AI). This thesis addresses these challenges in order to minimize computational resource requirements across the entire end-to-end deep learning pipeline. We determine the extent to which data compression impacts neural network architecture performance, and by how much this performance can be recovered by retraining neural networks with compressed data. The thesis then focuses on the accessibility of the deployment of neural architecture search (NAS) to facilitate automatic network architecture generation for image classification suited to resource-constrained environments. A combined hard example mining and curriculum learning strategy is developed to minimize the image data processed during a given training epoch within the NAS search phase, without diminishing performance. We demonstrate the capability of the proposed framework across all gradient-based, reinforcement learning, and evolutionary NAS approaches, and a simple but effective method to extend the approach to the prediction-based NAS paradigm. The hard example mining approach within the proposed NAS framework depends upon the effectiveness of an autoencoder to regulate the latent space such that similar images have similar feature embeddings. This thesis conducts a thorough investigation to satisfy this constraint within the context of image classification. Based upon the success of the overall proposed NAS framework, we subsequently extend the approach towards object detection. Despite the resultant multi-label domain presenting a more difficult challenge for hard example mining, we propose an extension to the autoencoder to capture the additional object location information encoded within the training labels. The generation of an implicit attention layer within the autoencoder network sufficiently improves its capability to enforce similar images to have similar embeddings, thus successfully transferring the proposed NAS approach to object detection. Finally, the thesis demonstrates the resilience to compression of the general two-stage NAS approach upon which our proposed NAS framework is based
Efficient video indexing for monitoring disease activity and progression in the upper gastrointestinal tract
Endoscopy is a routine imaging technique used for both diagnosis and
minimally invasive surgical treatment. While the endoscopy video contains a
wealth of information, tools to capture this information for the purpose of
clinical reporting are rather poor. In date, endoscopists do not have any
access to tools that enable them to browse the video data in an efficient and
user friendly manner. Fast and reliable video retrieval methods could for
example, allow them to review data from previous exams and therefore improve
their ability to monitor disease progression. Deep learning provides new
avenues of compressing and indexing video in an extremely efficient manner. In
this study, we propose to use an autoencoder for efficient video compression
and fast retrieval of video images. To boost the accuracy of video image
retrieval and to address data variability like multi-modality and view-point
changes, we propose the integration of a Siamese network. We demonstrate that
our approach is competitive in retrieving images from 3 large scale videos of 3
different patients obtained against the query samples of their previous
diagnosis. Quantitative validation shows that the combined approach yield an
overall improvement of 5% and 8% over classical and variational autoencoders,
respectively.Comment: Accepted at IEEE International Symposium on Biomedical Imaging
(ISBI), 201
- …