459 research outputs found
Empirically Analyzing the Effect of Dataset Biases on Deep Face Recognition Systems
It is unknown what kind of biases modern in the wild face datasets have
because of their lack of annotation. A direct consequence of this is that total
recognition rates alone only provide limited insight about the generalization
ability of a Deep Convolutional Neural Networks (DCNNs). We propose to
empirically study the effect of different types of dataset biases on the
generalization ability of DCNNs. Using synthetically generated face images, we
study the face recognition rate as a function of interpretable parameters such
as face pose and light. The proposed method allows valuable details about the
generalization performance of different DCNN architectures to be observed and
compared. In our experiments, we find that: 1) Indeed, dataset bias has a
significant influence on the generalization performance of DCNNs. 2) DCNNs can
generalize surprisingly well to unseen illumination conditions and large
sampling gaps in the pose variation. 3) Using the presented methodology we
reveal that the VGG-16 architecture outperforms the AlexNet architecture at
face recognition tasks because it can much better generalize to unseen face
poses, although it has significantly more parameters. 4) We uncover a main
limitation of current DCNN architectures, which is the difficulty to generalize
when different identities to not share the same pose variation. 5) We
demonstrate that our findings on synthetic data also apply when learning from
real-world data. Our face image generator is publicly available to enable the
community to benchmark other DCNN architectures.Comment: Accepted to CVPR 2018 Workshop on Analysis and Modeling of Faces and
Gestures (AMFG
On Rendering Synthetic Images for Training an Object Detector
We propose a novel approach to synthesizing images that are effective for
training object detectors. Starting from a small set of real images, our
algorithm estimates the rendering parameters required to synthesize similar
images given a coarse 3D model of the target object. These parameters can then
be reused to generate an unlimited number of training images of the object of
interest in arbitrary 3D poses, which can then be used to increase
classification performances.
A key insight of our approach is that the synthetically generated images
should be similar to real images, not in terms of image quality, but rather in
terms of features used during the detector training. We show in the context of
drone, plane, and car detection that using such synthetically generated images
yields significantly better performances than simply perturbing real images or
even synthesizing images in such way that they look very realistic, as is often
done when only limited amounts of training data are available
Detecting cells and analyzing their behaviors in microscopy images using deep neural networks
The computer-aided analysis in the medical imaging field has attracted a lot of attention for the past decade. The goal of computer-vision based medical image analysis is to provide automated tools to relieve the burden of human experts such as radiologists and physicians. More specifically, these computer-aided methods are to help identify, classify and quantify patterns in medical images. Recent advances in machine learning, more specifically, in the way of deep learning, have made a big leap to boost the performance of various medical applications. The fundamental core of these advances is exploiting hierarchical feature representations by various deep learning models, instead of handcrafted features based on domain-specific knowledge.
In the work presented in this dissertation, we are particularly interested in exploring the power of deep neural network in the Circulating Tumor Cells detection and mitosis event detection. We will introduce the Convolutional Neural Networks and the designed training methodology for Circulating Tumor Cells detection, a Hierarchical Convolutional Neural Networks model and a Two-Stream Bidirectional Long Short-Term Memory model for mitosis event detection and its stage localization in phase-contrast microscopy images”--Abstract, page iii
Deep Learning for Head Pose Estimation: A Survey
Head pose estimation (HPE) is an active and popular area of research. Over the years, many approaches have constantly been developed, leading to a progressive improvement in accuracy; nevertheless, head pose estimation remains an open research topic, especially in unconstrained environments. In this paper, we will review the increasing amount of available datasets and the modern methodologies used to estimate orientation, with a special attention to deep learning techniques. We will discuss the evolution of the feld by proposing a classifcation of head pose estimation methods, explaining their advantages and disadvantages, and highlighting the diferent ways deep learning techniques have been used in the context of HPE. An
in-depth performance comparison and discussion is presented at the end of the work. We also highlight the most promising research directions for future investigations on the topic
Multimodal Adversarial Learning
Deep Convolutional Neural Networks (DCNN) have proven to be an exceptional tool for object recognition, generative modelling, and multi-modal learning in various computer vision applications. However, recent findings have shown that such state-of-the-art models can be easily deceived by inserting slight imperceptible perturbations to key pixels in the input. A good target detection systems can accurately identify targets by localizing their coordinates on the input image of interest. This is ideally achieved by labeling each pixel in an image as a background or a potential target pixel. However, prior research still confirms that such state of the art targets models are susceptible to adversarial attacks. In the case of generative models, facial sketches drawn by artists mostly used by law enforcement agencies depend on the ability of the artist to clearly replicate all the key facial features that aid in capturing the true identity of a subject. Recent works have attempted to synthesize these sketches into plausible visual images to improve visual recognition and identification. However, synthesizing photo-realistic images from sketches proves to be an even more challenging task, especially for sensitive applications such as suspect identification. However, the incorporation of hybrid discriminators, which perform attribute classification of multiple target attributes, a quality guided encoder that minimizes the perceptual dissimilarity of the latent space embedding of the synthesized and real image at different layers in the network have shown to be powerful tools towards better multi modal learning techniques. In general, our overall approach was aimed at improving target detection systems and the visual appeal of synthesized images while incorporating multiple attribute assignment to the generator without compromising the identity of the synthesized image. We synthesized sketches using XDOG filter for the CelebA, Multi-modal and CelebA-HQ datasets and from an auxiliary generator trained on sketches from CUHK, IIT-D and FERET datasets. Our results overall for different model applications are impressive compared to current state of the art
DeepACSON automated segmentation of white matter in 3D electron microscopy
Tracing the entirety of ultrastructures in large three-dimensional electron microscopy (3D-EM) images of the brain tissue requires automated segmentation techniques. Current segmentation techniques use deep convolutional neural networks (DCNNs) and rely on high-contrast cellular membranes and high-resolution EM volumes. On the other hand, segmenting low-resolution, large EM volumes requires methods to account for severe membrane discontinuities inescapable. Therefore, we developed DeepACSON, which performs DCNN-based semantic segmentation and shape-decomposition-based instance segmentation. DeepACSON instance segmentation uses the tubularity of myelinated axons and decomposes under-segmented myelinated axons into their constituent axons. We applied DeepACSON to ten EM volumes of rats after sham-operation or traumatic brain injury, segmenting hundreds of thousands of long-span myelinated axons, thousands of cell nuclei, and millions of mitochondria with excellent evaluation scores. DeepACSON quantified the morphology and spatial aspects of white matter ultrastructures, capturing nanoscopic morphological alterations five months after the injury. With DeepACSON, Abdollahzadeh et al. combines existing deep learning-based methods for semantic segmentation and a novel shape decomposition technique for the instance segmentation. The pipeline is used to segment low-resolution 3D-EM datasets allowing quantification of white matter morphology in large fields-of-view.Peer reviewe
- …