27,429 research outputs found
Efficient Active Learning for Image Classification and Segmentation using a Sample Selection and Conditional Generative Adversarial Network
Training robust deep learning (DL) systems for medical image classification
or segmentation is challenging due to limited images covering different disease
types and severity. We propose an active learning (AL) framework to select most
informative samples and add to the training data. We use conditional generative
adversarial networks (cGANs) to generate realistic chest xray images with
different disease characteristics by conditioning its generation on a real
image sample. Informative samples to add to the training set are identified
using a Bayesian neural network. Experiments show our proposed AL framework is
able to achieve state of the art performance by using about 35% of the full
dataset, thus saving significant time and effort over conventional methods
Multi-Label Zero-Shot Human Action Recognition via Joint Latent Ranking Embedding
Human action recognition refers to automatic recognizing human actions from a
video clip. In reality, there often exist multiple human actions in a video
stream. Such a video stream is often weakly-annotated with a set of relevant
human action labels at a global level rather than assigning each label to a
specific video episode corresponding to a single action, which leads to a
multi-label learning problem. Furthermore, there are many meaningful human
actions in reality but it would be extremely difficult to collect/annotate
video clips regarding all of various human actions, which leads to a zero-shot
learning scenario. To the best of our knowledge, there is no work that has
addressed all the above issues together in human action recognition. In this
paper, we formulate a real-world human action recognition task as a multi-label
zero-shot learning problem and propose a framework to tackle this problem in a
holistic way. Our framework holistically tackles the issue of unknown temporal
boundaries between different actions for multi-label learning and exploits the
side information regarding the semantic relationship between different human
actions for knowledge transfer. Consequently, our framework leads to a joint
latent ranking embedding for multi-label zero-shot human action recognition. A
novel neural architecture of two component models and an alternate learning
algorithm are proposed to carry out the joint latent ranking embedding
learning. Thus, multi-label zero-shot recognition is done by measuring
relatedness scores of action labels to a test video clip in the joint latent
visual and semantic embedding spaces. We evaluate our framework with different
settings, including a novel data split scheme designed especially for
evaluating multi-label zero-shot learning, on two datasets: Breakfast and
Charades. The experimental results demonstrate the effectiveness of our
framework.Comment: 27 pages, 10 figures and 7 tables. Technical report submitted to a
journal. More experimental results/references were added and typos were
correcte
Quantifying Facial Age by Posterior of Age Comparisons
We introduce a novel approach for annotating large quantity of in-the-wild
facial images with high-quality posterior age distribution as labels. Each
posterior provides a probability distribution of estimated ages for a face. Our
approach is motivated by observations that it is easier to distinguish who is
the older of two people than to determine the person's actual age. Given a
reference database with samples of known ages and a dataset to label, we can
transfer reliable annotations from the former to the latter via
human-in-the-loop comparisons. We show an effective way to transform such
comparisons to posterior via fully-connected and SoftMax layers, so as to
permit end-to-end training in a deep network. Thanks to the efficient and
effective annotation approach, we collect a new large-scale facial age dataset,
dubbed `MegaAge', which consists of 41,941 images. Data can be downloaded from
our project page mmlab.ie.cuhk.edu.hk/projects/MegaAge and
github.com/zyx2012/Age_estimation_BMVC2017. With the dataset, we train a
network that jointly performs ordinal hyperplane classification and posterior
distribution learning. Our approach achieves state-of-the-art results on
popular benchmarks such as MORPH2, Adience, and the newly proposed MegaAge.Comment: To appear on BMVC 2017 (oral) revised versio
Recommended from our members
Interpretable classification of Alzheimer's disease pathologies with a convolutional neural network pipeline.
Neuropathologists assess vast brain areas to identify diverse and subtly-differentiated morphologies. Standard semi-quantitative scoring approaches, however, are coarse-grained and lack precise neuroanatomic localization. We report a proof-of-concept deep learning pipeline that identifies specific neuropathologies-amyloid plaques and cerebral amyloid angiopathy-in immunohistochemically-stained archival slides. Using automated segmentation of stained objects and a cloud-based interface, we annotate > 70,000 plaque candidates from 43 whole slide images (WSIs) to train and evaluate convolutional neural networks. Networks achieve strong plaque classification on a 10-WSI hold-out set (0.993 and 0.743 areas under the receiver operating characteristic and precision recall curve, respectively). Prediction confidence maps visualize morphology distributions at high resolution. Resulting network-derived amyloid beta (Aβ)-burden scores correlate well with established semi-quantitative scores on a 30-WSI blinded hold-out. Finally, saliency mapping demonstrates that networks learn patterns agreeing with accepted pathologic features. This scalable means to augment a neuropathologist's ability suggests a route to neuropathologic deep phenotyping
Fast-AT: Fast Automatic Thumbnail Generation using Deep Neural Networks
Fast-AT is an automatic thumbnail generation system based on deep neural
networks. It is a fully-convolutional deep neural network, which learns
specific filters for thumbnails of different sizes and aspect ratios. During
inference, the appropriate filter is selected depending on the dimensions of
the target thumbnail. Unlike most previous work, Fast-AT does not utilize
saliency but addresses the problem directly. In addition, it eliminates the
need to conduct region search on the saliency map. The model generalizes to
thumbnails of different sizes including those with extreme aspect ratios and
can generate thumbnails in real time. A data set of more than 70,000 thumbnail
annotations was collected to train Fast-AT. We show competitive results in
comparison to existing techniques
Recommended from our members
Medical Image Data and Datasets in the Era of Machine Learning-Whitepaper from the 2016 C-MIMI Meeting Dataset Session.
At the first annual Conference on Machine Intelligence in Medical Imaging (C-MIMI), held in September 2016, a conference session on medical image data and datasets for machine learning identified multiple issues. The common theme from attendees was that everyone participating in medical image evaluation with machine learning is data starved. There is an urgent need to find better ways to collect, annotate, and reuse medical imaging data. Unique domain issues with medical image datasets require further study, development, and dissemination of best practices and standards, and a coordinated effort among medical imaging domain experts, medical imaging informaticists, government and industry data scientists, and interested commercial, academic, and government entities. High-level attributes of reusable medical image datasets suitable to train, test, validate, verify, and regulate ML products should be better described. NIH and other government agencies should promote and, where applicable, enforce, access to medical image datasets. We should improve communication among medical imaging domain experts, medical imaging informaticists, academic clinical and basic science researchers, government and industry data scientists, and interested commercial entities
- …