2,384 research outputs found
Data Curation with Deep Learning [Vision]
Data curation - the process of discovering, integrating, and cleaning data -
is one of the oldest, hardest, yet inevitable data management problems. Despite
decades of efforts from both researchers and practitioners, it is still one of
the most time consuming and least enjoyable work of data scientists. In most
organizations, data curation plays an important role so as to fully unlock the
value of big data. Unfortunately, the current solutions are not keeping up with
the ever-changing data ecosystem, because they often require substantially high
human cost. Meanwhile, deep learning is making strides in achieving remarkable
successes in multiple areas, such as image recognition, natural language
processing, and speech recognition. In this vision paper, we explore how some
of the fundamental innovations in deep learning could be leveraged to improve
existing data curation solutions and to help build new ones. In particular, we
provide a thorough overview of the current deep learning landscape, and
identify interesting research opportunities and dispel common myths. We hope
that the synthesis of these important domains will unleash a series of research
activities that will lead to significantly improved solutions for many data
curation tasks
DADA: Deep Adversarial Data Augmentation for Extremely Low Data Regime Classification
Deep learning has revolutionized the performance of classification, but
meanwhile demands sufficient labeled data for training. Given insufficient
data, while many techniques have been developed to help combat overfitting, the
challenge remains if one tries to train deep networks, especially in the
ill-posed extremely low data regimes: only a small set of labeled data are
available, and nothing -- including unlabeled data -- else. Such regimes arise
from practical situations where not only data labeling but also data collection
itself is expensive. We propose a deep adversarial data augmentation (DADA)
technique to address the problem, in which we elaborately formulate data
augmentation as a problem of training a class-conditional and supervised
generative adversarial network (GAN). Specifically, a new discriminator loss is
proposed to fit the goal of data augmentation, through which both real and
augmented samples are enforced to contribute to and be consistent in finding
the decision boundaries. Tailored training techniques are developed
accordingly. To quantitatively validate its effectiveness, we first perform
extensive simulations to show that DADA substantially outperforms both
traditional data augmentation and a few GAN-based options. We then extend
experiments to three real-world small labeled datasets where existing data
augmentation and/or transfer learning strategies are either less effective or
infeasible. All results endorse the superior capability of DADA in enhancing
the generalization ability of deep networks trained in practical extremely low
data regimes. Source code is available at
https://github.com/SchafferZhang/DADA.Comment: 15 pages, 5 figure
Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules
A key challenge in leveraging data augmentation for neural network training
is choosing an effective augmentation policy from a large search space of
candidate operations. Properly chosen augmentation policies can lead to
significant generalization improvements; however, state-of-the-art approaches
such as AutoAugment are computationally infeasible to run for the ordinary
user. In this paper, we introduce a new data augmentation algorithm, Population
Based Augmentation (PBA), which generates nonstationary augmentation policy
schedules instead of a fixed augmentation policy. We show that PBA can match
the performance of AutoAugment on CIFAR-10, CIFAR-100, and SVHN, with three
orders of magnitude less overall compute. On CIFAR-10 we achieve a mean test
error of 1.46%, which is a slight improvement upon the current
state-of-the-art. The code for PBA is open source and is available at
https://github.com/arcelien/pba.Comment: ICML 201
Data augmentation using learned transformations for one-shot medical image segmentation
Image segmentation is an important task in many medical applications. Methods
based on convolutional neural networks attain state-of-the-art accuracy;
however, they typically rely on supervised training with large labeled
datasets. Labeling medical images requires significant expertise and time, and
typical hand-tuned approaches for data augmentation fail to capture the complex
variations in such images.
We present an automated data augmentation method for synthesizing labeled
medical images. We demonstrate our method on the task of segmenting magnetic
resonance imaging (MRI) brain scans. Our method requires only a single
segmented scan, and leverages other unlabeled scans in a semi-supervised
approach. We learn a model of transformations from the images, and use the
model along with the labeled example to synthesize additional labeled examples.
Each transformation is comprised of a spatial deformation field and an
intensity change, enabling the synthesis of complex effects such as variations
in anatomy and image acquisition procedures. We show that training a supervised
segmenter with these new examples provides significant improvements over
state-of-the-art methods for one-shot biomedical image segmentation. Our code
is available at https://github.com/xamyzhao/brainstorm.Comment: 9 pages, CVPR 201
DeceptionNet: Network-Driven Domain Randomization
We present a novel approach to tackle domain adaptation between synthetic and
real data. Instead, of employing "blind" domain randomization, i.e., augmenting
synthetic renderings with random backgrounds or changing illumination and
colorization, we leverage the task network as its own adversarial guide toward
useful augmentations that maximize the uncertainty of the output. To this end,
we design a min-max optimization scheme where a given task competes against a
special deception network to minimize the task error subject to the specific
constraints enforced by the deceiver. The deception network samples from a
family of differentiable pixel-level perturbations and exploits the task
architecture to find the most destructive augmentations. Unlike GAN-based
approaches that require unlabeled data from the target domain, our method
achieves robust mappings that scale well to multiple target distributions from
source data alone. We apply our framework to the tasks of digit recognition on
enhanced MNIST variants, classification and object pose estimation on the
Cropped LineMOD dataset as well as semantic segmentation on the Cityscapes
dataset and compare it to a number of domain adaptation approaches, thereby
demonstrating similar results with superior generalization capabilities.Comment: ICCV 201
A Kernel Theory of Modern Data Augmentation
Data augmentation, a technique in which a training set is expanded with
class-preserving transformations, is ubiquitous in modern machine learning
pipelines. In this paper, we seek to establish a theoretical framework for
understanding data augmentation. We approach this from two directions: First,
we provide a general model of augmentation as a Markov process, and show that
kernels appear naturally with respect to this model, even when we do not employ
kernel classification. Next, we analyze more directly the effect of
augmentation on kernel classifiers, showing that data augmentation can be
approximated by first-order feature averaging and second-order variance
regularization components. These frameworks both serve to illustrate the ways
in which data augmentation affects the downstream learning model, and the
resulting analyses provide novel connections between prior work in invariant
kernels, tangent propagation, and robust optimization. Finally, we provide
several proof-of-concept applications showing that our theory can be useful for
accelerating machine learning workflows, such as reducing the amount of
computation needed to train using augmented data, and predicting the utility of
a transformation prior to training
Efficient Augmentation via Data Subsampling
Data augmentation is commonly used to encode invariances in learning methods.
However, this process is often performed in an inefficient manner, as
artificial examples are created by applying a number of transformations to all
points in the training set. The resulting explosion of the dataset size can be
an issue in terms of storage and training costs, as well as in selecting and
tuning the optimal set of transformations to apply. In this work, we
demonstrate that it is possible to significantly reduce the number of data
points included in data augmentation while realizing the same accuracy and
invariance benefits of augmenting the entire dataset. We propose a novel set of
subsampling policies, based on model influence and loss, that can achieve a 90%
reduction in augmentation set size while maintaining the accuracy gains of
standard data augmentation
Relay: A High-Level Compiler for Deep Learning
Frameworks for writing, compiling, and optimizing deep learning (DL) models
have recently enabled progress in areas like computer vision and natural
language processing. Extending these frameworks to accommodate the rapidly
diversifying landscape of DL models and hardware platforms presents challenging
tradeoffs between expressivity, composability, and portability. We present
Relay, a new compiler framework for DL. Relay's functional, statically typed
intermediate representation (IR) unifies and generalizes existing DL IRs to
express state-of-the-art models. The introduction of Relay's expressive IR
requires careful design of domain-specific optimizations, addressed via Relay's
extension mechanisms. Using these extension mechanisms, Relay supports a
unified compiler that can target a variety of hardware platforms. Our
evaluation demonstrates Relay's competitive performance for a broad class of
models and devices (CPUs, GPUs, and emerging accelerators). Relay's design
demonstrates how a unified IR can provide expressivity, composability, and
portability without compromising performance
Learning Disentangling and Fusing Networks for Face Completion Under Structured Occlusions
Face completion aims to generate semantically new pixels for missing facial
components. It is a challenging generative task due to large variations of face
appearance. This paper studies generative face completion under structured
occlusions. We treat the face completion and corruption as disentangling and
fusing processes of clean faces and occlusions, and propose a jointly
disentangling and fusing Generative Adversarial Network (DF-GAN). First, three
domains are constructed, corresponding to the distributions of occluded faces,
clean faces and structured occlusions. The disentangling and fusing processes
are formulated as the transformations between the three domains. Then the
disentangling and fusing networks are built to learn the transformations from
unpaired data, where the encoder-decoder structure is adopted and allows DF-GAN
to simulate structure occlusions by modifying the latent representations.
Finally, the disentangling and fusing processes are unified into a dual
learning framework along with an adversarial strategy. The proposed method is
evaluated on Meshface verification problem. Experimental results on four
Meshface databases demonstrate the effectiveness of our proposed method for the
face completion under structured occlusions.Comment: Submitted to CVPR 201
MONAI: An open-source framework for deep learning in healthcare
Artificial Intelligence (AI) is having a tremendous impact across most areas
of science. Applications of AI in healthcare have the potential to improve our
ability to detect, diagnose, prognose, and intervene on human disease. For AI
models to be used clinically, they need to be made safe, reproducible and
robust, and the underlying software framework must be aware of the
particularities (e.g. geometry, physiology, physics) of medical data being
processed. This work introduces MONAI, a freely available, community-supported,
and consortium-led PyTorch-based framework for deep learning in healthcare.
MONAI extends PyTorch to support medical data, with a particular focus on
imaging, and provide purpose-specific AI model architectures, transformations
and utilities that streamline the development and deployment of medical AI
models. MONAI follows best practices for software-development, providing an
easy-to-use, robust, well-documented, and well-tested software framework. MONAI
preserves the simple, additive, and compositional approach of its underlying
PyTorch libraries. MONAI is being used by and receiving contributions from
research, clinical and industrial teams from around the world, who are pursuing
applications spanning nearly every aspect of healthcare.Comment: www.monai.i
- …