45 research outputs found
Data Augmentation for Spoken Language Understanding via Joint Variational Generation
Data scarcity is one of the main obstacles of domain adaptation in spoken
language understanding (SLU) due to the high cost of creating manually tagged
SLU datasets. Recent works in neural text generative models, particularly
latent variable models such as variational autoencoder (VAE), have shown
promising results in regards to generating plausible and natural sentences. In
this paper, we propose a novel generative architecture which leverages the
generative power of latent variable models to jointly synthesize fully
annotated utterances. Our experiments show that existing SLU models trained on
the additional synthetic examples achieve performance gains. Our approach not
only helps alleviate the data scarcity issue in the SLU task for many datasets
but also indiscriminately improves language understanding performances for
various SLU models, supported by extensive experiments and rigorous statistical
testing.Comment: 8 pages, 3 figures, 4 tables, Accepted in AAAI201
Equivariant Data Augmentation for Generalization in Offline Reinforcement Learning
We present a novel approach to address the challenge of generalization in
offline reinforcement learning (RL), where the agent learns from a fixed
dataset without any additional interaction with the environment. Specifically,
we aim to improve the agent's ability to generalize to out-of-distribution
goals. To achieve this, we propose to learn a dynamics model and check if it is
equivariant with respect to a fixed type of transformation, namely translations
in the state space. We then use an entropy regularizer to increase the
equivariant set and augment the dataset with the resulting transformed samples.
Finally, we learn a new policy offline based on the augmented dataset, with an
off-the-shelf offline RL algorithm. Our experimental results demonstrate that
our approach can greatly improve the test performance of the policy on the
considered environments
On the Generalization Effects of Linear Transformations in Data Augmentation
Data augmentation is a powerful technique to improve performance in
applications such as image and text classification tasks. Yet, there is little
rigorous understanding of why and how various augmentations work. In this work,
we consider a family of linear transformations and study their effects on the
ridge estimator in an over-parametrized linear regression setting. First, we
show that transformations which preserve the labels of the data can improve
estimation by enlarging the span of the training data. Second, we show that
transformations which mix data can improve estimation by playing a
regularization effect. Finally, we validate our theoretical insights on MNIST.
Based on the insights, we propose an augmentation scheme that searches over the
space of transformations by how uncertain the model is about the transformed
data. We validate our proposed scheme on image and text datasets. For example,
our method outperforms RandAugment by 1.24% on CIFAR-100 using
Wide-ResNet-28-10. Furthermore, we achieve comparable accuracy to the SoTA
Adversarial AutoAugment on CIFAR datasets.Comment: International Conference on Machine learning (ICML) 2020. Added
experimental results on ImageNe
Kernel-convoluted Deep Neural Networks with Data Augmentation
The Mixup method (Zhang et al. 2018), which uses linearly interpolated data,
has emerged as an effective data augmentation tool to improve generalization
performance and the robustness to adversarial examples. The motivation is to
curtail undesirable oscillations by its implicit model constraint to behave
linearly at in-between observed data points and promote smoothness. In this
work, we formally investigate this premise, propose a way to explicitly impose
smoothness constraints, and extend it to incorporate with implicit model
constraints. First, we derive a new function class composed of
kernel-convoluted models (KCM) where the smoothness constraint is directly
imposed by locally averaging the original functions with a kernel function.
Second, we propose to incorporate the Mixup method into KCM to expand the
domains of smoothness. In both cases of KCM and the KCM adapted with the Mixup,
we provide risk analysis, respectively, under some conditions for kernels. We
show that the upper bound of the excess risk is not slower than that of the
original function class. The upper bound of the KCM with the Mixup remains
dominated by that of the KCM if the perturbation of the Mixup vanishes faster
than where is a sample size. Using CIFAR-10 and CIFAR-100
datasets, our experiments demonstrate that the KCM with the Mixup outperforms
the Mixup method in terms of generalization and robustness to adversarial
examples
Data Augmentation for Modeling Human Personality: The Dexter Machine
Modeling human personality is important for several AI challenges, from the
engineering of artificial psychotherapists to the design of persona bots.
However, the field of computational personality analysis heavily relies on
labeled data, which may be expensive, difficult or impossible to get. This
problem is amplified when dealing with rare personality types or disorders
(e.g., the anti-social psychopathic personality disorder). In this context, we
developed a text-based data augmentation approach for human personality
(PEDANT). PEDANT doesn't rely on the common type of labeled data but on the
generative pre-trained model (GPT) combined with domain expertise. Testing the
methodology on three different datasets, provides results that support the
quality of the generated data