71 research outputs found
Unsupervised Domain Adaptation on Reading Comprehension
Reading comprehension (RC) has been studied in a variety of datasets with the
boosted performance brought by deep neural networks. However, the
generalization capability of these models across different domains remains
unclear. To alleviate this issue, we are going to investigate unsupervised
domain adaptation on RC, wherein a model is trained on labeled source domain
and to be applied to the target domain with only unlabeled samples. We first
show that even with the powerful BERT contextual representation, the
performance is still unsatisfactory when the model trained on one dataset is
directly applied to another target dataset. To solve this, we provide a novel
conditional adversarial self-training method (CASe). Specifically, our approach
leverages a BERT model fine-tuned on the source dataset along with the
confidence filtering to generate reliable pseudo-labeled samples in the target
domain for self-training. On the other hand, it further reduces domain
distribution discrepancy through conditional adversarial learning across
domains. Extensive experiments show our approach achieves comparable accuracy
to supervised models on multiple large-scale benchmark datasets.Comment: 8 pages, 6 figures, 5 tables, Accepted by AAAI 202
Sequential Subset Matching for Dataset Distillation
Dataset distillation is a newly emerging task that synthesizes a small-size
dataset used in training deep neural networks (DNNs) for reducing data storage
and model training costs. The synthetic datasets are expected to capture the
essence of the knowledge contained in real-world datasets such that the former
yields a similar performance as the latter. Recent advancements in distillation
methods have produced notable improvements in generating synthetic datasets.
However, current state-of-the-art methods treat the entire synthetic dataset as
a unified entity and optimize each synthetic instance equally. This static
optimization approach may lead to performance degradation in dataset
distillation. Specifically, we argue that static optimization can give rise to
a coupling issue within the synthetic data, particularly when a larger amount
of synthetic data is being optimized. This coupling issue, in turn, leads to
the failure of the distilled dataset to extract the high-level features learned
by the deep neural network (DNN) in the latter epochs.
In this study, we propose a new dataset distillation strategy called
Sequential Subset Matching (SeqMatch), which tackles this problem by adaptively
optimizing the synthetic data to encourage sequential acquisition of knowledge
during dataset distillation. Our analysis indicates that SeqMatch effectively
addresses the coupling issue by sequentially generating the synthetic
instances, thereby enhancing its performance significantly. Our proposed
SeqMatch outperforms state-of-the-art methods in various datasets, including
SVNH, CIFAR-10, CIFAR-100, and Tiny ImageNet. Our code is available at
https://github.com/shqii1j/seqmatch
You Only Condense Once: Two Rules for Pruning Condensed Datasets
Dataset condensation is a crucial tool for enhancing training efficiency by
reducing the size of the training dataset, particularly in on-device scenarios.
However, these scenarios have two significant challenges: 1) the varying
computational resources available on the devices require a dataset size
different from the pre-defined condensed dataset, and 2) the limited
computational resources often preclude the possibility of conducting additional
condensation processes. We introduce You Only Condense Once (YOCO) to overcome
these limitations. On top of one condensed dataset, YOCO produces smaller
condensed datasets with two embarrassingly simple dataset pruning rules: Low
LBPE Score and Balanced Construction. YOCO offers two key advantages: 1) it can
flexibly resize the dataset to fit varying computational constraints, and 2) it
eliminates the need for extra condensation processes, which can be
computationally prohibitive. Experiments validate our findings on networks
including ConvNet, ResNet and DenseNet, and datasets including CIFAR-10,
CIFAR-100 and ImageNet. For example, our YOCO surpassed various dataset
condensation and dataset pruning methods on CIFAR-10 with ten Images Per Class
(IPC), achieving 6.98-8.89% and 6.31-23.92% accuracy gains, respectively. The
code is available at: https://github.com/he-y/you-only-condense-once.Comment: Accepted by NeurIPS 202
- …