10,479 research outputs found
Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness with Dataset Reinforcement
We propose Dataset Reinforcement, a strategy to improve a dataset once such
that the accuracy of any model architecture trained on the reinforced dataset
is improved at no additional training cost for users. We propose a Dataset
Reinforcement strategy based on data augmentation and knowledge distillation.
Our generic strategy is designed based on extensive analysis across CNN- and
transformer-based models and performing large-scale study of distillation with
state-of-the-art models with various data augmentations. We create a reinforced
version of the ImageNet training dataset, called ImageNet+, as well as
reinforced datasets CIFAR-100+, Flowers-102+, and Food-101+. Models trained
with ImageNet+ are more accurate, robust, and calibrated, and transfer well to
downstream tasks (e.g., segmentation and detection). As an example, the
accuracy of ResNet-50 improves by 1.7% on the ImageNet validation set, 3.5% on
ImageNetV2, and 10.0% on ImageNet-R. Expected Calibration Error (ECE) on the
ImageNet validation set is also reduced by 9.9%. Using this backbone with
Mask-RCNN for object detection on MS-COCO, the mean average precision improves
by 0.8%. We reach similar gains for MobileNets, ViTs, and Swin-Transformers.
For MobileNetV3 and Swin-Tiny, we observe significant improvements on
ImageNet-R/A/C of up to 20% improved robustness. Models pretrained on ImageNet+
and fine-tuned on CIFAR-100+, Flowers-102+, and Food-101+, reach up to 3.4%
improved accuracy. The code, datasets, and pretrained models are available at
https://github.com/apple/ml-dr.Comment: Accepted at International Conference on Computer Vision (ICCV) 2023.
Camera-ready version with new Tables 9 and 1
Towards event analysis in time-series data: Asynchronous probabilistic models and learning from partial labels
In this thesis, we contribute in two main directions: modeling asynchronous time-series data and learning from partial labelled data. We first propose novel probabilistic frameworks to improve flexibility and expressiveness of current approaches in modeling complex real-world asynchronous event sequence data. Second, we present a scalable approach to end-to-end learn a deep multi-label classifier with partial labels. To evaluate the effectiveness of our proposed frameworks, we focus on visual recognition application, however, our proposed frameworks are generic and can be used in modeling general settings of learning event sequences, and learning multi-label classifiers from partial labels. Visual recognition is a fundamental piece for achieving machine intelligence, and has a wide range of applications such as human activity analysis, autonomous driving, surveillance and security, health-care monitoring, etc. With a wide range of experiments, we show that our proposed approaches help to build more powerful and effective visual recognition frameworks
RECAP: Towards Precise Radiology Report Generation via Dynamic Disease Progression Reasoning
Automating radiology report generation can significantly alleviate
radiologists' workloads. Previous research has primarily focused on realizing
highly concise observations while neglecting the precise attributes that
determine the severity of diseases (e.g., small pleural effusion). Since
incorrect attributes will lead to imprecise radiology reports, strengthening
the generation process with precise attribute modeling becomes necessary.
Additionally, the temporal information contained in the historical records,
which is crucial in evaluating a patient's current condition (e.g., heart size
is unchanged), has also been largely disregarded. To address these issues, we
propose RECAP, which generates precise and accurate radiology reports via
dynamic disease progression reasoning. Specifically, RECAP first predicts the
observations and progressions (i.e., spatiotemporal information) given two
consecutive radiographs. It then combines the historical records,
spatiotemporal information, and radiographs for report generation, where a
disease progression graph and dynamic progression reasoning mechanism are
devised to accurately select the attributes of each observation and
progression. Extensive experiments on two publicly available datasets
demonstrate the effectiveness of our model.Comment: Accepted by Findings of EMNLP 202
- …