10,479 research outputs found

    Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness with Dataset Reinforcement

    Full text link
    We propose Dataset Reinforcement, a strategy to improve a dataset once such that the accuracy of any model architecture trained on the reinforced dataset is improved at no additional training cost for users. We propose a Dataset Reinforcement strategy based on data augmentation and knowledge distillation. Our generic strategy is designed based on extensive analysis across CNN- and transformer-based models and performing large-scale study of distillation with state-of-the-art models with various data augmentations. We create a reinforced version of the ImageNet training dataset, called ImageNet+, as well as reinforced datasets CIFAR-100+, Flowers-102+, and Food-101+. Models trained with ImageNet+ are more accurate, robust, and calibrated, and transfer well to downstream tasks (e.g., segmentation and detection). As an example, the accuracy of ResNet-50 improves by 1.7% on the ImageNet validation set, 3.5% on ImageNetV2, and 10.0% on ImageNet-R. Expected Calibration Error (ECE) on the ImageNet validation set is also reduced by 9.9%. Using this backbone with Mask-RCNN for object detection on MS-COCO, the mean average precision improves by 0.8%. We reach similar gains for MobileNets, ViTs, and Swin-Transformers. For MobileNetV3 and Swin-Tiny, we observe significant improvements on ImageNet-R/A/C of up to 20% improved robustness. Models pretrained on ImageNet+ and fine-tuned on CIFAR-100+, Flowers-102+, and Food-101+, reach up to 3.4% improved accuracy. The code, datasets, and pretrained models are available at https://github.com/apple/ml-dr.Comment: Accepted at International Conference on Computer Vision (ICCV) 2023. Camera-ready version with new Tables 9 and 1

    Towards event analysis in time-series data: Asynchronous probabilistic models and learning from partial labels

    Get PDF
    In this thesis, we contribute in two main directions: modeling asynchronous time-series data and learning from partial labelled data. We first propose novel probabilistic frameworks to improve flexibility and expressiveness of current approaches in modeling complex real-world asynchronous event sequence data. Second, we present a scalable approach to end-to-end learn a deep multi-label classifier with partial labels. To evaluate the effectiveness of our proposed frameworks, we focus on visual recognition application, however, our proposed frameworks are generic and can be used in modeling general settings of learning event sequences, and learning multi-label classifiers from partial labels. Visual recognition is a fundamental piece for achieving machine intelligence, and has a wide range of applications such as human activity analysis, autonomous driving, surveillance and security, health-care monitoring, etc. With a wide range of experiments, we show that our proposed approaches help to build more powerful and effective visual recognition frameworks

    RECAP: Towards Precise Radiology Report Generation via Dynamic Disease Progression Reasoning

    Full text link
    Automating radiology report generation can significantly alleviate radiologists' workloads. Previous research has primarily focused on realizing highly concise observations while neglecting the precise attributes that determine the severity of diseases (e.g., small pleural effusion). Since incorrect attributes will lead to imprecise radiology reports, strengthening the generation process with precise attribute modeling becomes necessary. Additionally, the temporal information contained in the historical records, which is crucial in evaluating a patient's current condition (e.g., heart size is unchanged), has also been largely disregarded. To address these issues, we propose RECAP, which generates precise and accurate radiology reports via dynamic disease progression reasoning. Specifically, RECAP first predicts the observations and progressions (i.e., spatiotemporal information) given two consecutive radiographs. It then combines the historical records, spatiotemporal information, and radiographs for report generation, where a disease progression graph and dynamic progression reasoning mechanism are devised to accurately select the attributes of each observation and progression. Extensive experiments on two publicly available datasets demonstrate the effectiveness of our model.Comment: Accepted by Findings of EMNLP 202
    • …
    corecore