268 research outputs found

    Class-Incremental Exemplar Compression for Class-Incremental Learning

    Full text link
    Exemplar-based class-incremental learning (CIL) finetunes the model with all samples of new classes but few-shot exemplars of old classes in each incremental phase, where the "few-shot" abides by the limited memory budget. In this paper, we break this "few-shot" limit based on a simple yet surprisingly effective idea: compressing exemplars by downsampling non-discriminative pixels and saving "many-shot" compressed exemplars in the memory. Without needing any manual annotation, we achieve this compression by generating 0-1 masks on discriminative pixels from class activation maps (CAM). We propose an adaptive mask generation model called class-incremental masking (CIM) to explicitly resolve two difficulties of using CAM: 1) transforming the heatmaps of CAM to 0-1 masks with an arbitrary threshold leads to a trade-off between the coverage on discriminative pixels and the quantity of exemplars, as the total memory is fixed; and 2) optimal thresholds vary for different object classes, which is particularly obvious in the dynamic environment of CIL. We optimize the CIM model alternatively with the conventional CIL model through a bilevel optimization problem. We conduct extensive experiments on high-resolution CIL benchmarks including Food-101, ImageNet-100, and ImageNet-1000, and show that using the compressed exemplars by CIM can achieve a new state-of-the-art CIL accuracy, e.g., 4.8 percentage points higher than FOSTER on 10-Phase ImageNet-1000. Our code is available at https://github.com/xfflzl/CIM-CIL.Comment: Accepted to CVPR 202

    Distance-rank Aware Sequential Reward Learning for Inverse Reinforcement Learning with Sub-optimal Demonstrations

    Full text link
    Inverse reinforcement learning (IRL) aims to explicitly infer an underlying reward function based on collected expert demonstrations. Considering that obtaining expert demonstrations can be costly, the focus of current IRL techniques is on learning a better-than-demonstrator policy using a reward function derived from sub-optimal demonstrations. However, existing IRL algorithms primarily tackle the challenge of trajectory ranking ambiguity when learning the reward function. They overlook the crucial role of considering the degree of difference between trajectories in terms of their returns, which is essential for further removing reward ambiguity. Additionally, it is important to note that the reward of a single transition is heavily influenced by the context information within the trajectory. To address these issues, we introduce the Distance-rank Aware Sequential Reward Learning (DRASRL) framework. Unlike existing approaches, DRASRL takes into account both the ranking of trajectories and the degrees of dissimilarity between them to collaboratively eliminate reward ambiguity when learning a sequence of contextually informed reward signals. Specifically, we leverage the distance between policies, from which the trajectories are generated, as a measure to quantify the degree of differences between traces. This distance-aware information is then used to infer embeddings in the representation space for reward learning, employing the contrastive learning technique. Meanwhile, we integrate the pairwise ranking loss function to incorporate ranking information into the latent features. Moreover, we resort to the Transformer architecture to capture the contextual dependencies within the trajectories in the latent space, leading to more accurate reward estimation. Through extensive experimentation, our DRASRL framework demonstrates significant performance improvements over previous SOTA methods

    Robust tracking with discriminative ranking middle-level patches

    Get PDF
    The appearance model has been shown to be essential for robust visual tracking since it is the basic criterion to locating targets in video sequences. Though existing tracking-by-detection algorithms have shown to be greatly promising, they still suffer from the drift problem, which is caused by updating appearance models. In this paper, we propose a new appearance model composed of ranking middle-level patches to capture more object distinctiveness than traditional tracking-by-detection models. Targets and backgrounds are represented by both low-level bottom-up features and high-level top-down patches, which can compensate each other. Bottom-up features are defined at the pixel level, and each feature gets its discrimination score through selective feature attention mechanism. In top-down feature extraction, rectangular patches are ranked according to their bottom-up discrimination scores, by which all of them are clustered into irregular patches, named ranking middle-level patches. In addition, at the stage of classifier training, the online random forests algorithm is specially refined to reduce drifting problems. Experiments on challenging public datasets and our test videos demonstrate that our approach can effectively prevent the tracker drifting problem and obtain competitive performance in visual tracking

    Exploring Driving Behavior for Autonomous Vehicles Based on Gramian Angular Field Vision Transformer

    Full text link
    Effective classification of autonomous vehicle (AV) driving behavior emerges as a critical area for diagnosing AV operation faults, enhancing autonomous driving algorithms, and reducing accident rates. This paper presents the Gramian Angular Field Vision Transformer (GAF-ViT) model, designed to analyze AV driving behavior. The proposed GAF-ViT model consists of three key components: GAF Transformer Module, Channel Attention Module, and Multi-Channel ViT Module. These modules collectively convert representative sequences of multivariate behavior into multi-channel images and employ image recognition techniques for behavior classification. A channel attention mechanism is applied to multi-channel images to discern the impact of various driving behavior features. Experimental evaluation on the Waymo Open Dataset of trajectories demonstrates that the proposed model achieves state-of-the-art performance. Furthermore, an ablation study effectively substantiates the efficacy of individual modules within the model

    Single Satellite Imagery Simultaneous Super-resolution and Colorization using Multi-task Deep Neural Networks

    Get PDF
    Satellite imagery is a kind of typical remote sensing data, which holds preponderance in large area imaging and strong macro integrity. However, for most commercial space usages, such as virtual display of urban traffic flow, virtual interaction of environmental resources, one drawback of satellite imagery is its low spatial resolution, failing to provide the clear image details. Moreover, in recent years, synthesizing the color for grayscale satellite imagery or recovering the original color of camouflage sensitive regions becomes an urgent requirement for large spatial objects virtual reality interaction. In this work, unlike existing works which solve these two problems separately, we focus on achieving image super-resolution (SR) and image colorization synchronously. Based on multi-task learning, we provide a novel deep neural network model to fulfill single satellite imagery SR and colorization simultaneously. By feeding back the color feature representations into the SR network and jointly optimizing such two tasks, our deep model successfully achieves the mutual cooperation between imagery reconstruction and image colorization. To avoid color bias, we not only adopt the non-satellite imagery to enrich the color diversity of satellite image, but also recalculate the prior color distribution and the valid color range based on the mixed data. We evaluate the proposed model on satellite images from different data sets, such as RSSCN7 and AID. Both the evaluations and comparisons reveal that the proposed multi-task deep learning approach is superior to the state-of-the-art methods, where image SR and colorization can be accomplished simultaneously and efficiently

    The Prevalence and Risk Factors of Diabetic Retinopathy: Screening and Prophylaxis Project in 6 Provinces of China.

    Get PDF
    Purpose: To investigate the prevalence and associated factors of diabetic retinopathy (DR) and advanced DR in Chinese adults with diabetes mellitus (DM). Patients and Methods: A cross-sectional study was performed on 4831 diabetic patients from 24 hospitals from April 2018 to July 2020. Non-mydriatic fundus of patients were interpreted by an artificial intelligence (AI) system. Fundus photos that were unsuitable for AI interpretation were interpreted by two ophthalmologists trained by one expert ophthalmologist at Beijing Tongren Hospital. Medical history, height, weight, body mass index (BMI), glycosylated hemoglobin (HbA1c), blood pressure, and laboratory examinations were recorded. Results: A total of 4831 DM patients were included in this study. The prevalence of DR and advanced DR in the diabetic population was 31.8% and 6.6%, respectively. In multiple logistic regression analysis, male (odds ratio [OR], 1.39), duration of diabetes (OR, 1.05), HbA1c (OR, 1.11), farmer (OR, 1.39), insulin treatment (OR, 1.61), region (northern, OR, 1.78; rural, OR, 6.96), and presence of other diabetic complications (OR: 2.03) were associated with increased odds of DR. The factors associated with increased odds of advanced DR included poor glycemic control (HbA1c > 7.0%) (OR, 2.58), insulin treatment (OR, 1.73), longer duration of diabetes (OR, 3.66), rural region (OR, 4.84), and presence of other diabetic complications (OR, 2.36), but overweight (BMI > 25 kg/m2) (OR, 0.61) was associated with reduced odds of advanced DR. Conclusion: This study shows that the prevalence of DR is very high in Chinese adults with DM, highlighting the necessity of early diabetic retinal screening
    corecore