126 research outputs found

    MDCS: More Diverse Experts with Consistency Self-distillation for Long-tailed Recognition

    Full text link
    Recently, multi-expert methods have led to significant improvements in long-tail recognition (LTR). We summarize two aspects that need further enhancement to contribute to LTR boosting: (1) More diverse experts; (2) Lower model variance. However, the previous methods didn't handle them well. To this end, we propose More Diverse experts with Consistency Self-distillation (MDCS) to bridge the gap left by earlier methods. Our MDCS approach consists of two core components: Diversity Loss (DL) and Consistency Self-distillation (CS). In detail, DL promotes diversity among experts by controlling their focus on different categories. To reduce the model variance, we employ KL divergence to distill the richer knowledge of weakly augmented instances for the experts' self-distillation. In particular, we design Confident Instance Sampling (CIS) to select the correctly classified instances for CS to avoid biased/noisy knowledge. In the analysis and ablation study, we demonstrate that our method compared with previous work can effectively increase the diversity of experts, significantly reduce the variance of the model, and improve recognition accuracy. Moreover, the roles of our DL and CS are mutually reinforcing and coupled: the diversity of experts benefits from the CS, and the CS cannot achieve remarkable results without the DL. Experiments show our MDCS outperforms the state-of-the-art by 1% \sim 2% on five popular long-tailed benchmarks, including CIFAR10-LT, CIFAR100-LT, ImageNet-LT, Places-LT, and iNaturalist 2018. The code is available at https://github.com/fistyee/MDCS.Comment: ICCV2023 Accept. 13 page

    CupCleaner: A Data Cleaning Approach for Comment Updating

    Full text link
    Recently, deep learning-based techniques have shown promising performance on various tasks related to software engineering. For these learning-based approaches to perform well, obtaining high-quality data is one fundamental and crucial issue. The comment updating task is an emerging software engineering task aiming at automatically updating the corresponding comments based on changes in source code. However, datasets for the comment updating tasks are usually crawled from committed versions in open source software repositories such as GitHub, where there is lack of quality control of comments. In this paper, we focus on cleaning existing comment updating datasets with considering some properties of the comment updating process in software development. We propose a semantic and overlapping-aware approach named CupCleaner (Comment UPdating's CLEANER) to achieve this purpose. Specifically, we calculate a score based on semantics and overlapping information of the code and comments. Based on the distribution of the scores, we filter out the data with low scores in the tail of the distribution to get rid of possible unclean data. We first conducted a human evaluation on the noise data and high-quality data identified by CupCleaner. The results show that the human ratings of the noise data identified by CupCleaner are significantly lower. Then, we applied our data cleaning approach to the training and validation sets of three existing comment updating datasets while keeping the test set unchanged. Our experimental results show that even after filtering out over 30\% of the data using CupCleaner, there is still an improvement in all performance metrics. The experimental results on the cleaned test set also suggest that CupCleaner may provide help for constructing datasets for updating-related tasks

    Tailoring Intermolecular Interactions Towards High‐Performance Thermoelectric Ionogels at Low Humidity

    Get PDF
    Development of ionic thermoelectric (iTE) materials is of immense interest for efficient heat-to-electricity conversion due to their giant ionic Seebeck coefficient (Si), but challenges remain in terms of relatively small Si at low humidity, poor stretchability, and ambiguous interaction mechanism in ionogels. Herein, a novel ionogel is reported consisting of polyethylene oxide (PEO), polyethylene oxide-polypropylene oxide-polyethylene oxide (P123), and 1-ethyl-3-methylimidazolium acetate (Emim:OAC). By delicately designing the interactions between ions and polymers, the migration of anions is restricted due to their strong binding with the hydroxyl groups of polymers, while the transport of cations is facilitated through segmental motions due to the increased amorphous regions, thereby leading to enlarged diffusion difference between the cations and anions. Moreover, the plasticizing effect of P123 and Emim:OAC can increase the elongation at break. As a consequence, the ionogel exhibits excellent properties including high Si (18 mV K−1 at relative humidity of 60%), good ionic conductivity (1.1 mS cm−1), superior stretchability (787%), and high stability (over 80% retention after 600 h). These findings show a promising strategy to obtain multifunctional iTE materials by engineering the intermolecular interactions and demonstrate the great potential of ionogels for harvesting low-grade heat in human-comfortable humidity environments

    Retrieve Anyone: A General-purpose Person Re-identification Task with Instructions

    Full text link
    Human intelligence can retrieve any person according to both visual and language descriptions. However, the current computer vision community studies specific person re-identification (ReID) tasks in different scenarios separately, which limits the applications in the real world. This paper strives to resolve this problem by proposing a new instruct-ReID task that requires the model to retrieve images according to the given image or language instructions.Our instruct-ReID is a more general ReID setting, where existing ReID tasks can be viewed as special cases by designing different instructions. We propose a large-scale OmniReID benchmark and an adaptive triplet loss as a baseline method to facilitate research in this new setting. Experimental results show that the baseline model trained on our OmniReID benchmark can improve +0.5%, +3.3% mAP on Market1501 and CUHK03 for traditional ReID, +2.1%, +0.2%, +15.3% mAP on PRCC, VC-Clothes, LTCC for clothes-changing ReID, +12.5% mAP on COCAS+ real2 for clothestemplate based clothes-changing ReID when using only RGB images, +25.5% mAP on COCAS+ real2 for our newly defined language-instructed ReID. The dataset, model, and code will be available at https://github.com/hwz-zju/Instruct-ReID

    DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

    Full text link
    The rapid development of open-source large language models (LLMs) has been truly remarkable. However, the scaling law described in previous literature presents varying conclusions, which casts a dark cloud over scaling LLMs. We delve into the study of scaling laws and present our distinctive findings that facilitate scaling of large scale models in two commonly used open-source configurations, 7B and 67B. Guided by the scaling laws, we introduce DeepSeek LLM, a project dedicated to advancing open-source language models with a long-term perspective. To support the pre-training phase, we have developed a dataset that currently consists of 2 trillion tokens and is continuously expanding. We further conduct supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) on DeepSeek LLM Base models, resulting in the creation of DeepSeek Chat models. Our evaluation results demonstrate that DeepSeek LLM 67B surpasses LLaMA-2 70B on various benchmarks, particularly in the domains of code, mathematics, and reasoning. Furthermore, open-ended evaluations reveal that DeepSeek LLM 67B Chat exhibits superior performance compared to GPT-3.5

    Disrupted Small-World Brain Networks in Moderate Alzheimer's Disease: A Resting-State fMRI Study

    Get PDF
    The small-world organization has been hypothesized to reflect a balance between local processing and global integration in the human brain. Previous multimodal imaging studies have consistently demonstrated that the topological architecture of the brain network is disrupted in Alzheimer's disease (AD). However, these studies have reported inconsistent results regarding the topological properties of brain alterations in AD. One potential explanation for these inconsistent results lies with the diverse homogeneity and distinct progressive stages of the AD involved in these studies, which are thought to be critical factors that might affect the results. We investigated the topological properties of brain functional networks derived from resting functional magnetic resonance imaging (fMRI) of carefully selected moderate AD patients and normal controls (NCs). Our results showed that the topological properties were found to be disrupted in AD patients, which showing increased local efficiency but decreased global efficiency. We found that the altered brain regions are mainly located in the default mode network, the temporal lobe and certain subcortical regions that are closely associated with the neuropathological changes in AD. Of note, our exploratory study revealed that the ApoE genotype modulates brain network properties, especially in AD patients
    corecore