126 research outputs found
MDCS: More Diverse Experts with Consistency Self-distillation for Long-tailed Recognition
Recently, multi-expert methods have led to significant improvements in
long-tail recognition (LTR). We summarize two aspects that need further
enhancement to contribute to LTR boosting: (1) More diverse experts; (2) Lower
model variance. However, the previous methods didn't handle them well. To this
end, we propose More Diverse experts with Consistency Self-distillation (MDCS)
to bridge the gap left by earlier methods. Our MDCS approach consists of two
core components: Diversity Loss (DL) and Consistency Self-distillation (CS). In
detail, DL promotes diversity among experts by controlling their focus on
different categories. To reduce the model variance, we employ KL divergence to
distill the richer knowledge of weakly augmented instances for the experts'
self-distillation. In particular, we design Confident Instance Sampling (CIS)
to select the correctly classified instances for CS to avoid biased/noisy
knowledge. In the analysis and ablation study, we demonstrate that our method
compared with previous work can effectively increase the diversity of experts,
significantly reduce the variance of the model, and improve recognition
accuracy. Moreover, the roles of our DL and CS are mutually reinforcing and
coupled: the diversity of experts benefits from the CS, and the CS cannot
achieve remarkable results without the DL. Experiments show our MDCS
outperforms the state-of-the-art by 1% 2% on five popular long-tailed
benchmarks, including CIFAR10-LT, CIFAR100-LT, ImageNet-LT, Places-LT, and
iNaturalist 2018. The code is available at https://github.com/fistyee/MDCS.Comment: ICCV2023 Accept. 13 page
CupCleaner: A Data Cleaning Approach for Comment Updating
Recently, deep learning-based techniques have shown promising performance on
various tasks related to software engineering. For these learning-based
approaches to perform well, obtaining high-quality data is one fundamental and
crucial issue. The comment updating task is an emerging software engineering
task aiming at automatically updating the corresponding comments based on
changes in source code. However, datasets for the comment updating tasks are
usually crawled from committed versions in open source software repositories
such as GitHub, where there is lack of quality control of comments. In this
paper, we focus on cleaning existing comment updating datasets with considering
some properties of the comment updating process in software development. We
propose a semantic and overlapping-aware approach named CupCleaner (Comment
UPdating's CLEANER) to achieve this purpose. Specifically, we calculate a score
based on semantics and overlapping information of the code and comments. Based
on the distribution of the scores, we filter out the data with low scores in
the tail of the distribution to get rid of possible unclean data. We first
conducted a human evaluation on the noise data and high-quality data identified
by CupCleaner. The results show that the human ratings of the noise data
identified by CupCleaner are significantly lower. Then, we applied our data
cleaning approach to the training and validation sets of three existing comment
updating datasets while keeping the test set unchanged. Our experimental
results show that even after filtering out over 30\% of the data using
CupCleaner, there is still an improvement in all performance metrics. The
experimental results on the cleaned test set also suggest that CupCleaner may
provide help for constructing datasets for updating-related tasks
Tailoring Intermolecular Interactions Towards High‐Performance Thermoelectric Ionogels at Low Humidity
Development of ionic thermoelectric (iTE) materials is of immense interest for efficient heat-to-electricity conversion due to their giant ionic Seebeck coefficient (Si), but challenges remain in terms of relatively small Si at low humidity, poor stretchability, and ambiguous interaction mechanism in ionogels. Herein, a novel ionogel is reported consisting of polyethylene oxide (PEO), polyethylene oxide-polypropylene oxide-polyethylene oxide (P123), and 1-ethyl-3-methylimidazolium acetate (Emim:OAC). By delicately designing the interactions between ions and polymers, the migration of anions is restricted due to their strong binding with the hydroxyl groups of polymers, while the transport of cations is facilitated through segmental motions due to the increased amorphous regions, thereby leading to enlarged diffusion difference between the cations and anions. Moreover, the plasticizing effect of P123 and Emim:OAC can increase the elongation at break. As a consequence, the ionogel exhibits excellent properties including high Si (18 mV K−1 at relative humidity of 60%), good ionic conductivity (1.1 mS cm−1), superior stretchability (787%), and high stability (over 80% retention after 600 h). These findings show a promising strategy to obtain multifunctional iTE materials by engineering the intermolecular interactions and demonstrate the great potential of ionogels for harvesting low-grade heat in human-comfortable humidity environments
Retrieve Anyone: A General-purpose Person Re-identification Task with Instructions
Human intelligence can retrieve any person according to both visual and
language descriptions. However, the current computer vision community studies
specific person re-identification (ReID) tasks in different scenarios
separately, which limits the applications in the real world. This paper strives
to resolve this problem by proposing a new instruct-ReID task that requires the
model to retrieve images according to the given image or language
instructions.Our instruct-ReID is a more general ReID setting, where existing
ReID tasks can be viewed as special cases by designing different instructions.
We propose a large-scale OmniReID benchmark and an adaptive triplet loss as a
baseline method to facilitate research in this new setting. Experimental
results show that the baseline model trained on our OmniReID benchmark can
improve +0.5%, +3.3% mAP on Market1501 and CUHK03 for traditional ReID, +2.1%,
+0.2%, +15.3% mAP on PRCC, VC-Clothes, LTCC for clothes-changing ReID, +12.5%
mAP on COCAS+ real2 for clothestemplate based clothes-changing ReID when using
only RGB images, +25.5% mAP on COCAS+ real2 for our newly defined
language-instructed ReID. The dataset, model, and code will be available at
https://github.com/hwz-zju/Instruct-ReID
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
The rapid development of open-source large language models (LLMs) has been
truly remarkable. However, the scaling law described in previous literature
presents varying conclusions, which casts a dark cloud over scaling LLMs. We
delve into the study of scaling laws and present our distinctive findings that
facilitate scaling of large scale models in two commonly used open-source
configurations, 7B and 67B. Guided by the scaling laws, we introduce DeepSeek
LLM, a project dedicated to advancing open-source language models with a
long-term perspective. To support the pre-training phase, we have developed a
dataset that currently consists of 2 trillion tokens and is continuously
expanding. We further conduct supervised fine-tuning (SFT) and Direct
Preference Optimization (DPO) on DeepSeek LLM Base models, resulting in the
creation of DeepSeek Chat models. Our evaluation results demonstrate that
DeepSeek LLM 67B surpasses LLaMA-2 70B on various benchmarks, particularly in
the domains of code, mathematics, and reasoning. Furthermore, open-ended
evaluations reveal that DeepSeek LLM 67B Chat exhibits superior performance
compared to GPT-3.5
Disrupted Small-World Brain Networks in Moderate Alzheimer's Disease: A Resting-State fMRI Study
The small-world organization has been hypothesized to reflect a balance between local processing and global integration in the human brain. Previous multimodal imaging studies have consistently demonstrated that the topological architecture of the brain network is disrupted in Alzheimer's disease (AD). However, these studies have reported inconsistent results regarding the topological properties of brain alterations in AD. One potential explanation for these inconsistent results lies with the diverse homogeneity and distinct progressive stages of the AD involved in these studies, which are thought to be critical factors that might affect the results. We investigated the topological properties of brain functional networks derived from resting functional magnetic resonance imaging (fMRI) of carefully selected moderate AD patients and normal controls (NCs). Our results showed that the topological properties were found to be disrupted in AD patients, which showing increased local efficiency but decreased global efficiency. We found that the altered brain regions are mainly located in the default mode network, the temporal lobe and certain subcortical regions that are closely associated with the neuropathological changes in AD. Of note, our exploratory study revealed that the ApoE genotype modulates brain network properties, especially in AD patients
- …