9 research outputs found
Rethinking Label Smoothing on Multi-hop Question Answering
Multi-Hop Question Answering (MHQA) is a significant area in question
answering, requiring multiple reasoning components, including document
retrieval, supporting sentence prediction, and answer span extraction. In this
work, we analyze the primary factors limiting the performance of multi-hop
reasoning and introduce label smoothing into the MHQA task. This is aimed at
enhancing the generalization capabilities of MHQA systems and mitigating
overfitting of answer spans and reasoning paths in training set. We propose a
novel label smoothing technique, F1 Smoothing, which incorporates uncertainty
into the learning process and is specifically tailored for Machine Reading
Comprehension (MRC) tasks. Inspired by the principles of curriculum learning,
we introduce the Linear Decay Label Smoothing Algorithm (LDLA), which
progressively reduces uncertainty throughout the training process. Experiment
on the HotpotQA dataset demonstrates the effectiveness of our methods in
enhancing performance and generalizability in multi-hop reasoning, achieving
new state-of-the-art results on the leaderboard.Comment: 13 pages, 8 figures, accepted by CCL202
Improved Visual Fine-tuning with Natural Language Supervision
Fine-tuning a visual pre-trained model can leverage the semantic information
from large-scale pre-training data and mitigate the over-fitting problem on
downstream vision tasks with limited training examples. While the problem of
catastrophic forgetting in pre-trained backbone has been extensively studied
for fine-tuning, its potential bias from the corresponding pre-training task
and data, attracts less attention. In this work, we investigate this problem by
demonstrating that the obtained classifier after fine-tuning will be close to
that induced by the pre-trained model. To reduce the bias in the classifier
effectively, we introduce a reference distribution obtained from a fixed text
classifier, which can help regularize the learned vision classifier. The
proposed method, Text Supervised fine-tuning (TeS), is evaluated with diverse
pre-trained vision models including ResNet and ViT, and text encoders including
BERT and CLIP, on 11 downstream tasks. The consistent improvement with a clear
margin over distinct scenarios confirms the effectiveness of our proposal. Code
is available at \url{https://github.com/idstcv/TeS}.Comment: accepted by ICCV'2
IntroUNET: Identifying introgressed alleles via semantic segmentation
A growing body of evidence suggests that gene flow between closely related species is a widespread phenomenon. Alleles that introgress from one species into a close relative are typically neutral or deleterious, but sometimes confer a significant fitness advantage. Given the potential relevance to speciation and adaptation, numerous methods have therefore been devised to identify regions of the genome that have experienced introgression. Recently, supervised machine learning approaches have been shown to be highly effective for detecting introgression. One especially promising approach is to treat population genetic inference as an image classification problem, and feed an image representation of a population genetic alignment as input to a deep neural network that distinguishes among evolutionary models (i.e. introgression or no introgression). However, if we wish to investigate the full extent and fitness effects of introgression, merely identifying genomic regions in a population genetic alignment that harbor introgressed loci is insufficient—ideally we would be able to infer precisely which individuals have introgressed material and at which positions in the genome. Here we adapt a deep learning algorithm for semantic segmentation, the task of correctly identifying the type of object to which each individual pixel in an image belongs, to the task of identifying introgressed alleles. Our trained neural network is thus able to infer, for each individual in a two-population alignment, which of those individual’s alleles were introgressed from the other population. We use simulated data to show that this approach is highly accurate, and that it can be readily extended to identify alleles that are introgressed from an unsampled “ghost” population, performing comparably to a supervised learning method tailored specifically to that task. Finally, we apply this method to data from Drosophila, showing that it is able to accurately recover introgressed haplotypes from real data. This analysis reveals that introgressed alleles are typically confined to lower frequencies within genic regions, suggestive of purifying selection, but are found at much higher frequencies in a region previously shown to be affected by adaptive introgression. Our method’s success in recovering introgressed haplotypes in challenging real-world scenarios underscores the utility of deep learning approaches for making richer evolutionary inferences from genomic data