405,142 research outputs found
MRI-based Multi-task Decoupling Learning for Alzheimer's Disease Detection and MMSE Score Prediction: A Multi-site Validation
Accurately detecting Alzheimer's disease (AD) and predicting mini-mental
state examination (MMSE) score are important tasks in elderly health by
magnetic resonance imaging (MRI). Most of the previous methods on these two
tasks are based on single-task learning and rarely consider the correlation
between them. Since the MMSE score, which is an important basis for AD
diagnosis, can also reflect the progress of cognitive impairment, some studies
have begun to apply multi-task learning methods to these two tasks. However,
how to exploit feature correlation remains a challenging problem for these
methods. To comprehensively address this challenge, we propose a MRI-based
multi-task decoupled learning method for AD detection and MMSE score
prediction. First, a multi-task learning network is proposed to implement AD
detection and MMSE score prediction, which exploits feature correlation by
adding three multi-task interaction layers between the backbones of the two
tasks. Each multi-task interaction layer contains two feature decoupling
modules and one feature interaction module. Furthermore, to enhance the
generalization between tasks of the features selected by the feature decoupling
module, we propose the feature consistency loss constrained feature decoupling
module. Finally, in order to exploit the specific distribution information of
MMSE score in different groups, a distribution loss is proposed to further
enhance the model performance. We evaluate our proposed method on multi-site
datasets. Experimental results show that our proposed multi-task decoupled
representation learning method achieves good performance, outperforming
single-task learning and other existing state-of-the-art methods.Comment: 15 page
Task Indicating Transformer for Task-conditional Dense Predictions
The task-conditional model is a distinctive stream for efficient multi-task
learning. Existing works encounter a critical limitation in learning
task-agnostic and task-specific representations, primarily due to shortcomings
in global context modeling arising from CNN-based architectures, as well as a
deficiency in multi-scale feature interaction within the decoder. In this
paper, we introduce a novel task-conditional framework called Task Indicating
Transformer (TIT) to tackle this challenge. Our approach designs a Mix Task
Adapter module within the transformer block, which incorporates a Task
Indicating Matrix through matrix decomposition, thereby enhancing long-range
dependency modeling and parameter-efficient feature adaptation by capturing
intra- and inter-task features. Moreover, we propose a Task Gate Decoder module
that harnesses a Task Indicating Vector and gating mechanism to facilitate
adaptive multi-scale feature refinement guided by task embeddings. Experiments
on two public multi-task dense prediction benchmarks, NYUD-v2 and
PASCAL-Context, demonstrate that our approach surpasses state-of-the-art
task-conditional methods.Comment: Accepted by ICASSP 202
Task-Aware Asynchronous Multi-Task Model with Class Incremental Contrastive Learning for Surgical Scene Understanding
Purpose: Surgery scene understanding with tool-tissue interaction recognition
and automatic report generation can play an important role in intra-operative
guidance, decision-making and postoperative analysis in robotic surgery.
However, domain shifts between different surgeries with inter and intra-patient
variation and novel instruments' appearance degrade the performance of model
prediction. Moreover, it requires output from multiple models, which can be
computationally expensive and affect real-time performance.
Methodology: A multi-task learning (MTL) model is proposed for surgical
report generation and tool-tissue interaction prediction that deals with domain
shift problems. The model forms of shared feature extractor, mesh-transformer
branch for captioning and graph attention branch for tool-tissue interaction
prediction. The shared feature extractor employs class incremental contrastive
learning (CICL) to tackle intensity shift and novel class appearance in the
target domain. We design Laplacian of Gaussian (LoG) based curriculum learning
into both shared and task-specific branches to enhance model learning. We
incorporate a task-aware asynchronous MTL optimization technique to fine-tune
the shared weights and converge both tasks optimally.
Results: The proposed MTL model trained using task-aware optimization and
fine-tuning techniques reported a balanced performance (BLEU score of 0.4049
for scene captioning and accuracy of 0.3508 for interaction detection) for both
tasks on the target domain and performed on-par with single-task models in
domain adaptation.
Conclusion: The proposed multi-task model was able to adapt to domain shifts,
incorporate novel instruments in the target domain, and perform tool-tissue
interaction detection and report generation on par with single-task models.Comment: Manuscript accepted in the International Journal of Computer Assisted
Radiology and Surgery. codes available:
https://github.com/lalithjets/Domain-adaptation-in-MT
DEPHN: Different Expression Parallel Heterogeneous Network using virtual gradient optimization for Multi-task Learning
Recommendation system algorithm based on multi-task learning (MTL) is the
major method for Internet operators to understand users and predict their
behaviors in the multi-behavior scenario of platform. Task correlation is an
important consideration of MTL goals, traditional models use shared-bottom
models and gating experts to realize shared representation learning and
information differentiation. However, The relationship between real-world tasks
is often more complex than existing methods do not handle properly sharing
information. In this paper, we propose an Different Expression Parallel
Heterogeneous Network (DEPHN) to model multiple tasks simultaneously. DEPHN
constructs the experts at the bottom of the model by using different feature
interaction methods to improve the generalization ability of the shared
information flow. In view of the model's differentiating ability for different
task information flows, DEPHN uses feature explicit mapping and virtual
gradient coefficient for expert gating during the training process, and
adaptively adjusts the learning intensity of the gated unit by considering the
difference of gating values and task correlation. Extensive experiments on
artificial and real-world datasets demonstrate that our proposed method can
capture task correlation in complex situations and achieve better performance
than baseline models\footnote{Accepted in IJCNN2023}
Syntax-Informed Interactive Model for Comprehensive Aspect-Based Sentiment Analysis
Aspect-based sentiment analysis (ABSA), a nuanced task in text analysis,
seeks to discern sentiment orientation linked to specific aspect terms in text.
Traditional approaches often overlook or inadequately model the explicit
syntactic structures of sentences, crucial for effective aspect term
identification and sentiment determination. Addressing this gap, we introduce
an innovative model: Syntactic Dependency Enhanced Multi-Task Interaction
Architecture (SDEMTIA) for comprehensive ABSA. Our approach innovatively
exploits syntactic knowledge (dependency relations and types) using a
specialized Syntactic Dependency Embedded Interactive Network (SDEIN). We also
incorporate a novel and efficient message-passing mechanism within a multi-task
learning framework to bolster learning efficacy. Our extensive experiments on
benchmark datasets showcase our model's superiority, significantly surpassing
existing methods. Additionally, incorporating BERT as an auxiliary feature
extractor further enhances our model's performance
PEER: A Comprehensive and Multi-Task Benchmark for Protein Sequence Understanding
We are now witnessing significant progress of deep learning methods in a
variety of tasks (or datasets) of proteins. However, there is a lack of a
standard benchmark to evaluate the performance of different methods, which
hinders the progress of deep learning in this field. In this paper, we propose
such a benchmark called PEER, a comprehensive and multi-task benchmark for
Protein sEquence undERstanding. PEER provides a set of diverse protein
understanding tasks including protein function prediction, protein localization
prediction, protein structure prediction, protein-protein interaction
prediction, and protein-ligand interaction prediction. We evaluate different
types of sequence-based methods for each task including traditional feature
engineering approaches, different sequence encoding methods as well as
large-scale pre-trained protein language models. In addition, we also
investigate the performance of these methods under the multi-task learning
setting. Experimental results show that large-scale pre-trained protein
language models achieve the best performance for most individual tasks, and
jointly training multiple tasks further boosts the performance. The datasets
and source codes of this benchmark are all available at
https://github.com/DeepGraphLearning/PEER_BenchmarkComment: Accepted by NeurIPS 2022 Dataset and Benchmark Track. arXiv v2:
source code released; arXiv v1: release all benchmark result
Semantic Segmentation Enhanced Transformer Model for Human Attention Prediction
Saliency Prediction aims to predict the attention distribution of human eyes
given an RGB image. Most of the recent state-of-the-art methods are based on
deep image feature representations from traditional CNNs. However, the
traditional convolution could not capture the global features of the image well
due to its small kernel size. Besides, the high-level factors which closely
correlate to human visual perception, e.g., objects, color, light, etc., are
not considered. Inspired by these, we propose a Transformer-based method with
semantic segmentation as another learning objective. More global cues of the
image could be captured by Transformer. In addition, simultaneously learning
the object segmentation simulates the human visual perception, which we would
verify in our investigation of human gaze control in cognitive science. We
build an extra decoder for the subtask and the multiple tasks share the same
Transformer encoder, forcing it to learn from multiple feature spaces. We find
in practice simply adding the subtask might confuse the main task learning,
hence Multi-task Attention Module is proposed to deal with the feature
interaction between the multiple learning targets. Our method achieves
competitive performance compared to other state-of-the-art methods
UFIN: Universal Feature Interaction Network for Multi-Domain Click-Through Rate Prediction
Click-Through Rate (CTR) prediction, which aims to estimate the probability
of a user clicking on an item, is a key task in online advertising. Numerous
existing CTR models concentrate on modeling the feature interactions within a
solitary domain, thereby rendering them inadequate for fulfilling the
requisites of multi-domain recommendations in real industrial scenarios. Some
recent approaches propose intricate architectures to enhance knowledge sharing
and augment model training across multiple domains. However, these approaches
encounter difficulties when being transferred to new recommendation domains,
owing to their reliance on the modeling of ID features (e.g., item id). To
address the above issue, we propose the Universal Feature Interaction Network
(UFIN) approach for CTR prediction. UFIN exploits textual data to learn
universal feature interactions that can be effectively transferred across
diverse domains. For learning universal feature representations, we regard the
text and feature as two different modalities and propose an encoder-decoder
network founded on a Large Language Model (LLM) to enforce the transfer of data
from the text modality to the feature modality. Building upon the above
foundation, we further develop a mixtureof-experts (MoE) enhanced adaptive
feature interaction model to learn transferable collaborative patterns across
multiple domains. Furthermore, we propose a multi-domain knowledge distillation
framework to enhance feature interaction learning. Based on the above methods,
UFIN can effectively bridge the semantic gap to learn common knowledge across
various domains, surpassing the constraints of ID-based models. Extensive
experiments conducted on eight datasets show the effectiveness of UFIN, in both
multidomain and cross-platform settings. Our code is available at
https://github.com/RUCAIBox/UFIN
- …