33 research outputs found
The Devil is the Classifier: Investigating Long Tail Relation Classification with Decoupling Analysis
Long-tailed relation classification is a challenging problem as the head
classes may dominate the training phase, thereby leading to the deterioration
of the tail performance. Existing solutions usually address this issue via
class-balancing strategies, e.g., data re-sampling and loss re-weighting, but
all these methods adhere to the schema of entangling learning of the
representation and classifier. In this study, we conduct an in-depth empirical
investigation into the long-tailed problem and found that pre-trained models
with instance-balanced sampling already capture the well-learned
representations for all classes; moreover, it is possible to achieve better
long-tailed classification ability at low cost by only adjusting the
classifier. Inspired by this observation, we propose a robust classifier with
attentive relation routing, which assigns soft weights by automatically
aggregating the relations. Extensive experiments on two datasets demonstrate
the effectiveness of our proposed approach. Code and datasets are available in
https://github.com/zjunlp/deepke
MRI Image Segmentation System of Uterine Fibroids Based on AR-Unet Network
Uterine fibroids are the most common benign tumors in female reproductive organs. The segmentation of uterine fibroids is crucial for accurate treatment. This paper proposes a new uterine fibroids MRI T2W image segmentation network AR-Unet (Attention Resnet101-Unet), which uses the deep neural network ResNet101 as the front end of feature extraction, extracts image semantic information, and combines U-net design ideas to build a network structure. The attention gate module is added before the upsampling and downsampling feature maps are spliced. We tested a total of 123 uterine fibroids MRI T2W images from 13 patients. The segmentation results were verified with expert-defined manual segmentation results. The average Dice coefficient, IOU value, sensitivity and specificity of all segmented images were 0.9044, 0.8443, 88.55% and 94.56%, the performance is better than ResNet101-Unet and Attention-Unet models, and finally the network is encapsulated into an auxiliary diagnostic system
Split-NER: Named Entity Recognition via Two Question-Answering-based Classifications
In this work, we address the NER problem by splitting it into two logical
sub-tasks: (1) Span Detection which simply extracts entity mention spans
irrespective of entity type; (2) Span Classification which classifies the spans
into their entity types. Further, we formulate both sub-tasks as
question-answering (QA) problems and produce two leaner models which can be
optimized separately for each sub-task. Experiments with four cross-domain
datasets demonstrate that this two-step approach is both effective and time
efficient. Our system, SplitNER outperforms baselines on OntoNotes5.0, WNUT17
and a cybersecurity dataset and gives on-par performance on BioNLP13CG. In all
cases, it achieves a significant reduction in training time compared to its QA
baseline counterpart. The effectiveness of our system stems from fine-tuning
the BERT model twice, separately for span detection and classification. The
source code can be found at https://github.com/c3sr/split-ner
Infusing Hierarchical Guidance into Prompt Tuning: A Parameter-Efficient Framework for Multi-level Implicit Discourse Relation Recognition
Multi-level implicit discourse relation recognition (MIDRR) aims at
identifying hierarchical discourse relations among arguments. Previous methods
achieve the promotion through fine-tuning PLMs. However, due to the data
scarcity and the task gap, the pre-trained feature space cannot be accurately
tuned to the task-specific space, which even aggravates the collapse of the
vanilla space. Besides, the comprehension of hierarchical semantics for MIDRR
makes the conversion much harder. In this paper, we propose a prompt-based
Parameter-Efficient Multi-level IDRR (PEMI) framework to solve the above
problems. First, we leverage parameter-efficient prompt tuning to drive the
inputted arguments to match the pre-trained space and realize the approximation
with few parameters. Furthermore, we propose a hierarchical label refining
(HLR) method for the prompt verbalizer to deeply integrate hierarchical
guidance into the prompt tuning. Finally, our model achieves comparable results
on PDTB 2.0 and 3.0 using about 0.1% trainable parameters compared with
baselines and the visualization demonstrates the effectiveness of our HLR
method.Comment: accepted to ACL 202
MOELoRA: An MOE-based Parameter Efficient Fine-Tuning Method for Multi-task Medical Applications
The recent surge in the field of Large Language Models (LLMs) has gained
significant attention in numerous domains. In order to tailor an LLM to a
specific domain such as a web-based healthcare system, fine-tuning with domain
knowledge is necessary. However, two issues arise during fine-tuning LLMs for
medical applications. The first is the problem of task variety, where there are
numerous distinct tasks in real-world medical scenarios. This diversity often
results in suboptimal fine-tuning due to data imbalance and seesawing problems.
Additionally, the high cost of fine-tuning can be prohibitive, impeding the
application of LLMs. The large number of parameters in LLMs results in enormous
time and computational consumption during fine-tuning, which is difficult to
justify. To address these two issues simultaneously, we propose a novel
parameter-efficient fine-tuning framework for multi-task medical applications
called MOELoRA. The framework aims to capitalize on the benefits of both MOE
for multi-task learning and LoRA for parameter-efficient fine-tuning. To unify
MOE and LoRA, we devise multiple experts as the trainable parameters, where
each expert consists of a pair of low-rank matrices to maintain a small number
of trainable parameters. Additionally, we propose a task-motivated gate
function for all MOELoRA layers that can regulate the contributions of each
expert and generate distinct parameters for various tasks. To validate the
effectiveness and practicality of the proposed method, we conducted
comprehensive experiments on a public multi-task Chinese medical dataset. The
experimental results demonstrate that MOELoRA outperforms existing
parameter-efficient fine-tuning methods. The implementation is available online
for convenient reproduction of our experiments