32 research outputs found
Low Resource Multi-Task Sequence Tagging -- Revisiting Dynamic Conditional Random Fields
We compare different models for low resource multi-task sequence tagging that
leverage dependencies between label sequences for different tasks. Our analysis
is aimed at datasets where each example has labels for multiple tasks. Current
approaches use either a separate model for each task or standard multi-task
learning to learn shared feature representations. However, these approaches
ignore correlations between label sequences, which can provide important
information in settings with small training datasets. To analyze which
scenarios can profit from modeling dependencies between labels in different
tasks, we revisit dynamic conditional random fields (CRFs) and combine them
with deep neural networks. We compare single-task, multi-task and dynamic CRF
setups for three diverse datasets at both sentence and document levels in
English and German low resource scenarios. We show that including silver labels
from pretrained part-of-speech taggers as auxiliary tasks can improve
performance on downstream tasks. We find that especially in low-resource
scenarios, the explicit modeling of inter-dependencies between task predictions
outperforms single-task as well as standard multi-task models
Biomedical Entity Recognition by Detection and Matching
Biomedical named entity recognition (BNER) serves as the foundation for
numerous biomedical text mining tasks. Unlike general NER, BNER require a
comprehensive grasp of the domain, and incorporating external knowledge beyond
training data poses a significant challenge. In this study, we propose a novel
BNER framework called DMNER. By leveraging existing entity representation
models SAPBERT, we tackle BNER as a two-step process: entity boundary detection
and biomedical entity matching. DMNER exhibits applicability across multiple
NER scenarios: 1) In supervised NER, we observe that DMNER effectively
rectifies the output of baseline NER models, thereby further enhancing
performance. 2) In distantly supervised NER, combining MRC and AutoNER as span
boundary detectors enables DMNER to achieve satisfactory results. 3) For
training NER by merging multiple datasets, we adopt a framework similar to
DS-NER but additionally leverage ChatGPT to obtain high-quality phrases in the
training. Through extensive experiments conducted on 10 benchmark datasets, we
demonstrate the versatility and effectiveness of DMNER.Comment: 9 pages content, 2 pages appendi
Data-efficient Active Learning for Structured Prediction with Partial Annotation and Self-Training
In this work we propose a pragmatic method that reduces the annotation cost
for structured label spaces using active learning. Our approach leverages
partial annotation, which reduces labeling costs for structured outputs by
selecting only the most informative sub-structures for annotation. We also
utilize self-training to incorporate the current model's automatic predictions
as pseudo-labels for un-annotated sub-structures. A key challenge in
effectively combining partial annotation with self-training to reduce
annotation cost is determining which sub-structures to select to label. To
address this challenge, we adopt an error estimator to adaptively decide the
partial selection ratio according to the current model's capability. In
evaluations spanning four structured prediction tasks, we show that our
combination of partial annotation and self-training using an adaptive selection
ratio reduces annotation cost over strong full annotation baselines under a
fair comparison scheme that takes reading time into consideration.Comment: Findings of EMNLP 202
Extracting health information from social media
Social media platforms with large user bases such as Twitter, Reddit, and online health forums contain a rich amount of health-related information. Despite the advances achieved in natural language processing (NLP), extracting actionable health information from social media still remains challenging. This thesis proposes a set of methodologies that can be used to extract medical concepts and health information from social media that is related to drugs, symptoms, and side-effects. We first develop a rule-based relationship extraction system that utilises a set of dictionaries and linguistic rules in order to extract structured information from patients’ posts on online health forums. We then automate the concept extraction pro-cess via; i) a supervised algorithm that has been trained with a small labelled dataset, and ii) an iterative semi-supervised algorithm capable of learning new sentences and concepts. We test our machine-learning pipeline on a COVID-19 case study that involves patient authored social media posts. We develop a novel triage and diagnostic approach to extract symptoms, severity, and prevalence of the disease rather than to provide any actionable decisions at the individual level. Finally, we extend our approach by investigating the potential benefit of incorporating dictionary information into a neural network architecture for natural language processing