31 research outputs found
Strong Baselines for Neural Semi-Supervised Learning under Domain Shift
Novel neural models have been proposed in recent years for learning under
domain shift. Most models, however, only evaluate on a single task, on
proprietary datasets, or compare to weak baselines, which makes comparison of
models difficult. In this paper, we re-evaluate classic general-purpose
bootstrapping approaches in the context of neural networks under domain shifts
vs. recent neural approaches and propose a novel multi-task tri-training method
that reduces the time and space complexity of classic tri-training. Extensive
experiments on two benchmarks are negative: while our novel method establishes
a new state-of-the-art for sentiment analysis, it does not fare consistently
the best. More importantly, we arrive at the somewhat surprising conclusion
that classic tri-training, with some additions, outperforms the state of the
art. We conclude that classic approaches constitute an important and strong
baseline.Comment: ACL 201
Generalizing through Forgetting -- Domain Generalization for Symptom Event Extraction in Clinical Notes
Symptom information is primarily documented in free-text clinical notes and
is not directly accessible for downstream applications. To address this
challenge, information extraction approaches that can handle clinical language
variation across different institutions and specialties are needed. In this
paper, we present domain generalization for symptom extraction using
pretraining and fine-tuning data that differs from the target domain in terms
of institution and/or specialty and patient population. We extract symptom
events using a transformer-based joint entity and relation extraction method.
To reduce reliance on domain-specific features, we propose a domain
generalization method that dynamically masks frequent symptoms words in the
source domain. Additionally, we pretrain the transformer language model (LM) on
task-related unlabeled texts for better representation. Our experiments
indicate that masking and adaptive pretraining methods can significantly
improve performance when the source domain is more distant from the target
domain