7 research outputs found
Injecting Knowledge Base Information into End-to-End Joint Entity and Relation Extraction and Coreference Resolution
We consider a joint information extraction (IE) model, solving named entity
recognition, coreference resolution and relation extraction jointly over the
whole document. In particular, we study how to inject information from a
knowledge base (KB) in such IE model, based on unsupervised entity linking. The
used KB entity representations are learned from either (i) hyperlinked text
documents (Wikipedia), or (ii) a knowledge graph (Wikidata), and appear
complementary in raising IE performance. Representations of corresponding
entity linking (EL) candidates are added to text span representations of the
input document, and we experiment with (i) taking a weighted average of the EL
candidate representations based on their prior (in Wikipedia), and (ii) using
an attention scheme over the EL candidate list. Results demonstrate an increase
of up to 5% F1-score for the evaluated IE tasks on two datasets. Despite a
strong performance of the prior-based model, our quantitative and qualitative
analysis reveals the advantage of using the attention-based approach
Extreme Multi-Label Skill Extraction Training using Large Language Models
Online job ads serve as a valuable source of information for skill
requirements, playing a crucial role in labor market analysis and e-recruitment
processes. Since such ads are typically formatted in free text, natural
language processing (NLP) technologies are required to automatically process
them. We specifically focus on the task of detecting skills (mentioned
literally, or implicitly described) and linking them to a large skill ontology,
making it a challenging case of extreme multi-label classification (XMLC).
Given that there is no sizable labeled (training) dataset are available for
this specific XMLC task, we propose techniques to leverage general Large
Language Models (LLMs). We describe a cost-effective approach to generate an
accurate, fully synthetic labeled dataset for skill extraction, and present a
contrastive learning strategy that proves effective in the task. Our results
across three skill extraction benchmarks show a consistent increase of between
15 to 25 percentage points in \textit{R-Precision@5} compared to previously
published results that relied solely on distant supervision through literal
matches.Comment: Accepted to the International workshop on AI for Human Resources and
Public Employment Services (AI4HR&PES) as part of ECML-PKDD 202
Isolated anti-Ku antibody in scleroderma-myositis overlap syndrome: the histo-pathological patern
Injecting knowledge base information into end-to-end joint entity and relation extraction and coreference resolution
We consider a joint information extraction(IE) model, solving named entity recognition, coreference resolution and relation extraction jointly over the whole document. In particular, we study how to inject information from a knowledge base (KB) in such IE model, based on unsupervised entity linking. The used KB entity representations are learned from either(i) hyperlinked text documents (Wikipedia), or(ii) a knowledge graph (Wikidata), and ap-pear complementary in raising IE performance. Representations of corresponding entity linking (EL) candidates are added to text span representations of the input document, and we experiment with (i) taking a weighted average of the EL candidate representations based on their prior (in Wikipedia), and (ii) using an attention scheme over the EL candidate list. Results demonstrate an increase of up to 5%F1-score for the evaluated IE tasks on two datasets. Despite a strong performance of the prior-based model, our quantitative and qualitative analysis reveals the advantage of using the attention-based approach
Frozen pretrained transformers for neural sign language translation
One of the major challenges in sign language translation from a sign language to a spoken language is the lack of parallel corpora. Recent works have achieved promising results on the RWTH-PHOENIX-Weather 2014T dataset, which consists of over eight thousand parallel sentences between German sign language and German. However, from the perspective of neural machine translation, this is still a tiny dataset. To improve the performance of models trained on small datasets, transfer learning can be used. While this has been previously applied in sign language translation for feature extraction, to the best of our knowledge, pretrained language models have not yet been investigated. We use pretrained BERT-base and mBART-50 models to initialize our sign language video to spoken language text translation model. To mitigate overfitting, we apply the frozen pretrained transformer technique: we freeze the majority of parameters during training. Using a pretrained BERT model, we outperform a baseline trained from scratch by 1 to 2 BLEU-4. Our results show that pretrained language models can be used to improve sign language translation performance and that the self-attention patterns in BERT transfer in zero-shot to the encoder and decoder of sign language translation models