Search CORE

55,959 research outputs found

A named entity recognition system for Dutch

Author: Daelemans Walter
De Meulder Fien
Hoste Veronique
Publication venue: Rodopi
Publication date: 01/01/2002
Field of study

We describe a Named Entity Recognition system for Dutch that combines gazetteers, hand-crafted rules, and machine learning on the basis of seed material. We used gazetteers and a corpus to construct training material for Ripper, a rule learner. Instead of using Ripper to train a complete system, we used many different runs of Ripper in order to derive rules which we then interpreted and implemented in our own, hand-crafted system. This speeded up the building of a hand-crafted system, and allowed us to use many different rule sets in order to improve performance. We discuss the advantages of using machine learning software as a toot in knowledge acquisition, and evaluate the resulting system for Dutch

Crossref

Ghent University Academic Bibliography

Institutional Repository Universiteit Antwerpen

Tilburg University Repository

Exploring Spoken Named Entity Recognition: A Cross-Lingual Perspective

Author: Benaicha Moncef
Thulke David
Turan M. A. Tuğtekin
Publication venue
Publication date: 03/07/2023
Field of study

Recent advancements in Named Entity Recognition (NER) have significantly improved the identification of entities in textual data. However, spoken NER, a specialized field of spoken document retrieval, lags behind due to its limited research and scarce datasets. Moreover, cross-lingual transfer learning in spoken NER has remained unexplored. This paper utilizes transfer learning across Dutch, English, and German using pipeline and End-to-End (E2E) schemes. We employ Wav2Vec2-XLS-R models on custom pseudo-annotated datasets and investigate several architectures for the adaptability of cross-lingual systems. Our results demonstrate that End-to-End spoken NER outperforms pipeline-based alternatives over our limited annotations. Notably, transfer learning from German to Dutch surpasses the Dutch E2E system by 7% and the Dutch pipeline system by 4%. This study not only underscores the feasibility of transfer learning in spoken NER but also sets promising outcomes for future evaluations, hinting at the need for comprehensive data collection to augment the results

arXiv.org e-Print Archive

Knowledge-Based Named Entity Recognition of Archaeological Concepts in Dutch

Author: Tudhope D
Vlachidis A
Wansleeben M
Publication venue: Metadata and Semantic Research MTSR 2020
Publication date: 18/03/2021
Field of study

The advancement of Natural Language Processing (NLP) allows the process of deriving information from large volumes of text to be automated, making text-based resources more discoverable and useful. The attention is turned to one of the most important, but traditionally difficult to access resources in archaeology; the largely unpublished reports generated by commercial or “rescue” archaeology, commonly known as “grey literature”. The paper presents the development and evaluation of a Named Entity Recognition system of Dutch archaeological grey literature targeted at extracting mentions of artefacts, archaeological features, materials, places and time entities. The role of domain vocabulary is discussed for the development of a KOS-driven NLP pipeline which is evaluated against a Gold Standard, human-annotated corpus

UCL Discovery

Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition

Author: Sang Erik F. Tjong Kim
Publication venue
Publication date: 01/01/2002
Field of study

We describe the CoNLL-2002 shared task: language-independent named entity recognition. We give background information on the data sets and the evaluation method, present a general overview of the systems that have taken part in the task and discuss their performance.Comment: 4 page

arXiv.org e-Print Archive

Institutional Repository Universiteit Antwerpen

Tilburg University Repository