134,893 research outputs found
GermEval 2014 Named Entity Recognition Shared Task: Companion Paper
This paper describes the GermEval 2014 Named Entity Recognition (NER) Shared Task workshop at KONVENS. It provides background information on the motivation of this task, the data-set, the evaluation method, and an overview of the participating systems, followed by a discussion of their results. In contrast to previous NER tasks, the GermEval 2014 edition uses an extended tagset to account for derivatives of names and tokens that contain name parts. Further, nested named entities had to be predicted, i.e. names that contain other names. The eleven participating teams employed a wide range of techniques in their systems. The most successful systems used state-of-the- art machine learning methods, combined with some knowledge-based features in hybrid systems
GermEval 2014 Named Entity Recognition Shared Task: Companion Paper
This paper describes the GermEval 2014 Named Entity Recognition (NER) Shared Task workshop at KONVENS. It provides background information on the motivation of this task, the data-set, the evaluation method, and an overview of the participating systems, followed by a discussion of their results. In contrast to previous NER tasks, the GermEval 2014 edition uses an extended tagset to account for derivatives of names and tokens that contain name parts. Further, nested named entities had to be predicted, i.e. names that contain other names. The eleven participating teams employed a wide range of techniques in their systems. The most successful systems used state-of-the- art machine learning methods, combined with some knowledge-based features in hybrid systems
Information Access in a Multilingual World: Transitioning from Research to Real-World Applications
Multilingual Information Access (MLIA) is at a turning point wherein substantial real-world applications are being introduced after fifteen years of research into cross-language information retrieval, question answering, statistical machine translation and named entity recognition. Previous workshops on this topic have focused on research and small- scale applications. The focus of this workshop was on technology transfer from research to applications and on what future research needs to be done which facilitates MLIA in an increasingly connected multilingual world
A Syllable-based Technique for Word Embeddings of Korean Words
Word embedding has become a fundamental component to many NLP tasks such as
named entity recognition and machine translation. However, popular models that
learn such embeddings are unaware of the morphology of words, so it is not
directly applicable to highly agglutinative languages such as Korean. We
propose a syllable-based learning model for Korean using a convolutional neural
network, in which word representation is composed of trained syllable vectors.
Our model successfully produces morphologically meaningful representation of
Korean words compared to the original Skip-gram embeddings. The results also
show that it is quite robust to the Out-of-Vocabulary problem.Comment: 5 pages, 3 figures, 1 table. Accepted for EMNLP 2017 Workshop - The
1st Workshop on Subword and Character level models in NLP (SCLeM
Recommended from our members
Bidirectional LSTM for Named Entity Recognition in Twitter Messages
In this paper, we present our approach for named entity recognition in Twitter messages that we used in our participation in the Named Entity Recognition in Twitter shared task at the COLING 2016 Workshop on Noisy User-generated text (WNUT). The main challenge that we aim to tackle in our participation is the short, noisy and colloquial nature of tweets, which makes named entity recognition in Twitter messages a challenging task. In particular, we investigate an approach for dealing with this problem by enabling bidirectional long short-term memory (LSTM) to automatically learn orthographic features without requiring feature engineering. In comparison with other systems participating in the shared task, our system achieved the most effective performance on both the ‘segmentation and categorisation’ and the ‘segmentation only’ sub-tasks
Experiments to Improve Named Entity Recognition on Turkish Tweets
Social media texts are significant information sources for several
application areas including trend analysis, event monitoring, and opinion
mining. Unfortunately, existing solutions for tasks such as named entity
recognition that perform well on formal texts usually perform poorly when
applied to social media texts. In this paper, we report on experiments that
have the purpose of improving named entity recognition on Turkish tweets, using
two different annotated data sets. In these experiments, starting with a
baseline named entity recognition system, we adapt its recognition rules and
resources to better fit Twitter language by relaxing its capitalization
constraint and by diacritics-based expansion of its lexical resources, and we
employ a simplistic normalization scheme on tweets to observe the effects of
these on the overall named entity recognition performance on Turkish tweets.
The evaluation results of the system with these different settings are provided
with discussions of these results.Comment: appears in Proceedings of the EACL Workshop on Language Analysis for
Social Media, 201
Named Entity Recognition in Twitter using Images and Text
Named Entity Recognition (NER) is an important subtask of information
extraction that seeks to locate and recognise named entities. Despite recent
achievements, we still face limitations with correctly detecting and
classifying entities, prominently in short and noisy text, such as Twitter. An
important negative aspect in most of NER approaches is the high dependency on
hand-crafted features and domain-specific knowledge, necessary to achieve
state-of-the-art results. Thus, devising models to deal with such
linguistically complex contexts is still challenging. In this paper, we propose
a novel multi-level architecture that does not rely on any specific linguistic
resource or encoded rule. Unlike traditional approaches, we use features
extracted from images and text to classify named entities. Experimental tests
against state-of-the-art NER for Twitter on the Ritter dataset present
competitive results (0.59 F-measure), indicating that this approach may lead
towards better NER models.Comment: The 3rd International Workshop on Natural Language Processing for
Informal Text (NLPIT 2017), 8 page
- …