Search CORE

4,365 research outputs found

Comprehensive Overview of Named Entity Recognition: Models, Domain-Specific Applications and Challenges

Author: Pakhale Kalyani
Publication venue
Publication date: 25/09/2023
Field of study

In the domain of Natural Language Processing (NLP), Named Entity Recognition (NER) stands out as a pivotal mechanism for extracting structured insights from unstructured text. This manuscript offers an exhaustive exploration into the evolving landscape of NER methodologies, blending foundational principles with contemporary AI advancements. Beginning with the rudimentary concepts of NER, the study spans a spectrum of techniques from traditional rule-based strategies to the contemporary marvels of transformer architectures, particularly highlighting integrations such as BERT with LSTM and CNN. The narrative accentuates domain-specific NER models, tailored for intricate areas like finance, legal, and healthcare, emphasizing their specialized adaptability. Additionally, the research delves into cutting-edge paradigms including reinforcement learning, innovative constructs like E-NER, and the interplay of Optical Character Recognition (OCR) in augmenting NER capabilities. Grounding its insights in practical realms, the paper sheds light on the indispensable role of NER in sectors like finance and biomedicine, addressing the unique challenges they present. The conclusion outlines open challenges and avenues, marking this work as a comprehensive guide for those delving into NER research and applications

arXiv.org e-Print Archive

A Comparative Review of Machine Learning for Arabic Named Entity Recognition

Author: Qadri binti Zakaria Lailatul
Salah Ramzi Esmail
Publication venue: 'Insight Society'
Publication date: 16/04/2017
Field of study

Arabic Named Entity Recognition (ANER) systems aim to identify and classify Arabic Named entities (NEs) within Arabic text. Other important tasks in Arabic Natural Language Processing (NLP) depends on ANER such as machine translation, question-answering, information extraction, etc. In general, ANER systems can be classified into three main approaches, namely, rule-based, machine-learning or hybrid systems. In this paper, we focus on research progress in machine-learning (ML) ANER and compare between linguistic resource, entity type, domain, method and performance. We also highlight the challenges when processing Arabic NEs through ML systems

International Journal on Advanced Science, Engineering and Information Technology

Voted Approach for Part of Speech Tagging in Bengali

Author: Bandyopadhyay Sivaji
Ekbal Asif
Hasanuzzaman Md.
Publication venue: City University of Hong Kong
Publication date: 01/01/2009
Field of study

PACLIC 23 / City University of Hong Kong / 3-5 December 200

Waseda University Repository

Improved Named Entity Recognition Through SVM-Based Combination

Author: Labatut Vincent
Publication venue: HAL CCSD
Publication date: 11/12/2013
Field of study

Named Entity Extraction (NER) consists in identifying specific textual expressions, which represent various types of concepts: persons, locations, organizations, etc. It is an important part of natural language processing, because it is often used when building more advanced text-based tools, especially in the context of information extraction. Consequently, many NER tools are now available, designed to handle various sorts of texts, languages and entity types. A recent study on biographical texts showed the overall indices used to assess the performance of these tools hide the fact they can behave rather differently depending on the textual context, and could actually be complementary. In this work, we check this assumption by proposing two methods allowing to combine several NER tools: one relies on a voting process and the other is SVM-based. Both take advantage of a global text feature to guide the combination process. We extend an existing corpus to provide enough data for training and testing. We implement an open source flexible platform aiming at benchmarking NER tools. We apply our combination methods on a selection of NER tools, including state-of-the-art ones, as well as our custom tool specifically designed to process hyperlinked biographical texts. Our results show both proposed combination approaches outmatch the individual performance of all the considered standalone NER tools. Of the two, the SVM-based approach reaches the highest performance

Hal-Diderot