273 research outputs found
A Survey of Biological Entity Recognition Approaches
There has been growing interest in the task of Named Entity Recognition (NER) and a lot of research has been done in this direction in last two decades. Particularly, a lot of progress has been made in the biomedical domain with emphasis on identifying domain-specific entities and often the task being known as Biological Named Entity Recognition (BER). The task of biological entity recognition (BER) has been proved to be a challenging task due to several reasons as identified by many researchers. The recognition of biological entities in text and the extraction of relationships between them have paved the way for doing more complex text-mining tasks and building further applications. This paper looks at the challenges perceived by the researchers in BER task and investigates the works done in the domain of BER by using the multiple approaches available for the task
Biomedical Named Entity Recognition: A Review
Biomedical Named Entity Recognition (BNER) is the task of identifying biomedical instances such as chemical compounds, genes, proteins, viruses, disorders, DNAs and RNAs. The key challenge behind BNER lies on the methods that would be used for extracting such entities. Most of the methods used for BNER were relying on Supervised Machine Learning (SML) techniques. In SML techniques, the features play an essential role in terms of improving the effectiveness of the recognition process. Features can be identified as a set of discriminating and distinguishing characteristics that have the ability to indicate the occurrence of an entity. In this manner, the features should be able to generalize which means to discriminate the entities correctly even on new and unseen samples. Several studies have tackled the role of feature in terms of identifying named entities. However, with the surge of biomedical researches, there is a vital demand to explore biomedical features. This paper aims to accommodate a review study on the features that could be used for BNER in which various types of features will be examined including morphological features, dictionary-based features, lexical features and distance-based features
An Integrated Web-based System for MEDLINE Analysis: A Case Study of Chronic Kidney Disease
In the era of big data, medical researchers attempt to utilize some analysis techniques like machine learning and text mining on their large-scale corpora to save valuable labor work and time. Consequently, many data analysis platforms are built to support medical professionals such as Pubtator, GeneWays, BioContext, etc. These platforms are helpful to medical entities recognition and relation extraction, but there is not an integrated platform to support researchersâ various needs, and medical projects are isolated from each other, which is hard to be shared and reused. As a result, we present an integrated system containing âname entity recognitionâ, âdocument categorizationâ and âassociation extractionâ. Besides, we add the concept of âsocializationâ making projects reusable for further analyses. A case study of chronic kidney disease was adopted to indicate the effectiveness of the proposed system
Entity recognition in the biomedical domain using a hybrid approach.
BACKGROUND: This article describes a high-recall, high-precision approach for the extraction of biomedical entities from scientific articles. METHOD: The approach uses a two-stage pipeline, combining a dictionary-based entity recognizer with a machine-learning classifier. First, the OGER entity recognizer, which has a bias towards high recall, annotates the terms that appear in selected domain ontologies. Subsequently, the Distiller framework uses this information as a feature for a machine learning algorithm to select the relevant entities only. For this step, we compare two different supervised machine-learning algorithms: Conditional Random Fields and Neural Networks. RESULTS: In an in-domain evaluation using the CRAFT corpus, we test the performance of the combined systems when recognizing chemicals, cell types, cellular components, biological processes, molecular functions, organisms, proteins, and biological sequences. Our best system combines dictionary-based candidate generation with Neural-Network-based filtering. It achieves an overall precision of 86% at a recall of 60% on the named entity recognition task, and a precision of 51% at a recall of 49% on the concept recognition task. CONCLUSION: These results are to our knowledge the best reported so far in this particular task
Arabic named entity recognition
En esta tesis doctoral se describen las investigaciones realizadas con el objetivo de determinar
las mejores tecnicas para construir un Reconocedor de Entidades Nombradas
en Arabe. Tal sistema tendria la habilidad de identificar y clasificar las entidades
nombradas que se encuentran en un texto arabe de dominio abierto.
La tarea de Reconocimiento de Entidades Nombradas (REN) ayuda a otras tareas de
Procesamiento del Lenguaje Natural (por ejemplo, la Recuperacion de Informacion, la
Busqueda de Respuestas, la Traduccion Automatica, etc.) a lograr mejores resultados
gracias al enriquecimiento que a~nade al texto. En la literatura existen diversos trabajos
que investigan la tarea de REN para un idioma especifico o desde una perspectiva
independiente del lenguaje. Sin embargo, hasta el momento, se han publicado muy
pocos trabajos que estudien dicha tarea para el arabe.
El arabe tiene una ortografia especial y una morfologia compleja, estos aspectos aportan
nuevos desafios para la investigacion en la tarea de REN. Una investigacion completa
del REN para elarabe no solo aportaria las tecnicas necesarias para conseguir
un alto rendimiento, sino que tambien proporcionara un analisis de los errores y una
discusion sobre los resultados que benefician a la comunidad de investigadores del
REN. El objetivo principal de esta tesis es satisfacer esa necesidad. Para ello hemos:
1. Elaborado un estudio de los diferentes aspectos del arabe relacionados con dicha
tarea;
2. Analizado el estado del arte del REN;
3. Llevado a cabo una comparativa de los resultados obtenidos por diferentes
tecnicas de aprendizaje automatico;
4. Desarrollado un metodo basado en la combinacion de diferentes clasificadores,
donde cada clasificador trata con una sola clase de entidades nombradas y emplea
el conjunto de caracteristicas y la tecnica de aprendizaje automatico mas
adecuados para la clase de entidades nombradas en cuestion.
Nuestros experimentos han sido evaluados sobre nueve conjuntos de test.Benajiba, Y. (2009). Arabic named entity recognition [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8318Palanci
- âŚ