Search CORE

4 research outputs found

An Evaluation Methodology of Named Entities Recognition in Spanish Language: ECU 911 Case Study

Author: Acosta María-Inés
Lima Juan-Fernando
Orellana Marcos
Peña Mario
Trujillo Andrea
Publication venue: 'Insight Society'
Publication date: 13/06/2020
Field of study

The importance of the gathered information in Integrated Security Services as ECU911 in Ecuador is evidenced in terms of its quality and availability in order to perform decision-making tasks. It is a priority to avoid the loss of relevant information such as event address, places references, names, etc. In this context it is present Named Entity Recognition (NER) analysis for discovering information into informal texts. Unlike structured corpus and labeled for NER analysis like CONLL2002 or ANCORA, informal texts generated from emergency call dialogues have a very wide linguistic variety; in addition, there is a strong tending to lose important information in their processing. A relevant aspect to considerate is the identification of texts that denotes entities such as the physical address where emergency events occurred. This study aims to extract the locations in which an emergency event has been issued. A set of experiments was performed with NER models based on Convolutional Neural Network (CNN). The performance of models was evaluated according to parameters such as training dataset size, dropout rate, location dictionary, and denoting location. An experimentation methodology was proposed, with it follows the next steps: i) Data preprocessing, ii) Dataset labeling, iii) Model structuring, and iv) Model evaluating. Results revealed that the performance of a model improves when having more training data, an adequate dropout rate to control overfitting problems, and a combination of a dictionary of locations and replacing words denoting entities

International Journal on Advanced Science, Engineering and Information Technology

LANGUAGE MODELS FOR RARE DISEASE INFORMATION EXTRACTION: EMPIRICAL INSIGHTS AND MODEL COMPARISONS

Author: Gupta Shashank
Publication venue: UKnowledge
Publication date: 01/01/2024
Field of study

End-to-end relation extraction (E2ERE) is a crucial task in natural language processing (NLP) that involves identifying and classifying semantic relationships between entities in text. This thesis compares three paradigms for end-to-end relation extraction (E2ERE) in biomedicine, focusing on rare diseases with discontinuous and nested entities. We evaluate Named Entity Recognition (NER) to Relation Extraction (RE) pipelines, sequence-to-sequence models, and generative pre-trained transformer (GPT) models using the RareDis information extraction dataset. Our findings indicate that pipeline models are the most effective, followed closely by sequence-to-sequence models. GPT models, despite having eight times as many parameters, perform worse than sequence-to-sequence models and significantly lag pipeline models. Our results also hold for a second E2ERE dataset for chemical-protein interactions

University of Kentucky

An end-to-end deep learning architecture for extracting protein–protein interactions affected by genetic mutations

Author: Airola
Bengio
Bunescu
Chiu
Collobert
Frunza
Fundel
Gers
Hirschman
Kavuluru
Kim
Li
Liu
Luo
Maglott
Morgan
Ramakanth Kavuluru
Rink
Sepp
Tieleman
Tran
Tran
Tung Tran
Wei
Wei
Zheng
Özgür
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref