817 research outputs found

    Multilingual Name Entity Recognition and Intent Classification Employing Deep Learning Architectures

    Full text link
    Named Entity Recognition and Intent Classification are among the most important subfields of the field of Natural Language Processing. Recent research has lead to the development of faster, more sophisticated and efficient models to tackle the problems posed by those two tasks. In this work we explore the effectiveness of two separate families of Deep Learning networks for those tasks: Bidirectional Long Short-Term networks and Transformer-based networks. The models were trained and tested on the ATIS benchmark dataset for both English and Greek languages. The purpose of this paper is to present a comparative study of the two groups of networks for both languages and showcase the results of our experiments. The models, being the current state-of-the-art, yielded impressive results and achieved high performance.Comment: 24 pages, 5 figures, 11 tables, dataset availabl

    Comparing Transformer-based NER approaches for analysing textual medical diagnoses

    Get PDF
    The automated analysis of medical documents has grown in research interest in recent years as a consequence of the social relevance of the thematic and the difficulties often encountered with short and very specific documents. In particular, this fervent area of research has stimulated the development of several techniques of automatic document classification, question answering, and name entity recognition (NER). Nevertheless, many open issues must be addressed to obtain results that are satisfactory for a field in which the effectiveness of predictions is a fundamental factor in order not to make mistakes that could compromise people’s lives. To this end, we focused on the name entity recognition task from medical documents and, in this work, we will discuss the results we obtained by our hybrid approach. In order to take advantage of the most relevant findings in the field of natural language processing, we decided to focus on deep neural network models. We compared several configurations of our model by varying the transformer architecture, such as BERT, RoBERTa and ELECTRA, until we obtained a configuration that we considered the best for our goals. The most promising model was used to participate in the SpRadIE task of the annual CLEF (Conference and Labs of the Evaluation Forum). The obtained results are encouraging and can be of reference for future studies on the topic

    UEM-UC3M: An Ontology-based named entity recognition system for biomedical texts

    Get PDF
    Proceedings of: International Workshop on Semantic Evaluation. SemEval-2013 : Semantic Evaluation Exercises. Took place in 2013 June, 14-15, in Atlanta, Georgia (USA). The event Web site in http://www.cs.york.ac.uk/semeval-2013/Drug name entity recognition focuses on identifying concepts appearing in the text that correspond to a chemical substance used in pharmacology for treatment, cure, prevention or diagnosis of diseases. This paper describes a system based on ontologies for identifying the chemical substances in biomedical text. The system achieves an F-1 measure of 0.529 in the task.This work has been funded by MA2VICMR project (S2009/TIC-1542) and MULTIMEDICA project12 (TIN 2010-20644-C03-01).Publicad

    Building a Semantic Virtual Museum: from Wiki to Semantic Wiki using Named Entity Recognition

    No full text
    International audienceIn this paper, we describe an approach for creating semantic wiki pages from regular wiki pages, in the domain of scientific museums, using information extraction methods in general and named entity recognition in particular. We make use of a domain specific ontology called CIDOC-CRM as a base structure for representing and processing knowledge. We have described major components of the proposed approach and a three-step process involving name entity recognition, identifying domain classes using the ontology and establishing the properties for the entities in order to generate semantic wiki pages. Our initial evaluation of the prototype shows promising results in terms of enhanced efficiency and time and cost benefits

    PA163-2-CLASHISTVI : clasificación de historias provenientes de familiares de víctimas y sobrevivientes del conflicto sociopolítico de Colombia

    Get PDF
    El presente trabajo ilustra cómo se desarrolló un prototipo de software para clasificación su-pervisada de información no estructurada asociada a textos de historias provenientes de familiares de víctimas y sobrevivientes del conflicto sociopolítico de Colombia. Técnicas de proce-samiento natural del leguaje, como name entity recognition , permiten definir en cifras por-centuales las categorías a las que pertenece un texto. Son aplicadas en conjunto las metodo-logías Design Science Research y CRISP-DM y se realiza una evaluación enfocada en el usuario siguiendo Technology Acceptance Model (TAM) .The present work illustrates how a software prototype was developed for supervised classifi-cation of unstructured information associated with texts from stories of relatives of victims and survivors of the socio-political conflict in Colombia. Natural language processing techniques, such as name entity recognition, allow defining in percentage terms the categories to which a text belongs. The Design Science Research and CRISP-DM methodologies are applied together and a user-focused assessment following the Technology Acceptance Model (TAM) is done.Magíster en Ingeniería de Sistemas y ComputaciónMaestrí

    Report of MIRACLE team for Geographical IR in CLEF 2006

    Full text link
    The main objective of the designed experiments is testing the effects of geographical information retrieval from documents that contain geographical tags. In the designed experiments we try to isolate geographical retrieval from textual retrieval replacing all geo-entity textual references from topics with associated tags and splitting the retrieval process in two phases: textual retrieval from the textual part of the topic without geo-entity references and geographical retrieval from the tagged text generated by the topic tagger. Textual and geographical results are combined applying different techniques: union, intersection, difference, and external join based. Our geographic information retrieval system consists of a set of basics components organized in two categories: (i) linguistic tools oriented to textual analysis and retrieval and (ii) resources and tools oriented to geographical analysis. These tools are combined to carry out the different phases of the system: (i) documents and topics analysis, (ii) relevant documents retrieval and (iii) result combination. If we compare the results achieved to the last campaign’s results, we can assert that mean average precision gets worse when the textual geo-entity references are replaced with geographical tags. Part of this worsening is due to our experiments return cero pertinent documents if no documents satisfy de geographical sub-query. But if we only analyze the results of queries that satisfied both textual and geographical terms, we observe that the designed experiments recover pertinent documents quickly, improving R-Precision values. We conclude that the developed geographical information retrieval system is very sensible to textual georeference and therefore it is necessary to improve the name entity recognition module

    An Integrated Web-based System for MEDLINE Analysis: A Case Study of Chronic Kidney Disease

    Get PDF
    In the era of big data, medical researchers attempt to utilize some analysis techniques like machine learning and text mining on their large-scale corpora to save valuable labor work and time. Consequently, many data analysis platforms are built to support medical professionals such as Pubtator, GeneWays, BioContext, etc. These platforms are helpful to medical entities recognition and relation extraction, but there is not an integrated platform to support researchers’ various needs, and medical projects are isolated from each other, which is hard to be shared and reused. As a result, we present an integrated system containing ‘name entity recognition’, ‘document categorization’ and ‘association extraction’. Besides, we add the concept of ‘socialization’ making projects reusable for further analyses. A case study of chronic kidney disease was adopted to indicate the effectiveness of the proposed system
    • …
    corecore