9 research outputs found

    Sieve-based coreference resolution enhances semi-supervised learning model for chemical-induced disease relation extraction.

    Get PDF
    The BioCreative V chemical-disease relation (CDR) track was proposed to accelerate the progress of text mining in facilitating integrative understanding of chemicals, diseases and their relations. In this article, we describe an extension of our system (namely UET-CAM) that participated in the BioCreative V CDR. The original UET-CAM system's performance was ranked fourth among 18 participating systems by the BioCreative CDR track committee. In the Disease Named Entity Recognition and Normalization (DNER) phase, our system employed joint inference (decoding) with a perceptron-based named entity recognizer (NER) and a back-off model with Semantic Supervised Indexing and Skip-gram for named entity normalization. In the chemical-induced disease (CID) relation extraction phase, we proposed a pipeline that includes a coreference resolution module and a Support Vector Machine relation extraction model. The former module utilized a multi-pass sieve to extend entity recall. In this article, the UET-CAM system was improved by adding a 'silver' CID corpus to train the prediction model. This silver standard corpus of more than 50 thousand sentences was automatically built based on the Comparative Toxicogenomics Database (CTD) database. We evaluated our method on the CDR test set. Results showed that our system could reach the state of the art performance with F1 of 82.44 for the DNER task and 58.90 for the CID task. Analysis demonstrated substantial benefits of both the multi-pass sieve coreference resolution method (F1 + 4.13%) and the silver CID corpus (F1 +7.3%).Database URL: SilverCID-The silver-standard corpus for CID relation extraction is freely online available at: https://zenodo.org/record/34530 (doi:10.5281/zenodo.34530).H-Q.L. and T.H.D. gratefully acknowledge funding support from Vietnam National University, Hanoi (VNU), under Project No. QG.15.21. N.C. gratefully acknowledges funding support from the UK EPSRC (grant number EP/M005089/1). Funding for open access charge : VNUH Project No. QG.15.21

    Grounding event references in news

    Get PDF
    Events are frequently discussed in natural language, and their accurate identification is central to language understanding. Yet they are diverse and complex in ontology and reference; computational processing hence proves challenging. News provides a shared basis for communication by reporting events. We perform several studies into news event reference. One annotation study characterises each news report in terms of its update and topic events, but finds that topic is better consider through explicit references to background events. In this context, we propose the event linking task which—analogous to named entity linking or disambiguation—models the grounding of references to notable events. It defines the disambiguation of an event reference as a link to the archival article that first reports it. When two references are linked to the same article, they need not be references to the same event. Event linking hopes to provide an intuitive approximation to coreference, erring on the side of over-generation in contrast with the literature. The task is also distinguished in considering event references from multiple perspectives over time. We diagnostically evaluate the task by first linking references to past, newsworthy events in news and opinion pieces to an archive of the Sydney Morning Herald. The intensive annotation results in only a small corpus of 229 distinct links. However, we observe that a number of hyperlinks targeting online news correspond to event links. We thus acquire two large corpora of hyperlinks at very low cost. From these we learn weights for temporal and term overlap features in a retrieval system. These noisy data lead to significant performance gains over a bag-of-words baseline. While our initial system can accurately predict many event links, most will require deep linguistic processing for their disambiguation

    Grounding event references in news

    Get PDF
    Events are frequently discussed in natural language, and their accurate identification is central to language understanding. Yet they are diverse and complex in ontology and reference; computational processing hence proves challenging. News provides a shared basis for communication by reporting events. We perform several studies into news event reference. One annotation study characterises each news report in terms of its update and topic events, but finds that topic is better consider through explicit references to background events. In this context, we propose the event linking task which—analogous to named entity linking or disambiguation—models the grounding of references to notable events. It defines the disambiguation of an event reference as a link to the archival article that first reports it. When two references are linked to the same article, they need not be references to the same event. Event linking hopes to provide an intuitive approximation to coreference, erring on the side of over-generation in contrast with the literature. The task is also distinguished in considering event references from multiple perspectives over time. We diagnostically evaluate the task by first linking references to past, newsworthy events in news and opinion pieces to an archive of the Sydney Morning Herald. The intensive annotation results in only a small corpus of 229 distinct links. However, we observe that a number of hyperlinks targeting online news correspond to event links. We thus acquire two large corpora of hyperlinks at very low cost. From these we learn weights for temporal and term overlap features in a retrieval system. These noisy data lead to significant performance gains over a bag-of-words baseline. While our initial system can accurately predict many event links, most will require deep linguistic processing for their disambiguation

    Standardizing New Diagnostic Tests to Facilitate Rapid Responses to The Covid-19 Pandemic

    Get PDF
    In order to enhance the data interoperability, an expeditious and accurate standardization solution is highly desirable for naming rapidly emerging novel lab tests, and thus diminishes confusion in early responses to pandemic outbreaks. This is a preliminary study to explore the roles and implementation of medical informatics technology, especially natural language processing and ontology methods, in standardizing information about emerging lab tests during a pandemic, thereby facilitating rapid responses to the pandemic. The ultimate goal of this study is to develop an informatics framework for rapid standardization of lab testing names during a pandemic to better prepare for future public health threats. We first constructed an information model for lab tests approved during the COVID-19 pandemic and built a named entity recognition tool that can automatically extract lab test information specified in the information model from the Emergency Use Authorization(EUA)documents of the U.S. Food and Drug Administration (FDA), thus creating a catalog of approved lab tests with detailed information. To facilitate the standardization of lab testing data in electronic health records, we further developed the COVID-19 TestNorm, a tool that normalizes the names of various COVID-19 lab testing used by different healthcare facilities into standard Logical Observation Identifiers Names and Codes (LOINC). The overall accuracy of COVID-19 TestNorm on the development set was 98.9%, and on the independent test set was 97.4%. Lastly, we conducted a clinical study on COVID-19 re-positivity to demonstrate the utility of standardized lab test information in supporting clinical research. We believe that the result of my study indicates great a potential of medical informatics technologies for facilitating rapid responses to both current and future pandemics

    Preface

    Get PDF

    INTER-ENG 2020

    Get PDF
    These proceedings contain research papers that were accepted for presentation at the 14th International Conference Inter-Eng 2020 ,Interdisciplinarity in Engineering, which was held on 8–9 October 2020, in Târgu Mureș, Romania. It is a leading international professional and scientific forum for engineers and scientists to present research works, contributions, and recent developments, as well as current practices in engineering, which is falling into a tradition of important scientific events occurring at Faculty of Engineering and Information Technology in the George Emil Palade University of Medicine, Pharmacy Science, and Technology of Târgu Mures, Romania. The Inter-Eng conference started from the observation that in the 21st century, the era of high technology, without new approaches in research, we cannot speak of a harmonious society. The theme of the conference, proposing a new approach related to Industry 4.0, was the development of a new generation of smart factories based on the manufacturing and assembly process digitalization, related to advanced manufacturing technology, lean manufacturing, sustainable manufacturing, additive manufacturing, and manufacturing tools and equipment. The conference slogan was “Europe’s future is digital: a broad vision of the Industry 4.0 concept beyond direct manufacturing in the company”
    corecore