224 research outputs found

    Systematic identification of pharmacogenomics information from clinical trials

    Get PDF
    AbstractRecent progress in high-throughput genomic technologies has shifted pharmacogenomic research from candidate gene pharmacogenetics to clinical pharmacogenomics (PGx). Many clinical related questions may be asked such as ‘what drug should be prescribed for a patient with mutant alleles?’ Typically, answers to such questions can be found in publications mentioning the relationships of the gene–drug–disease of interest. In this work, we hypothesize that ClinicalTrials.gov is a comparable source rich in PGx related information. In this regard, we developed a systematic approach to automatically identify PGx relationships between genes, drugs and diseases from trial records in ClinicalTrials.gov. In our evaluation, we found that our extracted relationships overlap significantly with the curated factual knowledge through the literature in a PGx database and that most relationships appear on average 5years earlier in clinical trials than in their corresponding publications, suggesting that clinical trials may be valuable for both validating known and capturing new PGx related information in a more timely manner. Furthermore, two human reviewers judged a portion of computer-generated relationships and found an overall accuracy of 74% for our text-mining approach. This work has practical implications in enriching our existing knowledge on PGx gene–drug–disease relationships as well as suggesting crosslinks between ClinicalTrials.gov and other PGx knowledge bases

    Knowledge Author: Facilitating user-driven, Domain content development to support clinical information extraction

    Get PDF
    Background: Clinical Natural Language Processing (NLP) systems require a semantic schema comprised of domain-specific concepts, their lexical variants, and associated modifiers to accurately extract information from clinical texts. An NLP system leverages this schema to structure concepts and extract meaning from the free texts. In the clinical domain, creating a semantic schema typically requires input from both a domain expert, such as a clinician, and an NLP expert who will represent clinical concepts created from the clinician's domain expertise into a computable format usable by an NLP system. The goal of this work is to develop a web-based tool, Knowledge Author, that bridges the gap between the clinical domain expert and the NLP system development by facilitating the development of domain content represented in a semantic schema for extracting information from clinical free-text. Results: Knowledge Author is a web-based, recommendation system that supports users in developing domain content necessary for clinical NLP applications. Knowledge Author's schematic model leverages a set of semantic types derived from the Secondary Use Clinical Element Models and the Common Type System to allow the user to quickly create and modify domain-related concepts. Features such as collaborative development and providing domain content suggestions through the mapping of concepts to the Unified Medical Language System Metathesaurus database further supports the domain content creation process. Two proof of concept studies were performed to evaluate the system's performance. The first study evaluated Knowledge Author's flexibility to create a broad range of concepts. A dataset of 115 concepts was created of which 87 (76%) were able to be created using Knowledge Author. The second study evaluated the effectiveness of Knowledge Author's output in an NLP system by extracting concepts and associated modifiers representing a clinical element, carotid stenosis, from 34 clinical free-text radiology reports using Knowledge Author and an NLP system, pyConText. Knowledge Author's domain content produced high recall for concepts (targeted findings: 86%) and varied recall for modifiers (certainty: 91% sidedness: 80%, neurovascular anatomy: 46%). Conclusion: Knowledge Author can support clinical domain content development for information extraction by supporting semantic schema creation by domain experts

    Determining correspondences between high-frequency MedDRA concepts and SNOMED: a case study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The Systematic Nomenclature of Medicine Clinical Terms (SNOMED CT) is being advocated as the foundation for encoding clinical documentation. While the electronic medical record is likely to play a critical role in pharmacovigilance - the detection of adverse events due to medications - classification and reporting of Adverse Events is currently based on the Medical Dictionary of Regulatory Activities (MedDRA). Complete and high-quality MedDRA-to-SNOMED CT mappings can therefore facilitate pharmacovigilance.</p> <p>The existing mappings, as determined through the Unified Medical Language System (UMLS), are partial, and record only one-to-one correspondences even though SNOMED CT can be used compositionally. Efforts to map previously unmapped MedDRA concepts would be most productive if focused on concepts that occur frequently in actual adverse event data.</p> <p>We aimed to identify aspects of MedDRA that complicate mapping to SNOMED CT, determine pattern in unmapped high-frequency MedDRA concepts, and to identify types of integration errors in the mapping of MedDRA to UMLS.</p> <p>Methods</p> <p>Using one years' data from the US Federal Drug Administrations Adverse Event Reporting System, we identified MedDRA preferred terms that collectively accounted for 95% of both Adverse Events and Therapeutic Indications records. After eliminating those already mapping to SNOMED CT, we attempted to map the remaining 645 Adverse-Event and 141 Therapeutic-Indications preferred terms with software assistance.</p> <p>Results</p> <p>All but 46 Adverse-Event and 7 Therapeutic-Indications preferred terms could be composed using SNOMED CT concepts: none of these required more than 3 SNOMED CT concepts to compose. We describe the common composition patterns in the paper. About 30% of both Adverse-Event and Therapeutic-Indications Preferred Terms corresponded to single SNOMED CT concepts: the correspondence was detectable by human inspection but had been missed during the integration process, which had created duplicated concepts in UMLS.</p> <p>Conclusions</p> <p>Identification of composite mapping patterns, and the types of errors that occur in the MedDRA content within UMLS, can focus larger-scale efforts on improving the quality of such mappings, which may assist in the creation of an adverse-events ontology.</p

    Extracting biomedical relations from biomedical literature

    Get PDF
    Tese de mestrado em Bioinformática e Biologia Computacional, apresentada à Universidade de Lisboa, através da Faculdade de Ciências, em 2018A ciência, e em especial o ramo biomédico, testemunham hoje um crescimento de conhecimento a uma taxa que clínicos, cientistas e investigadores têm dificuldade em acompanhar. Factos científicos espalhados por diferentes tipos de publicações, a riqueza de menções etiológicas, mecanismos moleculares, pontos anatómicos e outras terminologias biomédicas que não se encontram uniformes ao longo das várias publicações, para além de outros constrangimentos, encorajaram a aplicação de métodos de text mining ao processo de revisão sistemática. Este trabalho pretende testar o impacto positivo que as ferramentas de text mining juntamente com vocabulários controlados (enquanto forma de organização de conhecimento, para auxílio num posterior momento de recolha de informação) têm no processo de revisão sistemática, através de um sistema capaz de criar um modelo de classificação cujo treino é baseado num vocabulário controlado (MeSH), que pode ser aplicado a uma panóplia de literatura biomédica. Para esse propósito, este projeto divide-se em duas tarefas distintas: a criação de um sistema, constituído por uma ferramenta que pesquisa a base de dados PubMed por artigos científicos e os grava de acordo com etiquetas pré-definidas, e outra ferramenta que classifica um conjunto de artigos; e a análise dos resultados obtidos pelo sistema criado, quando aplicado a dois casos práticos diferentes. O sistema foi avaliado através de uma série de testes, com recurso a datasets cuja classificação era conhecida, permitindo a confirmação dos resultados obtidos. Posteriormente, o sistema foi testado com recurso a dois datasets independentes, manualmente curados por investigadores cuja área de investigação se relaciona com os dados. Esta forma de avaliação atingiu, por exemplo, resultados de precisão cujos valores oscilam entre os 68% e os 81%. Os resultados obtidos dão ênfase ao uso das tecnologias e ferramentas de text mining em conjunto com vocabulários controlados, como é o caso do MeSH, como forma de criação de pesquisas mais complexas e dinâmicas que permitam melhorar os resultados de problemas de classificação, como são aqueles que este trabalho retrata.Science, and the biomedical field especially, is witnessing a growth in knowledge at a rate at which clinicians and researchers struggle to keep up with. Scientific evidence spread across multiple types of scientific publications, the richness of mentions of etiology, molecular mechanisms, anatomical sites, as well as other biomedical terminology that is not uniform across different writings, among other constraints, have encouraged the application of text mining methods in the systematic reviewing process. This work aims to test the positive impact that text mining tools together with controlled vocabularies (as a way of organizing knowledge to aid, at a later time, to collect information) have on the systematic reviewing process, through a system capable of creating a classification model which training is based on a controlled vocabulary (MeSH) that can be applied to a variety of biomedical literature. For that purpose, this project was divided into two distinct tasks: the creation a system, consisting of a tool that searches the PubMed search engine for scientific articles and saves them according to pre-defined labels, and another tool that classifies a set of articles; and the analysis of the results obtained by the created system when applied to two different practical cases. The system was evaluated through a series of tests, using datasets whose classification results were previously known, allowing the confirmation of the obtained results. Afterwards, the system was tested by using two independently-created datasets which were manually curated by researchers working in the field of study. This last form of evaluation achieved, for example, precision scores as low as 68%, and as high as 81%. The results obtained emphasize the use of text mining tools, along with controlled vocabularies, such as MeSH, as a way to create more complex and comprehensive queries to improve the performance scores of classification problems, with which the theme of this work relates

    An Investigation of the Public Health Informatics Research and Practice in the Past Fifteen Years from 2000 to 2014: A Scoping Review in MEDLINE

    Get PDF
    Objective: To examine the extent and nature of existing Public Health Informatics (PHI) studies in the past 15 years on MEDLINE. Methods: This thesis adopted the scientific scoping review methodology recommended by Arksey and O’Malley in 2005. It proceeded with the five main stages, which were: Stage I - identifying the research question; Stage II - identifying relevant studies; Stage III - study selection; Stage IV - charting the data; and Stage V - collating, summarizing, and reporting the results. Each methodological stage was carried out with the joint collaboration with the academic supervisor and a final result and conclusion were set forth. Results: The results of this study captured a total number of 486 articles in MEDLINE focused in PHI. Out of them, a majority belonged to the USA followed by the UK, Australia and Canada. Only about one fifth of the articles were from the rest of the world. Further, About 60% of the articles represented infectious disease monitoring, outbreak detection, and bio-terrorism surveillance. Furthermore, about 10% belonged to chronic disease monitoring; whereas public health policy system and research represented 40% of the total articles. The most frequently used information technology were electronic registry, website, and GIS. In contrast, mass media and mobile phones were among the least used technologies. Conclusion: Despite multiple research and discussions conducted in the past 15 years (starting from 2000), the PHI system requires further improvements in the application of modern PHT such as wireless devices, wearable devices, remote sensors, remote/ cloud computing etc. on various domains of PH, which were scarcely discussed or used in the available literature

    A comparative analysis of good enterprise data management practices:insights from literature and artificial intelligence perspectives for business efficiency and effectiveness

    Get PDF
    Abstract. This thesis presents a comparative analysis of enterprise data management practices based on literature and artificial intelligence (AI) perspectives, focusing on their impact on data quality, business efficiency, and effectiveness. It employs a systematic research methodology comprising of a literature review, an AI-based examination of current practices using ChatGPT, and a comparative analysis of findings. The study highlights the importance of robust data governance, high data quality, data integration, and security, alongside the transformative potential of AI. The limitations revolve around the primarily qualitative nature of the study and potential restrictions in the generalizability of the findings. However, the thesis offers valuable insights and recommendations for enterprises to optimize their data management strategies, underscoring the enhancement potential of AI in traditional practices. The research contributes to scientific discourse in information systems, data science, and business management
    corecore