13 research outputs found

    Extracting causal relations on HIV drug resistance from literature

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In HIV treatment it is critical to have up-to-date resistance data of applicable drugs since HIV has a very high rate of mutation. These data are made available through scientific publications and must be extracted manually by experts in order to be used by virologists and medical doctors. Therefore there is an urgent need for a tool that partially automates this process and is able to retrieve relations between drugs and virus mutations from literature.</p> <p>Results</p> <p>In this work we present a novel method to extract and combine relationships between HIV drugs and mutations in viral genomes. Our extraction method is based on natural language processing (NLP) which produces grammatical relations and applies a set of rules to these relations. We applied our method to a relevant set of PubMed abstracts and obtained 2,434 extracted relations with an estimated performance of 84% for F-score. We then combined the extracted relations using logistic regression to generate resistance values for each <drug, mutation> pair. The results of this relation combination show more than 85% agreement with the Stanford HIVDB for the ten most frequently occurring mutations. The system is used in 5 hospitals from the Virolab project <url>http://www.virolab.org</url> to preselect the most relevant novel resistance data from literature and present those to virologists and medical doctors for further evaluation.</p> <p>Conclusions</p> <p>The proposed relation extraction and combination method has a good performance on extracting HIV drug resistance data. It can be used in large-scale relation extraction experiments. The developed methods can also be applied to extract other type of relations such as gene-protein, gene-disease, and disease-mutation.</p

    Integrated Bio-Entity Network: A System for Biological Knowledge Discovery

    Get PDF
    A significant part of our biological knowledge is centered on relationships between biological entities (bio-entities) such as proteins, genes, small molecules, pathways, gene ontology (GO) terms and diseases. Accumulated at an increasing speed, the information on bio-entity relationships is archived in different forms at scattered places. Most of such information is buried in scientific literature as unstructured text. Organizing heterogeneous information in a structured form not only facilitates study of biological systems using integrative approaches, but also allows discovery of new knowledge in an automatic and systematic way. In this study, we performed a large scale integration of bio-entity relationship information from both databases containing manually annotated, structured information and automatic information extraction of unstructured text in scientific literature. The relationship information we integrated in this study includes protein–protein interactions, protein/gene regulations, protein–small molecule interactions, protein–GO relationships, protein–pathway relationships, and pathway–disease relationships. The relationship information is organized in a graph data structure, named integrated bio-entity network (IBN), where the vertices are the bio-entities and edges represent their relationships. Under this framework, graph theoretic algorithms can be designed to perform various knowledge discovery tasks. We designed breadth-first search with pruning (BFSP) and most probable path (MPP) algorithms to automatically generate hypotheses—the indirect relationships with high probabilities in the network. We show that IBN can be used to generate plausible hypotheses, which not only help to better understand the complex interactions in biological systems, but also provide guidance for experimental designs

    Knowledge, attitude and perception of 4th and 5th year UKZN medical school students towards the use of HIV drug resistance interpretation algorithms.

    Get PDF
    Thesis (M.Med.Sc.)-University of KwaZulu-Natal, Durban, 2013.HIV drug resistance (HIVDR) has emerged as a major clinical and public health challenge in many resource poor countries especially in Africa. HIVDR testing has become increasingly important and is of significant value in the management of HIV. The use of low cost technologies and procedures in testing HIVDR is being recommended. HIVDR computer interpretation algorithms make use of artificial intelligence and other computer technologies to predict HIVDR, and are recommended for use in resource poor countries. However, there is little known about the knowledge, attitude and perception of HIVDR computer algorithms by doctors in developing countries who are supposed to use computer algorithms. This study aimed to determine the knowledge, attitude and perception regarding computer interpretation algorithms of the 4th and 5th year medical students at Nelson R. Mandela School of Medicine, University of KwaZulu Natal in South Africa. Primary data collection was done using a questionnaire administered to a convenience sample of 216 4th and 5th year medical students. The study revealed that 90% of the respondents were aware of HIV drug resistance testing in South Africa but only 4% had knowledge of the computer interpretation algorithms. The study revealed that although the UKZN medical students are not aware of computer interpretation algorithms, majority are willing to use them in the future

    Enhancing automatic extration of biomedical relations using different linguistic features extracted from text

    Get PDF
    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Ingeniería del Software e Inteligencia Artificial, leída el 08-06-2017La extracción de relaciones entre entidades es una tarea muy importante dentro del procesamiento de textos biomédicos. Cada vez hay más información sobre este tipo de interacciones almacenada en bases de datos, pero sin embargo la mayor cantidad de información relacionada con el tema está presente en artículos científicos o en recursos donde la información se almacena en formato textual.Las interacciones entre fármacos son, en particular, una preocupación generalizada en medicina, por esa razón la extracción automática de este tipo de relaciones es una tarea muy demandada en el procesamiento de textos biomédicos. Una interacción entre 2 fármacos normalmente se produce cuando un fármaco altera el nivel de actividad de otro fármaco. De acuerdo a los informes presentados por la Adminsitración Nacional de Alimentos y Fármacos de Estados Unidos y otros estudios reconocidos [1], cada año se producen más de 2 millones de interacciones mortales entre fármacos. Muchos investigadores y compañías farmaceúticas han desarrollado bases de datos donde estas interacciones son almacenadas. Sin embargo, la información más actualizada y valiosa sigue apareciendo sólo en documentos no estructurados en formato textual, incluyendo publicaciones científicas e informes técnicos.En esta tesis se estudian 3 conjuntos de características lingüísticas de los textos: negación,dependencia clausal y candidatos neutros. El objetivo final de la investigación es mejorar el rendimiento de la tarea de extracción de interacciones entre fármacos considerando las combinaciones de las características lingüísticas extraídas de los textos con métodos de aprendizaje basados en kernel...Extracting biomedical relations from texts is a relatively new, but rapidly growing researchfield in natural language processing (NLP). Due to the increasing number of biomedicalresearch publications and the key role of databases of biomedical relations in biological andmedical research, extracting biomedical relations from scientific articles and text resourcesis of utmost importance.Drug-drug interactions (DDI) are, in particular, a widespread concern in medicine, and thus,extracting this kind of interactions automatically from texts is of high demand in BioNLP. Adrug-drug interaction usually occurs when one drug alters the activity level of another drug.According to the reports prepared by the U. S. Food and Drug Administration (the FDA) andother acknowledged studies [1], over 2 million life-threatening DDIs occur in the UnitedStates every year. Many academic researchers and pharmaceutical companies havedeveloped relational and structural databases, where DDIs are recorded. Nevertheless,most up-to-date and valuable information is still found only in unstructured research textdocuments, including scientific publications and technical reports.In this thesis, three complementary, linguistically driven, feature sets, are studied: negation,clause dependency, and neutral candidates. The ultimate aim of this research is to enhancethe performance of the DDI extraction task by considering the combinations of theextracted features with well-established kernel methods...Depto. de Ingeniería de Software e Inteligencia Artificial (ISIA)Fac. de InformáticaTRUEunpu
    corecore