95 research outputs found

    Using an ontological representation of chemotherapy toxicities for guiding information extraction and integration from EHRs

    Get PDF
    International audienceIntroduction. Chemotherapies against cancers are often interrupted due to severe drug toxicities, reducing treatment opportunities. For this reason, the detection of toxicities and their severity from EHRs is of importance for many downstream applications. However toxicity information is dispersed in various sources in the EHRs, making its extraction challenging. Methods. We introduce OntoTox, an ontology designed to represent chemotherapy toxicities, its attributes and provenance. We illustrated the interest of OntoTox by integrating toxicities and grading information extracted from three heterogeneous sources: EHR questionnaires, semi-structured tables, and free-text. Results. We instantiated 53,510, 2,366 and 54,420 toxicities from questionnaires, tables and free-text respectively, and compared the complementarity and redundancy of the three sources. Discussion. We illustrated with this preliminary study the potential of OntoTox to guide the integration of multiple sources, and identified that the three sources are only moderately overlapping, stressing the need for a common representation

    Leveraging Terminological Resources for Mapping between Rare

    Get PDF

    L'équipe-projet HeKA

    Get PDF
    This article describe the Inria, Inserm, Univ. de Paris project team HeKA.International audienceHeKA est une équipe-projet de recherche commune à Inria, l’Inserm et l’Université de Paris. Plus précisément, HeKA, dépend du Centre de Recherche des Cordeliers et du Centre Inria de Paris. En plus de deux chercheurs Inria et Inserm, HeKA est composé de chercheurs hospitalo-universitaires de l’AP-HP associés à des services de l’Hôpital Européen Georges Pompidou, l’Hôpital Necker et de l’Institut Imagine. Les thèmes de recherche de l’équipe sont l’informatique médicale, les biostatistiques et les mathématiques appliquées pour l’aide à la décision clinique. Le terme HeKA est à la fois une référence à la divité égyptienne de la médecine et un acronyme pour Health data- and model- driven Knowledge Acquisition.L’équipe HeKA fait suite à l’équipe 22 (Information Sciences to support Personalized Medicine) dirigée par Anita Burgun au Centre de Recherche des Corderliers (Inserm, Université de Paris). La responsable de HeKA est Sarah Zohar, elle est secondée par Adrien Coulet

    JCO Clin Cancer Inform

    Get PDF
    PURPOSE: Many institutions throughout the world have launched precision medicine initiatives in oncology, and a large amount of clinical and genomic data is being produced. Although there have been attempts at data sharing with the community, initiatives are still limited. In this context, a French task force composed of Integrated Cancer Research Sites (SIRICs), comprehensive cancer centers from the Unicancer network (one of Europe's largest cancer research organization), and university hospitals launched an initiative to improve and accelerate retrospective and prospective clinical and genomic data sharing in oncology. MATERIALS AND METHODS: For 5 years, the OSIRIS group has worked on structuring data and identifying technical solutions for collecting and sharing them. The group used a multidisciplinary approach that included weekly scientific and technical meetings over several months to foster a national consensus on a minimal data set. RESULTS: The resulting OSIRIS set and event-based data model, which is able to capture the disease course, was built with 67 clinical and 65 omics items. The group made it compatible with the HL7 Fast Healthcare Interoperability Resources (FHIR) format to maximize interoperability. The OSIRIS set was reviewed, approved by a National Plan Strategic Committee, and freely released to the community. A proof-of-concept study was carried out to put the OSIRIS set and Common Data Model into practice using a cohort of 300 patients. CONCLUSION: Using a national and bottom-up approach, the OSIRIS group has defined a model including a minimal set of clinical and genomic data that can be used to accelerate data sharing produced in oncology. The model relies on clear and formally defined terminologies and, as such, may also benefit the larger international community

    Recherche d'associations séquentielles et alignement d'ontologies biologiques

    No full text
    The main topic of this thesis is functional annotation. Functional annotation consists in associating proteins with biological functions. We explored two aspects of functional annotation. On one hand, we have tested the hypothesis that the order of domains in a protein could play a role in a protein biological function. We have introduced the new notion of sequential nugget of knowledge as an association of a sequence of items with a predetermined target. We have designed and implemented SNK, an algorithm that find such nuggets of knowledge. SNK algorithm has been adapted to fit specific needs expressed by our biologist collaborators. SNK has been successfully used to study a protein family. On the other band, we were interested in biological ontologies and functional hierarchies used by experts to perform functional annotation. Many of these structured and controlled vocabularies exist and express various aspects on the annotation. The mapping of biological ontologies appeared as a need to enable the study of whole set of annotation data for genomics purpose. We have chosen to develop a dedicated method O'Browser, that use specificity of biological ontologies by (i) using a matcher based on homology relationships between proteins annotated with the ontologies, and (ii) introducing the notion of adaptive weighting of matchers. This method has been used for the alignment of two functional hierarchies.Le thème principal de cette thèse est l'annotation fonctionnelle, qui consiste à associer à une protéine sa ou ses fonctions biologiques. Nous nous sommes intéressés à deux aspects. Dans un premier temps, nous avons testé l'hypothèse biologique selon laquelle l'ordre des domaines dans une protéine pourrait jouer un rôle dans la fonction biologique de celle-ci. Pour cela, nous avons introduit la notion de pépites séquentielles de connaissance comme une association séquentielle entre séquence d'items et une cible déterminée. Nous avons conçu et implémenté SNK, un algorithme pour rechercher ces pépites. Pour répondre à un besoin de nos collaborateurs, nous avons étendu l'algorithme SNK en lui donnant une spécification plus adaptée à la biologie, puis nous avons utilisé avec succès SNK pour l'étude d'une famille protéique. Dans un second temps, nous avons travaillé sur les ontologies biologiques et les hiérarchies fonctionnelles que les experts biologistes utilisent pour l'annotation. Il existe plusieurs de ces vocabulaires contrôlés et structurés exprimant chacun un point de vue sur l'annotation. Pour permettre de travailler avec l'ensemble de ces données d'annotation dans le cadre de travaux de génomique comparative. Il est apparu nécessaire de mettre en correspondance des ontologies biologiques. Nous avons choisi de développer une méthode de mapping, O'Browser, prenant en compte les spécificités des ontologies biologiques, en introduisant un matcher utilisant les relations d'homologie entre les protéines annotées par ces ontologies et la notion de pondération adaptative des ces matchers. Cette méthode a été utilisée pour l'alignement de deux hiérarchies fonctionnelles

    Recherche d'associations séquentielles et alignement d'ontologies biologiques

    No full text
    ORSAY-PARIS 11-BU Sciences (914712101) / SudocSudocFranceF

    Contributions from the 2019 Literature on Bioinformatics and Translational Informatics

    No full text
    International audienceObjectives: Summarize recent research and select the best papers published in 2019 in the field of Bioinformatics and Translational Informatics (BTI) for the corresponding section of the International Medical Informatics Association Yearbook.Methods: A literature review was performed for retrieving from PubMed papers indexed with keywords and free terms related to BTI. Independent review allowed the section editors to select a list of 15 candidate best papers which were subsequently peer-reviewed. A final consensus meeting gathering the whole Yearbook editorial committee was organized to finally decide on the selection of the best papers.Results: Among the 931 retrieved papers covering the various subareas of BTI, the review process selected four best papers. The first paper presents a logical modeling of cancer pathways. Using their tools, the authors are able to identify two known behaviours of tumors. The second paper describes a deep-learning approach to predicting resistance to antibiotics in Mycobacterium tubercu-losis. The authors of the third paper introduce a Genomic Global Positioning System (GPS) enabling comparison of genomic data with other individuals or genomics databases while preserving privacy. The fourth paper presents a multi-omics and temporal sequence-based approach to provide a better understanding of the sequence of events leading to Alzheimer’s Disease.Conclusions: Thanks to the normalization of open data and open science practices, research in BTI continues to develop and mature. Noteworthy achievements are sophisticated applications of leading edge machine-learning methods dedicated to person-alized medicine

    Using an ontological representation of chemotherapy toxicities for guiding information extraction and integration from EHRs

    Get PDF
    International audienceIntroduction. Chemotherapies against cancers are often interrupted due to severe drug toxicities, reducing treatment opportunities. For this reason, the detection of toxicities and their severity from EHRs is of importance for many downstream applications. However toxicity information is dispersed in various sources in the EHRs, making its extraction challenging. Methods. We introduce OntoTox, an ontology designed to represent chemotherapy toxicities, its attributes and provenance. We illustrated the interest of OntoTox by integrating toxicities and grading information extracted from three heterogeneous sources: EHR questionnaires, semi-structured tables, and free-text. Results. We instantiated 53,510, 2,366 and 54,420 toxicities from questionnaires, tables and free-text respectively, and compared the complementarity and redundancy of the three sources. Discussion. We illustrated with this preliminary study the potential of OntoTox to guide the integration of multiple sources, and identified that the three sources are only moderately overlapping, stressing the need for a common representation
    corecore