10 research outputs found

    Importance of Model Features.

    No full text
    <p>(a) Histogram of CNV lengths (on log scale) for harmful and benign CNVs within our dataset shows that harmful CNVs are more likely to be longer, and hence likely affect more genes and gene functions. (b-d) Precision (b), recall (c) and f-measure (d) for predicting harmful versus benign CNVs relative to the number of closest neighbors considered within the gene interaction network. Both precision (b) and f-measure (d) improve as we expand the number of neighbors considered, but stabilize or slightly descend after 10 neighbors. We also see an improvement from utilizing the patient phenotypes uniform model in precision and accuracy as we add the ranking as a source for weighing our features.</p

    Prioritizing Clinically Relevant Copy Number Variation from Genetic Interactions and Gene Function Data

    No full text
    <div><p>It is becoming increasingly necessary to develop computerized methods for identifying the few disease-causing variants from hundreds discovered in each individual patient. This problem is especially relevant for Copy Number Variants (CNVs), which can be cheaply interrogated via low-cost hybridization arrays commonly used in clinical practice. We present a method to predict the disease relevance of CNVs that combines functional context and clinical phenotype to discover clinically harmful CNVs (and likely causative genes) in patients with a variety of phenotypes. We compare several feature and gene weighing systems for classifying both genes and CNVs. We combined the best performing methodologies and parameters on over 2,500 Agilent CGH 180k Microarray CNVs derived from 140 patients. Our method achieved an F-score of 91.59%, with 87.08% precision and 97.00% recall. Our methods are freely available at <a href="https://github.com/compbio-UofT/cnv-prioritization" target="_blank">https://github.com/compbio-UofT/cnv-prioritization</a>. Our dataset is included with the supplementary information.</p></div

    Precision, recall and f-measure for CNVs when combining the three following features length, DGV and gene.

    No full text
    <p>Length is the CNV length. DGV is a measure of the CNV’s frequency in the Database of Genomic Variants. Gene is the feature derived from the previous machine learning step in this method.</p

    The overall structure of the two layer classifier, with the output of hte Gene Classifier being one of the inputs to the CNV classifier.

    No full text
    <p>The overall structure of the two layer classifier, with the output of hte Gene Classifier being one of the inputs to the CNV classifier.</p

    Databases, ontologies and known associations used to identify CNV-phenotype correlations.

    No full text
    <p>Our approach integrates 3 types of information: 1) CNVs an their non-exhaustive frequency in healthy individuals, 2) genes and gene interactions, with their respective functions (each gene within a CNV is weighted by its likelihood of contributing to the phenotypes; via semantic similarity within the GO ontology), and 3) phenotypic descriptions and relationships between them as specified by HPO, with their non-exhaustive associations to disease genes (via OMIM). For an individuals variants and known HPO phenotypes, genes affected by these variants are highlighted within the gene interaction network, while the phenotypes are emphasized in the phenotype ontology layer.</p

    Using phenotypic similarity to improve rare disease identification in PhenomeCentral

    No full text
    <p>Presentation given by Orion Buske at Genome Informatics 2014 in Cambridge, UK. Covers the current performance of patient matching and gene prioritization algorithms in PhenomeCentral.</p

    Data-sharing in NeurOmics: enabling effective collaboration and working with RD-Connect

    No full text
    <p>Within the NeurOmics project 1100 samples from across 10 rare neurodegenerative and neuromuscular diseases will undergo whole exome sequencing. In addition patients will be deep phenotyped, RNAseq will be carried out and biomarker studies are to be developed – this will all lead to improved understanding of the conditions, causative and modifier gene discovery, more diagnoses and the identification of potential therapeutic targets.</p> <p>For these ambitious aims to be realised, both omics and phenotypic data should be accessible for study by NeurOmics partners across the disease groups. In order to enable this, the NeurOmics project has built an online clinical database in which all phenotypic data is mapped to the Human Phenotype Ontology (HPO), and has established data sharing policies and procedures in close collaboration with RD-Connect. This means that partners are committed to collaborative working within the consortium now and to wider data-sharing via the RD-Connect platform and the European Genome-phenome Archive (EGA) in future, according to agreed timelines that ensure all NeurOmics data is ultimately accessible to researchers worldwide. The policies in place ensure that investigators’ rights to publish first and to intellectual property are protected whilst information sharing is facilitated. They also ensure that NeurOmics complies with the policies of the International Rare Diseases Research Consortium (IRDiRC), which mandates timely sharing of source data for the benefit of the rare disease research community. The NeurOmics policies also recognise the importance of patient consent - where this does not permit wide sharing of anonymised data, patients are to be reconsented before inclusion in the database.</p> <p>These policies have now been agreed and approved by the NeurOmics Steering Committee. This poster outlines how this has been achieved and describes the plans for future working with RD-Connect.</p

    PhenoTips: Patient Phenotyping Software for Clinical and Research Use.

    No full text
    <p>We have developed PhenoTips, a deep phenotyping tool and database, specifically designed for phenotyping patients with genetic disorders. Our tool closely mirrors clinician workflows so as to facilitate the recording of observations made during the patient encounter. Phenotypic information is represented using the Human Phenotype Ontology; however, the complexity of the ontology is hidden behind a user interface, which combines simple selection of common phenotypes with error-tolerant, predictive search of the entire ontology. The software provides a series of features that help reduce the clinician's workload during the clinical examination. Together with standardized phenotypic data, PhenoTips supports entering demographic information, medical history (including prior laboratory results), family history, various measurements, relevant images depicting manifestations of the patient's disorder, genetic tests and their results, as well as additional notes for each of these categories. A pedigree drawing tool which enables the collection of advanced family histories is currently under development. The software automatically plots growth curves for a variety of measurements, selects phenotypes reflecting abnormal measurements, instantly finds Online Mendelian Inheritance in Man (OMIM) diseases that most closely match the phenotypic description, and can suggest additional clinical investigations that can improve the diagnosis.</p> <p>PhenoTips is already used both in research studies and in the clinic, including the phenotyping of patients for the FORGE (Finding Of Rare disease GEnes) Canada project (http://care4rare.ca/), and the Undiagnosed Disease Program at the NIH. Our source code and a demo version of PhenoTips are available at http://phenotips.org.</p

    PhenomeCentral: An Integrated Portal for Sharing and Searching Patient Phenotype Data for Rare Genetic Disorders.

    No full text
    <p>The availability of low-cost genome sequencing has allowed for the identification of the molecular cause of hundreds of rare genetic disorders. Solved disorders, however, only represent the “tip of the iceberg”. Because the discovery of disease-causing variants typically requires confirmation of the mutation or gene in multiple unrelated individuals, an even larger number of genetic disorders remain unsolved due to difficulty identifying second families. With many groups now tackling these remaining undiagnosed disorders, which may be present in only a handful of individuals seen at different hospitals and sequenced by different centers, it is critical to establish effective and secure data-sharing techniques that allow clinicians and scientists to identify additional families via phenotype and genotype searches.</p> <p>To address this need, we have developed PhenomeCentral (http://phenomecentral.org), a repository for secure data sharing targeted to the rare disorder community. Each patient record within PhenomeCentral consists of a thorough phenotypic description capturing observed abnormalities as well as relevant absent manifestations, expressed using Human Phenotype Ontology terms. Furthermore, each record can be labeled by the creator as: private ‒ hidden from everyone except the contributor; public ‒ viewable and searchable by all registered users; or matchable ‒ the record cannot be directly viewed or searched, but is reachable via an automated phenotype matching system (following Cafe Variome principles) which informs contributors of the existence of profiles similar to their cases.</p> <p>PhenomeCentral currently incorporates phenotype data for hundreds of patients with rare genetic disorders without a molecular diagnosis, including ongoing submissions from the Canadian CARE for RARE project and the NIH Undiagnosed Diseases Program (UDP). Clinical geneticists and scientists studying rare disorders can request accounts, and new patients can be added either using the PhenoTips User Interface, built into PhenomeCentral, or uploaded in bulk.</p

    Standardized analysis and sharing of genome-phenome data for neuromuscular and rare disease research through the RD-Connect platform

    No full text
    <b>Abstract: </b><div>RD-Connect (rd-connect.eu) is an EU-funded project building an integrated platform to narrow the gaps in rare disease research, where patient populations, clinical expertise and research communities are small in number and highly fragmented. Guided by the needs of rare disease researchers and with neuromuscular and neurodegenerative researchers as its original collaborators, the RD-Connect platform securely integrates multiple types of omics data (genomics, proteomics and transcriptomics) with biosample and clinical information – at the level of an individual patient, a family or a whole cohort, providing not only a centralized data repository but also a sophisticated and user-friendly online analysis system. Whole-genome, exome or gene panel NGS datasets from individuals with rare diseases and family members are deposited at the European Genome-phenome Archive, a longstanding archiving system designed for long-term storage of these large datasets. The raw data is then processed by RD-Connect's standardised analysis and annotation pipeline to make data from different sequencing providers more comparable. The corresponding clinical information from each individual is recorded in a connected PhenoTips instance, a software solution that simplifies the capture of clinical data using the Human Phenotype Ontology, OMIM and Orpha codes. The results are made available to authorised users through the highly configurable online platform (platform.rd-connect.eu), which runs on a Hadoop cluster and uses ElasticSearch – technologies designed to handle big data at high speeds. The user-friendly interface enables filtering and prioritization of variants using the most common quality, genomic location, effect, pathogenicity and population frequency annotations, enabling users from clinical labs without extensive bioinformatics support to do their primary genomic analysis of their own patients online and compare them with other submitted cohorts. Additional tools, such as Exomiser, DiseaseCard, Alamut Functional Annotation (ALFA) and UMD Predictor (umd-predictor.eu) are integrated at several levels. The RD-Connect platform is designed to enable data sharing at various levels depending on user permissions. At the most basic level (“does this specific variant exist in any individual in this cohort?”) it has lit a Beacon within the Global Alliance for Genomics and Health’s Beacon Network (www.beacon-network.org). At the next stage of sharing – finding similarities between patients in different databases with a matching phenotype and a candidate variant in the same gene – it is actively involved in the development of Matchmaker Exchange (www.matchmakerexchange.org), allowing users of different systems to securely exchange information to find confirmatory cases. And finally, since all patients within the system have been consented for data sharing, users of the system, after validation and authorization, are able to access datasets from other centres, providing an instant means of gathering cohorts for cross-validation and further study. Although open to any rare disease, the platform is currently enriched for neuromuscular and neurodegenerative phenotypes and includes almost 1000 genomic datasets from the NeurOmics project (www.rd-neuromics.eu) with several other contributions in the pipeline, including 1000 limb-girdle muscular dystrophy index cases from the Myo-Seq project (www.myo-seq.org) and more. The platform is free of charge to use and is open for contributions of NGS and phenotypic data from research labs worldwide via [email protected] <p></p></div
    corecore