58 research outputs found

    Methods of a national colorectal cancer cohort study: the PIPER Project

    Get PDF
    A national study looking at bowel cancer in New Zealand has previously been completed (the PIPER Project). The study included 5,610 patients and collected medical information about how each person was found to have bowel cancer and the treatment they received. This paper reports how the study was carried out. The information collected in the study will be used to look at the quality of care being provided to New Zealand patients with bowel cancer, and to find out if differences in care occur based on where people live, their ethnicity and their socioeconomic status

    The Human Phenotype Ontology project:linking molecular biology and disease through phenotype data

    Get PDF
    The Human Phenotype Ontology (HPO) project, available at http://www.human-phenotype-ontology.org, provides a structured, comprehensive and well-defined set of 10,088 classes (terms) describing human phenotypic abnormalities and 13,326 subclass relations between the HPO classes. In addition we have developed logical definitions for 46% of all HPO classes using terms from ontologies for anatomy, cell types, function, embryology, pathology and other domains. This allows interoperability with several resources, especially those containing phenotype information on model organisms such as mouse and zebrafish. Here we describe the updated HPO database, which provides annotations of 7,278 human hereditary syndromes listed in OMIM, Orphanet and DECIPHER to classes of the HPO. Various meta-attributes such as frequency, references and negations are associated with each annotation. Several large-scale projects worldwide utilize the HPO for describing phenotype information in their datasets. We have therefore generated equivalence mappings to other phenotype vocabularies such as LDDB, Orphanet, MedDRA, UMLS and phenoDB, allowing integration of existing datasets and interoperability with multiple biomedical resources. We have created various ways to access the HPO database content using flat files, a MySQL database, and Web-based tools. All data and documentation on the HPO project can be found online

    Discovery of four recessive developmental disorders using probabilistic genotype and phenotype matching among 4,125 families.

    Get PDF
    Discovery of most autosomal recessive disease-associated genes has involved analysis of large, often consanguineous multiplex families or small cohorts of unrelated individuals with a well-defined clinical condition. Discovery of new dominant causes of rare, genetically heterogeneous developmental disorders has been revolutionized by exome analysis of large cohorts of phenotypically diverse parent-offspring trios. Here we analyzed 4,125 families with diverse, rare and genetically heterogeneous developmental disorders and identified four new autosomal recessive disorders. These four disorders were identified by integrating Mendelian filtering (selecting probands with rare, biallelic and putatively damaging variants in the same gene) with statistical assessments of (i) the likelihood of sampling the observed genotypes from the general population and (ii) the phenotypic similarity of patients with recessive variants in the same candidate gene. This new paradigm promises to catalyze the discovery of novel recessive disorders, especially those with less consistent or nonspecific clinical presentations and those caused predominantly by compound heterozygous genotypes

    Finding Diagnostically Useful Patterns in Quantitative Phenotypic Data.

    Get PDF
    Trio-based whole-exome sequence (WES) data have established confident genetic diagnoses in ∼40% of previously undiagnosed individuals recruited to the Deciphering Developmental Disorders (DDD) study. Here we aim to use the breadth of phenotypic information recorded in DDD to augment diagnosis and disease variant discovery in probands. Median Euclidean distances (mEuD) were employed as a simple measure of similarity of quantitative phenotypic data within sets of ≥10 individuals with plausibly causative de novo mutations (DNM) in 28 different developmental disorder genes. 13/28 (46.4%) showed significant similarity for growth or developmental milestone metrics, 10/28 (35.7%) showed similarity in HPO term usage, and 12/28 (43%) showed no phenotypic similarity. Pairwise comparisons of individuals with high-impact inherited variants to the 32 individuals with causative DNM in ANKRD11 using only growth z-scores highlighted 5 likely causative inherited variants and two unrecognized DNM resulting in an 18% diagnostic uplift for this gene. Using an independent approach, naive Bayes classification of growth and developmental data produced reasonably discriminative models for the 24 DNM genes with sufficiently complete data. An unsupervised naive Bayes classification of 6,993 probands with WES data and sufficient phenotypic information defined 23 in silico syndromes (ISSs) and was used to test a "phenotype first" approach to the discovery of causative genotypes using WES variants strictly filtered on allele frequency, mutation consequence, and evidence of constraint in humans. This highlighted heterozygous de novo nonsynonymous variants in SPTBN2 as causative in three DDD probands

    GA4GH: International policies and standards for data sharing across genomic research and healthcare.

    Get PDF
    The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches. The decreasing cost of genomic sequencing (along with other genome-wide molecular assays) and increasing evidence of its clinical utility will soon drive the generation of sequence data from tens of millions of humans, with increasing levels of diversity. In this perspective, we present the GA4GH strategies for addressing the major challenges of this data revolution. We describe the GA4GH organization, which is fueled by the development efforts of eight Work Streams and informed by the needs of 24 Driver Projects and other key stakeholders. We present the GA4GH suite of secure, interoperable technical standards and policy frameworks and review the current status of standards, their relevance to key domains of research and clinical care, and future plans of GA4GH. Broad international participation in building, adopting, and deploying GA4GH standards and frameworks will catalyze an unprecedented effort in data sharing that will be critical to advancing genomic medicine and ensuring that all populations can access its benefits

    The Human Phenotype Ontology in 2024: phenotypes around the world.

    Get PDF
    The Human Phenotype Ontology (HPO) is a widely used resource that comprehensively organizes and defines the phenotypic features of human disease, enabling computational inference and supporting genomic and phenotypic analyses through semantic similarity and machine learning algorithms. The HPO has widespread applications in clinical diagnostics and translational research, including genomic diagnostics, gene-disease discovery, and cohort analytics. In recent years, groups around the world have developed translations of the HPO from English to other languages, and the HPO browser has been internationalized, allowing users to view HPO term labels and in many cases synonyms and definitions in ten languages in addition to English. Since our last report, a total of 2239 new HPO terms and 49235 new HPO annotations were developed, many in collaboration with external groups in the fields of psychiatry, arthrogryposis, immunology and cardiology. The Medical Action Ontology (MAxO) is a new effort to model treatments and other measures taken for clinical management. Finally, the HPO consortium is contributing to efforts to integrate the HPO and the GA4GH Phenopacket Schema into electronic health records (EHRs) with the goal of more standardized and computable integration of rare disease data in EHRs

    Phenotypic Characterization of EIF2AK4 Mutation Carriers in a Large Cohort of Patients Diagnosed Clinically With Pulmonary Arterial Hypertension.

    Get PDF
    BACKGROUND: Pulmonary arterial hypertension (PAH) is a rare disease with an emerging genetic basis. Heterozygous mutations in the gene encoding the bone morphogenetic protein receptor type 2 (BMPR2) are the commonest genetic cause of PAH, whereas biallelic mutations in the eukaryotic translation initiation factor 2 alpha kinase 4 gene (EIF2AK4) are described in pulmonary veno-occlusive disease/pulmonary capillary hemangiomatosis. Here, we determine the frequency of these mutations and define the genotype-phenotype characteristics in a large cohort of patients diagnosed clinically with PAH. METHODS: Whole-genome sequencing was performed on DNA from patients with idiopathic and heritable PAH and with pulmonary veno-occlusive disease/pulmonary capillary hemangiomatosis recruited to the National Institute of Health Research BioResource-Rare Diseases study. Heterozygous variants in BMPR2 and biallelic EIF2AK4 variants with a minor allele frequency of <1:10 000 in control data sets and predicted to be deleterious (by combined annotation-dependent depletion, PolyPhen-2, and sorting intolerant from tolerant predictions) were identified as potentially causal. Phenotype data from the time of diagnosis were also captured. RESULTS: Eight hundred sixty-four patients with idiopathic or heritable PAH and 16 with pulmonary veno-occlusive disease/pulmonary capillary hemangiomatosis were recruited. Mutations in BMPR2 were identified in 130 patients (14.8%). Biallelic mutations in EIF2AK4 were identified in 5 patients with a clinical diagnosis of pulmonary veno-occlusive disease/pulmonary capillary hemangiomatosis. Furthermore, 9 patients with a clinical diagnosis of PAH carried biallelic EIF2AK4 mutations. These patients had a reduced transfer coefficient for carbon monoxide (Kco; 33% [interquartile range, 30%-35%] predicted) and younger age at diagnosis (29 years; interquartile range, 23-38 years) and more interlobular septal thickening and mediastinal lymphadenopathy on computed tomography of the chest compared with patients with PAH without EIF2AK4 mutations. However, radiological assessment alone could not accurately identify biallelic EIF2AK4 mutation carriers. Patients with PAH with biallelic EIF2AK4 mutations had a shorter survival. CONCLUSIONS: Biallelic EIF2AK4 mutations are found in patients classified clinically as having idiopathic and heritable PAH. These patients cannot be identified reliably by computed tomography, but a low Kco and a young age at diagnosis suggests the underlying molecular diagnosis. Genetic testing can identify these misclassified patients, allowing appropriate management and early referral for lung transplantation

    Telomerecat: A ploidy-agnostic method for estimating telomere length from whole genome sequencing data.

    Get PDF
    Telomere length is a risk factor in disease and the dynamics of telomere length are crucial to our understanding of cell replication and vitality. The proliferation of whole genome sequencing represents an unprecedented opportunity to glean new insights into telomere biology on a previously unimaginable scale. To this end, a number of approaches for estimating telomere length from whole-genome sequencing data have been proposed. Here we present Telomerecat, a novel approach to the estimation of telomere length. Previous methods have been dependent on the number of telomeres present in a cell being known, which may be problematic when analysing aneuploid cancer data and non-human samples. Telomerecat is designed to be agnostic to the number of telomeres present, making it suited for the purpose of estimating telomere length in cancer studies. Telomerecat also accounts for interstitial telomeric reads and presents a novel approach to dealing with sequencing errors. We show that Telomerecat performs well at telomere length estimation when compared to leading experimental and computational methods. Furthermore, we show that it detects expected patterns in longitudinal data, repeated measurements, and cross-species comparisons. We also apply the method to a cancer cell data, uncovering an interesting relationship with the underlying telomerase genotype
    corecore