29 research outputs found

    Big Data as a Driver for Clinical Decision Support Systems: A Learning Health Systems Perspective

    Get PDF
    Big data technologies are nowadays providing health care with powerful instruments to gather and analyze large volumes of heterogeneous data collected for different purposes, including clinical care, administration, and research. This makes possible to design IT infrastructures that favor the implementation of the so-called "Learning Healthcare System Cycle," where healthcare practice and research are part of a unique and synergic process. In this paper we highlight how "Big Data enabled" integrated data collections may support clinical decision-making together with biomedical research. Two effective implementations are reported, concerning decision support in Diabetes and in Inherited Arrhythmogenic Diseases

    Multiple clinical forms of dehydrated hereditary stomatocytosis arise from mutations in PIEZO1

    Get PDF
    Autosomal dominant dehydrated hereditary stomatocytosis (DHSt) usually presents as a compensated hemolytic anemia with macrocytosis and abnormally shaped red blood cells (RBCs). DHSt is part of a pleiotropic syndrome that may also exhibit pseudohyperkalemia and perinatal edema. We identified PIEZO1 as the disease gene for pleiotropic DHSt in a large kindred by exome sequencing analysis within the previously mapped 16q23-q24 interval. In 26 affected individuals among 7 multigenerational DHSt families with the pleiotropic syndrome, 11 heterozygous PIEZO1 missense mutations cosegregated with disease. PIEZO1 is expressed in the plasma membranes of RBCs and its messenger RNA, and protein levels increase during in vitro erythroid differentiation of CD341 cells. PIEZO1 is also expressed in liver and bone marrow during human and mouse development. We suggest for the first time a correlation between a PIEZO1 mutation and perinatal edema. DHSt patient red cells with the R2456H mutation exhibit increased ion-channel activity. Functional studies of PIEZO1 mutant R2488Q expressed in Xenopus oocytes demonstrated changes in ion-channel activity consistent with the altered cation content of DHSt patient red cells. Our findings provide direct evidence that R2456H and R2488Q mutations in PIEZO1 alter mechanosensitive channel regulation, leading to increased cation transport in erythroid cells

    PaPI: Pseudo amino acid composition to score human protein-coding variants

    No full text
    Background: High throughput sequencing technologies are able to identify the whole genomic variation of an individual. Gene-targeted and whole-exome experiments are mainly focused on coding sequence variants related to a single or multiple nucleotides. The analysis of the biological significance of this multitude of genomic variant is challenging and computational demanding. Results: We present PaPI, a new machine-learning approach to classify and score human coding variants by estimating the probability to damage their protein-related function. The novelty of this approach consists in using pseudo amino acid composition through which wild and mutated protein sequences are represented in a discrete model. A machine learning classifier has been trained on a set of known deleterious and benign coding variants with the aim to score unobserved variants by taking into account hidden sequence patterns in human genome potentially leading to diseases. We show how the combination of amphiphilic pseudo amino acid composition, evolutionary conservation and homologous proteins based methods outperforms several prediction algorithms and it is also able to score complex variants such as deletions, insertions and indels. Conclusions: This paper describes a machine-learning approach to predict the deleteriousness of human coding variants. A freely available web application (http://papi.unipv.it) has been developed with the presented method, able to score up to thousands variants in a single run

    Kimimila: A new model to classify ngs short reads by their allele origin

    No full text
    Next generation sequencing (NGS) technologies, often referred to as massively parallel sequencing, are having a huge impact on genomics and clinical applications. These technologies generate billions of short sequences (reads) that are consequently mapped to their corresponding reference genome to find out known and/or novel genomic variants potentially correlated to patients phenotype. DNA fragment library is usually derived from a diploid genome: we refer to genotyping on NGS data as the analytical process to assign the zygosity of identified variants. Current algorithms typically rely on data of the single genomic locus where variants have been called and are based on the condition of independence between variant locus and reads. These strong assumptions might bring to possible inaccuracies throughout the genotyping process. We have therefore developed an efficient assumption-free algorithm based on a kinetic model approach and distance geometry (Kimimila) that delivers the belonging allele for each read using the inference provided by the measure of differences (i.e. variants) among overlapping reads

    A kinetic model-based algorithm to classify NGS short reads by their allele origin

    No full text
    Genotyping Next Generation Sequencing (NGS) data of a diploid genome aims to assign the zygosity of identified variants through comparison with a reference genome. Current methods typically employ probabilistic models that rely on the pileup of bases at each locus and on a priori knowledge. We present a new algorithm, called Kimimila (KInetic Modeling based on InforMation theory to Infer Labels of Alleles), which is able to assign reads to alleles by using a distance geometry approach and to infer the variant genotypes accurately, without any kind of assumption. The performance of the model has been assessed on simulated and real data of the 1000 Genomes Project and the results have been compared with several commonly used genotyping methods, i.e., GATK, Samtools, VarScan, FreeBayes and Atlas2. Despite our algorithm does not make use of a priori knowledge, the percentage of correctly genotyped variants is comparable to these algorithms. Furthermore, our method allows the user to split the reads pool depending on the inferred allele origin

    A genomic data fusion framework to exploit rare and common variants for association discovery

    No full text
    Collapsing methods are used in association studies to exploit the effect of genetic rare variants in diseases. In this work we model an enriched collapsing approach by including genes, protein domains, pathways and protein-protein interactions data. We applied the collapsing technique to a data set of epileptic (85 cases) and healthy (61 controls) subjects. The method retrieved 4 genes, 5 domains, 33 gene interactions and 14 pathways showing a significant association with the disease. Collapsed data have been also used as features for prediction models. We found that the use of protein-protein interactions as model features increases the area under ROC curve (+1.5%) if compared to the solely gene-based approach

    BigQ: A NoSQL based framework to handle genomic variants in i2b2

    No full text
    Background: Precision medicine requires the tight integration of clinical and molecular data. To this end, it is mandatory to define proper technological solutions able to manage the overwhelming amount of high throughput genomic data needed to test associations between genomic signatures and human phenotypes. The i2b2 Center (Informatics for Integrating Biology and the Bedside) has developed a widely internationally adopted framework to use existing clinical data for discovery research that can help the definition of precision medicine interventions when coupled with genetic data. i2b2 can be significantly advanced by designing efficient management solutions of Next Generation Sequencing data. Results: We developed BigQ, an extension of the i2b2 framework, which integrates patient clinical phenotypes with genomic variant profiles generated by Next Generation Sequencing. A visual programming i2b2 plugin allows retrieving variants belonging to the patients in a cohort by applying filters on genomic variant annotations. We report an evaluation of the query performance of our system on more than 11 million variants, showing that the implemented solution scales linearly in terms of query time and disk space with the number of variants. Conclusions: In this paper we describe a new i2b2 web service composed of an efficient and scalable document-based database that manages annotations of genomic variants and of a visual programming plug-in designed to dynamically perform queries on clinical and genetic data. The system therefore allows managing the fast growing volume of genomic variants and can be used to integrate heterogeneous genomic annotations

    A Data Fusion Approach to Enhance Association Study in Epilepsy.

    Get PDF
    Among the scientific challenges posed by complex diseases with a strong genetic component, two stand out. One is unveiling the role of rare and common genetic variants; the other is the design of classification models to improve clinical diagnosis and predictive models for prognosis and personalized therapies. In this paper, we present a data fusion framework merging gene, domain, pathway and protein-protein interaction data related to a next generation sequencing epilepsy gene panel. Our method allows integrating association information from multiple genomic sources and aims at highlighting the set of common and rare variants that are capable to trigger the occurrence of a complex disease. When compared to other approaches, our method shows better performances in classifying patients affected by epilepsy

    Phenotypic Variation in Two Siblings Affected with Shwachman-Diamond Syndrome: The Use of Expert Variant Interpreter (eVai) Suggests Clinical Relevance of a Variant in the KMT2A Gene

    No full text
    Introduction. Shwachman-Diamond Syndrome (SDS) is an autosomal-recessive disorder characterized by neutropenia, pancreatic exocrine insufficiency, skeletal dysplasia, and an increased risk for leukemic transformation. Biallelic mutations in the SBDS gene have been found in about 90% of patients. The clinical spectrum of SDS in patients is wide, and variability has been noticed between different patients, siblings, and even within the same patient over time. Herein, we present two SDS siblings (UPN42 and UPN43) carrying the same SBDS mutations and showing relevant differences in their phenotypic presentation. Study aim. We attempted to understand whether other germline variants, in addition to SBDS, could explain some of the clinical variability noticed between the siblings. Methods. Whole-exome sequencing (WES) was performed. Human Phenotype Ontology (HPO) terms were defined for each patient, and the WES data were analyzed using the eVai and DIVAs platforms. Results. In UPN43, we found and confirmed, using Sanger sequencing, a novel de novo variant (c.10663G > A, p.Gly3555Ser) in the KMT2A gene that is associated with autosomal-dominant Wiedemann–Steiner Syndrome. The variant is classified as pathogenic according to different in silico prediction tools. Interestingly, it was found to be related to some of the HPO terms that describe UPN43. Conclusions. We postulate that the KMT2A variant found in UPN43 has a concomitant and co-occurring clinical effect, in addition to SBDS mutation. This dual molecular effect, supported by in silico prediction, could help to understand some of the clinical variations found among the siblings. In the future, these new data are likely to be useful for personalized medicine and therapy for selected cases

    CardioVAI: An automatic implementation of ACMG-AMP variant interpretation guidelines in the diagnosis of cardiovascular diseases

    No full text
    Variant interpretation for the diagnosis of genetic diseases is a complex process. The American College of Medical Genetics and Genomics, with the Association for Molecular Pathology, have proposed a set of evidence-based guidelines to support variant pathogenicity assessment and reporting in Mendelian diseases. Cardiovascular disorders are a field of application of these guidelines, but practical implementation is challenging due to the genetic disease heterogeneity and the complexity of information sources that need to be integrated. Decision support systems able to automate variant interpretation in the light of specific disease domains are demanded. We implemented CardioVAI (Cardio Variant Interpreter), an automated system for guidelines based variant classification in cardiovascular-related genes. Different omics-resources were integrated to assess pathogenicity of every genomic variant in 72 cardiovascular diseases related genes. We validated our method on benchmark datasets of high-confident assessed variants, reaching pathogenicity and benignity concordance up to 83 and 97.08%, respectively. We compared CardioVAI to similar methods and analyzed the main differences in terms of guidelines implementation. We finally made available CardioVAI as a web resource (http://cardiovai.engenome.com/) that allows users to further specialize guidelines recommendations
    corecore