92 research outputs found

    A human–AI collaboration workflow for archaeological sites detection

    Get PDF
    This paper illustrates the results obtained by using pre-trained semantic segmentation deep learning models for the detection of archaeological sites within the Mesopotamian floodplains environment. The models were fine-tuned using openly available satellite imagery and vector shapes coming from a large corpus of annotations (i.e., surveyed sites). A randomized test showed that the best model reaches a detection accuracy in the neighborhood of 80%. Integrating domain expertise was crucial to define how to build the dataset and how to evaluate the predictions, since defining if a proposed mask counts as a prediction is very subjective. Furthermore, even an inaccurate prediction can be useful when put into context and interpreted by a trained archaeologist. Coming from these considerations we close the paper with a vision for a Human–AI collaboration workflow. Starting with an annotated dataset that is refined by the human expert we obtain a model whose predictions can either be combined to create a heatmap, to be overlaid on satellite and/or aerial imagery, or alternatively can be vectorized to make further analysis in a GIS software easier and automatic. In turn, the archaeologists can analyze the predictions, organize their onsite surveys, and refine the dataset with new, corrected, annotations

    Limitations and challenges in protein stability prediction upon genome variations: towards future applications in precision medicine

    Get PDF
    Protein stability predictions are becoming essential in medicine to develop novel immunotherapeutic agents and for drug discovery. Despite the large number of computational approaches for predicting the protein stability upon mutation, there are still critical unsolved problems: 1) the limited number of thermodynamic measurements for proteins provided by current databases; 2) the large intrinsic variability of \u394\u394G values due to different experimental conditions; 3) biases in the development of predictive methods caused by ignoring the anti-symmetry of \u394\u394G values between mutant and native protein forms; 4) over-optimistic prediction performance, due to sequence similarity between proteins used in training and test datasets. Here, we review these issues, highlighting new challenges required to improve current tools and to achieve more reliable predictions. In addition, we provide a perspective of how these methods will be beneficial for designing novel precision medicine approaches for several genetic disorders caused by mutations, such as cancer and neurodegenerative diseases

    DDGun: An untrained method for the prediction of protein stability changes upon single and multiple point variations

    Get PDF
    Background: Predicting the effect of single point variations on protein stability constitutes a crucial step toward understanding the relationship between protein structure and function. To this end, several methods have been developed to predict changes in the Gibbs free energy of unfolding (\u3b4\u3b4G) between wild type and variant proteins, using sequence and structure information. Most of the available methods however do not exhibit the anti-symmetric prediction property, which guarantees that the predicted \u3b4\u3b4G value for a variation is the exact opposite of that predicted for the reverse variation, i.e., \u3b4\u3b4G(A \u2192 B) = -\u3b4\u3b4G(B \u2192 A), where A and B are amino acids. Results: Here we introduce simple anti-symmetric features, based on evolutionary information, which are combined to define an untrained method, DDGun (DDG untrained). DDGun is a simple approach based on evolutionary information that predicts the \u3b4\u3b4G for single and multiple variations from sequence and structure information (DDGun3D). Our method achieves remarkable performance without any training on the experimental datasets, reaching Pearson correlation coefficients between predicted and measured \u3b4\u3b4G values of ~ 0.5 and ~ 0.4 for single and multiple site variations, respectively. Surprisingly, DDGun performances are comparable with those of state of the art methods. DDGun also naturally predicts multiple site variations, thereby defining a benchmark method for both single site and multiple site predictors. DDGun is anti-symmetric by construction predicting the value of the \u3b4\u3b4G of a reciprocal variation as almost equal (depending on the sequence profile) to -\u3b4\u3b4G of the direct variation. This is a valuable property that is missing in the majority of the methods. Conclusions: Evolutionary information alone combined in an untrained method can achieve remarkably high performances in the prediction of \u3b4\u3b4G upon protein mutation. Non-trained approaches like DDGun represent a valid benchmark both for scoring the predictive power of the individual features and for assessing the learning capability of supervised methods

    Genome-Wide Identification and Phenotypic Characterization of Seizure-Associated Copy Number Variations in 741,075 Individuals

    Get PDF
    Copy number variants (CNV) are established risk factors for neurodevelopmental disorders with seizures or epilepsy. With the hypothesis that seizure disorders share genetic risk factors, we pooled CNV data from 10,590 individuals with seizure disorders, 16,109 individuals with clinically validated epilepsy, and 492,324 population controls and identified 25 genome-wide significant loci, 22 of which are novel for seizure disorders, such as deletions at 1p36.33, 1q44, 2p21-p16.3, 3q29, 8p23.3-p23.2, 9p24.3, 10q26.3, 15q11.2, 15q12-q13.1, 16p12.2, 17q21.31, duplications at 2q13, 9q34.3, 16p13.3, 17q12, 19p13.3, 20q13.33, and reciprocal CNVs at 16p11.2, and 22q11.21. Using genetic data from additional 248,751 individuals with 23 neuropsychiatric phenotypes, we explored the pleiotropy of these 25 loci. Finally, in a subset of individuals with epilepsy and detailed clinical data available, we performed phenome-wide association analyses between individual CNVs and clinical annotations categorized through the Human Phenotype Ontology (HPO). For six CNVs, we identified 19 significant associations with specific HPO terms and generated, for all CNVs, phenotype signatures across 17 clinical categories relevant for epileptologists. This is the most comprehensive investigation of CNVs in epilepsy and related seizure disorders, with potential implications for clinical practice

    A comparison of lysosomal enzymes expression levels in peripheral blood of mild- and severe-Alzheimer’s disease and MCI patients: implications for regenerative medicine approaches

    Get PDF
    The association of lysosomal dysfunction and neurodegeneration has been documented in several neurodegenerative diseases, including Alzheimer’s Disease (AD). Herein, we investigate the association of lysosomal enzymes with AD at different stages of progression of the disease (mild and severe) or with mild cognitive impairment (MCI). We conducted a screening of two classes of lysosomal enzymes: glycohydrolases (β-Hexosaminidase, β-Galctosidase, β-Galactosylcerebrosidase, β-Glucuronidase) and proteases (Cathepsins S, D, B, L) in peripheral blood samples (blood plasma and PBMCs) from mild AD, severe AD, MCI and healthy control subjects. We confirmed the lysosomal dysfunction in severe AD patients and added new findings enhancing the association of abnormal levels of specific lysosomal enzymes with the mild AD or severe AD, and highlighting the difference of AD from MCI. Herein, we showed for the first time the specific alteration of β-Galctosidase (Gal), β-Galactosylcerebrosidase (GALC) in MCI patients. It is notable that in above peripheral biological samples the lysosomes are more sensitive to AD cellular metabolic alteration when compared to levels of Aβ-peptide or Tau proteins, similar in both AD groups analyzed. Collectively, our findings support the role of lysosomal enzymes as potential peripheral molecules that vary with the progression of AD, and make them useful for monitoring regenerative medicine approaches for AD

    CNV-ClinViewer: Enhancing the clinical interpretation of large copy-number variants online

    Get PDF
    Purpose Large copy number variants (CNVs) can cause a heterogeneous spectrum of rare and severe disorders. However, most CNVs are benign and are part of natural variation in human genomes. CNV pathogenicity classification, genotype-phenotype analyses, and therapeutic target identification are challenging and time-consuming tasks that require the integration and analysis of information from multiple scattered sources by experts. Methods We developed a web-application combining >250,000 patient and population CNVs together with a large set of biomedical annotations and provide tools for CNV classification based on ACMG/ClinGen guidelines and gene-set enrichment analyses. Results Here, we introduce the CNV-ClinViewer (https://cnv-ClinViewer.broadinstitute.org), an open-source web-application for clinical evaluation and visual exploration of CNVs. The application enables real-time interactive exploration of large CNV datasets in a user-friendly designed interface. Conclusion Overall, this resource facilitates semi-automated clinical CNV interpretation and genomic loci exploration and, in combination with clinical judgment, enables clinicians and researchers to formulate novel hypotheses and guide their decision-making process. Subsequently, the CNV-ClinViewer enhances for clinical investigators patient care and for basic scientists translational genomic research

    Therapeutic Efficacy of a Potent Anti-venezuelan Equine Encephalitis Virus Antibody Is Contingent on FC Effector functionslc6a1 Variant Pathogenicity, Molecular Function and Phenotype: A Genetic and Clinical Analysis

    Get PDF
    Genetic variants in the SLC6A1 gene can cause a broad phenotypic disease spectrum by altering the protein function. Thus, systematically curated clinically relevant genotype-phenotype associations are needed to understand the disease mechanism and improve therapeutic decision-making. We aggregated genetic and clinical data from 172 individuals with likely pathogenic/pathogenic (lp/p) SLC6A1 variants and functional data for 184 variants (14.1% lp/p). Clinical and functional data were available for a subset of 126 individuals. We explored the potential associations of variant positions on the GAT1 3D structure with variant pathogenicity, altered molecular function and phenotype severity using bioinformatic approaches. The GAT1 transmembrane domains 1, 6 and extracellular loop 4 (EL4) were enriched for patient over population variants. Across functionally tested missense variants (n = 156), the spatial proximity from the ligand was associated with loss-of-function in the GAT1 transporter activity. For variants with complete loss of in vitro GABA uptake, we found a 4.6-fold enrichment in patients having severe disease versus non-severe disease (P = 2.9 × 10−3, 95% confidence interval: 1.5–15.3). In summary, we delineated associations between the 3D structure and variant pathogenicity, variant function and phenotype in SLC6A1-related disorders. This knowledge supports biology-informed variant interpretation and research on GAT1 function. All our data can be interactively explored in the SLC6A1 portal (https://slc6a1-portal.broadinstitute.org/)

    Genome-wide identification and phenotypic characterization of seizure-associated copy number variations in 741,075 individuals

    Get PDF
    Copy number variants (CNV) are established risk factors for neurodevelopmental disorders with seizures or epilepsy. With the hypothesis that seizure disorders share genetic risk factors, we pooled CNV data from 10,590 individuals with seizure disorders, 16,109 individuals with clinically validated epilepsy, and 492,324 population controls and identified 25 genome-wide significant loci, 22 of which are novel for seizure disorders, such as deletions at 1p36.33, 1q44, 2p21-p16.3, 3q29, 8p23.3-p23.2, 9p24.3, 10q26.3, 15q11.2, 15q12- q13.1, 16p12.2, 17q21.31, duplications at 2q13, 9q34.3, 16p13.3, 17q12, 19p13.3, 20q13.33, and reciprocal CNVs at 16p11.2, and 22q11.21. Using genetic data from additional 248,751 individuals with 23 neuropsychiatric phenotypes, we explored the pleiotropy of these 25 loci. Finally, in a subset of individuals with epilepsy and detailed clinical data available, we performed phenome-wide association analyses between individual CNVs and clinical annotations categorized through the Human Phenotype Ontology (HPO). For six CNVs, we identified 19 significant associations with specific HPO terms and generated, for all CNVs, phenotype signatures across 17 clinical categories relevant for epileptologists. This is the most comprehensive investigation of CNVs in epilepsy and related seizure disorders, with potential implications for clinical practice
    corecore