13 research outputs found

    PrePPI: a structure-informed database of protein–protein interactions

    Get PDF
    PrePPI (http://bhapp.c2b2.columbia.edu/PrePPI) is a database that combines predicted and experimentally determined protein–protein interactions (PPIs) using a Bayesian framework. Predicted interactions are assigned probabilities of being correct, which are derived from calculated likelihood ratios (LRs) by combining structural, functional, evolutionary and expression information, with the most important contribution coming from structure. Experimentally determined interactions are compiled from a set of public databases that manually collect PPIs from the literature and are also assigned LRs. A final probability is then assigned to every interaction by combining the LRs for both predicted and experimentally determined interactions. The current version of PrePPI contains ∼2 million PPIs that have a probability more than ∼0.1 of which ∼60 000 PPIs for yeast and ∼370 000 PPIs for human are considered high confidence (probability greater than 0.5). The PrePPI database constitutes an integrated resource that enables users to examine aggregate information on PPIs, including both known and potentially novel interactions, and that provides structural models for many of the PPIs

    Using Structure to Explore the Sequence Alignment Space of Remote Homologs

    Get PDF
    Protein structure modeling by homology requires an accurate sequence alignment between the query protein and its structural template. However, sequence alignment methods based on dynamic programming (DP) are typically unable to generate accurate alignments for remote sequence homologs, thus limiting the applicability of modeling methods. A central problem is that the alignment that is “optimal” in terms of the DP score does not necessarily correspond to the alignment that produces the most accurate structural model. That is, the correct alignment based on structural superposition will generally have a lower score than the optimal alignment obtained from sequence. Variations of the DP algorithm have been developed that generate alternative alignments that are “suboptimal” in terms of the DP score, but these still encounter difficulties in detecting the correct structural alignment. We present here a new alternative sequence alignment method that relies heavily on the structure of the template. By initially aligning the query sequence to individual fragments in secondary structure elements and combining high-scoring fragments that pass basic tests for “modelability”, we can generate accurate alignments within a small ensemble. Our results suggest that the set of sequences that can currently be modeled by homology can be greatly extended

    Genetic Drivers of Kidney Defects in the DiGeorge Syndrome

    Get PDF
    Background The DiGeorge syndrome, the most common of the microdeletion syndromes, affects multiple organs, including the heart, the nervous system, and the kidney. It is caused by deletions on chromosome 22q11.2; the genetic driver of the kidney defects is unknown. Methods We conducted a genomewide search for structural variants in two cohorts: 2080 patients with congenital kidney and urinary tract anomalies and 22,094 controls. We performed exome and targeted resequencing in samples obtained from 586 additional patients with congenital kidney anomalies. We also carried out functional studies using zebrafish and mice. Results We identified heterozygous deletions of 22q11.2 in 1.1% of the patients with congenital kidney anomalies and in 0.01% of population controls (odds ratio, 81.5; P=4.5×10(-14)). We localized the main drivers of renal disease in the DiGeorge syndrome to a 370-kb region containing nine genes. In zebrafish embryos, an induced loss of function in snap29, aifm3, and crkl resulted in renal defects; the loss of crkl alone was sufficient to induce defects. Five of 586 patients with congenital urinary anomalies had newly identified, heterozygous protein-altering variants, including a premature termination codon, in CRKL. The inactivation of Crkl in the mouse model induced developmental defects similar to those observed in patients with congenital urinary anomalies. Conclusions We identified a recurrent 370-kb deletion at the 22q11.2 locus as a driver of kidney defects in the DiGeorge syndrome and in sporadic congenital kidney and urinary tract anomalies. Of the nine genes at this locus, SNAP29, AIFM3, and CRKL appear to be critical to the phenotype, with haploinsufficiency of CRKL emerging as the main genetic driver. (Funded by the National Institutes of Health and others.)

    Predicting transmembrane beta-barrels in proteomes

    No full text
    Very few methods address the problem of predicting beta-barrel membrane proteins directly from sequence. One reason is that only very few high-resolution structures for transmembrane beta-barrel (TMB) proteins have been determined thus far. Here we introduced the design, statistics and results of a novel profile-based hidden Markov model for the prediction and discrimination of TMBs. The method carefully attempts to avoid over-fitting the sparse experimental data. While our model training and scoring procedures were very similar to a recently published work, the architecture and structure-based labelling were significantly different. In particular, we introduced a new definition of beta- hairpin motifs, explicit state modelling of transmembrane strands, and a log-odds whole-protein discrimination score. The resulting method reached an overall four-state (up-, down-strand, periplasmic-, outer-loop) accuracy as high as 86%. Furthermore, accurately discriminated TMB from non-TMB proteins (45% coverage at 100% accuracy). This high precision enabled the application to 72 entirely sequenced Gram-negative bacteria. We found over 164 previously uncharacterized TMB proteins at high confidence. Database searches did not implicate any of these proteins with membranes. We challenge that the vast majority of our 164 predictions will eventually be verified experimentally. All proteome predictions and the PROFtmb prediction method are available at http://www.rostlab.org/services/PROFtmb/

    De novo missense variants in PPP2R5D are associated with intellectual disability, macrocephaly, hypotonia, and autism

    No full text
    Protein phosphatase 2A (PP2A) is a heterotrimeric protein serine/threonine phosphatase and is involved in a broad range of cellular processes. PPP2R5D is a regulatory B subunit of PP2A and plays an important role in regulating key neuronal and developmental regulation processes such as PI3K/AKT and glycogen synthase kinase 3 beta (GSK3β)-mediated cell growth, chromatin remodeling, and gene transcriptional regulation. Using whole-exome sequencing (WES), we identified four de novo variants in PPP2R5D in a total of seven unrelated individuals with intellectual disability (ID) and other shared clinical characteristics, including autism spectrum disorder, macrocephaly, hypotonia, seizures, and dysmorphic features. Among the four variants, two have been previously reported and two are novel. All four amino acids are highly conserved among the PP2A subunit family, and all change a negatively charged acidic glutamic acid (E) to a positively charged basic lysine (K) and are predicted to disrupt the PP2A subunit binding and impair the dephosphorylation capacity. Our data provides further support for PPP2R5D as a genetic cause of ID

    Bi-allelic missense disease-causing variants in RPL3L associate neonatal dilated cardiomyopathy with muscle-specific ribosome biogenesis

    No full text
    Dilated cardiomyopathy (DCM) belongs to the most frequent forms of cardiomyopathy mainly characterized by cardiac dilatation and reduced systolic function. Although most cases of DCM are classified as sporadic, 20-30% of cases show a heritable pattern. Familial forms of DCM are genetically heterogeneous, and mutations in several genes have been identified that most commonly play a role in cytoskeleton and sarcomere-associated processes. Still, a large number of familial cases remain unsolved. Here, we report five individuals from three independent families who presented with severe dilated cardiomyopathy during the neonatal period. Using whole-exome sequencing (WES), we identified causative, compound heterozygous missense variants in RPL3L (ribosomal protein L3-like) in all the affected individuals. The identified variants co-segregated with the disease in each of the three families and were absent or very rare in the human population, in line with an autosomal recessive inheritance pattern. They are located within the conserved RPL3 domain of the protein and were classified as deleterious by several in silico prediction software applications. RPL3L is one of the four non-canonical riboprotein genes and it encodes the 60S ribosomal protein L3-like protein that is highly expressed only in cardiac and skeletal muscle. Three-dimensional homology modeling and in silico analysis of the affected residues in RPL3L indicate that the identified changes specifically alter the interaction of RPL3L with the RNA components of the 60S ribosomal subunit and thus destabilize its binding to the 60S subunit. In conclusion, we report that bi-allelic pathogenic variants in RPL3L are causative of an early-onset, severe neonatal form of dilated cardiomyopathy, and we show for the first time that cytoplasmic ribosomal proteins are involved in the pathogenesis of non-syndromic cardiomyopathies
    corecore