33 research outputs found

    A statistical analysis of [Beta]-sheets in proteins

    Full text link

    Disulfides as redox switches : from molecular mechanisms to functional significance

    Full text link
    The molecular mechanisms underlying thiol-based redox control are poorly defined. Disulfide bonds between Cys residues are commonly thought to confer extra rigidity and stability to their resident protein, forming a type of proteinaceous spot weld. Redox biologists have been redefining the role of disulfides over the last 30&ndash;40 years. Disulfides are now known to form in the cytosol under conditions of oxidative stress. Isomerization of extracellular disulfides is also emerging as an important regulator of protein function. The current paradigm is that the disulfide proteome consists of two subproteomes: a structural group and a redox-sensitive group. The redoxsensitive group is less stable and often associated with regions of stress in protein structures. Some characterized redox-active disulfides are the helical CXXC motif, often associated with thioredoxin-fold proteins; and forbidden disulfides, a group of metastable disulfides that disobey elucidated rules of protein stereochemistry. Here we discuss the role of redox-active disulfides as switches in proteins.<br /

    Identifying foldable regions in protein sequence from the hydrophobic signal

    Get PDF
    Structural genomics initiatives aim to elucidate representative 3D structures for the majority of protein families over the next decade, but many obstacles must be overcome. The correct design of constructs is extremely important since many proteins will be too large or contain unstructured regions and will not be amenable to crystallization. It is therefore essential to identify regions in protein sequences that are likely to be suitable for structural study. Scooby-Domain is a fast and simple method to identify globular domains in protein sequences. Domains are compact units of protein structure and their correct delineation will aid structural elucidation through a divide-and-conquer approach. Scooby-Domain predictions are based on the observed lengths and hydrophobicities of domains from proteins with known tertiary structure. The prediction method employs an A*-search to identify sequence regions that form a globular structure and those that are unstructured. On a test set of 173 proteins with consensus CATH and SCOP domain definitions, Scooby-Domain has a sensitivity of 50% and an accuracy of 29%, which is better than current state-of-the-art methods. The method does not rely on homology searches and, therefore, can identify previously unknown domains

    Comparison of automated candidate gene prediction systems using genes implicated in type 2 diabetes by genome-wide association studies

    Get PDF
    BackgroundAutomated candidate gene prediction systems allow geneticists to hone in on disease genes more rapidly by identifying the most probable candidate genes linked to the disease phenotypes under investigation. Here we assessed the ability of eight different candidate gene prediction systems to predict disease genes in intervals previously associated with type 2 diabetes by benchmarking their performance against genes implicated by recent genome-wide association studies.ResultsUsing a search space of 9556 genes, all but one of the systems pruned the genome in favour of genes associated with moderate to highly significant SNPs. Of the 11 genes associated with highly significant SNPs identified by the genome-wide association studies, eight were flagged as likely candidates by at least one of the prediction systems. A list of candidates produced by a previous consensus approach did not match any of the genes implicated by 706 moderate to highly significant SNPs flagged by the genome-wide association studies. We prioritized genes associated with medium significance SNPs.ConclusionThe study appraises the relative success of several candidate gene prediction systems against independent genetic data. Even when confronted with challengingly large intervals, the candidate gene prediction systems can successfully select likely disease genes. Furthermore, they can be used to filter statistically less-well-supported genetic data to select more likely candidates. We suggest consensus approaches fail because they penalize novel predictions made from independent underlying databases. To realize their full potential further work needs to be done on prioritization and annotation of genes.<br /

    Gentrepid V2.0: a web server for candidate disease gene prediction

    Get PDF
    Contains fulltext : 124935.pdf (publisher's version ) (Open Access)BACKGROUND: Candidate disease gene prediction is a rapidly developing area of bioinformatics research with the potential to deliver great benefits to human health. As experimental studies detecting associations between genetic intervals and disease proliferate, better bioinformatic techniques that can expand and exploit the data are required. DESCRIPTION: Gentrepid is a web resource which predicts and prioritizes candidate disease genes for both Mendelian and complex diseases. The system can take input from linkage analysis of single genetic intervals or multiple marker loci from genome-wide association studies. The underlying database of the Gentrepid tool sources data from numerous gene and protein resources, taking advantage of the wealth of biological information available. Using known disease gene information from OMIM, the system predicts and prioritizes disease gene candidates that participate in the same protein pathways or share similar protein domains. Alternatively, using an ab initio approach, the system can detect enrichment of these protein annotations without prior knowledge of the phenotype. CONCLUSIONS: The system aims to integrate the wealth of protein information currently available with known and novel phenotype/genotype information to acquire knowledge of biological mechanisms underpinning disease. We have updated the system to facilitate analysis of GWAS data and the study of complex diseases. Application of the system to GWAS data on hypertension using the ICBP data is provided as an example. An interesting prediction is a ZIP transporter additional to the one found by the ICBP analysis. The webserver URL is https://www.gentrepid.org/

    Structural and functional characterization of the oxidoreductase a-DsbA1 from wolbachia pipientis

    Full text link
    The &alpha;-proteobacterium Wolbachia pipientis is a highly successful intracellular endosymbiont of invertebrates that manipulates its host\u27s reproductive biology to facilitate its own maternal transmission. The fastidious nature of Wolbachia and the lack of genetic transformation have hampered analysis of the molecular basis of these manipulations. Structure determination of key Wolbachia proteins will enable the development of inhibitors for chemical genetics studies. Wolbachia encodes a homologue (&alpha;-DsbA1) of the Escherichia coli dithiol oxidase enzyme EcDsbA, essential for the oxidative folding of many exported proteins. We found that the active-site cysteine pair of Wolbachia &alpha;-DsbA1 has the most reducing redox potential of any characterized DsbA. In addition, Wolbachia &alpha;-DsbA1 possesses a second disulfide that is highly conserved in &alpha;-proteobacterial DsbAs but not in other DsbAs. The &alpha;-DsbA1 structure lacks the characteristic hydrophobic features of EcDsbA, and the protein neither complements EcDsbA deletion mutants in E. coli nor interacts with EcDsbB, the redox partner of EcDsbA. The surface characteristics and redox profile of &alpha;-DsbA1 indicate that it probably plays a specialized oxidative folding role with a narrow substrate specificity. This first report of a Wolbachia protein structure provides the basis for future chemical genetics studies.<br /

    Analysis of protein sequence and interaction data for candidate disease gene prediction

    Get PDF
    Linkage analysis is a successful procedure to associate diseases with specific genomic regions. These regions are often large, containing hundreds of genes, which make experimental methods employed to identify the disease gene arduous and expensive. We present two methods to prioritize candidates for further experimental study: Common Pathway Scanning (CPS) and Common Module Profiling (CMP). CPS is based on the assumption that common phenotypes are associated with dysfunction in proteins that participate in the same complex or pathway. CPS applies network data derived from protein&ndash;protein interaction (PPI) and pathway databases to identify relationships between genes. CMP identifies likely candidates using a domain-dependent sequence similarity approach, based on the hypothesis that disruption of genes of similar function will lead to the same phenotype. Both algorithms use two forms of input data: known disease genes or multiple disease loci. When using known disease genes as input, our combined methods have a sensitivity of 0.52 and a specificity of 0.97 and reduce the candidate list by 13-fold. Using multiple loci, our methods successfully identify disease genes for all benchmark diseases with a sensitivity of 0.84 and a specificity of 0.63. Our combined approach prioritizes good candidates and will accelerate the disease gene discovery process

    Changes in zinc ligation promote remodeling of the active site in the zinc hydrolase superfamily

    Full text link
    The zinc hydrolase superfamily is a group of divergently related proteins that are predominantly enzymes with a zinc-based catalytic mechanism. The common structural scaffold of the superfamily consists of an eight-stranded &beta;-sheet flanked by six &alpha;-helices. Previous analyses, while acknowledging the likely divergent origins of leucine aminopeptidase, carboxypeptidase A and the co-catalytic enzymes of the metallopeptidase H clan based on their structural scaffolds, have failed to find any homology between the active sites in leucine aminopeptidase and the metallopeptidase H clan enzymes. Here we show that these two groups of co-catalytic enzymes have overlapping dizinc centers where one of the two zinc atoms is conserved in each group. Carboxypeptidase A and leucine aminopeptidase, on the other hand, no longer share any homologous zinc-binding sites. At least three catalytic zinc-binding sites have existed in the structural scaffold over the period of history defined by available structures. Comparison of enzyme-inhibitor complexes show that major remodeling of the substrate-binding site has occurred in association with each change in zinc ligation in the binding site. These changes involve re-registration and re-orientation of the substrate. Some residues important to the catalytic mechanism are not conserved amongst members. We discuss how molecules acting in trans may have facilitated the mutation of catalytically important residues in the active site in this group

    Web tools for the prioritization of candidate disease genes

    Full text link
    Despite increasing sequencing capacity, genetic disease investigation still frequently results in the identification of loci containing multiple candidate disease genes that need to be tested for involvement in the disease. This process can be expedited by prioritizing the candidates prior to testing. Over the last decade, a large number of computational methods and tools have been developed to assist the clinical geneticist in prioritizing candidate disease genes. In this chapter, we give an overview of computational tools that can be used for this purpose, all of which are freely available over the web.<br /
    corecore