14 research outputs found

    OpenStructure: a flexible software framework for computational structural biology

    Get PDF
    Motivation: Developers of new methods in computational structural biology are often hampered in their research by incompatible software tools and non-standardized data formats. To address this problem, we have developed OpenStructure as a modular open source platform to provide a powerful, yet flexible general working environment for structural bioinformatics. OpenStructure consists primarily of a set of libraries written in C++ with a cleanly designed application programmer interface. All functionality can be accessed directly in C++ or in a Python layer, meeting both the requirements for high efficiency and ease of use. Powerful selection queries and the notion of entity views to represent these selections greatly facilitate the development and implementation of algorithms on structural data. The modular integration of computational core methods with powerful visualization tools makes OpenStructure an ideal working and development environment. Several applications, such as the latest versions of IPLT and QMean, have been implemented based on OpenStructure—demonstrating its value for the development of next-generation structural biology algorithms

    p3d – Python module for structural bioinformatics

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>High-throughput bioinformatic analysis tools are needed to mine the large amount of structural data via knowledge based approaches. The development of such tools requires a robust interface to access the structural data in an easy way. For this the Python scripting language is the optimal choice since its philosophy is to write an understandable source code.</p> <p>Results</p> <p>p3d is an object oriented Python module that adds a simple yet powerful interface to the Python interpreter to process and analyse three dimensional protein structure files (PDB files). p3d's strength arises from the combination of a) very fast spatial access to the structural data due to the implementation of a binary space partitioning (BSP) tree, b) set theory and c) functions that allow to combine a and b and that use human readable language in the search queries rather than complex computer language. All these factors combined facilitate the rapid development of bioinformatic tools that can perform quick and complex analyses of protein structures.</p> <p>Conclusion</p> <p>p3d is the perfect tool to quickly develop tools for structural bioinformatics using the Python scripting language.</p

    PTools: an opensource molecular docking library

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Macromolecular docking is a challenging field of bioinformatics. Developing new algorithms is a slow process generally involving routine tasks that should be found in a robust library and not programmed from scratch for every new software application.</p> <p>Results</p> <p>We present an object-oriented Python/C++ library to help the development of new docking methods. This library contains low-level routines like PDB-format manipulation functions as well as high-level tools for docking and analyzing results. We also illustrate the ease of use of this library with the detailed implementation of a 3-body docking procedure.</p> <p>Conclusion</p> <p>The PTools library can handle molecules at coarse-grained or atomic resolution and allows users to rapidly develop new software. The library is already in use for protein-protein and protein-DNA docking with the ATTRACT program and for simulation analysis. This library is freely available under the GNU GPL license, together with detailed documentation.</p

    Structure-Based Analysis of Five Novel Disease-Causing Mutations in 21-Hydroxylase-Deficient Patients

    Get PDF
    Congenital adrenal hyperplasia (CAH) due to 21-hydroxylase deficiency is the most frequent inborn error of metabolism, and accounts for 90–95% of CAH cases. The affected enzyme, P450C21, is encoded by the CYP21A2 gene, located together with a 98% nucleotide sequence identity CYP21A1P pseudogene, on chromosome 6p21.3. Even though most patients carry CYP21A1P-derived mutations, an increasing number of novel and rare mutations in disease causing alleles were found in the last years. In the present work, we describe five CYP21A2 novel mutations, p.R132C, p.149C, p.M283V, p.E431K and a frameshift g.2511_2512delGG, in four non-classical and one salt wasting patients from Argentina. All novel point mutations are located in CYP21 protein residues that are conserved throughout mammalian species, and none of them were found in control individuals. The putative pathogenic mechanisms of the novel variants were analyzed in silico. A three-dimensional CYP21 structure was generated by homology modeling and the protein design algorithm FoldX was used to calculate changes in stability of CYP21A2 protein. Our analysis revealed changes in protein stability or in the surface charge of the mutant enzymes, which could be related to the clinical manifestation found in patients

    New hyperekplexia mutations provide insight into glycine receptor assembly, trafficking, and activation mechanisms

    Get PDF
    Background: Hyperekplexia mutations have provided much information about glycine receptor structure and function. Results: Weidentified and characterized nine new mutations. Dominant mutations resulted in spontaneous activation, whereas recessive mutations precluded surface expression. Conclusion: These data provide insight into glycine receptor activation mechanisms and surface expression determinants. Significance: The results enhance our understanding of hyperekplexia pathology and glycine receptor structure-function. © 2013 by The American Society for Biochemistry and Molecular Biology, Inc. Published in the U.S.A

    Wordom update 2: A user-friendly program for the analysis of molecular structures and conformational ensembles

    Get PDF
    We present the second update of Wordom, a user-friendly and efficient program for manipulation and analysis of conformational ensembles from molecular simulations. The actual update expands some of the existing modules and adds 21 new modules to the update 1 published in 2011. The new adds can be divided into three sets that: 1) analyze atomic fluctuations and structural communication; 2) explore ion-channel conformational dynamics and ionic translocation; and 3) compute geometrical indices of structural deformation. Set 1 serves to compute correlations of motions, find geometrically stable domains, identify a dynamically invariant core, find changes in domain-domain separation and mutual orientation, perform wavelet analysis of large-scale simulations, process the output of principal component analysis of atomic fluctuations, perform functional mode analysis, infer regions of mechanical rigidity, analyze overall fluctuations, and perform the perturbation response scanning. Set 2 includes modules specific for ion channels, which serve to monitor the pore radius as well as water or ion fluxes, and measure functional collective motions like receptor twisting or tilting angles. Finally, set 3 includes tools to monitor structural deformations by computing angles, perimeter, area, volume, β-sheet curvature, radial distribution function, and center of mass. The ring perception module is also included, helpful to monitor supramolecular self-assemblies. This update places Wordom among the most suitable, complete, user-friendly, and efficient software for the analysis of biomolecular simulations. The source code of Wordom and the relative documentation are available under the GNU general public license at http://wordom.sf.net

    Pathophysiological Mechanisms of Dominant and Recessive GLRA1 Mutations in Hyperekplexia

    Get PDF
    Hyperekplexia is a rare, but potentially fatal, neuromotor disorder characterized by exaggerated startle reflexes and hypertonia in response to sudden, unexpected auditory or tactile stimuli. This disorder is primarily caused by inherited mutations in the genes encoding the glycine receptor (GlyR) alpha 1 subunit (GLRA1) and the presynaptic glycine transporter GlyT2 (SLC6A5). In this study, systematic DNA sequencing of GLRA1 in 88 new unrelated human hyperekplexia patients revealed 19 sequence variants in 30 index cases, of which 21 cases were inherited in recessive or compound heterozygote modes. This indicates that recessive hyperekplexia is far more prevalent than previous estimates. From the 19 GLRA1 sequence variants, we have investigated the functional effects of 11 novel and 2 recurrent mutations. The expression levels and functional properties of these hyperekplexia mutants were analyzed using a high-content imaging system and patch-clamp electrophysiology. When expressed in HEK293 cells, either as homomeric alpha 1 or heteromeric alpha 1 beta GlyRs, subcellular localization defects were the major mechanism underlying recessive mutations. However, mutants without trafficking defects typically showed alterations in the glycine sensitivity suggestive of disrupted receptor function. This study also reports the first hyperekplexia mutation associated with a GlyR leak conductance, suggesting tonic channel opening as a new mechanism in neuronal ligand-gated ion channels

    Structural and functional analysis of CD81-Claudin-1, a hepatitis C virus receptor complex

    Get PDF
    Many viruses initiate infection through a multistep process involving host cell membrane proteins. Hepatitis C virus (HCV) is an important human pathogen that infects more than 185 million people worldwide and results in progressive liver disease. Recent advances have identified an essential role for tetraspanin CD81 and tight junction protein Claudin-1 in HCV entry into hepatocytes in the liver. CD81 associates with Claudin-1 and this complex is necessary for virus internalisation; defining the full length interface of this membrane protein interaction is therefore important for the design of future anti-viral therapies. Structural information is lacking for CD81: indeed, there is no high resolution structure for any full-length tetraspanin. This thesis describes an analysis of the protein-protein interaction interface between CD81 and Claudin-1 full-length proteins using a split-ubiquitin yeast assay. Also, using recombinant protein production of CD81, this thesis describes work towards successful crystallisation trials of a full length tetraspanin. CD81 homotypic and heterotypic interactions with Claudin-1 were analysed in a high-throughput format in yeast, showing that this interaction is specific and does not require other mammalian cell factors. This work demonstrates that the CD81 large extracellular loop and its first transmembrane domain are involved in the CD81-Claudin-1 interaction: a novel full length molecular model predicted interacting amino acid residues that were confirmed in vivo using yeast assays. Thermal stability assays used to investigate recombinant membrane protein found that both detergent and buffer components are vital for the stability of recombinant CD81, which shows increased thermostability in the presence of cholesteryl hemisuccinate. Using the improved protein solution environment found here, and the increased understanding of the tetraspanin interaction interface; this work paves the way for CD81 structural characterisation alone or in combination with Claudin-1

    Computational Approaches to Address the Next-Generation Sequencing Era

    Get PDF
    In this thesis, I propose new algorithms and models to address biological problems. Computer science in fact plays a key role in proteomics and genetics research due to the advent of big datasets. In the context of protein study, I developed new methods for protein function prediction based on information retrieval principles. By using heterogeneous source of knowledge, like graph search and sequence similarity, I designed a tool called INGA that can be used to annotate entire genomes. It has been benchmarked during the Critical Assessment of Function Annotation challenge, and it proved to be one of the most effective approach for function inference. To better characterize proteins from the structural point of view, I proposed a protein conformers detection strategy based on residue interaction network (RIN) data. RIN graphs were extended to deal with the time-dependent protein coordinate fluctuations, and were generated by clustering algorithms. An implementation called RING MD highlighted effectively the key amino acids known to be functionally relevant in Ubiquitin. These amino acids in fact are very important to explain the protein three-dimensional dynamics. With the same rationale, RIN graphs were used also to predict the impact of mutations within a protein structure. By combining information about a mutant node in the network and its features, an artificial neural network was trained to estimate the free Gibbs energy change of a protein. Extreme changes in the internal energy might lead to the protein unfolding, and possibly to disease. The reduction of a protein flexibility may hamper its function as well. As an example, the extreme fluctuations observed in intrinsically disordered proteins (IDPs) are fundamental for their activities. To better understand IDPs, I contributed in the collection of the largest dataset of disordered regions. In the following analysis, it was shown what are the typical functions of these sequences and the biological processes where they are involved. Due to the importance of their detection, a comprehensive assessment of disorder predictors was performed to show what are the state-of-the-art methods and their limitations. In the context of genetics, I focused on phenotype prediction. During the Critical Assessment of Genome Interpretation (CAGI), I proposed new approaches for the analysis of exome data to prioritize the risk of Crohn's disease and abnormal cholesterol levels. These are often defined as complex disease, since the mechanism behind their insurgence is still unknown. In my study, human samples with an enrichment of mutations in critical genes were predicted to have an high genetic risk. In addition to disease associated genes, protein interaction networks were considered to better account for variants accumulation in biological pathways. Such strategy was shown to be among the best approaches by CAGI organizers. In the simpler case of Mendelian traits, with BOOGIE I designed a method for human blood groups prediction based on exome data. It uses a specialized version of nearest neighbor algorithm in order to match the gene variants in an unannotated exome with the ones available in a reference knowledge base. The most similar hit is used to transfer the blood group. With an accuracy above 90%, BOOGIE is a proof-of-concept that shows the potential applications of genetic prediction, and can be easily extended to any Mendelian trait. To summarize, this thesis is a partial answer to the exponential growth of sequences available that need further experiments. By integrating heterogeneous information and designing new predictive models based on machine learning, I developed novel tools for biological data analysis and classification. All implementations are freely available for the community and might be helpful during future investigations like in drug design and disease studies
    corecore