490 research outputs found

    Crystal structures of the human Dysferlin inner DysF domain

    Get PDF
    Background: Mutations in dysferlin, the first protein linked with the cell membrane repair mechanism, causes a group of muscular dystrophies called dysferlinopathies. Dysferlin is a type two-anchored membrane protein, with a single C terminal trans-membrane helix, and most of the protein lying in cytoplasm. Dysferlin contains several C2 domains and two DysF domains which are nested one inside the other. Many pathogenic point mutations fall in the DysF domain region. Results: We describe the crystal structure of the human dysferlin inner DysF domain with a resolution of 1.9 Angstroms. Most of the pathogenic mutations are part of aromatic/arginine stacks that hold the domain in a folded conformation. The high resolution of the structure show that these interactions are a mixture of parallel ring/guanadinium stacking, perpendicular H bond stacking and aliphatic chain packing. Conclusions: The high resolution structure of the Dysferlin DysF domain gives a template on which to interpret in detail the pathogenic mutations that lead to disease

    Gene3D: Multi-domain annotations for protein sequence and comparative genome analysis

    Get PDF
    Gene3D (http://gene3d.biochem.ucl.ac.uk) is a database of protein domain structure annotations for protein sequences. Domains are predicted using a library of profile HMMs from 2738 CATH superfamilies. Gene3D assigns domain annotations to Ensembl and UniProt sequence sets including >6000 cellular genomes and >20 million unique protein sequences. This represents an increase of 45% in the number of protein sequences since our last publication. Thanks to improvements in the underlying data and pipeline, we see large increases in the domain coverage of sequences. We have expanded this coverage by integrating Pfam and SUPERFAMILY domain annotations, and we now resolve domain overlaps to provide highly comprehensive composite multi-domain architectures. To make these data more accessible for comparative genome analyses, we have developed novel search algorithms for searching genomes to identify related multi-domain architectures. In addition to providing domain family annotations, we have now developed a pipeline for 3D homology modelling of domains in Gene3D. This has been applied to the human genome and will be rolled out to other major organisms over the next year

    The occupation of a box as a toy model for the seismic cycle of a fault

    Full text link
    We illustrate how a simple statistical model can describe the quasiperiodic occurrence of large earthquakes. The model idealizes the loading of elastic energy in a seismic fault by the stochastic filling of a box. The emptying of the box after it is full is analogous to the generation of a large earthquake in which the fault relaxes after having been loaded to its failure threshold. The duration of the filling process is analogous to the seismic cycle, the time interval between two successive large earthquakes in a particular fault. The simplicity of the model enables us to derive the statistical distribution of its seismic cycle. We use this distribution to fit the series of earthquakes with magnitude around 6 that occurred at the Parkfield segment of the San Andreas fault in California. Using this fit, we estimate the probability of the next large earthquake at Parkfield and devise a simple forecasting strategy.Comment: Final version of the published paper, with an erratum and an unpublished appendix with some proof

    An integrated approach to the interpretation of Single Amino Acid Polymorphisms within the framework of CATH and Gene3D

    Get PDF
    Background The phenotypic effects of sequence variations in protein-coding regions come about primarily via their effects on the resulting structures, for example by disrupting active sites or affecting structural stability. In order better to understand the mechanisms behind known mutant phenotypes, and predict the effects of novel variations, biologists need tools to gauge the impacts of DNA mutations in terms of their structural manifestation. Although many mutations occur within domains whose structure has been solved, many more occur within genes whose protein products have not been structurally characterized.<p></p> Results Here we present 3DSim (3D Structural Implication of Mutations), a database and web application facilitating the localization and visualization of single amino acid polymorphisms (SAAPs) mapped to protein structures even where the structure of the protein of interest is unknown. The server displays information on 6514 point mutations, 4865 of them known to be associated with disease. These polymorphisms are drawn from SAAPdb, which aggregates data from various sources including dbSNP and several pathogenic mutation databases. While the SAAPdb interface displays mutations on known structures, 3DSim projects mutations onto known sequence domains in Gene3D. This resource contains sequences annotated with domains predicted to belong to structural families in the CATH database. Mappings between domain sequences in Gene3D and known structures in CATH are obtained using a MUSCLE alignment. 1210 three-dimensional structures corresponding to CATH structural domains are currently included in 3DSim; these domains are distributed across 396 CATH superfamilies, and provide a comprehensive overview of the distribution of mutations in structural space.<p></p> Conclusion The server is publicly available at http://3DSim.bioinfo.cnio.es/ webcite. In addition, the database containing the mapping between SAAPdb, Gene3D and CATH is available on request and most of the functionality is available through programmatic web service access.<p></p&gt

    Detecting Remote Evolutionary Relationships among Proteins by Large-Scale Semantic Embedding

    Get PDF
    Virtually every molecular biologist has searched a protein or DNA sequence database to find sequences that are evolutionarily related to a given query. Pairwise sequence comparison methods—i.e., measures of similarity between query and target sequences—provide the engine for sequence database search and have been the subject of 30 years of computational research. For the difficult problem of detecting remote evolutionary relationships between protein sequences, the most successful pairwise comparison methods involve building local models (e.g., profile hidden Markov models) of protein sequences. However, recent work in massive data domains like web search and natural language processing demonstrate the advantage of exploiting the global structure of the data space. Motivated by this work, we present a large-scale algorithm called ProtEmbed, which learns an embedding of protein sequences into a low-dimensional “semantic space.” Evolutionarily related proteins are embedded in close proximity, and additional pieces of evidence, such as 3D structural similarity or class labels, can be incorporated into the learning process. We find that ProtEmbed achieves superior accuracy to widely used pairwise sequence methods like PSI-BLAST and HHSearch for remote homology detection; it also outperforms our previous RankProp algorithm, which incorporates global structure in the form of a protein similarity network. Finally, the ProtEmbed embedding space can be visualized, both at the global level and local to a given query, yielding intuition about the structure of protein sequence space

    Estimation of the solubility parameters of model plant surfaces and agrochemicals: a valuable tool for understanding plant surface interactions

    Get PDF
    Background Most aerial plant parts are covered with a hydrophobic lipid-rich cuticle, which is the interface between the plant organs and the surrounding environment. Plant surfaces may have a high degree of hydrophobicity because of the combined effects of surface chemistry and roughness. The physical and chemical complexity of the plant cuticle limits the development of models that explain its internal structure and interactions with surface-applied agrochemicals. In this article we introduce a thermodynamic method for estimating the solubilities of model plant surface constituents and relating them to the effects of agrochemicals. Results Following the van Krevelen and Hoftyzer method, we calculated the solubility parameters of three model plant species and eight compounds that differ in hydrophobicity and polarity. In addition, intact tissues were examined by scanning electron microscopy and the surface free energy, polarity, solubility parameter and work of adhesion of each were calculated from contact angle measurements of three liquids with different polarities. By comparing the affinities between plant surface constituents and agrochemicals derived from (a) theoretical calculations and (b) contact angle measurements we were able to distinguish the physical effect of surface roughness from the effect of the chemical nature of the epicuticular waxes. A solubility parameter model for plant surfaces is proposed on the basis of an increasing gradient from the cuticular surface towards the underlying cell wall. Conclusions The procedure enabled us to predict the interactions among agrochemicals, plant surfaces, and cuticular and cell wall components, and promises to be a useful tool for improving our understanding of biological surface interactions
    corecore