544 research outputs found

    The Gene3D Web Services: a platform for identifying, annotating and comparing structural domains in protein sequences

    Get PDF
    The Gene3D structural domain database provides domain annotations for 7 million proteins, based on the manually curated structural domain superfamilies in CATH. These annotations are integrated with functional, genomic and molecular information from external resources, such as GO, EC, UniProt and the NCBI Taxonomy database. We have constructed a set of web services that provide programmatic access to this integrated database, as well as the Gene3D domain recognition tool (Gene3DScan) and protein sequence annotation pipeline for analysing novel protein sequences. Example queries include retrieving all curated GO terms for a domain superfamily or all the multi-domain architectures for the human genome. The services can be accessed using simple HTTP calls and are able to return results in a range of formats for quick downloading and easy parsing, graphical rendering and data storage. Hence, they provide a simple, but flexible means of integrating domain annotations and associated data sets into locally run pipelines and analysis software. The services can be found at http://gene3d.biochem.ucl.ac.uk/WebServices/

    Gene3D: comprehensive structural and functional annotation of genomes

    Get PDF
    Gene3D provides comprehensive structural and functional annotation of most available protein sequences, including the UniProt, RefSeq and Integr8 resources. The main structural annotation is generated through scanning these sequences against the CATH structural domain database profile-HMM library. CATH is a database of manually derived PDB-based structural domains, placed within a hierarchy reflecting topology, homology and conservation and is able to infer more ancient and divergent homology relationships than sequence-based approaches. This data is supplemented with Pfam-A, other non-domain structural predictions (i.e. coiled coils) and experimental data from UniProt. In order to enhance the investigations possible with this data, we have also incorporated a variety of protein annotation resources, including protein–protein interaction data, GO functional assignments, KEGG pathways, FUNCAT functional descriptions and links to microarray expression data. All of this data can be accessed through a newly re-designed website that has a focus on flexibility and clarity, with searches that can be restricted to a single genome or across the entire sequence database. Currently Gene3D contains over 3.5 million domain assignments for nearly 5 million proteins including 527 completed genomes. This is available at: http://gene3d.biochem.ucl.ac.uk

    Interplay between ferromagnetism, surface states, and quantum corrections in a magnetically doped topological insulator

    Full text link
    The breaking of time-reversal symmetry by ferromagnetism is predicted to yield profound changes to the electronic surface states of a topological insulator. Here, we report on a concerted set of structural, magnetic, electrical and spectroscopic measurements of \MBS thin films wherein photoemission and x-ray magnetic circular dichroism studies have recently shown surface ferromagnetism in the temperature range 15 K T100\leq T \leq 100 K, accompanied by a suppressed density of surface states at the Dirac point. Secondary ion mass spectroscopy and scanning tunneling microscopy reveal an inhomogeneous distribution of Mn atoms, with a tendency to segregate towards the sample surface. Magnetometry and anisotropic magnetoresistance measurements are insensitive to the high temperature ferromagnetism seen in surface studies, revealing instead a low temperature ferromagnetic phase at T5T \lesssim 5 K. The absence of both a magneto-optical Kerr effect and anomalous Hall effect suggests that this low temperature ferromagnetism is unlikely to be a homogeneous bulk phase but likely originates in nanoscale near-surface regions of the bulk where magnetic atoms segregate during sample growth. Although the samples are not ideal, with both bulk and surface contributions to electron transport, we measure a magnetoconductance whose behavior is qualitatively consistent with predictions that the opening of a gap in the Dirac spectrum drives quantum corrections to the conductance in topological insulators from the symplectic to the orthogonal class.Comment: To appear in Phys. Rev.

    The occupation of a box as a toy model for the seismic cycle of a fault

    Full text link
    We illustrate how a simple statistical model can describe the quasiperiodic occurrence of large earthquakes. The model idealizes the loading of elastic energy in a seismic fault by the stochastic filling of a box. The emptying of the box after it is full is analogous to the generation of a large earthquake in which the fault relaxes after having been loaded to its failure threshold. The duration of the filling process is analogous to the seismic cycle, the time interval between two successive large earthquakes in a particular fault. The simplicity of the model enables us to derive the statistical distribution of its seismic cycle. We use this distribution to fit the series of earthquakes with magnitude around 6 that occurred at the Parkfield segment of the San Andreas fault in California. Using this fit, we estimate the probability of the next large earthquake at Parkfield and devise a simple forecasting strategy.Comment: Final version of the published paper, with an erratum and an unpublished appendix with some proof

    Estimation of the solubility parameters of model plant surfaces and agrochemicals: a valuable tool for understanding plant surface interactions

    Get PDF
    Background Most aerial plant parts are covered with a hydrophobic lipid-rich cuticle, which is the interface between the plant organs and the surrounding environment. Plant surfaces may have a high degree of hydrophobicity because of the combined effects of surface chemistry and roughness. The physical and chemical complexity of the plant cuticle limits the development of models that explain its internal structure and interactions with surface-applied agrochemicals. In this article we introduce a thermodynamic method for estimating the solubilities of model plant surface constituents and relating them to the effects of agrochemicals. Results Following the van Krevelen and Hoftyzer method, we calculated the solubility parameters of three model plant species and eight compounds that differ in hydrophobicity and polarity. In addition, intact tissues were examined by scanning electron microscopy and the surface free energy, polarity, solubility parameter and work of adhesion of each were calculated from contact angle measurements of three liquids with different polarities. By comparing the affinities between plant surface constituents and agrochemicals derived from (a) theoretical calculations and (b) contact angle measurements we were able to distinguish the physical effect of surface roughness from the effect of the chemical nature of the epicuticular waxes. A solubility parameter model for plant surfaces is proposed on the basis of an increasing gradient from the cuticular surface towards the underlying cell wall. Conclusions The procedure enabled us to predict the interactions among agrochemicals, plant surfaces, and cuticular and cell wall components, and promises to be a useful tool for improving our understanding of biological surface interactions

    Detecting Remote Evolutionary Relationships among Proteins by Large-Scale Semantic Embedding

    Get PDF
    Virtually every molecular biologist has searched a protein or DNA sequence database to find sequences that are evolutionarily related to a given query. Pairwise sequence comparison methods—i.e., measures of similarity between query and target sequences—provide the engine for sequence database search and have been the subject of 30 years of computational research. For the difficult problem of detecting remote evolutionary relationships between protein sequences, the most successful pairwise comparison methods involve building local models (e.g., profile hidden Markov models) of protein sequences. However, recent work in massive data domains like web search and natural language processing demonstrate the advantage of exploiting the global structure of the data space. Motivated by this work, we present a large-scale algorithm called ProtEmbed, which learns an embedding of protein sequences into a low-dimensional “semantic space.” Evolutionarily related proteins are embedded in close proximity, and additional pieces of evidence, such as 3D structural similarity or class labels, can be incorporated into the learning process. We find that ProtEmbed achieves superior accuracy to widely used pairwise sequence methods like PSI-BLAST and HHSearch for remote homology detection; it also outperforms our previous RankProp algorithm, which incorporates global structure in the form of a protein similarity network. Finally, the ProtEmbed embedding space can be visualized, both at the global level and local to a given query, yielding intuition about the structure of protein sequence space
    corecore