478 research outputs found

    Assessing functional novelty of PSI structures via structure-function analysis of large and diverse superfamilies

    Get PDF
    The structural genomics initiatives have had as one of their aims to improve our understanding of protein function by providing representative structures for many structurally uncharacterised protein families. As suggested by the recent assessment of the Protein Structure Initiative (Structural Genomics Initiative, funded by the NIH), doubts have arisen as to whether Structural Genomics as initially planned were really beneficial to our understanding of biological issues, and in particular of protein function.
A few protein domain superfamilies have been shown to account for unexpectedly large numbers of proteins encoded in fully sequenced genomes. These large superfamilies are generally very diverse, spanning a wide range of functions, both in terms of molecular activities and biological processes. Some of these superfamilies, such as the Rossmann-fold P-loop nucleotide hydrolases or the TIM-barrel glycosidases, have been the subject of extensive structural studies which in turn have shed light on how evolution of the sequence and structure properties produce functional diversity amongst homologues. Recently, the Structure-Function Linkage Database (SFLD) has been setup with the aim of helping the study of structure-function correlations in such superfamilies. Since the evolutionary success of these large superfamilies suggests biological importance, several Structural Genomics Centers have focused on providing full structural coverage for representatives of all sequence families in these superfamilies.
In this work we evaluate structure/function diversity in a set of these large superfamilies and attempt to assess the quality and quantity of biological information gained from Structural Genomics.
&#xa

    Potential of deep learning segmentation for the extraction of archaeological features from historical map series

    Get PDF
    Historical maps present a unique depiction of past landscapes, providing evidence for a wide range of information such as settlement distribution, past land use, natural resources, transport networks, toponymy and other natural and cultural data within an explicitly spatial context. Maps produced before the expansion of large‐scale mechanized agriculture reflect a landscape that is lost today. Of particular interest to us is the great quantity of archaeologically relevant information that these maps recorded, both deliberately and incidentally. Despite the importance of the information they contain, researchers have only recently begun to automatically digitize and extract data from such maps as coherent information, rather than manually examine a raster image. However, these new approaches have focused on specific types of information that cannot be used directly for archaeological or heritage purposes. This paper provides a proof of concept of the application of deep learning techniques to extract archaeological information from historical maps in an automated manner. Early twentieth century colonial map series have been chosen, as they provide enough time depth to avoid many recent large‐scale landscape modifications and cover very large areas (comprising several countries). The use of common symbology and conventions enhance the applicability of the method. The results show deep learning to be an efficient tool for the recovery of georeferenced, archaeologically relevant information that is represented as conventional signs, line‐drawings and text in historical maps. The method can provide excellent results when an adequate training dataset has been gathered and is therefore at its best when applied to the large map series that can supply such information. The deep learning approaches described here open up the possibility to map sites and features across entire map series much more quickly and coherently than other available methods, opening up the potential to reconstruct archaeological landscapes at continental scales

    FLORA: a novel method to predict protein function from structure in diverse superfamilies

    Get PDF
    Predicting protein function from structure remains an active area of interest, particularly for the structural genomics initiatives where a substantial number of structures are initially solved with little or no functional characterisation. Although global structure comparison methods can be used to transfer functional annotations, the relationship between fold and function is complex, particularly in functionally diverse superfamilies that have evolved through different secondary structure embellishments to a common structural core. The majority of prediction algorithms employ local templates built on known or predicted functional residues. Here, we present a novel method (FLORA) that automatically generates structural motifs associated with different functional sub-families (FSGs) within functionally diverse domain superfamilies. Templates are created purely on the basis of their specificity for a given FSG, and the method makes no prior prediction of functional sites, nor assumes specific physico-chemical properties of residues. FLORA is able to accurately discriminate between homologous domains with different functions and substantially outperforms (a 2–3 fold increase in coverage at low error rates) popular structure comparison methods and a leading function prediction method. We benchmark FLORA on a large data set of enzyme superfamilies from all three major protein classes (α, β, αβ) and demonstrate the functional relevance of the motifs it identifies. We also provide novel predictions of enzymatic activity for a large number of structures solved by the Protein Structure Initiative. Overall, we show that FLORA is able to effectively detect functionally similar protein domain structures by purely using patterns of structural conservation of all residues

    New functional families (FunFams) in CATH to improve the mapping of conserved functional sites to 3D structures.

    Get PDF
    CATH version 3.5 (Class, Architecture, Topology, Homology, available at http://www.cathdb.info/) contains 173 536 domains, 2626 homologous superfamilies and 1313 fold groups. When focusing on structural genomics (SG) structures, we observe that the number of new folds for CATH v3.5 is slightly less than for previous releases, and this observation suggests that we may now know the majority of folds that are easily accessible to structure determination. We have improved the accuracy of our functional family (FunFams) sub-classification method and the CATH sequence domain search facility has been extended to provide FunFam annotations for each domain. The CATH website has been redesigned. We have improved the display of functional data and of conserved sequence features associated with FunFams within each CATH superfamily

    Computation of protein geometry and its applications: Packing and function prediction

    Full text link
    This chapter discusses geometric models of biomolecules and geometric constructs, including the union of ball model, the weigthed Voronoi diagram, the weighted Delaunay triangulation, and the alpha shapes. These geometric constructs enable fast and analytical computaton of shapes of biomoleculres (including features such as voids and pockets) and metric properties (such as area and volume). The algorithms of Delaunay triangulation, computation of voids and pockets, as well volume/area computation are also described. In addition, applications in packing analysis of protein structures and protein function prediction are also discussed.Comment: 32 pages, 9 figure

    Large-scale mapping of bioactive peptides in structural and sequence space

    Get PDF
    Health-enhancing potential bioactive peptide (BP) has driven an interest in food proteins as well as in the development of predictive methods. Research in this area has been especially active to use them as components in functional foods. Apparently, BPs do not have a given biological function in the containing proteins and they do not evolve under independent evolutionary constraints. In this work we performed a large-scale mapping of BPs in sequence and structural space. Using well curated BP deposited in BIOPEP database, we searched for exact matches in non-redundant sequences databases. Proteins containing BPs, were used in fold-recognition methods to predict the corresponding folds and BPs occurrences were mapped. We found that fold distribution of BP occurrences possibly reflects sequence relative abundance in databases. However, we also found that proteins with 5 or more than 5 BP in their sequences correspond to well populated protein folds, called superfolds. Also, we found that in well populated superfamilies, BPs tend to adopt similar locations in the protein fold, suggesting the existence of hotspots. We think that our results could contribute to the development of new bioinformatics pipeline to improve BP detection.Fil: Nardo, Agustina Estefania. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Centro de Investigación y Desarrollo en Criotecnología de Alimentos. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Centro de Investigación y Desarrollo en Criotecnología de Alimentos. Universidad Nacional de la Plata. Facultad de Ciencias Exactas. Centro de Investigación y Desarrollo en Criotecnología de Alimentos; ArgentinaFil: Añon, Maria Cristina. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Centro de Investigación y Desarrollo en Criotecnología de Alimentos. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Centro de Investigación y Desarrollo en Criotecnología de Alimentos. Universidad Nacional de la Plata. Facultad de Ciencias Exactas. Centro de Investigación y Desarrollo en Criotecnología de Alimentos; ArgentinaFil: Parisi, Gustavo Daniel. Universidad Nacional de Quilmes; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentin

    LigASite—a database of biologically relevant binding sites in proteins with known apo-structures

    Get PDF
    Better characterization of binding sites in proteins and the ability to accurately predict their location and energetic properties are major challenges which, if addressed, would have many valuable practical applications. Unfortunately, reliable benchmark datasets of binding sites in proteins are still sorely lacking. Here, we present LigASite (‘LIGand Attachment SITE’), a gold-standard dataset of binding sites in 550 proteins of known structures. LigASite consists exclusively of biologically relevant binding sites in proteins for which at least one apo- and one holo-structure are available. In defining the binding sites for each protein, information from all holo-structures is combined, considering in each case the quaternary structure defined by the PQS server. LigASite is built using simple criteria and is automatically updated as new structures become available in the PDB, thereby guaranteeing optimal data coverage over time. Both a redundant and a culled non-redundant version of the dataset is available at http://www.scmbb.ulb.ac.be/Users/benoit/LigASite. The website interface allows users to search the dataset by PDB identifiers, ligand identifiers, protein names or sequence, and to look for structural matches as defined by the CATH homologous superfamilies. The datasets can be downloaded from the website as Schema-validated XML files or comma-separated flat files

    The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution

    Get PDF
    We report the latest release (version 3.0) of the CATH protein domain database (). There has been a 20% increase in the number of structural domains classified in CATH, up to 86 151 domains. Release 3.0 comprises 1110 fold groups and 2147 homologous superfamilies. To cope with the increases in diverse structural homologues being determined by the structural genomics initiatives, more sensitive methods have been developed for identifying boundaries in multi-domain proteins and for recognising homologues. The CATH classification update is now being driven by an integrated pipeline that links these automated procedures with validation steps, that have been made easier by the provision of information rich web pages summarising comparison scores and relevant links to external sites for each domain being classified. An analysis of the population of domains in the CATH hierarchy and several domain characteristics are presented for version 3.0. We also report an update of the CATH Dictionary of homologous structures (CATH-DHS) which now contains multiple structural alignments, consensus information and functional annotations for 1459 well populated superfamilies in CATH. CATH is directly linked to the Gene3D database which is a projection of CATH structural data onto ∼2 million sequences in completed genomes and UniProt

    Passive Prophylactic Administration with a Single Dose of Anti-Fel d 1 Monoclonal Antibodies REGN1908-1909 in Cat Allergen-Induced Allergic Rhinitis: A Randomized, Double-blind, Placebo Controlled Trial

    Get PDF
    RATIONALE: Sensitization to Felis domesticus allergen 1 (Fel d 1) contributes to persistent allergic rhinitis and asthma. Existing treatment options for cat allergy, including allergen immunotherapy (AIT) are only moderately effective, and AIT has limited use due to safety concerns. OBJECTIVES: To explore the relationship among the pharmaokinteic, clinical, and immunological effects of REGN1908-1909 (anti-Fel d 1 monoclonal antibodies) in patients after treatment. METHODS: Patients received REGN1908-1909 (n=36) or placebo (n=37) in a phase 1b study. Fel d 1-induced basophil and IgE-facilitated allergen binding responses were evaluated at baseline and days 8, 29 and 85. Cytokine and chemokine levels in nasal fluids were measured. REGN1908-1909 inhibition of allergen-IgE binding in patient serum was evaluated. MEASUREMENTS AND MAIN RESULTS: Peak serum drug concentrations were concordant with maximal observed clinical response. The anti-Fel d 1 IgE/cat-dander IgE ratio in pretreatment serum correlated with Total Nasal Symptom Score improvement. The allergen neutralizing capacity of REGN1908-1909 was observed in serum and nasal fluid, and was detected in an inhibition assay. Type-2 cytokines (IL-4, IL-5 and IL-13) and chemokines (CCL17/TARC, CCL5/RANTES) in nasal fluid were inhibited in REGN1908-1909-treated patients compared to placebo (all P < 0.05); IL-13 and IL-5 levels correlated with TNSS improvement. Ex vivo assays demonstrated that REGN1908 and REGN1909 combined was more potent than each alone for inhibiting FcεRI- and FcεRII (CD23)-mediated allergic responses and subsequent T-cell activation. CONCLUSION: Single passive dose administration of Fel d 1-neutralizing IgG antibodies improved nasal symptoms in cat-allergic patients, and was underscored by suppression of FcεRI-, FcεRII- and Th2-mediated allergic responses. Clinical trial registration available at www.clinicaltrials.gov, ID: NCT02127801
    corecore