148,424 research outputs found

    Protein structure search and local structure characterization

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Structural similarities among proteins can provide valuable insight into their functional mechanisms and relationships. As the number of available three-dimensional (3D) protein structures increases, a greater variety of studies can be conducted with increasing efficiency, among which is the design of protein structural alphabets. Structural alphabets allow us to characterize local structures of proteins and describe the global folding structure of a protein using a one-dimensional (1D) sequence. Thus, 1D sequences can be used to identify structural similarities among proteins using standard sequence alignment tools such as BLAST or FASTA.</p> <p>Results</p> <p>We used self-organizing maps in combination with a minimum spanning tree algorithm to determine the optimum size of a structural alphabet and applied the k-means algorithm to group protein fragnts into clusters. The centroids of these clusters defined the structural alphabet. We also developed a flexible matrix training system to build a substitution matrix (TRISUM-169) for our alphabet. Based on FASTA and using TRISUM-169 as the substitution matrix, we developed the SA-FAST alignment tool. We compared the performance of SA-FAST with that of various search tools in database-scale search tasks and found that SA-FAST was highly competitive in all tests conducted. Further, we evaluated the performance of our structural alphabet in recognizing specific structural domains of EGF and EGF-like proteins. Our method successfully recovered more EGF sub-domains using our structural alphabet than when using other structural alphabets. SA-FAST can be found at <url>http://140.113.166.178/safast/</url>.</p> <p>Conclusion</p> <p>The goal of this project was two-fold. First, we wanted to introduce a modular design pipeline to those who have been working with structural alphabets. Secondly, we wanted to open the door to researchers who have done substantial work in biological sequences but have yet to enter the field of protein structure research. Our experiments showed that by transforming the structural representations from 3D to 1D, several 1D-based tools can be applied to structural analysis, including similarity searches and structural motif finding.</p

    Hot-spot analysis for drug discovery targeting protein-protein interactions

    Get PDF
    Introduction: Protein-protein interactions are important for biological processes and pathological situations, and are attractive targets for drug discovery. However, rational drug design targeting protein-protein interactions is still highly challenging. Hot-spot residues are seen as the best option to target such interactions, but their identification requires detailed structural and energetic characterization, which is only available for a tiny fraction of protein interactions. Areas covered: In this review, the authors cover a variety of computational methods that have been reported for the energetic analysis of protein-protein interfaces in search of hot-spots, and the structural modeling of protein-protein complexes by docking. This can help to rationalize the discovery of small-molecule inhibitors of protein-protein interfaces of therapeutic interest. Computational analysis and docking can help to locate the interface, molecular dynamics can be used to find suitable cavities, and hot-spot predictions can focus the search for inhibitors of protein-protein interactions. Expert opinion: A major difficulty for applying rational drug design methods to protein-protein interactions is that in the majority of cases the complex structure is not available. Fortunately, computational docking can complement experimental data. An interesting aspect to explore in the future is the integration of these strategies for targeting PPIs with large-scale mutational analysis.This work has been funded by grants BIO2016-79930-R and SEV-2015-0493 from the Spanish Ministry of Economy, Industry and Competitiveness, and grant EFA086/15 from EU Interreg V POCTEFA. M Rosell is supported by an FPI fellowship from the Severo Ochoa program. The authors are grateful for the support of the the Joint BSC-CRG-IRB Programme in Computational Biology.Peer ReviewedPostprint (author's final draft

    Many-Task Computing and Blue Waters

    Full text link
    This report discusses many-task computing (MTC) generically and in the context of the proposed Blue Waters systems, which is planned to be the largest NSF-funded supercomputer when it begins production use in 2012. The aim of this report is to inform the BW project about MTC, including understanding aspects of MTC applications that can be used to characterize the domain and understanding the implications of these aspects to middleware and policies. Many MTC applications do not neatly fit the stereotypes of high-performance computing (HPC) or high-throughput computing (HTC) applications. Like HTC applications, by definition MTC applications are structured as graphs of discrete tasks, with explicit input and output dependencies forming the graph edges. However, MTC applications have significant features that distinguish them from typical HTC applications. In particular, different engineering constraints for hardware and software must be met in order to support these applications. HTC applications have traditionally run on platforms such as grids and clusters, through either workflow systems or parallel programming systems. MTC applications, in contrast, will often demand a short time to solution, may be communication intensive or data intensive, and may comprise very short tasks. Therefore, hardware and software for MTC must be engineered to support the additional communication and I/O and must minimize task dispatch overheads. The hardware of large-scale HPC systems, with its high degree of parallelism and support for intensive communication, is well suited for MTC applications. However, HPC systems often lack a dynamic resource-provisioning feature, are not ideal for task communication via the file system, and have an I/O system that is not optimized for MTC-style applications. Hence, additional software support is likely to be required to gain full benefit from the HPC hardware

    A Factor Graph Approach to Automated GO Annotation

    Get PDF
    As volume of genomic data grows, computational methods become essential for providing a first glimpse onto gene annotations. Automated Gene Ontology (GO) annotation methods based on hierarchical ensemble classification techniques are particularly interesting when interpretability of annotation results is a main concern. In these methods, raw GO-term predictions computed by base binary classifiers are leveraged by checking the consistency of predefined GO relationships. Both formal leveraging strategies, with main focus on annotation precision, and heuristic alternatives, with main focus on scalability issues, have been described in literature. In this contribution, a factor graph approach to the hierarchical ensemble formulation of the automated GO annotation problem is presented. In this formal framework, a core factor graph is first built based on the GO structure and then enriched to take into account the noisy nature of GO-term predictions. Hence, starting from raw GO-term predictions, an iterative message passing algorithm between nodes of the factor graph is used to compute marginal probabilities of target GO-terms. Evaluations on Saccharomyces cerevisiae, Arabidopsis thaliana and Drosophila melanogaster protein sequences from the GO Molecular Function domain showed significant improvements over competing approaches, even when protein sequences were naively characterized by their physicochemical and secondary structure properties or when loose noisy annotation datasets were considered. Based on these promising results and using Arabidopsis thaliana annotation data, we extend our approach to the identification of most promising molecular function annotations for a set of proteins of unknown function in Solanum lycopersicum.Fil: Spetale, Flavio Ezequiel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; ArgentinaFil: Krsticevic, Flavia Jorgelina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; ArgentinaFil: Roda, Fernando. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; ArgentinaFil: Bulacio, Pilar Estela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; Argentin

    A Curve Shaped Description of Large Networks, with an Application to the Evaluation of Network Models

    Get PDF
    BACKGROUND: Understanding the structure of complex networks is a continuing challenge, which calls for novel approaches and models to capture their structure and reveal the mechanisms that shape the networks. Although various topological measures, such as degree distributions or clustering coefficients, have been proposed to characterize network structure from many different angles, a comprehensive and intuitive representation of large networks that allows quantitative analysis is still difficult to achieve. METHODOLOGY/PRINCIPAL FINDINGS: Here we propose a mesoscopic description of large networks which associates networks of different structures with a set of particular curves, using breadth-first search. After deriving the expressions of the curves of the random graphs and a small-world-like network, we found that the curves possess a number of network properties together, including the size of the giant component and the local clustering. Besides, the curve can also be used to evaluate the fit of network models to real-world networks. We describe a simple evaluation method based on the curve and apply it to the Drosophila melanogaster protein interaction network. The evaluation method effectively identifies which model better reproduces the topology of the real network among the given models and help infer the underlying growth mechanisms of the Drosophila network. CONCLUSIONS/SIGNIFICANCE: This curve-shaped description of large networks offers a wealth of possibilities to develop new approaches and applications including network characterization, comparison, classification, modeling and model evaluation, differing from using a large bag of topological measures

    Folding of small disulfide-rich proteins : clarifying the puzzle

    Get PDF
    Premi a l'excel·lència investigadora. Àmbit de les Ciències Experimentals. 2008The process by which small proteins fold to their native conformations has been intensively studied over the last few decades. In this field, the particular chemistry of disulfide bond formation has facilitated the characterization of the oxidative folding of numerous small, disulfide-rich proteins with results that illustrate a high diversity of folding mechanisms, differing in the heterogeneity and disulfide pairing nativeness of their intermediates. In this review, we combine information on the folding of different protein models together with the recent structural determinations of major intermediates to provide new molecular clues in oxidative folding. Also, we turn to analyze the role of disulfide bonds in misfolding and protein aggregation and their implications in amyloidosis and conformational diseases

    Encounter complexes and dimensionality reduction in protein-protein association

    Get PDF
    An outstanding challenge has been to understand the mechanism whereby proteins associate. We report here the results of exhaustively sampling the conformational space in protein–protein association using a physics-based energy function. The agreement between experimental intermolecular paramagnetic relaxation enhancement (PRE) data and the PRE profiles calculated from the docked structures shows that the method captures both specific and non-specific encounter complexes. To explore the energy landscape in the vicinity of the native structure, the nonlinear manifold describing the relative orientation of two solid bodies is projected onto a Euclidean space in which the shape of low energy regions is studied by principal component analysis. Results show that the energy surface is canyon-like, with a smooth funnel within a two dimensional subspace capturing over 75% of the total motion. Thus, proteins tend to associate along preferred pathways, similar to sliding of a protein along DNA in the process of protein-DNA recognition

    Heterologous expression and functional characterization of a GH10 endoxylanase from \u3ci\u3eAspergillus fumigatus\u3c/i\u3e var. \u3ci\u3eniveus\u3c/i\u3e with potential biotechnological application

    Get PDF
    Xylanases decrease the xylan content in pretreated biomass releasing it from hemicellulose, thus improving the accessibility of cellulose for cellulases. In this work, an endo-β-1,4-xylanase from Aspergillus fumigatus var. niveus (AFUMN-GH10) was successfully expressed. The structural analysis and biochemical characterization showed this AFUMN-GH10 does not contain a carbohydrate-binding module. The enzyme retained its activity in a pH range from 4.5 to 7.0, with an optimal temperature at 60°C. AFUMN-GH10 showed the highest activity in beechwood xylan. The mode of action of AFUMNGH10 was investigated by hydrolysis of APTS-labeled xylohexaose, which resulted in xylotriose and xylobiose as the main products. AFUMN-GH10 released 27% of residual xylan from hydrothermally-pretreated corn stover and 14% of residual xylan from hydrothermally-pretreated sugarcane bagasse. The results showed that environmentally friendly pretreatment followed by enzymatic hydrolysis with AFUMN-GH10 in low concentration is a suitable method to remove part of residual and recalcitrant hemicellulose from biomass
    • …
    corecore