143 research outputs found

    A structural study for the optimisation of functional motifs encoded in protein sequences

    Get PDF
    BACKGROUND: A large number of PROSITE patterns select false positives and/or miss known true positives. It is possible that – at least in some cases – the weak specificity and/or sensitivity of a pattern is due to the fact that one, or maybe more, functional and/or structural key residues are not represented in the pattern. Multiple sequence alignments are commonly used to build functional sequence patterns. If residues structurally conserved in proteins sharing a function cannot be aligned in a multiple sequence alignment, they are likely to be missed in a standard pattern construction procedure. RESULTS: Here we present a new procedure aimed at improving the sensitivity and/ or specificity of poorly-performing patterns. The procedure can be summarised as follows: 1. residues structurally conserved in different proteins, that are true positives for a pattern, are identified by means of a computational technique and by visual inspection. 2. the sequence positions of the structurally conserved residues falling outside the pattern are used to build extended sequence patterns. 3. the extended patterns are optimised on the SWISS-PROT database for their sensitivity and specificity. The method was applied to eight PROSITE patterns. Whenever structurally conserved residues are found in the surface region close to the pattern (seven out of eight cases), the addition of information inferred from structural analysis is shown to improve pattern selectivity and in some cases selectivity and sensitivity as well. In some of the cases considered the procedure allowed the identification of functionally interesting residues, whose biological role is also discussed. CONCLUSION: Our method can be applied to any type of functional motif or pattern (not only PROSITE ones) which is not able to select all and only the true positive hits and for which at least two true positive structures are available. The computational technique for the identification of structurally conserved residues is already available on request and will be soon accessible on our web server. The procedure is intended for the use of pattern database curators and of scientists interested in a specific protein family for which no specific or selective patterns are yet available

    Query3d: a new method for high-throughput analysis of functional residues in protein structures

    Get PDF
    BACKGROUND: The identification of local similarities between two protein structures can provide clues of a common function. Many different methods exist for searching for similar subsets of residues in proteins of known structure. However, the lack of functional and structural information on single residues, together with the low level of integration of this information in comparison methods, is a limitation that prevents these methods from being fully exploited in high-throughput analyses. RESULTS: Here we describe Query3d, a program that is both a structural DBMS (Database Management System) and a local comparison method. The method conserves a copy of all the residues of the Protein Data Bank annotated with a variety of functional and structural information. New annotations can be easily added from a variety of methods and known databases. The algorithm makes it possible to create complex queries based on the residues' function and then to compare only subsets of the selected residues. Functional information is also essential to speed up the comparison and the analysis of the results. CONCLUSION: With Query3d, users can easily obtain statistics on how many and which residues share certain properties in all proteins of known structure. At the same time, the method also finds their structural neighbours in the whole PDB. Programs and data can be accessed through the PdbFun web interface

    The Bioinformatics Italian Society

    Get PDF
    The Bioinformatics Italian Society (BITS) is a non-profit scientific association grounded on 19 June 2003, to gather scientists with interests in the field of Bioinformatics, intended as multidisciplinary science studying biological problems at the molecular level by using informatics and computational methods. The Society has now about 230 members and aims at overcoming 250 in 2012

    pdbFun: mass selection and fast comparison of annotated PDB residues

    Get PDF
    pdbFun () is a web server for structural and functional analysis of proteins at the residue level. pdbFun gives fast access to the whole Protein Data Bank (PDB) organized as a database of annotated residues. The available data (features) range from solvent exposure to ligand binding ability, location in a protein cavity, secondary structure, residue type, sequence functional pattern, protein domain and catalytic activity. Users can select any residue subset (even including any number of PDB structures) by combining the available features. Selections can be used as probe and target in multiple structure comparison searches. For example a search could involve, as a query, all solvent-exposed, hydrophylic residues that are not in alpha-helices and are involved in nucleotide binding. Possible examples of targets are represented by another selection, a single structure or a dataset composed of many structures. The output is a list of aligned structural matches offered in tabular and also graphical format

    Variation in the co-expression profile highlights a loss of miRNA-mRNA regulation in multiple cancer types

    Get PDF
    Recent research provides insight into the ability of miRNA to regulate various pathways in several cancer types. Despite their involvement in the regulation of the mRNA via targeting the 3'UTR, there are relatively few studies examining the changes in these regulatory mechanisms specific to single cancer types or shared between different cancer types.We analyzed samples where both miRNA and mRNA expression had been measured and performed a thorough correlation analysis on 7494 experimentally validated human miRNA-mRNA target-gene pairs in both healthy and tumoral samples.We show how more than 90% of these miRNA-mRNA interactions show a loss of regulation in the tumoral samples compared with their healthy counterparts.As expected, we found shared miRNA-mRNA dysregulated pairs among different tumors of the same tissue. However, anatomically different cancers also share multiple dysregulated interactions, suggesting that some cancer-related mechanisms are not tumor-specific. 2865 unique miRNA-mRNA pairs were identified across 13 cancer types, approximate to 40% of these pairs showed a loss of correlation in the tumoral samples in at least 2 out of the 13 analyzed cancers. Specifically, miR-200 family, miR-155 and miR-1 were identified, based on the computational analysis described below, as the miRNAs that potentially lose the highest number of interactions across different samples (only literature-based interactions were used for this analysis).Moreover, the miR-34a/ALDH2 and miR-9/MTHFD2 pairs show a switch in their correlation between healthy and tumor kidney samples suggesting a possible change in the regulation exerted by the miRNAs. Interestingly, the expression of these mRNAs is also associated with the overall survival. The disruption of miRNA regulation on its target, therefore, suggests the possible involvement of these pairs in cell malignant functions.The analysis reported here shows how the regulation of miRNA-mRNA interactions strongly differs between healthy and tumoral cells, based on the strong correlation variation between miRNA and its target that we obtained by analyzing the expression data of healthy and tumor tissue in highly reliable miRNA-target pairs. Finally, a go term enrichment analysis shows that the critical pairs identified are involved in cellular adhesion, proliferation, and migration

    COTAN: scRNA-seq data analysis based on gene co-expression

    Get PDF
    Estimating the co-expression of cell identity factors in single-cell is crucial. Due to the low efficiency of scRNA-seq methodologies, sensitive computational approaches are critical to accurately infer transcription profiles in a cell population. We introduce COTAN, a statistical and computational method, to analyze the co-expression of gene pairs at single cell level, providing the foundation for single-cell gene interactome analysis. The basic idea is studying the zero UMI counts' distribution instead of focusing on positive counts; this is done with a generalized contingency tables framework. COTAN can assess the correlated or anti-correlated expression of gene pairs, providing a new correlation index with an approximate p-value for the associated test of independence. COTAN can evaluate whether single genes are differentially expressed, scoring them with a newly defined global differentiation index. Similarly to correlation network analysis, it provides ways to plot and cluster genes according to their co-expression pattern with other genes, effectively helping the study of gene interactions, becoming a new tool to identify cell-identity markers. We assayed COTAN on two neural development datasets with very promising results. COTAN is an R package that complements the traditional single cell RNA-seq analysis and it is available at https://github.com/seriph78/COTAN

    Relative Information Gain: Shannon entropy-based measure of the relative structural conservation in RNA alignments

    Get PDF
    Structural characterization of RNAs is a dynamic field, offeringmanymodelling possibilities. RNA secondary structure models are usually characterized by an encoding that depicts structural information of the molecule through string representations or graphs. In this work, we provide a generalization of the BEAR encoding (a context-aware structural encoding we previously developed) by expanding the set of alignments used for the construction of substitution matrices and then applying it to secondary structure encodings ranging from fine-grained to more coarse-grained representations. We also introduce a re-interpretation of the Shannon Information applied on RNA alignments, proposing a new scoring metric, the Relative Information Gain (RIG). The RIG score is available for any position in an alignment, showing how different levels of detail encoded in the RNA representation can contribute differently to convey structural information. The approaches presented in this study can be used alongside state-ofthe-art tools to synergistically gain insights into the structural elements that RNAs and RNA families are composed of. This additional information could potentially contribute to their improvement or increase the degree of confidence in the secondary structure of families and any set of aligned RNAs

    3dLOGO:a web server for the identification, analysis and use of conserved protein substructures

    Get PDF
    3dLOGO is a web server for the identification and analysis of conserved protein 3D substructures. Given a set of residues in a PDB (Protein Data Bank) chain, the server detects the matching substructure(s) in a set of user-provided protein structures, generates a multiple structure alignment centered on the input substructures and highlights other residues whose structural conservation becomes evident after the defined superposition. Conserved residues are proposed to the user for highlighting functional areas, deriving refined structural motifs or building sequence patterns. Residue structural conservation can be visualized through an expressly designed Java application, 3dProLogo, which is a 3D implementation of a sequence logo. The 3dLOGO server, with related documentation, is available at http://3dlogo.uniroma2.it

    BRIO: a web server for RNA sequence and structure motif scan

    Get PDF
    The interaction between RNA and RNA-binding proteins (RBPs) has a key role in the regulation of gene expression, in RNA stability, and in many other biological processes. RBPs accomplish these functions by binding target RNA molecules through specific sequence and structure motifs. The identification of these binding motifs is therefore fundamental to improve our knowledge of the cellular processes and how they are regulated. Here, we present BRIO (BEAM RNA Interaction mOtifs), a new web server designed for the identification of sequence and structure RNA-binding motifs in one or more RNA molecules of interest. BRIO enables the user to scan over 2508 sequence motifs and 2296 secondary structure motifs identified in Homo sapiens and Mus musculus, in three different types of experiments (PAR-CLIP, eCLIP, HITS). The motifs are associated with the binding of 186 RBPs and 69 protein domains. The web server is freely available at http://brio.bio.uniroma2.it
    corecore