1,709 research outputs found

    AGMIAL: implementing an annotation strategy for prokaryote genomes as a distributed system

    Get PDF
    We have implemented a genome annotation system for prokaryotes called AGMIAL. Our approach embodies a number of key principles. First, expert manual annotators are seen as a critical component of the overall system; user interfaces were cyclically refined to satisfy their needs. Second, the overall process should be orchestrated in terms of a global annotation strategy; this facilitates coordination between a team of annotators and automatic data analysis. Third, the annotation strategy should allow progressive and incremental annotation from a time when only a few draft contigs are available, to when a final finished assembly is produced. The overall architecture employed is modular and extensible, being based on the W3 standard Web services framework. Specialized modules interact with two independent core modules that are used to annotate, respectively, genomic and protein sequences. AGMIAL is currently being used by several INRA laboratories to analyze genomes of bacteria relevant to the food-processing industry, and is distributed under an open source license

    MoKCa database - mutations of kinases in cancer

    Get PDF
    Members of the protein kinase family are amongst the most commonly mutated genes in human cancer, and both mutated and activated protein kinases have proved to be tractable targets for the development of new anticancer therapies The MoKCa database (Mutations of Kinases in Cancer, http://strubiol.icr.ac.uk/extra/mokca) has been developed to structurally and functionally annotate, and where possible predict, the phenotypic consequences of mutations in protein kinases implicated in cancer. Somatic mutation data from tumours and tumour cell lines have been mapped onto the crystal structures of the affected protein domains. Positions of the mutated amino-acids are highlighted on a sequence-based domain pictogram, as well as a 3D-image of the protein structure, and in a molecular graphics package, integrated for interactive viewing. The data associated with each mutation is presented in the Web interface, along with expert annotation of the detailed molecular functional implications of the mutation. Proteins are linked to functional annotation resources and are annotated with structural and functional features such as domains and phosphorylation sites. MoKCa aims to provide assessments available from multiple sources and algorithms for each potential cancer-associated mutation, and present these together in a consistent and coherent fashion to facilitate authoritative annotation by cancer biologists and structural biologists, directly involved in the generation and analysis of new mutational data

    Domain discovery method for topological profile searches in protein structures

    Get PDF
    We describe a method for automated domain discovery for topological profile searches in protein structures. The method is used in a system TOPStructure for fast prediction of CATH classification for protein structures (given as PDB files). It is important for profile searches in multi-domain proteins, for which the profile method by itself tends to perform poorly. We also present an O(C(n)k +nk2) time algorithm for this problem, compared to the O(C(n)k +(nk)2) time used by a trivial algorithm (where n is the length of the structure, k is the number of profiles and C(n) is the time needed to check for a presence of a given motif in a structure of length n). This method has been developed and is currently used for TOPS representations of protein structures and prediction of CATH classification, but may be applied to other graph-based representations of protein or RNA structures and/or other prediction problems. A protein structure prediction system incorporating the domain discovery method is available at http://bioinf.mii.lu.lv/tops/

    Recent improvements to the PROSITE database

    Get PDF
    The PROSITE database consists of a large collection of biologically meaningful signatures that are described as patterns or profiles. Each signature is linked to documentation that provides useful biological information on the protein family, domain or functional site identified by the signature. The PROSITE web page has been redesigned and several tools have been implemented to help the user discover new conserved regions in their own proteins and to visualize domain arrangements. We also introduced the facility to search PDB with a PROSITE entry or a user's pattern and visualize matched positions on 3D structures. The latest version of PROSITE (release 18.17 of November 30, 2003) contains 1676 entries. The database is accessible at http://www.expasy.org/prosit

    A Molecular Biology Database Digest

    Get PDF
    Computational Biology or Bioinformatics has been defined as the application of mathematical and Computer Science methods to solving problems in Molecular Biology that require large scale data, computation, and analysis [18]. As expected, Molecular Biology databases play an essential role in Computational Biology research and development. This paper introduces into current Molecular Biology databases, stressing data modeling, data acquisition, data retrieval, and the integration of Molecular Biology data from different sources. This paper is primarily intended for an audience of computer scientists with a limited background in Biology

    Pattern matching and pattern discovery algorithms for protein topologies

    Get PDF
    We describe algorithms for pattern matching and pattern learning in TOPS diagrams (formal descriptions of protein topologies). These problems can be reduced to checking for subgraph isomorphism and finding maximal common subgraphs in a restricted class of ordered graphs. We have developed a subgraph isomorphism algorithm for ordered graphs, which performs well on the given set of data. The maximal common subgraph problem then is solved by repeated subgraph extension and checking for isomorphisms. Despite the apparent inefficiency such approach gives an algorithm with time complexity proportional to the number of graphs in the input set and is still practical on the given set of data. As a result we obtain fast methods which can be used for building a database of protein topological motifs, and for the comparison of a given protein of known secondary structure against a motif database

    Molecular mechanisms of the non-coenzyme action of thiamin in brain. Biochemical, structural and pathway analysis

    Get PDF
    Thiamin (vitamin B1) is a pharmacological agent boosting central metabolism through the action of the coenzyme thiamin diphosphate (ThDP). However, positive effects, including improved cognition, of high thiamin doses in neurodegeneration may be observed without increased ThDP or ThDPdependent enzymes in brain. Here, we determine protein partners and metabolic pathways where thiamin acts beyond its coenzyme role. Malate dehydrogenase, glutamate dehydrogenase and pyridoxal kinase were identified as abundant proteins binding to thiamin- or thiazolium-modified sorbents. Kinetic studies, supported by structural analysis, revealed allosteric regulation of these proteins by thiamin and/or its derivatives. Thiamin triphosphate and adenylated thiamin triphosphate activate glutamate dehydrogenase. Thiamin and ThDP regulate malate dehydrogenase isoforms and pyridoxal kinase. Thiamin regulation of enzymes related to malate-aspartate shuttle may impact on malate/citrate exchange, responsible for exporting acetyl residues from mitochondria. Indeed, bioinformatic analyses found an association between thiamin- and thiazolium-binding proteins and the term acetylation. Our interdisciplinary study shows that thiamin is not only a coenzyme for acetyl-CoA production, but also an allosteric regulator of acetyl-CoA metabolism including regulatory acetylation of proteins and acetylcholine biosynthesis. Moreover, thiamin action in neurodegeneration may also involve neurodegeneration-related 14-3-3, DJ-1 and β-amyloid precursor proteins identified among the thiamin- and/or thiazolium-binding proteins

    The PROSITE database, its status in 1999

    Get PDF
    The PROSITE database (http://www.expasy.ch/sprot/prosite.html) consists of biologically significant patterns and profiles formulated in such a way that with appropriate computational tools it can help to determine to which known family of protein (if any) a new sequence belongs, or which known domain(s) it contain
    corecore