Search CORE

154 research outputs found

Uncovering metabolic pathways relevant to phenotypic traits of microbial genomes

Author: Gasteiger Johann
Kastenmüller Gabi
Mewes Hans-Werner
Schenk Maria Elisabeth
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

A new machine learning-based method is presented here for the identification of metabolic pathways related to specific phenotypes in multiple microbial genomes

Crossref

PubMed Central

PuSH

The PEDANT genome database in 2005

Author: Frishman Dmitrij
Mewes Hans-Werner
Riley M. Louise
Schmidt Thorsten
Wagner Christian
Publication venue: Oxford University Press
Publication date: 17/12/2004
Field of study

The PEDANT genome database (http://pedant.gsf.de) contains pre-computed bioinformatics analyses of publicly available genomes. Its main mission is to provide robust automatic annotation of the vast majority of amino acid sequences, which have not been subjected to in-depth manual curation by human experts in high-quality protein sequence databases. By design PEDANT annotation is genome-oriented, making it possible to explore genomic context of gene products, and evaluate functional and structural content of genomes using a category-based query mechanism. At present, the PEDANT database contains exhaustive annotation of over 1 240 000 proteins from 270 eubacterial, 23 archeal and 41 eukaryotic genomes

Crossref

PubMed Central

PuSH

SIMAP: the similarity matrix of proteins

Author: Arnold Roland
Lindner Dominik
Mewes H. Werner
Rattei Thomas
Stümpflen Volker
Tischler Patrick
Publication venue: Oxford University Press
Publication date: 01/01/2005
Field of study

Similarity Matrix of Proteins (SIMAP) () provides a database based on a pre-computed similarity matrix covering the similarity space formed by >4 million amino acid sequences from public databases and completely sequenced genomes. The database is capable of handling very large datasets and is updated incrementally. For sequence similarity searches and pairwise alignments, we implemented a grid-enabled software system, which is based on FASTA heuristics and the Smith–Waterman algorithm. Our ProtInfo system allows querying by protein sequences covered by the SIMAP dataset as well as by fragments of these sequences, highly similar sequences and title words. Each sequence in the database is supplemented with pre-calculated features generated by detailed sequence analyses. By providing WWW interfaces as well as web-services, we offer the SIMAP resource as an efficient and comprehensive tool for sequence similarity searches

Crossref

University of Birmingham Research Portal

PubMed Central

PuSH

MPact: the MIPS protein interaction resource on yeast

Author: Güldener Ulrich
Mewes Hans-Werner
Münsterkötter Martin
Oesterheld Matthias
Pagel Philipp
Ruepp Andreas
Stümpflen Volker
Publication venue: Oxford University Press
Publication date: 28/12/2005
Field of study

In recent years, the Munich Information Center for Protein Sequences (MIPS) yeast protein–protein interaction (PPI) dataset has been used in numerous analyses of protein networks and has been called a gold standard because of its quality and comprehensiveness [H. Yu, N. M. Luscombe, H. X. Lu, X. Zhu, Y. Xia, J. D. Han, N. Bertin, S. Chung, M. Vidal and M. Gerstein (2004) Genome Res., 14, 1107–1118]. MPact and the yeast protein localization catalog provide information related to the proximity of proteins in yeast. Beside the integration of high-throughput data, information about experimental evidence for PPIs in the literature was compiled by experts adding up to 4300 distinct PPIs connecting 1500 proteins in yeast. As the interaction data is a complementary part of CYGD, interactive mapping of data on other integrated data types such as the functional classification catalog [A. Ruepp, A. Zollner, D. Maier, K. Albermann, J. Hani, M. Mokrejs, I. Tetko, U. Güldener, G. Mannhaupt, M. Münsterkötter and H. W. Mewes (2004) Nucleic Acids Res., 32, 5539–5545] is possible. A survey of signaling proteins and comparison with pathway data from KEGG demonstrates that based on these manually annotated data only an extensive overview of the complexity of this functional network can be obtained in yeast. The implementation of a web-based PPI-analysis tool allows analysis and visualization of protein interaction networks and facilitates integration of our curated data with high-throughput datasets. The complete dataset as well as user-defined sub-networks can be retrieved easily in the standardized PSI-MI format. The resource can be accessed through

Crossref

PubMed Central

PuSH

SIMAP--the database of all-against-all protein sequence similarities and annotations with new interfaces and increased coverage

Author: Arnold Roland
Goldenberg Florian
Mewes Hans-Werner
Rattei Thomas
Publication venue: 'Oxford University Press (OUP)'
Publication date: 26/10/2013
Field of study

The Similarity Matrix of Proteins (SIMAP, http://mips.gsf.de/simap/) database has been designed to massively accelerate computationally expensive protein sequence analysis tasks in bioinformatics. It provides pre-calculated sequence similarities interconnecting the entire known protein sequence universe, complemented by pre-calculated protein features and domains, similarity clusters and functional annotations. SIMAP covers all major public protein databases as well as many consistently re-annotated metagenomes from different repositories. As of September 2013, SIMAP contains >163 million proteins corresponding to ∼70 million non-redundant sequences. SIMAP uses the sensitive FASTA search heuristics, the Smith–Waterman alignment algorithm, the InterPro database of protein domain models and the BLAST2GO functional annotation algorithm. SIMAP assists biologists by facilitating the interactive exploration of the protein sequence universe. Web-Service and DAS interfaces allow connecting SIMAP with any other bioinformatic tool and resource. All-against-all protein sequence similarity matrices of project-specific protein collections are generated on request. Recent improvements allow SIMAP to cover the rapidly growing sequenced protein sequence universe. New Web-Service interfaces enhance the connectivity of SIMAP. Novel tools for interactive extraction of protein similarity networks have been added. Open access to SIMAP is provided through the web portal; the portal also contains instructions and links for software access and flat file downloads

Crossref

University of Birmingham Research Portal

PubMed Central

PuSH

OREST: the online resource for EST analysis

Author: Andreas Ruepp
Ashburner
Brigitte Waegele
Castillo-Davis
Datson
Forment
Guldener
H. Werner Mewes
Hotz-Wagenblatt
Kent
Liang
Mao
Mewes
Nagaraj
Parkinson
Ruepp
Schmidt
The UniProt Consortium
Thorsten Schmidt
Wheeler
Publication venue: Oxford University Press
Publication date: 01/01/2008
Field of study

The generation of expressed sequence tag (EST) libraries offers an affordable approach to investigate organisms, if no genome sequence is available. OREST (http://mips.gsf.de/genre/proj/orest/index.html) is a server-based EST analysis pipeline, which allows the rapid analysis of large amounts of ESTs or cDNAs from mammalia and fungi. In order to assign the ESTs to genes or proteins OREST maps DNA sequences to reference datasets of gene products and in a second step to complete genome sequences. Mapping against genome sequences recovers additional 13% of EST data, which otherwise would escape further analysis. To enable functional analysis of the datasets, ESTs are functionally annotated using the hierarchical FunCat annotation scheme as well as GO annotation terms. OREST also allows to predict the association of gene products and diseases by Morbid Map (OMIM) classification. A statistical analysis of the results of the dataset is possible with the included PROMPT software, which provides information about enrichment and depletion of functional and disease annotation terms. OREST was successfully applied for the identification and functional characterization of more than 3000 EST sequences of the common marmoset monkey (Callithrix jacchus) as part of an international collaboration

Crossref

PubMed Central

PuSH

FGDB: a comprehensive fungal genome resource on the plant pathogen Fusarium graminearum

Author: Adam Gerhard
Güldener Ulrich
Haase Dirk
Mannhaupt Gertrud
Mewes Hans-Werner
Münsterkötter Martin
Oesterheld Matthias
Stümpflen Volker
Publication venue: Oxford University Press
Publication date: 28/12/2005
Field of study

The MIPS Fusarium graminearum Genome Database (FGDB) is a comprehensive genome database on one of the most devastating fungal plant pathogens of wheat and barley. FGDB provides information on two gene sets independently derived by automated annotation of the F.graminearum genome sequence. A complete manually revised gene set will be completed within the near future. The initial results of systematic manual correction of gene calls are already part of the current gene set. The database can be accessed to retrieve information from bioinformatics analyses and functional classifications of the proteins. The data are also organized in the well established MIPS catalogs and novel query techniques are available to search the data. The comprehensive set of gene calls was also used for the design of an Affymetrix GeneChip. The resource is accessible on

Spatiotemporal Expression Control Correlates with Intragenic Scaffold Matrix Attachment Regions (S/MARs) in Arabidopsis thaliana

Author: Blake Meyers
Georg Haberer
Hans-Werner Mewes
Igor V Tetko
Klaus F. X Mayer
Philip E Bourne
Stephen Rudd
Publication venue: Public Library of Science
Publication date: 01/01/2006
Field of study

Scaffold/matrix attachment regions (S/MARs) are essential for structural organization of the chromatin within the nucleus and serve as anchors of chromatin loop domains. A significant fraction of genes in Arabidopsis thaliana contains intragenic S/MAR elements and a significant correlation of S/MAR presence and overall expression strength has been demonstrated. In this study, we undertook a genome scale analysis of expression level and spatiotemporal expression differences in correlation with the presence or absence of genic S/MAR elements. We demonstrate that genes containing intragenic S/MARs are prone to pronounced spatiotemporal expression regulation. This characteristic is found to be even more pronounced for transcription factor genes. Our observations illustrate the importance of S/MARs in transcriptional regulation and the role of chromatin structural characteristics for gene regulation. Our findings open new perspectives for the understanding of tissue- and organ-specific regulation of gene expression

Crossref

Directory of Open Access Journals

PubMed Central

PuSH

PEDANT genome database: 10 years online

Author: Artamonova Irena I.
Frishman Dmitrij
Heumann Klaus
Mewes Hans-Werner
Riley M. Louise
Schmidt Thorsten
Volz Andreas
Wagner Christian
Publication venue: Oxford University Press
Publication date: 05/12/2006
Field of study

The PEDANT genome database provides exhaustive annotation of 468 genomes by a broad set of bioinformatics algorithms. We describe recent developments of the PEDANT Web server. The all-new Graphical User Interface (GUI) implemented in Java™ allows for more efficient navigation of the genome data, extended search capabilities, user customization and export facilities. The DNA and Protein viewers have been made highly dynamic and customizable. We also provide Web Services to access the entire body of PEDANT data programmatically. Finally, we report on the application of association rule mining for automatic detection of potential annotation errors. PEDANT is freely accessible to academic users at

CRONOS: the cross-reference navigation server

Author: Andreas Ruepp
Brigitte Waegele
Bussey
Corinna Montrone
Cote
Flicek
Gisela Fobo
H.-Werner Mewes
Irmtraud Dunger-Kaltenbach
McKusick
Povey
Pruitt
The UniProt Consortium
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Summary: Cross-mapping of gene and protein identifiers between different databases is a tedious and time-consuming task. To overcome this, we developed CRONOS, a cross-reference server that contains entries from five mammalian organisms presented by major gene and protein information resources. Sequence similarity analysis of the mapped entries shows that the cross-references are highly accurate. In total, up to 18 different identifier types can be used for identification of cross-references. The quality of the mapping could be improved substantially by exclusion of ambiguous gene and protein names which were manually validated. Organism-specific lists of ambiguous terms, which are valuable for a variety of bioinformatics applications like text mining are available for download

Crossref

PubMed Central

PuSH