489 research outputs found

    The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis.

    No full text
    The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400 000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for experimental scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment

    The MIGenAS integrated bioinformatics toolkit for web-based sequence analysis

    Get PDF
    We describe a versatile and extensible integrated bioinformatics toolkit for the analysis of biological sequences over the Internet. The web portal offers convenient interactive access to a growing pool of chainable bioinformatics software tools and databases that are centrally installed and maintained by the RZG. Currently, supported tasks comprise sequence similarity searches in public or user-supplied databases, computation and validation of multiple sequence alignments, phylogenetic analysis and protein–structure prediction. Individual tools can be seamlessly chained into pipelines allowing the user to conveniently process complex workflows without the necessity to take care of any format conversions or tedious parsing of intermediate results. The toolkit is part of the Max-Planck Integrated Gene Analysis System (MIGenAS) of the Max Planck Society available at (click ‘Start Toolkit’)

    Jwalk and MNXL web server: model validation using restraints from Crosslinking Mass Spectrometry

    Get PDF
    Motivation: Crosslinking Mass Spectrometry generates restraints that can be used to model proteins and protein complexes. Previously, we have developed two methods, to help users achieve better modelling performance from their crosslinking restraints: Jwalk, to estimate solvent accessible distances between crosslinked residues and MNXL, to assess the quality of the models based on these distances. Results: Here we present the Jwalk and MNXL webservers, which streamline the process of validating monomeric protein models using restraints from crosslinks. We demonstrate this by using the MNXL server to filter models made of varying quality, selecting the most native-like. Availability: The webserver and source code are freely available from jwalk.ismb.lon.ac.uk and mnxl.ismb.lon.ac.uk

    Noncoder : a web interface for exon array-based detection of long non-coding RNAs

    Get PDF
    Due to recent technical developments, a high number of long non-coding RNAs (lncRNAs) have been discovered in mammals. Although it has been shown that lncRNAs are regulated differently among tissues and disease statuses, functions of these transcripts are still unknown in most cases. GeneChip Exon 1.0 ST Arrays (exon arrays) from Affymetrix, Inc. have been used widely to profile genome-wide expression changes and alternative splicing of protein-coding genes. Here, we demonstrate that re-annotation of exon array probes can be used to profile expressions of tens of thousands of lncRNAs. With this annotation, a detailed inspection of lncRNAs and their isoforms is possible. To allow for a general usage to the research community, we developed a user-friendly web interface called 'noncoder'. By uploading CEL files from exon arrays and with a few mouse clicks and parameter settings, exon array data will be normalized and analysed to identify differentially expressed lncRNAs. Noncoder provides the detailed annotation information of lncRNAs and is equipped with unique features to allow for an efficient search for interesting lncRNAs to be studied further. The web interface is available at http://noncoder.mpi-bn.mpg.de

    Protein sequence analysis using the MPI Bioinformatics Toolkit

    Get PDF
    The MPI Bioinformatics Toolkit (https://toolkit.tuebingen.mpg.de) provides interactive access to a wide range of the best‐performing bioinformatics tools and databases, including the state‐of‐the‐art protein sequence comparison methods HHblits and HHpred. The Toolkit currently includes 35 external and in‐house tools, covering functionalities such as sequence similarity searching, prediction of sequence features, and sequence classification. Due to this breadth of functionality, the tight interconnection of its constituent tools, and its ease of use, the Toolkit has become an important resource for biomedical research and for teaching protein sequence analysis to students in the life sciences. In this article, we provide detailed information on utilizing the three most widely accessed tools within the Toolkit: HHpred for the detection of homologs, HHpred in conjunction with MODELLER for structure prediction and homology modeling, and CLANS for the visualization of relationships in large sequence datasets. Basic Protocol 1: Sequence similarity searching using HHpred Alternate Protocol: Pairwise sequence comparison using HHpred Support Protocol: Building a custom multiple sequence alignment using PSI‐BLAST and forwarding it as input to HHpred Basic Protocol 2: Calculation of homology models using HHpred and MODELLER Basic Protocol 3: Cluster analysis using CLAN

    Mapping genetic variations to three- dimensional protein structures to enhance variant interpretation: a proposed framework

    Get PDF
    The translation of personal genomics to precision medicine depends on the accurate interpretation of the multitude of genetic variants observed for each individual. However, even when genetic variants are predicted to modify a protein, their functional implications may be unclear. Many diseases are caused by genetic variants affecting important protein features, such as enzyme active sites or interaction interfaces. The scientific community has catalogued millions of genetic variants in genomic databases and thousands of protein structures in the Protein Data Bank. Mapping mutations onto three-dimensional (3D) structures enables atomic-level analyses of protein positions that may be important for the stability or formation of interactions; these may explain the effect of mutations and in some cases even open a path for targeted drug development. To accelerate progress in the integration of these data types, we held a two-day Gene Variation to 3D (GVto3D) workshop to report on the latest advances and to discuss unmet needs. The overarching goal of the workshop was to address the question: what can be done together as a community to advance the integration of genetic variants and 3D protein structures that could not be done by a single investigator or laboratory? Here we describe the workshop outcomes, review the state of the field, and propose the development of a framework with which to promote progress in this arena. The framework will include a set of standard formats, common ontologies, a common application programming interface to enable interoperation of the resources, and a Tool Registry to make it easy to find and apply the tools to specific analysis problems. Interoperability will enable integration of diverse data sources and tools and collaborative development of variant effect prediction methods

    Structural bioinformatics predicts that the Retinitis Pigmentosa-28 protein of unknown function FAM161A is a homologue of the microtubule nucleation factor Tpx2 [version 1; peer review: 2 approved]

    Get PDF
    BACKGROUND: FAM161A is a microtubule-associated protein conserved widely across eukaryotes, which is mutated in the inherited blinding disease Retinitis Pigmentosa-28. FAM161A is also a centrosomal protein, being a core component of a complex that forms an internal skeleton of centrioles. Despite these observations about the importance of FAM161A, current techniques used to examine its sequence reveal no homologies to other proteins. METHODS: Sequence profiles derived from multiple sequence alignments of FAM161A homologues were constructed by PSI-BLAST and HHblits, and then used by the profile-profile search tool HHsearch, implemented online as HHpred, to identify homologues. These in turn were used to create profiles for reverse searches and pair-wise searches. Multiple sequence alignments were also used to identify amino acid usage in functional elements. RESULTS: FAM161A has a single homologue: the targeting protein for Xenopus kinesin-like protein-2 (Tpx2), which is a strong hit across more than 200 residues. Tpx2 is also a microtubule-associated protein, and it has been shown previously by a cryo-EM molecular structure to nucleate microtubules through two small elements: an extended loop and a short helix. The homology between FAM161A and Tpx2 includes these elements, as FAM161A has three copies of the loop, and one helix that has many, but not all, properties of the one in Tpx2. CONCLUSIONS: FAM161A and ­its homologues are predicted to be a previously unknown variant of Tpx2, and hence bind microtubules in the same way. This prediction allows precise, testable molecular models to be made of FAM161A-microtubule complexes

    BioExcel Building Blocks, a software library for interoperable biomolecular simulation workflows.

    Get PDF
    In the recent years, the improvement of software and hardware performance has made biomolecular simulations a mature tool for the study of biological processes. Simulation length and the size and complexity of the analyzed systems make simulations both complementary and compatible with other bioinformatics disciplines. However, the characteristics of the software packages used for simulation have prevented the adoption of the technologies accepted in other bioinformatics fields like automated deployment systems, workflow orchestration, or the use of software containers. We present here a comprehensive exercise to bring biomolecular simulations to the "bioinformatics way of working". The exercise has led to the development of the BioExcel Building Blocks (BioBB) library. BioBB's are built as Python wrappers to provide an interoperable architecture. BioBB's have been integrated in a chain of usual software management tools to generate data ontologies, documentation, installation packages, software containers and ways of integration with workflow managers, that make them usable in most computational environments
    corecore