489 research outputs found
The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis.
The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400 000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for experimental scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment
The MIGenAS integrated bioinformatics toolkit for web-based sequence analysis
We describe a versatile and extensible integrated bioinformatics toolkit for the analysis of biological sequences over the Internet. The web portal offers convenient interactive access to a growing pool of chainable bioinformatics software tools and databases that are centrally installed and maintained by the RZG. Currently, supported tasks comprise sequence similarity searches in public or user-supplied databases, computation and validation of multiple sequence alignments, phylogenetic analysis and protein–structure prediction. Individual tools can be seamlessly chained into pipelines allowing the user to conveniently process complex workflows without the necessity to take care of any format conversions or tedious parsing of intermediate results. The toolkit is part of the Max-Planck Integrated Gene Analysis System (MIGenAS) of the Max Planck Society available at (click ‘Start Toolkit’)
Jwalk and MNXL web server: model validation using restraints from Crosslinking Mass Spectrometry
Motivation: Crosslinking Mass Spectrometry generates restraints that can be used to model proteins and protein complexes. Previously, we have developed two methods, to help users achieve better modelling performance from their crosslinking restraints: Jwalk, to estimate solvent accessible distances between crosslinked residues and MNXL, to assess the quality of the models based on these distances.
Results: Here we present the Jwalk and MNXL webservers, which streamline the process of validating monomeric protein models using restraints from crosslinks. We demonstrate this by using the MNXL server to filter models made of varying quality, selecting the most native-like.
Availability: The webserver and source code are freely available from jwalk.ismb.lon.ac.uk and mnxl.ismb.lon.ac.uk
Noncoder : a web interface for exon array-based detection of long non-coding RNAs
Due to recent technical developments, a high number of long non-coding RNAs (lncRNAs) have been discovered in mammals. Although it has been shown that lncRNAs are regulated differently among tissues and disease statuses, functions of these transcripts are still unknown in most cases. GeneChip Exon 1.0 ST Arrays (exon arrays) from Affymetrix, Inc. have been used widely to profile genome-wide expression changes and alternative splicing of protein-coding genes. Here, we demonstrate that re-annotation of exon array probes can be used to profile expressions of tens of thousands of lncRNAs. With this annotation, a detailed inspection of lncRNAs and their isoforms is possible. To allow for a general usage to the research community, we developed a user-friendly web interface called 'noncoder'. By uploading CEL files from exon arrays and with a few mouse clicks and parameter settings, exon array data will be normalized and analysed to identify differentially expressed lncRNAs. Noncoder provides the detailed annotation information of lncRNAs and is equipped with unique features to allow for an efficient search for interesting lncRNAs to be studied further. The web interface is available at http://noncoder.mpi-bn.mpg.de
Protein sequence analysis using the MPI Bioinformatics Toolkit
The MPI Bioinformatics Toolkit (https://toolkit.tuebingen.mpg.de) provides interactive access to a wide range of the best‐performing bioinformatics tools and databases, including the state‐of‐the‐art protein sequence comparison methods HHblits and HHpred. The Toolkit currently includes 35 external and in‐house tools, covering functionalities such as sequence similarity searching, prediction of sequence features, and sequence classification. Due to this breadth of functionality, the tight interconnection of its constituent tools, and its ease of use, the Toolkit has become an important resource for biomedical research and for teaching protein sequence analysis to students in the life sciences. In this article, we provide detailed information on utilizing the three most widely accessed tools within the Toolkit: HHpred for the detection of homologs, HHpred in conjunction with MODELLER for structure prediction and homology modeling, and CLANS for the visualization of relationships in large sequence datasets. Basic Protocol 1: Sequence similarity searching using HHpred Alternate Protocol: Pairwise sequence comparison using HHpred Support Protocol: Building a custom multiple sequence alignment using PSI‐BLAST and forwarding it as input to HHpred Basic Protocol 2: Calculation of homology models using HHpred and MODELLER Basic Protocol 3: Cluster analysis using CLAN
Mapping genetic variations to three- dimensional protein structures to enhance variant interpretation: a proposed framework
The translation of personal genomics to precision medicine depends on the accurate interpretation of the multitude of genetic variants observed for each individual. However, even when genetic variants are predicted to modify a protein, their functional implications may be unclear. Many diseases are caused by genetic variants affecting important protein features, such as enzyme active sites or interaction interfaces. The scientific community has catalogued millions of genetic variants in genomic databases and thousands of protein structures in the Protein Data Bank. Mapping mutations onto three-dimensional (3D) structures enables atomic-level analyses of protein positions that may be important for the stability or formation of interactions; these may explain the effect of mutations and in some cases even open a path for targeted drug development. To accelerate progress in the integration of these data types, we held a two-day Gene Variation to 3D (GVto3D) workshop to report on the latest advances and to discuss unmet needs. The overarching goal of the workshop was to address the question: what can be done together as a community to advance the integration of genetic variants and 3D protein structures that could not be done by a single investigator or laboratory? Here we describe the workshop outcomes, review the state of the field, and propose the development of a framework with which to promote progress in this arena. The framework will include a set of standard formats, common ontologies, a common application programming interface to enable interoperation of the resources, and a Tool Registry to make it easy to find and apply the tools to specific analysis problems. Interoperability will enable integration of diverse data sources and tools and collaborative development of variant effect prediction methods
Recommended from our members
Anti-phage islands force their target phage to directly mediate island excision and spread.
Vibrio cholerae, the causative agent of the diarrheal disease cholera, is antagonized by the lytic phage ICP1 in the aquatic environment and in human hosts. Mobile genetic elements called PLEs (phage-inducible chromosomal island-like elements) protect V. cholerae from ICP1 infection and initiate their anti-phage response by excising from the chromosome. Here, we show that PLE 1 encodes a large serine recombinase, Int, that exploits an ICP1-specific protein as a recombination directionality factor (RDF) to excise PLE 1 in response to phage infection. We show that this phage-encoded protein is sufficient to direct Int-mediated recombination in vitro and that it is highly conserved in all sequenced ICP1 genomes. Our results uncover an aspect of the molecular specificity underlying the conflict between a single predatory phage and V. cholerae PLE and contribute to our understanding of long-term evolution between phage and their bacterial hosts
Recommended from our members
The obesity-associated gene TMEM18 has a role in the central control of appetite and body weight regulation
An intergenic region of human chromosome 2 (2p25.3) harbors genetic variants which are among those most strongly and reproducibly associated with obesity. The gene closest to these variants is TMEM18, although the molecular mechanisms mediating these effects remain entirely unknown. Tmem18 expression in the murine hypothalamic paraventricular nucleus (PVN) was altered by changes in nutritional state. Germline loss of Tmem18 in mice resulted in increased body weight, which was exacerbated by high fat diet and driven by increased food intake. Selective overexpression of Tmem18 in the PVN of wild-type mice reduced food intake and also increased energy expenditure. We provide evidence that TMEM18 has four, not three, transmembrane domains and that it physically interacts with key components of the nuclear pore complex. Our data support the hypothesis that TMEM18 itself, acting within the central nervous system, is a plausible mediator of the impact of adjacent genetic variation on human adiposity.RL, YCLT, DR, GSHY, SOR and APC are funded by the Medical Research Council (MRC) Metabolic Disease Unit (MRC_MC_UU_12012/1) and animal work was carried out with the assistance of MRC Disease Model Core of the Wellcome Trust MRC Institute of Metabolic Sciences (MRC_MC_UU_12012/5 and Wellcome Trust Strategic Award (100574/Z/12/Z). F. Bosch is the recipient of an award from the ICREA Academia, Generalitat de Catalunya, Spain. Vector generation and production were funded by Ministerio de Economía y Competitividad (SAF 2014-54866-R), Spain. CD and DWL were supported by the Wellcome Trust (WT098051) and CD was supported by the Wellcome Trust PhD Programme for Clinicians (100679/Z/12/Z)
Structural bioinformatics predicts that the Retinitis Pigmentosa-28 protein of unknown function FAM161A is a homologue of the microtubule nucleation factor Tpx2 [version 1; peer review: 2 approved]
BACKGROUND: FAM161A is a microtubule-associated protein conserved widely across eukaryotes, which is mutated in the inherited blinding disease Retinitis Pigmentosa-28. FAM161A is also a centrosomal protein, being a core component of a complex that forms an internal skeleton of centrioles. Despite these observations about the importance of FAM161A, current techniques used to examine its sequence reveal no homologies to other proteins.
METHODS: Sequence profiles derived from multiple sequence alignments of FAM161A homologues were constructed by PSI-BLAST and HHblits, and then used by the profile-profile search tool HHsearch, implemented online as HHpred, to identify homologues. These in turn were used to create profiles for reverse searches and pair-wise searches. Multiple sequence alignments were also used to identify amino acid usage in functional elements.
RESULTS: FAM161A has a single homologue: the targeting protein for Xenopus kinesin-like protein-2 (Tpx2), which is a strong hit across more than 200 residues. Tpx2 is also a microtubule-associated protein, and it has been shown previously by a cryo-EM molecular structure to nucleate microtubules through two small elements: an extended loop and a short helix. The homology between FAM161A and Tpx2 includes these elements, as FAM161A has three copies of the loop, and one helix that has many, but not all, properties of the one in Tpx2.
CONCLUSIONS: FAM161A and its homologues are predicted to be a previously unknown variant of Tpx2, and hence bind microtubules in the same way. This prediction allows precise, testable molecular models to be made of FAM161A-microtubule complexes
BioExcel Building Blocks, a software library for interoperable biomolecular simulation workflows.
In the recent years, the improvement of software and hardware performance has made biomolecular simulations a mature tool for the study of biological processes. Simulation length and the size and complexity of the analyzed systems make simulations both complementary and compatible with other bioinformatics disciplines. However, the characteristics of the software packages used for simulation have prevented the adoption of the technologies accepted in other bioinformatics fields like automated deployment systems, workflow orchestration, or the use of software containers. We present here a comprehensive exercise to bring biomolecular simulations to the "bioinformatics way of working". The exercise has led to the development of the BioExcel Building Blocks (BioBB) library. BioBB's are built as Python wrappers to provide an interoperable architecture. BioBB's have been integrated in a chain of usual software management tools to generate data ontologies, documentation, installation packages, software containers and ways of integration with workflow managers, that make them usable in most computational environments
- …