Search CORE

Clemson University: TigerPrints

ProtVirDB: a database of protozoan virulent proteins

Author: Dinesh Gupta
Jayashree Ramana
Publication venue
Publication date: 15/04/2009
Field of study

Abstract Summary: ProtVirDB is a comprehensive and user-friendly web-based knowledgebase of virulent proteins belonging to protozoan species. The database will facilitate research and provide an integrated platform for comparative studies of virulent proteins in different parasitic protozoans and organize them under a unifying classification schema with functional categories. Remarkably, one-third of the protein sequences in the database showed presence of either mono- or hetero-repeats, or both concomitantly—hence reiterating the importance of repeats in parasite virulence mechanisms. A number of useful bioinformatics tools including BLAST and tools for phylogenetic analysis are integrated with the database. With the rapidly burgeoning interest in the pathogenesis mechanisms of protozoans and ongoing genome sequencing projects, we anticipate that the database will be a useful tool for the research community. Availability: http://bioinfo.icgeb.res.in/protvirdb Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics onlin

Open Access Repository

DATA-INTENSIVE COMPUTING FOR BIOINFORMATICS USING VIRTUALIZATION TECHNOLOGIES AND HPC INFRASTRUCTURES

Author: Xuan Pengfei
Publication venue: Clemson University Libraries
Publication date: 01/12/2011
Field of study

The bioinformatics applications often involve many computational components and massive data sets, which are very difficult to be deployed on a single computing machine. In this thesis, we designed a data-intensive computing platform for bioinformatics applications using virtualization technologies and high performance computing (HPC) infrastructures with the concept of multi-tier architecture, which can seamlessly integrate the web user interface (presentation tier), scientific workflow (logic tier) and computing infrastructure (data/computing tier). We demonstrated our platform on two bioinformatics projects. First, we redesigned and deployed the cotton marker database (CMD) (http://www.cottonmarker.org), a centralized web portal in the cotton research community, using the Xen-based virtualization solution. To achieve high-performance and scalability for CMD web tools, we hosted the large amounts of protein databases and computational intensive applications of CMD on the Palmetto HPC of Clemson University. Biologists can easily utilize both bioinformatics applications and HPC resources through the CMD website without a background in computer science. Second, we developed a web tools - Glycan Array QSAR Tool (http://bci.clemson.edu/tools/glycan_array), to analyze glycan array data. The user interface of this tool was developed at the top of Drupal Content Management Systems (CMS) and the computational part was implemented using MATLAB Compiler Runtime (MCR) module. Our new bioinformatics computing platform enables the rapid deployment of data-intensive bioinformatics applications on HPC and virtualization environment with a user-friendly web interface and bridges the gap between biological scientists and cyberinfrastructure

B mu G@Sbase-a microbial gene expression and comparative genomic database

Author: Brooks LA
Butcher PD
Hinds J
Stoker NG
Tyler RH
Waldron DE
Withers M
Witney AA
Wren BW
Publication venue: OXFORD UNIV PRESS
Publication date: 01/01/2011
Field of study

The reducing cost of high-throughput functional genomic technologies is creating a deluge of high volume, complex data, placing the burden on bioinformatics resources and tool development. The Bacterial Microarray Group at St George's (BμG@S) has been at the forefront of bacterial microarray design and analysis for over a decade and while serving as a hub of a global network of microbial research groups has developed BμG@Sbase, a microbial gene expression and comparative genomic database. BμG@Sbase (http://bugs.sgul.ac.uk/bugsbase/) is a web-browsable, expertly curated, MIAME-compliant database that stores comprehensive experimental annotation and multiple raw and analysed data formats. Consistent annotation is enabled through a structured set of web forms, which guide the user through the process following a set of best practices and controlled vocabulary. The database currently contains 86 expertly curated publicly available data sets (with a further 124 not yet published) and full annotation information for 59 bacterial microarray designs. The data can be browsed and queried using an explorer-like interface; integrating intuitive tree diagrams to present complex experimental details clearly and concisely. Furthermore the modular design of the database will provide a robust platform for integrating other data types beyond microarrays into a more Systems analysis based future

LSHTM Research Online

St George's Online Research Archive

Bioinformatics on the Cloud Computing Platform Azure

Author: Andrew P. Harrison
Anne M. Owen
BD Halligan
DA de Lima Morais
DP Wall
E Afgan
GEP Ropella
H Eriksson
H Kim
H Parkinson
Hugh P. Shanahan
J Qiu
L Zhang
LD Stein
M Abouelhoda
P Di Tommaso
RC Taylor
S Contrino
Shyamal D. Peddada
SV Angiuoli
T Barrett
VA Fusaro
WB Langdon
Z Wang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development. © 2014 Shanahan et al

University of Essex Research Repository

CiteSeerX

Public Library of Science (PLOS)

Royal Holloway - Pure

FigShare

yStreX: yeast stress expression database

Author: Nookaew Intawat
Petranovic Dina
Wanichthanarak Kwanjeera
Publication venue
Publication date: 01/01/2014
Field of study

Over the past decade genome-wide expression analyses have been often used to study how expression of genes changes in response to various environmental stresses. Many of these studies (such as effects of oxygen concentration, temperature stress, low pH stress, osmotic stress, depletion or limitation of nutrients, addition of different chemical compounds, etc.) have been conducted in the unicellular Eukaryal model, yeast Saccharomyces cerevisiae. However, the lack of a unifying or integrated, bioinformatics platformthat would permit efficient and rapid use of all these existing data remain an important issue. To facilitate research by exploiting existing transcription data in the field of yeast physiology, we have developed the yStreX database. It is an online repository of analyzed gene expression data from curated data sets from different studies that capture genome-wide transcriptional changes in response to diverse environmental transitions. The first aim of this online database is to facilitate comparison of cross-platform and cross-laboratory gene expression data. Additionally, we performed different expression analyses, meta-analyses and gene set enrichment analyses; and the results are also deposited in this database. Lastly, we constructed a user-friendly Web interface with interactive visualization to provide intuitive access and to display the queried data for users with no background in bioinformatics. Database URL: http://www.ystrexdb.co

Chalmers Publication Library

Chalmers Research

Updates in metabolomics tools and resources: 2014-2015

Author: Misra Biswapriya B.
van der Hooft Justin
Publication venue: 'Wiley'
Publication date: 01/01/2016
Field of study

Data processing and interpretation represent the most challenging and time-consuming steps in high-throughput metabolomic experiments, regardless of the analytical platforms (MS or NMR spectroscopy based) used for data acquisition. Improved machinery in metabolomics generates increasingly complex datasets that create the need for more and better processing and analysis software and in silico approaches to understand the resulting data. However, a comprehensive source of information describing the utility of the most recently developed and released metabolomics resources—in the form of tools, software, and databases—is currently lacking. Thus, here we provide an overview of freely-available, and open-source, tools, algorithms, and frameworks to make both upcoming and established metabolomics researchers aware of the recent developments in an attempt to advance and facilitate data processing workflows in their metabolomics research. The major topics include tools and researches for data processing, data annotation, and data visualization in MS and NMR-based metabolomics. Most in this review described tools are dedicated to untargeted metabolomics workflows; however, some more specialist tools are described as well. All tools and resources described including their analytical and computational platform dependencies are summarized in an overview Table

Enlighten

EST-PAC a web package for EST annotation and protein sequence prediction

Author: A Bateman
A Hotz-Wagenblatt
C Iseli
C Lottaz
C Mao
Christophe Lefèvre
David Powell
E Dias Neto
G Wistow
J Parkinson
JD Wasmuth
LD Hillier
LK Matukumalli
MD Adams
P Ayoubi
S McGinnis
SR Eddy
Yvan Strahm
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

With the decreasing cost of DNA sequencing technology and the vast diversity of biological resources, researchers increasingly face the basic challenge of annotating a larger number of expressed sequences tags (EST) from a variety of species. This typically consists of a series of repetitive tasks, which should be automated and easy to use. The results of these annotation tasks need to be stored and organized in a consistent way. All these operations should be self-installing, platform independent, easy to customize and amenable to using distributed bioinformatics resources available on the Internet. In order to address these issues, we present EST-PAC a web oriented multi-platform software package for expressed sequences tag (EST) annotation. EST-PAC provides a solution for the administration of EST and protein sequence annotations accessible through a web interface. Three aspects of EST annotation are automated: 1) searching local or remote biological databases for sequence similarities using Blast services, 2) predicting protein coding sequence from EST data and, 3) annotating predicted protein sequences with functional domain predictions. In practice, EST-PAC integrates the BLASTALL suite, EST-Scan2 and HMMER in a relational database system accessible through a simple web interface. EST-PAC also takes advantage of the relational database to allow consistent storage, powerful queries of results and, management of the annotation process. The system allows users to customize annotation strategies and provides an open-source data-management environment for research and education in bioinformatics

Springer - Publisher Connector

University of Melbourne Institutional Repository

AGMIAL: implementing an annotation strategy for prokaryote genomes as a distributed system

Author: Bessières P.
Bossy R.
Bryson K.
Chaillou S.
Gibrat J.-F.
Hoebeke M.
Loux V.
Maguin E.
Nicolas P.
Penaud S.
van de Guchte M.
Publication venue
Publication date: 01/07/2006
Field of study

We have implemented a genome annotation system for prokaryotes called AGMIAL. Our approach embodies a number of key principles. First, expert manual annotators are seen as a critical component of the overall system; user interfaces were cyclically refined to satisfy their needs. Second, the overall process should be orchestrated in terms of a global annotation strategy; this facilitates coordination between a team of annotators and automatic data analysis. Third, the annotation strategy should allow progressive and incremental annotation from a time when only a few draft contigs are available, to when a final finished assembly is produced. The overall architecture employed is modular and extensible, being based on the W3 standard Web services framework. Specialized modules interact with two independent core modules that are used to annotate, respectively, genomic and protein sequences. AGMIAL is currently being used by several INRA laboratories to analyze genomes of bacteria relevant to the food-processing industry, and is distributed under an open source license

UCL Discovery

QueryOR: a comprehensive web platform for genetic variant analysis and prioritization

Author: ANGLANI FRANCA
BERTOLDI LORIS
BIROLO GIOVANNI
D'AVANZO FRANCESCA
DE PASCALE FABIO
Faulkner Georgine
FELTRIN ERIKA
FORCATO CLAUDIO
NEGRISOLO SUSANNA
SCHIAVON RICCARDO
TOMANIN ROSELLA
VALLE GIORGIO
VEZZI ALESSANDRO
VITULO NICOLA
ZANETTI ALESSANDRA
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Background: Whole genome and exome sequencing are contributing to the extraordinary progress in the study of human genetic variants. In this fast developing field, appropriate and easily accessible tools are required to facilitate data analysis. Results: Here we describe QueryOR, a web platform suitable for searching among known candidate genes as well as for finding novel gene-disease associations. QueryOR combines several innovative features that make it comprehensive, flexible and easy to use. Instead of being designed on specific datasets, it works on a general XML schema specifying formats and criteria of each data source. Thanks to this flexibility, new criteria can be easily added for future expansion. Currently, up to 70 user-selectable criteria are available, including a wide range of gene and variant features. Moreover, rather than progressively discarding variants taking one criterion at a time, the prioritization is achieved by a global positive selection process that considers all transcript isoforms, thus producing reliable results. QueryOR is easy to use and its intuitive interface allows to handle different kinds of inheritance as well as features related to sharing variants in different patients. QueryOR is suitable for investigating single patients, families or cohorts. Conclusions: QueryOR is a comprehensive and flexible web platform eligible for an easy user-driven variant prioritization. It is freely available for academic institutions at http://queryor.cribi.unipd.it/