Search CORE

28 research outputs found

ProFAT: a web-based tool for the functional annotation of protein sequences

Author: Bradshaw Charles Richard
Habermann Bianca
Surendranath Vineeth
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: The functional annotation of proteins relies on published information concerning their close and remote homologues in sequence databases. Evidence for remote sequence similarity can be further strengthened by a similar biological background of the query sequence and identified database sequences. However, few tools exist so far, that provide a means to include functional information in sequence database searches. RESULTS: We present ProFAT, a web-based tool for the functional annotation of protein sequences based on remote sequence similarity. ProFAT combines sensitive sequence database search methods and a fold recognition algorithm with a simple text-mining approach. ProFAT extracts identified hits based on their biological background by keyword-mining of annotations, features and most importantly, literature associated with a sequence entry. A user-provided keyword list enables the user to specifically search for weak, but biologically relevant homologues of an input query. The ProFAT server has been evaluated using the complete set of proteins from three different domain families, including their weak relatives and could correctly identify between 90% and 100% of all domain family members studied in this context. ProFAT has furthermore been applied to a variety of proteins from different cellular contexts and we provide evidence on how ProFAT can help in functional prediction of proteins based on remotely conserved proteins. CONCLUSION: By employing sensitive database search programs as well as exploiting the functional information associated with database sequences, ProFAT can detect remote, but biologically relevant relationships between proteins and will assist researchers in the prediction of protein function based on remote homologies

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MPG.PuRe

SeLOX—a locus of recombination site search tool for the detection and directed evolution of site-specific recombination systems

Author: Buchholz Frank
Chusainow Janet
Habermann Bianca H.
Hauber Joachim
Surendranath Vineeth
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Site-specific recombinases have become a resourceful tool for genome engineering, allowing sophisticated in vivo DNA modifications and rearrangements, including the precise removal of integrated retroviruses from host genomes. In a recent study, a mutant form of Cre recombinase has been used to excise the provirus of a specific HIV-1 strain from the human genome. To achieve provirus excision, the Cre recombinase had to be evolved to recombine an asymmetric locus of recombination (lox)-like sequence present in the long terminal repeat (LTR) regions of a HIV-1 strain. One pre-requisite for this type of work is the identification of degenerate lox-like sites in genomic sequences. Given their nature—two inverted repeats flanking a spacer of variable length—existing search tools like BLAST or RepeatMasker perform poorly. To address this lack of available algorithms, we have developed the web-server SeLOX, which can identify degenerate lox-like sites within genomic sequences. SeLOX calculates a position weight matrix based on lox-like sequences, which is used to search genomic sequences. For computational efficiency, we transform sequences into binary space, which allows us to use a bit-wise AND Boolean operator for comparisons. Next to finding lox-like sites for Cre type recombinases in HIV LTR sequences, we have used SeLOX to identify lox-like sites in HIV LTRs for six yeast recombinases. We finally demonstrate the general usefulness of SeLOX in identifying lox-like sequences in large genomes by searching Cre type recombination sites in the entire human genome. SeLOX is freely available at http://selox.mpi-cbg.de/cgi-bin/selox/index

PubMed Central

MPG.PuRe

BioBuilder as a database development and functional annotation platform for proteins

Author: Deshpande Nandan
Jonnalagadda Chandra Kiran
Kousthub PS
Navarro J Daniel
Padma N
Pandey Akhilesh
Peri Suraj
Rashmi BP
Shanker K
Surendranath Vineeth
Talreja Naveen
Vrushabendra BM
Publication venue: BioMed Central
Publication date: 01/01/2004
Field of study

BACKGROUND: The explosion in biological information creates the need for databases that are easy to develop, easy to maintain and can be easily manipulated by annotators who are most likely to be biologists. However, deployment of scalable and extensible databases is not an easy task and generally requires substantial expertise in database development. RESULTS: BioBuilder is a Zope-based software tool that was developed to facilitate intuitive creation of protein databases. Protein data can be entered and annotated through web forms along with the flexibility to add customized annotation features to protein entries. A built-in review system permits a global team of scientists to coordinate their annotation efforts. We have already used BioBuilder to develop Human Protein Reference Database , a comprehensive annotated repository of the human proteome. The data can be exported in the extensible markup language (XML) format, which is rapidly becoming as the standard format for data exchange. CONCLUSIONS: As the proteomic data for several organisms begins to accumulate, BioBuilder will prove to be an invaluable platform for functional annotation and development of customizable protein centric databases. BioBuilder is open source and is available under the terms of LGPL

Lund University Publications

Springer - Publisher Connector

PubMed Central

Academica-e

HMMerThread: Detecting Remote, Functional Conserved Domains in Entire Genomes by Combining Relaxed Sequence-Database Searches with Fold Recognition

Author: A Bauer
A Gattiker
A Hildebrand
A Lupas
A Marchler-Bauer
AG Murzin
AJ McNairn
AW Tai
B Habermann
BD Rowland
Bianca Hermine Habermann
BR Sevetson
C Chothia
C Hertz-Fowler
C Mooney
C Ostermeier
CA Kim
CA Orengo
CE Lawrence
Charles Richard Bradshaw
CR Bradshaw
CT Eggers
D Gebauer
D Ivanov
D Kim
D Wilson
DT Jones
E Quevillon
EL Tudor
EM Ross
EM Zdobnov
EW Sayers
F Verni
G Apic
H Takatsu
I Letunic
J Amberger
J Gough
J Gough
J Moult
J Schultz
J Skolnick
J Skolnick
J Soding
JC Wootton
JD Thompson
JM Cherry
JM Peters
JW Wang
K Hofmann
K Karplus
K Katoh
K Mochizuki
K Nasmyth
K Suzuki-Utsunomiya
KD Pruitt
L Aravind
L Stein
LA Kelley
LA Kelley
LJ McGuffin
LL Burns-Hamuro
M Ashburner
M Fukuda
M Oyen
M Remm
Matthias Stefan Mueller
MJ Sippl
MS Nielsen
MW Russo
NJ Mulder
O Lohi
O Lohi
Peter Csermely
R Gandhi
R Puertollano
RA Goldstein
RB Ray
RB Ray
RD Finn
RD Finn
RD Finn
Robert Henschel
S Hadano
S Kammerer
S Kueng
S Lee
S Li
S Tweedie
S Wu
SE Brenner
SF Altschul
SR Eddy
T Sutani
TK Chatterjee
TS Prasad
Vineeth Surendranath
VJ Lannoy
WG Tingley
Y Zhang
Y Zhu
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs) are sensitive in detecting conserved domains in proteins when they share sufficient sequence similarity, they tend to miss more divergent family members, as they lack a reliable statistical framework for the detection of low sequence similarity. We have developed a greatly improved HMMerThread algorithm that can detect remotely conserved domains in highly divergent sequences. HMMerThread combines relaxed conserved domain searches with fold recognition to eliminate false positive, sequence-based identifications. With an accuracy of 90%, our software is able to automatically predict highly divergent members of conserved domain families with an associated 3-dimensional structure. We give additional confidence to our predictions by validation across species. We have run HMMerThread searches on eight proteomes including human and present a rich resource of remotely conserved domains, which adds significantly to the functional annotation of entire proteomes. We find ∼4500 cross-species validated, remotely conserved domain predictions in the human proteome alone. As an example, we find a DNA-binding domain in the C-terminal part of the A-kinase anchor protein 10 (AKAP10), a PKA adaptor that has been implicated in cardiac arrhythmias and premature cardiac death, which upon stress likely translocates from mitochondria to the nucleus/nucleolus. Based on our prediction, we propose that with this HLH-domain, AKAP10 is involved in the transcriptional control of stress response. Further remotely conserved domains we discuss are examples from areas such as sporulation, chromosome segregation and signalling during immune response. The HMMerThread algorithm is able to automatically detect the presence of remotely conserved domains in proteins based on weak sequence similarity. Our predictions open up new avenues for biological and medical studies. Genome-wide HMMerThread domains are available at http://vm1-hmmerthread.age.mpg.de

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

MPG.PuRe

Reference Data

Author: Vineeth Surendranath (281965)
Publication venue
Publication date: 14/02/2013
Field of study

Flybase D. melanogaster Release 5.34 mRNA and polyadenylated ncRNA reference as GTF file for use with the bowtie suite of tools. </p

FigShare

Prepare Flybase GTF file

Author: Vineeth Surendranath (281965)
Publication venue
Publication date
Field of study

Script using cufflinks' gffread to generate a clean GTF file for use with cufflinks and htseq </p

FigShare

Preprocessors

Author: Vineeth Surendranath (281965)
Publication venue
Publication date
Field of study

Scripts to deal with reference files</p

FigShare

morFeus: a web-based program to detect remotely conserved orthologs using symmetrical best hits and orthology network scoring

Author: Habermann Bianca H.
Oswald Felix
Sharan Malvika
Surendranath Vineeth
Villaveces Jose M.
Volkmer Michael
Wagner Ines
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Background: Searching the orthologs of a given protein or DNA sequence is one of the most important and most commonly used Bioinformatics methods in Biology. Programs like BLAST or the orthology search engine Inparanoid can be used to find orthologs when the similarity between two sequences is sufficiently high. They however fail when the level of conservation is low. The detection of remotely conserved proteins oftentimes involves sophisticated manual intervention that is difficult to automate. Results: Here, we introduce morFeus, a search program to find remotely conserved orthologs. Based on relaxed sequence similarity searches, morFeus selects sequences based on the similarity of their alignments to the query, tests for orthology by iterative reciprocal BLAST searches and calculates a network score for the resulting network of orthologs that is a measure of orthology independent of the E-value. Detecting remotely conserved orthologs of a protein using morFeus thus requires no manual intervention. We demonstrate the performance of morFeus by comparing it to state-of-the-art orthology resources and methods. We provide an example of remotely conserved orthologs, which were experimentally shown to be functionally equivalent in the respective organisms and therefore meet the criteria of the orthology-function conjecture. Conclusions: Based on our results, we conclude that morFeus is a powerful and specific search method for detecting remotely conserved orthologs

Crossref

Springer - Publisher Connector

PubMed Central

Online-Publikations-Server der Universität Würzburg

MPG.PuRe

P084 Typing in the third generation: A HLA typing approach for nanopore sequencing data

Author: Alexander Schmidt
Gerhard Schöfl
Kathrin Putke
Steffen Klasberg
Vineeth Surendranath
Vinzenz Lange
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Rapid Validation of Protein Identifications with the Borderline Statistical Confidence via De Novo Sequencing and MS BLAST Searches

Author: Andrej Shevchenko
Ari Frank
Henrik Thomas
Natalie Wielsch
Patrice Waridel
Pavel Pevzner
Vineeth Surendranath
Publication venue
Publication date: 01/01/2006
Field of study

Protein identifications with the borderline statistical confidence are typically produced by matching a few marginal quality MS/MS spectra to database peptide sequences and represent a significant bottleneck in the reliable and reproducible characterization of proteomes. Here, we present a method for rapid validation of borderline hits that circumvents the need in, often biased, manual inspection of raw MS/MS spectra. The approach takes advantage of the independent interpretation of corresponding MS/MS spectra by PepNovo de novo sequencing software followed by mass spectrometry-driven BLAST (MS BLAST) sequence-similarity database searches that utilize all partially inaccurate, degenerate and redundant candidate peptide sequences. In a case study involving the identification of more than 180 Caenorhabditis elegans proteins by nanoLC-MS/MS analysis on a linear ion trap LTQ mass spectrometer, the approach enabled rapid assignment (confirmation or rejection) of more than 70 % of Mascot hits of borderline statistical confidence. Keywords: de novo sequencing • database searching • borderline hits • MS/MS • MS BLAST • PepNovo Nanoflow liquid chromatography-tandem mass spectrometry (nanoLC-MS/MS) is employed in a variety of bottom-up proteomics projects (reviewed in refs 1-4). Individual protein

CiteSeerX

MPG.PuRe