Search CORE

7,763 research outputs found

Protein subcellular localization prediction of eukaryotes using a knowledge-based approach

Author: Chen Ching-Tai
Ho Shinn-Ying
Hsu Wen-Lian
Lin Hsin-Nan
Sung Ting-Yi
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The study of protein subcellular localization (PSL) is important for elucidating protein functions involved in various cellular processes. However, determining the localization sites of a protein through wet-lab experiments can be time-consuming and labor-intensive. Thus, computational approaches become highly desirable. Most of the PSL prediction systems are established for single-localized proteins. However, a significant number of eukaryotic proteins are known to be localized into multiple subcellular organelles. Many studies have shown that proteins may simultaneously locate or move between different cellular compartments and be involved in different biological processes with different roles. Results In this study, we propose a knowledge based method, called KnowPredsite, to predict the localization site(s) of both single-localized and multi-localized proteins. Based on the local similarity, we can identify the "related sequences" for prediction. We construct a knowledge base to record the possible sequence variations for protein sequences. When predicting the localization annotation of a query protein, we search against the knowledge base and used a scoring mechanism to determine the predicted sites. We downloaded the dataset from ngLOC, which consisted of ten distinct subcellular organelles from 1923 species, and performed ten-fold cross validation experiments to evaluate KnowPredsite's performance. The experiment results show that KnowPredsite achieves higher prediction accuracy than ngLOC and Blast-hit method. For single-localized proteins, the overall accuracy of KnowPredsite is 91.7%. For multi-localized proteins, the overall accuracy of KnowPredsite is 72.1%, which is significantly higher than that of ngLOC by 12.4%. Notably, half of the proteins in the dataset that cannot find any Blast hit sequence above a specified threshold can still be correctly predicted by KnowPredsite. Conclusion KnowPredsite demonstrates the power of identifying related sequences in the knowledge base. The experiment results show that even though the sequence similarity is low, the local similarity is effective for prediction. Experiment results show that KnowPredsite is a highly accurate prediction method for both single- and multi-localized proteins. It is worth-mentioning the prediction process of KnowPredsite is transparent and biologically interpretable and it shows a set of template sequences to generate the prediction result. The KnowPredsite prediction server is available at <url>http://bio-cluster.iis.sinica.edu.tw/kbloc/</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Protein (Multi-)Location Prediction: Using Location Inter-Dependencies in a Probabilistic Framework

Author: Shatkay Hagit
Simha Ramanuja
Publication venue
Publication date: 29/07/2013
Field of study

Knowing the location of a protein within the cell is important for understanding its function, role in biological processes, and potential use as a drug target. Much progress has been made in developing computational methods that predict single locations for proteins, assuming that proteins localize to a single location. However, it has been shown that proteins localize to multiple locations. While a few recent systems have attempted to predict multiple locations of proteins, they typically treat locations as independent or capture inter-dependencies by treating each locations-combination present in the training set as an individual location-class. We present a new method and a preliminary system we have developed that directly incorporates inter-dependencies among locations into the multiple-location-prediction process, using a collection of Bayesian network classifiers. We evaluate our system on a dataset of single- and multi-localized proteins. Our results, obtained by incorporating inter-dependencies are significantly higher than those obtained by classifiers that do not use inter-dependencies. The performance of our system on multi-localized proteins is comparable to a top performing system (YLoc+), without restricting predictions to be based only on location-combinations present in the training set.Comment: Peer-reviewed and presented as part of the 13th Workshop on Algorithms in Bioinformatics (WABI2013

arXiv.org e-Print Archive

Springer - Publisher Connector

Signal peptides and protein localization prediction

Author: Nielsen Henrik
Publication venue: John Wiley and Sons Ltd
Publication date: 01/01/2005
Field of study

Online Research Database In Technology

TESTLoc: protein subcellular localization prediction from EST data

Author: A Chacinska
A Kumar
A Pierleoni
A Reinhardt
AG Hatzigeorgiou
BF Lang
C Guda
C Guda
C Iseli
CS Yu
CS Yu
D Sarda
Gertraud Burger
H Bannai
H Shatkay
HM Yuan
HN Lin
HW Platta
I Small
J Assfalg
J Li
J Liu
J Parkinson
JD Wasmuth
K Baerenfaller
KC Chou
KC Chou
KJ Park
L Barbe
LB Koski
M Boden
MG Claros
MS Boguski
MS Scott
O Emanuelsson
P Rice
R Casadio
R Kaundal
R Lascaris
R Nair
R Nair
R Nair
RE Fan
S Briesemeister
S Hua
SF Altschul
T Blum
TM Devlin
W Li
WK Huh
Y Huang
Y Lee
Yao-Qing Shen
YQ Shen
YQ Shen
Z Lu
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The eukaryotic cell has an intricate architecture with compartments and substructures dedicated to particular biological processes. Knowing the subcellular location of proteins not only indicates how bio-processes are organized in different cellular compartments, but also contributes to unravelling the function of individual proteins. Computational localization prediction is possible based on sequence information alone, and has been successfully applied to proteins from virtually all subcellular compartments and all domains of life. However, we realized that current prediction tools do not perform well on partial protein sequences such as those inferred from Expressed Sequence Tag (EST) data, limiting the exploitation of the large and taxonomically most comprehensive body of sequence information from eukaryotes. Results We developed a new predictor, TESTLoc, suited for subcellular localization prediction of proteins based on their partial sequence conceptually translated from ESTs (EST-peptides). Support Vector Machine (SVM) is used as computational method and EST-peptides are represented by different features such as amino acid composition and physicochemical properties. When TESTLoc was applied to the most challenging test case (plant data), it yielded high accuracy (~85%). Conclusions TESTLoc is a localization prediction tool tailored for EST data. It provides a variety of models for the users to choose from, and is available for download at http://megasun.bch.umontreal.ca/~shenyq/TESTLoc/TESTLoc.html</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

cis-acting sequences and trans-acting factors in the localization of mRNA for mitochondrial ribosomal proteins

Author: AMORESANO ANGELA
C. Cirulli
PIETROPAOLO CONCETTA
PUCCI PIETRO
RUSSO ANNAPINA
RUSSO GIULIA
Publication venue: 'Elsevier BV'
Publication date: 01/01/2008
Field of study

mRNA localization is a conserved post-transcriptional process crucial for a variety of systems. Although several mechanisms have been identified, emerging evidence suggests that most transcripts reach the protein functional site by moving along cytoskeleton elements. We demonstrated previously that mRNA for mitochondrial ribosomal proteins are asymmetrically distributed in the cytoplasm, and that localization in the proximity of mitochondria is mediated by the 3′-UTR. Here we show by biochemical analysis that these mRNA transcripts are associated with the cytoskeleton through the microtubule network. Cytoskeleton association is functional for their intracellular localization near the mitochondrion, and the 3′-UTR is involved in this cytoskeleton-dependent localization. To identify the minimal elements required for localization, we generated DNA constructs containing, downstream from the GFP gene, deletion mutants of mitochondrial ribosomal protein S12 3′-UTR, and expressed them in HeLa cells. RT-PCR analysis showed that the localization signals responsible for mRNA localization are located in the first 154 nucleotides. RNA pulldown assays, mass spectrometry, and RNP immunoprecipitation assay experiments, demonstrated that mitochondrial ribosomal protein S12 3′-UTR interacts specifically with TRAP1 (tumor necrosis factor receptor-associated protein1), hnRNPM4 (heterogeneous nuclear ribonucleoprotein M4), Hsp70 and Hsp60 (heat shock proteins 70 and 60), and α-tubulin in vitro and in vivo

Archivio della ricerca - Università degli studi di Napoli Federico II

eSLDB: eukaryotic subcellular localization database

Author: Casadio Rita
Fariselli Piero
Martelli Pier Luigi
Pierleoni Andea
Publication venue: Oxford University Press
Publication date: 01/01/2006
Field of study

Eukaryotic Subcellular Localization DataBase collects the annotations of subcellular localization of eukaryotic proteomes. So far five proteomes have been processed and stored: Homo sapiens, Mus musculus, Caenorhabditis elegans, Saccharomyces cerevisiae and Arabidopsis thaliana. For each sequence, the database lists localization obtained adopting three different approaches: (i) experimentally determined (when available); (ii) homology-based (when possible); and (iii) predicted. The latter is computed with a suite of machine learning based methods, developed in house. All the data are available at our website and can be searched by sequence, by protein code and/or by protein description. Furthermore, a more complex search can be performed combining different search fields and keys. All the data contained in the database can be freely downloaded in flat file format. The database is available at

CiteSeerX

Crossref

PubMed Central

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Archivio istituzionale della ricerca - Università di Padova

Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

Author: Bastien Olivier
Birkholtz Lyn-Marie
Breton Vincent
Grando Delphine
Hofmann-Apitius Martin
Jacq Nicolas
Joubert Fourie
Kasam Vinod
Louw Abraham I
Maréchal Eric
Ortet Philippe
Roy Sylvaine
Saïdani Nadia
Wells Gordon
Zimmermann Marc
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa

Hal - Université Grenoble Alpes

HAL AMU

Fraunhofer-ePrints

HAL Clermont Université

HAL Descartes

HAL-CEA

ProdInra

arXiv.org e-Print Archive

HAL-IN2P3

Springer - Publisher Connector

PubMed Central

UPSpace at the University of Pretoria

Iterative orthology prediction uncovers new mitochondrial proteins and identifies C12orf62 as the human ortholog of COX14, a protein involved in the assembly of cytochrome c oxidase

Author: Cuypers Thomas D
Esseling John J
Gloerich Jolein
Huynen Martijn A
Lasonder Edwin
Nijtmans Leo G
Riemersma Moniek
Szklarczyk Radek
van den Brand Mariël AM
van den Heuvel Lambert P
Wanschers Bas FJ
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

BACKGROUND: Orthology is a central tenet of comparative genomics and ortholog identification is instrumental to protein function prediction. Major advances have been made to determine orthology relations among a set of homologous proteins. However, they depend on the comparison of individual sequences and do not take into account divergent orthologs. RESULTS: We have developed an iterative orthology prediction method, Ortho-Profile, that uses reciprocal best hits at the level of sequence profiles to infer orthology. It increases ortholog detection by 20% compared to sequence-to-sequence comparisons. Ortho-Profile predicts 598 human orthologs of mitochondrial proteins from Saccharomyces cerevisiae and Schizosaccharomyces pombe with 94% accuracy. Of these, 181 were not known to localize to mitochondria in mammals. Among the predictions of the Ortho-Profile method are 11 human cytochrome c oxidase (COX) assembly proteins that are implicated in mitochondrial function and disease. Their co-expression patterns, experimentally verified subcellular localization, and co-purification with human COX-associated proteins support these predictions. For the human gene C12orf62, the ortholog of S. cerevisiae COX14, we specifically confirm its role in negative regulation of the translation of cytochrome c oxidase. CONCLUSIONS: Divergent homologs can often only be detected by comparing sequence profiles and profile-based hidden Markov models. The Ortho-Profile method takes advantage of these techniques in the quest for orthologs

Crossref

Northumbria Research Link

Springer - Publisher Connector

PubMed Central

Plymouth Electronic Archive and Research Library

Radboud Repository

Analyses and web interfaces for protein subcellular localization and gene expression data

Author: Bilen Biter
Publication venue: Bilkent University
Publication date: 01/01/2007
Field of study

Cataloged from PDF version of article.In order to benefit maximally from large scale molecular biology data generated by recent developments, it is important to proceed in an organized manner by developing databases, interfaces, data visualization and data interpretation tools. Protein subcellular localization and microarray gene expression are two of such fields that require immense computational effort before being used as a roadmap for the experimental biologist. Protein subcellular localization is important for elucidating protein function. We developed an automatically updated searchable and downloadable system called model organisms proteome subcellular localization database (MEP2SL) that hosts predicted localizations and known experimental localizations for nine eukaryotes. MEP2SL localizations highly correlated with high throughput localization experiments in yeast and were shown to have superior accuracies when compared with four other localization prediction tools based on two different datasets. Hence, MEP2SL system may serve as a reference source for protein subcellular localization information with its interface that provides various search and download options together with links and utilities for further annotations. Microarray gene expression technology enables monitoring of whole genome simultaneously. We developed an online installable searchable open source system called differentially expressed genes (DEG) that includes analysis and retrieval interfaces for Affymetrix HG-U133 Plus 2.0 arrays. DEG provides permanent data storage capabilities with its integration into a database and being an installable online tool and is valuable for groups who are not willing to submit their data on public servers.Bilen, BiterM.S

Bilkent University Institutional Repository