Search CORE

26 research outputs found

An SVM-based system for predicting protein subnuclear localizations

Author: Dai Yang
Lei Zhengdeng
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: The large gap between the number of protein sequences in databases and the number of functionally characterized proteins calls for the development of a fast computational tool for the prediction of subnuclear and subcellular localizations generally applicable to protein sequences. The information on localization may reveal the molecular function of novel proteins, in addition to providing insight on the biological pathways in which they function. The bulk of past work has been focused on protein subcellular localizations. Furthermore, no specific tool has been dedicated to prediction at the subnuclear level, despite its high importance. In order to design a suitable predictive system, the extraction of subtle sequence signals that can discriminate among proteins with different subnuclear localizations is the key. RESULTS: New kernel functions used in a support vector machine (SVM) learning model are introduced for the measurement of sequence similarity. The k-peptide vectors are first mapped by a matrix of high-scored pairs of k-peptides which are measured by BLOSUM62 scores. The kernels, measuring the similarity for sequences, are then defined on the mapped vectors. By combining these new encoding methods, a multi-class classification system for the prediction of protein subnuclear localizations is established for the first time. The performance of the system is evaluated with a set of proteins collected in the Nuclear Protein Database (NPD). The overall accuracy of prediction for 6 localizations is about 50% (vs. random prediction 16.7%) for single localization proteins in the leave-one-out cross-validation; and 65% for an independent set of multi-localization proteins. This integrated system can be accessed at . CONCLUSION: The integrated system benefits from the combination of predictions from several SVMs based on selected encoding methods. Finally, the predictive power of the system is expected to improve as more proteins with known subnuclear localizations become available

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Abundant copy-number loss of CYCLOPS and STOP genes in gastric adenocarcinoma

Author: Chan Weng Hoong
Cutcutache Ioana
Deng Niantao
Lei Zhengdeng
McPherson John Richard
Ooi London Lucien
Rozen Steven G.
Soo Khee Chee
Suzuki Yuka
Tan Patrick
Welsch Roy E
Wong Wai Keong
Wu Alice Yingting
Zhang Shenli
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Background Gastric cancer, a leading cause of cancer death worldwide, has been little studied compared with other cancers that impose similar health burdens. Our goal is to assess genomic copy-number loss and the possible functional consequences and therapeutic implications thereof across a large series of gastric adenocarcinomas. Methods We used high-density single-nucleotide polymorphism microarrays to determine patterns of copy-number loss and allelic imbalance in 74 gastric adenocarcinomas. We investigated whether suppressor of tumorigenesis and/or proliferation (STOP) genes are associated with genomic copy-number loss. We also analyzed the extent to which copy-number loss affects Copy-number alterations Yielding Cancer Liabilities Owing to Partial losS (CYCLOPS) genes–genes that may be attractive targets for therapeutic inhibition when partially deleted. Results The proportion of the genome subject to copy-number loss varies considerably from tumor to tumor, with a median of 5.5 %, and a mean of 12 % (range 0–58.5 %). On average, 91 STOP genes were subject to copy-number loss per tumor (median 35, range 0–452), and STOP genes tended to have lower copy-number compared with the rest of the genes. Furthermore, on average, 1.6 CYCLOPS genes per tumor were both subject to copy-number loss and downregulated, and 51.4 % of the tumors had at least one such gene. Conclusions The enrichment of STOP genes in regions of copy-number loss indicates that their deletion may contribute to gastric carcinogenesis. Furthermore, the presence of several deleted and downregulated CYCLOPS genes in some tumors suggests potential therapeutic targets in these tumors.Singapore. Ministry of Health (Duke-NUS Signature Research Programs)Singapore. Agency for Science, Technology and ResearchSingapore-MIT Allianc

DSpace@MIT

Crossref

Springer - Publisher Connector

PubMed Central

Assessing protein similarity with Gene Ontology and its use in subnuclear localization prediction

Author: AK Bjorklund
B Rost
BW Matthews
D Sarda
G Dellaire
H Wu
J Wang
JL Gardy
JL Gardy
K Itoh
K Nakai
K Tu
KC Chou
L Cocco
M Bhasin
MA Harris
P Zhang
PW Lord
PW Lord
R Gentleman
R Nair
R Nair
V Brendel
X Lu
X Wu
Yang Dai
YD Cai
Z Lei
Zhengdeng Lei
ZP Feng
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: The accomplishment of the various genome sequencing projects resulted in accumulation of massive amount of gene sequence information. This calls for a large-scale computational method for predicting protein localization from sequence. The protein localization can provide valuable information about its molecular function, as well as the biological pathway in which it participates. The prediction of localization of a protein at subnuclear level is a challenging task. In our previous work we proposed an SVM-based system using protein sequence information for this prediction task. In this work, we assess protein similarity with Gene Ontology (GO) and then improve the performance of the system by adding a module of nearest neighbor classifier using a similarity measure derived from the GO annotation terms for protein sequences. RESULTS: The performance of the new system proposed here was compared with our previous system using a set of proteins resided within 6 localizations collected from the Nuclear Protein Database (NPD). The overall MCC (accuracy) is elevated from 0.284 (50.0%) to 0.519 (66.5%) for single-localization proteins in leave-one-out cross-validation; and from 0.420 (65.2%) to 0.541 (65.2%) for an independent set of multi-localization proteins. The new system is available at . CONCLUSION: The prediction of protein subnuclear localizations can be largely influenced by various definitions of similarity for a pair of proteins based on different similarity measures of GO terms. Using the sum of similarity scores over the matched GO term pairs for two proteins as the similarity definition produced the best predictive outcome. Substantial improvement in predicting protein subnuclear localizations has been achieved by combining Gene Ontology with sequence information

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Abundant copy-number loss of CYCLOPS and STOP genes in gastric adenocarcinoma

Author: A Jemal
A Li
A Subramanian
AD Panani
Alice Yingting Wu
AR Cachia
AS Yustein
B Lee
B Vogelstein
D Nijhawan
DA Solomon
G Tamura
GH Zhao
H Bengtsson
H Bengtsson
HH Hartgrink
IB Tan
Ioana Cutcutache
J Barretina
J Ferlay
JC Lang
John Richard McPherson
JU Kang
K Takahashi
Khee Chee Soo
KJ Purdie
London Lucien Ooi
M Baudis
M Giefing
M Rasmussen
M Tada
MG Rhyu
MS Wu
N Deng
N Deng
Niantao Deng
NL Solimini
P Loo Van
P Muscarella
P Nair
PA Futreal
Patrick Tan
R Beroukhim
R Beroukhim
Roy Welsch
S Tang
S Uchino
S Veeriah
Shenli Zhang
Steven G. Rozen
T Kohno
T Noguchi
T Popova
T Sano
Wai Keong Wong
WD Foulkes
WE Johnson
Weng Hoong Chan
Y Nannya
Y Wu
Yuka Suzuki
Z Lei
Zhengdeng Lei
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Genome-wide computational prediction of protein localizations.

Author: Zhengdeng. Lei (7978262)
Publication venue
Publication date: 01/01/2007
Field of study

Genome-wide computational prediction of protein localizations

University of Illinois at Chicago: UIC INDIGO (INtellectual property in DIGital form available online in an Open environment)

A New Kernel Based on High-Scored Pairs of Tri-peptides and Its Application in Prediction of Protein Subcellular Localization ⋆

Author: Yang Dai
Zhengdeng Lei
Publication venue
Publication date
Field of study

Abstract. A new kernel has been developed for vectors derived from a coding scheme of the tri-peptide composition for protein sequences. This kernel defines the sequence similarity through a mapping that transforms a tri-peptide coding vector into a new vector based on a matrix formed by the high BLOSUM scores associated with pairs of tri-peptides. In conjunction with the use of support vector machines, the effectiveness of the new kernel is evaluated against the conventional coding schemes of k-peptide (k ≤ 3) for the prediction of subcellular localizations of proteins in Gram-negative bacteria. It is demonstrated that the new method outperforms all the other methods in a 5-fold cross-validation. Keywords: protein subcellular localization, Gram-negative bacteria, BLOSUM matrix, kernel, support vector machine.

CiteSeerX

Lipidomics identifies a requirement for peroxisomal function during influenza virus replication

Author: Chng Charmaine
Guan Xue Li
Lei Zhengdeng
Rozen Steven G.
Tanner Lukas Bahati
Wenk Markus R.
Publication venue: 'American Society for Biochemistry & Molecular Biology (ASBMB)'
Publication date: 01/01/2014
Field of study

Influenza virus acquires a host-derived lipid envelope during budding, yet a convergent view on the role of host lipid metabolism during infection is lacking. Using a mass spectrometry-based lipidomics approach, we provide a systems-scale perspective on membrane lipid dynamics of infected human lung epithelial cells and purified influenza virions. We reveal enrichment of the minor peroxisome-derived ether-linked phosphatidylcholines relative to bulk ester-linked phosphatidylcholines in virions as a unique pathogenicity-dependent signature for influenza not found in other enveloped viruses. Strikingly, pharmacological and genetic interference with peroxisomal and ether lipid metabolism impaired influenza virus production. Further integration of our lipidomics results with published genomics and proteomics data corroborated altered peroxisomal lipid metabolism as a hallmark of influenza virus infection in vitro and in vivo. Influenza virus may therefore tailor peroxisomal and particularly ether lipid metabolism for efficient replication

Crossref

edoc

PubMed Central

Modeling and optimization for separation of ionic solutes in pressurized flow capillary electrochromatography

Author: Hanfa Zou
Hongqing Fu
Mingliang Ye
Renan Wu
Zhengdeng Lei
Publication venue: 'Wiley'
Publication date: 01/01/2002
Field of study

Crossref