Search CORE

Public Library of Science (PLOS)

Prediction of Protein Domain with mRMR Feature Selection and Analysis

Author: AA Schaffer
AG Murzin
AK Dunker
AM Moses
AP Elhammer
B Saffari
Bi-Qing Li
Bin Xue
BQ Li
CA Orengo
D Chivian
D Li
DE Kim
E Angov
EC Mbamala
G Pugalenthi
GP Zhou
GP Zhou
H Ingolfsson
H Mohabatkar
H Peng
HB Shen
HB Shen
I Walsh
ID Campbell
IH Witten
J Chen
J Cheng
J Cheng
J Cheng
J Eickholt
J Lin
J Liu
J Liu
J Wang
JD Qiu
JE Gewehr
JJ Chou
JR Schnell
K Peng
K Shameer
K Wang
Kai-Yan Feng
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KK Kandaswamy
Kuo-Chen Chou
L Breiman
L Chen
L Holm
Le-Le Hu
Lei Chen
M Esmaeili
M Hayat
M Suyama
MJ Berardi
MK Yoon
N Nagarajan
N von Ohsen
NM Goldenberg
P Mundra
P Tompa
P Wang
PE Wright
PK Nielsen
Q Gu
R Apweiler
R Bondugula
R Guerois
R Linding
RA George
RA Poorman
S Gong
S Kawashima
S Roy
SC Jia
SF Altschul
SM Reynolds
T Ebina
T Huang
TA Holland
W Li
W Zhao
WR Atchley
WZ Lin
X Xiao
X Xiao
X Xiao
X Xiao
X Xiao
X Xiao
X Xiao
Y Zhang
YD Cai
YD Li
Yu-Dong Cai
YX Li
Z He
Z Qiu
ZC Wu
ZC Wu
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

The domains are the structural and functional units of proteins. With the avalanche of protein sequences generated in the postgenomic age, it is highly desired to develop effective methods for predicting the protein domains according to the sequences information alone, so as to facilitate the structure prediction of proteins and speed up their functional annotation. However, although many efforts have been made in this regard, prediction of protein domains from the sequence information still remains a challenging and elusive problem. Here, a new method was developed by combing the techniques of RF (random forest), mRMR (maximum relevance minimum redundancy), and IFS (incremental feature selection), as well as by incorporating the features of physicochemical and biochemical properties, sequence conservation, residual disorder, secondary structure, and solvent accessibility. The overall success rate achieved by the new method on an independent dataset was around 73%, which was about 28–40% higher than those by the existing method on the same benchmark dataset. Furthermore, it was revealed by an in-depth analysis that the features of evolution, codon diversity, electrostatic charge, and disorder played more important roles than the others in predicting protein domains, quite consistent with experimental observations. It is anticipated that the new method may become a high-throughput tool in annotating protein domains, or may, at the very least, play a complementary role to the existing domain prediction methods, and that the findings about the key features with high impacts to the domain prediction might provide useful insights or clues for further experimental investigations in this area. Finally, it has not escaped our notice that the current approach can also be utilized to study protein signal peptides, B-cell epitopes, HIV protease cleavage sites, among many other important topics in protein science and biomedicine

CiteSeerX

FigShare

STM-induced light emission from thin films of perylene derivatives on the HOPG and Au substrates

Author: A Fujiki
Akira Saito
Aya Fujiki
BQ Xu
CR Newman
CW Struijk
D Ino
E Cavar
E Lifshiz
ET Jensen
G Horowitz
H Langhals
J-J Greffet
JR Lakowicz
K Akers
K Balakrishnan
K Okamoto
L Schmidt-Mende
L Zang
M Sakurai
M-K Kwon
Megumi Akai-Kasaya
P Schouwink
PA Hobson
PP Pompa
R Berndt
RF Kubin
S Alibert-Fouet
S Demmig
T Uemura
T Uemura
XH Qiu
Y Uehara
Y Vertsimakha
Y Zhang
Yasushi Oshikane
YC Chen
Yuji Kuwahara
Yusuke Miyake
Z-C Dong
ZC Dong
Publication venue: Springer
Publication date
Field of study

We have investigated the emission properties of N,N'-diheptyl-3,4,9,10-perylenetetracarboxylic diimide thin films by the tunneling-electron-induced light emission technique. A fluorescence peak with vibronic progressions with large Stokes shifts was observed on both highly ordered pyrolytic graphite (HOPG) and Au substrates, indicating that the emission was derived from the isolated-molecule-like film condition with sufficient π-π interaction of the perylene rings of perylenetetracarboxylic diimide molecules. The upconversion emission mechanism of the tunneling-electron-induced emission was discussed in terms of inelastic tunneling including multiexcitation processes. The wavelength-selective enhanced emission due to a localized tip-induced surface plasmon on the Au substrate was also obtained

Soluble expression, purification, and characterization of active recombinant human tissue plasminogen activator by auto-induction in E. coli

Author: A Granelli-Piperno
A Iwata
A Vindigni
Ailong Huang
AM Baca
BJ Desai
D Collen
D Collen
D Pennica
Deqiang Wang
FW Studier
FW Studier
FX Ding
GS Waldo
H-J Lee
Hongpeng Zhang
HR Lijnen
J Jiao
J Manosroi
J Qiu
J Tang
Jianzhong Zhou
JW Dubendorff
Ke Chen
L Waxman
Lei Bai
M Ploug
MG Obukowicz
Miao Luo
OD Ekici
OG Wilhelm
PE Molloy
PH Bessette
Quan He
R Batra
R Mattes
SA Rouf
Shaocheng Zhang
Shuang Wu
T Biswas
T Yamada
TH Grossman
W Li
Xiaobin Long
Y Geng
Yeran Gou
Z Li
ZC Hua
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Helsingin yliopiston digitaalinen arkisto

A verified genomic reference sample for assessing performance of cancer panels detecting small variants of low allele frequency

BackgroundOncopanel genomic testing, which identifies important somatic variants, is increasingly common in medical practice and especially in clinical trials. Currently, there is a paucity of reliable genomic reference samples having a suitably large number of pre-identified variants for properly assessing oncopanel assay analytical quality and performance. The FDA-led Sequencing and Quality Control Phase 2 (SEQC2) consortium analyze ten diverse cancer cell lines individually and their pool, termed Sample A, to develop a reference sample with suitably large numbers of coding positions with known (variant) positives and negatives for properly evaluating oncopanel analytical performance.ResultsIn reference Sample A, we identify more than 40,000 variants down to 1% allele frequency with more than 25,000 variants having less than 20% allele frequency with 1653 variants in COSMIC-related genes. This is 5-100x more than existing commercially available samples. We also identify an unprecedented number of negative positions in coding regions, allowing statistical rigor in assessing limit-of-detection, sensitivity, and precision. Over 300 loci are randomly selected and independently verified via droplet digital PCR with 100% concordance. Agilent normal reference Sample B can be admixed with Sample A to create new samples with a similar number of known variants at much lower allele frequency than what exists in Sample A natively, including known variants having allele frequency of 0.02%, a range suitable for assessing liquid biopsy panels.ConclusionThese new reference samples and their admixtures provide superior capability for performing oncopanel quality control, analytical accuracy, and validation for small to large oncopanels and liquid biopsy assays.Peer reviewe

Comparison of posterior correction results between Marfan syndrome scoliosis and adolescent idiopathic scoliosis—a retrospective case-series study

Author: A Paepe De
Bin Yu
GE Lipton
GM Villeirs
Guixing Qiu
J Zenner
Jianguo Zhang
Jianxiong Shen
JP Gjolaj
JP Thompson
KB Jones
M Silvestre Di
PD Sponseller
PD Sponseller
QY Li
R Fattori
RE Pyeritz
RE Pyeritz
WE Stern
Weiqiang Liang
Yipeng Wang
ZC Li
Zhengyao Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Heterogeneity of mammary lesions represent molecular differences

Author: A Castro
A Hoque
A Muller
Alexander D Borowsky
AW Griffioen
B Ren
B Shi
BS Shoker
CM Perou
Colin A Baron
Condie E Carmack
CS Schuetz
D Kalaitzidis
D Tamiolakisl
DJ Slamon
DL Myer
DM Anderson
E Forrester
E Mallon
ER Fearon
ES Hwang
EY Lin
EY Lin
F Eckerdt
F Moll
G Jonsson
G Liu
HJ Zeh 3rd
HK Lee
HK Reddy
J Fridlyand
J Yao
JC Bouma-ter Steege
JE Maglione
JE Maglione
Jeannie E Maglione
Jeffrey P Gregg
JG Hodgson
JM Peters
JS Winston
K Chin
K Gunther
KH Lim
L Yuste
Lawrence JT Young
LJ van 't Veer
LM Coussens
M Aubele
M Philip
MA van Vugt
MC Cid
MJ Bissell
MS Kinch
MT Barrett
MW Landis
P Meraldi
P O'Connell
PJ Kushner
PL Schwartzberg
Q Lu
Q Yu
R Namba
R Namba
RD Cardiff
RH Weiss
RL Sutherland
Robert D Cardiff
RS Muraoka
RS Muraoka-Cook
RT Phan
Ruria Namba
Ryan R Davis
S Gery
S Ramaswamy
S Troup
SJ Holland
SL Grimm
SR Ritland
Stephenie Liu
T Sorlie
TH Bugge
TH Qiu
TR Hughes
WP Lee
XJ Ma
Y Feng
Y Miyoshi
Y Tsuchiya
ZC Wang
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Human breast cancer is a heterogeneous disease, histopathologically, molecularly and phenotypically. The molecular basis of this heterogeneity is not well understood. We have used a mouse model of DCIS that consists of unique lines of mammary intraepithelial neoplasia (MIN) outgrowths, the premalignant lesion in the mouse that progress to invasive carcinoma, to understand the molecular changes that are characteristic to certain phenotypes. Each MIN-O line has distinguishable morphologies, metastatic potentials and estrogen dependencies. METHODS: We utilized oligonucleotide expression arrays and high resolution array comparative genomic hybridization (aCGH) to investigate whole genome expression patterns and whole genome aberrations in both the MIN-O and tumor from four different MIN-O lines that each have different phenotypes. From the whole genome analysis at 35 kb resolution, we found that chromosome 1, 2, 10, and 11 were frequently associated with whole chromosome gains in the MIN-Os. In particular, two MIN-O lines had the majority of the chromosome gains. Although we did not find any whole chromosome loss, we identified 3 recurring chromosome losses (2F1-2, 3E4, 17E2) and two chromosome copy number gains on chromosome 11. These interstitial deletions and duplications were verified with a custom made array designed to interrogate the specific regions at approximately 550 bp resolution. RESULTS: We demonstrated that expression and genomic changes are present in the early premalignant lesions and that these molecular profiles can be correlated to phenotype (metastasis and estrogen responsiveness). We also identified expression changes associated with genomic instability. Progression to invasive carcinoma was associated with few additional changes in gene expression and genomic organization. Therefore, in the MIN-O mice, early premalignant lesions have the major molecular and genetic changes required and these changes have important phenotypic significance. In contrast, the changes that occur in the transition to invasive carcinoma are subtle, with few consistent changes and no association with phenotype. CONCLUSION: We propose that the early lesions carry the important genetic changes that reflect the major phenotypic information, while additional genetic changes that accumulate in the invasive carcinoma are less associated with the overall phenotype

Springer - Publisher Connector

eScholarship - University of California

Classification and Analysis of Regulatory Pathways Using Graph Property, Biochemical and Physicochemical Property, and Functional Property

Author: A Bairoch
A Barabasi
C Chen
C Chen
C Klukas
C Krieger
Cathal Seoighe
CF Gao
D Chakrabarti
D Frishman
DN Georgiou
E Camon
F Chiti
G Pollastri
GF Cooper
GP Zhou
GP Zhou
GY Zhang
H Ding
H Lin
H Mohabatkar
H Mohabatkar
H Ogata
H Peng
I Althaus
I Althaus
I Althaus
I Dubchak
I Dubchak
I Schomburg
I Schomburg
IH Witten
J Andraos
J Cheng
J Cheng
JD Qiu
JM Dale
K Chou
K Chou
K Chou
K Chou
K Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
Kuo-Chen Chou
L Chen
L Chen
L Chen
L Chen
L Chen
L Lu
L Lu
L Yu
Lei Chen
M Chang
M Esmaeili
M Kanehisa
M Kanehisa
M Kanehisa
M Kanehisa
N Chazal
N Friedman
P Carmona-Saez
P Pharkya
Q Gu
R Caspi
R Caspi
RR Bouckaert
S Salzberg
SS Keerthi
T Denoeux
T Huang
T Huang
T Huang
T Huang
T Huang
Tao Huang
U Stelzl
W Buntine
X Xiao
XB Zhou
Y Cai
Y Cai
Y Cai
Y Qi
YH Zeng
YS Lobanova
Yu-Dong Cai
Z He
ZC Wu
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Given a regulatory pathway system consisting of a set of proteins, can we predict which pathway class it belongs to? Such a problem is closely related to the biological function of the pathway in cells and hence is quite fundamental and essential in systems biology and proteomics. This is also an extremely difficult and challenging problem due to its complexity. To address this problem, a novel approach was developed that can be used to predict query pathways among the following six functional categories: (i) “Metabolism”, (ii) “Genetic Information Processing”, (iii) “Environmental Information Processing”, (iv) “Cellular Processes”, (v) “Organismal Systems”, and (vi) “Human Diseases”. The prediction method was established trough the following procedures: (i) according to the general form of pseudo amino acid composition (PseAAC), each of the pathways concerned is formulated as a 5570-D (dimensional) vector; (ii) each of components in the 5570-D vector was derived by a series of feature extractions from the pathway system according to its graphic property, biochemical and physicochemical property, as well as functional property; (iii) the minimum redundancy maximum relevance (mRMR) method was adopted to operate the prediction. A cross-validation by the jackknife test on a benchmark dataset consisting of 146 regulatory pathways indicated that an overall success rate of 78.8% was achieved by our method in identifying query pathways among the above six classes, indicating the outcome is quite promising and encouraging. To the best of our knowledge, the current study represents the first effort in attempting to identity the type of a pathway system or its biological function. It is anticipated that our report may stimulate a series of follow-up investigations in this new and challenging area

CiteSeerX