Search CORE

9 research outputs found

Predictive Power Estimation Algorithm (PPEA) - A New Algorithm to Reduce Overfitting for Genomic Biomarker Discovery

Author: A Ben-Dor
A Moreau
A Pachot
Aaron T. Smith
AC Gavin
AL Barabasi
B Ganter
C Fan
C Lu
C Sima
Craig E. Thomas
DF Ransohoff
DF Ransohoff
DH Adams
DL Mendrick
E Vittinghoff
ER Dougherty
FD Sistare
G Natsoulis
G Natsoulis
George H. Searfoss
GH John
GW Donaldson
I Guyon
I Guyon
IDJ Bross
J Liu
J Ozer
JE Peterson
JH Cai
Jiangang Liu
JW Eun
Keith Dunker
Keith M. Goldstein
L Coussens
MA Olayioye
MR Fielden
N Dessì
N Zidek
P Peduzzi
Peter Csermely
PR Bushel
R Kohavi
R Tibshirani
Robert A. Jolly
S Das
Shuyu Li
T Bo
Tao Wei
TP Ryan
TR Golub
Vladimir N. Uversky
W Luo
X Fan
X Zhang
Y Saeys
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Toxicogenomics promises to aid in predicting adverse effects, understanding the mechanisms of drug action or toxicity, and uncovering unexpected or secondary pharmacology. However, modeling adverse effects using high dimensional and high noise genomic data is prone to over-fitting. Models constructed from such data sets often consist of a large number of genes with no obvious functional relevance to the biological effect the model intends to predict that can make it challenging to interpret the modeling results. To address these issues, we developed a novel algorithm, Predictive Power Estimation Algorithm (PPEA), which estimates the predictive power of each individual transcript through an iterative two-way bootstrapping procedure. By repeatedly enforcing that the sample number is larger than the transcript number, in each iteration of modeling and testing, PPEA reduces the potential risk of overfitting. We show with three different cases studies that: (1) PPEA can quickly derive a reliable rank order of predictive power of individual transcripts in a relatively small number of iterations, (2) the top ranked transcripts tend to be functionally related to the phenotype they are intended to predict, (3) using only the most predictive top ranked transcripts greatly facilitates development of multiplex assay such as qRT-PCR as a biomarker, and (4) more importantly, we were able to demonstrate that a small number of genes identified from the top-ranked transcripts are highly predictive of phenotype as their expression changes distinguished adverse from nonadverse effects of compounds in completely independent tests. Thus, we believe that the PPEA model effectively addresses the over-fitting problem and can be used to facilitate genomic biomarker discovery for predictive toxicology and drug responses

Crossref

USFSP Digital Archive

Directory of Open Access Journals

PubMed Central

Scholar Commons - University of South Florida

Recurrent Signature Patterns in HIV-1 B Clade Envelope Glycoproteins Associated with either Early or Chronic Infections

Author: A Bultmann
A Land
A Land
A Ly
A Rehm
A Trkola
AJ McMichael
AK Dhillon
Alan S. Lapedes
Allan C. DeCamp
B Chohan
B Efron
B Efron
B Gaschen
B Korber
B Korber
Barton F. Haynes
Beatrice H. Hahn
Bette Korber
BF Haynes
BF Keele
BF Keele
Brandon F. Keele
Brian Gaschen
C Rizzuto
CA Derdeyn
CD Rizzuto
Charles B. Hicks
Chunlai Jiang
CN Scanlan
Craig A. Magaret
D Boyd
DH Barouch
E Hunter
EG Cormier
EL Delwart
EL Turnbull
ES Gray
EW Fiebig
Feng Gao
FK Treurnicht
G Blot
G Pancino
G von Heijne
George M. Shaw
Georgia D. Tomaras
GH Learn
H Ellerbrok
H Li
H Xhu
Hui Li
HY Lee
J Auwerx
J Felsenstein
J Irungu
J Jiang
J Liu
J Sterjovski
JD Storey
Jeffrey A. Anderson
Jesus F. Salazar-Gonzalez
JF Salazar-Gonzalez
JF Salazar-Gonzalez
JL Kirchherr
JL Mellquist
JM Binley
JN Reitter
John A. T. Young
Joseph G. Sodroski
Joseph J. Eron
Julie M. Decker
K Katoh
K Ritola
Kelly A. Soderberg
KJ Doores
L Chen
L Kong
L Margolis
L Wu
Li-Hua Ping
LQ Zhang
M Braibant
M Coetzer
M Kearney
M Li
M Sagar
M Stone
Marcus Daniels
Martin Markowitz
Michael S. Saag
Ming Zhang
Mohammed Asmal
Mohan Krishnamoorthy
MR Abrahams
Myron S. Cohen
N Goonetilleke
N Wood
Norman L. Letvin
P Borrow
P Borrow
P Refaeilzadeh
P Yang
Paul A. Goepfert
PB Gilbert
PD Kwong
Peter B. Gilbert
Peter T. Hraber
PL Moore
PL Moore
R Kohavi
R Rong
R Rong
R Shankarappa
R Wyatt
RA McCaffrey
RD Astronomo
RE Haaland
Ronald Swanstrom
RR Bouckaert
RW Sanders
RW Sanders
S Gnanakaran
S Gnanakaran
S Rerks-Ngarm
S Salzberg
S Sato
S. Gnanakaran
SC Piller
SD Frost
Shuyi Wang
SK Wang
SM Wolinsky
SR Eddy
T Bhattacharya
T Golubchik
T Hirbod
T Murakami
T Zhou
T Zhou
T Zhu
Tanmoy Bhattacharya
TB Geijtenbeek
TF Wolfs
TG Edwards
Tongye Shen
TP Hopp
V Kalia
W Fischer
William A. Blattner
William R. Schief
X Wei
X Wu
Y Bengio
Y Furuta
Y Kliger
Y Li
Y Li
Y Li
Y Li
Y Liu
Yih-En Andrew Ban
ZL Brumme
ZL Brumme
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Here we have identified HIV-1 B clade Envelope (Env) amino acid signatures from early in infection that may be favored at transmission, as well as patterns of recurrent mutation in chronic infection that may reflect common pathways of immune evasion. To accomplish this, we compared thousands of sequences derived by single genome amplification from several hundred individuals that were sampled either early in infection or were chronically infected. Samples were divided at the outset into hypothesis-forming and validation sets, and we used phylogenetically corrected statistical strategies to identify signatures, systematically scanning all of Env. Signatures included single amino acids, glycosylation motifs, and multi-site patterns based on functional or structural groupings of amino acids. We identified signatures near the CCR5 co-receptor-binding region, near the CD4 binding site, and in the signal peptide and cytoplasmic domain, which may influence Env expression and processing. Two signatures patterns associated with transmission were particularly interesting. The first was the most statistically robust signature, located in position 12 in the signal peptide. The second was the loss of an N-linked glycosylation site at positions 413–415; the presence of this site has been recently found to be associated with escape from potent and broad neutralizing antibodies, consistent with enabling a common pathway for immune escape during chronic infection. Its recurrent loss in early infection suggests it may impact fitness at the time of transmission or during early viral expansion. The signature patterns we identified implicate Env expression levels in selection at viral transmission or in early expansion, and suggest that immune evasion patterns that recur in many individuals during chronic infection when antibodies are present can be selected against when the infection is being established prior to the adaptive immune response

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Carolina Digital Repository

A predictive model for stress recognition in desk jobs

Author: A Sharma
Alicia Martinez
D Carneiro
D Carneiro
D Carneiro
DW Aha
E Garcia-Ceja
European Agency for Safety and Health at Work (EU-OSHA)
GH Dunteman
GH John
GJ Borradaile
Hugo Estrada
IH Witten
J Han
Javier Hernandez
L Breiman
M Hall
M Sohail
M Ulinskas
Miguel Gonzalez-Mendoza
NV Chawla
O Loyola
P Univaso
R Kohavi
R Quinlan
S Koldijk
SA Murphy
TA Beehr
Wendy Sanchez
Y Hernández
Yasmin Hernandez
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Discriminative histogram taxonomy features for snake species identification

Author: A Premawardhena
A Singhal
AP James
AP James
C Marquez-Vera
C Mattison
CG Atkeson
D Benbouzid
D Stevens
DA Warrell
DW Aha
E Frank
GH John
ID Longstaff
JJ Calvete
K Koonsanit
L Breiman
M Hall
M Indra Devi
M Milacic
MA Smith
MD Buhmann
MI Devi
MS Sorower
P Chanda
R Jensen
R Kohavi
R Whitaker
S Backshall
S Weidensaul
SB Kim
SB Kotsiantis
SMJWJR Firth
T Mertens
TK Ho
Y Freund
YX Meng
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Multi-layer Attribute Selection and Classification Algorithm for the Diagnosis of Cardiac Autonomic Neuropathy Based on HRV Attributes

Author: Breiman L
Dietterich TG
Dimitropoulos G Tahrani AA, Stevens MJ
Ewing DJ Martyn CN, Young RJ, et al.
Goldberger AL Amaral LAN, Hausdorff JM, et al.
Huikuri HV Linnaluoto MK, Sepp&#228
Karmakar CK Khandoker AH, Jelinek HF, et al.
Khandoker AH Jelinek HF, Palaniswami M
Kohavi R John GH
La Rovere MT Pinna GD, Maestri R, et al.
Lake DE Moorman JR
Oida ET Moritani KT, Yamori Y
Peng CK Havlin S, Stanley HE, et al.
Spallone V Menzinger G
Stranieri A Abawajy J, Kelarev A, et al.
Task Force of The European Society of Cardiology
Vinik AI Erbas T, Casellini CM
Ziegler DA Rathmann VW, Strom A, et al.
Publication venue: 'American Institute of Mathematical Sciences (AIMS)'
Publication date: 01/01/2015
Field of study

Crossref

Redocumenting APIs with crowd knowledge: a coverage analysis based on question types

Author: A van Deursen
AM Rocha
B Dagenais
C Parnin
C Treude
C Treude
D Kavaler
Damien Cassou
DW Aha
EC Campos
Fernanda Madeiral Delfim
G Petrosyan
G Sridhara
GH John
J Kim
JE Montandon
JR Landis
JR Quinlan
JR Quinlan
Klérisson V. R. Paixão
L Breiman
L Moonen
L Moreno
L Ponzanelli
L Yu
LBL Souza
LBLd Souza
LM Silva
M Hall
M Linares-Vásquez
Marcelo de Almeida Maia
MP Robillard
MP Robillard
N Friedman
N Landwehr
P Refaeilzadeh
PC Rigby
PW McBurney
Q McNemar
R Dyer
R Kohavi
RFQ Lafetá
S Hen β
S le Cessie
S Subramanian
SM Nasehi
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments

Author: A Garg
A Kirschner
AJ Shepherd
B Rost
B Rost
Ce Zheng
CM Wilmot
CM Wilmot
CT Zhang
DL Wheeler
DN Ivankov
DT Jones
EG Hutchinson
EG Hutchinson
F Birzele
G Forman
G Müller
G Pollastri
GD Rose
GH John
H Kaur
H Kaur
H Kaur
H Liu
I Witten
J Platt
J Song
JA Cuff
JS Richardson
K Chen
K Chen
K Chen
K Guruprasad
K Takano
KC Chou
KC Chou
KS Kee
L Yu
LJ McGuffin
Lukasz Kurgan
M Ouali
P Baldi
PF Fuchs
PY Chou
Q Zhang
R Kohavi
S Kim
S Montgomerie
SF Altschul
SS Keerthi
TH Pham
U Hobohm
V Vapnik
W Kabsch
X Hu
Y Wang
YD Cai
Publication venue: BMC
Publication date: 01/01/2008
Field of study

Abstract Background <it>β</it>-turn is a secondary protein structure type that plays significant role in protein folding, stability, and molecular recognition. To date, several methods for prediction of <it>β</it>-turns from protein sequences were developed, but they are characterized by relatively poor prediction quality. The novelty of the proposed sequence-based <it>β</it>-turn predictor stems from the usage of a window based information extracted from four predicted three-state secondary structures, which together with a selected set of position specific scoring matrix (PSSM) values serve as an input to the support vector machine (SVM) predictor. Results We show that (1) all four predicted secondary structures are useful; (2) the most useful information extracted from the predicted secondary structure includes the structure of the predicted residue, secondary structure content in a window around the predicted residue, and features that indicate whether the predicted residue is inside a secondary structure segment; (3) the PSSM values of Asn, Asp, Gly, Ile, Leu, Met, Pro, and Val were among the top ranked features, which corroborates with recent studies. The Asn, Asp, Gly, and Pro indicate potential <it>β</it>-turns, while the remaining four amino acids are useful to predict non-<it>β</it>-turns. Empirical evaluation using three nonredundant datasets shows favorable Qtotal, Qpredicted and MCC values when compared with over a dozen of modern competing methods. Our method is the first to break the 80% Qtotal barrier and achieves Qtotal = 80.9%, MCC = 0.47, and Qpredicted higher by over 6% when compared with the second best method. We use feature selection to reduce the dimensionality of the feature vector used as the input for the proposed prediction method. The applied feature set is smaller by 86, 62 and 37% when compared with the second and two third-best (with respect to MCC) competing methods, respectively. Conclusion Experiments show that the proposed method constitutes an improvement over the competing prediction methods. The proposed prediction model can better discriminate between <it>β</it>-turns and non-<it>β</it>-turns due to obtaining lower numbers of false positive predictions. The prediction model and datasets are freely available at <url>http://biomine.ece.ualberta.ca/BTNpred/BTNpred.html</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Dimension reduction methods for microarray data: a review

Author: Aziz R Verma C, Srivastava N
Brown MP Grundy WN, Lin D, et al.
Cadenas JM Garrido MC, MartíNez R
Chandrashekar G Sahin F
Chang TW
Cheadle C Vawter MP, Freed WJ, et al.
Chuang LY Yang CH, Wu KC, et al.
Dudoit S Fridlyand J, Speed TP
Eisen MB Brown PO
Golub TR Slonim DK, Tamayo P, et al.
Hsu CC Chen MC, Chen LS
Hu Q Pan W, An S, et al.
Huang Y Lowe HJ
Ji G Yang Z, You W
Khan J Wei JS, Ringner M, et al.
Kohavi R John GH
Kong W Vanderburg CR, Gunshin H, et al.
Leng C
Lenoir T Giannella E
Li B Zheng CH, Huang DS, et al.
Lin KS Chien CF
Liu Q Zhao Z, Li YX, et al.
Peng Y
Pinkel D Segraves R, Sudar D, et al.
Quackenbush J
Saeys Y Inza I, Larrañaga P
Shen Q Mei Z, Ye BX
Somol P Pudil P, Novovičová J, et al.
Tan Y Shi L, Tong W, et al.
Tong DL Mintram R
Wang L Feng Z, Wang X, et al.
Xiang S Nie F, Meng G, et al.
You W Yang Z, Yuan M, et al.
Zhu S Wang D, Yu K, et al.
Zibakhsh A Abadeh MS
Publication venue: 'American Institute of Mathematical Sciences (AIMS)'
Publication date: 01/01/2017
Field of study

Crossref

Prediction and Prioritization of Rare Oncogenic Mutations in the Cancer Kinome Using Novel Features and Multiple Classifiers

Crossref