Search CORE

49 research outputs found

Adjusted Measures for Feature Selection Stability for Data Sets with Similar Features

Author: A Bommert
A Kalousis
A Statnikov
J Vanschoren
JE Hopcroft
L Lausser
L Yu
M Lang
M Zhang
M Zucknick
MS Rahman
P Jaccard
Z He
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/09/2020
Field of study

For data sets with similar features, for example highly correlated features, most existing stability measures behave in an undesired way: They consider features that are almost identical but have different identifiers as different features. Existing adjusted stability measures, that is, stability measures that take into account the similarities between features, have major theoretical drawbacks. We introduce new adjusted stability measures that overcome these drawbacks. We compare them to each other and to existing stability measures based on both artificial and real sets of selected features. Based on the results, we suggest using one new stability measure that considers highly similar features as exchangeable

arXiv.org e-Print Archive

Crossref

Validation of ZAP-70 methylation and its relative significance in predicting outcome in chronic lymphocytic leukemia

Author: +13 additional authors
Barrientos J. C.
Byrd J. C.
Claus R.
Lucas D. M.
Patterson K.
Rai K. R.
Ruppert A. S.
Weng D.
Williams K. E.
Zucknick M.
Publication venue: Donald and Barbara Zucker School of Medicine Academic Works
Publication date: 01/01/2014
Field of study

ZAP-70 methylation 223 nucleotides downstream of transcription start (CpG+223) predicts outcome in chronic lymphocytic leukemia (CLL), but its impact relative to CD38 and ZAP-70 expression or immunoglobulin heavy chain variable region (IGHV) status is uncertain. Additionally, standardizing ZAP-70 expression analysis has been unsuccessful. CpG+223 methylation was quantitatively determined in 295 untreated CLL cases using MassARRAY. Impact on clinical outcome vs CD38 and ZAP-70 expression and IGHV status was evaluated. Cases with low methylation (0.90). Thus, ZAP-70 CpG+223 methylation represents a superior biomarker for TT and OS that can be feasibly measured, supporting its use in risk-stratifying CLL

Hofstra Northwell Academic Works (Hofstra Northwell School of Medicine)

Recommended from our members

The effects of a dialogue-based intervention to promote psychosocial well-being after stroke: a randomized controlled trial

Author: Bragstad L. K.
Bronken B. A.
Hilari K.
Hjelle E. G.
Kirkevold M.
Kitzmüller G.
Kvigne K. J.
Lightbody C. E.
Mangset M.
Martinsen R.
Sveen U.
Thommessen B.
Zucknick M.
Publication venue: 'SAGE Publications'
Publication date: 01/01/2020
Field of study

Objective: To evaluate the effect of a dialogue-based intervention targeting psychosocial well-being at 12 months post-stroke. Design: Multicenter, prospective, randomized, assessor-blinded, controlled trial with two parallel groups. Setting: Community. Subjects: Three-hundred and twenty-two adults (⩾18 years) with stroke within the last four weeks were randomly allocated into intervention group (n = 166) or control group (n = 156). Interventions: The intervention group received a dialogue-based intervention to promote psychosocial well-being, comprising eight individual 1–1½ hour sessions delivered during the first six months post-stroke. Main measures: The primary outcome measure was the General Health Questionnaire-28 (GHQ-28). Secondary outcome measures included the Stroke and Aphasia Quality of Life Scale-39g, the Sense of Coherence scale, and the Yale Brown single-item questionnaire. Results: The mean (SD) age of the participants was 66.8 (12.1) years in the intervention group and 65.7 (13.3) years in the control group. At 12 months post-stroke, the mean (SE) GHQ-28 score was 20.6 (0.84) in the intervention group and 19.9 (0.85) in the control group. There were no between-group differences in psychosocial well-being at 12 months post-stroke (mean difference: −0.74, 95% confidence interval (CI): −3.08, 1.60). The secondary outcomes showed no statistically significant between-group difference in health-related quality of life, sense of coherence, or depression at 12 months. Conclusion: The results of this trial did not demonstrate lower levels of emotional distress and anxiety or higher levels of health-related quality of life in the intervention group (dialogue-based intervention) as compared to the control group (usual care) at 12 months post-stroke

City Research Online

CLoK

Brage INN

Munin - Open Research Archive

NORA - Norwegian Open Research Archives

Predictive value of DNA methylation patterns in AML patients treated with an azacytidine containing induction regimen

Author: Benner A.
Bullinger L.
Claus R.
Döhner K.
Mertens D.
Mücke O.
Plass C.
Schlenk R.F.
Schmutz M.
Weichenhan D.
Zucknick M.
Publication venue: BioMed Central
Publication date: 26/10/2023
Field of study

BACKGROUND: Acute myeloid leukemia (AML) is a heterogeneous disease with a poor prognosis. Dysregulation of the epigenetic machinery is a significant contributor to disease development. Some AML patients benefit from treatment with hypomethylating agents (HMAs), but no predictive biomarkers for therapy response exist. Here, we investigated whether unbiased genome-wide assessment of pre-treatment DNA-methylation profiles in AML bone marrow blasts can help to identify patients who will achieve a remission after an azacytidine-containing induction regimen. RESULTS: A total of n = 155 patients with newly diagnosed AML treated in the AMLSG 12-09 trial were randomly assigned to a screening and a refinement and validation cohort. The cohorts were divided according to azacytidine-containing induction regimens and response status. Methylation status was assessed for 664,227 500-bp-regions using methyl-CpG immunoprecipitation-seq, resulting in 1755 differentially methylated regions (DMRs). Top regions were distilled and included genes such as WNT10A and GATA3. 80% of regions identified as a hit were represented on HumanMethlyation 450k Bead Chips. Quantitative methylation analysis confirmed 90% of these regions (36 of 40 DMRs). A classifier was trained using penalized logistic regression and fivefold cross validation containing 17 CpGs. Validation based on mass spectra generated by MALDI-TOF failed (AUC 0.59). However, discriminative ability was maintained by adding neighboring CpGs. A recomposed classifier with 12 CpGs resulted in an AUC of 0.77. When evaluated in the non-azacytidine containing group, the AUC was 0.76. CONCLUSIONS: Our analysis evaluated the value of a whole genome methyl-CpG screening assay for the identification of informative methylation changes. We also compared the informative content and discriminatory power of regions and single CpGs for predicting response to therapy. The relevance of the identified DMRs is supported by their association with key regulatory processes of oncogenic transformation and support the idea of relevant DMRs being enriched at distinct loci rather than evenly distribution across the genome. Predictive response to therapy could be established but lacked specificity for treatment with azacytidine. Our results suggest that a predictive epigenotype carries its methylation information at a complex, genome-wide level, that is confined to regions, rather than to single CpGs. With increasing application of combinatorial regimens, response prediction may become even more complicated

MDC Repository

Significance testing in ridge regression for genetic data.

Author: AE Hoerl
AE Hoerl
AM Halawa
CI Amos
CJ Hoggart
CJ Hoggart
CJ Hoggart
D Altman
E Riboli
E Vago
Erika Cule
G Golub
H Zou
IE Frank
J Lawless
J McKay
JC Whittaker
JY Tzeng
K Ayers
M Chadeau-Hyam
M Park
M Zucknick
Maria De Iorio
N Malo
N Meinshausen
P Armitage
P Yang
Paolo Vineis
R Development Core Team
R Tibshirani
RJ Hung
SL Cessie
T Hastie
T Hsiang
T Truong
TA Manolio
The 1000 Genomes Project Consortium
WTCCC
Y Sun
Y Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2011
Field of study

Published versio

Crossref

Directory of Open Access Journals

PubMed Central

UCL Discovery

Spiral - Imperial College Digital Repository

Assessment and optimisation of normalisation methods for dual-colour antibody microarrays

Author: A Oshlack
Axel Benner
B Efron
C Schröder
C Schröder
C Wingren
CAK Borrebaeck
Christoph Schröder
D Hamelinck
E Bjorling
GC Tseng
GK Smyth
GK Smyth
H Xiong
J Ingvarsson
J Silver
Jörg D Hoheisel
M Sanchez-Carbayo
Manuela Zucknick
Martin Sill
ME Ritchie
MJ Taussig
MS Alhamdani
P Ellmark
R Tibshirani
RC Gentleman
S Dudoit
S Varma
W Huber
W Kusnezow
Y Yang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Recent advances in antibody microarray technology have made it possible to measure the expression of hundreds of proteins simultaneously in a competitive dual-colour approach similar to dual-colour gene expression microarrays. Thus, the established normalisation methods for gene expression microarrays, e.g. loess regression, can in principle be applied to protein microarrays. However, the typical assumptions of such normalisation methods might be violated due to a bias in the selection of the proteins to be measured. Due to high costs and limited availability of high quality antibodies, the current arrays usually focus on a high proportion of regulated targets. Housekeeping features could be used to circumvent this problem, but they are typically underrepresented on protein arrays. Therefore, it might be beneficial to select invariant features among the features already represented on available arrays for normalisation by a dedicated selection algorithm. Results We compare the performance of several normalisation methods that have been established for dual-colour gene expression microarrays. The focus is on an invariant selection algorithm, for which effective improvements are proposed. In a simulation study the performances of the different normalisation methods are compared with respect to their impact on the ability to correctly detect differentially expressed features. Furthermore, we apply the different normalisation methods to a pancreatic cancer data set to assess the impact on the classification power. Conclusions The simulation study and the data application demonstrate the superior performance of the improved invariant selection algorithms in comparison to other normalisation methods, especially in situations where the assumptions of the usual global loess normalisation are violated.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Effect of training-sample size and classification difficulty on the accuracy of genomic predictors

Author: Alex Ishkin
B Efron
Brandon D Gallas
C Fan
C Lai
C Liedtke
Christos Hatzis
CM Perou
Daniel Booser
DW Huang
F Andre
F Peintinger
Frank W Samuelson
Gabriel N Hortobagyi
IA Wood
J Stec
JS Ross
K Fukunaga
Kenneth R Hess
KR Hess
L Ein-Dor
L Pusztai
Lajos Pusztai
Leming Shi
M Ayers
M Lecocke
M Zucknick
Marina Tsyganova
Mauro Delorenzi
MJ van de Vijver
P Wirapati
PC Boutros
R Rouzier
S Dudoit
S Paik
Tatiana Nikolskaya
TK Ho
Vicente Valero
Vlad Popovici
W Fraser Symmans
WA Yousef
WA Yousef
Weijie Chen
Weiwei Shi
WF Symmans
Yuri Nikolsky
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Introduction: As part of the MicroArray Quality Control (MAQC)-II project, this analysis examines how the choice of univariate feature-selection methods and classification algorithms may influence the performance of genomic predictors under varying degrees of prediction difficulty represented by three clinically relevant endpoints. Methods: We used gene-expression data from 230 breast cancers (grouped into training and independent validation sets), and we examined 40 predictors (five univariate feature-selection methods combined with eight different classifiers) for each of the three endpoints. Their classification performance was estimated on the training set by using two different resampling methods and compared with the accuracy observed in the independent validation set. Results: A ranking of the three classification problems was obtained, and the performance of 120 models was estimated and assessed on an independent validation set. The bootstrapping estimates were closer to the validation performance than were the cross-validation estimates. The required sample size for each endpoint was estimated, and both gene-level and pathway-level analyses were performed on the obtained models. Conclusions: We showed that genomic predictor accuracy is determined largely by an interplay between sample size and classification difficulty. Variations on univariate feature-selection methods and choice of classification algorithm have only a modest impact on predictor performance, and several statistically equally good predictors can be developed for any given classification problem

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Springer - Publisher Connector

Serveur académique lausannois

PubMed Central

Effect of Size and Heterogeneity of Samples on Biomarker Discovery: Synthetic and Real Data Assessment

Author: A Buness
A Subramanian
AL Boulesteix
Annalisa Barla
B Di Camillo
B Di Camillo
Barbara Di Camillo
C Ambroise
C Desmedt
C Furlanello
C Furlanello
C Sotiriou
CA Davis
Cesare Furlanello
Claudio Cobelli
D Cai
DS Oh
Francesco Sambo
G Jurman
G Jurman
G Jurman
Gianna Toffolo
Giuseppe Jurman
HY Chuang
I Guyon
JE Larkin
Jo-Ann L. Stanton
JP Ioannidis
L Ein-Dor
L Ein-Dor
L Shi
LD Miller
M Zucknick
Margherita Squillario
Matteo Martini
ML Siegal
P Baldi
RA Irizarry
RA Irizarry
S Riccadonna
SY Kim
T Abeel
Tiziana Sanavia
VG Tusher
VK Mootha
VN Vapnik
X Solé
Y Benjamini
Y Sun
Z He
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

MOTIVATION: The identification of robust lists of molecular biomarkers related to a disease is a fundamental step for early diagnosis and treatment. However, methodologies for the discovery of biomarkers using microarray data often provide results with limited overlap. These differences are imputable to 1) dataset size (few subjects with respect to the number of features); 2) heterogeneity of the disease; 3) heterogeneity of experimental protocols and computational pipelines employed in the analysis. In this paper, we focus on the first two issues and assess, both on simulated (through an in silico regulation network model) and real clinical datasets, the consistency of candidate biomarkers provided by a number of different methods. METHODS: We extensively simulated the effect of heterogeneity characteristic of complex diseases on different sets of microarray data. Heterogeneity was reproduced by simulating both intrinsic variability of the population and the alteration of regulatory mechanisms. Population variability was simulated by modeling evolution of a pool of subjects; then, a subset of them underwent alterations in regulatory mechanisms so as to mimic the disease state. RESULTS: The simulated data allowed us to outline advantages and drawbacks of different methods across multiple studies and varying number of samples and to evaluate precision of feature selection on a benchmark with known biomarkers. Although comparable classification accuracy was reached by different methods, the use of external cross-validation loops is helpful in finding features with a higher degree of precision and stability. Application to real data confirmed these results

Public Library of Science (PLOS)

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Directory of Open Access Journals

PubMed Central

Archivio istituzionale della ricerca - Università di Genova

Archivio istituzionale della ricerca - Università di Padova

Institutional Research Information System University of Turin

FigShare

Community assessment to advance computational prediction of cancer drug combinations in a pharmacogenomic screen

Author: Abante J
Abecassis BS
Aben N
Aghamirzaie D
Ahsen ME
Aittokallio T
Akhtari FS
Al-lazikani B
Alam T
Allam A
Allen C
Altarawy D
Alves V
Amadoz A
Anchang B
Angel Pujana M
Antolin AA
Ash JR
Ba-alawi W
Bagheri M
Bajic V
Ball G
Ballester PJ
Baptista D
Bare C
Bateson M
Bender A
Bertrand D
Boroevich KA
Bosdriesz E
Bougouffa S
Bounova G
Brouwer T
Bryant B
Bulusu KC
Calaza M
Calderone A
Calza S
Capuzzi S
Carbonell-Caballero J
Carlin D
Carter H
Castagnoli L
Celebi R
Cesareni G
Chang H
Chen G
Chen H
Chen H
Cheng L
Chernomoretz A
Chicco D
Cho K-H
Cho S
Choi D
Choi J
Choi K
Choi M
Coker E
Combinatio A-SD
Cortes-Ciriano I
Cserzo M
Cubuk C
Curtis C
Dang CC
de Almeida MP
De Cock M
de Esch I
de Graaf C
De Maeyer D
De Niz C
de Ruiter JR
De Troyer E
Di Veroli GY
Dijkstra T
Dopazo J
Draghici S
Drosou A
Dry JR
Dumontier M
Ehrhart F
Eid F-E
ElHefnawi M
Elmarakeby H
Engin HB
Evelo C
Falcao AO
Farag S
Fawell S
Fernandez-Lozano C
Fisch K
Flobak A
Fornari C
Foroushani ABK
Fotso DC
Fourches D
Friend S
Frigessi A
Gao F
Gao X
Garnett MJ
Gerold JM
Gestraud P
Ghazoui Z
Ghosh S
Gillberg J
Godoy-Lorite A
Godynyuk L
Godzik A
Goldenberg A
Gomez-Cabrero D
Gonen M
Gray H
Grechkin M
Guan Y
Guimera R
Guinney J
Guney E
Haibe-Kains B
Han Y
Hase T
He D
He L
Heath LS
Hellton KH
Helmer-Citterich M
Hidalgo MR
Hidru D
Hill SM
Hochreiter S
Hong S
Hovig E
Hsueh Y-C
Hu Z
Huang JK
Huang RS
Hunyady L
Hwang J
Hwang TH
Hwang W
Hwang Y
Isayev O
Jack J
Jahandideh S
Jang IS
Jeon M
Ji J
Jo Y
Kamola PJ
Kanev GK
Kang J
Karacosta L
Karimi M
Kaski S
Kazanov M
Khamis AM
Khan SA
Kiani NA
Kim A
Kim J
Kim J
Kim K
Kim K
Kim S
Kim Y
Kim Y
Kirk PDW
Kitano H
Klambauer G
Knowles D
Ko M
Kohn-Luque A
Kooistra AJ
Kuenemann MA
Kuiper M
Kurz C
Kwon M
Laegreid A
Lederer S
Lee H
Lee J
Lee YW
Leppaho E
Lewis R
Li J
Li L
Liley J
Lim WK
Lin C
Liu Y
Lopez Y
Low J
Lysenko A
Machado D
Madhukar N
Malpartida AB
Mamitsuka H
Marabita F
Marchal K
Marttinen P
Mason D
Mason MJ
Mazaheri A
Mehmood A
Mehreen A
Menden MP
Michaut M
Miller RA
Mitsopoulos C
Modos D
Moo K
Motsinger-Reif A
Movva R
Muraru S
Muratov E
Mushthofa M
Nagarajan N
Nakken S
Nath A
Neto EC
Neuvial P
Newton R
Nguyen T
Ning Z
Norman T
Oliva B
Olsen C
Palmeri A
Panesar B
Papadopoulos S
Park J
Park S
Park S
Pawitan Y
Peluso D
Pendyala S
Peng J
Perfetto L
Pirro S
Plevritis S
Politi R
Poon H
Porta E
Prellner I
Preuer K
Ramnarine R
Reid JE
Reyal F
Richardson S
Ricketts C
Rieswijk L
Rocha M
Rodriguez-Gonzalvez C
Roell K
Romeo Aznar V
Rotroff D
Rukawa P
Sadacca B
Saez-Rodriguez J
Safikhani Z
Safitri F
Sales-Pardo M
Sauer S
Schlichting M
Seoane JA
Serra J
Shang M-M
Sharma A
Sharma H
Shen Y
Shiga M
Shin M
Shkedy Z
Shopsowitz K
Sinai S
Skola D
Smirnov P
Soerensen IF
Soerensen P
Song J-H
Song SO
Soufan O
Spitzmueller A
Steipe B
Stolovitzky G
Suphavilai C
Szalai B
Tamayo SP
Tamborero D
Tang EKY
Tang J
Tanoli Z-U-R
Tarres-Deulofeu M
Tegner J
Thommesen L
Tonekaboni SAM
Tran H
Truong A
Tsunoda T
Turu G
Tzeng G-Y
Van Daele D
van Engelen B
van Laarhoven T
Van Moerbeke M
van Westen GJP
Verbeke L
Videla S
Vis D
Vogel R
Voronkov A
Votis K
Walk OBD
Wang A
Wang D
Wang H-QH
Wang P-W
Wang S
Wang W
Wang X
Wang X
Wennerberg K
Wernisch L
Wessels L
Westerman BA
White SR
Wijayawardena B
Willighagen E
Wolfinger R
Wurdinger T
Xie L
Xie S
Xu H
Yadav B
Yau C
Yeerna H
Yin JW
Yu M
Yu M
Yu T
Yun SJ
Zakharov A
Zamichos A
Zanin M
Zaslavskiy M
Zeng L
Zenil H
Zhang F
Zhang P
Zhang W
Zhao H
Zhao L
Zheng W
Zoufir A
Zucknick M
Publication venue
Publication date: 01/01/2019
Field of study

The effectiveness of most cancer targeted therapies is short-lived. Tumors often develop resistance that might be overcome with drug combinations. However, the number of possible combinations is vast, necessitating data-driven approaches to find optimal patient-specific treatments. Here we report AstraZeneca’s large drug combination dataset, consisting of 11,576 experiments from 910 combinations across 85 molecularly characterized cancer cell lines, and results of a DREAM Challenge to evaluate computational strategies for predicting synergistic drug pairs and biomarkers. 160 teams participated to provide a comprehensive methodological development and benchmarking. Winning methods incorporate prior knowledge of drug-target interactions. Synergy is predicted with an accuracy matching biological replicates for >60% of combinations. However, 20% of drug combinations are poorly predicted by all methods. Genomic rationale for synergy predictions are identified, including ADAM17 inhibitor antagonism when combined with PIK3CB/D inhibition contrasting to synergy when combined with other PI3K-pathway inhibitors in PIK3CA mutant cells.Peer reviewe

VU Research Portal

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Publikationsserver der Universität Tübingen

Archivio istituzionale della ricerca - Università di Brescia

Leiden University Scholary Publications

UPF Digital Repository

NORA - Norwegian Open Research Archives

White Rose Research Online

CONICET Digital

Ghent University Academic Bibliography

Aaltodoc Publication Archive

Oxford University Research Archive