Search CORE

8 research outputs found

A permutation test for determining significance of clusters with applications to spatial and gene expression data

Author: Bonetti Marco
J. Manjourides
M. Pagano
P. J. Park
Publication venue
Publication date
Field of study

Hierarchical clustering is a common procedure for identifying structure in a data set, and this is frequently used for organizing genomic data. Although more advanced clustering algorithms are available, the simplicity and visual appeal of hierarchical clustering has made it ubiquitous in gene expression data analysis. Hence, even minor improvements in this framework would have significant impact. There is currently no simple and systematic way of assessing and displaying the significance of various clusters in a resulting dendrogram without making certain distributional assumptions or ignoring gene-specific variances. In this work, we introduce a permutation test based on comparing the within-cluster structure of the observed data with those of sample datasets obtained by permuting the cluster membership. We carry out this test at each node of the dendrogram using a statistic derived from the singular value decomposition of variance matrices. The p-values thus obtained provide insight into the significance of each cluster division. Given these values, one can also modify the dendrogram by combining non-significant branches. By adjusting the cut-off level of significance for branches, one can produce dendrograms with a desired level of detail for ease of interpretation. We demonstrate the usefulness of this approach by applying it to illustrative data sets

Archivio istituzionale della Ricerca - Bocconi

A permutation test for determining significance of clusters with applications to spatial and gene expression data

Author: Bonetti M.
Manjourides J.
Pagano M.
Park P.J.
Publication venue
Publication date
Field of study

Hierarchical clustering is a common procedure for identifying structure in a dataset, and this is frequently used for organizing genomic data. Although more advanced clustering algorithms are available, the simplicity and visual appeal of hierarchical clustering have made it ubiquitous in gene expression data analysis. Hence, even minor improvements in this framework would have significant impact. There is currently no simple and systematic way of assessing and displaying the significance of various clusters in a resulting dendrogram without making certain distributional assumptions or ignoring gene-specific variances. In this work, we introduce a permutation test based on comparing the within-cluster structure of the observed data with those of sample datasets obtained by permuting the cluster membership. We carry out this test at each node of the dendrogram using a statistic derived from the singular value decomposition of variance matrices. The p-values thus obtained provide insight into the significance of each cluster division. Given these values, one can also modify the dendrogram by combining non-significant branches. By adjusting the cut-off level of significance for branches, one can produce dendrograms with a desired level of detail for ease of interpretation. We demonstrate the usefulness of this approach by applying it to illustrative datasets.

Research Papers in Economics

Recommended from our members

Treatment Outcomes for Adolescents With Multidrug-Resistant Tuberculosis in Lima, Peru

Author: Furin Jennifer J.
Manjourides Justin
Milstein Meredith B.
Mitnick Carole D.
Tierney Dylan B.
Publication venue: 'SAGE Publications'
Publication date: 01/10/2016
Field of study

Treatment outcomes for adolescents with multidrug-resistant tuberculosis are rarely reported and, to date, have been poor. Among 90 adolescents from Lima, Peru, 68 (75.6%) achieved cure or completion of treatment. Unsuccessful treatment was less common in the Peru cohort than previously described in the literature

Harvard University - DASH

Directory of Open Access Journals

Risk Adjustment for Lumbar Dysfunction: Comparison of Linear Mixed Models With and Without Inclusion of Between-Clinic Variation as a Random Effect

Author: Chui Kevin
Corkery Marie B.
Manjourides Justin
Resnik Linda J.
Wang Ying-Chih
Yen Sheng-Che
Publication venue: DigitalCommons@SHU
Publication date: 01/01/2015
Field of study

Background Valid comparison of patient outcomes of physical therapy care requires risk adjustment for patient characteristics using statistical models. Because patients are clustered within clinics, results of risk adjustment models are likely to be biased by random, unobserved between-clinic differences. Such bias could lead to inaccurate prediction and interpretation of outcomes. Purpose The purpose of this study was to determine if including between-clinic variation as a random effect would improve the performance of a risk adjustment model for patient outcomes following physical therapy for low back dysfunction. Design This was a secondary analysis of data from a longitudinal cohort of 147,623 patients with lumbar dysfunction receiving physical therapy in 1,470 clinics in 48 states of the United States. Methods Three linear mixed models predicting patients\u27 functional status (FS) at discharge, controlling for FS at intake, age, sex, number of comorbidities, surgical history, and health care payer, were developed. Models were: (1) a fixed-effect model, (2) a random-intercept model that allowed clinics to have different intercepts, and (3) a random-slope model that allowed different intercepts and slopes for each clinic. Goodness of fit, residual error, and coefficient estimates were compared across the models. Results The random-effect model fit the data better and explained an additional 11% to 12% of the between-patient differences compared with the fixed-effect model. Effects of payer, acuity, and number of comorbidities were confounded by random clinic effects. Limitations Models may not have included some variables associated with FS at discharge. The clinics studied may not be representative of all US physical therapy clinics. Conclusions Risk adjustment models for functional outcome of patients with lumbar dysfunction that control for between-clinic variation performed better than a model that does not

Crossref

Sacred Heart University: DigitalCommons@SHU

Identifying multidrug resistant tuberculosis transmission hotspots using routinely collected data

Author: Asencios L
Cohen T
Contreras C
Jave O
Jeffery Caroline
Lin H
Manjourides J
Pagano M
Santa Cruz J
Shin S
Yagui M
Publication venue: 'Elsevier BV'
Publication date: 06/03/2012
Field of study

In most countries with large drug resistant tuberculosis epidemics, only those cases that are at highest risk of having MDRTB receive a drug sensitivity test (DST) at the time of diagnosis. Because of this prioritized testing, identification of MDRTB transmission hotspots in communities where TB cases do not receive DST is challenging, as any observed aggregation of MDRTB may reflect systematic differences in how testing is distributed in communities. We introduce a new disease mapping method, which estimates this missing information through probability-weighted locations, to identify geographic areas of increased risk of MDRTB transmission. We apply this method to routinely collected data from two districts in Lima, Peru over three consecutive years. This method identifies an area in the eastern part of Lima where previously untreated cases have increased risk of MDRTB. This may indicate an area of increased transmission of drug resistant disease, a finding that may otherwise have been missed by routine analysis of programmatic data. The risk of MDR among retreatment cases is also highest in these probable transmission hotspots, though a high level of MDR among retreatment cases is present throughout the study area. Identifying potential multidrug resistant tuberculosis (MDRTB) transmission hotspots may allow for targeted investigation and deployment of resources

LSTM Online Archive

Crossref

PubMed Central

Recommended from our members

Metabolomic data presents challenges for epidemiological meta-analysis: a case study of childhood body mass index from the ECHO consortium

Author: Alshawabkeh Akram
Angel Elizabeth Esther
Busgang Stefanie A
Chu Su H
Cordero José F
Curtin Paul
Dunlop Anne L
Gilbert-Diamond Diane
Giulivi Cecilia
Hoen Anne G
Karagas Margaret R
Kelly Rachel S
Kirchner David
Lasky-Su Jessica A
Liang Donghai
Litonjua Augusto A
Manjourides Justin
McRitchie Susan
Meeker John D
Pathmasiri Wimal
Perng Wei
Prince Nicole
Schmidt Rebecca J
Tan Youran
Watkins Deborah J
Weiss Scott T
Zens Michael S
Zhu Yeyi
Publication venue: eScholarship, University of California
Publication date: 01/01/2024
Field of study

IntroductionMeta-analyses across diverse independent studies provide improved confidence in results. However, within the context of metabolomic epidemiology, meta-analysis investigations are complicated by differences in study design, data acquisition, and other factors that may impact reproducibility.ObjectiveThe objective of this study was to identify maternal blood metabolites during pregnancy (> 24 gestational weeks) related to offspring body mass index (BMI) at age two years through a meta-analysis framework.MethodsWe used adjusted linear regression summary statistics from three cohorts (total N = 1012 mother-child pairs) participating in the NIH Environmental influences on Child Health Outcomes (ECHO) Program. We applied a random-effects meta-analysis framework to regression results and adjusted by false discovery rate (FDR) using the Benjamini-Hochberg procedure.ResultsOnly 20 metabolites were detected in all three cohorts, with an additional 127 metabolites detected in two of three cohorts. Of these 147, 6 maternal metabolites were nominally associated (P < 0.05) with offspring BMI z-scores at age 2 years in a meta-analytic framework including at least two studies: arabinose (Coefmeta = 0.40 [95% CI 0.10,0.70], Pmeta = 9.7 × 10-3), guanidinoacetate (Coefmeta = - 0.28 [- 0.54, - 0.02], Pmeta = 0.033), 3-ureidopropionate (Coefmeta = 0.22 [0.017,0.41], Pmeta = 0.033), 1-methylhistidine (Coefmeta = - 0.18 [- 0.33, - 0.04], Pmeta = 0.011), serine (Coefmeta = - 0.18 [- 0.36, - 0.01], Pmeta = 0.034), and lysine (Coefmeta = - 0.16 [- 0.32, - 0.01], Pmeta = 0.044). No associations were robust to multiple testing correction.ConclusionsDespite including three cohorts with large sample sizes (N > 100), we failed to identify significant metabolite associations after FDR correction. Our investigation demonstrates difficulties in applying epidemiological meta-analysis to clinical metabolomics, emphasizes challenges to reproducibility, and highlights the need for standardized best practices in metabolomic epidemiology

eScholarship - University of California

Risk Adjustment for Lumbar Dysfunction: Comparison of Linear Mixed Models With and Without Inclusion of Between-Clinic Variation as a Random Effect

Author: Allen
Austin
Blumenthal
Burstin
Carey
Cohen
D'Errigo
Deutscher
Finch
Garson
Glance
Goldstein
Groll
Hart
Hart
Hart
Hart
Hart
Hendler
Jette
Jette
Justin Manjourides
Kevin K. Chui
Linda J. Resnik
Marie B. Corkery
Merskey
Moore
Nof
Resnik
Resnik
Resnik
Resnik
Resnik
Resnik
Romano
Rosenthal
Rothman
Shahian
Shahian
Sheng-Che Yen
Singer
Snijders
Stratford
Verdery
Wang
West
Ying-Chih Wang
Publication venue: 'American Physical Therapy Association (APTA)'
Publication date
Field of study

Crossref

Methods used in the spatial analysis of tuberculosis epidemiology: a systematic review

Author: A Carter
A Crisan
A Lopez De Fede
A Nana Yakam
A Roetzer
A Verma
A Zaragoza Bastida
AC Clements
ACA Clements
AF Nassel
AG Pereira
AL Davidow
AL Rodrigues Jr
AL Rodrigues-Junior
AP Silva
AR Tuite
Archie CA Clements
Atikaimu Wubuli
B Mathema
BJ Jacob
BR Perri
BW Higgs
C Areias
C Dye
C Erazo
C Nunes
C Nunes
C Prussing
C Sasson
C Stephen
CC Leung
CM Smith
CM Smith
CM Yuen
D Acevedo-Garcia
D Gomez-Barroso
D Manley
D Nguyen
D Pfeiffer
D Roth
D Shaweno
D Shaweno
D Stucki
D Wallace
D Wartenberg
D Wartenberg
Daniel Barros de Castro
DC Wheeler
Debebe Shaweno
DL da Roza
Dorothy Yeboah-Manu
E Ge
E Musenge
E Musenge
E Nava-Aguilera
EJ Murray
ELN Maciel
EM Streicher
EM Wampande
Emma S. McBryde
ES McBryde
F Tanser
Fei Zhao
FK Ribeiro
G Aamodt
G Alvarez-Hernandez
G Chamie
G Harling
G Kolifarhood
G Theron
H Lin
H Zhou
HE Jenkins
HH Lin
I Haase
I Mokrousov
I Wanyeki
In-Chan Ng
J Kang
J Manjourides
J Obasanya
James M. Trauer
Jason T. Evans
Jennifer R. Lim
JL Zelner
JM Ross
JP Cegielski
JS Kammerer
Justin T. Denholm
K Froggatt
K Middelkoop
K Middelkoop
K Touray
Kai Cao
Kefyalew Addis Alene
Kefyalew Addis Alene
Kiyohiko Izumi
KR Fluegge
L Anselin
L Anselin
L Burgess
L Couceiro
L Egunjobi
L Shah
LA Gaudette
Lan Li
LM Jacobson
M Chan-Yeung
M Olfatifar
M Richardson
M Saavedra-Campos
M Santos Neto
M Santos Neto
M Santos-Neto
M Tipayamongkholgul
M Yamamura
MA Marlow
MA McGuigan
Malancha Karmakar
MD Lima
Mesay Hailu Dangisso
MH Dangisso
MJ Rytkönen
MK Wong
ML Feske
ML Feske
ML Pinto
MN Seraphin
MRC Tuberculosis and chest Diseases Unit
N Beyers
N Cressie
N Tiwari
N Tiwari
Neela D. Goswami
OA Uthman
OA Uthman
P Brassard
P Goovaerts
P Goovaerts
P Hino
P Schlattmann
P Sousa
Pau Dominkovics
PC Lai
PK Moonan
PM Ricks
PTT Pang
R Beiranvand
R Srinivasan
R Strauss
RM Zorzenon dos Santos
Romain Ragonnet
RP de Queiroga
RS Kirby
RV Randremanana
S Dragioevio
S Hassarangsee
S Kakchapati
S Keshavjee
SI Cadmus
Sitraka Rakotosamimanana
SK Chandrasekaran
T Burra
T Jafari-Koshki
T Kistemann
T Li
T Tadesse
TA Yates
TS Venâncio
U Gurjav
U Gurjav
V Bhatt
V Guernier
W Sun
W Wei
WHO
WR Bishai
WV Souza
WV Souza
X Yang
XX Li
XX Li
Y Liu
YP Yeh
Yunxia Liu
Z Munch
ZW Jia
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref