Search CORE

56 research outputs found

Six solutions for more reliable infant research

Author: Bergmann C.
Byers-Heinlein K.
Savalei V.
Publication venue: 'Wiley'
Publication date: 01/10/2022
Field of study

Infant research is often underpowered, undermining the robustness and replicability of our findings. Improving the reliability of infant studies offers a solution for increasing statistical power independent of sample size. Here, we discuss two senses of the term reliability in the context of infant research: reliable (large) effects and reliable measures. We examine the circumstances under which effects are strongest and measures are most reliable and use synthetic datasets to illustrate the relationship between effect size, measurement reliability, and statistical power. We then present six concrete solutions for more reliable infant research: (a) routinely estimating and reporting the effect size and measurement reliability of infant tasks, (b) selecting the best measurement tool, (c) developing better infant paradigms, (d) collecting more data points per infant, (e) excluding unreliable data from the analysis, and (f) conducting more sophisticated data analyses. Deeper consideration of measurement in infant research will improve our ability to study infant development

MPG.PuRe

Null hypothesis significance testing: a short tutorial

Author: A Gelman
D Lakens
D Lakens
D Lindley
E Walker
F Turkheimer
G Cumming
J Ioannidis
J Kruschke
J Miller
J Neyman
J Neyman
L Halsey
M Krzywinski
M van Assen
P Killeen
R Christensen
R Fisher
R Fisher
R Fisher
R Fisher
R Frick
R Hoekstra
R Hubbard
R Morey
R Nickerson
R Nuzzo
R Rosenthal
R Wasserstein
R Wilcox
S Tan
V Johnson
V Savalei
Z Dienes
Publication venue: 'F1000 Research Ltd'
Publication date: 13/07/2016
Field of study

Although thoroughly criticized, null hypothesis significance testing (NHST) remains the statistical method of choice used to provide evidence for an effect, in biological, biomedical and social sciences. In this short tutorial, I first summarize the concepts behind the method, distinguishing test of significance (Fisher) and test of acceptance (Newman-Pearson) and point to common interpretation errors regarding the p-value. I then present the related concepts of confidence intervals and again point to common interpretation errors. Finally, I discuss what should be reported in which context. The goal is to clarify concepts to avoid interpretation errors and propose reporting practices.</ns4:p

Crossref

Edinburgh Research Explorer

[Comment] Redefine statistical significance

Author: Benjamin Daniel J.
Berger James O.
Berk Richard
Bollen Kenneth A.
Brembs Björn
Brown Lawrence
Camerer Colin
Cesarini David
Chambers Christopher D.
Clyde Merlise
Cook Thomas D.
De Boeck Paul
Dienes Zoltan
Dreber Anna
Easwaran Kenny
Efferson Charles
Fehr Ernst
Fidler Fiona
Field Andy P.
Forster Malcolm
George Edward I.
Gonzalez Richard
Goodman Steven
Green Donald P.
Green Edwin
Greenwald Anthony G.
Hadfield Jarrod D.
Hedges Larry V.
Held Leonhard
Hoijtink Herbert
Hruschka Daniel J.
Hua Ho Teck
Imai Kosuke
Imbens Guido
Ioannidis John P. A.
Jeon Minjeong
Johannesson Magnus
Johnson Valen E.
Jones James Holland
Kirchler Michael
Laibson David
List John
Little Roderick
Lupia Arthur
Machery Edouard
Maxwell Scott E.
McCarthy Michael
Moore Don A.
Morgan Stephen L.
Munafó Marcus
Nakagawa Shinichi
Nosek Brian A.
Nyhan Brendan
Parker Timothy H.
Pericchi Luis
Perugini Marco
Rouder Jeff
Rousseau Judith
Savalei Victoria
Schönbrodt Felix D.
Sellke Thomas
Sinclair Betsy
Tingley Dustin
Van Zandt Trisha
Vazire Simine
Wagenmakers E.-J.
Watts Duncan J.
Winship Christopher
Wolpert Robert L.
Xie Yu
Young Cristobal
Zinman Jonathan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

The lack of reproducibility of scientific studies has caused growing concern over the credibility of claims of new discoveries based on “statistically significant” findings. There has been much progress toward documenting and addressing several causes of this lack of reproducibility (e.g., multiple testing, P-hacking, publication bias, and under-powered studies). However, we believe that a leading cause of non-reproducibility has not yet been adequately addressed: Statistical standards of evidence for claiming discoveries in many fields of science are simply too low. Associating “statistically significant” findings with P < 0.05 results in a high rate of false positives even in the absence of other experimental, procedural and reporting problems. For fields where the threshold for defining statistical significance is P<0.05, we propose a change to P<0.005. This simple step would immediately improve the reproducibility of scientific research in many fields. Results that would currently be called “significant” but do not meet the new threshold should instead be called “suggestive.” While statisticians have known the relative weakness of using P≈0.05 as a threshold for discovery and the proposal to lower it to 0.005 is not new (1, 2), a critical mass of researchers now endorse this change. We restrict our recommendation to claims of discovery of new effects. We do not address the appropriate threshold for confirmatory or contradictory replications of existing claims. We also do not advocate changes to discovery thresholds in fields that have already adopted more stringent standards (e.g., genomics and high-energy physics research; see Potential Objections below). We also restrict our recommendation to studies that conduct null hypothesis significance tests. We have diverse views about how best to improve reproducibility, and many of us believe that other ways of summarizing the data, such as Bayes factors or other posterior summaries based on clearly articulated model assumptions, are preferable to P-values. However, changing the P-value threshold is simple and might quickly achieve broad acceptance

Serveur académique lausannois

Edinburgh Research Explorer

Sussex Research Online

Crossref

Online Research @ Cardiff

Royal Holloway - Pure

eScholarship - University of California

Oxford University Research Archive

ZORA

Utrecht University Repository

Explore Bristol Research

ScholarBank@NUS

Quality-of-life assessment in dementia: the use of DEMQOL and DEMQOL-Proxy total scores

Author: A Brown
A Kifley
A Vogel
AL Murray
Anna Brown
BO Muthén
BS Black
C Brayne
C Ebesutani
C Ebesutani
C Ebesutani
C Ebesutani
Caroline Watchurst
CE Coyle
CM Woods
D Ozer
D Rindskopf
David Matthews
DB Flora
FF Chen
FF Chen
FP Holgado-Tello
FR Ferraro
G Holmbeck
GG Aikman
H Blake
H Finch
H MacRae
HW Marsh
HW Marsh
J Perales
J Yesavage
JD Rubright
JH Steiger
JL Cummings
JL Cummings
JL Cummings
JL Horn
JL Novella
JM Cortina
JM Tomás
JT Willse
Kia-Chong Chua
KS Rook
L Halvorsrud
LB Mokkink
LB Mokkink
Liam Morton
LM Byrne
M Binder
M Browne
M Brunner
M Lindwall
M Rhemtulla
MF Folstein
MP Lawton
N Herrmann
O Huxhold
OP Almeida
PG Clark
PM Bentler
PM Horan
PV Rabins
R Lucas-Carrasco
R Willis
RB Kline
RE Zinbarg
RE Zinbarg
Renee Romeo
Rhian Tait
RS Bucks
Ryan Little
S Banerjee
S Banerjee
S Banerjee
S Banerjee
S Banerjee
S Banerjee
SA Sikkes
SC Smith
SP Reise
SP Reise
SP Reise
SP Reise
SP Reise
SP Reise
Sube Banerjee
T Bakas
TF Hughes
TN Tombaugh
TP Ettema
U Frick
U Reininghaus
V Savalei
Vanessa Loftus
VV Ray
VW Lou
W Moyle
Y Ichida
Y Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Purpose There is a need to determine whether health-related quality-of-life (HRQL) assessments in dementia capture what is important, to form a coherent basis for guiding research and clinical and policy decisions. This study investigated structural validity of HRQL assessments made using the DEMQOL system, with particular interest in studying domains that might be central to HRQL, and the external validity of these HRQL measurements. Methods HRQL of people with dementia was evaluated by 868 self-reports (DEMQOL) and 909 proxy reports (DEMQOL-Proxy) at a community memory service. Exploratory and confirmatory factor analyses (EFA and CFA) were conducted using bifactor models to investigate domains that might be central to general HRQL. Reliability of the general and specific factors measured by the bifactor models was examined using omega (?) and omega hierarchical (? h) coefficients. Multiple-indicators multiple-causes models were used to explore the external validity of these HRQL measurements in terms of their associations with other clinical assessments. Results Bifactor models showed adequate goodness of fit, supporting HRQL in dementia as a general construct that underlies a diverse range of health indicators. At the same time, additional factors were necessary to explain residual covariation of items within specific health domains identified from the literature. Based on these models, DEMQOL and DEMQOL-Proxy overall total scores showed excellent reliability (? h > 0.8). After accounting for common variance due to a general factor, subscale scores were less reliable (? h < 0.7) for informing on individual differences in specific HRQL domains. Depression was more strongly associated with general HRQL based on DEMQOL than on DEMQOL-Proxy (?0.55 vs ?0.22). Cognitive impairment had no reliable association with general HRQL based on DEMQOL or DEMQOL-Proxy. Conclusions The tenability of a bifactor model of HRQL in dementia suggests that it is possible to retain theoretical focus on the assessment of a general phenomenon, while exploring variation in specific HRQL domains for insights on what may lie at the ‘heart’ of HRQL for people with dementia. These data suggest that DEMQOL and DEMQOL-Proxy total scores are likely to be accurate measures of individual differences in HRQL, but that subscale scores should not be used. No specific domain was solely responsible for general HRQL at dementia diagnosis. Better HRQL was moderately associated with less depressive symptoms, but this was less apparent based on informant reports. HRQL was not associated with severity of cognitive impairment

Crossref

Springer - Publisher Connector

PubMed Central

Kent Academic Repository

King's Research Portal

Ethical Decision-Making: The Role of Self-Monitoring, Future Orientation, and Social Networks

Author: Adler P. S.
Ajzen I.
Ajzen I.
Ana Carla Bon
Ashton M. C.
Ayios A.
Bizzi L.
Bolino M. C.
Bolino M. C.
Borgatti S. P.
Brass D. J.
Brink M.
Burt R. S.
Burt R. S.
Caldwell D. F.
Coleman J. S.
Collins J. D.
Craft J. L.
Day D. V.
Day D. V.
Dillman D. A.
Dougherty T. W.
Décieux J. P.
Eyal T.
Flynn F. J.
Ford R. C.
Freeman L. C.
Gangestad S. W.
Garrett B. L.
Glover S. H.
Gunia B. C.
Hair J. F.
Hershfield H. E.
Hewlin P. F.
Hogue M.
Holman E. A.
Ibarra H.
Ibarra H.
Ibarra H.
Janicki G. A.
Joireman J.
Jones T. M.
Jones T. M.
Jorge Ferreira da Silva
Kalish Y.
Keough K. A.
Kilduff M.
Kilduff M.
Kish-Gephart J. J.
Lee S. H. M.
Lin N.
Loe T. W.
Ma R.
Martin K. D.
McPherson M.
Mehra A.
Merluzzi J.
Nevins J. L.
Nielsen R. P.
O'Fallon M. J.
Oh H.
Oh I.-S.
Olsson U. H.
Oner B.
Payne G. T.
Pinto J.
Rest J. R.
Roger James Volkema
Sasovova Z.
Savalei V.
Shah P. P.
Simons T.
Snyder M.
Snyder M.
Snyder M.
Snyder M.
Trevino L. K.
Trope Y.
Xiao Z.
Zimbardo P. G.
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Crossref

The performance of robust test statistics with categorical data

Author: Rhemtulla M.
Savalei V.
Publication venue: 'Wiley'
Publication date: 01/01/2013
Field of study

International Migration, Integration and Social Cohesion online publications

The performance of robust test statistics with categorical data

Author: Rhemtulla M.
Savalei V.
Publication venue: 'Wiley'
Publication date: 01/01/2013
Field of study

This paper reports on a simulation study that evaluated the performance of five structural equation model test statistics appropriate for categorical data. Both Type I error rate and power were investigated. Different model sizes, sample sizes, numbers of categories, and threshold distributions were considered. Statistics associated with both the diagonally weighted least squares (cat-DWLS) estimator and with the unweighted least squares (cat-ULS) estimator were studied. Recent research suggests that cat-ULS parameter estimates and robust standard errors slightly outperform cat-DWLS estimates and robust standard errors (Forero, Maydeu-Olivares, & Gallardo-Pujol, 2009). The findings of the present research suggest that the mean- and variance-adjusted test statistic associated with the cat-ULS estimator performs best overall. A new version of this statistic now exists that does not require a degrees-of-freedom adjustment (Asparouhov & Muthén, 2010), and this statistic is recommended. Overall, the cat-ULS estimator is recommended over cat-DWLS, particularly in small to medium sample sizes

International Migration, Integration and Social Cohesion online publications

On the Asymptotic Relative Efficiency of Planned Missingness Designs

Author: Little T.D.
Rhemtulla M.
Savalei V.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

In planned missingness (PM) designs, certain data are set a priori to be missing. PM designs can increase validity and reduce cost; however, little is known about the loss of efficiency that accompanies these designs. The present paper compares PM designs to reduced sample (RN) designs that have the same total number of data points concentrated in fewer participants. In 4 studies, we consider models for both observed and latent variables, designs that do or do not include an "X set" of variables with complete data, and a full range of between- and within-set correlation values. All results are obtained using asymptotic relative efficiency formulas, and thus no data are generated; this novel approach allows us to examine whether PM designs have theoretical advantages over RN designs removing the impact of sampling error. Our primary findings are that (a) in manifest variable regression models, estimates of regression coefficients have much lower relative efficiency in PM designs as compared to RN designs, (b) relative efficiency of factor correlation or latent regression coefficient estimates is maximized when the indicators of each latent variable come from different sets, and (c) the addition of an X set improves efficiency in manifest variable regression models only for the parameters that directly involve the X-set variables, but it substantially improves efficiency of most parameters in latent variable models. We conclude that PM designs can be beneficial when the model of interest is a latent variable model; recommendations are made for how to optimize such a design

International Migration, Integration and Social Cohesion online publications

On the Model-Based Bootstrap With Missing Data: Obtaining a P

Author: Arbuckle J. L.
Bentler P. M.
Efron B.
Holzinger K. J.
Ke-Hai Yuan
Magnus J. R.
Satorra A.
Savalei V.
Victoria Savalei
Yung Y.-F.
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

SEM with Missing Data and Unknown Population Distributions Using Two-Stage ML: Theory and Its Application

Author: Bentler P. M.
Dempster A. P.
Mardia K. V.
Satorra A.
Savalei V.
Yuan K.-H.
Yuan K.-H.
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref