Search CORE

6 research outputs found

Recommended from our members

Heteroskedasticity in Multiple Regression Analysis: What it is, How to Detect it and How to Solve it with Applications in R and SPSS

Author: Astivia Oscar L. Olvera
Zumbo Bruno D.
Publication venue: ScholarWorks@UMass Amherst
Publication date: 25/11/2019
Field of study

Within psychology and the social sciences, Ordinary Least Squares (OLS) regression is one of the most popular techniques for data analysis. In order to ensure the inferences from the use of this method are appropriate, several assumptions must be satisfied, including the one of constant error variance (i.e. homoskedasticity). Most of the training received by social scientists with respect to homoskedasticity is limited to graphical displays for detection and data transformations as solution, giving little recourse if none of these two approaches work. Borrowing from the econometrics literature, this tutorial aims to present a clear description of what heteroskedasticity is, how to measure it through statistical tests designed for it and how to address it through the use of heteroskedastic-consistent standard errors and the wild bootstrap. A step-by-step solution to obtain these errors in SPSS is presented without the need to load additional macros or syntax. Emphasis is placed on the fact that non-constant error variance is a population-defined, model-dependent feature and different types of heteroskedasticity can arise depending on what one is willing to assume about the data. Accessed 4,952 times on https://pareonline.net from January 11, 2019 to December 31, 2019. For downloads from January 1, 2020 forward, please click on the PlumX Metrics link to the right

ScholarWorks@UMass Amherst

The relationship between statistical power and predictor distribution in multilevel logistic regression: a simulation-based approach

Author: Gadermann Anne
Guhn Martin
Olvera Astivia Oscar L
Publication venue: BioMed Central
Publication date: 01/05/2019
Field of study

Background: Despite its popularity, issues concerning the estimation of power in multilevel logistic regression models are prevalent because of the complexity involved in its calculation (i.e., computer-simulation-based approaches). These issues are further compounded by the fact that the distribution of the predictors can play a role in the power to estimate these effects. To address both matters, we present a sample of cases documenting the influence that predictor distribution have on statistical power as well as a user-friendly, web-based application to conduct power analysis for multilevel logistic regression. Method: Computer simulations are implemented to estimate statistical power in multilevel logistic regression with varying numbers of clusters, varying cluster sample sizes, and non-normal and non-symmetrical distributions of the Level 1/2 predictors. Power curves were simulated to see in what ways non-normal/unbalanced distributions of a binary predictor and a continuous predictor affect the detection of population effect sizes for main effects, a cross-level interaction and the variance of the random effects. Results: Skewed continuous predictors and unbalanced binary ones require larger sample sizes at both levels than balanced binary predictors and normally-distributed continuous ones. In the most extreme case of imbalance (10% incidence) and skewness of a chi-square distribution with 1 degree of freedom, even 110 Level 2 units and 100 Level 1 units were not sufficient for all predictors to reach power of 80%, mostly hovering at around 50% with the exception of the skewed, continuous Level 2 predictor. Conclusions: Given the complex interactive influence among sample sizes, effect sizes and predictor distribution characteristics, it seems unwarranted to make generic rule-of-thumb sample size recommendations for multilevel logistic regression, aside from the fact that larger sample sizes are required when the distributions of the predictors are not symmetric or balanced. The more skewed or imbalanced the predictor is, the larger the sample size requirements. To assist researchers in planning research studies, a user-friendly web application that conducts power analysis via computer simulations in the R programming language is provided. With this web application, users can conduct simulations, tailored to their study design, to estimate statistical power for multilevel logistic regression models.Other UBCReviewedFacult

Crossref

Directory of Open Access Journals

University of British Columbia: cIRcle - UBC's Information Repository