295 research outputs found
Recommended from our members
Using the Student\u27s t-test with extremely small sample sizes
Researchers occasionally have to work with an extremely small sample size, defined herein as N ≤ 5. Some methodologists have cautioned against using the t-test when the sample size is extremely small, whereas others have suggested that using the t-test is feasible in such a case. The present simulation study estimated the Type I error rate and statistical power of the one- and two-sample t-tests for normally distributed populations and for various distortions such as unequal sample sizes, unequal variances, the combination of unequal sample sizes and unequal variances, and a lognormal population distribution. Ns per group were varied between 2 and 5. Results show that the t-test provides Type I error rates close to the 5% nominal value in most of the cases, and that acceptable power (i.e., 80%) is reached only if the effect size is very large. ... Compared to the regular t-test, the Welch test tends to reduce statistical power and the t-testR yields false positive rates that deviate from 5%. This study further shows that a paired t-test is feasible with extremely small Ns if the within-pair correlation is high. It is concluded that there are no principal.objections to using a t-test with Ns as small as 2. A final cautionary note is made on the credibility of research findings when sample sizes are small. Accessed 123,336 times on https://pareonline.net from August 06, 2013 to December 31, 2019. For downloads from January 1, 2020 forward, please click on the PlumX Metrics link to the right
Using the Student’s t-test with extremely small sample sizes
Researchers occasionally have to work with an extremely small sample size, defined herein as N ≤ 5. Some methodologists have cautioned against using the t-test when the sample size is extremely small, whereas others have suggested that using the t-test is feasible in such a case. The present simulation study estimated the Type I error rate and statistical power of the one- and two-sample t-tests for normally distributed populations and for various distortions such as unequal sample sizes, unequal variances, the combination of unequal sample sizes and unequal variances, and a lognormal population distribution. Ns per group were varied between 2 and 5. Results show that the t-test provides Type I error rates close to the 5% nominal value in most of the cases, and that acceptable power (i.e., 80%) is reached only if the effect size is very large. This study also investigated the behavior of the Welch test and a rank-transformation prior to conducting the t-test (t-testR). Compared to the regular t-test, the Welch test tends to reduce statistical power and the t-testR yields false positive rates that deviate from 5%. This study further shows that a paired t-test is feasible with extremely small Ns if the within-pair correlation is high. It is concluded that there are no principal objections to using a t-test with Ns as small as 2. A final cautionary note is made on the credibility of research findings when sample sizes are small
Book review of: Modeling Human–System Interaction: Philosophical and Methodological Considerations, With Examples By Thomas B. Sheridan
Modeling Human–System Interaction: Philosophical and Methodological Considerations, With Examples By Thomas B. Sheridan 2017, 192 pages, $110.00 Hoboken, NJ: John Wiley & Sons, Inc. ISBN 978-1-119-275268-
Why person models are important for human factors science
Human factors science has always been concerned with explaining and preventing human error and accidents. In the past 100 years, the field has shifted focus from a person approach to a system approach. In this opinion article, I provide five reasons why this shift is not opportune, and why person models are important for human factors science. I argue that (1) system models lack causal specificity; (2) as technology becomes more reliable, the proportion of accidents caused by human error increases; (3) technological development leads to new forms of human error; (4) scientific advances point to stable individual characteristics as predictors of human error and safety; and (5) in complex tasks, individual differences increase with task experience. Finally, some research recommendations are provided and ethical challenges of person models are brought forward
Advancing simulation-based driver training
BioMechanical EngineeringMechanical, Maritime and Materials Engineerin
Controversy in human factors constructs and the explosive use of the NASA-TLX: A measurement perspective
Situation awareness and workload are popular constructs in human factors science. It has been hotly debated whether these constructs are scientifically credible, or whether they should merely be seen as folk models. Reflecting on the works of psychophysicist Stanley Smith Stevens and of measurement theorist David Hand, we suggest a resolution to this debate, namely that human factors constructs are situated towards the operational end of a representational–operational continuum. From an operational perspective, human factors constructs do not reflect an empirical reality, but they aim to predict. For operationalism to be successful, however, it is important to have suitable measurement procedures available. To explore how human factors constructs are measured, we focused on (mental) workload and its measurement by questionnaires and applied a culturomic analysis to investigate secular trends in word use. The results reveal an explosive use of the NASA Task Load Index (TLX). Other questionnaires, such as the Cooper Harper rating scale and the Subjective Workload Assessment Technique, show a modest increase, whereas many others appear short lived. We found no indication that the TLX is improved by iterative self-correction towards optimal validity, and we argue that usage of the NASA-TLX has become dominant through a Matthew effect. Recommendations for improving the quality of human factors research are provided.Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Biomechatronics & Human-Machine Contro
Can ChatGPT Pass High School Exams on English Language Comprehension?
Launched in late November 2022, ChatGPT, a large language model chatbot, has garnered considerable attention. However, ongoing questions remain regarding its capabilities. In this study, ChatGPT was used to complete national high school exams in the Netherlands on the topic of English reading comprehension. In late December 2022, we submitted the exam questions through the ChatGPT web interface (GPT-3.5). According to official norms, ChatGPT achieved a mean grade of 7.3 on the Dutch scale of 1 to 10—comparable to the mean grade of all students who took the exam in the Netherlands, 6.99. However, ChatGPT occasionally required re-prompting to arrive at an explicit answer; without these nudges, the overall grade was 6.5. In March 2023, API access was made available, and a new version of ChatGPT, GPT-4, was released. We submitted the same exams to the API, and GPT-4 achieved a score of 8.3 without a need for re-prompting. Additionally, employing a bootstrapping method that incorporated randomness through ChatGPT’s ‘temperature’ parameter proved effective in self-identifying potentially incorrect answers. Finally, a re-assessment conducted with the GPT-4 model updated as of June 2023 showed no substantial change in the overall score. The present findings highlight significant opportunities but also raise concerns about the impact of ChatGPT and similar large language models on educational assessment.Human-Robot Interactio
Predicting self-reported violations among novice license drivers using pre-license simulator measures
Novice drivers are overrepresented in crash statistics and there is a clear need for remedial measures. Driving simulators allow for controlled and objective measurement of behavior and might therefore be a useful tool for predicting whether someone will commit deviant driving behaviors on the roads. However, little is currently known about the relationship between driving-simulator behavior and on-road driving behavior in novice drivers. In this study, 321 drivers, who on average 3.4 years earlier had completed a pre-license driver-training program in a medium-fidelity simulator, responded to a questionnaire about their on-road driving. Zero-order correlations showed that violations and speed in the simulator were predictive of self-reported on-road violations. This relationship persisted after controlling for age, gender, mileage, and education level. Respondents with a higher number of violations, faster speed, and lower number of errors in the simulator reported completing fewer hours of on-road lessons before their first on-road driving test. The results add to the literature on the predictive validity of driving simulators, and can be used to identify at-risk drivers early in a driver-training program.Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Medical Instruments & Bio-Inspired Technolog
- …