54 research outputs found
Challenging the State of the Art in Post-Introductory Statistics: Preparation, Concepts, and Pedagogy
The demands for a statistically literate society are increasing, and the introductory statistics course ( Stat 101 ) remains the primary venue for learning statistics for the majority of high school and undergraduate students. After three decades of very fruitful activity in the areas of pedagogy and assessment, but with comparatively little pressure for rethinking the content of this course, the statistics education community has recently turned its attention to use of randomization-based methods to illustrate core concepts of statistical inference. This new focus not only presents an opportunity to address documented shortcomings in the standard Stat 101 course (for example, improving students’ reasoning about inference), but provides an impetus for re-thinking the timing of the introduction of multivariable statistical methods (for example, multiple regression and general linear models). Multivariable methods dominate modern statistical practice but are rarely seen in the introductory course. Instead these methods have been, traditionally, relegated to second courses in statistics for students with a background in calculus and linear algebra. Recently, curricula have been developed to bring multivariable content to students who have only taken a Stat 101 course. However, these courses tend to focus on models and model-building as an end in itself. We have developed a preliminary version of an integrated one to two semester curriculum which introduces students to the core-logic of statistical inference through randomization-methods, and then introduces students to approaches for protecting against confounding and variability through multivariable statistical design and analysis techniques. The course has been developed by putting primary emphasis on the development of students’ conceptual understanding in an intuitive, cyclical, active-learning pedagogy, while continuing to emphasize the overall process of statistical investigations, from asking questions and collecting data through making inferences and drawing conclusions. The curriculum successfully introduces introductory statistics students to multivariable techniques in their first or second course
Broadening the Impact and Effectiveness of Simulation-Based Curricula for Introductory Statistics
The demands for a statistically literate society are increasing, and the introductory statistics course “Stat 101” remains the primary venue for learning statistics for the majority of high school and undergraduate students. After three decades of very fruitful activity in the areas of pedagogy and assessment, but with comparatively little pressure for rethinking the content of this course, the statistics education community has recently turned its attention to focusing on simulation-based methods, including bootstrapping and permutation tests, to illustrate core concepts of statistical inference within the context of the overall statistical investigative process. This new focus presents an opportunity to address documented shortcomings in the standard Stat 101 course (e.g., seeing the big picture; improving statistical thinking over mere knowledge of procedures).
Our group has developed and implemented one of the first cohesive curricula that (a) emphasizes the core logic of inference using simulation-based methods in an intuitive, cyclical, active-learning pedagogy, and (b) emphasizes the overall process of statistical investigations, from asking questions and collecting data through making inferences and drawing conclusions. Improved conceptual understanding and retention of inference and study design that had been observed when using early versions of the curriculum at a single institution, are now being evaluated at dozens of institutions across the country with thousands of students using the fully integrated, stand-alone version of the curriculum. Encouraging preliminary results continue to be observed.
We are now leveraging the tremendous national momentum and excitement about the approach to greatly expand implementations of simulation-based curricula by offering workshops around the country to diverse sets of faculty, offering numerous online support structures including: a blog, freely available applets, free instructor materials, earning objective-based instructional videos, free instructor-focused training videos, a listserv, and peer-reviewed publications covering both rationale and assessment results. Many hundreds of instructors have been directly impacted by our workshops and hundreds more through access to the free online materials. We are also in the midst of valuating widespread transferability of the approach across diverse institutions, students, and learning environments and deepening our understanding of how students’ attitudes and conceptual understanding develop using this approach through an assessment project involving concept and attitude inventories with over 10,000 students across 200 different instructors
Identification of novel genetic susceptibility loci for Behçet's disease using a genome-wide association study
Introduction Behcet's disease is a chronic systemic inflammatory disease that remains incompletely understood. Herein, we perform the first genome-wide association study in Behcet's disease
Quantitative Evidence for the Use of Simulation and Randomization in the Introductory Statistics Course
The use of simulation and randomization in the introductory statistics course is gaining popularity, but what evidence is there that these approaches are improving students’ conceptual understanding and attitudes as we hope? In this talk I will discuss evidence from early full-length versions of such a curriculum, covering issues such as (a) items and scales showing improved conceptual performance compared to traditional curriculum, (b) transferability of findings to different institutions, (c) retention of conceptual understanding post-course and (d) student attitudes. Along the way I will discuss a few areas in which students in both simulation/randomization courses and the traditional course still perform poorly on standardized assessments
QUANTITATIVE EVIDENCE FOR THE USE SIMULATION AND RANDOMIZATION IN THE INTRODUCTORY STATISTICS COURSE
Recommended from our members
Phenome-wide association study (PheWAS) in EMR-linked pediatric cohorts, genetically links PLCL1 to speech language development and IL5-IL13 to Eosinophilic Esophagitis
Objective: We report the first pediatric specific Phenome-Wide Association Study (PheWAS) using electronic medical records (EMRs). Given the early success of PheWAS in adult populations, we investigated the feasibility of this approach in pediatric cohorts in which associations between a previously known genetic variant and a wide range of clinical or physiological traits were evaluated. Although computationally intensive, this approach has potential to reveal disease mechanistic relationships between a variant and a network of phenotypes. Method: Data on 5049 samples of European ancestry were obtained from the EMRs of two large academic centers in five different genotyped cohorts. Recently, these samples have undergone whole genome imputation. After standard quality controls, removing missing data and outliers based on principal components analyses (PCA), 4268 samples were used for the PheWAS study. We scanned for associations between 2476 single-nucleotide polymorphisms (SNP) with available genotyping data from previously published GWAS studies and 539 EMR-derived phenotypes. The false discovery rate was calculated and, for any new PheWAS findings, a permutation approach (with up to 1,000,000 trials) was implemented. Results: This PheWAS found a variety of common variants (MAF > 10%) with prior GWAS associations in our pediatric cohorts including Juvenile Rheumatoid Arthritis (JRA), Asthma, Autism and Pervasive Developmental Disorder (PDD) and Type 1 Diabetes with a false discovery rate < 0.05 and power of study above 80%. In addition, several new PheWAS findings were identified including a cluster of association near the NDFIP1 gene for mental retardation (best SNP rs10057309, p = 4.33 × 10−7, OR = 1.70, 95%CI = 1.38 − 2.09); association near PLCL1 gene for developmental delays and speech disorder [best SNP rs1595825, p = 1.13 × 10−8, OR = 0.65(0.57 − 0.76)]; a cluster of associations in the IL5-IL13 region with Eosinophilic Esophagitis (EoE) [best at rs12653750, p = 3.03 × 10−9, OR = 1.73 95%CI = (1.44 − 2.07)], previously implicated in asthma, allergy, and eosinophilia; and association of variants in GCKR and JAZF1 with allergic rhinitis in our pediatric cohorts [best SNP rs780093, p = 2.18 × 10−5, OR = 1.39, 95%CI = (1.19 − 1.61)], previously demonstrated in metabolic disease and diabetes in adults. Conclusion: The PheWAS approach with re-mapping ICD-9 structured codes for our European-origin pediatric cohorts, as with the previous adult studies, finds many previously reported associations as well as presents the discovery of associations with potentially important clinical implications
Recommended from our members
A GWAS Study on Liver Function Test Using eMERGE Network Participants
Introduction: Liver enzyme levels and total serum bilirubin are under genetic control and in recent years genome-wide population-based association studies have identified different susceptibility loci for these traits. We conducted a genome-wide association study in European ancestry participants from the Electronic Medical Records and Genomics (eMERGE) Network dataset of patient medical records with available genotyping data in order to identify genetic contributors to variability in serum bilirubin levels and other liver function tests and to compare the effects between adult and pediatric populations. Methods: The process of whole genome imputation of eMERGE samples with standard quality control measures have been described previously. After removing missing data and outliers based on principal components (PC) analyses, 3294 samples from European ancestry were used for the GWAS study. The association between each single nucleotide polymorphism (SNP) and total serum bilirubin and other liver function tests was tested using linear regression, adjusting for age, gender, site, platform and ancestry principal components (PC). Results: Consistent with previous results, a strong association signal has been detected for UGT1A gene cluster (best SNP rs887829, beta = 0.15, p = 1.30x10-118) for total serum bilirubin level. Indeed, in this region more than 176 SNPs (or indels) had p<10−8 spanning 150Kb on the long arm of chromosome 2q37.1. In addition, we found a similar level of magnitude in a pediatric group (p = 8.26x10-47, beta = 0.17). Further imputation using sequencing data as a reference panel revealed association of other markers including known TA7 repeat indels (rs8175347) (p = 9.78x10-117) and rs111741722 (p = 5.41x10-119) which were in proxy (r2 = 0.99) with rs887829. Among rare variants, two Asian subjects homozygous for coding SNP rs4148323 (G71R) were identified. Additional known effects for total serum bilirubin were also confirmed including organic anion transporters SLCO1B1-SLCO1B3, TDRP and ZMYND8 at FDR<0.05 with no gene-gene interaction effects. Phenome-wide association studies (PheWAS) suggest a protective effect of TA7 repeat against cerebrovascular disease in an adult cohort (OR = 0.75, p = 0.0008). Among other liver function tests, we also confirmed the previous effect of the ABO blood group locus for variation in serum alkaline phosphatase (rs579459, p = 9.44x10-15). Conclusions: Taken together, our data present interesting findings with strong confirmation of previous effects by simply using the eMERGE electronic health record phenotyping. In addition, our findings indicate that similar to the adult population, the UGT1A1 is the main locus responsible for normal variation of serum bilirubin in pediatric populations
A Longitudinal Analysis of Violence and Housing Insecurity
Violence and housing insecurity are horrible events that may be intertwined, with violence possibly forcing victims to abandon their accommodations and housing insecurity depriving people of the safety of a home or placing them in compromised circumstances. This study uses national, prospective, longitudinal data from the Journeys Home Survey to examine how violence, housing insecurity, and other characteristics in one period affect disadvantaged Australian men's and women's chances of experiencing violence and housing insecurity in subsequent periods. The study is one of the first to investigate these relationships prospectively and unusual in considering how violence among adult men contributes to their housing insecurity. We estimate dynamic multivariate models that control for observed and time-invariant unobserved characteristics and find that men's chances of being housing secure without experiencing violence are 24-45 percent lower and women's chances are 12- 20 percent lower if they experienced housing insecurity, violence or both in the previous period. Heavy drinking, marijuana use, psychological distress, and a history of childhood abuse and neglect also increase the risks of violence and housing insecurity for both genders, while the presence of children reduces these risks. Women who are bisexual or lesbian and women with homeless friends also face elevated risks of housing insecurity, while men's sexual orientation and friend networks seem less relevant
Introduction to Statistical Investigations
This book leads students to learn about the process of conducting statistical investigations from data collection, to exploring data, to statistical inference, to drawing appropriate conclusions. The text is designed for a one-semester introductory statistics course.
It focuses on genuine research studies, active learning, and effective use of technology. Simulations and randomization tests introduce statistical inference, yielding a strong conceptual foundation that bridges students to theory-based inference approaches. Repetition allows students to see the logic and scope of inference. This implementation follows the GAISE recommendations endorsed by the American Statistical Association. This is an unbound, binder-ready version.https://digitalcollections.dordt.edu/books/1047/thumbnail.jp
- …