15 research outputs found

    Generating univariate and multivariate nonnormal data

    No full text
    Because the assumption of normality is common in statistics, the robustness of statistical procedures to the violation of the normality assumption is often of interest. When one examines the impact of the violation of the normality assumption, it is important to simulate data from a nonnormal distribution with varying degrees of skewness and kurtosis. Fleishman (1978, Psychometrika 43: 521–532) developed a method to simulate data from a univariate distribution with specific values for the skewness and kurtosis. Vale and Maurelli (1983, Psychometrika 48: 465–471) extended Fleishman’s method to simulate data from a multivariate nonnormal distribution. In this article, I briefly introduce these two methods and present two new commands, rnonnormal and rmvnonnormal, for simulating data from the univariate and multivariate nonnormal distributions

    Implementing a Simulation Study Using Multiple Software Packages for Structural Equation Modeling

    No full text
    A Monte Carlo simulation study is an essential tool for evaluating the behavior of various quantitative methods including structural equation modeling (SEM) under various conditions. Typically, a large number of replications are recommended for a Monte Carlo simulation study, and therefore automating a Monte Carlo simulation study is important to get the desired number of replications for a simulation study. This article is intended to provide concrete examples for automating a Monte Carlo simulation study using some standard software packages for SEM: Mplus, LISREL, SAS PROC CALIS, and R package lavaan. Also, the equivalence between the multilevel SEM and hierarchical linear modeling (HLM) is discussed, and relevant examples are provided. It is hoped that the codes in this article can provide some building blocks for researchers to write their own code to automate simulation procedures

    Power of Latent Variable Interactions for Distinguishing Ordinal and Disordinal Interactions

    No full text

    The Machine Learning-Based Dropout Early Warning System for Improving the Performance of Dropout Prediction

    No full text
    A dropout early warning system enables schools to preemptively identify students who are at risk of dropping out of school, to promptly react to them, and eventually to help potential dropout students to continue their learning for a better future. However, the inherent class imbalance between dropout and non-dropout students could pose difficulty in building accurate predictive modeling for a dropout early warning system. The present study aimed to improve the performance of a dropout early warning system: (a) by addressing the class imbalance issue using the synthetic minority oversampling techniques (SMOTE) and the ensemble methods in machine learning; and (b) by evaluating the trained classifiers with both receiver operating characteristic (ROC) and precision–recall (PR) curves. To that end, we trained random forest, boosted decision tree, random forest with SMOTE, and boosted decision tree with SMOTE using the big data samples of the 165,715 high school students from the National Education Information System (NEIS) in South Korea. According to our ROC and PR curve analysis, boosted decision tree showed the optimal performance

    Re-Designing The Structure Of Online Courses To Empower Educational Data Mining

    No full text
    The amount of information contained in any educational data set is fundamentally constrained by the instructional conditions under which the data are collected. In this study, we show that by redesigning the structure of traditional online courses, we can improve the ability of educational data mining to provide useful information for instructors. This new design, referred to as Online Learning Modules, blends frequent learning assessment as seen in intelligent tutoring systems into the structure of conventional online courses, allowing learning behavior data and learning outcome data to be collected from the same learning module. By applying relatively straightforward clustering analysis to data collected from a sequence of four modules, we are able to gain insight on whether students are spending enough time studying and on the effectiveness of the instructional materials, two questions most instructors ask each day

    The Accurate Measurement of Students’ Learning in E-Learning Environments

    No full text
    The ultimate goal of E-learning environments is to improve students’ learning. To achieve that goal, it is crucial to accurately measure students’ learning. In the field of educational measurement, it is well known that the key issue in the measurement of learning is to place test scores on a common metric. Despite the crucial role of a common metric in the measurement of learning, however, less attention has been paid to this important issue in E-learning studies. In this study, we propose to use fixed-parameter calibration (FPC) in an item response theory (IRT) framework to set up a common metric in E-learning environments. To demonstrate FPC, we used the data from the MOOC “Introduction to Psychology as a Science” offered through Coursera collaboratively by Georgia Institute of Technology (GIT) and Carnegie Mellon University (CMU) in 2013. Our analysis showed that the students’ learning gains were substantially different with and without FPC

    Factor Analysis Reveals Student Thinking Using The Mechanics Reasoning Inventory

    No full text
    The Mechanics Reasoning Inventory (MRI) [1] is an assessment instrument specifically designed to assess strategic reasoning skills involving core concepts in introductory Newtonian mechanics. Being an assessment of higher order thinking (as opposed to declarative or rule-based procedural thinking), it is necessary to check whether or not the mental constructs underlying actual student responses correlate with the authors\u27 domain classification, which is the subject of this paper. The instrument consists of three types of problems: Whether momentum or energy is conserved in a given situation and why, (partly inspired by the paired what/why questions in Lawson\u27s Classroom Test of Scientific Reasoning), application of Newton\u27s 2nd and 3rd law, and decomposing problems into parts (inspired by Van Domelen [2]\u27s Problem Decomposition Diagnostic). It has been administered 183 times in two MIT courses since 2009. Exploratory Factor Analysis (EFA) revealed that each Lawson pair of questions should be considered as one item, after which it identified four factors among the 21 questions that correspond reasonably well with the intended physics topics, and a fifth factor correlated with the concept of circular motion, a difficult topic for students (even though not viewed as a core principle by the designers). We discuss why 6 of the items classified under factors that differed from the expert assignments. There was no strong indication that the students answered each of different problem types similarly, which is a hallmark of students using novice heuristics rather than reasoning based on physical principles to answer the questions

    Carbon-Deposited TiO<sub>2</sub> 3D Inverse Opal Photocatalysts: Visible-Light Photocatalytic Activity and Enhanced Activity in a Viscous Solution

    No full text
    We for the first time demonstrated carbon-deposited TiO<sub>2</sub> inverse opal (C-TiO<sub>2</sub> IO) structures as highly efficient visible photocatalysts. The carbon deposition proceeded via high-temperature pyrolysis of phloroglucinol/formaldehyde resol, which had been coated onto the TiO<sub>2</sub> IO structures. Carbon deposition formed a carbon layer and doped the TiO<sub>2</sub> interface, which synergistically enhanced visible-light absorption. We directly measured the visible-light photocatalytic activity by constructing solar cells comprising the C-TiO<sub>2</sub> IO electrode. Photocatalytic degradation of organic dyes in a solution was also evaluated. Photocatalytic dye degradation under visible light was only observed in the presence of the C-TiO<sub>2</sub> IO sample and was increased with the content of carbon deposition. The IO structures could be readily decorated with TiO<sub>2</sub> nanoparticles to increase the surface area and enhance the photocatalytic activity. Notably, the photocatalytic reaction was found to proceed in a viscous polymeric solution. A comparison of the mesoporous TiO<sub>2</sub> structure and the IO TiO<sub>2</sub> structure revealed that the latter performed better as the solution viscosity increased. This result was attributed to facile diffusion into the fully connected and low-tortuosity macropore network of the IO structure

    Are MOOC Learning Analytics Results Trustworthy? With Fake Learners, They Might Not Be!

    No full text
    The rich data that Massive Open Online Courses (MOOCs) platforms collect on the behavior of millions of users provide a unique opportunity to study human learning and to develop data-driven methods that can address the needs of individual learners. This type of research falls into the emerging field of learning analytics. However, learning analytics research tends to ignore the issue of the reliability of results that are based on MOOCs data, which is typically noisy and generated by a largely anonymous crowd of learners. This paper provides evidence that learning analytics in MOOCs can be significantly biased by users who abuse the anonymity and open-nature of MOOCs, for example by setting up multiple accounts, due to their amount and aberrant behavior. We identify these users, denoted fake learners, using dedicated algorithms. The methodology for measuring the bias caused by fake learners’ activity combines the ideas of Replication Research and Sensitivity Analysis. We replicate two highly-cited learning analytics studies with and without fake learners data, and compare the results. While in one study, the results were relatively stable against fake learners, in the other, removing the fake learners’ data significantly changed the results. These findings raise concerns regarding the reliability of learning analytics in MOOCs, and highlight the need to develop more robust, generalizable and verifiable research methods. Keywords: Learning Analytics; MOOCs; Replication research; Sensitivity analysis; Fake learner
    corecore