126,051 research outputs found
Using Random Forests to Describe Equity in Higher Education: A Critical Quantitative Analysis of Utah’s Postsecondary Pipelines
The following work examines the Random Forest (RF) algorithm as a tool for predicting student outcomes and interrogating the equity of postsecondary education pipelines. The RF model, created using longitudinal data of 41,303 students from Utah\u27s 2008 high school graduation cohort, is compared to logistic and linear models, which are commonly used to predict college access and success. Substantially, this work finds High School GPA to be the best predictor of postsecondary GPA, whereas commonly used ACT and AP test scores are not nearly as important. Each model identified several demographic disparities in higher education access, most significantly the effects of individual-level economic disadvantage. District- and school-level factors such as the proportion of Low Income students and the proportion of Underrepresented Racial Minority (URM) students were important and negatively associated with postsecondary success. Methodologically, the RF model was able to capture non-linearity in the predictive power of school- and district-level variables, a key finding which was undetectable using linear models. The RF algorithm outperforms logistic models in prediction of student enrollment, performs similarly to linear models in prediction of postsecondary GPA, and excels both models in its descriptions of non-linear variable relationships. RF provides novel interpretations of data, challenges conclusions from linear models, and has enormous potential to further the literature around equity in postsecondary pipelines
Prediction with Dimension Reduction of Multiple Molecular Data Sources for Patient Survival
Predictive modeling from high-dimensional genomic data is often preceded by a
dimension reduction step, such as principal components analysis (PCA). However,
the application of PCA is not straightforward for multi-source data, wherein
multiple sources of 'omics data measure different but related biological
components. In this article we utilize recent advances in the dimension
reduction of multi-source data for predictive modeling. In particular, we apply
exploratory results from Joint and Individual Variation Explained (JIVE), an
extension of PCA for multi-source data, for prediction of differing response
types. We conduct illustrative simulations to illustrate the practical
advantages and interpretability of our approach. As an application example we
consider predicting survival for Glioblastoma Multiforme (GBM) patients from
three data sources measuring mRNA expression, miRNA expression, and DNA
methylation. We also introduce a method to estimate JIVE scores for new samples
that were not used in the initial dimension reduction, and study its
theoretical properties; this method is implemented in the R package R.JIVE on
CRAN, in the function 'jive.predict'.Comment: 11 pages, 9 figure
Recommended from our members
USMLE Scores Do Not Predict the Clinical Performance of Emergency Medicine Residents
Background: Scores on “high-stakes” multiple choice exams such as the United States Medical Licensing Examination® (USMLE) are important screening and applicant ranking criteria used by residencies.Objective: We tested the hypothesis that USMLE scores do not predict overall clinical performance of emergency medicine (EM) residents.Methods: All graduates from our University-based EM residency between the years 2008 and 2015 were included. Residents who had incomplete USMLE records were terminated, transferred out of the program, or did not graduate within this timeframe were excluded from the analysis. Clinical performance was defined as a gestalt of the residency program’s leadership and was classified into three sets: top, average, and lowest clinical performer. Dissimilarities of the initial blind rankings were adjudicated during a consensus conference.Results: During the eight years of the study period, there were a total of 115 graduating residents: 73 men (63%) and 42 women. Nearly all of them (109; 95%) had allopathic medical degrees; the remainder had osteopathic degrees. There was not a statistically significant correlation between our ranking of clinical performance and the Step 2 Clinical Knowledge score. There was a non-significant correlation between clinical performance and the Step 1 score.Conclusion: Neither USMLE Step 1 nor Step 2 Clinical Knowledge were good predictors of the actual clinical performance of residents during their training. We feel that their scores are overemphasized in the resident selecÂtion process
- …