396 research outputs found
Dropout Model Evaluation in MOOCs
The field of learning analytics needs to adopt a more rigorous approach for
predictive model evaluation that matches the complex practice of
model-building. In this work, we present a procedure to statistically test
hypotheses about model performance which goes beyond the state-of-the-practice
in the community to analyze both algorithms and feature extraction methods from
raw data. We apply this method to a series of algorithms and feature sets
derived from a large sample of Massive Open Online Courses (MOOCs). While a
complete comparison of all potential modeling approaches is beyond the scope of
this paper, we show that this approach reveals a large gap in dropout
prediction performance between forum-, assignment-, and clickstream-based
feature extraction methods, where the latter is significantly better than the
former two, which are in turn indistinguishable from one another. This work has
methodological implications for evaluating predictive or AI-based models of
student success, and practical implications for the design and targeting of
at-risk student models and interventions
Subgroup Robustness Grows On Trees: An Empirical Baseline Investigation
Researchers have proposed many methods for fair and robust machine learning,
but comprehensive empirical evaluation of their subgroup robustness is lacking.
In this work, we address this gap in the context of tabular data, where
sensitive subgroups are clearly-defined, real-world fairness problems abound,
and prior works often do not compare to state-of-the-art tree-based models as
baselines. We conduct an empirical comparison of several previously-proposed
methods for fair and robust learning alongside state-of-the-art tree-based
methods and other baselines. Via experiments with more than model
configurations on eight datasets, we show that tree-based methods have strong
subgroup robustness, even when compared to robustness- and fairness-enhancing
methods. Moreover, the best tree-based models tend to show good performance
over a range of metrics, while robust or group-fair models can show
brittleness, with significant performance differences across different metrics
for a fixed model. We also demonstrate that tree-based models show less
sensitivity to hyperparameter configurations, and are less costly to train. Our
work suggests that tree-based ensemble models make an effective baseline for
tabular data, and are a sensible default when subgroup robustness is desired.
For associated code and detailed results, see
https://github.com/jpgard/subgroup-robustness-grows-on-trees .Comment: To appear at Neural Information Processing Systems (NeurIPS) 2022.
Code at https://github.com/jpgard/subgroup-robustness-grows-on-tree
VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use
We introduce VisIT-Bench (Visual InsTruction Benchmark), a benchmark for
evaluation of instruction-following vision-language models for real-world use.
Our starting point is curating 70 'instruction families' that we envision
instruction tuned vision-language models should be able to address. Extending
beyond evaluations like VQAv2 and COCO, tasks range from basic recognition to
game playing and creative generation. Following curation, our dataset comprises
592 test queries, each with a human-authored instruction-conditioned caption.
These descriptions surface instruction-specific factors, e.g., for an
instruction asking about the accessibility of a storefront for wheelchair
users, the instruction-conditioned caption describes ramps/potential obstacles.
These descriptions enable 1) collecting human-verified reference outputs for
each instance; and 2) automatic evaluation of candidate multimodal generations
using a text-only LLM, aligning with human judgment. We quantify quality gaps
between models and references using both human and automatic evaluations; e.g.,
the top-performing instruction-following model wins against the GPT-4 reference
in just 27% of the comparison. VisIT-Bench is dynamic to participate,
practitioners simply submit their model's response on the project website;
Data, code and leaderboard is available at visit-bench.github.io
Influence of diabetes on ambulation and inflammation in men and women with symptomatic peripheral artery disease
AbstractObjectiveTo determine whether diabetes and sex were factors associated with ambulatory function, endothelial cell inflammation, oxidative stress, and apoptosis, and with circulating biomarkers of inflammation and antioxidant capacity in patients with peripheral artery disease (PAD) and claudication.Materials/MethodsAmbulatory function of 180 symptomatic men and women with PAD was assessed during a graded maximal treadmill test, 6-minute walk test, and 4-meter walk test. Patients were further characterized on endothelial effects of circulating factors present in the sera using a cell culture-based bioassay on primary human arterial endothelial cells, and on circulating inflammatory and vascular biomarkers.ResultsMen and women with diabetes had greater prevalence (p = 0.007 and p = 0.015, respectively) of coronary artery disease (CAD) than patients without diabetes. To assure that this difference did not influence planned comparisons, the data set was stratified on CAD. Diabetic men with CAD had a lower peak walking time (PWT) during the treadmill test and a slower 4-meter gait speed compared to non-diabetic men with CAD (p < 0.05). Diabetic women with CAD had a lower PWT compared to their non-diabetic counterparts (p < 0.01). Additionally, diabetic men with CAD had higher pigment epithelium-derived factor (p < 0.05) than their non-diabetic counterparts, and diabetic women with CAD had higher leptin (p < 0.01) and interleukin-8 levels (p < 0.05).ConclusionsIn patients with PAD, diabetic men and women with CAD had more severe claudication than their non-diabetic counterparts, as measured by shorter PWT, and the men had further ambulatory impairment manifested by slower 4-meter gait speed. Furthermore, the diabetic patients with CAD had elevations in interleukin-8, leptin, and PEDF
Cachexia index for prognostication in surgical patients with locally advanced oesophageal or gastric cancer: multicentre cohort study
Background Features of cancer cachexia adversely influence patient outcomes, yet few currently inform clinical decision-making. This study assessed the value of the cachexia index (CXI), a novel prognostic marker, in patients for whom neoadjuvant chemotherapy and surgery for oesophagogastric cancer is planned. Methods Consecutive patients newly diagnosed with locally advanced (T3–4 or at least N1) oesophagogastric cancer between 1 January 2010 and 31 December 2015 were identified through the West of Scotland and South-East Scotland Cancer Networks. CXI was calculated as (L3 skeletal muscle index) × (serum albumin)/(neutrophil lymphocyte ratio). Sex-stratified cut-off values were determined based on the area under the curve (AUC), and patients were divided into groups with low or normal CXI. Primary outcomes were disease progression during neoadjuvant chemotherapy and overall survival (at least 5 years of follow-up). Results Overall, 385 patients (72% men, median age 66 years) were treated with neoadjuvant chemotherapy for oesophageal (274) or gastric (111) cancer across the study interval. Although patients with a low CXI (men: CXI below 52 (AUC 0.707); women: CXI below 41 (AUC 0.759)) were older with more co-morbidity, disease characteristics were comparable to those in patients with a normal CXI. Rates of disease progression during neoadjuvant chemotherapy, leading to inoperability, were higher in patients with a low CXI (28 versus 12%; adjusted OR 3.07, 95% c.i. 1.67 to 5.64; P &lt; 0.001). Low CXI was associated with worsened postoperative mortality (P = 0.019) and decreased overall survival (median 14.9 versus 56.9 months; adjusted HR 1.85, 1.42 to 2.42; P &lt; 0.001). Conclusion CXI is associated with disease progression, worse postoperative mortality, and overall survival, and could improve prognostication and decision-making in patients with locally advanced oesophagogastric cancer
- …