Search CORE

131,783 research outputs found

Learning deep patient representations for the teleICU

Author: Oguntola Ini(Iniokuwa A.)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2019
Field of study

This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019Cataloged from student-submitted PDF version of thesis.Includes bibliographical references (pages 89-93).This thesis presents a method of extracting deep robust representations of teleICU clinical data using Transformer networks, inspired by recent machine learning literature in language modeling. The utility of these representations is evaluated in various prediction outcome tasks, in which they were able to outperform linear and neural baselines. Also examined are the probability distributions of various patient characteristics across the learned patient representation space; where corresponding high-level spatial structure suggests potential for use as a similarity metric or in combination with other patient similarity metrics. Finally, the code for the models developed is publicly provided as a starting point for further research.by Ini Oguntola.M. Eng.M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Scienc

A critical assessment of imbalanced class distribution problem: the case of predicting freshmen student attrition

Author: Delen Dursun
Kasap Nihat
Meesad Phayung
Thammasiri Dech
Publication venue: 'Elsevier BV'
Publication date: 01/08/2013
Field of study

Predicting student attrition is an intriguing yet challenging problem for any academic institution. Class-imbalanced data is a common in the field of student retention, mainly because a lot of students register but fewer students drop out. Classification techniques for imbalanced dataset can yield deceivingly high prediction accuracy where the overall predictive accuracy is usually driven by the majority class at the expense of having very poor performance on the crucial minority class. In this study, we compared different data balancing techniques to improve the predictive accuracy in minority class while maintaining satisfactory overall classification performance. Specifically, we tested three balancing techniques—oversampling, under-sampling and synthetic minority over-sampling (SMOTE)—along with four popular classification methods—logistic regression, decision trees, neuron networks and support vector machines. We used a large and feature rich institutional student data (between the years 2005 and 2011) to assess the efficacy of both balancing techniques as well as prediction methods. The results indicated that the support vector machine combined with SMOTE data-balancing technique achieved the best classification performance with a 90.24% overall accuracy on the 10-fold holdout sample. All three data-balancing techniques improved the prediction accuracy for the minority class. Applying sensitivity analyses on developed models, we also identified the most important variables for accurate prediction of student attrition. Application of these models has the potential to accurately predict at-risk students and help reduce student dropout rates

Curriculum Guidelines for Undergraduate Programs in Data Science

Author: Agarwal Mahesh
Averett Maia
Baumer Benjamin
Bray Andrew
Bressoud Thomas
Bryant Lance
Cheng Lei
De Veaux Richard
Francis Amanda
Gould Robert
Kim Albert Y.
Kretchmar Matt
Lu Qin
Moskol Ann
Nolan Deborah
Pelayo Roberto
Raleigh Sean
Sethi Ricky J.
Sondjaja Mutiara
Tiruviluamala Neelesh
Uhlig Paul
Washington Talitha
Wesley Curtis
White David
Ye Ping
Publication venue: 'Annual Reviews'
Publication date: 01/01/2017
Field of study

The Park City Math Institute (PCMI) 2016 Summer Undergraduate Faculty Program met for the purpose of composing guidelines for undergraduate programs in Data Science. The group consisted of 25 undergraduate faculty from a variety of institutions in the U.S., primarily from the disciplines of mathematics, statistics and computer science. These guidelines are meant to provide some structure for institutions planning for or revising a major in Data Science

arXiv.org e-Print Archive

Supporting teachers in collaborative student modeling: a framework and an implementation

Author: Gaudioso Elena
Hernandez del Olmo Felix
Montero Miguel
Talavera Méndez Luis José
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

Collaborative student modeling in adaptive learning environments allows the learners to inspect and modify their own student models. It is often considered as a collaboration between students and the system to promote learners’ reflection and to collaboratively assess the course. When adaptive learning environments are used in the classroom, teachers act as a guide through the learning process. Thus, they need to monitor students’ interactions in order to understand and evaluate their activities. Although, the knowledge gained through this monitorization can be extremely useful to student modeling, collaboration between teachers and the system to achieve this goal has not been considered in the literature. In this paper we present a framework to support teachers in this task. In order to prove the usefulness of this framework we have implemented and evaluated it in an adaptive web-based educational system called PDinamet.Postprint (author's final draft

kLog: A Language for Logical and Relational Learning with Kernels

Author: Altun
Ando
Antanas
Antanas
Antanas
Argyriou
Blockeel
Blockeel
Bottou
Boulicaut
Bröcheler
Ceroni
Chang
Chang
Cook
Costa
Costa
De
De Grave
De Grave
De Raedt
De Raedt
De Raedt
Dietterich
Dietterich
Evgeniou
Fabrizio Costa
Frasconi
Frasconi
Friedman
Gross
Gärtner
Gärtner
Haussler
Heckerman
Helma
Helma
Horváth
Joachims
Kazius
Kersting
Kersting
Kersting
Kimmig
Koller
Kordjamshidi
Kou
Kramer
Kurt De Grave
Lanckriet
Landwehr
Lao
Lari
London
Lowd
Luc De Raedt
Luks
Macskassy
Mahe
McCallum
McKay
Menchetti
Mitchell
Muggleton
Muggleton
Neville
Ng
Paolo Frasconi
Quinlan
Ralaivola
Richardson
Rizzolo
Rossi
Serebrenik
Shervashidze
Shi
Sorlin
Srinivasan
Srinivasan
Sun
Sutton
Taskar
Taskar
Tsochantaridis
van de Waterbeemd
Vazquez
Verbeke
Verbeke
Vishwanathan
Wachman
Wang
Wolpert
Yan
Publication venue: 'Elsevier BV'
Publication date: 28/07/2014
Field of study

We introduce kLog, a novel approach to statistical relational learning. Unlike standard approaches, kLog does not represent a probability distribution directly. It is rather a language to perform kernel-based learning on expressive logical and relational representations. kLog allows users to specify learning problems declaratively. It builds on simple but powerful concepts: learning from interpretations, entity/relationship data modeling, logic programming, and deductive databases. Access by the kernel to the rich representation is mediated by a technique we call graphicalization: the relational representation is first transformed into a graph --- in particular, a grounded entity/relationship diagram. Subsequently, a choice of graph kernel defines the feature space. kLog supports mixed numerical and symbolic data, as well as background knowledge in the form of Prolog or Datalog programs as in inductive logic programming systems. The kLog framework can be applied to tackle the same range of tasks that has made statistical relational learning so popular, including classification, regression, multitask learning, and collective classification. We also report about empirical comparisons, showing that kLog can be either more accurate, or much faster at the same level of accuracy, than Tilde and Alchemy. kLog is GPLv3 licensed and is available at http://klog.dinfo.unifi.it along with tutorials

arXiv.org e-Print Archive

Predicting time to graduation at a large enrollment American university

Author: Aiken John M.
Caballero Marcos D.
De Bin Riccardo
Hjorth-Jensen Morten
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2020
Field of study

The time it takes a student to graduate with a university degree is mitigated by a variety of factors such as their background, the academic performance at university, and their integration into the social communities of the university they attend. Different universities have different populations, student services, instruction styles, and degree programs, however, they all collect institutional data. This study presents data for 160,933 students attending a large American research university. The data includes performance, enrollment, demographics, and preparation features. Discrete time hazard models for the time-to-graduation are presented in the context of Tinto's Theory of Drop Out. Additionally, a novel machine learning method: gradient boosted trees, is applied and compared to the typical maximum likelihood method. We demonstrate that enrollment factors (such as changing a major) lead to greater increases in model predictive performance of when a student graduates than performance factors (such as grades) or preparation (such as high school GPA).Comment: 28 pages, 11 figure

arXiv.org e-Print Archive

Directory of Open Access Journals