Search CORE

10,047 research outputs found

Recommended from our members

Scholarly insight Spring 2018: a Data wrangler perspective

Author: Calder Kathleen
Clow Doug
Coughlan Tim
Cross Simon
Edwards Chris
Evans Gerald
Gaved Mark
Herodotou Christothea
Hidalgo Rafael
Jones Edwina
Lay Stephanie
Lowe Sue
Mangafa Chrysoula
Rienties Bart
Ullmann Thomas
Publication venue: Open University UK
Publication date: 01/09/2018
Field of study

In the movie classic Back to the Future a young Michael J. Fox is able to explore the past by a time machine developed by the slightly bizarre but exquisite Dr Brown. Unexpectedly by some small intervention the course of history was changed a bit along Fox’s adventures. In this fourth Scholarly Insight Report we have explored two innovative approaches to learn from OU data of the past, which hopefully in the future will make a large difference in how we support our students and design and implement our teaching and learning practices. In Chapter 1, we provide an in-depth analysis of 50 thousands comments expressed by students through the Student Experience on a Module (SEAM) questionnaire. By analysing over 2.5 million words using big data approaches, our Scholarly insights indicate that not all student voices are heard. Furthermore, our big data analysis indicate useful potential insights to explore how student voices change over time, and for which particular modules emergent themes might arise. In Chapter 2 we provide our second innovative approach of a proof-of-concept of qualification path way using graph approaches. By exploring existing data of one qualification (i.e., Psychology), we show that students make a range of pathway choices during their qualification, some of which are more successful than others. As highlighted in our previous Scholarly Insight Reports, getting data from a qualification perspective within the OU is a difficult and challenging process, and the proof-of-concept provided in Chapter 2 might provide a way forward to better understand and support the complex choices our students make. In Chapter 3, we provide a slightly more practically-oriented and perhaps down to earth approach focussing on the lessons-learned with Analytics4Action. Over the last four years nearly a hundred modules have worked with more active use of data and insights into module presentation to support their students. In Chapter 3 several good-practices are described by the LTI/TEL learning design team, as well as three innovative case-studies which we hope will inspire you to try something new as well. Working organically in various Faculty sub-group meetings and LTI Units and in a google doc with various key stakeholders in the Faculties, we hope that our Scholarly insights can help to inform our staff, but also spark some ideas how to further improve our module designs and qualification pathways. Of course we are keen to hear what other topics require Scholarly insight. We hope that you see some potential in the two innovative approaches, and perhaps you might want to try some new ideas in your module. While a time machine has not really been invented yet, with the increasing rich and fine-grained data about our students and our learning practices we are getting closer to understand what really drives our students

Open Research Online (The Open University)

A Comparative Analysis of Ensemble Classifiers: Case Studies in Genomics

Author: Pandey Gaurav
Whalen Sean
Publication venue
Publication date: 19/09/2013
Field of study

The combination of multiple classifiers using ensemble methods is increasingly important for making progress in a variety of difficult prediction problems. We present a comparative analysis of several ensemble methods through two case studies in genomics, namely the prediction of genetic interactions and protein functions, to demonstrate their efficacy on real-world datasets and draw useful conclusions about their behavior. These methods include simple aggregation, meta-learning, cluster-based meta-learning, and ensemble selection using heterogeneous classifiers trained on resampled data to improve the diversity of their predictions. We present a detailed analysis of these methods across 4 genomics datasets and find the best of these methods offer statistically significant improvements over the state of the art in their respective domains. In addition, we establish a novel connection between ensemble selection and meta-learning, demonstrating how both of these disparate methods establish a balance between ensemble diversity and performance.Comment: 10 pages, 3 figures, 8 tables, to appear in Proceedings of the 2013 International Conference on Data Minin

arXiv.org e-Print Archive

Crossref

Data mining techniques for complex application domains

Author: Mahoto NAEEM AHMED
Publication venue: Politecnico di Torino
Publication date
Field of study

The emergence of advanced communication techniques has increased availability of large collection of data in electronic form in a number of application domains including healthcare, e- business, and e-learning. Everyday a large amount of records are stored electronically. However, finding useful information from such a large data collection is a challenging issue. Data mining technology aims automatically extracting hidden knowledge from large data repositories exploiting sophisticated algorithms. The hidden knowledge in the electronic data may be potentially utilized to facilitate the procedures, productivity, and reliability of several application domains. The PhD activity has been focused on novel and effective data mining approaches to tackle the complex data coming from two main application domains: Healthcare data analysis and Textual data analysis. The research activity, in the context of healthcare data, addressed the application of different data mining techniques to discover valuable knowledge from real exam-log data of patients. In particular, efforts have been devoted to the extraction of medical pathways, which can be exploited to analyze the actual treatments followed by patients. The derived knowledge not only provides useful information to deal with the treatment procedures but may also play an important role in future predictions of potential patient risks associated with medical treatments. The research effort in textual data analysis is twofold. On the one hand, a novel approach to discovery of succinct summaries of large document collections has been proposed. On the other hand, the suitability of an established descriptive data mining to support domain experts in making decisions has been investigated. Both research activities are focused on adopting widely exploratory data mining techniques to textual data analysis, which require overcoming intrinsic limitations for traditional algorithms for handling textual documents efficiently and effectively

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Inferring Actual Treatment Pathways from Patient Records

Author: Bandara Madhushi
Catchpoole Daniel
Kennedy Paul J.
Musial Katarzyna
Wilkins-Caruana Adrian
Publication venue
Publication date: 04/09/2023
Field of study

Treatment pathways are step-by-step plans outlining the recommended medical care for specific diseases; they get revised when different treatments are found to improve patient outcomes. Examining health records is an important part of this revision process, but inferring patients' actual treatments from health data is challenging due to complex event-coding schemes and the absence of pathway-related annotations. This study aims to infer the actual treatment steps for a particular patient group from administrative health records (AHR) - a common form of tabular healthcare data - and address several technique- and methodology-based gaps in treatment pathway-inference research. We introduce Defrag, a method for examining AHRs to infer the real-world treatment steps for a particular patient group. Defrag learns the semantic and temporal meaning of healthcare event sequences, allowing it to reliably infer treatment steps from complex healthcare data. To our knowledge, Defrag is the first pathway-inference method to utilise a neural network (NN), an approach made possible by a novel, self-supervised learning objective. We also developed a testing and validation framework for pathway inference, which we use to characterise and evaluate Defrag's pathway inference ability and compare against baselines. We demonstrate Defrag's effectiveness by identifying best-practice pathway fragments for breast cancer, lung cancer, and melanoma in public healthcare records. Additionally, we use synthetic data experiments to demonstrate the characteristics of the Defrag method, and to compare Defrag to several baselines where it significantly outperforms non-NN-based methods. Defrag significantly outperforms several existing pathway-inference methods and offers an innovative and effective approach for inferring treatment pathways from AHRs. Open-source code is provided to encourage further research in this area

arXiv.org e-Print Archive

Through the Lens of the Learner: Using Learning Analytics to Predict Learner-Centered Outcomes in Massive Open Online Courses'

Author: Rabin Eyal
Publication venue: Open Universiteit
Publication date: 10/09/2021
Field of study

Open University of the Netherlands Research Portal