10,047 research outputs found
Recommended from our members
Scholarly insight Spring 2018: a Data wrangler perspective
In the movie classic Back to the Future a young Michael J. Fox is able to explore the past by a time machine developed by the slightly bizarre but exquisite Dr Brown. Unexpectedly by some small intervention the course of history was changed a bit along Fox’s adventures. In this fourth Scholarly Insight Report we have explored two innovative approaches to learn from OU data of the past, which hopefully in the future will make a large difference in how we support our students and design and implement our teaching and learning practices. In Chapter 1, we provide an in-depth analysis of 50 thousands comments expressed by students through the Student Experience on a Module (SEAM) questionnaire. By analysing over 2.5 million words using big data approaches, our Scholarly insights indicate that not all student voices are heard. Furthermore, our big data analysis indicate useful potential insights to explore how student voices change over time, and for which particular modules emergent themes might arise.
In Chapter 2 we provide our second innovative approach of a proof-of-concept of qualification path way using graph approaches. By exploring existing data of one qualification (i.e., Psychology), we show that students make a range of pathway choices during their qualification, some of which are more successful than others. As highlighted in our previous Scholarly Insight Reports, getting data from a qualification perspective within the OU is a difficult and challenging process, and the proof-of-concept provided in Chapter 2 might provide a way forward to better understand and support the complex choices our students make.
In Chapter 3, we provide a slightly more practically-oriented and perhaps down to earth approach focussing on the lessons-learned with Analytics4Action. Over the last four years nearly a hundred modules have worked with more active use of data and insights into module presentation to support their students. In Chapter 3 several good-practices are described by the LTI/TEL learning design team, as well as three innovative case-studies which we hope will inspire you to try something new as well.
Working organically in various Faculty sub-group meetings and LTI Units and in a google doc with various key stakeholders in the Faculties, we hope that our Scholarly insights can help to inform our staff, but also spark some ideas how to further improve our module designs and qualification pathways. Of course we are keen to hear what other topics require Scholarly insight. We hope that you see some potential in the two innovative approaches, and perhaps you might want to try some new ideas in your module. While a time machine has not really been invented yet, with the increasing rich and fine-grained data about our students and our learning practices we are getting closer to understand what really drives our students
A Comparative Analysis of Ensemble Classifiers: Case Studies in Genomics
The combination of multiple classifiers using ensemble methods is
increasingly important for making progress in a variety of difficult prediction
problems. We present a comparative analysis of several ensemble methods through
two case studies in genomics, namely the prediction of genetic interactions and
protein functions, to demonstrate their efficacy on real-world datasets and
draw useful conclusions about their behavior. These methods include simple
aggregation, meta-learning, cluster-based meta-learning, and ensemble selection
using heterogeneous classifiers trained on resampled data to improve the
diversity of their predictions. We present a detailed analysis of these methods
across 4 genomics datasets and find the best of these methods offer
statistically significant improvements over the state of the art in their
respective domains. In addition, we establish a novel connection between
ensemble selection and meta-learning, demonstrating how both of these disparate
methods establish a balance between ensemble diversity and performance.Comment: 10 pages, 3 figures, 8 tables, to appear in Proceedings of the 2013
International Conference on Data Minin
Data mining techniques for complex application domains
The emergence of advanced communication techniques has increased availability of large collection of data in electronic form in a number of application domains including healthcare, e- business, and e-learning. Everyday a large amount of records are stored electronically. However, finding useful information from such a large data collection is a challenging issue. Data mining technology aims automatically extracting hidden knowledge from large data repositories exploiting sophisticated algorithms. The hidden knowledge in the electronic data may be potentially utilized to facilitate the procedures, productivity, and reliability of several application domains.
The PhD activity has been focused on novel and effective data mining approaches to tackle the complex data coming from two main application domains: Healthcare data analysis and Textual data analysis.
The research activity, in the context of healthcare data, addressed the application of different data mining techniques to discover valuable knowledge from real exam-log data of patients. In particular, efforts have been devoted to the extraction of medical pathways, which can be exploited to analyze the actual treatments followed by patients. The derived knowledge not only provides useful information to deal with the treatment procedures but may also play an important role in future predictions of potential patient risks associated with medical treatments.
The research effort in textual data analysis is twofold. On the one hand, a novel approach to discovery of succinct summaries of large document collections has been proposed. On the other hand, the suitability of an established descriptive data mining to support domain experts in making decisions has been investigated. Both research activities are focused on adopting widely exploratory data mining techniques to textual data analysis, which require overcoming intrinsic limitations for traditional algorithms for handling textual documents efficiently and effectively
Inferring Actual Treatment Pathways from Patient Records
Treatment pathways are step-by-step plans outlining the recommended medical
care for specific diseases; they get revised when different treatments are
found to improve patient outcomes. Examining health records is an important
part of this revision process, but inferring patients' actual treatments from
health data is challenging due to complex event-coding schemes and the absence
of pathway-related annotations. This study aims to infer the actual treatment
steps for a particular patient group from administrative health records (AHR) -
a common form of tabular healthcare data - and address several technique- and
methodology-based gaps in treatment pathway-inference research. We introduce
Defrag, a method for examining AHRs to infer the real-world treatment steps for
a particular patient group. Defrag learns the semantic and temporal meaning of
healthcare event sequences, allowing it to reliably infer treatment steps from
complex healthcare data. To our knowledge, Defrag is the first
pathway-inference method to utilise a neural network (NN), an approach made
possible by a novel, self-supervised learning objective. We also developed a
testing and validation framework for pathway inference, which we use to
characterise and evaluate Defrag's pathway inference ability and compare
against baselines. We demonstrate Defrag's effectiveness by identifying
best-practice pathway fragments for breast cancer, lung cancer, and melanoma in
public healthcare records. Additionally, we use synthetic data experiments to
demonstrate the characteristics of the Defrag method, and to compare Defrag to
several baselines where it significantly outperforms non-NN-based methods.
Defrag significantly outperforms several existing pathway-inference methods and
offers an innovative and effective approach for inferring treatment pathways
from AHRs. Open-source code is provided to encourage further research in this
area
- …