6 research outputs found
Identifying the Main Causes of Medical Data Incompleteness in the Smart Healthcare Era
Incomplete data due to discrepancies between medical data sources and their storage methods represents a serious concern as it may lead to the loss, or misrepresentation of important medical information. This concern is anticipated to grow in the era of smart healthcare as the volume, variety and speed at which medical data is collected will increase significantly. This paper aims to identify the main causes of data incompleteness in the medical domain, discuss some techniques currently used to build a complete medical picture and highlight how they may affect the consistency and accuracy of the collected data. It also outlines future research directions to efficiently handle data incompleteness and its consequences
Reliability of grading using a rubric versus a traditional marking scheme in statistics
Assessment grading in statistics and mathematics has often been approached in an ad-hoc manner, using marking schemes that attach marks to specific steps of a model solution and often do not explicitly reference assessment criteria. Another approach for grading is to use rubrics. Rubrics are recognised to have several advantages for assessment, but research on the reliability of grading with rubrics is equivocal and mostly conducted in less quantitative disciplines. We present a direct comparison of the reliability of marking of a written statistics assignment using a rubric and using the traditional marking scheme approach. We use a Bayesian statistical analysis and find that both methods yield similar levels of inter-rater and intra-rater reliability
Inference for auto-regulatory genetic networks using diffusion process approximations
The scope of this thesis is to propose new inferential tools, based on diffusion process approximations, for the study of the kinetic parameters in auto-regulatory networks. In the first part of this thesis, we study the applicability of the EA methodology to Stochastic Differential Equations (SDEs) which approximate biological systems. In principle EA can be applied to any scalar-valued SDE as long as a transformation (known as Lamperti transform) exists that sets the (new) infinitesimal variance to unity. We explore the numerical limitations of this requirement by considering a biological system that can be expressed as a scalar non-linear SDE. Next, we consider the multidimensional extension of this transformation and we show, with a counterexample, that EA can be applied to a class of SDEs which is wider than the class of reducible diffusions. In the second part of this thesis, we proposed a reparametrization of the kinetic constants that leads to an approximation known as the Linear Noise approximation (LNA). We prove that LNA converges to a linear SDE, as the size of the biological system increases. Since the LNA is a linear SDE, it has a known transition density with parameters given as the solutions of a system of Ordinary Differential Equations (ODEs) which are usually obtained numerically. Furthermore, we compare the LNA's simulation performance to the performance of other (approximate and exact) methods under different modelling scenarios and we relate the performance of the approximate methods to the system size. In addition, we consider LNA as an inferential tool and we use two methods, the Restarting (RE), which we propose, and the Non-Restarting (NR) method, proposed by Komorowski et. al. (2009) to derive the LNA's likelihood. The two methods differ on the initial conditions that they pose in order to solve the underlying ODEs. We compare the performance of the two methods by considering data generated under different scenarios. Finally, we discuss the lnar, a package for the R statistical environment, that we developed to implement the LNA methodology
A Neighborhood-Similarity-Based Imputation Algorithm for Healthcare Data Sets: A Comparative Study
The increasing computerisation of medical services has highlighted inconsistencies in the way in which patients’ historic medical data were recorded. Differences in process and practice between medical services and facilities have led to many incomplete and inaccurate medical histories being recorded. To create a single point of truth going forward, it is necessary to correct these inconsistencies. A common way to do this has been to use imputation techniques to predict missing data values based on the known values in the data set. In this paper, we propose a neighborhood similarity measure-based imputation technique and analyze its achieved prediction accuracy in comparison with a number of traditional imputation methods using both an incomplete anonymized diabetes medical data set and a number of simulated data sets as the sources of our data. The aim is to determine whether any improvement could be made in the accuracy of predicting a diabetes diagnosis using the known outcomes of the diabetes patients’ data set. The obtained results have proven the effectiveness of our proposed approach compared to other state-of-the-art single-pass imputation techniques
The influence of odour, taste and nutrients on feeding behaviour and food preferences in horses
While it has been established that nutrients and flavours (odour, taste) play an important role in diet selection by horses, previous studies have not always clarified what type of flavouring (e.g. non-nutritive or nutritive) was used. Therefore, the objective of this study was to determine the influence of distinct food characteristics (odour, taste, nutrients) on the preference of horses using different preference testing protocols. This experiment consisted of three phases; adaptation (Pl), two-choice testing (P2) and multiple-choice testing using a chequerboard design (P3). Four pelleted diets equal in digestible energy, but contrasted in crude protein (IP; 14% and HP; 27%) and added non-caloric (natural) sweetener (i.e. LP,LP+, HP, HP+) were consecutively fed to each of sixteen adult horses. The diets were paired with four non-nutritive odours (coconut, banana, cinnamon, spearmint), with a unique odour and diet combination allocated to each group of four horses. In P1, each diet was presented solely for five days to facilitate pre- and post-ingestive associations: in P2 a two-choice test was conducted with four diet combinations (contrasts) over three days; and in P3 the four diets were presented simultaneously in a checkerboard fashion over a 5-day period. Feed intake, bucket/zone visits and time spent foraging or moving were recorded. The key findings of this study were: (1) In Pl an initially large variation in intake was recorded with only some horses showing a neophobic response to a new odour/food, but variation declined within 2 days with the majority of the horses consuming over 9~ of the diets. (2) Nutrient (HP) content appeared to be the main driver for diet intake in P2 (