106 research outputs found
Identifying candidate risk factors for prescription drug side effects using causal contrast set mining
Big longitudinal observational databases present the opportunity to extract new knowledge in a cost effective manner. Unfortunately, the ability of these databases to be used for causal inference is limited due to the passive way in which the data are collected resulting in various forms of bias. In this paper we investigate a method that can overcome these limitations and determine causal contrast set rules efficiently from big data. In particular, we present a new methodology for the purpose of identifying risk factors that increase a patients likelihood of experiencing the known rare side effect of renal failure after ingesting aminosalicylates. The results show that the methodology was able to identify previously researched risk factors such as being prescribed diuretics and highlighted that patients with a higher than average risk of renal failure may be even more susceptible to experiencing it as a side effect after ingesting aminosalicylates
Incorporating spontaneous reporting system data to aid causal inference in longitudinal healthcare data
Inferring causality using longitudinal observational databases is challenging due to the passive way the data are collected. The majority of associations found within longitudinal observational data are often non-causal and occur due to confounding.
The focus of this paper is to investigate incorporating information from additional databases to complement the longitudinal observational database analysis. We investigate the detection of prescription drug side effects as this is an example of a causal relationship. In previous work a framework was proposed for detecting side effects only using longitudinal data. In this paper we combine a measure of association derived from mining a spontaneous reporting system database to previously proposed analysis that extracts domain expertise features for causal analysis of a UK general practice longitudinal database.
The results show that there is a significant improvement to the performance of detecting prescription drug side effects when the longitudinal observation data analysis is complemented by incorporating additional drug safety sources into the framework. The area under the receiver operating characteristic curve (AUC) for correctly classifying a side effect when other data were considered was 0.967, whereas without it the AUC was 0.923 However, the results of this paper may be biased by the evaluation and future work should overcome this by developing an unbiased reference set
Identifying candidate risk factors for prescription drug side effects using causal contrast set mining
Big longitudinal observational databases present the opportunity to extract new knowledge in a cost effective manner. Unfortunately, the ability of these databases to be used for causal inference is limited due to the passive way in which the data are collected resulting in various forms of bias. In this paper we investigate a method that can overcome these limitations and determine causal contrast set rules efficiently from big data. In particular, we present a new methodology for the purpose of identifying risk factors that increase a patients likelihood of experiencing the known rare side effect of renal failure after ingesting aminosalicylates. The results show that the methodology was able to identify previously researched risk factors such as being prescribed diuretics and highlighted that patients with a higher than average risk of renal failure may be even more susceptible to experiencing it as a side effect after ingesting aminosalicylates
Detecting adverse drug reactions in the general practice healthcare database
The novel contribution of this research is the development of a supervised algorithm that extracts relevant attributes from The Health Improvement Network database to detect prescription side effects. Prescription drug side effects are a common cause of morbidity throughout the world. Methods that aim to detect side effects have historically been limited due to the data available, but some of these limitations may be overcome by incorporating longitudinal observational databases into pharmacovigilance. Existing side effect detecting methods using longitudinal observational databases have shown promise at becoming a fundamental component of post marketing surveillance but unfortunately have high false positive rates. An extra step is required to further analyse and filter the potential side effects detected by existing methods due to their high false positive rates, and this reduces their efficiency. In this thesis a novel methodology, the supervised adverse drug reaction predictor (SAP) framework, is presented that learns from known side effects, and identifies patterns that can be utilised to detect unknown side effects. The Bradford-Hill causality considerations are used to derive suitable attributes as inputs into a learning algorithm. Both supervised and semi-supervised techniques are investigated due to the limited number of definitively known side effects. The results showed that the SAP framework implementing a random forest classifier outperformed the existing methods on The Health Improvement Network longitudinal observational database, with AUCs ranging between 0.812-0.937, an overall MAP of 0.667, precision values between 0.733-1 and a false positive rate ≤ 0.013. When applied to the standard reference the SAP framework implementing a support vector machine obtained a MAP score of 0.490, an average AUC of 0.703 and a false positive rate of 0.16. The false positive rate is lower than that obtained by existing methods on the standard reference
Detecting adverse drug reactions in the general practice healthcare database
The novel contribution of this research is the development of a supervised algorithm that extracts relevant attributes from The Health Improvement Network database to detect prescription side effects. Prescription drug side effects are a common cause of morbidity throughout the world. Methods that aim to detect side effects have historically been limited due to the data available, but some of these limitations may be overcome by incorporating longitudinal observational databases into pharmacovigilance. Existing side effect detecting methods using longitudinal observational databases have shown promise at becoming a fundamental component of post marketing surveillance but unfortunately have high false positive rates. An extra step is required to further analyse and filter the potential side effects detected by existing methods due to their high false positive rates, and this reduces their efficiency. In this thesis a novel methodology, the supervised adverse drug reaction predictor (SAP) framework, is presented that learns from known side effects, and identifies patterns that can be utilised to detect unknown side effects. The Bradford-Hill causality considerations are used to derive suitable attributes as inputs into a learning algorithm. Both supervised and semi-supervised techniques are investigated due to the limited number of definitively known side effects. The results showed that the SAP framework implementing a random forest classifier outperformed the existing methods on The Health Improvement Network longitudinal observational database, with AUCs ranging between 0.812-0.937, an overall MAP of 0.667, precision values between 0.733-1 and a false positive rate ≤ 0.013. When applied to the standard reference the SAP framework implementing a support vector machine obtained a MAP score of 0.490, an average AUC of 0.703 and a false positive rate of 0.16. The false positive rate is lower than that obtained by existing methods on the standard reference
Personalising mobile advertising based on users’ installed apps
Mobile advertising is a billion pound industry that is rapidly expanding. The success of an advert is measured based on how users interact with it. In this paper we investigate whether the application of unsupervised learning and association rule mining could be used to enable personalised targeting of mobile adverts with the aim of increasing the interaction rate. Over May and June 2014 we recorded advert interactions such as tapping the advert or watching the whole advert video along with the set of apps a user has installed at the time of the interaction. Based on the apps that the users have installed we applied k-means clustering to profile the users into one of ten classes. Due to the large number of apps considered we implemented dimension reduction to reduced the app feature space by mapping the apps to their iTunes category and clustered users based on the percentage of their apps that correspond to each iTunes app category. The clustering was externally validated by investigating differences between the way the ten profiles interact with the various adverts genres (lifestyle, finance and entertainment adverts). In addition association rule mining was performed to find whether the time of the day that the advert is served and the number of apps a user has installed makes certain profiles more likely to interact with the advert genres. The results showed there were clear differences in the way the profiles interact with the different advert genres and the results of this paper suggest that mobile advert targeting would improve the frequency that users interact with an advert
Personalising mobile advertising based on users’ installed apps
Mobile advertising is a billion pound industry that is rapidly expanding. The success of an advert is measured based on how users interact with it. In this paper we investigate whether the application of unsupervised learning and association rule mining could be used to enable personalised targeting of mobile adverts with the aim of increasing the interaction rate. Over May and June 2014 we recorded advert interactions such as tapping the advert or watching the whole advert video along with the set of apps a user has installed at the time of the interaction. Based on the apps that the users have installed we applied k-means clustering to profile the users into one of ten classes. Due to the large number of apps considered we implemented dimension reduction to reduced the app feature space by mapping the apps to their iTunes category and clustered users based on the percentage of their apps that correspond to each iTunes app category. The clustering was externally validated by investigating differences between the way the ten profiles interact with the various adverts genres (lifestyle, finance and entertainment adverts). In addition association rule mining was performed to find whether the time of the day that the advert is served and the number of apps a user has installed makes certain profiles more likely to interact with the advert genres. The results showed there were clear differences in the way the profiles interact with the different advert genres and the results of this paper suggest that mobile advert targeting would improve the frequency that users interact with an advert
- …