1,563 research outputs found
Machine learning in healthcare : an investigation into model stability
Current machine learning algorithms, when directly applied to medical data, often fail to provide a good understanding of prognosis. This study provides three pathways to make predictive models stable and usable for healthcare. When tested on heart failure and diabetes patients from a local hospital, this study demonstrated 20% improvement over existing methods.<br /
OPENMENDEL: A Cooperative Programming Project for Statistical Genetics
Statistical methods for genomewide association studies (GWAS) continue to
improve. However, the increasing volume and variety of genetic and genomic data
make computational speed and ease of data manipulation mandatory in future
software. In our view, a collaborative effort of statistical geneticists is
required to develop open source software targeted to genetic epidemiology. Our
attempt to meet this need is called the OPENMENDELproject
(https://openmendel.github.io). It aims to (1) enable interactive and
reproducible analyses with informative intermediate results, (2) scale to big
data analytics, (3) embrace parallel and distributed computing, (4) adapt to
rapid hardware evolution, (5) allow cloud computing, (6) allow integration of
varied genetic data types, and (7) foster easy communication between
clinicians, geneticists, statisticians, and computer scientists. This article
reviews and makes recommendations to the genetic epidemiology community in the
context of the OPENMENDEL project.Comment: 16 pages, 2 figures, 2 table
Pattern discovery in adverse event data
Imperial Users onl
Recommended from our members
Identification of Randomized Trials for Inclusion in Meta-Analyses of Treatments for Childhood Acute Lymphoblastic Leukaemia, and Investigation of Factors Leading to Publication Bias
Purpose: Some randomized trials are reported widely, while others remain unpublished. It is essential to systematic reviewers and meta-analysts that factors leading to publication bias in the form of delayed or non-publication of an eligible study are identified. This thesis is an attempt to do this.
Data: The set of randomized trials identified by the Childhood Acute Lymphoblastic Leukaemia (ALL) Collaborative Group was used. This consists of 149 trials comprising 243 randomized comparisons (randomizations), starting prior to 1 January 1988, reported in 257 articles, published prior to 1 January 2000. Each mention of a randomization in an article (irrespective of whether results are given) generates a publication record, of which there are 610.
Methods: The main focus is on identifying which trial characteristics lead to a delay in publication of a randomization. Time to the first mention of a randomization in an article (irrespective of whether any results are given) and to the first reporting of its results are both modelled using ordinary linear regression (the independence model). However, when these analyses are extended to include all mentions and all reportings of results respectively, non-independence necessitates the use of techniques for dealing with repeated measures. In such cases the independence model is the starting point, the residuals from which are used to form the covariance matrix, which in turn is used to suggest plausible correlation structures for repeated measures models. Generalised estimating equation (GEE) analysis is used to select an appropriate correlation structure, and a linear mixed effects model serves to confirm this. The conclusions are then discussed in the context of other studies identified. Finally logistic regression is used to identify trial characteristics associated with a randomization remaining unpublished, and Poisson and negative binomial models to identify those affecting frequency of reporting.
Results: Evidence was found of âpipeline biasâ in the reporting of first results since, although direction of effect was not found to be significant, highly statistically significant results are published faster than others. However this is not so for first mentions. Negative results (i.e. those in favour of the standard/control) arm were submitted for first publication faster than all others, although this did not effect time to publication. In addition, geographic location is an important predictor of whether a randomization is ever mentioned in an article, frequency of mentions and of time to first publication and results from single-centre trials are published more frequently than those with multi-centre participation.
Conclusions: Although âpipeline biasâ was identified in the analysis of time first reporting of results, it was not present in the analysis of time to first mention, and so not a problem for those wishing only to identify randomized trials for inclusion in meta-analyses. The importance of geographic location suggests that the practice of contacting known trialists is worthwhile in addition to the computerised literature searches and should be continued.
</br
Feature selection and personalized modeling on medical adverse outcome prediction
This thesis is about the medical adverse outcome prediction and is composed of three parts, i.e. feature selection, time-to-event prediction and personalized modeling. For feature selection, we proposed a three-stage feature selection method which is an ensemble of filter, embedded and wrapper selection techniques. We combine them in a way to select a both stable and predictive set of features as well as reduce the computation burden. Datasets on two adverse outcome prediction problems, 30-day hip fracture readmission and diabetic retinopathy prognosis are derived from electronic health records and exemplified to prove the effectiveness of the proposed method. With the selected features, we investigated the application of some classical survival analysis models, namely the accelerated failure time models, Cox proportional hazard regression models and mixture cure models on adverse outcome prediction. Unlike binary classifiers, survival analysis methods consider both the status and time-to-event information and provide more flexibility when we are interested in the occurrence of adverse outcome in different time windows. Lastly, we introduced the use of personalized modeling(PM) to predict adverse outcome based on the most similar patients of each query patient. Different from the commonly used global modeling approach, PM builds prediction model on smaller but more similar patient cohort thus leading to a more individual-based prediction and customized risk factor profile. Both static and metric learning distance measures are used to identify similar patient cohort. We show that PM together with feature selection achieves better prediction performance by using only similar patients, compared with using data from all available patients in one-size-fits-all model
Big Data Analytics and Information Science for Business and Biomedical Applications
The analysis of Big Data in biomedical as well as business and financial research has drawn much attention from researchers worldwide. This book provides a platform for the deep discussion of state-of-the-art statistical methods developed for the analysis of Big Data in these areas. Both applied and theoretical contributions are showcased
Anticoagulant Use, Safety and Effectiveness for Ischemic Stroke Prevention in Nursing Home Residents with Atrial Fibrillation
Background
Fewer than one-third of nursing home residents with atrial fibrillation were treated with the only available oral anticoagulant, warfarin, historically. Management of atrial fibrillation has transformed in recent years with the approval of 4 direct-acting oral anticoagulants (DOACs) since 2010.
Methods
Using the national Minimum Data Set 3.0 linked to Medicare Part A and D claims, we first described contemporary (2011-2016) warfarin and DOAC utilization in the nursing home population (Aim 1). In Aim 2, we linked residents to nursing home and county level data to study associations between resident, facility, county, and state characteristics and anticoagulant treatment. Using a new-user active comparator design, we then compared the incidence of safety (i.e., bleeding), effectiveness (i.e., ischemic stroke), and mortality outcomes between residents initiating DOACs versus warfarin (Aim 3).
Results
The proportion of residents with atrial fibrillation receiving treatment increased from 42.3% in 2011 to 47.8% as of December 31, 2016, at which time 48.2% of treated residents received DOACs. Demographic and clinical characteristics of residents using DOACs and warfarin were similar in 2016. Half of the 8,734 DOAC users received standard dosages and most were treated with apixaban (54.4%) or rivaroxaban (35.8%) in 2016.
Compared with warfarin, bleeding rates were lower and ischemic stroke rates were higher for apixaban users. Ischemic stroke and bleeding rates for dabigatran and rivaroxaban were comparable to warfarin. Mortality rates were lower versus warfarin for each DOAC.
Conclusions
In nursing homes, DOACs are being used commonly and with equal or greater benefit than warfarin
Washington University Record, August 22, 1996
https://digitalcommons.wustl.edu/record/1728/thumbnail.jp
- âŠ