Search CORE

25,155 research outputs found

Efficient Replication of Over 180 Genetic Associations with Self-Reported Medical Data

Author: Amy K. Kiefer
Anne Wojcicki
Arnab B. Chowdry
Brian T. Naughton
Chuong B. Do
David A. Hinds
J. Michael Macpherson
Joanna L. Mountain
Joyce Y. Tung
Nicholas Eriksson
Uta Francke
Publication venue
Publication date: 01/01/2011
Field of study

While the cost and speed of generating genomic data have come down dramatically in recent years, the slow pace of collecting medical data for large cohorts continues to hamper genetic research. Here we evaluate a novel online framework for amassing large amounts of medical information in a recontactable cohort by assessing our ability to replicate genetic associations using these data. Using web-based questionnaires, we gathered self-reported data on 50 medical phenotypes from a generally unselected cohort of over 20,000 genotyped individuals. Of a list of genetic associations curated by NHGRI, we successfully replicated about 75% of the associations that we expected to (based on the number of cases in our cohort and reported odds ratios, and excluding a set of associations with contradictory published evidence). Altogether we replicated over 180 previously reported associations, including many for type 2 diabetes, prostate cancer, cholesterol levels, and multiple sclerosis. We found significant variation across categories of conditions in the percentage of expected associations that we were able to replicate, which may reflect systematic inflation of the effects in some initial reports, or differences across diseases in the likelihood of misdiagnosis or misreport. We also demonstrated that we could improve replication success by taking advantage of our recontactable cohort, offering more in-depth questions to refine self-reported diagnoses. Our data suggests that online collection of self-reported data in a recontactable cohort may be a viable method for both broad and deep phenotyping in large populations

CiteSeerX

Directory of Open Access Journals

Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks

Author: Ba Jimmy
Cheng Yu
Choi Edward
Lipton Zachary C
Suo Qiuling
Wang Xiang
Xu Kelvin
Zeiler Matthew D
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 18/06/2017
Field of study

Predicting the future health information of patients from the historical Electronic Health Records (EHR) is a core research task in the development of personalized healthcare. Patient EHR data consist of sequences of visits over time, where each visit contains multiple medical codes, including diagnosis, medication, and procedure codes. The most important challenges for this task are to model the temporality and high dimensionality of sequential EHR data and to interpret the prediction results. Existing work solves this problem by employing recurrent neural networks (RNNs) to model EHR data and utilizing simple attention mechanism to interpret the results. However, RNN-based approaches suffer from the problem that the performance of RNNs drops when the length of sequences is large, and the relationships between subsequent visits are ignored by current RNN-based approaches. To address these issues, we propose {\sf Dipole}, an end-to-end, simple and robust model for predicting patients' future health information. Dipole employs bidirectional recurrent neural networks to remember all the information of both the past visits and the future visits, and it introduces three attention mechanisms to measure the relationships of different visits for the prediction. With the attention mechanisms, Dipole can interpret the prediction results effectively. Dipole also allows us to interpret the learned medical code representations which are confirmed positively by medical experts. Experimental results on two real world EHR datasets show that the proposed Dipole can significantly improve the prediction accuracy compared with the state-of-the-art diagnosis prediction approaches and provide clinically meaningful interpretation

arXiv.org e-Print Archive

Protocol for the 'e-Nudge trial' : a randomised controlled trial of electronic feedback to reduce the cardiovascular risk of individuals in general practice [ISRCTN64828380]

Author: 2 JBS
A Agrawal
A Filippi
AD Hingorani
AD Morris
AT Khoury
B Kralj
B Williams
C Safran
CH Fung
D Machin
D Moher
Department of Health
DF Lobach
E Hak
E Mitchell
E Toth-Pal
FM Weaver
Frances Griffiths
I Hoch
J Hippisley-Cox
K Stewart
KG Schellhase
KM Anderson
KS Yarnall
LL Dickey
M Bland
M Weiner
MA Krall
Margaret Thorogood
MZ Kleschen
N Kucher
National Service Frameworks
NC Campbell
P Brindle
PC Tang
PM Rothwell
RJ Lilford
SJ Pocock
SS Intille
Stephen Munday
TA Holt
TA Lieu
Tim A Holt
TK Gandhi
UK Diabetes
WL Galanter
WM Tierney
World Health Organisation
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Background: Cardiovascular disease (including coronary heart disease and stroke) is a major cause of death and disability in the United Kingdom, and is to a large extent preventable, by lifestyle modification and drug therapy. The recent standardisation of electronic codes for cardiovascular risk variables through the United Kingdom's new General Practice contract provides an opportunity for the application of risk algorithms to identify high risk individuals. This randomised controlled trial will test the benefits of an automated system of alert messages and practice searches to identify those at highest risk of cardiovascular disease in primary care databases. Design: Patients over 50 years old in practice databases will be randomised to the intervention group that will receive the alert messages and searches, and a control group who will continue to receive usual care. In addition to those at high estimated risk, potentially high risk patients will be identified who have insufficient data to allow a risk estimate to be made. Further groups identified will be those with possible undiagnosed diabetes, based either on elevated past recorded blood glucose measurements, or an absence of recent blood glucose measurement in those with established cardiovascular disease. Outcome measures: The intervention will be applied for two years, and outcome data will be collected for a further year. The primary outcome measure will be the annual rate of cardiovascular events in the intervention and control arms of the study. Secondary measures include the proportion of patients at high estimated cardiovascular risk, the proportion of patients with missing data for a risk estimate, and the proportion with undefined diabetes status at the end of the trial

Springer - Publisher Connector

Directory of Open Access Journals

Warwick Research Archives Portal Repository

Oxford University Research Archive

Big data and data repurposing – using existing data to answer new questions in vascular dementia research

Author: A Abdul-Rahim
AH Noel-Storr
Andreas Charidimou
Charlotte J. Roberts
CN Martyn
Craig W. Ritchie
CW Ritchie
D Jang
D Kerr
D Religa
DA Levine
Deborah A. Levine
Fergus N. Doubal
G Mead
G Perera
G Perera
G Rands
G. David Batty
GE Mead
Gillian Mead
HA Mucke
Hermann A. M. Mucke
HL Dunn
I Chalmers
J Danesh
J Mindell
J Sultana
JPA Ioannidis
L Gray
M Ali
M Brainin
M Jokela
M Porter
M Porter
Maria Eriksdotter
Martin Hofmann-Apitius
Myzoon Ali
NL Catlett
P Langhorne
R Lees
R Patel
R Stewart
R Xu
Robert Stewart
S Garcia-Ptacek
SK McCann
T Hoffman
T Skillbäck
TC Russ
TC Russ
TC Russ
TC Russ
Terence J. Quinn
TJ Quinn
Tom C. Russ
William Whiteley
Yun-Hee Kim
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Introduction: Traditional approaches to clinical research have, as yet, failed to provide effective treatments for vascular dementia (VaD). Novel approaches to collation and synthesis of data may allow for time and cost efficient hypothesis generating and testing. These approaches may have particular utility in helping us understand and treat a complex condition such as VaD. Methods: We present an overview of new uses for existing data to progress VaD research. The overview is the result of consultation with various stakeholders, focused literature review and learning from the group’s experience of successful approaches to data repurposing. In particular, we benefitted from the expert discussion and input of delegates at the 9th International Congress on Vascular Dementia (Ljubljana, 16-18th October 2015). Results: We agreed on key areas that could be of relevance to VaD research: systematic review of existing studies; individual patient level analyses of existing trials and cohorts and linking electronic health record data to other datasets. We illustrated each theme with a case-study of an existing project that has utilised this approach. Conclusions: There are many opportunities for the VaD research community to make better use of existing data. The volume of potentially available data is increasing and the opportunities for using these resources to progress the VaD research agenda are exciting. Of course, these approaches come with inherent limitations and biases, as bigger datasets are not necessarily better datasets and maintaining rigour and critical analysis will be key to optimising data use

Directory of Open Access Journals

Enlighten

ResearchOnline@GCU

Predicting diabetes-related hospitalizations based on electronic health records

Author: Brisimi Theodora S.
Dai Wuyang
Paschalidis Ioannis Ch.
Wang Taiyao
Xu Tingting
Publication venue: 'SAGE Publications'
Publication date: 01/12/2019
Field of study

OBJECTIVE: To derive a predictive model to identify patients likely to be hospitalized during the following year due to complications attributed to Type II diabetes. METHODS: A variety of supervised machine learning classification methods were tested and a new method that discovers hidden patient clusters in the positive class (hospitalized) was developed while, at the same time, sparse linear support vector machine classifiers were derived to separate positive samples from the negative ones (non-hospitalized). The convergence of the new method was established and theoretical guarantees were proved on how the classifiers it produces generalize to a test set not seen during training. RESULTS: The methods were tested on a large set of patients from the Boston Medical Center - the largest safety net hospital in New England. It is found that our new joint clustering/classification method achieves an accuracy of 89% (measured in terms of area under the ROC Curve) and yields informative clusters which can help interpret the classification results, thus increasing the trust of physicians to the algorithmic output and providing some guidance towards preventive measures. While it is possible to increase accuracy to 92% with other methods, this comes with increased computational cost and lack of interpretability. The analysis shows that even a modest probability of preventive actions being effective (more than 19%) suffices to generate significant hospital care savings. CONCLUSIONS: Predictive models are proposed that can help avert hospitalizations, improve health outcomes and drastically reduce hospital expenditures. The scope for savings is significant as it has been estimated that in the USA alone, about $5.8 billion are spent each year on diabetes-related hospitalizations that could be prevented.Accepted manuscrip

Boston University Institutional Repository (OpenBU)

Extracting information from the text of electronic medical records to improve case detection: a systematic review

Author: Afzal
Afzal
Ananthakrishnan
Baus
Cano
Carroll
Carroll
Carroll
Castro
Chapman
Chen
Chung
Currie
de Lusignan
DeLisle
DeLisle
Donia Scott
Dorr
Elizabeth Ford
Ford
Friedlin
Friedman
Friedman
Graiser
Greenhalgh
Gulliford
Gundlapalli
Hanauer
Hanauer
Hanauer
Harkema
Helen E Smith
Imfeld
Jackie A Cassell
John A Carroll
Jones
Kalra
Karnik
Koeling
Kushida
Li
Liao
Lin
Lindberg
Love
Lovis
Ludvigsson
Manning
Manuel
McPeek Hinz
Mehrabi
Meystre
Nielen
Pakhomov
Pakhomov
Powsner
Rait
Resnik
Roch
Ryan
Savova
Soler
Stein
Stone
Tange
Tate
Tsui
Uzuner
Valkhoff
Walsh
Widdifield
Wilke
Wu
Xia
Xu
Xu
Yadav
Ye
Zeng
Zeng
Zheng
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2016
Field of study

Background: Electronic medical records (EMRs) are revolutionizing health-related research. One key issue for study quality is the accurate identification of patients with the condition of interest. Information in EMRs can be entered as structured codes or unstructured free text. The majority of research studies have used only coded parts of EMRs for case-detection, which may bias findings, miss cases, and reduce study quality. This review examines whether incorporating information from text into case-detection algorithms can improve research quality. Methods: A systematic search returned 9659 papers, 67 of which reported on the extraction of information from free text of EMRs with the stated purpose of detecting cases of a named clinical condition. Methods for extracting information from text and the technical accuracy of case-detection algorithms were reviewed. Results: Studies mainly used US hospital-based EMRs, and extracted information from text for 41 conditions using keyword searches, rule-based algorithms, and machine learning methods. There was no clear difference in case-detection algorithm accuracy between rule-based and machine learning methods of extraction. Inclusion of information from text resulted in a significant improvement in algorithm sensitivity and area under the receiver operating characteristic in comparison to codes alone (median sensitivity 78% (codes + text) vs 62% (codes), P = .03; median area under the receiver operating characteristic 95% (codes + text) vs 88% (codes), P = .025). Conclusions: Text in EMRs is accessible, especially with open source information extraction algorithms, and significantly improves case detection when combined with codes. More harmonization of reporting within EMR studies is needed, particularly standardized reporting of algorithm accuracy metrics like positive predictive value (precision) and sensitivity (recall)