Search CORE

9 research outputs found

Application of Clinical Concept Embeddings for Heart Failure Prediction in UK EHR data

Author: Denaxas Spiros
Dobson Richard
Hemingway Harry
Pikoula Maria
Riedel Sebastian
Stenetorp Pontus
Publication venue
Publication date: 28/11/2018
Field of study

Electronic health records (EHR) are increasingly being used for constructing disease risk prediction models. Feature engineering in EHR data however is challenging due to their highly dimensional and heterogeneous nature. Low-dimensional representations of EHR data can potentially mitigate these challenges. In this paper, we use global vectors (GloVe) to learn word embeddings for diagnoses and procedures recorded using 13 million ontology terms across 2.7 million hospitalisations in national UK EHR. We demonstrate the utility of these embeddings by evaluating their performance in identifying patients which are at higher risk of being hospitalised for congestive heart failure. Our findings indicate that embeddings can enable the creation of robust EHR-derived disease risk prediction models and address some the limitations associated with manual clinical feature engineering.Comment: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.0721

arXiv.org e-Print Archive

UCL Discovery

Identifying priorities in methodological research using ICD-9-CM and ICD-10 administrative data: report from an international consortium

Author: Alan Finlayson
BA Virnig
Carolyn De Coster
CN Bernstein
D Schrag
DC Goff Jr
DJ Magid
DS May
GE Rosenthal
Greg Webster
H Quan
Helen Johansen
Hude Quan
J Wennberg
Jack V Tu
JB Mitchell
JE Hux
Jean-Christophe Luthi
Jin Ma
K Brouch
Karin H Humphries
Leslie Roos
Lisa M Lix
LL Roos
Min Gao
MR Joffres
Patricia Halfon
Patrick S Romano
RA Deyo
S De Lusignan
SM Asch
V Sundararajan
Vijaya Sundararajan
WA Ghali
William A Ghali
WT Hamilton
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Health administrative data are frequently used for health services and population health research. Comparative research using these data has been facilitated by the use of a standard system for coding diagnoses, the International Classification of Diseases (ICD). Research using the data must deal with data quality and validity limitations which arise because the data are not created for research purposes. This paper presents a list of high-priority methodological areas for researchers using health administrative data. METHODS: A group of researchers and users of health administrative data from Canada, the United States, Switzerland, Australia, China and the United Kingdom came together in June 2005 in Banff, Canada to discuss and identify high-priority methodological research areas. The generation of ideas for research focussed not only on matters relating to the use of administrative data in health services and population health research, but also on the challenges created in transitioning from ICD-9 to ICD-10. After the brain-storming session, voting took place to rank-order the suggested projects. Participants were asked to rate the importance of each project from 1 (low priority) to 10 (high priority). Average ranks were computed to prioritise the projects. RESULTS: Thirteen potential areas of research were identified, some of which represented preparatory work rather than research per se. The three most highly ranked priorities were the documentation of data fields in each country's hospital administrative data (average score 8.4), the translation of patient safety indicators from ICD-9 to ICD-10 (average score 8.0), and the development and validation of algorithms to verify the logic and internal consistency of coding in hospital abstract data (average score 7.0). CONCLUSION: The group discussions resulted in a list of expert views on critical international priorities for future methodological research relating to health administrative data. The consortium's members welcome contacts from investigators involved in research using health administrative data, especially in cross-jurisdictional collaborative studies or in studies that illustrate the application of ICD-10

University of Toronto Research Repository

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Serveur académique lausannois

PubMed Central

eScholarship - University of California

University of Melbourne Institutional Repository

Comprehensive computerised primary care records are an essential component of any national health information strategy: report from an international consensus conference

Author
Publication venue: 'BCS Learning and Development Limited'
Publication date
Field of study

Crossref

Confiabilidade interobservador da classificação internacional de atenção primária em uma unidade de atenção básica à saúde

Author: Arlinda Barbosa Moreno
Bentsen BG
Britt H
Cláudia Medina Coeli
Cohen J
de Lusignan S
Emond JG
Kenneth Rochel de Camargo Jr.
Lamberts H
Lebrão ML
Letrilliart L
Mariana Miranda Autran Sampaio
Márcia Fernandes Soares
Márcia Guimarães de Mello Alves
Reichenheim ME
Rodgers RPC
Sampaio MM
Shrout PE
Soeken KL
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Crossref

Diagnoses of Patients with Severe Subjective Health Complaints in Scandinavia: A Cross Sectional Study

Author
Publication venue: 'Hindawi Limited'
Publication date
Field of study

Crossref

Vaccine semantics : Automatic methods for recognizing, representing, and reasoning about vaccine-related information

Author: Becker B.F.H. (Benedikt)
Publication venue: Post-marketing management and decision-making about vaccines builds on the early detection of safety concerns and changes in public sentiment, the accurate access to established evidence, and the ability to promptly quantify effects and verify hypotheses about the vaccine benefits and risks. A variety of resources provide relevant information but they use different representations, which makes rapid evidence generation and extraction challenging. This thesis presents automatic methods for interpreting heterogeneously represented vaccine information. Part I evaluates social media messages for monitoring vaccine adverse events and public sentiment in social media messages, using automatic methods for information recognition. Parts II and III develop and evaluate automatic methods and reso
Publication date: 08/01/2019
Field of study

Post-marketing management and decision-making about vaccines builds on the early detection of safety concerns and changes in public sentiment, the accurate access to established evidence, and the ability to promptly quantify effects and verify hypotheses about the vaccine benefits and risks. A variety of resources provide relevant information but they use different representations, which makes rapid evidence generation and extraction challenging. This thesis presents automatic methods for interpreting heterogeneously represented vaccine information. Part I evaluates social media messages for monitoring vaccine adverse events and public sentiment in social media messages, using automatic methods for information recognition. Parts II and III develop and evaluate automatic methods and res

EUR Research Repository

Erasmus University Digital Repository

An experimental study and evaluation of a new architecture for clinical decision support - integrating the openEHR specifications for the Electronic Health Record with Bayesian Networks

Author: Arikan SS
Publication venue: UCL (University College London)
Publication date: 28/06/2016
Field of study

Healthcare informatics still lacks wide-scale adoption of intelligent decision support methods, despite continuous increases in computing power and methodological advances in scalable computation and machine learning, over recent decades. The potential has long been recognised, as evidenced in the literature of the domain, which is extensively reviewed. The thesis identifies and explores key barriers to adoption of clinical decision support, through computational experiments encompassing a number of technical platforms. Building on previous research, it implements and tests a novel platform architecture capable of processing and reasoning with clinical data. The key components of this platform are the now widely implemented openEHR electronic health record specifications and Bayesian Belief Networks. Substantial software implementations are used to explore the integration of these components, guided and supplemented by input from clinician experts and using clinical data models derived in hospital settings at Moorfields Eye Hospital. Data quality and quantity issues are highlighted. Insights thus gained are used to design and build a novel graph-based representation and processing model for the clinical data, based on the openEHR specifications. The approach can be implemented using diverse modern database and platform technologies. Computational experiments with the platform, using data from two clinical domains – a preliminary study with published thyroid metabolism data and a substantial study of cataract surgery – explore fundamental barriers that must be overcome in intelligent healthcare systems developments for clinical settings. These have often been neglected, or misunderstood as implementation procedures of secondary importance. The results confirm that the methods developed have the potential to overcome a number of these barriers. The findings lead to proposals for improvements to the openEHR specifications, in the context of machine learning applications, and in particular for integrating them with Bayesian Networks. The thesis concludes with a roadmap for future research, building on progress and findings to date

UCL Discovery

Recommended from our members

Secondary use of electronic medical records for early identification of raised condition likelihoods in individuals: a machine learning approach

Author: Turner Jonathan
Publication venue
Publication date
Field of study

With many symptoms being common to multiple diseases, there is a challenge in producing an initial diagnosis or recommendation for diagnostic tests from a set of symptoms that could have been produced by a number of diseases. Often the initial choice of diagnosis or testing is based on a clinician’s impression of the likelihood of that condition in a general population; however the opportunity may exist for modification of these likelihoods based on individuals’ recorded medical histories. This data-driven approach utilises existing data and is thus cheap and non-invasive. A method is proposed by which an individual’s likelihoods of having specified medical conditions are modified by the similarity of that individual’s medical history to the medical histories of other individuals, comparing the prevalence of conditions in those other individuals’ records who are similar to the individual of interest versus the prevalence of the conditions in those individuals who are dissimilar. In order to maximise the number of records available for analysis, a process was developed for the merging of data from disparate sources that used different clinical coding systems, including extensive development of a technique for semi automatically mapping clinical events coded in ICD9-CM to Clinical Terms Version 3 (CTV3), for which no existing mapping table was found. Semantically similar fields in the source code sets were identified and retained in the combined data set. ‘Codelists’ comprising multiple CTV3 codes for a variety of conditions were built that defined the presence of those conditions within individual records. The hierarchical structure of the CTV3 code table was utilised as a method of identifying codes that differed in structure but had clinically similar or related meaning. The optimum degree of granularity of the coded data to use in identifying similar records was investigated and used in subsequent analysis. Two methods were used for discovering groups of similar and dissimilar individuals: the ‘nearest neighbours’ method and the grouping of records using a clustering process. Altered likelihoods for a range of conditions were investigated and results for the nearest-neighbours approach compared to the clustering approach. Results for adjusted condition likelihoods for 18 conditions are reported, together with a discussion of possible reasons for a change, or otherwise, in the condition likelihood, and a discussion of the clinical significance and potential use of information about such a change. logistic regressions performed on a selection of conditions KNN performed better than logistic regression when judged by F-score (or sensitivity and specificity separately), however situation more nuanced when looking at likelihood ratios: Logistic regression produced higher (better) positive likelihood ratios, but KNN produced lower (better) negative likelihood ratios. Logistic regression produced higher odds ratios

City Research Online