345 research outputs found

    Exploring the relationship between age and health conditions using electronic health records: from single diseases to multimorbidities

    Get PDF
    Background Two enormous challenges facing healthcare systems are ageing and multimorbidity. Clinicians, policymakers, healthcare providers and researchers need to know “who gets which diseases when” in order to effectively prevent, detect and manage multiple conditions. Identification of ageing-related diseases (ARDs) is a starting point for research into common biological pathways in ageing. Examining multimorbidity clusters can facilitate a shift from the single-disease paradigm that pervades medical research and practice to models which reflect the reality of the patient population. Aim To examine how age influences an individual’s likelihood of developing single and multiple health conditions over the lifecourse. Methods and Outputs I used primary care and hospital admission electronic health records (EHRs) of 3,872,451 individuals from the Clinical Practice Research Datalink (CPRD) linked to the Hospital Episode Statistics admitted patient care (HES-APC) dataset in England from 1 April 2010 to 31 March 2015. In collaboration with Professor Aroon Hingorani, Dr Osman Bhatti, Dr Shanaz Husain, Dr Shailen Sutaria, Professor Dorothea Nitsch, Mrs Melanie Hingorani, Dr Constantinos Parisinos, Dr Tom Lumbers and Dr Reecha Sofat, I derived the case definitions for 308 clinically important health conditions, by harmonising Read, ICD-10 and OPCS-4 codes across primary and secondary care records in England. I calculated the age-specific incidence rate, period prevalence and median age at first recorded diagnosis for these conditions and described the 50 most common diseases in each decade of life. I developed a protocol for identifying ARDs using machine-learning and actuarial techniques. Finally, I identified highly correlated multimorbidity clusters and created a tool to visualise comorbidity clusters using a network approach. Conclusions I have developed case definitions (with a panel of clinicians) and calculated disease frequency estimates for 308 clinically important health conditions in the NHS in England. I have described patterns of ageing and multimorbidity using these case definitions, and produced an online app for interrogating comorbidities for an index condition. This work facilitates future research into ageing pathways and multimorbidity

    Biological mechanisms of aging predict age-related disease co-occurrence in patients

    Get PDF
    Genetic, environmental, and pharmacological interventions into the aging process can confer resistance to multiple age-related diseases in laboratory animals, including rhesus monkeys. These findings imply that individual mechanisms of aging might contribute to the co-occurrence of age-related diseases in humans and could be targeted to prevent these conditions simultaneously. To address this question, we text mined 917,645 literature abstracts followed by manual curation and found strong, non-random associations between age-related diseases and aging mechanisms in humans, confirmed by gene set enrichment analysis of GWAS data. Integration of these associations with clinical data from 3.01 million patients showed that age-related diseases associated with each of five aging mechanisms were more likely than chance to be present together in patients. Genetic evidence revealed that innate and adaptive immunity, the intrinsic apoptotic signaling pathway and activity of the ERK1/2 pathway were associated with multiple aging mechanisms and diverse age-related diseases. Mechanisms of aging hence contribute both together and individually to age-related disease co-occurrence in humans and could potentially be targeted accordingly to prevent multimorbidity

    Translating and evaluating historic phenotyping algorithms using SNOMED CT

    Get PDF
    OBJECTIVE: Patient phenotype definitions based on terminologies are required for the computational use of electronic health records. Within UK primary care research databases, such definitions have typically been represented as flat lists of Read terms, but Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) (a widely employed international reference terminology) enables the use of relationships between concepts, which could facilitate the phenotyping process. We implemented SNOMED CT-based phenotyping approaches and investigated their performance in the CPRD Aurum primary care database. MATERIALS AND METHODS: We developed SNOMED CT phenotype definitions for 3 exemplar diseases: diabetes mellitus, asthma, and heart failure, using 3 methods: "primary" (primary concept and its descendants), "extended" (primary concept, descendants, and additional relations), and "value set" (based on text searches of term descriptions). We also derived SNOMED CT codelists in a semiautomated manner for 276 disease phenotypes used in a study of health across the lifecourse. Cohorts selected using each codelist were compared to "gold standard" manually curated Read codelists in a sample of 500 000 patients from CPRD Aurum. RESULTS: SNOMED CT codelists selected a similar set of patients to Read, with F1 scores exceeding 0.93, and age and sex distributions were similar. The "value set" and "extended" codelists had slightly greater recall but lower precision than "primary" codelists. We were able to represent 257 of the 276 phenotypes by a single concept hierarchy, and for 135 phenotypes, the F1 score was greater than 0.9. CONCLUSIONS: SNOMED CT provides an efficient way to define disease phenotypes, resulting in similar patient populations to manually curated codelists

    Performance of polygenic risk scores in screening, prediction, and risk stratification: secondary analysis of data in the Polygenic Score Catalog

    Get PDF
    OBJECTIVE: To clarify the performance of polygenic risk scores in population screening, individual risk prediction, and population risk stratification. DESIGN: Secondary analysis of data in the Polygenic Score Catalog. SETTING: Polygenic Score Catalog, April 2022. Secondary analysis of 3915 performance metric estimates for 926 polygenic risk scores for 310 diseases to generate estimates of performance in population screening, individual risk, and population risk stratification. PARTICIPANTS: Individuals contributing to the published studies in the Polygenic Score Catalog. MAIN OUTCOME MEASURES: Detection rate for a 5% false positive rate (DR5) and the population odds of becoming affected given a positive result; individual odds of becoming affected for a person with a particular polygenic score; and odds of becoming affected for groups of individuals in different portions of a polygenic risk score distribution. Coronary artery disease and breast cancer were used as illustrative examples. RESULTS: For performance in population screening, median DR5 for all polygenic risk scores and all diseases studied was 11% (interquartile range 8-18%). Median DR5 was 12% (9-19%) for polygenic risk scores for coronary artery disease and 10% (9-12%) for breast cancer. The population odds of becoming affected given a positive results were 1:8 for coronary artery disease and 1:21 for breast cancer, with background 10 year odds of 1:19 and 1:41, respectively, which are typical for these diseases at age 50. For individual risk prediction, the corresponding 10 year odds of becoming affected for individuals aged 50 with a polygenic risk score at the 2.5th, 25th, 75th, and 97.5th centiles were 1:54, 1:29, 1:15, and 1:8 for coronary artery disease and 1:91, 1:56, 1:34, and 1:21 for breast cancer. In terms of population risk stratification, at age 50, the risk of coronary artery disease was divided into five groups, with 10 year odds of 1:41 and 1:11 for the lowest and highest quintile groups, respectively. The 10 year odds was 1:7 for the upper 2.5% of the polygenic risk score distribution for coronary artery disease, a group that contributed 7% of cases. The corresponding estimates for breast cancer were 1:72 and 1:26 for the lowest and highest quintile groups, and 1:19 for the upper 2.5% of the distribution, which contributed 6% of cases. CONCLUSION: Polygenic risk scores performed poorly in population screening, individual risk prediction, and population risk stratification. Strong claims about the effect of polygenic risk scores on healthcare seem to be disproportionate to their performance

    Improving the odds of drug development success through human genomics: modelling study.

    Get PDF
    Lack of efficacy in the intended disease indication is the major cause of clinical phase drug development failure. Explanations could include the poor external validity of pre-clinical (cell, tissue, and animal) models of human disease and the high false discovery rate (FDR) in preclinical science. FDR is related to the proportion of true relationships available for discovery (γ), and the type 1 (false-positive) and type 2 (false negative) error rates of the experiments designed to uncover them. We estimated the FDR in preclinical science, its effect on drug development success rates, and improvements expected from use of human genomics rather than preclinical studies as the primary source of evidence for drug target identification. Calculations were based on a sample space defined by all human diseases - the 'disease-ome' - represented as columns; and all protein coding genes - 'the protein-coding genome'- represented as rows, producing a matrix of unique gene- (or protein-) disease pairings. We parameterised the space based on 10,000 diseases, 20,000 protein-coding genes, 100 causal genes per disease and 4000 genes encoding druggable targets, examining the effect of varying the parameters and a range of underlying assumptions, on the inferences drawn. We estimated γ, defined mathematical relationships between preclinical FDR and drug development success rates, and estimated improvements in success rates based on human genomics (rather than orthodox preclinical studies). Around one in every 200 protein-disease pairings was estimated to be causal (γ = 0.005) giving an FDR in preclinical research of 92.6%, which likely makes a major contribution to the reported drug development failure rate of 96%. Observed success rate was only slightly greater than expected for a random pick from the sample space. Values for γ back-calculated from reported preclinical and clinical drug development success rates were also close to the a priori estimates. Substituting genome wide (or druggable genome wide) association studies for preclinical studies as the major information source for drug target identification was estimated to reverse the probability of late stage failure because of the more stringent type 1 error rate employed and the ability to interrogate every potential druggable target in the same experiment. Genetic studies conducted at much larger scale, with greater resolution of disease end-points, e.g. by connecting genomics and electronic health record data within healthcare systems has the potential to produce radical improvement in drug development success rate

    UK phenomics platform for developing and validating electronic health record phenotypes: CALIBER

    Get PDF
    Objective: Electronic health records (EHRs) are a rich source of information on human diseases, but the information is variably structured, fragmented, curated using different coding systems, and collected for purposes other than medical research. We describe an approach for developing, validating, and sharing reproducible phenotypes from national structured EHR in the United Kingdom with applications for translational research. Materials and Methods: We implemented a rule-based phenotyping framework, with up to 6 approaches of validation. We applied our framework to a sample of 15 million individuals in a national EHR data source (population-based primary care, all ages) linked to hospitalization and death records in England. Data comprised continuous measurements (for example, blood pressure; medication information; coded diagnoses, symptoms, procedures, and referrals), recorded using 5 controlled clinical terminologies: (1) read (primary care, subset of SNOMED-CT [Systematized Nomenclature of Medicine Clinical Terms]), (2) International Classification of Diseases–Ninth Revision and Tenth Revision (secondary care diagnoses and cause of mortality), (3) Office of Population Censuses and Surveys Classification of Surgical Operations and Procedures, Fourth Revision (hospital surgical procedures), and (4) DMĂŸD prescription codes. Results: Using the CALIBER phenotyping framework, we created algorithms for 51 diseases, syndromes, biomarkers, and lifestyle risk factors and provide up to 6 validation approaches. The EHR phenotypes are curated in the open-access CALIBER Portal (https://www.caliberresearch.org/portal) and have been used by 40 national and international research groups in 60 peer-reviewed publications. Conclusions: We describe a UK EHR phenomics approach within the CALIBER EHR data platform with initial evidence of validity and use, as an important step toward international use of UK EHR data for health research

    Mobile health applications: awareness, attitudes, and practices among medical students in Malaysia

    Get PDF
    Background The popularity of mobile health (mHealth) applications (or apps) in the field of health and medical education is rapidly increasing, especially since the COVID-19 pandemic. We aimed to assess awareness, attitudes, practices, and factors associated with the mHealth app usage among medical students. Methods We conducted a cross-sectional study involving medical students at a government university in Sarawak, Malaysia, from February to April 2021. Validated questionnaires were administered to all consenting students. These questionnaires included questions on basic demographic information as well as awareness, attitude toward, and practices with mHealth apps concerned with medical education, health and fitness, and COVID-19 management. Results Respondents had favorable attitudes toward mHealth apps (medical education [61.8%], health and fitness [76.3%], and COVID-19 management [82.7%]). Respondents’ mean attitude scores were four out of five for all three app categories. However, respondents used COVID-19 management apps more frequently (73.5%) than those for medical education (35.7%) and fitness (39.0%). Usage of all three app categories was significantly associated with the respondent’s awareness and attitude. Respondents in the top 20% in term of household income and study duration were more likely to use medical education apps. The number of respondents who used COVID-19 apps was higher in the top 20% household income group than in the other income groups. The most common barrier to the use of apps was uncertainty regarding the most suitable apps to choose. Conclusion Our study highlighted a discrepancy between awareness of mHealth apps and positive attitudes toward them and their use. Recognition of barriers to using mHealth apps by relevant authorities may be necessary to increase the usage of these apps

    A chronological map of 308 physical and mental health conditions from 4 million individuals in the English National Health Service.

    Get PDF
    Background: To effectively prevent, detect, and treat health conditions that affect people during their lifecourse, health-care professionals and researchers need to know which sections of the population are susceptible to which health conditions and at which ages. Hence, we aimed to map the course of human health by identifying the 50 most common health conditions in each decade of life and estimating the median age at first diagnosis. Methods: We developed phenotyping algorithms and codelists for physical and mental health conditions that involve intensive use of health-care resources. Individuals older than 1 year were included in the study if their primary-care and hospital-admission records met research standards set by the Clinical Practice Research Datalink and they had been registered in a general practice in England contributing up-to-standard data for at least 1 year during the study period. We used linked records of individuals from the CALIBER platform to calculate the sex-standardised cumulative incidence for these conditions by 10-year age groups between April 1, 2010, and March 31, 2015. We also derived the median age at diagnosis and prevalence estimates stratified by age, sex, and ethnicity (black, white, south Asian) over the study period from the primary-care and secondary-care records of patients. Findings: We developed case definitions for 308 disease phenotypes. We used records of 2 784 138 patients for the calculation of cumulative incidence and of 3 872 451 patients for the calculation of period prevalence and median age at diagnosis of these conditions. Conditions that first gained prominence at key stages of life were: atopic conditions and infections that led to hospital admission in children (<10 years); acne and menstrual disorders in the teenage years (10-19 years); mental health conditions, obesity, and migraine in individuals aged 20-29 years; soft-tissue disorders and gastro-oesophageal reflux disease in individuals aged 30-39 years; dyslipidaemia, hypertension, and erectile dysfunction in individuals aged 40-59 years; cancer, osteoarthritis, benign prostatic hyperplasia, cataract, diverticular disease, type 2 diabetes, and deafness in individuals aged 60-79 years; and atrial fibrillation, dementia, acute and chronic kidney disease, heart failure, ischaemic heart disease, anaemia, and osteoporosis in individuals aged 80 years or older. Black or south-Asian individuals were diagnosed earlier than white individuals for 258 (84%) of the 308 conditions. Bone fractures and atopic conditions were recorded earlier in male individuals, whereas female individuals were diagnosed at younger ages with nutritional anaemias, tubulointerstitial nephritis, and urinary disorders. Interpretation: We have produced the first chronological map of human health with cumulative-incidence and period-prevalence estimates for multiple morbidities in parallel from birth to advanced age. This can guide clinicians, policy makers, and researchers on how to formulate differential diagnoses, allocate resources, and target research priorities on the basis of the knowledge of who gets which diseases when. We have published our phenotyping algorithms on the CALIBER open-access Portal which will facilitate future research by providing a curated list of reusable case definitions. Funding: Wellcome Trust, National Institute for Health Research, Medical Research Council, Arthritis Research UK, British Heart Foundation, Cancer Research UK, Chief Scientist Office of the Scottish Government Health and Social Care Directorates, Department of Health and Social Care (England), Health and Social Care Research and Development Division (Welsh Government), Public Health Agency (Northern Ireland), Economic and Social Research Council, Engineering and Physical Sciences Research Council, National Institute for Social Care and Health Research, and The Alan Turing Institute

    Data-driven identification of ageing-related diseases from electronic health records.

    Get PDF
    Reducing the burden of late-life morbidity requires an understanding of the mechanisms of ageing-related diseases (ARDs), defined as diseases that accumulate with increasing age. This has been hampered by the lack of formal criteria to identify ARDs. Here, we present a framework to identify ARDs using two complementary methods consisting of unsupervised machine learning and actuarial techniques, which we applied to electronic health records (EHRs) from 3,009,048 individuals in England using primary care data from the Clinical Practice Research Datalink (CPRD) linked to the Hospital Episode Statistics admitted patient care dataset between 1 April 2010 and 31 March 2015 (mean age 49.7 years (s.d. 18.6), 51% female, 70% white ethnicity). We grouped 278 high-burden diseases into nine main clusters according to their patterns of disease onset, using a hierarchical agglomerative clustering algorithm. Four of these clusters, encompassing 207 diseases spanning diverse organ systems and clinical specialties, had rates of disease onset that clearly increased with chronological age. However, the ages of onset for these four clusters were strikingly different, with median age of onset 82 years (IQR 82-83) for Cluster 1, 77 years (IQR 75-77) for Cluster 2, 69 years (IQR 66-71) for Cluster 3 and 57 years (IQR 54-59) for Cluster 4. Fitting to ageing-related actuarial models confirmed that the vast majority of these 207 diseases had a high probability of being ageing-related. Cardiovascular diseases and cancers were highly represented, while benign neoplastic, skin and psychiatric conditions were largely absent from the four ageing-related clusters. Our framework identifies and clusters ARDs and can form the basis for fundamental and translational research into ageing pathways
    • 

    corecore