237 research outputs found
Rationale for the Cytogenomics of Cardiovascular Malformations Consortium: A Phenotype Intensive Registry Based Approach
Cardiovascular malformations (CVMs) are the most common birth defect, occurring in 1%-5% of all live births. Although the genetic contribution to CVMs is well recognized, the genetic causes of human CVMs are identified infrequently. In addition, a failure of systematic deep phenotyping of CVMs, resulting from the complexity and heterogeneity of malformations, has obscured genotype-phenotype correlations and contributed to a lack of understanding of disease mechanisms. To address these knowledge gaps, we have developed the Cytogenomics of Cardiovascular Malformations (CCVM) Consortium, a multi-site alliance of geneticists and cardiologists, contributing to a database registry of submicroscopic genetic copy number variants (CNVs) based on clinical chromosome microarray testing in individuals with CVMs using detailed classification schemes. Cardiac classification is performed using a modification to the National Birth Defects Prevention Study approach, and non-cardiac diagnoses are captured through ICD-9 and ICD-10 codes. By combining a comprehensive approach to clinically relevant genetic analyses with precise phenotyping, the Consortium goal is to identify novel genomic regions that cause or increase susceptibility to CVMs and to correlate the findings with clinical phenotype. This registry will provide critical insights into genetic architecture, facilitate genotype-phenotype correlations, and provide a valuable resource for the medical community
Data Resource Profile: Cardiovascular disease research using linked bespoke studies and electronic health records (CALIBER)
The goal of cardiovascular disease (CVD) research using linked bespoke studies and electronic health records (CALIBER) is to provide evidence to inform health care and public health policy for CVDs across different stages of translation, from discovery, through evaluation in trials to implementation, where linkages to electronic health records provide new scientific opportunities. The initial approach of the CALIBER programme is characterized as follows: (i) Linkages of multiple electronic heath record sources: examples include linkages between the longitudinal primary care data from the Clinical Practice Research Datalink, the national registry of acute coronary syndromes (Myocardial Ischaemia National Audit Project), hospitalization and procedure data from Hospital Episode Statistics and cause-specific mortality and social deprivation data from the Office of National Statistics. Current cohort analyses involve a million people in initially healthy populations and disease registries with ∼105 patients. (ii) Linkages of bespoke investigator-led cohort studies (e.g. UK Biobank) to registry data (e.g. Myocardial Ischaemia National Audit Project), providing new means of ascertaining, validating and phenotyping disease. (iii) A common data model in which routine electronic health record data are made research ready, and sharable, by defining and curating with meta-data >300 variables (categorical, continuous, event) on risk factors, CVDs and non-cardiovascular comorbidities. (iv) Transparency: all CALIBER studies have an analytic protocol registered in the public domain, and data are available (safe haven model) for use subject to approvals. For more information, e-mail [email protected]
Artificial intelligence (AI) in rare diseases: is the future brighter?
The amount of data collected and managed in (bio)medicine is ever-increasing. Thus, there is a need to rapidly and efficiently collect, analyze, and characterize all this information. Artificial intelligence (AI), with an emphasis on deep learning, holds great promise in this area and is already being successfully applied to basic research, diagnosis, drug discovery, and clinical trials. Rare diseases (RDs), which are severely underrepresented in basic and clinical research, can particularly benefit from AI technologies. Of the more than 7000 RDs described worldwide, only 5% have a treatment. The ability of AI technologies to integrate and analyze data from different sources (e.g., multi-omics, patient registries, and so on) can be used to overcome RDs' challenges (e.g., low diagnostic rates, reduced number of patients, geographical dispersion, and so on). Ultimately, RDs' AI-mediated knowledge could significantly boost therapy development. Presently, there are AI approaches being used in RDs and this review aims to collect and summarize these advances. A section dedicated to congenital disorders of glycosylation (CDG), a particular group of orphan RDs that can serve as a potential study model for other common diseases and RDs, has also been included.info:eu-repo/semantics/publishedVersio
Recommended from our members
Understanding and Reducing Clinical Data Biases
The vast amount of clinical data made available by pervasive electronic health records presents a great opportunity for reusing these data to improve the efficiency and lower the costs of clinical and translational research. A risk to reuse is potential hidden biases in clinical data. While specific studies have demonstrated benefits in reusing clinical data for research, there are significant concerns about potential clinical data biases.
This dissertation research contributes original understanding of clinical data biases. Using research data carefully collected from a patient community served by our institution as the reference standard, we examined the measurement and sampling biases in the clinical data for selected clinical variables. Our results showed that the clinical data and research data had similar summary statistical profiles, but that there were detectable differences in definitions and measurements for variables such as height, diastolic blood pressure, and diabetes status. One implication of these results is that research data can complement clinical data for clinical phenotyping. We further supported this hypothesis using diabetes as an example clinical phenotype, showing that integrated clinical and research data improved the sensitivity and positive predictive value
Electronic Health Record Phenotyping in Cardiovascular Epidemiology
The secondary use of EHR data for research is a cost-effective resource for a variety of research questions and domains; however, there are many challenges when using electronic health record (EHR) data for epidemiologic research.This dissertation quantified differences in prevalence for acute myocardial infarction (MI) and heart failure (HF) using phenotyping algorithms differing in diagnosis position of ICD-10-CM codes and the inclusion of clinical components. The period of interest was January 1, 2016 to December 31, 2019 for UNC Clinical Data Warehouse for Health data and October 1, 2015 and December 31, 2019 for Atherosclerosis Risk in Communities (ARIC) Study data, the latter used for validation analyses. During the period of interest, 13,200 acute MI cases and 53,545 HF cases were identified in the UNC data. Age-standardized prevalence of acute MI and HF were highest using Any Diagnosis Position algorithm and lowest for acute MI using 1st or 2nd Diagnosis Position with Lab or Procedure and 1st Diagnosis Position for HF. Projected differences in healthcare expenditures by algorithm as well as patient and clinical characteristics, such as event severity and mortality, were also estimated. When compared to physician-adjudicated hospitalizations in the ARIC study, the phenotyping algorithms used for the UNC analysis performed well given their simplicity. The algorithm with the highest sensitivity was Any Diagnosis Position for acute MI and HF at 75.5% and 70.5%. Specificity, PPV, and NPV ranged from 80-99% for all algorithms. Requiring clinical components had little effect except for increasing PPV slightly, while restricting diagnosis position to 1st or 2nd position decreased sensitivity and increased PPV. The impact of clinical components or diagnosis position did not differ by race, age, or sex subgroups.The results from this dissertation can be used by researchers using EHR data for a variety of reasons from informing their own analytic decisions to validating their study findings. The continued use of EHR data for research requires transparency to facilitate reproducibility as well as studies focused on what we are measuring.Doctor of Philosoph
Collaborative Cloud Computing Framework for Health Data with Open Source Technologies
The proliferation of sensor technologies and advancements in data collection
methods have enabled the accumulation of very large amounts of data.
Increasingly, these datasets are considered for scientific research. However,
the design of the system architecture to achieve high performance in terms of
parallelization, query processing time, aggregation of heterogeneous data types
(e.g., time series, images, structured data, among others), and difficulty in
reproducing scientific research remain a major challenge. This is specifically
true for health sciences research, where the systems must be i) easy to use
with the flexibility to manipulate data at the most granular level, ii)
agnostic of programming language kernel, iii) scalable, and iv) compliant with
the HIPAA privacy law. In this paper, we review the existing literature for
such big data systems for scientific research in health sciences and identify
the gaps of the current system landscape. We propose a novel architecture for
software-hardware-data ecosystem using open source technologies such as Apache
Hadoop, Kubernetes and JupyterHub in a distributed environment. We also
evaluate the system using a large clinical data set of 69M patients.Comment: This paper is accepted in ACM-BCB 202
- …