237 research outputs found

    Rationale for the Cytogenomics of Cardiovascular Malformations Consortium: A Phenotype Intensive Registry Based Approach

    Get PDF
    Cardiovascular malformations (CVMs) are the most common birth defect, occurring in 1%-5% of all live births. Although the genetic contribution to CVMs is well recognized, the genetic causes of human CVMs are identified infrequently. In addition, a failure of systematic deep phenotyping of CVMs, resulting from the complexity and heterogeneity of malformations, has obscured genotype-phenotype correlations and contributed to a lack of understanding of disease mechanisms. To address these knowledge gaps, we have developed the Cytogenomics of Cardiovascular Malformations (CCVM) Consortium, a multi-site alliance of geneticists and cardiologists, contributing to a database registry of submicroscopic genetic copy number variants (CNVs) based on clinical chromosome microarray testing in individuals with CVMs using detailed classification schemes. Cardiac classification is performed using a modification to the National Birth Defects Prevention Study approach, and non-cardiac diagnoses are captured through ICD-9 and ICD-10 codes. By combining a comprehensive approach to clinically relevant genetic analyses with precise phenotyping, the Consortium goal is to identify novel genomic regions that cause or increase susceptibility to CVMs and to correlate the findings with clinical phenotype. This registry will provide critical insights into genetic architecture, facilitate genotype-phenotype correlations, and provide a valuable resource for the medical community

    Data Resource Profile: Cardiovascular disease research using linked bespoke studies and electronic health records (CALIBER)

    Get PDF
    The goal of cardiovascular disease (CVD) research using linked bespoke studies and electronic health records (CALIBER) is to provide evidence to inform health care and public health policy for CVDs across different stages of translation, from discovery, through evaluation in trials to implementation, where linkages to electronic health records provide new scientific opportunities. The initial approach of the CALIBER programme is characterized as follows: (i) Linkages of multiple electronic heath record sources: examples include linkages between the longitudinal primary care data from the Clinical Practice Research Datalink, the national registry of acute coronary syndromes (Myocardial Ischaemia National Audit Project), hospitalization and procedure data from Hospital Episode Statistics and cause-specific mortality and social deprivation data from the Office of National Statistics. Current cohort analyses involve a million people in initially healthy populations and disease registries with ∼105 patients. (ii) Linkages of bespoke investigator-led cohort studies (e.g. UK Biobank) to registry data (e.g. Myocardial Ischaemia National Audit Project), providing new means of ascertaining, validating and phenotyping disease. (iii) A common data model in which routine electronic health record data are made research ready, and sharable, by defining and curating with meta-data >300 variables (categorical, continuous, event) on risk factors, CVDs and non-cardiovascular comorbidities. (iv) Transparency: all CALIBER studies have an analytic protocol registered in the public domain, and data are available (safe haven model) for use subject to approvals. For more information, e-mail [email protected]

    Artificial intelligence (AI) in rare diseases: is the future brighter?

    Get PDF
    The amount of data collected and managed in (bio)medicine is ever-increasing. Thus, there is a need to rapidly and efficiently collect, analyze, and characterize all this information. Artificial intelligence (AI), with an emphasis on deep learning, holds great promise in this area and is already being successfully applied to basic research, diagnosis, drug discovery, and clinical trials. Rare diseases (RDs), which are severely underrepresented in basic and clinical research, can particularly benefit from AI technologies. Of the more than 7000 RDs described worldwide, only 5% have a treatment. The ability of AI technologies to integrate and analyze data from different sources (e.g., multi-omics, patient registries, and so on) can be used to overcome RDs' challenges (e.g., low diagnostic rates, reduced number of patients, geographical dispersion, and so on). Ultimately, RDs' AI-mediated knowledge could significantly boost therapy development. Presently, there are AI approaches being used in RDs and this review aims to collect and summarize these advances. A section dedicated to congenital disorders of glycosylation (CDG), a particular group of orphan RDs that can serve as a potential study model for other common diseases and RDs, has also been included.info:eu-repo/semantics/publishedVersio

    Electronic Health Record Phenotyping in Cardiovascular Epidemiology

    Get PDF
    The secondary use of EHR data for research is a cost-effective resource for a variety of research questions and domains; however, there are many challenges when using electronic health record (EHR) data for epidemiologic research.This dissertation quantified differences in prevalence for acute myocardial infarction (MI) and heart failure (HF) using phenotyping algorithms differing in diagnosis position of ICD-10-CM codes and the inclusion of clinical components. The period of interest was January 1, 2016 to December 31, 2019 for UNC Clinical Data Warehouse for Health data and October 1, 2015 and December 31, 2019 for Atherosclerosis Risk in Communities (ARIC) Study data, the latter used for validation analyses. During the period of interest, 13,200 acute MI cases and 53,545 HF cases were identified in the UNC data. Age-standardized prevalence of acute MI and HF were highest using Any Diagnosis Position algorithm and lowest for acute MI using 1st or 2nd Diagnosis Position with Lab or Procedure and 1st Diagnosis Position for HF. Projected differences in healthcare expenditures by algorithm as well as patient and clinical characteristics, such as event severity and mortality, were also estimated. When compared to physician-adjudicated hospitalizations in the ARIC study, the phenotyping algorithms used for the UNC analysis performed well given their simplicity. The algorithm with the highest sensitivity was Any Diagnosis Position for acute MI and HF at 75.5% and 70.5%. Specificity, PPV, and NPV ranged from 80-99% for all algorithms. Requiring clinical components had little effect except for increasing PPV slightly, while restricting diagnosis position to 1st or 2nd position decreased sensitivity and increased PPV. The impact of clinical components or diagnosis position did not differ by race, age, or sex subgroups.The results from this dissertation can be used by researchers using EHR data for a variety of reasons from informing their own analytic decisions to validating their study findings. The continued use of EHR data for research requires transparency to facilitate reproducibility as well as studies focused on what we are measuring.Doctor of Philosoph

    Collaborative Cloud Computing Framework for Health Data with Open Source Technologies

    Full text link
    The proliferation of sensor technologies and advancements in data collection methods have enabled the accumulation of very large amounts of data. Increasingly, these datasets are considered for scientific research. However, the design of the system architecture to achieve high performance in terms of parallelization, query processing time, aggregation of heterogeneous data types (e.g., time series, images, structured data, among others), and difficulty in reproducing scientific research remain a major challenge. This is specifically true for health sciences research, where the systems must be i) easy to use with the flexibility to manipulate data at the most granular level, ii) agnostic of programming language kernel, iii) scalable, and iv) compliant with the HIPAA privacy law. In this paper, we review the existing literature for such big data systems for scientific research in health sciences and identify the gaps of the current system landscape. We propose a novel architecture for software-hardware-data ecosystem using open source technologies such as Apache Hadoop, Kubernetes and JupyterHub in a distributed environment. We also evaluate the system using a large clinical data set of 69M patients.Comment: This paper is accepted in ACM-BCB 202
    • …
    corecore