12 research outputs found

    Clinical Data: Sources and Types, Regulatory Constraints, Applications.

    Get PDF
    Access to clinical data is critical for the advancement of translational research. However, the numerous regulations and policies that surround the use of clinical data, although critical to ensure patient privacy and protect against misuse, often present challenges to data access and sharing. In this article, we provide an overview of clinical data types and associated regulatory constraints and inferential limitations. We highlight several novel approaches that our team has developed for openly exposing clinical data

    A research agenda to support the development and implementation of genomics-based clinical informatics tools and resources.

    Get PDF
    OBJECTIVE: The Genomic Medicine Working Group of the National Advisory Council for Human Genome Research virtually hosted its 13th genomic medicine meeting titled Developing a Clinical Genomic Informatics Research Agenda . The meeting\u27s goal was to articulate a research strategy to develop Genomics-based Clinical Informatics Tools and Resources (GCIT) to improve the detection, treatment, and reporting of genetic disorders in clinical settings. MATERIALS AND METHODS: Experts from government agencies, the private sector, and academia in genomic medicine and clinical informatics were invited to address the meeting\u27s goals. Invitees were also asked to complete a survey to assess important considerations needed to develop a genomic-based clinical informatics research strategy. RESULTS: Outcomes from the meeting included identifying short-term research needs, such as designing and implementing standards-based interfaces between laboratory information systems and electronic health records, as well as long-term projects, such as identifying and addressing barriers related to the establishment and implementation of genomic data exchange systems that, in turn, the research community could help address. DISCUSSION: Discussions centered on identifying gaps and barriers that impede the use of GCIT in genomic medicine. Emergent themes from the meeting included developing an implementation science framework, defining a value proposition for all stakeholders, fostering engagement with patients and partners to develop applications under patient control, promoting the use of relevant clinical workflows in research, and lowering related barriers to regulatory processes. Another key theme was recognizing pervasive biases in data and information systems, algorithms, access, value, and knowledge repositories and identifying ways to resolve them

    Inverting the model of genomics data sharing with the NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space

    Get PDF
    The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL; https://anvilproject.org) was developed to address a widespread community need for a unified computing environment for genomics data storage, management, and analysis. In this perspective, we present AnVIL, describe its ecosystem and interoperability with other platforms, and highlight how this platform and associated initiatives contribute to improved genomic data sharing efforts. The AnVIL is a federated cloud platform designed to manage and store genomics and related data, enable population-scale analysis, and facilitate collaboration through the sharing of data, code, and analysis results. By inverting the traditional model of data sharing, the AnVIL eliminates the need for data movement while also adding security measures for active threat detection and monitoring and provides scalable, shared computing resources for any researcher. We describe the core data management and analysis components of the AnVIL, which currently consists of Terra, Gen3, Galaxy, RStudio/Bioconductor, Dockstore, and Jupyter, and describe several flagship genomics datasets available within the AnVIL. We continue to extend and innovate the AnVIL ecosystem by implementing new capabilities, including mechanisms for interoperability and responsible data sharing, while streamlining access management. The AnVIL opens many new opportunities for analysis, collaboration, and data sharing that are needed to drive research and to make discoveries through the joint analysis of hundreds of thousands to millions of genomes along with associated clinical and molecular data types

    Willingness to Share Wearable Device Data for Research Among Mechanical Turk Workers: Web-Based Survey Study

    No full text
    BackgroundWearable devices that are used for observational research and clinical trials hold promise for collecting data from study participants in a convenient, scalable way that is more likely to reach a broad and diverse population than traditional research approaches. Amazon Mechanical Turk (MTurk) is a potential resource that researchers can use to recruit individuals into studies that use data from wearable devices. ObjectiveThis study aimed to explore the characteristics of wearable device users on MTurk that are associated with a willingness to share wearable device data for research. We also aimed to determine whether compensation was a factor that influenced the willingness to share such data. MethodsThis was a secondary analysis of a cross-sectional survey study of MTurk workers who use wearable devices for health monitoring. A 19-question web-based survey was administered from March 1 to April 5, 2018, to participants aged ≥18 years by using the MTurk platform. In order to identify characteristics that were associated with a willingness to share wearable device data, we performed logistic regression and decision tree analyses. Results A total of 935 MTurk workers who use wearable devices completed the survey. The majority of respondents indicated a willingness to share their wearable device data (615/935, 65.8%), and the majority of these respondents were willing to share their data if they received compensation (518/615, 84.2%). The findings from our logistic regression analyses indicated that Indian nationality (odds ratio [OR] 2.74, 95% CI 1.48-4.01, P=.007), higher annual income (OR 2.46, 95% CI 1.26-3.67, P=.02), over 6 months of using a wearable device (OR 1.75, 95% CI 1.21-2.29, P=.006), and the use of heartbeat and pulse tracking monitoring devices (OR 1.60, 95% CI 0.14-2.07, P=.01) are significant parameters that influence the willingness to share data. The only factor associated with a willingness to share data if compensation is provided was Indian nationality (OR 0.47, 95% CI 0.24-0.9, P=.02). The findings from our decision tree analyses indicated that the three leading parameters associated with a willingness to share data were the duration of wearable device use, nationality, and income. ConclusionsMost wearable device users indicated a willingness to share their data for research use (with or without compensation; 615/935, 65.8%). The probability of having a willingness to share these data was higher among individuals who had used a wearable for more than 6 months, were of Indian nationality, or were of American (United States of America) nationality and had an annual income of more than US $20,000. Individuals of Indian nationality who were willing to share their data expected compensation significantly less often than individuals of American nationality (P=.02)

    Assessing Associations Between COVID-19 Symptomology and Adverse Outcomes After Piloting Crowdsourced Data Collection: Cross-sectional Survey Study

    No full text
    BackgroundCrowdsourcing is a useful way to rapidly collect information on COVID-19 symptoms. However, there are potential biases and data quality issues given the population that chooses to participate in crowdsourcing activities and the common strategies used to screen participants based on their previous experience. ObjectiveThe study aimed to (1) build a pipeline to enable data quality and population representation checks in a pilot setting prior to deploying a final survey to a crowdsourcing platform, (2) assess COVID-19 symptomology among survey respondents who report a previous positive COVID-19 result, and (3) assess associations of symptomology groups and underlying chronic conditions with adverse outcomes due to COVID-19. MethodsWe developed a web-based survey and hosted it on the Amazon Mechanical Turk (MTurk) crowdsourcing platform. We conducted a pilot study from August 5, 2020, to August 14, 2020, to refine the filtering criteria according to our needs before finalizing the pipeline. The final survey was posted from late August to December 31, 2020. Hierarchical cluster analyses were performed to identify COVID-19 symptomology groups, and logistic regression analyses were performed for hospitalization and mechanical ventilation outcomes. Finally, we performed a validation of study outcomes by comparing our findings to those reported in previous systematic reviews. ResultsThe crowdsourcing pipeline facilitated piloting our survey study and revising the filtering criteria to target specific MTurk experience levels and to include a second attention check. We collected data from 1254 COVID-19–positive survey participants and identified the following 6 symptomology groups: abdominal and bladder pain (Group 1); flu-like symptoms (loss of smell/taste/appetite; Group 2); hoarseness and sputum production (Group 3); joint aches and stomach cramps (Group 4); eye or skin dryness and vomiting (Group 5); and no symptoms (Group 6). The risk factors for adverse COVID-19 outcomes differed for different symptomology groups. The only risk factor that remained significant across 4 symptomology groups was influenza vaccine in the previous year (Group 1: odds ratio [OR] 6.22, 95% CI 2.32-17.92; Group 2: OR 2.35, 95% CI 1.74-3.18; Group 3: OR 3.7, 95% CI 1.32-10.98; Group 4: OR 4.44, 95% CI 1.53-14.49). Our findings regarding the symptoms of abdominal pain, cough, fever, fatigue, shortness of breath, and vomiting as risk factors for COVID-19 adverse outcomes were concordant with the findings of other researchers. Some high-risk symptoms found in our study, including bladder pain, dry eyes or skin, and loss of appetite, were reported less frequently by other researchers and were not considered previously in relation to COVID-19 adverse outcomes. ConclusionsWe demonstrated that a crowdsourced approach was effective for collecting data to assess symptomology associated with COVID-19. Such a strategy may facilitate efficient assessments in a dynamic intersection between emerging infectious diseases, and societal and environmental changes

    Feature engineering with clinical expert knowledge: A case study assessment of machine learning model complexity and performance.

    No full text
    Incorporating expert knowledge at the time machine learning models are trained holds promise for producing models that are easier to interpret. The main objectives of this study were to use a feature engineering approach to incorporate clinical expert knowledge prior to applying machine learning techniques, and to assess the impact of the approach on model complexity and performance. Four machine learning models were trained to predict mortality with a severe asthma case study. Experiments to select fewer input features based on a discriminative score showed low to moderate precision for discovering clinically meaningful triplets, indicating that discriminative score alone cannot replace clinical input. When compared to baseline machine learning models, we found a decrease in model complexity with use of fewer features informed by discriminative score and filtering of laboratory features with clinical input. We also found a small difference in performance for the mortality prediction task when comparing baseline ML models to models that used filtered features. Encoding demographic and triplet information in ML models with filtered features appeared to show performance improvements from the baseline. These findings indicated that the use of filtered features may reduce model complexity, and with little impact on performance

    Factors associated with resistance to SARS-CoV-2 infection discovered using large-scale medical record data and machine learning.

    No full text
    There have been over 621 million cases of COVID-19 worldwide with over 6.5 million deaths. Despite the high secondary attack rate of COVID-19 in shared households, some exposed individuals do not contract the virus. In addition, little is known about whether the occurrence of COVID-19 resistance differs among people by health characteristics as stored in the electronic health records (EHR). In this retrospective analysis, we develop a statistical model to predict COVID-19 resistance in 8,536 individuals with prior COVID-19 exposure using demographics, diagnostic codes, outpatient medication orders, and count of Elixhauser comorbidities in EHR data from the COVID-19 Precision Medicine Platform Registry. Cluster analyses identified 5 patterns of diagnostic codes that distinguished resistant from non-resistant patients in our study population. In addition, our models showed modest performance in predicting COVID-19 resistance (best performing model AUROC = 0.61). Monte Carlo simulations conducted indicated that the AUROC results are statistically significant (p < 0.001) for the testing set. We hope to validate the features found to be associated with resistance/non-resistance through more advanced association studies

    Infobuttons for Genomic Medicine: Requirements and Barriers

    No full text
    ObjectivesThe study aimed to understand potential barriers to the adoption of health information technology projects that are released as free and open source software (FOSS).MethodsWe conducted a survey of research consortia participants engaged in genomic medicine implementation to assess perceived institutional barriers to the adoption of three systems: ClinGen electronic health record (EHR) Toolkit, DocUBuild, and MyResults.org. The survey included eight barriers from the Consolidated Framework for Implementation Research (CFIR), with additional barriers identified from a qualitative analysis of open-ended responses.ResultsWe analyzed responses from 24 research consortia participants from 18 institutions. In total, 14 categories of perceived barriers were evaluated, which were consistent with other observed barriers to FOSS adoption. The most frequent perceived barriers included lack of adaptability of the system, lack of institutional priority to implement, lack of trialability, lack of advantage of alternative systems, and complexity.ConclusionIn addition to understanding potential barriers, we recommend some strategies to address them (where possible), including considerations for genomic medicine. Overall, FOSS developers need to ensure systems are easy to trial and implement and need to clearly articulate benefits of their systems, especially when alternatives exist. Institutional champions will remain a critical component to prioritizing genomic medicine projects

    Preferences for Updates on General Research Results: A Survey of Participants in Genomic Research from Two Institutions

    No full text
    There is a need for multimodal strategies to keep research participants informed about study results. Our aim was to characterize preferences of genomic research participants from two institutions along four dimensions of general research result updates: content, timing, mechanism, and frequency. Methods: We conducted a web-based cross-sectional survey that was administered from 25 June 2018 to 5 December 2018. Results: 397 participants completed the survey, most of whom (96%) expressed a desire to receive research updates. Preferences with high endorsement included: update content (brief descriptions of major findings, descriptions of purpose and goals, and educational material); update timing (when the research is completed, when findings are reviewed, when findings are published, and when the study status changes); update mechanism (email with updates, and email newsletter); and update frequency (every three months). Hierarchical cluster analyses based on the four update preferences identified four profiles of participants with similar preference patterns. Very few participants in the largest profile were comfortable with budgeting less money for research activities so that researchers have money to set up services to send research result updates to study participants. Conclusion: Future studies may benefit from exploring preferences for research result updates, as we have in our study. In addition, this work provides evidence of a need for funders to incentivize researchers to communicate results to participants

    Genomic Information for Clinicians in the Electronic Health Record: Lessons Learned from ClinGen and eMERGE

    No full text
    Genomic knowledge is being translated into clinical care. To fully realize the value, it is critical to place credible information in the hands of clinicians in time to support clinical decision-making. The electronic health record is an essential component of clinician workflow. Utilizing the electronic health record to present information to support the use of genomic medicine in clinical care to improve outcomes represents a tremendous opportunity. However, there are numerous barriers that prevent the effective use of the electronic health record for this purpose. The electronic health record working groups of the electronic MEdical Records and GEnomics network (eMERGE) and the Clinical Genome Resource (ClinGen) project, along with other groups, have been defining these barriers, to allow the development of solutions that can be tested using implementation pilots. In this paper, we present ‘lessons learned’ from these efforts to inform future efforts leading to the development of effective and sustainable solutions that will support the realization of genomic medicine
    corecore