715 research outputs found

    Natural Language Processing – Finding the Missing Link for Oncologic Data, 2022

    Get PDF
    Oncology like most medical specialties, is undergoing a data revolution at the center of which lie vast and growing amounts of clinical data in unstructured, semi-structured and structed formats. Artificial intelligence approaches are widely employed in research endeavors in an attempt to harness electronic medical records data to advance patient outcomes. The use of clinical oncologic data, although collected on large scale, particularly with the increased implementation of electronic medical records, remains limited due to missing, incorrect or manually entered data in registries and the lack of resource allocation to data curation in real world settings. Natural Language Processing (NLP) may provide an avenue to extract data from electronic medical records and as a result has grown considerably in medicine to be employed for documentation, outcome analysis, phenotyping and clinical trial eligibility. Barriers to NLP persist with inability to aggregate findings across studies due to use of different methods and significant heterogeneity at all levels with important parameters such as patient comorbidities and performance status lacking implementation in AI approaches. The goal of this review is to provide an updated overview of natural language processing (NLP) and the current state of its application in oncology for clinicians and researchers that wish to implement NLP to augment registries and/or advance research projects

    Machine learning approaches to identifying social determinants of health in electronic health record clinical notes

    Get PDF
    Social determinants of health (SDH) represent the complex set of circumstances in which individuals are born, or with which they live, that impact health. Relatively little attention has been given to processes needed to extract SDH data from electronic health records. Despite their importance, SDH data in the EHR remains sparse, typically collected only in clinical notes and thus largely unavailable for clinical decision making. I focus on developing and validating more efficient information extraction approaches to identifying and classifying SDH in clinical notes. In this dissertation, I have three goals: First, I develop a word embedding model to expand SDH terminology in the context of identifying SDH clinical text. Second, I examine the effectiveness of different machine learning algorithms and a neural network model to classify the SDH characteristics financial resource strain and poor social support. Third, I compare the highest performing approaches to simpler text mining techniques and evaluate the models based on performance, cost, and generalizability in the task of classifying SDH in two distinct data sources.Doctor of Philosoph

    A study assessing the characteristics of big data environments that predict high research impact: application of qualitative and quantitative methods

    Full text link
    BACKGROUND: Big data offers new opportunities to enhance healthcare practice. While researchers have shown increasing interest to use them, little is known about what drives research impact. We explored predictors of research impact, across three major sources of healthcare big data derived from the government and the private sector. METHODS: This study was based on a mixed methods approach. Using quantitative analysis, we first clustered peer-reviewed original research that used data from government sources derived through the Veterans Health Administration (VHA), and private sources of data from IBM MarketScan and Optum, using social network analysis. We analyzed a battery of research impact measures as a function of the data sources. Other main predictors were topic clusters and authors’ social influence. Additionally, we conducted key informant interviews (KII) with a purposive sample of high impact researchers who have knowledge of the data. We then compiled findings of KIIs into two case studies to provide a rich understanding of drivers of research impact. RESULTS: Analysis of 1,907 peer-reviewed publications using VHA, IBM MarketScan and Optum found that the overall research enterprise was highly dynamic and growing over time. With less than 4 years of observation, research productivity, use of machine learning (ML), natural language processing (NLP), and the Journal Impact Factor showed substantial growth. Studies that used ML and NLP, however, showed limited visibility. After adjustments, VHA studies had generally higher impact (10% and 27% higher annualized Google citation rates) compared to MarketScan and Optum (p<0.001 for both). Analysis of co-authorship networks showed that no single social actor, either a community of scientists or institutions, was dominating. Other key opportunities to achieve high impact based on KIIs include methodological innovations, under-studied populations and predictive modeling based on rich clinical data. CONCLUSIONS: Big data for purposes of research analytics has grown within the three data sources studied between 2013 and 2016. Despite important challenges, the research community is reacting favorably to the opportunities offered both by big data and advanced analytic methods. Big data may be a logical and cost-efficient choice to emulate research initiatives where RCTs are not possible

    Broadening horizons: the case for capturing function and the role of health informatics in its use

    Get PDF
    Background Human activity and the interaction between health conditions and activity is a critical part of understanding the overall function of individuals. The World Health Organization’s International Classification of Functioning, Disability and Health (ICF) models function as all aspects of an individual’s interaction with the world, including organismal concepts such as individual body structures, functions, and pathologies, as well as the outcomes of the individual’s interaction with their environment, referred to as activity and participation. Function, particularly activity and participation outcomes, is an important indicator of health at both the level of an individual and the population level, as it is highly correlated with quality of life and a critical component of identifying resource needs. Since it reflects the cumulative impact of health conditions on individuals and is not disease specific, its use as a health indicator helps to address major barriers to holistic, patient-centered care that result from multiple, and often competing, disease specific interventions. While the need for better information on function has been widely endorsed, this has not translated into its routine incorporation into modern health systems. Purpose We present the importance of capturing information on activity as a core component of modern health systems and identify specific steps and analytic methods that can be used to make it more available to utilize in improving patient care. We identify challenges in the use of activity and participation information, such as a lack of consistent documentation and diversity of data specificity and representation across providers, health systems, and national surveys. We describe how activity and participation information can be more effectively captured, and how health informatics methodologies, including natural language processing (NLP), can enable automatically locating, extracting, and organizing this information on a large scale, supporting standardization and utilization with minimal additional provider burden. We examine the analytic requirements and potential challenges of capturing this information with informatics, and describe how data-driven techniques can combine with common standards and documentation practices to make activity and participation information standardized and accessible for improving patient care. Recommendations We recommend four specific actions to improve the capture and analysis of activity and participation information throughout the continuum of care: (1) make activity and participation annotation standards and datasets available to the broader research community; (2) define common research problems in automatically processing activity and participation information; (3) develop robust, machine-readable ontologies for function that describe the components of activity and participation information and their relationships; and (4) establish standards for how and when to document activity and participation status during clinical encounters. We further provide specific short-term goals to make significant progress in each of these areas within a reasonable time frame

    Misconduct-Related Discharge from Active Duty Military Service: An Examination of Precipitating Factors and Post-Deployment Health Outcomes

    Get PDF
    U.S. military service members who are discharged from service for misconduct are at high risk for mental health and substance use disorders, homelessness, mortality, and incarceration. The purpose of this dissertation was to investigate the pre- and post-discharge experiences and characteristics of this highly vulnerable population in order to inform improved prevention and intervention strategies. Administrative data from the Department of Defense and Veterans Health Administration for veterans of recent conflicts were used to conduct 3 related retrospective cohort studies. These included (1) an evaluation of the demographic and military service characteristics and service-connected disabilities associated with discharge for misconduct; (2) an examination of post-discharge health status and healthcare utilization among misconduct-discharged veterans; and (3) the development of predictive models for homelessness and mortality among misconduct-discharged veterans. Several demographic and military service characteristics were associated with increased risk for misconduct discharge, as were exposure to sexual trauma, and post-discharge designation of service-connected disabilities related to mental illness. Misconduct-discharged veterans were found to have significant and complex healthcare needs, and used clinical services at approximately double the rate of routinely discharged veterans. Several risk factors for homelessness and mortality among this population were identified. Risk stratification models showed good predictive accuracy for homelessness, and fair predictive accuracy for mortality. Targeted counter-attrition strategies and an increased focus on health-related determinants of misconduct, including rehabilitative approaches to behavioral problems, may help to reduce misconduct-related attrition. Efforts to transition post-discharge care from specialty settings to integrated primary care settings may be successful in mitigating adverse outcomes. Risk stratification techniques can facilitate the efficient targeting of resources

    Perceptions of Homeless Individuals Regarding Public Housing Use

    Get PDF
    Research on how homeless individuals perceive shelters, housing programs, and their agents has been limited, especially in relation to the reasons for engaging in or avoiding programs. This phenomenological study explored the perspectives of chronically homeless individuals in Wake County, North Carolina, regarding shelters and housing programs, examining their reasons for using or not using shelters or public housing. Using Glidden\u27s structuration theory as the framework, the research questions for this study were based on exploring the perceptions of homeless individuals use of public resources related to housing and shelters to better understand why some use, and perhaps more importantly, why some choose to not use these resources. Purposeful sampling was used to identify 12 chronically homeless men and women and data were collected through semi-structured interviews. Data were both deductively and inductively coded and analyzed using a thematic analysis procedure. This study found that the persistence of homelessness is a result of a combination of homeless individuals\u27 perceptions of housing programs\u27 structural failures including long waiting periods for access to housing, unnecessary bureaucratic entanglements, and what they perceived as inaction or apathy on the part of program staff in response to requests for assistance. These findings are consistent with structuration theory. The implications for positive social change include recommendations to policy makers to consider the views and perceptions of homeless people in designing programs, including ways to improve access to public resources that may ultimately lead to permanent housing for homeless individuals

    Learning Clinical Data Representations for Machine Learning

    Get PDF
    • …
    corecore