21,768 research outputs found
LeafAI: query generator for clinical cohort discovery rivaling a human programmer
Objective: Identifying study-eligible patients within clinical databases is a
critical step in clinical research. However, accurate query design typically
requires extensive technical and biomedical expertise. We sought to create a
system capable of generating data model-agnostic queries while also providing
novel logical reasoning capabilities for complex clinical trial eligibility
criteria.
Materials and Methods: The task of query creation from eligibility criteria
requires solving several text-processing problems, including named entity
recognition and relation extraction, sequence-to-sequence transformation,
normalization, and reasoning. We incorporated hybrid deep learning and
rule-based modules for these, as well as a knowledge base of the Unified
Medical Language System (UMLS) and linked ontologies. To enable data-model
agnostic query creation, we introduce a novel method for tagging database
schema elements using UMLS concepts. To evaluate our system, called LeafAI, we
compared the capability of LeafAI to a human database programmer to identify
patients who had been enrolled in 8 clinical trials conducted at our
institution. We measured performance by the number of actual enrolled patients
matched by generated queries.
Results: LeafAI matched a mean 43% of enrolled patients with 27,225 eligible
across 8 clinical trials, compared to 27% matched and 14,587 eligible in
queries by a human database programmer. The human programmer spent 26 total
hours crafting queries compared to several minutes by LeafAI.
Conclusions: Our work contributes a state-of-the-art data model-agnostic
query generation system capable of conditional reasoning using a knowledge
base. We demonstrate that LeafAI can rival a human programmer in finding
patients eligible for clinical trials
Recommended from our members
Artificial Intelligence in Radiotherapy Treatment Planning: Present and Future.
Treatment planning is an essential step of the radiotherapy workflow. It has become more sophisticated over the past couple of decades with the help of computer science, enabling planners to design highly complex radiotherapy plans to minimize the normal tissue damage while persevering sufficient tumor control. As a result, treatment planning has become more labor intensive, requiring hours or even days of planner effort to optimize an individual patient case in a trial-and-error fashion. More recently, artificial intelligence has been utilized to automate and improve various aspects of medical science. For radiotherapy treatment planning, many algorithms have been developed to better support planners. These algorithms focus on automating the planning process and/or optimizing dosimetric trade-offs, and they have already made great impact on improving treatment planning efficiency and plan quality consistency. In this review, the smart planning tools in current clinical use are summarized in 3 main categories: automated rule implementation and reasoning, modeling of prior knowledge in clinical practice, and multicriteria optimization. Novel artificial intelligence-based treatment planning applications, such as deep learning-based algorithms and emerging research directions, are also reviewed. Finally, the challenges of artificial intelligence-based treatment planning are discussed for future works
CohortGPT: An Enhanced GPT for Participant Recruitment in Clinical Study
Participant recruitment based on unstructured medical texts such as clinical
notes and radiology reports has been a challenging yet important task for the
cohort establishment in clinical research. Recently, Large Language Models
(LLMs) such as ChatGPT have achieved tremendous success in various downstream
tasks thanks to their promising performance in language understanding,
inference, and generation. It is then natural to test their feasibility in
solving the cohort recruitment task, which involves the classification of a
given paragraph of medical text into disease label(s). However, when applied to
knowledge-intensive problem settings such as medical text classification, where
the LLMs are expected to understand the decision made by human experts and
accurately identify the implied disease labels, the LLMs show a mediocre
performance. A possible explanation is that, by only using the medical text,
the LLMs neglect to use the rich context of additional information that
languages afford. To this end, we propose to use a knowledge graph as auxiliary
information to guide the LLMs in making predictions. Moreover, to further boost
the LLMs adapt to the problem setting, we apply a chain-of-thought (CoT) sample
selection strategy enhanced by reinforcement learning, which selects a set of
CoT samples given each individual medical report. Experimental results and
various ablation studies show that our few-shot learning method achieves
satisfactory performance compared with fine-tuning strategies and gains superb
advantages when the available data is limited. The code and sample dataset of
the proposed CohortGPT model is available at:
https://anonymous.4open.science/r/CohortGPT-4872/Comment: 16 pages, 10 figure
Desiderata for the development of next-generation electronic health record phenotype libraries
Background
High-quality phenotype definitions are desirable to enable the extraction of patient cohorts from large electronic health record repositories and are characterized by properties such as portability, reproducibility, and validity. Phenotype libraries, where definitions are stored, have the potential to contribute significantly to the quality of the definitions they host. In this work, we present a set of desiderata for the design of a next-generation phenotype library that is able to ensure the quality of hosted definitions by combining the functionality currently offered by disparate tooling.
Methods
A group of researchers examined work to date on phenotype models, implementation, and validation, as well as contemporary phenotype libraries developed as a part of their own phenomics communities. Existing phenotype frameworks were also examined. This work was translated and refined by all the authors into a set of best practices.
Results
We present 14 library desiderata that promote high-quality phenotype definitions, in the areas of modelling, logging, validation, and sharing and warehousing.
Conclusions
There are a number of choices to be made when constructing phenotype libraries. Our considerations distil the best practices in the field and include pointers towards their further development to support portable, reproducible, and clinically valid phenotype design. The provision of high-quality phenotype definitions enables electronic health record data to be more effectively used in medical domains
- …