2 research outputs found
LeafAI: query generator for clinical cohort discovery rivaling a human programmer
Objective: Identifying study-eligible patients within clinical databases is a
critical step in clinical research. However, accurate query design typically
requires extensive technical and biomedical expertise. We sought to create a
system capable of generating data model-agnostic queries while also providing
novel logical reasoning capabilities for complex clinical trial eligibility
criteria.
Materials and Methods: The task of query creation from eligibility criteria
requires solving several text-processing problems, including named entity
recognition and relation extraction, sequence-to-sequence transformation,
normalization, and reasoning. We incorporated hybrid deep learning and
rule-based modules for these, as well as a knowledge base of the Unified
Medical Language System (UMLS) and linked ontologies. To enable data-model
agnostic query creation, we introduce a novel method for tagging database
schema elements using UMLS concepts. To evaluate our system, called LeafAI, we
compared the capability of LeafAI to a human database programmer to identify
patients who had been enrolled in 8 clinical trials conducted at our
institution. We measured performance by the number of actual enrolled patients
matched by generated queries.
Results: LeafAI matched a mean 43% of enrolled patients with 27,225 eligible
across 8 clinical trials, compared to 27% matched and 14,587 eligible in
queries by a human database programmer. The human programmer spent 26 total
hours crafting queries compared to several minutes by LeafAI.
Conclusions: Our work contributes a state-of-the-art data model-agnostic
query generation system capable of conditional reasoning using a knowledge
base. We demonstrate that LeafAI can rival a human programmer in finding
patients eligible for clinical trials
Closed-World Semantics for Query Answering in Temporal Description Logics
Ontology-mediated query answering is a popular paradigm for enriching answers to user queries with background knowledge. For querying the absence of information, however, there exist only few ontology-based approaches. Moreover, these proposals conflate the closed-domain and closed-world assumption, and therefore are not suited to deal with the anonymous objects that are common in ontological reasoning. Many real-world applications, like processing electronic health records (EHRs), also contain a temporal dimension, and require efficient reasoning algorithms. Moreover, since medical data is not recorded on a regular basis, reasoners must deal with sparse data with potentially large temporal gaps. Our contribution consists of three main parts:
Firstly, we introduce a new closed-world semantics for answering conjunctive queries with negation over ontologies formulated in the description logic ELH⊥, which is based on the minimal universal model.
We propose a rewriting strategy for dealing with negated query atoms, which shows that query answering is possible in polynomial time in data complexity. Secondly, we introduce a new temporal variant of ELH⊥ that features a convexity operator. We extend this minimal-world semantics for answering metric temporal conjunctive queries with negation over the logic and obtain similar rewritability and complexity results.
Thirdly, apart from the theoretical results, we evaluate minimal-world semantics in practice by selecting patients, based their EHRs, that match given criteria