3 research outputs found
Biomedical information extraction for matching patients to clinical trials
Digital Medical information had an astonishing growth on the last decades, driven
by an unprecedented number of medical writers, which lead to a complete revolution in
what and how much information is available to the health professionals.
The problem with this wave of information is that performing a precise selection of
the information retrieved by medical information repositories is very exhaustive and time
consuming for physicians. This is one of the biggest challenges for physicians with the
new digital era: how to reduce the time spent finding the perfect matching document for a
patient (e.g. intervention articles, clinical trial, prescriptions).
Precision Medicine (PM) 2017 is the track by the Text REtrieval Conference (TREC),
that is focused on this type of challenges exclusively for oncology. Using a dataset with a
large amount of clinical trials, this track is a good real life example on how information
retrieval solutions can be used to solve this types of problems. This track can be a very
good starting point for applying information extraction and retrieval methods, in a very
complex domain.
The purpose of this thesis is to improve a system designed by the NovaSearch team
for TREC PM 2017 Clinical Trials task, which got ranked on the top-5 systems of 2017.
The NovaSearch team also participated on the 2018 track and got a 15% increase on
precision compared to the 2017 one. It was used multiple IR techniques for information
extraction and processing of data, including rank fusion, query expansion (e.g. Pseudo
relevance feedback, Mesh terms expansion) and experiments with Learning to Rank
(LETOR) algorithms. Our goal is to retrieve the best possible set of trials for a given
patient, using precise documents filters to exclude the unwanted clinical trials. This work
can open doors in what can be done for searching and perceiving the criteria to exclude or
include the trials, helping physicians even on the more complex and difficult information
retrieval tasks
Utilizing Knowledge Bases In Information Retrieval For Clinical Decision Support And Precision Medicine
Accurately answering queries that describe a clinical case and aim at finding articles in a collection of medical literature requires utilizing knowledge bases in capturing many explicit and latent aspects of such queries. Proper representation of these aspects needs knowledge-based query understanding methods that identify the most important query concepts as well as knowledge-based query reformulation methods that add new concepts to a query. In the tasks of Clinical Decision Support (CDS) and Precision Medicine (PM), the query and collection documents may have a complex structure with different components, such as disease and genetic variants that should be transformed to enable an effective information retrieval. In this work, we propose methods for representing domain-specific queries based on weighted concepts of different types whether exist in the query itself or extracted from the knowledge bases and top retrieved documents. Besides, we propose an optimization framework, which allows unifying query analysis and expansion by jointly determining the importance weights for the query and expansion concepts depending on their type and source. We also propose a probabilistic model to reformulate the query given genetic information in the query and collection documents. We observe significant improvement of retrieval accuracy will be obtained for our proposed methods over state-of-the-art baselines for the tasks of clinical decision support and precision medicine