2 research outputs found
Biomedical information extraction for matching patients to clinical trials
Digital Medical information had an astonishing growth on the last decades, driven
by an unprecedented number of medical writers, which lead to a complete revolution in
what and how much information is available to the health professionals.
The problem with this wave of information is that performing a precise selection of
the information retrieved by medical information repositories is very exhaustive and time
consuming for physicians. This is one of the biggest challenges for physicians with the
new digital era: how to reduce the time spent finding the perfect matching document for a
patient (e.g. intervention articles, clinical trial, prescriptions).
Precision Medicine (PM) 2017 is the track by the Text REtrieval Conference (TREC),
that is focused on this type of challenges exclusively for oncology. Using a dataset with a
large amount of clinical trials, this track is a good real life example on how information
retrieval solutions can be used to solve this types of problems. This track can be a very
good starting point for applying information extraction and retrieval methods, in a very
complex domain.
The purpose of this thesis is to improve a system designed by the NovaSearch team
for TREC PM 2017 Clinical Trials task, which got ranked on the top-5 systems of 2017.
The NovaSearch team also participated on the 2018 track and got a 15% increase on
precision compared to the 2017 one. It was used multiple IR techniques for information
extraction and processing of data, including rank fusion, query expansion (e.g. Pseudo
relevance feedback, Mesh terms expansion) and experiments with Learning to Rank
(LETOR) algorithms. Our goal is to retrieve the best possible set of trials for a given
patient, using precise documents filters to exclude the unwanted clinical trials. This work
can open doors in what can be done for searching and perceiving the criteria to exclude or
include the trials, helping physicians even on the more complex and difficult information
retrieval tasks