5 research outputs found
Overview of the CLEF eHealth Evaluation Lab 2018
In this paper, we provide an overview of the sixth annual edition of the CLEF eHealth evaluation lab. CLEF eHealth 2018 continues
our evaluation resource building efforts around the easing and support of
patients, their next-of-kins, clinical staff, and health scientists in understanding, accessing, and authoring eHealth information in a multilingual
setting. This year’s lab offered three tasks: Task 1 on multilingual information extraction to extend from last year’s task on French and English
corpora to French, Hungarian, and Italian; Task 2 on technologically
assisted reviews in empirical medicine building on last year’s pilot task in English; and Task 3 on Consumer Health Search (CHS) in mono- and
multilingual settings that builds on the 2013–17 Information Retrieval
tasks. In total 28 teams took part in these tasks (14 in Task 1, 7 in Task
2 and 7 in Task 3). Herein, we describe the resources created for these
tasks, outline our evaluation methodology adopted and provide a brief
summary of participants of this year’s challenges and results obtained.
As in previous years, the organizers have made data and tools associated
with the lab tasks available for future research and development
Improving understandability in consumer health information search: Uevora @ 2016 fire chis
This paper presents our work at 2016 FIRE CHIS. Given a CHIS query and a document associated with that query, the task is to classify the sentences in the document as relevant to the query or not; and further classify the relevant sentences to be supporting, neutral or opposing to the claim made in the query. In this paper, we present two different approaches to do the classification. With the first approach, we implement two models to satisfy the task. We first implement an information retrieval model to retrieve the sentences that are relevant to the query; and then we use supervised learning method to train a classification model to classify the relevant sentences into support, oppose or neutral. With the second approach, we only use machine learning techniques to learn a model and classify the sentences into four classes (relevant & support, relevant & neutral, relevant & oppose, irrelevant & neutral). Our submission for CHIS uses the first approach.Erasmus Mundus LEADER projec
Query expansion strategies for laypeople-centred health information retrieval
One of the most common activities on the web is the research for health information. This activity has been gaining popularity among users, but the majority of them have no training in health care, which leads to difficulties in understanding the terminology and contents of documents.In the field of health information retrieval various investigations have been carried out, which resulted in methodologies that offer solutions to improve the quality of the retrieval documents. One of the most covered techniques in this area is the query expansion, that solves one of the biggest difficulties for users in the search of health information: the limited knowledge of medical terminology. This lack of knowledge influence the formulation of queries and the expectations of the retrieval documents. The query expansion complements the original query with additional terms, making it more reliable. These new terms can be obtained through thesaurus containing several terms associated with a medical concept.The amount of research conducted on the issue of readability of the documents is greatly reduced, the most developed subject is relevance, but if a document is relevant and the user does not comprehend it's contents it ceases to be useful.In this thesis it will be proposed a methodology to improve the quality of the retrieval documents, using methods to improve the users queries, such as the query expansion, and it will be used Readability formulas to determine the level of education required to understand a document. Will be conducted several tests to determine if the source to be used in the query expansion and the readability will have an effect in the retrieval process. These tests will be evaluated with precision and NDCG in the case of relevance, and in the case of readability it will be used uRBP
Promoting understandability in consumer healt information seach
Nowadays, in the area of Consumer Health Information Retrieval, techniques
and methodologies are still far from being effective in answering complex
health queries. One main challenge comes from the varying and limited
medical knowledge background of consumers; the existing language gap be-
tween non-expert consumers and the complex medical resources confuses
them. So, returning not only topically relevant but also understandable
health information to the user is a significant and practical challenge in this
area.
In this work, the main research goal is to study ways to promote under-
standability in Consumer Health Information Retrieval. To help reaching
this goal, two research questions are issued: (i) how to bridge the existing
language gap; (ii) how to return more understandable documents. Two mod-
ules are designed, each answering one research question. In the first module,
a Medical Concept Model is proposed for use in health query processing;
this model integrates Natural Language Processing techniques into state-of-
the-art Information Retrieval. Moreover, aiming to integrate syntactic and
semantic information, word embedding models are explored as query expan-
sion resources. The second module is designed to learn understandability
from past data; a two-stage learning to rank model is proposed with rank
aggregation methods applied on single field-based ranking models.
These proposed modules are assessed on FIRE’2016 CHIS track data and
CLEF’2016-2018 eHealth IR data collections. Extensive experimental com-
parisons with the state-of-the-art baselines on the considered data collec-
tions confirmed the effectiveness of the proposed approaches: regarding un-
derstandability relevance, the improvement is 11.5%, 9.3% and 16.3% in
RBP, uRBP and uRBPgr evaluation metrics, respectively; in what concerns
to topical relevance, the improvement is 7.8%, 16.4% and 7.6% in P@10,
NDCG@10 and MAP evaluation metrics, respectively; Sumário:
Promoção da Compreensibilidade na Pesquisa de
Informação de Saúde pelo Consumidor
Atualmente as técnicas e metodologias utilizadas na área da Recuperação
de Informação em Saúde estão ainda longe de serem efetivas na resposta
às interrogações colocadas pelo consumidor. Um dos principais desafios é
o variado e limitado conhecimento médico dos consumidores; a lacuna lin-
guÃstica entre os consumidores e os complexos recursos médicos confundem
os consumidores não especializados. Assim, a disponibilização, não apenas
de informação de saúde relevante, mas também compreensÃvel, é um desafio
significativo e prático nesta área.
Neste trabalho, o objetivo é estudar formas de promover a compreensibili-
dade na Recuperação de Informação em Saúde. Para tal, são são levantadas
duas questões de investigação: (i) como diminuir as diferenças de linguagem
existente entre consumidores e recursos médicos; (ii) como recuperar textos
mais compreensÃveis. São propostos dois módulos, cada um para respon-
der a uma das questões. No primeiro módulo é proposto um Modelo de
Conceitos Médicos para inclusão no processo da consulta de informação que
integra técnicas de Processamento de Linguagem Natural na Recuperação
de Informação. Mais ainda, com o objetivo de incorporar informação sin-
tática e semântica, são também explorados modelos de word embedding na
expansão de consultas. O segundo módulo é desenhado para aprender a com-
preensibilidade a partir de informação do passado; é proposto um modelo de
learning to rank de duas etapas, com métodos de agregação aplicados sobre
os modelos de ordenação criados com informação de campos especÃficos dos
documentos.
Os módulos propostos são avaliados nas coleções CHIS do FIRE’2016 e
eHealth do CLEF’2016-2018. Comparações experimentais extensivas real-
izadas com modelos atuais (baselines) confirmam a eficácia das abordagens
propostas: relativamente à relevância da compreensibilidade, obtiveram-se melhorias de 11.5%, 9.3% e 16.3 % nas medidas de avaliação RBP, uRBP e
uRBPgr, respectivamente; no que respeita à relevância dos tópicos recupera-
dos, obtiveram-se melhorias de 7.8%, 16.4% e 7.6% nas medidas de avaliação
P@10, NDCG@10 e MAP, respectivamente