15 research outputs found
CLEF 2017 NewsREEL Overview: Offline and Online Evaluation of Stream-based News Recommender Systems
The CLEF NewsREEL challenge allows researchers to evaluate news
recommendation algorithms both online (NewsREEL Live) and offline (News-
REEL Replay). Compared with the previous year NewsREEL challenged participants
with a higher volume of messages and new news portals. In the 2017
edition of the CLEF NewsREEL challenge a wide variety of new approaches have
been implemented ranging from the use of existing machine learning frameworks,
to ensemble methods to the use of deep neural networks. This paper gives an
overview over the implemented approaches and discusses the evaluation results.
In addition, the main results of Living Lab and the Replay task are explained
Overview of NewsREEL’16: Multi-dimensional evaluation of real-time stream-recommendation algorithms
Successful news recommendation requires facing the challenges of dynamic item sets, contextual item relevance, and of fulfilling non-functional requirements, such as response time. The CLEF NewsREEL challenge is a campaign-style evaluation lab allowing participants to tackle news recommendation and to optimize and evaluate their recommender algorithms both online and offline. In this paper, we summarize the objectives and challenges of NewsREEL 2016. We cover two contrasting perspectives on the challenge: that of the operator (the business providing recommendations) and that of the challenge participant (the researchers developing recommender algorithms). In the intersection of these perspectives, new insights can be gained on how to effectively evaluate real-time stream recommendation algorithms
Overview of the CLEF 2018 Consumer Health Search Task
This paper details the collection, systems and evaluation
methods used in the CLEF 2018 eHealth Evaluation Lab, Consumer Health Search (CHS) task (Task 3). This task investigates the effectiveness of search engines in providing access to medical information present on the Web for people that have no or little medical knowledge. The task aims to foster advances in the development of search technologies for Consumer Health Search by providing resources and evaluation methods to test and validate search systems. Built upon the the 2013-17 series of CLEF eHealth Information Retrieval tasks, the 2018 task considers
both mono- and multilingual retrieval, embracing the Text REtrieval Conference (TREC) -style evaluation process with a shared collection of documents and queries, the contribution of runs from participants and the subsequent formation of relevance assessments and evaluation of the participants submissions.
For this year, the CHS task uses a new Web corpus and a new set of queries compared to the previous years. The new corpus consists of Web pages acquired from the CommonCrawl and the new set of queries consists of 50 queries issued by the general public to the Health on the Net (HON) search services. We then manually translated the 50 queries to
French, German, and Czech; and obtained English query variations of the 50 original queries.
A total of 7 teams from 7 different countries participated in the 2018 CHS task: CUNI (Czech Republic), IMS Unipd (Italy), MIRACL (Tunisia), QUT (Australia), SINAI (Spain), UB-Botswana (Botswana), and UEvora (Portugal)
Overview of the CLEF 2018 Consumer Health Search Task
This paper details the collection, systems and evaluation
methods used in the CLEF 2018 eHealth Evaluation Lab, Consumer
Health Search (CHS) task (Task 3). This task investigates the effectiveness of search engines in providing access to medical information present
on the Web for people that have no or little medical knowledge. The task
aims to foster advances in the development of search technologies for
Consumer Health Search by providing resources and evaluation methods
to test and validate search systems. Built upon the the 2013-17 series
of CLEF eHealth Information Retrieval tasks, the 2018 task considers
both mono- and multilingual retrieval, embracing the Text REtrieval
Conference (TREC) -style evaluation process with a shared collection of
documents and queries, the contribution of runs from participants and
the subsequent formation of relevance assessments and evaluation of the
participants submissions.
For this year, the CHS task uses a new Web corpus and a new set of
queries compared to the previous years. The new corpus consists of Web
pages acquired from the CommonCrawl and the new set of queries consists of 50 queries issued by the general public to the Health on the Net
(HON) search services. We then manually translated the 50 queries to
French, German, and Czech; and obtained English query variations of
the 50 original queries.
A total of 7 teams from 7 different countries participated in the 2018 CHS
task: CUNI (Czech Republic), IMS Unipd (Italy), MIRACL (Tunisia),
QUT (Australia), SINAI (Spain), UB-Botswana (Botswana), and UEvora
(Portugal)
NailP at eRisk 2023: search for symptoms of depression
Depression is a global health concern with severe consequences for individuals, making its recognition and
understanding crucial. Recently, there has been a growing interest in utilizing social media platforms as
valuable sources of information to gain insights into individuals’ experiences with depression. Analyzing
textual data from diverse user populations enables the identification of common symptoms, triggers,
coping mechanisms, and potential warning signs. Researchers have developed algorithms and machine
learning models to automate the detection of depressive symptoms in text, facilitating more efficient
screening and early intervention. This paper describes the participation of team NailP in the CLEF
eRisk 2023 task 1, which focuses on ranking sentences from user writings based on their relevance to
symptoms of depression. The goal is to evaluate the sentences and determine their level of relevance to
each symptom outlined in the Beck Depression Questionnaire-II. Such participation contributes to the
development of effective methods and tools for identifying and predicting potential risks and dangers
associated with depression in online environments.The authors thank CNPq, CAPES, FAPERJ, and CEFET/RJ for partially funding this research.
The authors are grateful to the Foundation for Science and Technology (FCT, Portugal) for
financial support through national funds FCT/MCTES (PIDDAC) to CeDRI (UIDB/05757/2020
and UIDP/05757/2020) and SusTEC (LA/P/0007/2021).info:eu-repo/semantics/publishedVersio
Plataforma para la etiquetación asistida de casos de riesgo temprano en internet
[Resumen] Desde la invención de la World Wide Web, la necesidad de los usuarios por buscar información en Internet no ha parado de aumentar. Esto ha provocado que crezcan de manera continua los sistemas de recuperación de información y los ámbitos donde esta disciplina tiene alguna aplicación. Por esto, es necesario elaborar metodologías y herramientas que permitan realizar una correcta evaluación de estos nuevos sistemas. El objetivo de este proyecto es diseñar y construir una plataforma que permita la etiquetación de documentos eficiente por parte de los asesores asociados a casos de trastornos psicológicos y mentales. Esta plataforma se usará para construir la colección de prueba en la competición de CLEF eRisk de 2020, que se celebra con el objetivo de evaluar la efectividad de metodologías y métricas para la detección temprana de casos de riesgo en Internet, especialmente aquellos relacionados con la salud, como la anorexia o la depresión. También se busca que la plataforma sea flexible a la hora de poder añadir nuevos modelos de recuperación o nuevas estrategias de pooling. Para poder lograr una correcta consecución de estos objetivos se ha decidido emplear una metodología ágil con ciclos iterativos e incrementales que permiten adaptarse a las circunstancias cambiantes del proyecto. Siguiendo este proceso se ha obtenido una aplicación de calidad que cumple con los objetivos establecidos.[Abstract] Since the invention of the World Wide Web, the need for users to search for information on the Internet has not stopped increasing. This has led to the continuous growth of information retrieval systems and the areas where this discipline has some application. For this reason, it is necessary to develop methodologies and tools that allow a correct evaluation of these new systems. The aim of this project is to design and build a platform that allows the efficient labeling of documents by asessors associated with cases of psychological and mental disorders. This platform will be used to build the test collection in the 2020 CLEF eRisk competition, which is held with the aim of evaluating the effectiveness of ethodologies and metrics for the early detection of risk cases on the Internet, especially those related to health, such as anorexia or depression. The platform is also intended to be flexible when adding new recovery models or new pooling strategies. In order to achieve these objectives correctly, it has been decided to use an agile methodology with iterative and incremental cycles that allow adaptation to the changing circumstances of the project. Following this process has resulted in a quality application that meets the established objectives.Traballo fin de grao (UDC.FIC). Enxeñaría informática. Curso 2018/201
Identifying depression-related topics in smartphone-collected free-response speech recordings using an automatic speech recognition system and a deep learning topic model
Language use has been shown to correlate with depression, but large-scale
validation is needed. Traditional methods like clinic studies are expensive.
So, natural language processing has been employed on social media to predict
depression, but limitations remain-lack of validated labels, biased user
samples, and no context. Our study identified 29 topics in 3919
smartphone-collected speech recordings from 265 participants using the Whisper
tool and BERTopic model. Six topics with a median PHQ-8 greater than or equal
to 10 were regarded as risk topics for depression: No Expectations, Sleep,
Mental Therapy, Haircut, Studying, and Coursework. To elucidate the topic
emergence and associations with depression, we compared behavioral (from
wearables) and linguistic characteristics across identified topics. The
correlation between topic shifts and changes in depression severity over time
was also investigated, indicating the importance of longitudinally monitoring
language use. We also tested the BERTopic model on a similar smaller dataset
(356 speech recordings from 57 participants), obtaining some consistent
results. In summary, our findings demonstrate specific speech topics may
indicate depression severity. The presented data-driven workflow provides a
practical approach to collecting and analyzing large-scale speech data from
real-world settings for digital health research