    LeMe-PT: A Medical Package Leaflet Corpus for Portuguese

    The current trend on natural language processing is the use of machine learning. This is being done on every field, from summarization to machine translation. For these techniques to be applied, resources are needed, namely quality corpora. While there are large quantities of corpora for the Portuguese language, there is the lack of technical and focused corpora. Therefore, in this article we present a new corpus, built from drug package leaflets. We describe its structure and contents, and discuss possible exploration directions

    The RareDis corpus: A corpus annotated with rare diseases, their signs and symptoms

    Rare diseases affect a small number of people compared to the general population. However, more than 6,000 different rare diseases exist and, in total, they affect more than 300 million people worldwide. Rare diseases share as part of their main problem, the delay in diagnosis and the sparse information available for researchers, clinicians, and patients. Finding a diagnostic can be a very long and frustrating experience for patients and their families. The average diagnostic delay is between 6–8 years. Many of these diseases result in different manifestations among patients, which hampers even more their detection and the correct treatment choice. Therefore, there is an urgent need to increase the scientific and medical knowledge about rare diseases. Natural Language Processing (NLP) can help to extract relevant information about rare diseases to facilitate their diagnosis and treatments, but most NLP techniques require manually annotated corpora. Therefore, our goal is to create a gold standard corpus annotated with rare diseases and their clinical manifestations. It could be used to train and test NLP approaches and the information extracted through NLP could enrich the knowledge of rare diseases, and thereby, help to reduce the diagnostic delay and improve the treatment of rare diseases. The paper describes the selection of 1,041 texts to be included in the corpus, the annotation process and the annotation guidelines. The entities (disease, rare disease, symptom, sign and anaphor) and the relationships (produces, is a, is acron, is synon, increases risk of, anaphora) were annotated. The RareDis corpus contains more than 5,000 rare diseases and almost 6,000 clinical manifestations are annotated. Moreover, the Inter Annotator Agreement evaluation shows a relatively high agreement (F1-measure equal to 83.5% under exact match criteria for the entities and equal to 81.3% for the relations). Based on these results, this corpus is of high quality, supposing a significant step for the field since there is a scarcity of available corpus annotated with rare diseases. This could open the door to further NLP applications, which would facilitate the diagnosis and treatment of these rare diseases and, therefore, would improve dramatically the quality of life of these patients.This work was supported by the Madrid Government (Comunidad de Madrid) under the Multiannual Agreement with UC3M in the line of "Fostering Young Doctors Research" (NLP4RARE-CM-UC3M) and in the context of the V PRICIT (Regional Programme of Research and Technological Innovation; the Multiannual Agreement with UC3M in the line of "Excellence of University Professors (EPUC3M17)"; and a grant from Spanish Ministry of Economy and Competitiveness (SAF2017-86810-R)

    Development and evaluation of a microservice-based virtual assistant for chronic patients support

    Los asistentes virtuales (también conocidos como chatbots) son programas que interactúan con los usuarios simulando una conversación humana a través de mensajes de texto o de voz. Los asistentes virtuales destinados al cuidado de la salud ofrecen servicios, herramientas, asesoramiento, ayuda, soporte y gestión de diferentes enfermedades. Los usuarios de este tipo de asistente virtual pueden ser, por ejemplo, pacientes, cuidadores y profesionales sanitarios, los cuales poseen diferentes necesidades y requerimientos. Los pacientes con enfermedades crónicas podrían beneficiarse de los asistentes virtuales que se encargan de realizar seguimientos de su condición, proporcionar información específica, fomentar la adherencia a la medicación, etc. Para realizar estas funciones, los asistentes virtuales necesitan una arquitectura de software adecuada. Esta tesis doctoral propone el diseño de una arquitectura específica para el desarrollo de asistentes virtuales destinados a proporcionar soporte a pacientes crónicos. Hoy en día, las personas interactúan entre sí diariamente utilizando plataformas de mensajería. Para alinear este tipo de interacción con la arquitectura del asistente virtual, proponemos el uso de plataformas de mensajería para la interacción asistente virtual-paciente, prestando especial atención a las cuestiones de seguridad y privacidad (es decir, el uso de plataformas de mensajería seguras con cifrado de extremo a extremo).Los asistentes virtuales pueden implementar sistemas conversacionales para que la interacción con los pacientes sea más natural. Los sistemas conversacionales en escenarios de atención médica complejos, como la gestión de enfermedades, deben ser capaces de poder comprender oraciones complejas utilizadas durante la interacción. La adaptación de nuevos métodos con el procesamiento de lenguaje natural (NLP, por su nombre en inglés, Natural Language Processing) puede aportar una mejora a la arquitectura del asistente virtual. Los word embeddings (incrustación de palabras) se han utilizado ampliamente en NLP como entrada en las redes neuronales. Tales word embeddings pueden ayudar a comprender el objetivo final y las palabras clave en una oración. Por ello, en esta tesis estudiamos el impacto de diferentes word embeddings entrenados con corpus generales y específicos utilizando el entendimiento del lenguaje natural conjunto (Joint NLU, por su nombre en inglés, Joint Natural Language Understanding) en el dominio de la medicación en español. Los datos para entrenar el modelo NLU conjunto se generan usando plantillas. Dicho modelo se utiliza para la detección de intenciones, así como para el slot filling (llenado de ranuras). En este estudio comparamos word2vec y fastText como word embeddings y ELMo y BERT como modelos de lenguaje. Para entrenar los embeddings utilizamos tres corpus diferentes: los datos de entrenamiento generados para este escenario, la Wikipedia en español como dominio general y la base de datos de medicamentos en español como datos especializados. El mejor resultado se obtuvo con el modelo ELMo entrenado con Wikipedia en español.Dotamos al asistente virtual de capacidades de gestión de medicamentos basadas en NLP. En consecuencia, se analiza el impacto del etiquetado de slots y la longitud de los datos de entrenamiento en modelos NLU conjuntos para escenarios de gestión de medicamentos utilizando asistentes virtuales en español. En este estudio definimos las intenciones (propósitos de las oraciones) para escenarios centrados en la administración de medicamentos y dos tipos de etiquetas de slots. Para entrenar el modelo, generamos cuatro conjuntos de datos, combinando oraciones largas o cortas con slots largos o cortos. Para el análisis comparativo, elegimos seis modelos NLU conjuntos (SlotRefine, stack-propagation framework, SF-ID network, capsule-NLU, slot-gated modeling y joint SLU-LM) de la literatura existente. Tras el análisis competitivo, se observa que el mejor resultado se obtuvo utilizando oraciones y slots cortos. Nuestros resultados sugirieron que los modelos NLU conjuntos entrenados con slots cortos produjeron mejores resultados que aquellos entrenados con slots largos para la tarea de slot filling.En definitiva, proponemos una arquitectura de microservicios genérica válida para cualquier tipo de gestión de enfermedades crónicas. El prototipo genérico ofrece un asistente virtual operativo para gestionar información básica y servir de base para futuras ampliaciones. Además, en esta tesis presentamos dos prototipos especializados con el objetivo de mostrar cómo esta nueva arquitectura permite cambiar, añadir o mejorar diferentes partes del asistente virtual de forma dinámica y flexible. El primer prototipo especializado tiene como objetivo ayudar en la gestión de la medicación del paciente. Este prototipo se encargará de recordar la ingesta de medicamentos a través de la creación de una comunidad de apoyo donde los pacientes, cuidadores y profesionales sanitarios interactúen con herramientas y servicios útiles ofrecidos por el asistente virtual. La implementación del segundo prototipo especializado está diseñada para una enfermedad crónica específica, la psoriasis. Este prototipo ofrece teleconsulta y almacenamiento de fotografías.Por último, esta tesis tiene como objetivo validar la eficacia del asistente virtual integrado en las plataformas de mensajería, destinado al cuidado de la salud. Por ello, esta tesis incluye la evaluación de los dos prototipos especializados. El primer estudio tiene como objetivo mejorar la adherencia a la medicación en pacientes con diabetes mellitus tipo 2 comórbida y trastorno depresivo. Para ello, se diseñó y posteriormente se realizó un estudio piloto de nueve meses. En el estudio analizamos la Tasa de Posesión de Medicamentos (MPR, por su nombre en inglés, Medication Possession Ratio), obtuvimos la puntuación del Cuestionario sobre la Salud del Paciente (PHQ-9, por su nombre en inglés, Patient Health Questionnaire) y medimos el nivel de hemoglobina glicosilada (HbA1c), en los pacientes antes y después del estudio. También realizamos entrevistas a todos los participantes. Un total de trece pacientes y cinco enfermeras utilizaron y evaluaron el asistente virtual propuesto. Los resultados mostraron que, en promedio, la adherencia a la medicación de los pacientes mejoró. El segundo estudio tiene como objetivo evaluar un año de uso entre el asistente virtual y pacientes con psoriasis y dermatólogos, y el impacto en su calidad de vida. Para ello se diseñó y realizó un estudio prospectivo de un año de duración con pacientes con psoriasis y dermatólogos. Para medir la mejora en la calidad de vida, en este estudio analizamos los cuestionarios de Calidad de Vida de los Pacientes con Psoriasis (PSOLIFE, por su nombre en inglés, Psoriasis Quality of Life) y el Índice de Calidad de Vida en Dermatología (DLQI, por su nombre en inglés, Dermatology Life Quality Index). Además, realizamos encuestas a todos los participantes y obtuvimos el número de consultas médicas realizadas a través del asistente virtual. Se incluyeron en el estudio un total de 34 participantes (30 pacientes diagnosticados con psoriasis moderada-grave y cuatro profesionales sanitarios). Los resultados mostraron que, en promedio, la calidad de vida mejoró.<br /

    Impact of written drug information in patient package inserts: Acceptance and impact on benefit/risk perception

    This thesis discusses the patient package insert (PPI), a folded sheet of paper in the drug package with a text which is supposed to be comprehensible for the general public. The PPI contains information on how the drug must be taken, on the risks of taking the drug, and a limited amount of information on whatthe drug is for. Belgium was the first country in Europe, together with Switzerland, to introduce PPIs. PPIs were first introduced in 1988 and the process was completed in 1992. In Europe, health authorities decided in 1992 that all medicinal product packages should contain a comprehensible insert. This decision is slowly but surely being implemented in all European countries. Similar developments did not take place in the US and other parts of the world, where medicines are distributed in bulk and dispensed without much information, even when dealing with powerful prescription drugs. During the introduction of PPIs in Belgium, a research programme wasconducted to evaluate this change in the way drug information was provided in the drug distribution system. This thesis provides an overview of the studiescarried out during that period. In addition, a number of other descriptive studies of the flow of drug information in specific patients groups is provided. Finally, a number of experimental studies is presented, which evaluate the impact of written drug information on patients? benefit/risk perception. The acceptance of a drug distribution system with mandatory PPIs in all drug packages will be evaluated on the basis of Belgium?s relatively long and welldocumented experience with PPIs. We address the following questions: what is the percentage of patients who read, accept and appreciate PPIs; what happens when a country changes from technical inserts with difficult jargon to comprehensible PPIs; what do we know about the impact of PPIs on patients knowledge and feelings about their drugs? In this thesis, an attempt is made to understand the mental processing of drug information which precedes patients' decisions and coping strategies, necessary for successful drug treatment. The question here is whether the PPI is capable of influencing the benefit/risk perception of patients. A further step is to study the impact of the PPI on behaviour. Here, other questions are at stake. Does the PPI have an impact on patients' reporting of health problems and side-effects, on their ability to carry out a treatment correctly and safely, on their adherence to therapy at the beginning of treatment, and on their motivation to continue crucial therapy? These questions are addressed only to a limited extent in this work, as we have focused on the preceding cognitive process of benefit/risk perception

    Simplifying drug package leaflets written in Spanish by using word embedding

    Background: Drug Package Leaflets (DPLs) provide information for patients on how to safely use medicines. Pharmaceutical companies are responsible for producing these documents. However, several studies have shown that patients usually have problems in understanding sections describing posology (dosage quantity and prescription), contraindications and adverse drug reactions. An ultimate goal of this work is to provide an automatic approach that helps these companies to write drug package leaflets in an easy-to-understand language. Natural language processing has become a powerful tool for improving patient care and advancing medicine because it leads to automatically process the large amount of unstructured information needed for patient care. However, to the best of our knowledge, no research has been done on the automatic simplification of drug package leaflets. In a previous work, we proposed to use domain terminological resources for gathering a set of synonyms for a given target term. A potential drawback of this approach is that it depends heavily on the existence of dictionaries, however these are not always available for any domain and language or if they exist, their coverage is very scarce.This work was supported by the Research Program of the Ministry of Economy and Competitiveness - Government of Spain, (eGovernAbility-Access project TIN2014-52665-C2-2-R)

    Computational Advances in Drug Safety: Systematic and Mapping Review of Knowledge Engineering Based Approaches

    Drug Safety (DS) is a domain with significant public health and social impact. Knowledge Engineering (KE) is the Computer Science discipline elaborating on methods and tools for developing “knowledge-intensive” systems, depending on a conceptual “knowledge” schema and some kind of “reasoning” process. The present systematic and mapping review aims to investigate KE-based approaches employed for DS and highlight the introduced added value as well as trends and possible gaps in the domain. Journal articles published between 2006 and 2017 were retrieved from PubMed/MEDLINE and Web of Science® (873 in total) and filtered based on a comprehensive set of inclusion/exclusion criteria. The 80 finally selected articles were reviewed on full-text, while the mapping process relied on a set of concrete criteria (concerning specific KE and DS core activities, special DS topics, employed data sources, reference ontologies/terminologies, and computational methods, etc.). The analysis results are publicly available as online interactive analytics graphs. The review clearly depicted increased use of KE approaches for DS. The collected data illustrate the use of KE for various DS aspects, such as Adverse Drug Event (ADE) information collection, detection, and assessment. Moreover, the quantified analysis of using KE for the respective DS core activities highlighted room for intensifying research on KE for ADE monitoring, prevention and reporting. Finally, the assessed use of the various data sources for DS special topics demonstrated extensive use of dominant data sources for DS surveillance, i.e., Spontaneous Reporting Systems, but also increasing interest in the use of emerging data sources, e.g., observational healthcare databases, biochemical/genetic databases, and social media. Various exemplar applications were identified with promising results, e.g., improvement in Adverse Drug Reaction (ADR) prediction, detection of drug interactions, and novel ADE profiles related with specific mechanisms of action, etc. Nevertheless, since the reviewed studies mostly concerned proof-of-concept implementations, more intense research is required to increase the maturity level that is necessary for KE approaches to reach routine DS practice. In conclusion, we argue that efficiently addressing DS data analytics and management challenges requires the introduction of high-throughput KE-based methods for effective knowledge discovery and management, resulting ultimately, in the establishment of a continuous learning DS system

    Hyvinvoinnin muotoilu

    Designing for Wellbeing consists of 12 projects which represent actual services or processes in the cities of Helsinki, Espoo, Kauniainen and Lahti. Projects address different dimensions of wellbeing, focusing in particular on municipal wellbeing services and patient-centered health care solutions. Designing for Wellbeing highlights new working methods in design, such as service design and the opportunities it provides for municipal decision-makers and the general public using the services. The projects are aimed at finding ways of encouraging people to adopt healthier lifestyles and helping designers and municipal decision-makers to design more pleasant and healthier environments. Examples of the services include redesigning the Villa Breda service home for the elderly in Kauniainen to include cultural services and social events for today’s active retirees, developing the environments and practices in psychiatric care units in Helsinki, reinventing the suburban neighborhoods in Helsinki and Lahti, designing better online services for basic health care and creating smoke-free public environments

    Abstractive multi-document summarization - paraphrasing and compressing with neural networks

    This thesis presents studies in neural text summarization for single and multiple documents.The focus is on using sentence paraphrasing and compression for generating fluent summaries, especially in multi-document summarization where there is data paucity. A novel solution is to use transfer-learning from downstream tasks with an abundance of data. For this purpose, we pre-train three models for each of extractive summarization, paraphrase generation and sentence compression. We find that summarization datasets – CNN/DM and NEWSROOM – contain a number of noisy samples. Hence, we present a method for automatically filtering out this noise. We combine the representational power of the GRU-RNN and TRANSFORMER encoders in our paraphrase generation model. In training our sentence compression model, we investigate the impact of using different early-stopping criteria, such as embedding-based cosine similarity and F1. We utilize the pre-trained models (ours, GPT2 and T5) in different settings for single and multi-document summarization.SGS Tuition Award Alberta Innovates Technology Futures (AITF

    Handbook of Easy Languages in Europe

    The Handbook of Easy Languages in Europe describes what Easy Language is and how it is used in European countries. It demonstrates the great diversity of actors, instruments and outcomes related to Easy Language throughout Europe. All people, despite their limitations, have an equal right to information, inclusion, and social participation. This results in requirements for understandable language. The notion of Easy Language refers to modified forms of standard languages that aim to facilitate reading and language comprehension. This handbook describes the historical background, the principles and the practices of Easy Language in 21 European countries. Its topics include terminological definitions, legal status, stakeholders, target groups, guidelines, practical outcomes, education, research, and a reflection on future perspectives related to Easy Language in each country. Written in an academic yet interesting and understandable style, this Handbook of Easy Languages in Europe aims to find a wide audience