3,645 research outputs found

    Adverse Drug Event Detection, Causality Inference, Patient Communication and Translational Research

    Adverse drug events (ADEs) are injuries resulting from a medical intervention related to a drug. ADEs are responsible for nearly 20% of all the adverse events that occur in hospitalized patients. ADEs have been shown to increase the cost of health care and the length of stays in hospital. Therefore, detecting and preventing ADEs for pharmacovigilance is an important task that can improve the quality of health care and reduce the cost in a hospital setting. In this dissertation, we focus on the development of ADEtector, a system that identifies ADEs and medication information from electronic medical records and the FDA Adverse Event Reporting System reports. The ADEtector system employs novel natural language processing approaches for ADE detection and provides a user interface to display ADE information. The ADEtector employs machine learning techniques to automatically processes the narrative text and identify the adverse event (AE) and medication entities that appear in that narrative text. The system will analyze the entities recognized to infer the causal relation that exists between AEs and medications by automating the elements of Naranjo score using knowledge and rule based approaches. The Naranjo Adverse Drug Reaction Probability Scale is a validated tool for finding the causality of a drug induced adverse event or ADE. The scale calculates the likelihood of an adverse event related to drugs based on a list of weighted questions. The ADEtector also presents the user with evidence for ADEs by extracting figures that contain ADE related information from biomedical literature. A brief summary is generated for each of the figures that are extracted to help users better comprehend the figure. This will further enhance the user experience in understanding the ADE information better. The ADEtector also helps patients better understand the narrative text by recognizing complex medical jargon and abbreviations that appear in the text and providing definitions and explanations for them from external knowledge resources. This system could help clinicians and researchers in discovering novel ADEs and drug relations and also hypothesize new research questions within the ADE domain

    Identificação e análise de estados de saúde em mensagens do twitter

    Social media has become very widely used all over the world for its ability to connect people from different countries and create global communities. One of the most prominent social media platforms is Twitter. Twitter is a platform where users can share text segments with a maximum length of 280 characters. Due to the nature of the platform, it generates very large amounts of text data about its users’ lives. This data can be used to extract health information about a segment of the population for the purpose of public health surveillance. Social Media Mining for Health Shared Task is a challenge that encompasses many Natural Language Processing tasks related to the use of social media data for health research purposes. This dissertation describes the approach I used in my participation in the Social Media Mining for Health Shared Task. I participated in task 1 of the Shared Task. This task was divided into three subtasks. Subtask 1a consisted of the classification of Tweets regarding the presence of Adverse Drug Events. Subtask 1b was a Named Entity Recognition task that aimed at detecting Adverse Drug Effect spans in tweets. Subtask 1c was a normalization task that sought to match an Adverse Drug Event mention to a Medical Dictionary for Regulatory Activities preferred term ID. Toward discovering the best approach for each of the subtasks I made many experiments with different models and techniques to distinguish the ones that were more suited for each subtask. To solve these subtasks, I used transformer-based models as well as other techniques that aim at solving the challenges present in each of the subtasks. The best-performing approach for subtask 1a was a BERTweet large model trained with an augmented training set. As for subtask 1b, the best results were obtained through a RoBERTa large model with oversampled training data. Regarding subtask 1c, I used a RoBERTa base model trained with data from an additional dataset beyond the one made available by the shared task organizers. The systems used for subtasks 1a and 1b both achieved state-of-the-art performance, however, the approach for the third subtask was not able to achieve favorable results. The system used in subtask 1a achieved an F1 score of 0.698, the one used in subtask 1b achieved a relaxed F1 score of 0.661, and the one used in the final subtask achieved a relaxed F1 score of 0.116.As redes sociais tornaram-se muito utilizadas por todo o mundo, permitindo ligar pessoas de diferentes países e criar comunidades globais. O Twitter, uma das redes sociais mais populares, permite que os seus utilizadores partilhem segmentos curtos de texto com um máximo de 280 caracteres. Esta partilha na rede gera uma enorme quantidade de dados sobre os seus utilizadores, podendo ser analisados sobre múltiplas perspetivas. Por exemplo, podem ser utilizados para extrair informação sobre a saúde de um segmento da população tendo em vista a vigilância de saúde pública. O objetivo deste trabalho foi a investigação e o desenvolvimento de soluções técnicas para participar no “Social Media Mining for Health Shared Task” (#SMM4H), um desafio constituído por diversas tarefas de processamento de linguagem natural relacionadas com o uso de dados provenientes de redes sociais para o propósito de investigação na área da saúde. O trabalho envolveu o desenvolvimento de modelos baseados em transformadores e outras técnicas relacionadas, para participação na tarefa 1 deste desafio, que por sua vez está dividida em 3 subtarefas: 1a) classificação de tweets relativamente à presença ou não de eventos adversos de medicamentos (ADE); 1b) reconhecimento de entidades com o objetivo de detetar menções de ADE; 1c) tarefa de normalização com o objetivo de associar as menções de ADE ao termo MedDRA correspondente (“Medical Dictionary for Regulatory Activities”). A abordagem com melhor desempenho na tarefa 1a foi um modelo BERTweet large treinado com dados gerados através de um processo de data augmentation. Relativamente à tarefa 1b, os melhores resultados foram obtidos usando um modelo RoBERTa large com dados de treino sobreamostrados. Na tarefa 1c utilizou-se um modelo RoBERTa base treinado com dados adicionais provenientes de um conjunto de dados externo. A abordagem utilizada na terceira tarefa não conseguiu alcançar resultados relevantes (F1 de 0.12), enquanto que os sistemas desenvolvidos para as duas primeiras alcançaram resultados ao nível dos melhores do desafio (F1 de 0.69 e 0.66 respetivamente).Mestrado em Engenharia Informátic

    Multi-task Learning for Personal Health Mention Detection on Social Media

    Detecting personal health mentions on social media is essential to complement existing health surveillance systems. However, annotating data for detecting health mentions at a large scale is a challenging task. This research employs a multitask learning framework to leverage available annotated data from a related task to improve the performance on the main task to detect personal health experiences mentioned in social media texts. Specifically, we focus on incorporating emotional information into our target task by using emotion detection as an auxiliary task. Our approach significantly improves a wide range of personal health mention detection tasks compared to a strong state-of-the-art baseline.Comment: 5 page

    When Silver Is As Good As Gold: Using Weak Supervision to Train Machine Learning Models on Social Media Data

    Over the last decade, advances in machine learning have led to an exponential growth in artificial intelligence i.e., machine learning models capable of learning from vast amounts of data to perform several tasks such as text classification, regression, machine translation, speech recognition, and many others. While massive volumes of data are available, due to the manual curation process involved in the generation of training datasets, only a percentage of the data is used to train machine learning models. The process of labeling data with a ground-truth value is extremely tedious, expensive, and is the major bottleneck of supervised learning. To curtail this, the theory of noisy learning can be employed where data labeled through heuristics, knowledge bases and weak classifiers can be utilized for training, instead of data obtained through manual annotation. The assumption here is that a large volume of training data, which contains noise and acquired through an automated process, can compensate for the lack of manual labels. In this study, we utilize heuristic based approaches to create noisy silver standard datasets. We extensively tested the theory of noisy learning on four different applications by training several machine learning models using the silver standard dataset with several sample sizes and class imbalances and tested the performance using a gold standard dataset. Our evaluations on the four applications indicate the success of silver standard datasets in identifying a gold standard dataset. We conclude the study with evidence that noisy social media data can be utilized for weak supervisio

    QSAR model development for early stage screening of monoclonal antibody therapeutics to facilitate rapid developability

    PhD ThesisMonoclonal antibodies (mAbs) and related therapeutics are highly desirable from a biopharmaceutical perspective as they are highly target specific and well tolerated within the human system. Nevertheless, several mAbs have been discontinued or withdrawn based either on their inability to demonstrate efficacy and/or due to adverse effects. With nearly 80% of drugs failing in clinical development mainly due to lack of efficacy and safety there arises an urgent need for better understanding of biological activity, affinity, pharmacology, toxicity, immunogenicity etc. thus leading to early prediction of success/failure. In this study a hybrid modelling framework was developed that enabled early stage screening of mAbs. The applicability of the experimental methods was first tested on chemical compounds to assess the assay quality following which they were used to assess potential off target adverse effects of mAbs. Furthermore, hypersensitivity reactions were assessed using Skimune™, a non-artificial human skin explants based assay for safety and efficacy assessment of novel compounds and drugs, developed by Alcyomics Ltd. The suitability of Skimune™ for assessing the immune related adverse effects of aggregated mAbs was studied where aggregation was induced using a heat stress protocol. The aggregates were characterised by protein analysis techniques such as analytical ultra-centrifugation following which the immunogenicity tested using Skimune™ assay. Numerical features (descriptors) of mAbs were identified and generated using ProtDCal, EMBOSS Pepstat software as well as amino acid scales for different. Five independent and novel X block datasets consisting of these descriptors were generated based on the physicochemical, electronic, thermodynamic, electronic and topological properties of amino acids: Domain, Window, Substructure, Single Amino Acid, and Running Sum. This study describes the development of a hybrid QSAR based model with a structured workflow and clear evaluation metrics, with several optimisation steps, that could be beneficial for broader and more generic PLS modelling. Based on the results and observation from this study, it was demonstrated incremental improvement via selection of datasets and variables help in further optimisation of these hybrid models. Furthermore, using hypersensitivity and cross reactivity as responses and physicochemical characteristics of mAbs as descriptors, the QSAR models generated for different applicability domains allow for rapid early stage screening and developability. These models were validated with external test set comprising of proprietary compounds from industrial partners, thus paving way for enhanced developability that tackles manufacturing failures as well as attrition rates.European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie actions grant agreemen

    Artificial intelligence applications in disease diagnosis and treatment: recent progress and outlook

    The use of computers and other technologies to replicate human-like intelligent behaviour and critical thinking is known as artificial intelligence (AI).The development of AI-assisted applications and big data research has accelerated as a result of the rapid advancements in computing power, sensor technology, and platform accessibility that have accompanied advances in artificial intelligence. AI models and algorithms for planning and diagnosing endodontic procedures. The search engine evaluated information on artificial intelligence (AI) and its function in the field of endodontics, and it also incorporated databases like Google Scholar, PubMed, and Science Direct with the search criterion of original research articles published in English. Online appointment scheduling, online check-in at medical facilities, digitization of medical records, reminder calls for follow-up appointments and immunisation dates for children and pregnant women, as well as drug dosage algorithms and adverse effect warnings when prescribing multidrug combinations, are just a few of the tasks that already use artificial intelligence. Data from the review supported the conclusion that AI can play a significant role in endodontics, including the identification of apical lesions, classification and numbering of teeth, detection of dental caries, periodontitis, and periapical disease, diagnosis of various dental problems, aiding dentists in making referrals, and helping them develop more precise treatment plans for dental disorders. Although artificial intelligence (AI) has the potential to drastically alter how medicine is practised in ways that were previously unthinkable, many of its practical applications are still in their infancy and need additional research and development. Over the past ten years, artificial intelligence in ophthalmology has grown significantly and will continue to do so as imaging techniques and data processing algorithms improve

    Recognising Biomedical Names: Challenges and Solutions

    The growth rate in the amount of biomedical documents is staggering. Unlocking information trapped in these documents can enable researchers and practitioners to operate confidently in the information world. Biomedical Named Entity Recognition (NER), the task of recognising biomedical names, is usually employed as the first step of the NLP pipeline. Standard NER models, based on sequence tagging technique, are good at recognising short entity mentions in the generic domain. However, there are several open challenges of applying these models to recognise biomedical names: ● Biomedical names may contain complex inner structure (discontinuity and overlapping) which cannot be recognised using standard sequence tagging technique; ● The training of NER models usually requires large amount of labelled data, which are difficult to obtain in the biomedical domain; and, ● Commonly used language representation models are pre-trained on generic data; a domain shift therefore exists between these models and target biomedical data. To deal with these challenges, we explore several research directions and make the following contributions: (1) we propose a transition-based NER model which can recognise discontinuous mentions; (2) We develop a cost-effective approach that nominates the suitable pre-training data; and, (3) We design several data augmentation methods for NER. Our contributions have obvious practical implications, especially when new biomedical applications are needed. Our proposed data augmentation methods can help the NER model achieve decent performance, requiring only a small amount of labelled data. Our investigation regarding selecting pre-training data can improve the model by incorporating language representation models, which are pre-trained using in-domain data. Finally, our proposed transition-based NER model can further improve the performance by recognising discontinuous mentions

    A Comprehensive Study of Gender Bias in Chemical Named Entity Recognition Models

    Objective. Chemical named entity recognition (NER) models have the potential to impact a wide range of downstream tasks, from identifying adverse drug reactions to general pharmacoepidemiology. However, it is unknown whether these models work the same for everyone. Performance disparities can potentially cause harm rather than the intended good. Hence, in this paper, we measure gender-related performance disparities of chemical NER systems. Materials and Methods. We develop a framework to measure gender bias in chemical NER models using synthetic data and a newly annotated dataset of over 92,405 words with self-identified gender information from Reddit. We applied and evaluated state-of-the-art biomedical NER models. Results. Our findings indicate that chemical NER models are biased. The results of the bias tests on the synthetic dataset and the real-world data multiple fairness issues. For example, for synthetic data, we find that female-related names are generally classified as chemicals, particularly in datasets containing many brand names rather than standard ones. For both datasets, we find consistent fairness issues resulting in substantial performance disparities between female- and male-related data. Discussion. Our study highlights the issue of biases in chemical NER models. For example, we find that many systems cannot detect contraceptives (e.g., birth control). Conclusion. Chemical NER models are biased and can be harmful to female-related groups. Therefore, practitioners should carefully consider the potential biases of these models and take steps to mitigate them

    Combination antiretroviral therapy -associated lipodystrophy : insights into pathogenesis and treatment

    Introduction: Combination antiretroviral therapy (cART) has decreased morbidity and mortality of individuals infected with human immunodeficiency virus type 1 (HIV-1). Its use, however, is associated with adverse effects which increase the patients risk of conditions such as diabetes and coronary heart disease. Perhaps the most stigmatizing side effect is lipodystrophy, i.e., the loss of subcutaneous adipose tissue (SAT) in the face, limbs and trunk while fat accumulates intra-abdominally and dorsocervically. The pathogenesis of cART-associated lipodystrophy is obscure. Nucleoside reverse transcriptase inhibitors (NRTI) have been implicated to cause lipoatrophy via mitochondrial toxicity. There is no known effective treatment for cART-associated lipodystrophy during unchanged antiretroviral regimen in humans, but in vitro data have shown uridine to abrogate NRTI-induced toxicity in adipocytes. Aims: To investigate whether i) cART or lipodystrophy associated with its use affect arterial stiffness; ii) lipoatrophic SAT is inflamed compared to non-lipoatrophic SAT; iii) abdominal SAT from patients with compared to those without cART-associated lipoatrophy differs with respect to mitochondrial DNA (mtDNA) content, adipose tissue inflammation and gene expression, and if NRTIs stavudine and zidovudine are associated with different degree of changes; iv) lipoatrophic abdominal SAT differs from preserved dorsocervical SAT with respect to mtDNA content, adipose tissue inflammation and gene expression in patients with cART-associated lipodystrophy and v) whether uridine can revert lipoatrophy and the associated metabolic disturbances in patients on stavudine or zidovudine based cART. Subjects and methods: 64 cART-treated patients with (n=45) and without lipodystrophy/-atrophy (n=19) were compared cross-sectionally. A marker of arterial stiffness, heart rate corrected augmentation index (AgIHR), was measured by pulse wave analysis. Body composition was measured by magnetic resonance imaging and dual-energy X-ray absorptiometry, and liver fat content by proton magnetic resonance spectroscopy. Gene expression and mtDNA content in SAT were assessed by real-time polymerase chain reaction and microarray. Adipose tissue composition and inflammation were assessed by histology and immunohistochemistry. Dorsocervical and abdominal SAT were studied. The efficacy and safety of uridine for the treatment of cART-associated lipoatrophy were evaluated in a randomized, double-blind, placebo-controlled 3-month trial in 20 lipoatrophic cART-treated patients. Results: Duration of antiretroviral treatment and cumulative exposure to NRTIs and protease inhibitors, but not the presence of cART-associated lipodystrophy, predicted AgIHR independent of age and blood pressure. Gene expression of inflammatory markers was increased in SAT of lipodystrophic as compared to non-lipodystrophic patients. Expression of genes involved in adipogenesis, triglyceride synthesis and glucose disposal was lower and of those involved in mitochondrial biogenesis, apoptosis and oxidative stress higher in SAT of patients with than without cART-associated lipoatrophy. Most changes were more pronounced in stavudine-treated than in zidovudine-treated individuals. Lipoatrophic SAT had lower mtDNA than SAT of non-lipoatrophic patients. Expression of inflammatory genes was lower in dorsocervical than in abdominal SAT. Neither depot had characteristics of brown adipose tissue. Despite being spared from lipoatrophy, dorsocervical SAT of lipodystrophic patients had lower mtDNA than the phenotypically similar corresponding depot of non-lipodystrophic patients. The greatest difference in gene expression between dorsocervical and abdominal SAT, irrespective of lipodystrophy status, was in expression of homeobox genes that regulate transcription and regionalization of organs during embryonal development. Uridine increased limb fat and its proportion of total fat, but had no effect on liver fat content and markers of insulin resistance. Conclusions: Long-term cART is associated with increased arterial stiffness and, thus, with higher cardiovascular risk. Lipoatrophic abdominal SAT is characterized by inflammation, apoptosis and mtDNA depletion. As mtDNA is depleted even in non-lipoatrophic dorsocervical SAT, lipoatrophy is unlikely to be caused directly by mtDNA depletion. Preserved dorsocervical SAT of patients with cART-associated lipodystrophy is less inflamed than their lipoatrophic abdominal SAT, and does not resemble brown adipose tissue. The greatest difference in gene expression between dorsocervical and abdominal SAT is in expression of transcriptional regulators, homeobox genes, which might explain the differential susceptibility of these adipose tissue depots to cART-induced toxicity. Uridine is able to increase peripheral SAT in lipoatrophic patients during unchanged cART.Johdanto: Ihmisen immuunikatoviruksen (HIV) hoitoon käytetyt lääkeyhdistelmät ovat vähentäneet HIV-positiivisten henkilöiden sairastuvuutta ja kuolleisuutta. Yhdistelmähoitoon liittyy kuitenkin vakavia sivuvaikutuksia, jotka lisäävät potilaiden riskiä sairastua mm. diabetekseen ja sepelvaltimotautiin. Yksi leimaavimpia sivuvaikutuksia on lipodystrofia eli ihonalaisen rasvakudoksen häviäminen (lipoatrofia) kasvoista, raajoista ja vatsalta samalla kun rasvaa kertyy ylen määrin vatsaonteloon ja niskaan. Ilmiön syyt ovat epäselvät. Useiden HIV:ta vastaan suunnattujen lääkeaineiden on epäilty aiheuttavan lipodystrofiaa mm. tuhoamalla mitokondrioita, solujen energiatehtaita . Lipodystrofiaan ei ole tehokasta hoitoa, ellei HIV-lääkitystä muuteta, mutta esim. uridiini on ollut lupaava apu solumallitutkimusten valossa. Tavoitteet: Tutkia liittyykö yhdistelmähoitoon tai sen käyttöön liittyvään lipodystrofiaan verisuonien jäykistymistä, onko lipoatrofinen rasvakudos tulehtunutta verrattuna ei-lipoatrofiseen rasvakudokseen, eroaako lipoatrofinen ei-lipoatrofisesta rasvakudoksesta mm. mitokondriomäärän ja aineenvaihduntaan vaikuttavien geenien ilmentymisen suhteen sekä poikkeaako lipodystrofiassa paremmin säilyvä niskan rasva häviävästä vatsan ihonalaisrasvasta ja onko se mahdollisesti ruskeata rasvaa. Lisäksi tutkimme, voiko ravintolisänä käytetty uridiini parantaa lipoatrofiaa ja siihen liittyviä aineenvaihduntahäiriöitä, kuten rasvamaksaa ja heikentynyttä insuliiniherkkyyttä. Menetelmät: Tutkimuksiin osallistui 64 HIV-positiivista yhdistelmähoidettua potilasta, joista 45:lla oli ja 19:lla ei ollut kehittynyt lääkitykseen liittyviä rasvakudoksen muutoksia. Verisuonijäykkyys tutkittiin pulssiaaltoanalyysilla, kehon koostumus mitattiin kaksienergisella röntgenabsorptiometria- sekä magneettikuvaantamisella ja maksan rasvapitoisuus protonispektroskopialla. Rasvakudosnäytteet otettiin potilaiden vatsan ja niskan ihoalaisrasvasta ja niistä mitattiin eri geenien ilmentymistä sekä mitokondrioiden ja tulehdussolujen määrää mm. DNA:n monistustekniikalla ja kudosleikevärjäyksin. Uridiinin tehoa lipoatrofian hoidossa arvioitiin 3kk satunnaistetussa lumelääkekontrolloidussa tutkimuksessa, johon osallistui 20 HIV-positiivista yhdistelmähoidettua lipoatrofista henkilöä. Tulokset: HIV-lääkityksen kesto, mutta ei lipodystrofia, altistaa verisuonien jäykistymiselle iästä ja verenpainetasosta riippumatta. Lipoatrofisessa rasvakudoksessa tulehdukseen liittyvien geenien ilmentyminen ja tulehdussolujen määrä ovat lisääntyneet, kun taas mitokondriomäärä sekä rasvasolujen muodostumiseen ja toimintaan liittyvien geenien ilmentyminen vähentyneet verrattuna ei-lipoatrofiseen rasvakudokseen. Lipodystrofiassa säilyvä/lisääntyvä niskan rasva on vähemmän tulehtunutta kuin herkemmin häviävä vatsan ihonalaisrasva eikä se ole ruskeata rasvaa. Lipodystrofisten henkilöiden niskan rasvassa on vähemmän mitokondrioita kuin ei-lipodystrofisten henkilöiden niskan rasvassa, vaikka kudokset ovat ulkoisesti samannäköisiä. Niskan ja vatsan alueen ihonalaisrasva eroaa eniten ns. homeobox-geenien ilmentymisessä eli sellaisten geenien, jotka määrittelevät kudosten sijainnin ja ominaisuudet sikiökehityksen varhaisvaiheessa. Uridiini lisää ihonalaisrasvan määrää lipoatrofisilla potilailla, mutta ei vaikuta maksan rasvapitoisuuteen tai insuliiniherkkyyteen. Johtopäätökset: HIV:n hoitoon käytettyjen lääkkeiden pitkäaikaiskäyttö lisää verisuonien jäykkyyttä ja siten potilaiden riskiä sairastua sydän- ja verisuonitauteihin. Lipoatrofinen rasva on tulehtunut ja sen mitokondriovarannot vähentyneet. Koska mitokondrioiden vähyys on todettavissa niskarasvassa myös sellaisilla lipodystrofisilla henkilöillä, joilla se on säilynyt atrofialta, mitokondriokatoa ei voida pitää lipoatrofiaa suoraan aiheuttavana tekijänä. Niskan ja vatsan ihonalaisrasvan merkittävin ero on elinkehitystä ohjaavissa geeneissä, mikä voi selittää kudosten erilaisen alttiuden lääkkeiden sivuvaikutuksille. Uridiini on tehokas hoito HIV-potilaiden lipodystrofiaan muuttumattoman yhdistelmähoidon aikana