156 research outputs found

    Investigating Citation Linkage Between Research Articles

    Get PDF
    In recent years, there has been a dramatic increase in scientific publications across the globe. To help navigate this overabundance of information, methods have been devised to find papers with related content, but they are lacking in the ability to provide specific information that a researcher may need without having to read hundreds of linked papers. The search and browsing capabilities of online domain specific scientific repositories are limited to finding a paper citing other papers, but do not point to the specific text that is being cited. Providing this capability to the research community will be beneficial in terms of the time required to acquire the amount of background information they need to undertake their research. In this thesis, we present our effort to develop a citation linkage framework for finding those sentences in a cited article that are the focus of a citation in a citing paper. This undertaking has involved the construction of datasets and corpora that are required to build models for focused information extraction, text classification and information retrieval. As the first part of this thesis, two preprocessing steps that are deemed to assist with the citation linkage task are explored: method mention extraction and rhetorical categorization of scientific discourse. In the second part of this thesis, two methodologies for achieving the citation linkage goal are investigated. Firstly, regression techniques have been used to predict the degree of similarity between citation sentences and their equivalent target sentences with medium Pearson correlation score between predicted and expected values. The resulting learning models are then used to rank sentences in the cited paper based on their predicted scores. Secondly, search engine-like retrieval techniques have been used to rank sentences in the cited paper based on the words contained in the citation sentence. Our experiments show that it is possible to find the set of sentences that a citation refers to in a cited paper with reasonable performance. Possible applications of this work include: creation of better science paper repository navigation tools, development of scientific argumentation across research articles, and multi-document summarization of science articles

    Adverse Drug Event Detection, Causality Inference, Patient Communication and Translational Research

    Get PDF
    Adverse drug events (ADEs) are injuries resulting from a medical intervention related to a drug. ADEs are responsible for nearly 20% of all the adverse events that occur in hospitalized patients. ADEs have been shown to increase the cost of health care and the length of stays in hospital. Therefore, detecting and preventing ADEs for pharmacovigilance is an important task that can improve the quality of health care and reduce the cost in a hospital setting. In this dissertation, we focus on the development of ADEtector, a system that identifies ADEs and medication information from electronic medical records and the FDA Adverse Event Reporting System reports. The ADEtector system employs novel natural language processing approaches for ADE detection and provides a user interface to display ADE information. The ADEtector employs machine learning techniques to automatically processes the narrative text and identify the adverse event (AE) and medication entities that appear in that narrative text. The system will analyze the entities recognized to infer the causal relation that exists between AEs and medications by automating the elements of Naranjo score using knowledge and rule based approaches. The Naranjo Adverse Drug Reaction Probability Scale is a validated tool for finding the causality of a drug induced adverse event or ADE. The scale calculates the likelihood of an adverse event related to drugs based on a list of weighted questions. The ADEtector also presents the user with evidence for ADEs by extracting figures that contain ADE related information from biomedical literature. A brief summary is generated for each of the figures that are extracted to help users better comprehend the figure. This will further enhance the user experience in understanding the ADE information better. The ADEtector also helps patients better understand the narrative text by recognizing complex medical jargon and abbreviations that appear in the text and providing definitions and explanations for them from external knowledge resources. This system could help clinicians and researchers in discovering novel ADEs and drug relations and also hypothesize new research questions within the ADE domain

    Text Mining Biomedical Literature for Genomic Knowledge Discovery

    Get PDF
    The last decade has been marked by unprecedented growth in both the production of biomedical data and the amount of published literature discussing it. Almost every known or postulated piece of information pertaining to genes, proteins, and their role in biological processes is reported somewhere in the vast amount of published biomedical literature. We believe the ability to rapidly survey and analyze this literature and extract pertinent information constitutes a necessary step toward both the design and the interpretation of any large-scale experiment. Moreover, automated literature mining offers a yet untapped opportunity to integrate many fragments of information gathered by researchers from multiple fields of expertise into a complete picture exposing the interrelated roles of various genes, proteins, and chemical reactions in cells and organisms. In this thesis, we show that functional keywords in biomedical literature, particularly Medline, represent very valuable information and can be used to discover new genomic knowledge. To validate our claim we present an investigation into text mining biomedical literature to assist microarray data analysis, yeast gene function classification, and biomedical literature categorization. We conduct following studies: 1. We test sets of genes to discover common functional keywords among them and use these keywords to cluster them into groups; 2. We show that it is possible to link genes to diseases by an expert human interpretation of the functional keywords for the genes- none of these diseases are as yet mentioned in public databases; 3. By clustering genes based on commonality of functional keywords it is possible to group genes into meaningful clusters that reveal more information about their functions, link to diseases and roles in metabolism pathways; 4. Using extracted functional keywords, we are able to demonstrate that for yeast genes, we can make a better functional grouping of genes in comparison to available public microarray and phylogenetic databases; 5. We show an application of our approach to literature classification. Using functional keywords as features, we are able to extract epidemiological abstracts automatically from Medline with higher sensitivity and accuracy than a human expert.Ph.D.Committee Chair: Shamkant B. Navathe; Committee Co-Chair: Brian J. Ciliax; Committee Member: Ashwin Ram; Committee Member: Edward Omiecinski; Committee Member: Ray Dingledine; Committee Member: Venu Dasig

    Smart workplaces: a system proposal for stress management

    Get PDF
    Over the past last decades of contemporary society, workplaces have become the primary source of many health issues, leading to mental problems such as stress, depression, and anxiety. Among the others, environmental aspects have shown to be the causes of stress, illness, and lack of productivity. With the arrival of new technologies, especially in the smart workplaces field, most studies have focused on investigating the building energy efficiency models and human thermal comfort. However, little has been applied to occupants’ stress recognition and well-being overall. Due to this fact, this present study aims to propose a stress management solution for an interactive design system that allows the adapting of comfortable environmental conditions according to the user preferences by measuring in real-time the environmental and biological characteristics, thereby helping to prevent stress, as well as to enable users to cope stress when being stressed. The secondary objective will focus on evaluating one part of the system: the mobile application. The proposed system uses several usability methods to identify users’ needs, behavior, and expectations from the user-centered design approach. Applied methods, such as User Research, Card Sorting, and Expert Review, allowed us to evaluate the design system according to Heuristics Analysis, resulting in improved usability of interfaces and experience. The study presents the research results, the design interface, and usability tests. According to the User Research results, temperature and noise are the most common environmental stressors among the users causing stress and uncomfortable conditions to work in, and the preference for physical activities over the digital solutions for coping with stress. Additionally, the System Usability Scale (SUS) results identified that the system’s usability was measured as “excellent” and “acceptable” with a final score of 88 points out of the 100. It is expected that these conclusions can contribute to future investigations in the smart workplaces study field and their interaction with the people placed there.Nas Ășltimas dĂ©cadas da sociedade contemporĂąnea, o local de trabalho tem se tornado principal fonte de muitos problemas de saĂșde mental, como o stress, depressĂŁo e ansiedade. Os aspetos ambientais tĂȘm se revelado como as causas de stress, doenças, falta de produtividade, entre outros. Atualmente, com a chegada de novas tecnologias, principalmente na ĂĄrea de locais de trabalho inteligentes, a maioria dos estudos tem se concentrado na investigação de modelos de eficiĂȘncia energĂ©tica de edifĂ­cios e conforto tĂ©rmico humano. No entanto, pouco foi aplicado ao reconhecimento do stress dos ocupantes e ao bem-estar geral das pessoas. Diante disso, o objetivo principal Ă© propor um sistema de design de gestĂŁo do stress para um sistema de design interativo que permita adaptar as condiçÔes ambientais de acordo com as preferĂȘncias de utilizador, medindo em tempo real as caracterĂ­sticas ambientais e biolĂłgicas, auxiliando assim na prevenção de stress, bem como ajuda os utilizadores a lidar com o stress quando estĂŁo sob o mesmo. O segundo objetivo Ă© desenhar e avaliar uma parte do projeto — o protĂłtipo da aplicação mĂłvel atravĂ©s da realização de testes de usabilidade. O sistema proposto resulta da abordagem de design centrado no utilizador, utilizando diversos mĂ©todos de usabilidade para identificar as necessidades, comportamentos e as expectativas dos utilizadores. MĂ©todos aplicados, como Pesquisa de UsuĂĄrio, Card Sorting e RevisĂŁo de Especialistas, permitiram avaliar o sistema de design de acordo com a anĂĄlise heurĂ­stica, resultando numa melhoria na usabilidade das interfaces e experiĂȘncia. O estudo apresenta os resultados da pesquisa, a interface do design e os testes de usabilidade. De acordo com os resultados de User Research, a temperatura e o ruĂ­do sĂŁo os stressores ambientais mais comuns entre os utilizadores, causando stresse e condiçÔes menos favorĂĄveis para trabalhar, igualmente existe uma preferĂȘncia por atividades fĂ­sicas sobre as soluçÔes digitais na gestĂŁo do stresse. Adicionalmente, os resultados de System Usability Scale (SUS) identificaram a usabilidade do sistema de design como “excelente” e “aceitĂĄvel” com pontuação final de 88 pontos em 100. É esperado que essas conclusĂ”es possam contribuir para futuras investigaçÔes no campo de estudo dos smart workplaces e sua interação com os utilizadores

    Individualised, interpretable and reproducible computer-aided diagnosis of dementia: towards application in clinical practice

    Get PDF
    Neuroimaging offers an unmatched description of the brain’s structure and physiology, but the information it provides is not easy to extract and interpret. A popular way to extract meaningful information from brain images is to use computational methods based on machine learning and deep learning to predict the current or future diagnosis of a patient. A large number of these approaches have been dedicated to the computer-aided diagnosis of dementia, and more specifically of Alzheimer's disease. However, only a few are translated to the clinic. This can be explained by different factors such as the lack of rigorous validation of these approaches leading to over-optimistic performance and their lack of reproducibility, but also the limited interpretability of these methods and their limited generalisability when moving from highly controlled research data to routine clinical data. This manuscript describes how we tried to address these limitations.We have proposed reproducible frameworks for the evaluation of Alzheimer's disease classification methods and developed two open-source software platforms for clinical neuroimaging studies (Clinica) and neuroimaging processing with deep learning (ClinicaDL). We have implemented and assessed the robustness of a visualisation method aiming to interpret convolutional neural networks and used it to study the stability of the network training. We concluded that, currently, combining a convolutional neural networks classifier with an interpretability method may not constitute a robust tool for individual computer-aided diagnosis. As an alternative, we have proposed an approach that detects anomalies in the brain by generating what would be the healthy version of a patient's image and comparing this healthy version with the real image. Finally, we have studied the performance of machine and deep learning algorithms for the computer-aided diagnosis of dementia from images acquired in clinical routine.La neuro-imagerie offre une description inĂ©galĂ©e de la structure et de la physiologie du cerveau, mais les informations qu'elle fournit ne sont pas faciles Ă  extraire et Ă  interprĂ©ter. Une façon populaire d'extraire des informations pertinentes d'images cĂ©rĂ©brales consiste Ă  utiliser des mĂ©thodes basĂ©es sur l'apprentissage statistique et l'apprentissage profond pour prĂ©dire le diagnostic actuel ou futur d'un patient. Un grand nombre de ces approches ont Ă©tĂ© dĂ©diĂ©es au diagnostic assistĂ© par ordinateur de la dĂ©mence, et plus spĂ©cifiquement de la maladie d'Alzheimer. Cependant, seules quelques-unes sont transposĂ©es en clinique. Cela peut s'expliquer par diffĂ©rents facteurs tels que l'absence de validation rigoureuse de ces approches conduisant Ă  des performances trop optimistes et Ă  leur manque de reproductibilitĂ©, mais aussi l'interprĂ©tabilitĂ© limitĂ©e de ces mĂ©thodes et leur gĂ©nĂ©ralisation limitĂ©e lors du passage de donnĂ©es de recherche hautement contrĂŽlĂ©es Ă  des donnĂ©es cliniques de routine. Ce manuscrit dĂ©crit comment nous avons tentĂ© de remĂ©dier Ă  ces limites.Nous avons proposĂ© des cadres reproductibles pour l'Ă©valuation des mĂ©thodes de classification de la maladie d'Alzheimer et dĂ©veloppĂ© deux plateformes logicielles open-source pour les Ă©tudes de neuroimagerie clinique (Clinica) et le traitement de la neuroimagerie par apprentissage profond (ClinicaDL). Nous avons implĂ©mentĂ© et Ă©valuĂ© la robustesse d'une mĂ©thode de visualisation visant Ă  interprĂ©ter les rĂ©seaux neuronaux convolutifs et l'avons utilisĂ©e pour Ă©tudier la stabilitĂ© de l'entraĂźnement du rĂ©seau. Nous avons conclu qu'actuellement, la combinaison de rĂ©seaux neuronaux convolutifs avec une mĂ©thode d'interprĂ©tabilitĂ© peut ne pas constituer un outil robuste pour le diagnostic individuel assistĂ© par ordinateur. De façon alternative, nous avons proposĂ© une approche qui dĂ©tecte les anomalies dans le cerveau en gĂ©nĂ©rant ce qui serait la version saine de l'image d'un patient et en comparant cette version saine avec l'image rĂ©elle. Enfin, nous avons Ă©tudiĂ© les performances des algorithmes d'apprentissage statistique et profond pour le diagnostic assistĂ© par ordinateur de la dĂ©mence Ă  partir d'images acquises en routine clinique

    Towards Accurate Forecasting of Epileptic Seizures: Artificial Intelligence and Effective Connectivity Findings

    Get PDF
    L’épilepsie est une des maladies neurologiques les plus frĂ©quentes, touchant prĂšs d’un pourcent de la population mondiale. De nos jours, bien qu’environ deux tiers des patients Ă©pileptiques rĂ©pondent adĂ©quatement aux traitements pharmacologiques, il reste qu’un tiers des patients doivent vivre avec des crises invalidantes et imprĂ©visibles. Quoique la chirurgie d’épilepsie puisse ĂȘtre une autre option thĂ©rapeutique envisageable, le recours Ă  la chirurgie de rĂ©section demeure trĂšs faible en partie pour des raisons diverses (taux de rĂ©ussite modeste, peur des complications, perceptions nĂ©gatives). D’autres avenues de traitement sont donc souhaitables. Une piste actuellement explorĂ©e par des groupes de chercheurs est de tenter de prĂ©dire les crises Ă  partir d’enregistrements de l’activitĂ© cĂ©rĂ©brale des patients. La capacitĂ© de prĂ©dire la survenue de crises permettrait notamment aux patients, aidants naturels ou personnels mĂ©dical de prendre des mesures de prĂ©caution pour Ă©viter les dĂ©sagrĂ©ments reliĂ©s aux crises voire mĂȘme instaurer un traitement pour les faire avorter. Au cours des derniĂšres annĂ©es, d’importants efforts ont Ă©tĂ© dĂ©ployĂ©s pour dĂ©velopper des algorithmes de prĂ©diction de crises et d’en amĂ©liorer les performances. Toutefois, le manque d’enregistrements Ă©lectroencĂ©phalographiques intracrĂąniens (iEEG) de longue durĂ©e de qualitĂ©, la quantitĂ© limitĂ©e de crises, ainsi que la courte durĂ©e des pĂ©riodes interictales constituaient des obstacles majeurs Ă  une Ă©valuation adĂ©quate de la performance des algorithmes de prĂ©diction de crises. RĂ©cemment, la disponibilitĂ© en ligne d’enregistrements iEEG continus avec Ă©chantillonnage bilatĂ©ral (des deux hĂ©misphĂšres) acquis chez des chiens atteints d’épilepsie focale Ă  l’aide du dispositif de surveillance ambulatoire implantable NeuroVista a partiellement facilitĂ© cette tĂąche. Cependant, une des limitations associĂ©es Ă  l’utilisation de ces donnĂ©es durant la conception d’un algorithme de prĂ©diction de crises Ă©tait l’absence d’information concernant la zone exacte de dĂ©but des crises (information non fournie par les gestionnaires de cette base de donnĂ©es en ligne). Le premier objectif de cette thĂšse Ă©tait la mise en oeuvre d’un algorithme prĂ©cis de prĂ©diction de crises basĂ© sur des enregistrements iEEG canins de longue durĂ©e. Les principales contributions Ă  cet Ă©gard incluent une localisation quantitative de la zone d’apparition des crises (basĂ©e sur la fonction de transfert dirigĂ© –DTF), l’utilisation d’une nouvelle fonction de coĂ»t via l’algorithme gĂ©nĂ©tique proposĂ©, ainsi qu’une Ă©valuation quasi-prospective des performances de prĂ©diction (donnĂ©es de test d’un total de 893 jours). Les rĂ©sultats ont montrĂ© une amĂ©lioration des performances de prĂ©diction par rapport aux Ă©tudes antĂ©rieures, atteignant une sensibilitĂ© moyenne de 84.82 % et un temps en avertissement de 10 %. La DTF, utilisĂ©e prĂ©cĂ©demment comme mesure de connectivitĂ© pour dĂ©terminer le rĂ©seau Ă©pileptique (objectif 1), a Ă©tĂ© prĂ©alablement validĂ©e pour quantifier les relations causales entre les canaux lorsque les exigences de quasi-stationnaritĂ© sont satisfaites. Ceci est possible dans le cas des enregistrements canins en raison du nombre relativement faible de canaux. Pour faire face aux exigences de non-stationnaritĂ©, la fonction de transfert adaptatif pondĂ©rĂ©e par le spectre (Spectrum weighted adaptive directed transfer function - swADTF) a Ă©tĂ© introduit en tant qu’une version variant dans le temps de la DTF. Le second objectif de cette thĂšse Ă©tait de valider la possibilitĂ© d’identifier les endroits Ă©metteurs (ou sources) et rĂ©cepteurs d’activitĂ© Ă©pileptiques en appliquant la swADTF sur des enregistrements iEEG de haute densitĂ© provenant de patients admis pour Ă©valuation prĂ©-chirurgicale au CHUM. Les gĂ©nĂ©rateurs d’activitĂ© Ă©pileptique Ă©taient dans le volume rĂ©sĂ©quĂ© pour les patients ayant des bons rĂ©sultats post-chirurgicaux alors que diffĂ©rents foyers ont Ă©tĂ© identifiĂ©s chez les patients ayant eu de mauvais rĂ©sultats postchirurgicaux. Ces rĂ©sultats dĂ©montrent la possibilitĂ© d’une identification prĂ©cise des sources et rĂ©cepteurs d’activitĂ©s Ă©pileptiques au moyen de la swADTF ouvrant la porte Ă  la possibilitĂ© d’une meilleure sĂ©lection d’électrodes de maniĂšre quantitative dans un contexte de dĂ©veloppement d’algorithme de prĂ©diction de crises chez l’humain. Dans le but d’explorer de nouvelles avenues pour la prĂ©diction de crises Ă©pileptiques, un nouveau prĂ©curseur a aussi Ă©tĂ© Ă©tudiĂ© combinant l’analyse des spectres d’ordre supĂ©rieur et les rĂ©seaux de neurones artificiels (objectif 3). Les rĂ©sultats ont montrĂ© des diffĂ©rences statistiquement significatives (p<0.05) entre l’état prĂ©ictal et l’état interictal en utilisant chacune des caractĂ©ristiques extraites du bi-spectre. UtilisĂ©es comme entrĂ©es Ă  un perceptron multicouche, l’entropie bispectrale normalisĂ©e, l’entropie carrĂ© normalisĂ©e, et la moyenne ont atteint des prĂ©cisions respectives de 78.11 %, 72.64% et 73.26%. Les rĂ©sultats de cette thĂšse confirment la faisabilitĂ© de prĂ©diction de crises Ă  partir d’enregistrements d’électroencĂ©phalographie intracrĂąniens. Cependant, des efforts supplĂ©mentaires en termes de sĂ©lection d’électrodes, d’extraction de caractĂ©ristiques, d’utilisation des techniques d’apprentissage profond et d’implĂ©mentation Hardware, sont nĂ©cessaires avant l’intĂ©gration de ces approches dans les dispositifs implantables commerciaux.----------ABSTRACT Epilepsy is a chronic condition characterized by recurrent “unpredictable” seizures. While the first line of treatment consists of long-term drug therapy about one-third of patients are said to be pharmacoresistant. In addition, recourse to epilepsy surgery remains low in part due to persisting negative attitudes towards resective surgery, fear of complications and only moderate success rates. An important direction of research is to investigate the possibility of predicting seizures which, if achieved, can lead to novel interventional avenues. The paucity of intracranial electroencephalography (iEEG) recordings, the limited number of ictal events, and the short duration of interictal periods have been important obstacles for an adequate assessment of seizure forecasting. More recently, long-term continuous bilateral iEEG recordings acquired from dogs with naturally occurring focal epilepsy, using the implantable NeuroVista ambulatory monitoring device have been made available on line for the benefit of researchers. Still, an important limitation of these recordings for seizure-prediction studies was that the seizure onset zone was not disclosed/available. The first objective of this thesis was to develop an accurate seizure forecasting algorithm based on these canine ambulatory iEEG recordings. Main contributions include a quantitative, directed transfer function (DTF)-based, localization of the seizure onset zone (electrode selection), a new fitness function for the proposed genetic algorithm (feature selection), and a quasi-prospective assessment of seizure forecasting on long-term continuous iEEG recordings (total of 893 testing days). Results showed performance improvement compared to previous studies, achieving an average sensitivity of 84.82% and a time in warning of 10 %. The DTF has been previously validated for quantifying causal relations when quasistationarity requirements are met. Although such requirements can be fulfilled in the case of canine recordings due to the relatively low number of channels (objective 1), the identification of stationary segments would be more challenging in the case of high density iEEG recordings. To cope with non-stationarity issues, the spectrum weighted adaptive directed transfer function (swADTF) was recently introduced as a time-varying version of the DTF. The second objective of this thesis was to validate the feasibility of identifying sources and sinks of seizure activity based on the swADTF using high-density iEEG recordings of patients admitted for pre-surgical monitoring at the CHUM. Generators of seizure activity were within the resected volume for patients with good post-surgical outcomes, whereas different or additional seizure foci were identified in patients with poor post-surgical outcomes. Results confirmed the possibility of accurate identification of seizure origin and propagation by means of swADTF paving the way for its use in seizure prediction algorithms by allowing a more tailored electrode selection. Finally, in an attempt to explore new avenues for seizure forecasting, we proposed a new precursor of seizure activity by combining higher order spectral analysis and artificial neural networks (objective 3). Results showed statistically significant differences (p<0.05) between preictal and interictal states using all the bispectrum-extracted features. Normalized bispectral entropy, normalized squared entropy and mean of magnitude, when employed as inputs to a multi-layer perceptron classifier, achieved held-out test accuracies of 78.11%, 72.64%, and 73.26%, respectively. Results of this thesis confirm the feasibility of seizure forecasting based on iEEG recordings; the transition into the ictal state is not random and consists of a “build-up”, leading to seizures. However, additional efforts in terms of electrode selection, feature extraction, hardware and deep learning implementation, are required before the translation of current approaches into commercial devices

    Sustainable Agriculture and Advances of Remote Sensing (Volume 2)

    Get PDF
    Agriculture, as the main source of alimentation and the most important economic activity globally, is being affected by the impacts of climate change. To maintain and increase our global food system production, to reduce biodiversity loss and preserve our natural ecosystem, new practices and technologies are required. This book focuses on the latest advances in remote sensing technology and agricultural engineering leading to the sustainable agriculture practices. Earth observation data, in situ and proxy-remote sensing data are the main source of information for monitoring and analyzing agriculture activities. Particular attention is given to earth observation satellites and the Internet of Things for data collection, to multispectral and hyperspectral data analysis using machine learning and deep learning, to WebGIS and the Internet of Things for sharing and publication of the results, among others

    Smart Sensing Technologies for Personalised Coaching

    Get PDF
    People living in both developed and developing countries face serious health challenges related to sedentary lifestyles. It is therefore essential to find new ways to improve health so that people can live longer and can age well. With an ever-growing number of smart sensing systems developed and deployed across the globe, experts are primed to help coach people toward healthier behaviors. The increasing accountability associated with app- and device-based behavior tracking not only provides timely and personalized information and support but also gives us an incentive to set goals and to do more. This book presents some of the recent efforts made towards automatic and autonomous identification and coaching of troublesome behaviors to procure lasting, beneficial behavioral changes
    • 

    corecore