70 research outputs found

    Positive blood culture detection in time series data using a BiLSTM network

    Get PDF
    The presence of bacteria or fungi in the bloodstream of patients is abnormal and can lead to life-threatening conditions. A computational model based on a bidirectional long short-term memory artificial neural network, is explored to assist doctors in the intensive care unit to predict whether examination of blood cultures of patients will return positive. As input it uses nine monitored clinical parameters, presented as time series data, collected from 2177 ICU admissions at the Ghent University Hospital. Our main goal is to determine if general machine learning methods and more specific, temporal models, can be used to create an early detection system. This preliminary research obtains an area of 71.95% under the precision recall curve, proving the potential of temporal neural networks in this context

    Smart aging : utilisation of machine learning and the Internet of Things for independent living

    Get PDF
    Smart aging utilises innovative approaches and technology to improve older adults’ quality of life, increasing their prospects of living independently. One of the major concerns the older adults to live independently is “serious fall”, as almost a third of people aged over 65 having a fall each year. Dementia, affecting nearly 9% of the same age group, poses another significant issue that needs to be identified as early as possible. Existing fall detection systems from the wearable sensors generate many false alarms; hence, a more accurate and secure system is necessary. Furthermore, there is a considerable gap to identify the onset of cognitive impairment using remote monitoring for self-assisted seniors living in their residences. Applying biometric security improves older adults’ confidence in using IoT and makes it easier for them to benefit from smart aging. Several publicly available datasets are pre-processed to extract distinctive features to address fall detection shortcomings, identify the onset of dementia system, and enable biometric security to wearable sensors. These key features are used with novel machine learning algorithms to train models for the fall detection system, identifying the onset of dementia system, and biometric authentication system. Applying a quantitative approach, these models are tested and analysed from the test dataset. The fall detection approach proposed in this work, in multimodal mode, can achieve an accuracy of 99% to detect a fall. Additionally, using 13 selected features, a system for detecting early signs of dementia is developed. This system has achieved an accuracy rate of 93% to identify a cognitive decline in the older adult, using only some selected aspects of their daily activities. Furthermore, the ML-based biometric authentication system uses physiological signals, such as ECG and Photoplethysmogram, in a fusion mode to identify and authenticate a person, resulting in enhancement of their privacy and security in a smart aging environment. The benefits offered by the fall detection system, early detection and identifying the signs of dementia, and the biometric authentication system, can improve the quality of life for the seniors who prefer to live independently or by themselves

    Biomedical Data Classification with Improvised Deep Learning Architectures

    Get PDF
    With the rise of very powerful hardware and evolution of deep learning architectures, healthcare data analysis and its applications have been drastically transformed. These transformations mainly aim to aid a healthcare personnel with diagnosis and prognosis of a disease or abnormality at any given point of healthcare routine workflow. For instance, many of the cancer metastases detection depends on pathological tissue procedures and pathologist reviews. The reports of severity classification vary amongst different pathologist, which then leads to different treatment options for a patient. This labor-intensive work can lead to errors or mistreatments resulting in high cost of healthcare. With the help of machine learning and deep learning modules, some of these traditional diagnosis techniques can be improved and aid a doctor in decision making with an unbiased view. Some of such modules can help reduce the cost, shortage of an expertise, and time in identifying the disease. However, there are many other datapoints that are available with medical images, such as omics data, biomarker calculations, patient demographics and history. All these datapoints can enhance disease classification or prediction of progression with the help of machine learning/deep learning modules. However, it is very difficult to find a comprehensive dataset with all different modalities and features in healthcare setting due to privacy regulations. Hence in this thesis, we explore both medical imaging data with clinical datapoints as well as genomics datasets separately for classification tasks using combinational deep learning architectures. We use deep neural networks with 3D volumetric structural magnetic resonance images of Alzheimer Disease dataset for classification of disease. A separate study is implemented to understand classification based on clinical datapoints achieved by machine learning algorithms. For bioinformatics applications, sequence classification task is a crucial step for many metagenomics applications, however, requires a lot of preprocessing that requires sequence assembly or sequence alignment before making use of raw whole genome sequencing data, hence time consuming especially in bacterial taxonomy classification. There are only a few approaches for sequence classification tasks that mainly involve some convolutions and deep neural network. A novel method is developed using an intrinsic nature of recurrent neural networks for 16s rRNA sequence classification which can be adapted to utilize read sequences directly. For this classification task, the accuracy is improved using optimization techniques with a hybrid neural network

    Transfer learning for sentiment analysis using bert based supervised fine-tuning

    Get PDF
    The growth of the Internet has expanded the amount of data expressed by users across multiple platforms. The availability of these different worldviews and individuals’ emotions em-powers sentiment analysis. However, sentiment analysis becomes even more challenging due to a scarcity of standardized labeled data in the Bangla NLP domain. The majority of the existing Bangla research has relied on models of deep learning that significantly focus on context-independent word embeddings, such as Word2Vec, GloVe, and fastText, in which each word has a fixed representation irrespective of its context. Meanwhile, context-based pre-trained language models such as BERT have recently revolutionized the state of natural language processing. In this work, we utilized BERT’s transfer learning ability to a deep integrated model CNN-BiLSTM for enhanced performance of decision-making in sentiment analysis. In addition, we also introduced the ability of transfer learning to classical machine learning algorithms for the performance comparison of CNN-BiLSTM. Additionally, we explore various word embedding techniques, such as Word2Vec, GloVe, and fastText, and compare their performance to the BERT transfer learning strategy. As a result, we have shown a state-of-the-art binary classification performance for Bangla sentiment analysis that significantly outperforms all embedding and algorithms

    Mapping (Dis-)Information Flow about the MH17 Plane Crash

    Get PDF
    Digital media enables not only fast sharing of information, but also disinformation. One prominent case of an event leading to circulation of disinformation on social media is the MH17 plane crash. Studies analysing the spread of information about this event on Twitter have focused on small, manually annotated datasets, or used proxys for data annotation. In this work, we examine to what extent text classifiers can be used to label data for subsequent content analysis, in particular we focus on predicting pro-Russian and pro-Ukrainian Twitter content related to the MH17 plane crash. Even though we find that a neural classifier improves over a hashtag based baseline, labeling pro-Russian and pro-Ukrainian content with high precision remains a challenging problem. We provide an error analysis underlining the difficulty of the task and identify factors that might help improve classification in future work. Finally, we show how the classifier can facilitate the annotation task for human annotators

    Biosynthetic gene cluster identification in plasmids and characterization of plasmids from animal-associated microbiota

    Get PDF
    Individual bacteria in complex microbial communities can acquire and accumulate new traits. These traits are reflective of their environment, being niche-specific. A major player in trait sharing is horizontal gene transfer (HGT). Plasmids, extrachromosomal DNA molecules, have a role in HGT and can change the host’s phenotype. Considering the transformative role of plasmids in bacterial lifestyle, we investigated the prevalence, distribution and products of biosynthetic gene clusters (BGCs) present in plasmids. Sequences available on the National Center for Biotechnology Information (NCBI) database (n=101 416) were run through two bioinformatic pipelines for BGC detection that apply different approaches, deepBGC and antiSMASH (antibiotics and secondary metabolites analysis shell). The highest percentage of plasmids with BGCs was detected in Actinobacteria but, apart from Chlamidiae and Tenericutes, all phyla had BGCs in their plasmids, with predictions varying according to the software used. The BGCs identified comprised a range of classes, indicating that plasmid encoded BGCs could be leveraged for the discovery of new molecules. In order to apply that concept to real-life examples, plasmids were isolated from animal-associated microbial communities and characterized. Plasmids from Escherichia coli isolated from wild birds (n=36) were screened for phenotypes of interest in human and animal health. Seven isolates displayed plasmid-encoded antibiotic resistance. Taxonomic identification of the hosts of plasmids isolated from bovid-associated microbiomes (n=38) was determined via 16S rRNA gene, and placed the majority of the isolated in the phylum Firmicutes, apart from a single Klebsiella pneumoniae isolate. Twelve plasmids were sequenced. Three plasmids from different hosts (pRAM-12, pRAM-19-2 and pRAM-30-2) shared 100% nucleotide sequence and a gene cluster for the bacteriocin cloacin. Two of those hosts shared not one, but two plasmids, pRAM-19-1 and pRAM-30-1, despite being in different phyla. This highlights the intimacy of gene sharing and the importance of HGT. pRAM-28 and pRAM-21 shared a plasmid that harbors the BGC for the bacteriocin aureocin A70, the only four peptide bacteriocin known to date. Additional analysis revealed two putative novel lanthipeptide gene clusters in pRAM-2. These results suggest that the plasmidome is a neglected source of secondary metabolites with the potential for molecule discovery. Furthermore, it can be leveraged to study genetic exchange in a community and how plasmid-encoded featured can mediate interactions in a microbiome

    Contributions to information extraction for spanish written biomedical text

    Get PDF
    285 p.Healthcare practice and clinical research produce vast amounts of digitised, unstructured data in multiple languages that are currently underexploited, despite their potential applications in improving healthcare experiences, supporting trainee education, or enabling biomedical research, for example. To automatically transform those contents into relevant, structured information, advanced Natural Language Processing (NLP) mechanisms are required. In NLP, this task is known as Information Extraction. Our work takes place within this growing field of clinical NLP for the Spanish language, as we tackle three distinct problems. First, we compare several supervised machine learning approaches to the problem of sensitive data detection and classification. Specifically, we study the different approaches and their transferability in two corpora, one synthetic and the other authentic. Second, we present and evaluate UMLSmapper, a knowledge-intensive system for biomedical term identification based on the UMLS Metathesaurus. This system recognises and codifies terms without relying on annotated data nor external Named Entity Recognition tools. Although technically naive, it performs on par with more evolved systems, and does not exhibit a considerable deviation from other approaches that rely on oracle terms. Finally, we present and exploit a new corpus of real health records manually annotated with negation and uncertainty information: NUBes. This corpus is the basis for two sets of experiments, one on cue andscope detection, and the other on assertion classification. Throughout the thesis, we apply and compare techniques of varying levels of sophistication and novelty, which reflects the rapid advancement of the field
    corecore