1,041 research outputs found

    Challenges and opportunities beyond structured data in analysis of electronic health records

    Get PDF
    Electronic health records (EHR) contain a lot of valuable information about individual patients and the whole population. Besides structured data, unstructured data in EHRs can provide extra, valuable information but the analytics processes are complex, time-consuming, and often require excessive manual effort. Among unstructured data, clinical text and images are the two most popular and important sources of information. Advanced statistical algorithms in natural language processing, machine learning, deep learning, and radiomics have increasingly been used for analyzing clinical text and images. Although there exist many challenges that have not been fully addressed, which can hinder the use of unstructured data, there are clear opportunities for well-designed diagnosis and decision support tools that efficiently incorporate both structured and unstructured data for extracting useful information and provide better outcomes. However, access to clinical data is still very restricted due to data sensitivity and ethical issues. Data quality is also an important challenge in which methods for improving data completeness, conformity and plausibility are needed. Further, generalizing and explaining the result of machine learning models are important problems for healthcare, and these are open challenges. A possible solution to improve data quality and accessibility of unstructured data is developing machine learning methods that can generate clinically relevant synthetic data, and accelerating further research on privacy preserving techniques such as deidentification and pseudonymization of clinical text

    Histopathology Image Analysis and NLP for Digital Pathology

    Get PDF
    Information technologies based on ML with quantitative imaging and texts are playing an essential role, particularly in general medicine and oncology. DL in particular has demonstrated significant breakthroughs in Computer Vision and NLP which could enhance disease detection and the establishment of efficient treatments. Furthermore, considering a large number of people with cancer and the substantial volume of data generated during cancer treatment, there is a significant interest in the use of AI to improve oncologic care. In digital pathology, high-resolution microscope images of tissue samples are stored along with written medical reports in databases that are used by pathologists. The diagnosis is made through tissue analysis of the biopsy sample and is written as a brief unstructured report which is stored as free text in Electronic Medical Record (EMR)systems. For the transition towards digitization of medical records to achieve its maximum benefits, these reports must be accessible and usable by medical practitioners to easily understand them and help them precisely identify the disease. Concerning the histopathology images, which is the basis of diagnosis and study of diseases of the tissues, image analysis helps us identify the disease’s location and allows us to classify the type of cancer. Recently, due to the abundant accumulation of WSIs, there has been an increased demand for effective and efficient gigapixel image analysis, such as computer-aided diagnosis using DL techniques. Also, due to the high diversity of shapes and structures in WSIs, it is not possible to use conventional DL techniques for classification. Though computer-aided diagnosis using DL has good prediction accuracy, in the medical domain, there is a need to explain the prediction of the model to have a better understanding beyond standard quantitative performance evaluation. This thesis presents three different findings. Firstly, I provide a comparative analysis of various transformer models such as BioBERT, Clinical BioBERT, BioMed-RoBERTaand TF-IDF and our results demonstrate the effectiveness of various word embedding techniques for pathology reports in the classification task. Secondly, with the help of slide labels of WSIs, I classify them to their disease types, with an architecture having an attention mechanism and instance-level clustering. Finally, I introduced a method to fuse the features of the pathology reports and the features of their respective images. I investigated the effect of the combination of the features in the classification of both histopathology images and their respective reports simultaneously. This proved to be better than the individual classification tasks achieving an accuracy of 95.73%

    Revealing effects of psychosocial factors of cancer patients

    Get PDF
    Abstract. This research shows different methodologies applied on different platforms in order to extract both social and psychosocial factors that might be related to caner by applying natural language processing tools on text from different platforms as social media or other online forums. We also present challenges associated with every platform and the corresponding tools used on it. From text mining to text analysis and then data visualisation, this research compares different analysis methods and outputs. We discuss many tools either tested, used or modified in order to achieve such analysis. Meanwhile, we were able to get interesting findings for the medical fields to explore and research more. We developed a modular system that can help clinicians and medical experts use to analyse similar forums.Syöpäpotilaiden psykososiaalisten tekijöiden vaikutusten paljastaminen. Tiivistelmä. Tämä tutkimus esittelee erilaisia menetelmiä sovellettuina eri alustoilla, tavoitteena hahmottaa sekä sosiaalisia että psykokososiaalisia tekijöitä, jotka voivat liittyä syöpään sovellettaessa luonnollisia kielenkäsittelyvälineitä eri alustojen tekstille sosiaalisen median tai muiden online-foorumeiden muodossa. Esitämme myös haasteita, jotka liittyvät jokaiseen alustaan ja siihen liittyviin työkaluihin. Teksti-mining, tekstianalyysiin ja sitten datan visualisointiin tässä tutkimuksessa verrataan erilaisia analyysimenetelmiä ja -tuloksia. Keskustelemme monista työkaluista, jotka on testattu, käytetty tai muunnettu tällaisen analyysin saavuttamiseksi. Samaan aikaan saimme mielenkiintoisia tuloksia lääketieteen aloille tutkia ja tutkia lisää. Kehitimme modulaarisen järjestelmän, jonka avulla lääkärit ja lääketieteen asiantuntijat voivat analysoida samanlaisia foorumeita

    Enhance Representation Learning of Clinical Narrative with Neural Networks for Clinical Predictive Modeling

    Get PDF
    Medicine is undergoing a technological revolution. Understanding human health from clinical data has major challenges from technical and practical perspectives, thus prompting methods that understand large, complex, and noisy data. These methods are particularly necessary for natural language data from clinical narratives/notes, which contain some of the richest information on a patient. Meanwhile, deep neural networks have achieved superior performance in a wide variety of natural language processing (NLP) tasks because of their capacity to encode meaningful but abstract representations and learn the entire task end-to-end. In this thesis, I investigate representation learning of clinical narratives with deep neural networks through a number of tasks ranging from clinical concept extraction, clinical note modeling, and patient-level language representation. I present methods utilizing representation learning with neural networks to support understanding of clinical text documents. I first introduce the notion of representation learning from natural language processing and patient data modeling. Then, I investigate word-level representation learning to improve clinical concept extraction from clinical notes. I present two works on learning word representations and evaluate them to extract important concepts from clinical notes. The first study focuses on cancer-related information, and the second study evaluates shared-task data. The aims of these two studies are to automatically extract important entities from clinical notes. Next, I present a series of deep neural networks to encode hierarchical, longitudinal, and contextual information for modeling a series of clinical notes. I also evaluate the models by predicting clinical outcomes of interest, including mortality, length of stay, and phenotype predictions. Finally, I propose a novel representation learning architecture to develop a generalized and transferable language representation at the patient level. I also identify pre-training tasks appropriate for constructing a generalizable language representation. The main focus is to improve predictive performance of phenotypes with limited data, a challenging task due to a lack of data. Overall, this dissertation addresses issues in natural language processing for medicine, including clinical text classification and modeling. These studies show major barriers to understanding large-scale clinical notes. It is believed that developing deep representation learning methods for distilling enormous amounts of heterogeneous data into patient-level language representations will improve evidence-based clinical understanding. The approach to solving these issues by learning representations could be used across clinical applications despite noisy data. I conclude that considering different linguistic components in natural language and sequential information between clinical events is important. Such results have implications beyond the immediate context of predictions and further suggest future directions for clinical machine learning research to improve clinical outcomes. This could be a starting point for future phenotyping methods based on natural language processing that construct patient-level language representations to improve clinical predictions. While significant progress has been made, many open questions remain, so I will highlight a few works to demonstrate promising directions

    Evaluation of a Portuguese computerized cancer registry - a qualitative research

    Get PDF
    Pretende-se que se proceda ao levantamento de processos "as is model" do sistema através de observação do sistema e entrevistas com utilizadores. Pretende-se também que sejam levantados os problemas funcionais do sistema e requisitos dos utilizadores face às suas necessidades atuais. Por fim pretende-se que se defina um "to be model" para este sistema de informação em saúde face às necessidades dos utilizadores e da organização, às imposições das normas europeias e às novas tecnologias do mercado

    Front-Line Physicians' Satisfaction with Information Systems in Hospitals

    Get PDF
    Day-to-day operations management in hospital units is difficult due to continuously varying situations, several actors involved and a vast number of information systems in use. The aim of this study was to describe front-line physicians' satisfaction with existing information systems needed to support the day-to-day operations management in hospitals. A cross-sectional survey was used and data chosen with stratified random sampling were collected in nine hospitals. Data were analyzed with descriptive and inferential statistical methods. The response rate was 65 % (n = 111). The physicians reported that information systems support their decision making to some extent, but they do not improve access to information nor are they tailored for physicians. The respondents also reported that they need to use several information systems to support decision making and that they would prefer one information system to access important information. Improved information access would better support physicians' decision making and has the potential to improve the quality of decisions and speed up the decision making process.Peer reviewe

    Preface

    Get PDF
    corecore