11,462 research outputs found

    Veterans engineering resource center: the DREAM project

    Get PDF
    Due to technological advances, data collected from direct healthcare delivery is growing by the day. The constantly growing data that was collected from various resources including patient visits, images, laboratory results and physician notes, though important, has no significance beyond its satisfying reporting and/or documentation requirements and potential application to specific clinical situations, mainly due to the voluminous and heterogeneous nature of the data. With this tremendous amount of data, manual extraction of information is expensive, time consuming, and subject to human error. Fortunately, information technologies have enabled the generation and collection of this data and also the efficient extraction of useful information. Currently, there is a broad spectrum of secondary uses of this clinical data including clinical and translational research, public health and policy analysis, and quality measurement and improvement. The following case study examines a pilot project undertaken by the Veterans Engineering Resource Center(VERC) to design a data mining software utility called Data Resource Engine & Analytical Model (DREAM).This software should be operable within the VA IT infrastructure and will allow providers to view aggregate patient data rapidly and accurately using electronic health records

    Cognition-Based Networks: A New Perspective on Network Optimization Using Learning and Distributed Intelligence

    Get PDF
    IEEE Access Volume 3, 2015, Article number 7217798, Pages 1512-1530 Open Access Cognition-based networks: A new perspective on network optimization using learning and distributed intelligence (Article) Zorzi, M.a , Zanella, A.a, Testolin, A.b, De Filippo De Grazia, M.b, Zorzi, M.bc a Department of Information Engineering, University of Padua, Padua, Italy b Department of General Psychology, University of Padua, Padua, Italy c IRCCS San Camillo Foundation, Venice-Lido, Italy View additional affiliations View references (107) Abstract In response to the new challenges in the design and operation of communication networks, and taking inspiration from how living beings deal with complexity and scalability, in this paper we introduce an innovative system concept called COgnition-BAsed NETworkS (COBANETS). The proposed approach develops around the systematic application of advanced machine learning techniques and, in particular, unsupervised deep learning and probabilistic generative models for system-wide learning, modeling, optimization, and data representation. Moreover, in COBANETS, we propose to combine this learning architecture with the emerging network virtualization paradigms, which make it possible to actuate automatic optimization and reconfiguration strategies at the system level, thus fully unleashing the potential of the learning approach. Compared with the past and current research efforts in this area, the technical approach outlined in this paper is deeply interdisciplinary and more comprehensive, calling for the synergic combination of expertise of computer scientists, communications and networking engineers, and cognitive scientists, with the ultimate aim of breaking new ground through a profound rethinking of how the modern understanding of cognition can be used in the management and optimization of telecommunication network

    Predictive modelling of hospital readmissions in diabetic patients clusters

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceDiabetes is a global public health problem with increasing incidence over the past 10 years. This disease's social and economic impacts are widely assessed worldwide, showing a direct and gradual decrease in the individual's ability to work, a gradual loss in the scale of quality of life and a burden on personal finances. The recurrence of hospitalisation is one of the most significant indexes in measuring the quality of care and the opportunity to optimise resources. Numerous techniques identify the patient who will need to be readmitted, such as LACE and HOSPITAL. The purpose of this study was to use a dataset related to the risk of hospital readmission in patients with Diabetes first to apply a clustering of subgroups by similarity. Then structures a predictive analysis with the main algorithms to identify the methodology of best performance. Numerous approaches were performed to prepare the dataset for these two interventions. The results found in the first phase were two clusters based on the total number of hospital recurrences and others on total administrative costs, with K=3. In the second phase, the best algorithm found was Neural Network 3, with a ROC of 0.68 and a misclassification rate of 0.37. When applied the same algorithm in the clusters, there were no gains in the confidence of the indexes, suggesting that there are no substantial gains in the division of subpopulations since the disease has the same behaviour and needs throughout its development

    Neurocognitive Informatics Manifesto.

    Get PDF
    Informatics studies all aspects of the structure of natural and artificial information systems. Theoretical and abstract approaches to information have made great advances, but human information processing is still unmatched in many areas, including information management, representation and understanding. Neurocognitive informatics is a new, emerging field that should help to improve the matching of artificial and natural systems, and inspire better computational algorithms to solve problems that are still beyond the reach of machines. In this position paper examples of neurocognitive inspirations and promising directions in this area are given

    Predicting post-surgical lenght of stay using machine learning

    Get PDF
    Tese de mestrado integrado em Engenharia Biomédica e Biofísica (Engenharia Clínica e Instrumentação Médica), Universidade de Lisboa, Faculdade de Ciências, 2020Ser saudável, em qualquer cultura, é essencialmente a condição mais importante para uma vida longa e feliz e para ela contribui toda a rede hospitalar de um país, quer seja um sistema de saúde nacional ou privado. Análogo a diferentes áreas, também a saúde deve acompanhar a evolução tecnológica para oferecer serviços avançados devido às variedades de demandas sociais. Isso acontece porque o desenvolvimento de tecnologias e metodologias em saúde permite criar novos processos aprimorados e torna os já existentes mais eficientes. A tecnologia na medicina não envolve apenas anestésicos e antibióticos ou técnicas médicas, como ressonância magnética e radioterapia. Na verdade, como os pacientes geram enormes quantidades de informações, não só médicas (como resultados de análises ao sangue), mas também relacionadas com o hospital (nomeadamente o tempo e o tipo de cirurgia), um dos avanços mais importantes dos últimos anos foi a digitalização dessas mesmas informações por meio dos registos de saúde eletrónico. Um dos maiores e mais diretos benefícios conhecidos da digitalização médica é que o atendimento ao paciente é mais fácil e eficiente. Contudo, a grande finalidade da existência destes registos vem após o tratamento e manipulação dos dados com técnicas de ciência dos dados quando, por exemplo, alguns diagnósticos, como as doenças cardíacas, podem ser previstos pelo uso dessas metodologias. Assim, na posse dos dados em formato digital, diferentes técnicas podem ser aplicadas, conforme o caso, de modo a extrair informações que não seriam visíveis per si. Os resultados são tanto melhores quanto mais cógnito todo o processo por trás da coleta de dados, pois aperfeiçoa a seleção e o pré- -processamento dos dados. Dentro das técnicas existentes para a previsão a partir dos bancos de dados e, consequentemente, auxiliar uma empresa a tomar as melhores decisões, está a aprendizagem automática. Esta área fornece aos sistemas a capacidade de aprender e melhorar automaticamente com a experiência, sem ser explicitamente programado, o que pode ser extremamente relevante na área da saúde. Paralelamente à tecnologia, fatores financeiros e de gestão também devem ser considerados, pois também o hospital é uma empresa que deve ser gerida. Assim, além de contribuir para o bem-estar da população, um dos seus objetivos internos é reduzir ao máximo os custos sem prejudicar o normal funcionamento de qualquer atividade desempenhada, otimizando recursos. Neste seguimento, um dos aspetos mais problemáticos da logística hospitalar é a gestão de camas. O seu excesso, ao mesmo tempo que garante maior alocação de pacientes, leva também a um custo hospitalar excessivo. Sob outra perspetiva, um défice pode gerar situações graves para quem precisa. Em suma, a gestão profissional de camas visa uma alta taxa de ocupação, mas uma baixa taxa de cancelamentos, alcançando assim uma alocação ótima. Porém, a sua distribuição ideal é dificultada pela difícil precisão do tempo de internamento de pacientes hospitalizados. De modo a colmatar esta adversidade, é possível a concretização de um modelo capaz de prever o tempo de estadia com maior rigor através da manipulação de um conjunto de dados composto, neste caso, por informações de pacientes. Desta forma, esta dissertação tem como finalidade a criação e avaliação, em Python, de um modelo preditivo de classificação para o tempo de internamento para pacientes que sejam submetidos a cirurgia, tendo como base de comparação o adotado atualmente pelo hospital em estudo, o HBA. Por forma a alcançar este propósito, recorrendo à metodologia Cross-Industry Standard Process for Data Mining, este trabalho dividiu-se em três etapas: o entendimento dos dados e respetiva preparação, a sua modelação e por fim a sua avaliação e comparação com o modelo do HBA. Este estudo visa suprir as lacunas de outros estudos que não consideram simultaneamente características gerais dos pacientes e hospitalares, como a de data e hora da cirurgia. Além disso, existe ainda uma carência na literatura de estudos que utilizem aprendizagem automática no que diz respeito aos pacientes de origem exclusivamente cirúrgica. Para o início da primeira fase, foi utilizado um dataset referente a 20 736 pacientes que estiveram hospitalizados no HBA entre o ano de 2017 e 2018, estando ainda asseguradas 135 características dos mesmos, quer do foro do paciente, quer do foro hospitalar. Após a receção dos dados, é necessária a sua compreensão do ponto de vista médico e comportamental, uma vez que o modo como foi preenchido está sujeito a erros de cariz humano. Estes erros podem ir desde a troca de informações no momento do preenchimento, assim como à existência de características que representam a mesma ideia, estando uma mais atualizada relativamente a outra. Assim sendo, é importante um primeiro contacto com os responsáveis pelo preenchimento do conjunto de dados por forma a garantir a sua leitura plausível e respetivo entendimento das informações fornecidas por cada uma das características. A partir desta análise é possível uma organização primordial dos dados. Ainda nesta etapa é imperativo verificar a possibilidade de formação de novas variáveis a partir de outras já existentes de forma a enriquecer o dataset. O conhecimento da distribuição das variáveis torna-se essencial para a total compreensão dos da dos, uma vez que permite a averiguação da repartição de categorias de cada uma das características. Nesta fase é assim necessário o conhecimento, limpeza e preparação dos dados para que estes possam ser seguidamente modelados. A segunda etapa refere-se à modelação dos dados a um dos algoritmos de aprendizagem automática, neste caso, das Random Forests. Uma vez que a finalidade se prende em dois modelos diferentes – pré e pós-cirúrgico – é indispensável ter em consideração as variáveis consideradas em cada um dos modelos, tendo pleno conhecimento do momento em que cada uma delas é referenciada pela primeira vez. Tratando-se de um algoritmo de classificação com 135 features, é ainda imprescindível uma seleção de variáveis ideal. Esta seleção de variáveis permite um aperfeiçoamento da acuidade e uma redução do overfitting, face a um modelo que utilize todas as variáveis. Para além disto, o facto de haver um menor número de atributos considerados, também levará a que o tempo de treino seja menor. Por fim, a última fase diz respeito à avaliação dos resultados. Para ambos os modelos, pré e pós cirúrgico, a métrica utilizada foi o F1-score, por se tratar de dados não equilibrados. Desta forma, com a elaboração destes modelos foi possível verificar-se uma melhoria notória, dependendo da especialidade, face ao modelo atualmente em vigência de, em média, 13,87 pontos percentuais para o modelo pós operatório e 12,32 para o modelo pré-operatório. Constrangimentos como o número restrito de pacientes considerados após a preparação do conjunto de dados para a modelação e erros comportamentais no preenchimento do dataset poderão ter limitado os resultados desta dissertação. No entanto, mesmo podendo beneficiar de algumas melhorias, a finalidade para o qual este projeto foi proposto, foi cumprida. Neste caso em específico, foi possível denotar melhorias face ao modelo atualmente empregue no hospital, comprovando assim o potencial de modelos que tiram proveito dos benefícios da aprendizagem automática. Em adição ao objetivo central deste trabalho foi ainda feita uma análise e comparação entre modelos que contivessem apenas variáveis do foro do paciente e modelos que incluíam unicamente variáveis de procedimento ou estruturais. A elaboração destes modelos e posterior análise visou a comparação da influência destes dois tipos de variáveis num modelo hospitalar, com o intuito de enaltecer a importância do correto preenchimento destes atributos por parte dos profissionais. Os resultados desta abordagem permitiram reconhecer a relevância associada à integração dos dois tipos de variáveis num modelo de Random Forests, adicionando uma melhoria média de 9,68 pontos percentuais em relação ao uso exclusivo de variáveis relacionadas ao paciente e 3,83 para variáveis relacionadas ao procedimento para o modelo pós-cirúrgico. Por sua vez, para o modelo pré-cirúrgico, a incorporação de ambas as variáveis traz uma melhoria de 7,67 pontos percentuais em relação ao modelo que utiliza apenas características do paciente e 5,72 para o modelo apenas com variáveis relacionadas ao procedimento. Com esta dissertação, demonstra-se que a partir da aplicação de técnicas de Random Forests aos registos de saúde eletrónico do hospital em estudo é possível criar um modelo preditivo para o tempo de estadia. Isto possibilita no futuro um processo de gestão de camas otimizado, permitindo assim a diminuição dos custos hospitalares.In recent years, there has been a steady increase in the number of hospitals adopting Electronic Health Records (EHR) allowing a digitalisation of patient data. In turn, the correct manipulation of these data, using Data Mining (DM) techniques, can lead to achieving solutions both related to patients’ health and hospital management. Regarding hospital management problems, one of the most severe issues is related to bed management, which is associated with the Length of Stay (LOS) in the hospital. In this way, taking advantage of the information taken from the data collected from the patients, whether of a personal or hospital nature, it is possible to solve or mitigate this complication hitherto hardly solvable. In this follow-up, this dissertation will focus on the case study of Hospital Beatriz Ângelo (HBA) and proposes a Cross-Industry Standard Process for Data Mining (CRISP-DM) methodology in order to predict the LOS of patients after surgeries. Random Forests (RF) was the technique considered to perform the classification task and F1-score was the metric selected to evaluate the results. LOS is predicted by models developed in different situations: in the postoperative period and in the preoperative period. Comparing the results between the models developed and the discharge system used in this hospital, it is possible to conclude that there are remarkable results, with an average improvement of 13.87 percentage points for the postoperative model and 12.32 for the preoperative model, in terms of F1-score. In addition, an analysis and comparison between models that have as input merely patient-related variables and models solely containing procedure or structural-related variables was made, in order to understand the importance of each of these two types of features in the LOS. The results of this approach allowed the recognition of the importance associated with the integration of the two types of features in a Machine Learning (ML) model, adding an average improvement, in terms of F1-score, of 9.68 percentage points in relation to the exclusive use of patient-related variables and 3.83 for procedure-related variables for the post-surgical model. In turn, for the pre-surgical model, the incorporation of both variables brings an improvement of 7.67 percentage points compared to the model that uses only patient features and 5.72 for the model with only procedure-related variables. The overall results of this work demonstrated that there was an improvement in the ML model in relation to the existing one, highlighting a better forecast of the day of discharge, which allows a better management of the beds

    Interpretable Machine Learning Model for Clinical Decision Making

    Get PDF
    Despite machine learning models being increasingly used in medical decision-making and meeting classification predictive accuracy standards, they remain untrusted black-boxes due to decision-makers\u27 lack of insight into their complex logic. Therefore, it is necessary to develop interpretable machine learning models that will engender trust in the knowledge they generate and contribute to clinical decision-makers intention to adopt them in the field. The goal of this dissertation was to systematically investigate the applicability of interpretable model-agnostic methods to explain predictions of black-box machine learning models for medical decision-making. As proof of concept, this study addressed the problem of predicting the risk of emergency readmissions within 30 days of being discharged for heart failure patients. Using a benchmark data set, supervised classification models of differing complexity were trained to perform the prediction task. More specifically, Logistic Regression (LR), Random Forests (RF), Decision Trees (DT), and Gradient Boosting Machines (GBM) models were constructed using the Healthcare Cost and Utilization Project (HCUP) Nationwide Readmissions Database (NRD). The precision, recall, area under the ROC curve for each model were used to measure predictive accuracy. Local Interpretable Model-Agnostic Explanations (LIME) was used to generate explanations from the underlying trained models. LIME explanations were empirically evaluated using explanation stability and local fit (R2). The results demonstrated that local explanations generated by LIME created better estimates for Decision Trees (DT) classifiers

    Visualisation of Integrated Patient-Centric Data as Pathways: Enhancing Electronic Medical Records in Clinical Practice

    Get PDF
    Routinely collected data in hospital Electronic Medical Records (EMR) is rich and abundant but often not linked or analysed for purposes other than direct patient care. We have created a methodology to integrate patient-centric data from different EMR systems into clinical pathways that represent the history of all patient interactions with the hospital during the course of a disease and beyond. In this paper, the literature in the area of data visualisation in healthcare is reviewed and a method for visualising the journeys that patients take through care is discussed. Examples of the hidden knowledge that could be discovered using this approach are explored and the main application areas of visualisation tools are identified. This paper also highlights the challenges of collecting and analysing such data and making the visualisations extensively used in the medical domain. This paper starts by presenting the state-of-the-art in visualisation of clinical and other health related data. Then, it describes an example clinical problem and discusses the visualisation tools and techniques created for the utilisation of these data by clinicians and researchers. Finally, we look at the open problems in this area of research and discuss future challenges

    AI Solutions for MDS: Artificial Intelligence Techniques for Misuse Detection and Localisation in Telecommunication Environments

    Get PDF
    This report considers the application of Articial Intelligence (AI) techniques to the problem of misuse detection and misuse localisation within telecommunications environments. A broad survey of techniques is provided, that covers inter alia rule based systems, model-based systems, case based reasoning, pattern matching, clustering and feature extraction, articial neural networks, genetic algorithms, arti cial immune systems, agent based systems, data mining and a variety of hybrid approaches. The report then considers the central issue of event correlation, that is at the heart of many misuse detection and localisation systems. The notion of being able to infer misuse by the correlation of individual temporally distributed events within a multiple data stream environment is explored, and a range of techniques, covering model based approaches, `programmed' AI and machine learning paradigms. It is found that, in general, correlation is best achieved via rule based approaches, but that these suffer from a number of drawbacks, such as the difculty of developing and maintaining an appropriate knowledge base, and the lack of ability to generalise from known misuses to new unseen misuses. Two distinct approaches are evident. One attempts to encode knowledge of known misuses, typically within rules, and use this to screen events. This approach cannot generally detect misuses for which it has not been programmed, i.e. it is prone to issuing false negatives. The other attempts to `learn' the features of event patterns that constitute normal behaviour, and, by observing patterns that do not match expected behaviour, detect when a misuse has occurred. This approach is prone to issuing false positives, i.e. inferring misuse from innocent patterns of behaviour that the system was not trained to recognise. Contemporary approaches are seen to favour hybridisation, often combining detection or localisation mechanisms for both abnormal and normal behaviour, the former to capture known cases of misuse, the latter to capture unknown cases. In some systems, these mechanisms even work together to update each other to increase detection rates and lower false positive rates. It is concluded that hybridisation offers the most promising future direction, but that a rule or state based component is likely to remain, being the most natural approach to the correlation of complex events. The challenge, then, is to mitigate the weaknesses of canonical programmed systems such that learning, generalisation and adaptation are more readily facilitated

    Time-Series Embedded Feature Selection Using Deep Learning: Data Mining Electronic Health Records for Novel Biomarkers

    Get PDF
    As health information technologies continue to advance, routine collection and digitisation of patient health records in the form of electronic health records present as an ideal opportunity for data-mining and exploratory analysis of biomarkers and risk factors indicative of a potentially diverse domain of patient outcomes. Patient records have continually become more widely available through various initiatives enabling open access whilst maintaining critical patient privacy. In spite of such progress, health records remain not widely adopted within the current clinical statistical analysis domain due to challenging issues derived from such “big data”.Deep learning based temporal modelling approaches present an ideal solution to health record challenges through automated self-optimisation of representation learning, able to man-ageably compose the high-dimensional domain of patient records into data representations able to model complex data associations. Such representations can serve to condense and reduce dimensionality to emphasise feature sparsity and importance through novel embedded feature selection approaches. Accordingly, application towards patient records enable complex mod-elling and analysis of the full domain of clinical features to select biomarkers of predictive relevance.Firstly, we propose a novel entropy regularised neural network ensemble able to highlight risk factors associated with hospitalisation risk of individuals with dementia. The application of which, was able to reduce a large domain of unique medical events to a small set of relevant risk factors able to maintain hospitalisation discrimination.Following on, we continue our work on ensemble architecture approaches with a novel cas-cading LSTM ensembles to predict severe sepsis onset within critical patients in an ICU critical care centre. We demonstrate state-of-the-art performance capabilities able to outperform that of current related literature.Finally, we propose a novel embedded feature selection application dubbed 1D convolu-tion feature selection using sparsity regularisation. Said methodology was evaluated on both domains of dementia and sepsis prediction objectives to highlight model capability and generalisability. We further report a selection of potential biomarkers for the aforementioned case study objectives highlighting clinical relevance and potential novelty value for future clinical analysis.Accordingly, we demonstrate the effective capability of embedded feature selection ap-proaches through the application of temporal based deep learning architectures in the discovery of effective biomarkers across a variety of challenging clinical applications
    corecore