1,345 research outputs found

    A decision support tool for health service re-design

    Get PDF
    Many of the outpatient services are currently only available in hospitals, however there are plans to provide some of these services alongside with General Practitioners. Consequently, General Practitioners could soon be based at polyclinics. These changes have caused a number of concerns to Hounslow Primary Care Trust (PCT). For example, which of the outpatient services are to be shifted from the hospital to the polyclinic? What are the current and expected future demands for these services? To tackle some of these concerns, the first phase of this project explores the set of specialties that are frequently visited in a sequence (using sequential association rules). The second phase develops an Excel based spreadsheet tool to compute the current and expected future demands for the selected specialties. From the sequential association rule algorithm, endocrinology and ophthalmology were found to be highly associated (i.e. frequently visited in a sequence), which means that these two specialties could easily be shifted from the hospital environment to the polyclinic. We illustrated the Excel based spreadsheet tool for endocrinology and ophthalmology, however, the model is generic enough to cope with other specialties, provided that the data are available

    An integrated knowledge-based system for early detection of eye refractive error using data mining

    Get PDF
    Refractive error is one of optical defect in the human visual system. Refractive error is a very common disease these days in all populations and in all age groups. Uncorrected and undetected refractive error contributes to visual impairment, blindness and places a considerable burden on a person in the world. The long use of technological devices such as smart phones also poses a new burden on the human eye. The intensity and brightness of these digital devices open a new door for high prevalence of eye refractive errors. Early medical diagnosis of the disease may help in avoiding complications and blindness. Data mining algorithms can be applied to help in ophthalmology and detection of an eye disease at an early stage. So mining the ophthalmology data in efficient manner is a critical issue. This research work deals with development of an integrated knowledge-based system that helps to detect eye refractive error early and provides appropriate advice for the patients. In this study, the hybrid knowledge discovery process model of data mining that was developed for academic research is used. About 9000 ophthalmology data from selected eye health centers are used to build the model. The sample data was preprocessed for missing values, outliers, and noise. Then the model is built using decision tree (J48 and REPTree) and rule induction (JRip and part) algorithms. The part algorithm has registered better predictive performance with accuracy of 60% and 96.45% for subjective and objective based model evaluation, respectively as compared to J48, REPTree, and JRip. Finally, the knowledge discovered with this algorithm is further used to build the knowledge-based systems. The Java programing language is used to integrate data mining results to knowledge-based system. The performance of the proposed system is evaluated by preparing test cases. Overall, the knowledge based system resulted in 89.2% accuracy.  Finally the study concludes that discovering knowledge using data mining techniques could be used as a functional eye refractive error detection system

    SynthEye: Investigating the Impact of Synthetic Data on Artificial Intelligence-assisted Gene Diagnosis of Inherited Retinal Disease

    Get PDF
    PURPOSE: Rare disease diagnosis is challenging in medical image-based artificial intelligence due to a natural class imbalance in datasets, leading to biased prediction models. Inherited retinal diseases (IRDs) are a research domain that particularly faces this issue. This study investigates the applicability of synthetic data in improving artificial intelligence-enabled diagnosis of IRDs using generative adversarial networks (GANs). DESIGN: Diagnostic study of gene-labeled fundus autofluorescence (FAF) IRD images using deep learning. PARTICIPANTS: Moorfields Eye Hospital (MEH) dataset of 15 692 FAF images obtained from 1800 patients with confirmed genetic diagnosis of 1 of 36 IRD genes. METHODS: A StyleGAN2 model is trained on the IRD dataset to generate 512 × 512 resolution images. Convolutional neural networks are trained for classification using different synthetically augmented datasets, including real IRD images plus 1800 and 3600 synthetic images, and a fully rebalanced dataset. We also perform an experiment with only synthetic data. All models are compared against a baseline convolutional neural network trained only on real data. MAIN OUTCOME MEASURES: We evaluated synthetic data quality using a Visual Turing Test conducted with 4 ophthalmologists from MEH. Synthetic and real images were compared using feature space visualization, similarity analysis to detect memorized images, and Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) score for no-reference-based quality evaluation. Convolutional neural network diagnostic performance was determined on a held-out test set using the area under the receiver operating characteristic curve (AUROC) and Cohen's Kappa (κ). RESULTS: An average true recognition rate of 63% and fake recognition rate of 47% was obtained from the Visual Turing Test. Thus, a considerable proportion of the synthetic images were classified as real by clinical experts. Similarity analysis showed that the synthetic images were not copies of the real images, indicating that copied real images, meaning the GAN was able to generalize. However, BRISQUE score analysis indicated that synthetic images were of significantly lower quality overall than real images (P < 0.05). Comparing the rebalanced model (RB) with the baseline (R), no significant change in the average AUROC and κ was found (R-AUROC = 0.86[0.85-88], RB-AUROC = 0.88[0.86-0.89], R-k = 0.51[0.49-0.53], and RB-k = 0.52[0.50-0.54]). The synthetic data trained model (S) achieved similar performance as the baseline (S-AUROC = 0.86[0.85-87], S-k = 0.48[0.46-0.50]). CONCLUSIONS: Synthetic generation of realistic IRD FAF images is feasible. Synthetic data augmentation does not deliver improvements in classification performance. However, synthetic data alone deliver a similar performance as real data, and hence may be useful as a proxy to real data. Financial Disclosure(s): Proprietary or commercial disclosure may be found after the references

    Extending Bayesian network models for mining and classification of glaucoma

    Get PDF
    This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Glaucoma is a degenerative disease that damages the nerve fiber layer in the retina of the eye. Its mechanisms are not fully known and there is no fully-effective strategy to prevent visual impairment and blindness. However, if treatment is carried out at an early stage, it is possible to slow glaucomatous progression and improve the quality of life of sufferers. Despite the great amount of heterogeneous data that has become available for monitoring glaucoma, the performance of tests for early diagnosis are still insufficient, due to the complexity of disease progression and the diffculties in obtaining sufficient measurements. This research aims to assess and extend Bayesian Network (BN) models to investigate the nature of the disease and its progression, as well as improve early diagnosis performance. The exibility of BNs and their ability to integrate with clinician expertise make them a suitable tool to effectively exploit the available data. After presenting the problem, a series of BN models for cross-sectional data classification and integration are assessed; novel techniques are then proposed for classification and modelling of glaucoma progression. The results are validated against literature, direct expert knowledge and other Artificial Intelligence techniques, indicating that BNs and their proposed extensions improve glaucoma diagnosis performance and enable new insights into the disease process

    Evaluating an automated machine learning model that predicts visual acuity outcomes in patients with neovascular age-related macular degeneration

    Get PDF
    PURPOSE: Neovascular age-related macular degeneration (nAMD) is a major global cause of blindness. Whilst anti-vascular endothelial growth factor (anti-VEGF) treatment is effective, response varies considerably between individuals. Thus, patients face substantial uncertainty regarding their future ability to perform daily tasks. In this study, we evaluate the performance of an automated machine learning (AutoML) model which predicts visual acuity (VA) outcomes in patients receiving treatment for nAMD, in comparison to a manually coded model built using the same dataset. Furthermore, we evaluate model performance across ethnic groups and analyse how the models reach their predictions. METHODS: Binary classification models were trained to predict whether patients' VA would be 'Above' or 'Below' a score of 70 one year after initiating treatment, measured using the Early Treatment Diabetic Retinopathy Study (ETDRS) chart. The AutoML model was built using the Google Cloud Platform, whilst the bespoke model was trained using an XGBoost framework. Models were compared and analysed using the What-if Tool (WIT), a novel model-agnostic interpretability tool. RESULTS: Our study included 1631 eyes from patients attending Moorfields Eye Hospital. The AutoML model (area under the curve [AUC], 0.849) achieved a highly similar performance to the XGBoost model (AUC, 0.847). Using the WIT, we found that the models over-predicted negative outcomes in Asian patients and performed worse in those with an ethnic category of Other. Baseline VA, age and ethnicity were the most important determinants of model predictions. Partial dependence plot analysis revealed a sigmoidal relationship between baseline VA and the probability of an outcome of 'Above'. CONCLUSION: We have described and validated an AutoML-WIT pipeline which enables clinicians with minimal coding skills to match the performance of a state-of-the-art algorithm and obtain explainable predictions

    Development of Machine Learning Techniques for Diabetic Retinopathy Risk Estimation

    Get PDF
    La retinopatia diabètica (DR) és una malaltia crònica. És una de les principals complicacions de diabetis i una causa essencial de pèrdua de visió entre les persones que pateixen diabetis. Els pacients diabètics han de ser analitzats periòdicament per tal de detectar signes de desenvolupament de la retinopatia en una fase inicial. El cribratge precoç i freqüent disminueix el risc de pèrdua de visió i minimitza la càrrega als centres assistencials. El nombre dels pacients diabètics està en augment i creixements ràpids, de manera que el fa difícil que consumeix recursos per realitzar un cribatge anual a tots ells. L’objectiu principal d’aquest doctorat. la tesi consisteix en construir un sistema de suport de decisions clíniques (CDSS) basat en dades de registre de salut electrònic (EHR). S'utilitzarà aquest CDSS per estimar el risc de desenvolupar RD. En aquesta tesi doctoral s'estudien mètodes d'aprenentatge automàtic per constuir un CDSS basat en regles lingüístiques difuses. El coneixement expressat en aquest tipus de regles facilita que el metge sàpiga quines combindacions de les condicions són les poden provocar el risc de desenvolupar RD. En aquest treball, proposo un mètode per reduir la incertesa en la classificació dels pacients que utilitzen arbres de decisió difusos (FDT). A continuació es combinen diferents arbres, usant la tècnica de Fuzzy Random Forest per millorar la qualitat de la predicció. A continuació es proposen diverses tècniques d'agregació que millorin la fusió dels resultats que ens dóna cadascun dels arbres FDT. Per millorar la decisió final dels nostres models, proposo tres mesures difuses que s'utilitzen amb integrals de Choquet i Sugeno. La definició d’aquestes mesures difuses es basa en els valors de confiança de les regles. En particular, una d'elles és una mesura difusa que es troba en la qual l'estructura jeràrquica de la FDT és explotada per trobar els valors de la mesura difusa. El resultat final de la recerca feta ha donat lloc a un programari que es pot instal·lar en centres d’assistència primària i hospitals, i pot ser usat pels metges de capçalera per fer l'avaluació preventiva i el cribatge de la Retinopatia Diabètica.La retinopatía diabética (RD) es una enfermedad crónica. Es una de las principales complicaciones de diabetes y una causa esencial de pérdida de visión entre las personas que padecen diabetes. Los pacientes diabéticos deben ser examinados periódicamente para detectar signos de diabetes. desarrollo de retinopatía en una etapa temprana. La detección temprana y frecuente disminuye el riesgo de pérdida de visión y minimiza la carga en los centros de salud. El número de pacientes diabéticos es enorme y está aumentando rápidamente, lo que lo hace difícil y Consume recursos para realizar una evaluación anual para todos ellos. El objetivo principal de esta tesis es construir un sistema de apoyo a la decisión clínica (CDSS) basado en datos de registros de salud electrónicos (EHR). Este CDSS será utilizado para estimar el riesgo de desarrollar RD. En este tesis doctoral se estudian métodos de aprendizaje automático para construir un CDSS basado en reglas lingüísticas difusas. El conocimiento expresado en este tipo de reglas facilita que el médico pueda saber que combinaciones de las condiciones son las que pueden provocar el riesgo de desarrollar RD. En este trabajo propongo un método para reducir la incertidumbre en la clasificación de los pacientes que usan árboles de decisión difusos (FDT). A continuación se combinan diferentes árboles usando la técnica de Fuzzy Random Forest para mejorar la calidad de la predicción. Se proponen también varias políticas para fusionar los resultados de que nos da cada uno de los árboles (FDT). Para mejorar la decisión final propongo tres medidas difusas que se usan con las integrales Choquet y Sugeno. La definición de estas medidas difusas se basa en los valores de confianza de las reglas. En particular, uno de ellos es una medida difusa descomponible en la que se usa la estructura jerárquica del FDT para encontrar los valores de la medida difusa. Como resultado final de la investigación se ha construido un software que puede instalarse en centros de atención médica y hospitales, i que puede ser usado por los médicos de cabecera para hacer la evaluación preventiva y el cribado de la Retinopatía Diabética.Diabetic retinopathy (DR) is a chronic illness. It is one of the main complications of diabetes, and an essential cause of vision loss among people suffering from diabetes. Diabetic patients must be periodically screened in order to detect signs of diabetic retinopathy development in an early stage. Early and frequent screening decreases the risk of vision loss and minimizes the load on the health care centres. The number of the diabetic patients is huge and rapidly increasing so that makes it hard and resource-consuming to perform a yearly screening to all of them. The main goal of this Ph.D. thesis is to build a clinical decision support system (CDSS) based on electronic health record (EHR) data. This CDSS will be utilised to estimate the risk of developing RD. In this Ph.D. thesis, I focus on developing novel interpretable machine learning systems. Fuzzy based systems with linguistic terms are going to be proposed. The output of such systems makes the physician know what combinations of the features that can cause the risk of developing DR. In this work, I propose a method to reduce the uncertainty in classifying diabetic patients using fuzzy decision trees. A Fuzzy Random forest (FRF) approach is proposed as well to estimate the risk for developing DR. Several policies are going to be proposed to merge the classification results achieved by different Fuzzy Decision Trees (FDT) models to improve the quality of the final decision of our models, I propose three fuzzy measures that are used with Choquet and Sugeno integrals. The definition of these fuzzy measures is based on the confidence values of the rules. In particular, one of them is a decomposable fuzzy measure in which the hierarchical structure of the FDT is exploited to find the values of the fuzzy measure. Out of this Ph.D. work, we have built a CDSS software that may be installed in the health care centres and hospitals in order to evaluate and detect Diabetic Retinopathy at early stages

    Improving operating room schedule in a portuguese hospital : a machine learning approach to predict operating room time

    Get PDF
    Tese de Mestrado, Engenharia Biomédica e Biofísica, 2022, Universidade de Lisboa, Faculdade de CiênciasFor most hospitals, the operating room (OR) is a significant source of expenses and income. A critical point of effective OR scheduling is the prediction of OR time for a patient procedure. An inefficient schedule results in two scenarios: underestimated or overestimated OR times. A solution reported in the literature is the implementation of machine learning (ML) models that include additional variables to improve the accuracy of these predictions. This project goal is to improve the OR schedule efficiency in a hospital center by achieving precise OR time predictions. This goal was accomplished by developing two ML models (Multiple Linear Regression (MLR) and Random Forest (RF)), through two different approaches. Firstly, for all the specialties on the dataset (All Specialties Model). Second, a specialty-specific model for each (Urology, General Surgery, and Orthopedics Models). This leads to eight models where the predictive features were identified based on the literature along with consultations with the professionals. The All Specialties Model presented a surgery median time of 115.0 minutes, with an R-squared surrounding 0.7. Urology had a median time of 70.0 minutes, with an R-squared of 0.822 and 0.831 and a MAE of 21.7 and 20.9 minutes for MLR and RF models, respectively. General Surgery had a median time of 110.0 minutes with an R-squared of 0.826 and 0.825 and a MAE of 26.2 and 26.1 minutes for MLR and RF, respectively. For Orthopedics, the RF was the only one able to model all the data with an R-squared of 0.683 and a MAE of 27.1 minutes. When compared with the current methods, considering a 10% threshold, the models achieved reductions in underestimation surgeries (41%), and an increase of within predictions (19%). However, with a 22% increase in overestimation predictions. We conclude that using ML approaches improve the accuracy of OR time predictions.O bloco operatório representa uma das unidades que gera maior despesas e receitas a nível hospitalar. Trata-se de um ambiente altamente complexo, onde é necessário alocar recursos materiais e humanos que são extremamente dispendiosos. Desta forma, o bloco operatório necessita de ser gerido de forma eficiente para garantir que o investimento inicialmente feito tem o seu retorno e é utilizado no seu máximo potencial. Paralelamente, os hospitais públicos, integrados no Serviço Nacional de Saúde, apresentam longas listas de espera às quais necessitam de dar resposta. Esta crescente demanda por serviços de saúde, que exige tratamento a nível de bloco operatório, é agravada pelo envelhecimento populacional, e leva a que todos os profissionais envolvidos neste ambiente coloquem os seus esforços no sentido de garantir que toda a população tem as suas necessidades asseguradas. Um ponto fulcral no problema descrito passa por, numa primeira instância, garantir um agendamento cirúrgico eficiente. Quando um paciente é eleito para uma cirurgia programável, cirurgia eletiva, é colocado em lista de espera e feito o seu agendamento, para mais tarde realizar o respetivo procedimento cirúrgico. No momento do agendamento é necessária a informação do tempo de sala de operação que o paciente irá requerer, para reservar o bloco de tempo de sala adequado ao seu procedimento cirúrgico. Um agendamento cirúrgico ineficiente pode gerar dois diferentes cenários que não são desejáveis. Por um lado, se existir uma subestimação do tempo de sala, situação em que o tempo previsto é inferior ao real, leva a que a cirurgia seja mais longa que o estimado e, consequentemente, atrase as operações seguintes. No pior dos cenários há operações que são canceladas. Por outro lado, se há uma sobrestimação, a cirurgia levou menos tempo que o estimado, não há um aproveitamento total dos recursos da sala de operação. Na maioria dos hospitais, esta previsão de tempo de sala é feita com base na experiência do cirurgião e a implementação de ferramentas de inteligência artificial para executar esta tarefa ainda é escassa. Este tipo de previsão leva a um elevado número de cirurgias subestimadas, pois o cirurgião, na sua maioria, não tem em consideração fatores do paciente e anestésicos que impactam o tempo de sala considerando, na maioria das vezes, somente o tempo necessário à cirurgia em si. Além disso, o cirurgião tende a alocar o maior número de cirurgias num curto bloco de tempo, o que leva a uma previsão irrealista. Uma solução apontada na literatura é a implementação de algoritmos de aprendizagem automática para o desenvolvimento de modelos que implementem variáveis associadas ao paciente, operacionais, anestésicas e relacionadas com o staff. Este tipo de abordagens mostrou melhorar a precisão na previsão do tempo de sala. O projeto apresentado foi baseado numa metodologia que, primeiramente, permitiu a compreensão dos métodos praticados no centro hospitalar abordado no projeto, o Centro Hospitalar Lisboa Central (CHULC), a validação da relevância do projeto e como objetivo principal, o aumento da eficiência do bloco operatório através da melhoria na precisão da predição do tempo de sala. Toda a metodologia foi desenvolvida tendo como fundamento a base de dados fornecida por esta instituição que contém todas as cirurgias relativas às especialidades de Urologia, Cirurgia Geral e Ortopedia realizadas nos últimos cinco anos (janeiro de 2017 a dezembro de 2021). Para alcançar o objetivo central de melhorar a predição do tempo de sala, foram propostos dois modelos de aprendizagem automática, cujo output é o tempo de sala, um modelo de regressão linear múltipla e de uma floresta aleatória (em inglês designado por Random Forest- RF) segundo duas abordagens. A primeira abordagem consistiu no desenvolvimento de um modelo único para todas as três especialidades apresentadas na base de dados e a segunda num modelo específico para cada especialidade individual. O que conduziu a um total de oito modelos, uma vez que em cada abordagem ambos os algoritmos de regressão linear múltipla e de RF foram implementados. As variáveis com potencial valor preditivo da base de dados do CHULC foram identificadas com base na revisão de literatura assim como em reuniões marcadas com os diretores de serviço das especialidades abordadas, administradores hospitalares e anestesiologistas. Uma vez abordada a metodologia atualmente implementada no CHULC para a previsão do tempo de sala, que é baseada na experiência do próprio cirurgião, foi avaliado o impacto do tempo controlado pelo cirurgião e relativo à anestesia no tempo de sala. O tempo controlado pelo cirurgião apresentou a maior correlação com o tempo de sala, com um coeficiente de Pearson de 0,966 seguido do tempo anestésico, com um coeficiente de 0,686. A elevada correlação do tempo controlado pelo cirurgião com o tempo de sala indica que, por um lado, a forma como a predição do tempo de sala é praticada atualmente não é totalmente errada, mas, por outro lado, não é tão realistas já que não considera todos os fatores que influenciam este tempo. Ao incluir as variáveis relativas ao paciente, hospital e anestesia nos oito modelos propostos, para uma mediana de tempo de sala de 115,0 minutos, o modelo de regressão linear relativo a todas as especialidades obteve um R-quadrado de 0,780 acompanhado por um erro médio absoluto de 26,9 minutos. Os modelos de Urologia apresentaram um R-quadrado de 0,822 e 0,831 e um erro médio de 21,7 e 20,9 minutos para o modelo de regressão linear e de RF, respetivamente, com uma mediana de cirurgia de 70,0 minutos. Para a Cirurgia Geral, a mediana de cirurgia é de 110,0 minutos com um R-quadrado de 0,826 e 0,825 e um erro médio de 26,2 e 26,1 minutos para os modelos de regressão linear e RF, respetivamente. No modelo de Ortopedia, o algoritmo de RF foi o único capaz de modelar todos os dados desta especialidade com um R-quadrado de 0,683 e um erro médio de 27,1 minutos, para uma mediana de cirurgia de 130,0 minutos. Nesta especialidade, a regressão linear conseguiu moldar todas as cirurgias com exceção das cirurgias relativas ao joelho e anca, com um R-quadrado de 0,685 e erro médio de 28,9 minutos. As possíveis causas foram levantadas e descritas em maior detalhe, a elevada variabilidade entre procedimentos e o perfil de doentes (polidiagnosticados e polimedicados) foram os pontos fulcrais apontados pelo diretor de cirurgia ortopédica do CHULC. Quando comparado com os métodos atuais do CHULC, todos os modelos alcançaram uma diminuição significativa no erro de predição do tempo de sala. Considerando uma margem de 10%, todos os modelos apresentaram uma redução na percentagem de cirurgias subestimadas, cerca de 41%, e um aumento nas percentagens das cirurgias estimadas corretamente, rondando os 19%. No entanto, os modelos registaram um aumento de 22% nas cirurgias sobrestimadas. Futuros estudos no sentido de traduzir o impacto de cirurgias subestimadas e sobrestimadas serão necessários para complementar estes resultados. A variável que apresentou um maior impacto em todos os modelos de RF foi a média do cirurgião com base no tipo de procedimento cirúrgico realizado. Dado o elevado grau de linearidade desta variável com o output do modelo, o tempo de sala, expresso por um coeficiente de Pearson de 0,865, levou a que o modelo de regressão linear conseguisse traduzir de forma precisa a relação entre estas variáveis, e, consequentemente, atingisse resultados semelhantes ao modelo de RF nas especialidades de Urologia e Cirurgia Geral. Conclui-se que a implementação de abordagens de aprendizagem automática melhora a precisão na predição do tempo de sala e podem servir como uma ferramenta de apoio à decisão clínica para o auxílio do agendamento cirúrgico. Para operacionalizar estes resultados a nível hospitalar é necessário trabalho futuro

    Accelerating precision ophthalmology: recent advances

    Get PDF
    Introduction: The future of ophthalmology is precision medicine. With a growing incidence of lifestyle-associated ophthalmic disease such as diabetic retinopathy, the use of technology has the potential to overcome the burden on clinical specialists. Advances in precision medicine will help improve diagnosis and better triage those with higher clinical need to the appropriate experts, as well as providing a more tailored approach to treatment that could help transform patient management. Areas covered: A detailed literature review was conducted using OVID Medline and PubMed databases to explore advances in precision medicine within the areas of retinal disease, glaucoma, cornea, cataracts and uveitis. Over the last three years [2019–2022] are explored, particularly discussing technological and genomic advances in screening, diagnosis, and management within these fields. Expert opinion: Artificial intelligence and its subspecialty deep learning provide the most substantial ways in which diagnosis and management of ocular diseases can be further developed within the advancing field of precision medicine. Future challenges include optimal training sets for algorithms and further developing pharmacogenetics in more specialized areas
    • …
    corecore