31 research outputs found

    A Machine Learning Approach for Prediction of Hospital Bed Availability

    Get PDF
    Las camas de internación constituyen un recurso escaso en las instituciones hospitalarias, los datos, en cambio, no. En el presente trabajo se argumenta que, haciendo uso de técnicas de aprendizaje automático, puede sacarse provecho del enorme volumen de data disponible en los sistemas de información de hospitales y sanatorios para construir soluciones de analytics que potencien la eficiente utilización de las camas de internación mediante la mejora del proceso de toma de decisiones. Con el objetivo de poner a prueba esta hipótesis, se trabajó en conjunto con una de las instituciones hospitalarias más importantes de la ciudad de Buenos Aires. El foco del trabajo estuvo puesto en la construcción de un modelo de aprendizaje automático que pudiera predecir la probabilidad de que un paciente sea dado de alta en las próximas veinticuatro horas, en función de su historia clínica, datos demográficos y algunos otros factorales ambientales. Para lograrlo se aplicaron técnicas de ingeniería de datos y aprendizaje supervisado, en el contexto de un problema de clasificación. Se experimentó con diferentes algoritmos así como formas de abordar la representación de atributos para sacar el máximo provecho de la data disponible. Como resultado, se obtuvo un modelo con un rendimiento prometedor que alcanza un puntaje de 0.84 de área bajo la curva ROC y ha demostrado generalizar muy bien en datos desconocidos. Dicho modelo fue la base sobre la cual se montó una herramienta de pronóstico de altas. Esta solución permite obtener tres predicciones, con diferentes niveles de incertidumbre asociada, de las altas esperadas en el Sanatorio para la fecha especificada. Los "niveles de confianza" reportados fueron obtenidos mediante un ejercicio de simulación sobre la data histórica que permitió comparar el pronóstico de la herramienta con el escenario observado en la realidad. El equipo de gestión de operaciones del hospital en cuestión ha hecho explícito su interés en la solución propuesta, ya que evalúan que el modelo tiene un enorme potencial para facilitar el proceso de planificación de camas y, de esta manera, ayudar a mejorar la eficiencia operacional del sanatorio.Hospital beds are a scarce resource for healthcare facilities, data is not. In this thesis, we argue that machine learning techniques could take advantage of the abundant amount of data available at hospitals information systems inorder to build analytics solutions that could propel the efficiet utilization of beds by improving the management decission making process. In order to test this hypothesis we have worked together with one of the most relevant medical institutions in Buenos Aires. The focus of our work has been placed in building a machine learning model that could predict the probability of a certain patient being discharged during the following twenty four hours, based on his medical records as well as his demographic data and some environmental factors. To this aim, data engineering and supervised learning techniques have been applied in the context of a classification task. We have experimented with different algorithms as well as feature representation approaches to make the most out of the data at hand. As a result, a model with a promising performance of 0.84 AUC-ROC score was obtained, and its results have demonstrated to generalize quite well on unseen data. This model was the base on top of which a discharges forecaster tool was developed. This solution is able to return three different predictions of the hospital discharges for a specified date with different "confidence levels" associated, thus providing management with a risk-informed prediction of hospital beds availaibility. The "confidence" reported for each of the forecasts was obtained using a simulation approach for historic data where we were able to contrast the forecast output with the actual scenario. The hospital management team has made explicit its interest in the solution, as they assess it has an enourmous potential for facilitating the bed planning process and by doing so improving the hospital operational efficiency

    Predicting Lapse Rate in Life Insurance: An Exploration of Machine Learning Techniques

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Risk Analysis and ManagementThe implementation of machine learning techniques for the prediction of the lapse rate in life insurance is investigated in this study. The lapse rate, which refers to the rate of policy cancellations or expirations, plays a crucial role in the viability of life insurance companies as they determine pricing strategies, manage risk, and plan for the future. Data was collected through a risk survey administered to policyholders, covering their characteristics, policy details, and historical lapse patterns. A variety of machine learning algorithms were then applied to the collected data to evaluate their performance in predicting the lapse rate. The results of the study demonstrate the effectiveness of machine learning methods in forecasting the lapse rate in life insurance. The Extreme Gradient Boosting, C5:0, and random f orest algorithms produced the best results when applied to the dataset. Additionally, several key policy and customer characteristics were identified as having significant predictive power in regards to the lapse rate. However, the limitations of the study must be taken into consideration. Further research is necessary to validate the results on larger and more diverse datasets and to examine the practical applications of the models in the life insurance industry. In conclusion, this study makes a contribution to the existing body of knowledge on the use of machine learning in the insurance industry and holds the potential to inform the development of more efficient risk management practices in the life insurance sector.Os seguros de ramo vida são uma importante rede de segurança financeira para muitos indivíduos e famílias. Um fator-chave na viabilidade de uma seguradora é o risco de lapso, ou seja, a taxa de cancelamento ou expiração de apólices por parte dos segurados. A previsão precisa desta taxa de lapso é essencial para as seguradoras poderem preçar corretamente as apólices, gerir os riscos e planear o futuro estrategicamente. Neste estudo, foi explorado o uso de métodos preditivos de Data Mining para prever a taxa de lapso em seguros de vida. Teve como base a análise e tratamento de dados, tendo em conta um questionário de risco com as características dos segurados, detalhes das suas apólices e padrões históricos de lapso. Com esta informação foi aplicada uma gama de métodos preditivos e feita uma avaliação de performance relativa à previsão da taxa de lapso. Os nossos resultados mostraram que os métodos preditivos podem ser eficazes e coerentes na previsão da taxa de lapso em seguros de vida. Em particular, foi encontrada uma boa performance de resultados nos algoritmos Extreme Gradient Boosting, C5:0 e Random Forest. Além disso, com este estudo foi possivel identificar várias características importantes para conseguir prever as apólices e clientes em risco de lapso. Embora os nossos resultados apontem para uma promessa no uso de metódos preditivos na antevisão da taxa de lapso, também existiram algumas limitações. É sugerido uma maior pesquisa para validar os resultos encontrados e aplicacões de modelos com um conjunto maior de dados e mais diversificados. De modo geral, esta pesquisa contribui para o desenvolvimento do uso de métodos preditivos na indústria de seguros e grande potencial em informar e gerir riscos antecipados no setor segurador no ramo de Vida. Palavras-chave: Ramo Vida, Seguros, Gestão de Risco, Métodos Preditivos de Data Mining, Problema de Classificação, Risco de Lapso, Classificação de Risco

    Decision support by machine learning systems for acute management of severely injured patients: A systematic review

    Full text link
    Introduction Treating severely injured patients requires numerous critical decisions within short intervals in a highly complex situation. The coordination of a trauma team in this setting has been shown to be associated with multiple procedural errors, even of experienced care teams. Machine learning (ML) is an approach that estimates outcomes based on past experiences and data patterns using a computer-generated algorithm. This systematic review aimed to summarize the existing literature on the value of ML for the initial management of severely injured patients. Methods We conducted a systematic review of the literature with the goal of finding all articles describing the use of ML systems in the context of acute management of severely injured patients. MESH search of Pubmed/Medline and Web of Science was conducted. Studies including fewer than 10 patients were excluded. Studies were divided into the following main prediction groups: (1) injury pattern, (2) hemorrhage/need for transfusion, (3) emergency intervention, (4) ICU/length of hospital stay, and (5) mortality. Results Thirty-six articles met the inclusion criteria; among these were two prospective and thirty-four retrospective case series. Publication dates ranged from 2000 to 2020 and included 32 different first authors. A total of 18,586,929 patients were included in the prediction models. Mortality was the most represented main prediction group (n = 19). ML models used were artificial neural network ( n = 15), singular vector machine (n = 3), Bayesian network (n = 7), random forest (n = 6), natural language processing (n = 2), stacked ensemble classifier [SuperLearner (SL), n = 3], k-nearest neighbor (n = 1), belief system (n = 1), and sequential minimal optimization (n = 2) models. Thirty articles assessed results as positive, five showed moderate results, and one article described negative results to their implementation of the respective prediction model. Conclusions While the majority of articles show a generally positive result with high accuracy and precision, there are several requirements that need to be met to make the implementation of such models in daily clinical work possible. Furthermore, experience in dealing with on-site implementation and more clinical trials are necessary before the implementation of ML techniques in clinical care can become a reality

    Hospital length of stay prediction tools for all hospital admissions and general medicine populations: systematic review and meta-analysis

    Get PDF
    BackgroundUnwarranted extended length of stay (LOS) increases the risk of hospital-acquired complications, morbidity, and all-cause mortality and needs to be recognized and addressed proactively.ObjectiveThis systematic review aimed to identify validated prediction variables and methods used in tools that predict the risk of prolonged LOS in all hospital admissions and specifically General Medicine (GenMed) admissions.MethodLOS prediction tools published since 2010 were identified in five major research databases. The main outcomes were model performance metrics, prediction variables, and level of validation. Meta-analysis was completed for validated models. The risk of bias was assessed using the PROBAST checklist.ResultsOverall, 25 all admission studies and 14 GenMed studies were identified. Statistical and machine learning methods were used almost equally in both groups. Calibration metrics were reported infrequently, with only 2 of 39 studies performing external validation. Meta-analysis of all admissions validation studies revealed a 95% prediction interval for theta of 0.596 to 0.798 for the area under the curve. Important predictor categories were co-morbidity diagnoses and illness severity risk scores, demographics, and admission characteristics. Overall study quality was deemed low due to poor data processing and analysis reporting.ConclusionTo the best of our knowledge, this is the first systematic review assessing the quality of risk prediction models for hospital LOS in GenMed and all admissions groups. Notably, both machine learning and statistical modeling demonstrated good predictive performance, but models were infrequently externally validated and had poor overall study quality. Moving forward, a focus on quality methods by the adoption of existing guidelines and external validation is needed before clinical application.Systematic review registrationhttps://www.crd.york.ac.uk/PROSPERO/, identifier: CRD42021272198

    Predicting healthcare demand using machine learning on patient data

    Get PDF

    Essentials of Business Analytics

    Get PDF

    Comparative Analysis of Student Learning: Technical, Methodological and Result Assessing of PISA-OECD and INVALSI-Italian Systems .

    Get PDF
    PISA is the most extensive international survey promoted by the OECD in the field of education, which measures the skills of fifteen-year-old students from more than 80 participating countries every three years. INVALSI are written tests carried out every year by all Italian students in some key moments of the school cycle, to evaluate the levels of some fundamental skills in Italian, Mathematics and English. Our comparison is made up to 2018, the last year of the PISA-OECD survey, even if INVALSI was carried out for the last edition in 2022. Our analysis focuses attention on the common part of the reference populations, which are the 15-year-old students of the 2nd class of secondary schools of II degree, where both sources give a similar picture of the students

    Analyzing Granger causality in climate data with time series classification methods

    Get PDF
    Attribution studies in climate science aim for scientifically ascertaining the influence of climatic variations on natural or anthropogenic factors. Many of those studies adopt the concept of Granger causality to infer statistical cause-effect relationships, while utilizing traditional autoregressive models. In this article, we investigate the potential of state-of-the-art time series classification techniques to enhance causal inference in climate science. We conduct a comparative experimental study of different types of algorithms on a large test suite that comprises a unique collection of datasets from the area of climate-vegetation dynamics. The results indicate that specialized time series classification methods are able to improve existing inference procedures. Substantial differences are observed among the methods that were tested

    Emerg Infect Dis

    Get PDF
    PMC4550154611
    corecore