1,648 research outputs found

    Improving prediction of COVID-19 evolution by fusing epidemiological and mobility data

    Get PDF
    [EN] We are witnessing the dramatic consequences of the COVID¿19 pandemic which, unfortunately, go beyond the impact on the health system. Until herd immunity is achieved with vaccines, the only available mechanisms for controlling the pandemic are quarantines, perimeter closures and social distancing with the aim of reducing mobility. Governments only apply these measures for a reduced period, since they involve the closure of economic activities such as tourism, cultural activities, or nightlife. The main criterion for establishing these measures and planning socioeconomic subsidies is the evolution of infections. However, the collapse of the health system and the unpredictability of human behavior, among others, make it difficult to predict this evolution in the short to medium term. This article evaluates different models for the early prediction of the evolution of the COVID¿19 pandemic to create a decision support system for policy¿makers. We consider a wide branch of models including artificial neural networks such as LSTM and GRU and statistically based models such as autoregressive (AR) or ARIMA. Moreover, several consensus strategies to ensemble all models into one system are proposed to obtain better results in this uncertain environment. Finally, a multivariate model that includes mobility data provided by Google is proposed to better forecast trend changes in the 14¿day CI. A real case study in Spain is evaluated, providing very accurate results for the prediction of 14¿day CI in scenarios with and without trend changes, reaching 0.93 R2, 4.16 RMSE and 1.08 MAE.This work has been partially supported by the Spanish Ministry of Science and Innovation, under Grants RYC2018-025580-I, RTI2018-096384-B-I00, RTC-2017-6389-5 and RTC2019-007159-5, by the Fundacion Seneca del Centro de Coordinacion de la Investigacion de la Region de Murcia under Project 20813/PI/18, by the "Conselleria de Educacion, Investigacion, Cultura y Deporte, Direccio General de Ciencia i Investigacio, Proyectos AICO/2020", Spain, under Grant AICO/2020/302 and a predoctoral contract by the Generalitat Valenciana and the European Social Fund under Grant ACIF/2018/219.García-Cremades, S.; Morales-García, J.; Hernández-Sanjaime, R.; Martínez-España, R.; Bueno-Crespo, A.; Hernández-Orallo, E.; López-Espín, JJ.... (2021). Improving prediction of COVID-19 evolution by fusing epidemiological and mobility data. Scientific Reports. 11(1):1-16. https://doi.org/10.1038/s41598-021-94696-2S11611

    Enhancing Prediction and Analysis of UK Road Traffic Accident Severity Using AI: Integration of Machine Learning, Econometric Techniques, and Time Series Forecasting in Public Health Research

    Full text link
    This research investigates road traffic accident severity in the UK, using a combination of machine learning, econometric, and statistical methods on historical data. We employed various techniques, including correlation analysis, regression models, GMM for error term issues, and time-series forecasting with VAR and ARIMA models. Our approach outperforms naive forecasting with an MASE of 0.800 and ME of -73.80. We also built a random forest classifier with 73% precision, 78% recall, and a 73% F1-score. Optimizing with H2O AutoML led to an XGBoost model with an RMSE of 0.176 and MAE of 0.087. Factor Analysis identified key variables, and we used SHAP for Explainable AI, highlighting influential factors like Driver_Home_Area_Type and Road_Type. Our study enhances understanding of accident severity and offers insights for evidence-based road safety policies.Comment: 3

    A plan for application system verification tests: The value of improved meteorological information, volume 1

    Get PDF
    The framework within which the Applications Systems Verification Tests (ASVTs) are performed and the economic consequences of improved meteorological information demonstrated is described. This framework considers the impact of improved information on decision processes, the data needs to demonstrate the economic impact of the improved information, the data availability, the methodology for determining and analyzing the collected data and demonstrating the economic impact of the improved information, and the possible methods of data collection. Three ASVTs are considered and program outlines and plans are developed for performing experiments to demonstrate the economic consequences of improved meteorological information. The ASVTs are concerned with the citrus crop in Florida, the cotton crop in Mississippi and a group of diverse crops in Oregon. The program outlines and plans include schedules, manpower estimates and funding requirements

    Six papers on computational methods for the analysis of structured and unstructured data in the economic domain

    Get PDF
    This work investigates the application of computational methods for structured and unstructured data. The domains of application are two closely connected fields with the common goal of promoting the stability of the financial system: systemic risk and bank supervision. The work explores different families of models and applies them to different tasks: graphical Gaussian network models to address bank interconnectivity, topic models to monitor bank news and deep learning for text classification. New applications and variants of these models are investigated posing a particular attention on the combined use of textual and structured data. In the penultimate chapter is introduced a sentiment polarity classification tool in Italian, based on deep learning, to simplify future researches relying on sentiment analysis. The different models have proven useful for leveraging numerical (structured) and textual (unstructured) data. Graphical Gaussian Models and Topic models have been adopted for inspection and descriptive tasks while deep learning has been applied more for predictive (classification) problems. Overall, the integration of textual (unstructured) and numerical (structured) information has proven useful for systemic risk and bank supervision related analysis. The integration of textual data with numerical data in fact, has brought either to higher predictive performances or enhanced capability of explaining phenomena and correlating them to other events.This work investigates the application of computational methods for structured and unstructured data. The domains of application are two closely connected fields with the common goal of promoting the stability of the financial system: systemic risk and bank supervision. The work explores different families of models and applies them to different tasks: graphical Gaussian network models to address bank interconnectivity, topic models to monitor bank news and deep learning for text classification. New applications and variants of these models are investigated posing a particular attention on the combined use of textual and structured data. In the penultimate chapter is introduced a sentiment polarity classification tool in Italian, based on deep learning, to simplify future researches relying on sentiment analysis. The different models have proven useful for leveraging numerical (structured) and textual (unstructured) data. Graphical Gaussian Models and Topic models have been adopted for inspection and descriptive tasks while deep learning has been applied more for predictive (classification) problems. Overall, the integration of textual (unstructured) and numerical (structured) information has proven useful for systemic risk and bank supervision related analysis. The integration of textual data with numerical data in fact, has brought either to higher predictive performances or enhanced capability of explaining phenomena and correlating them to other events

    Predicting Real Estate Sales Volume in Finland: Building a Predictive Model for the Sales Volume of Old Apartments

    Get PDF
    The aim of this Master’s Thesis is to find an optimal set of explanatory variables affecting the real estate market in order to build a robust and accurate predictive model that forecasts the development of the real estate sales volume for the next 12 months. In more detail, this research examines the prior literature concerning the factors affecting the real estate market and predictive models based on which the initial variable set is constructed and the model is built. Two interviews are conducted interviewing industry experts in order to gain deeper knowledge of the field. The research aims to answer the following research questions: (1) What factors/input variables to involve when predicting the real estate sales volume, more accurately the sales volume of old apartments, in Finland, (2) What modelling method will give the best result when predicting real estate sales for the next 12 months given the nature of the data and (3) How does the sales volume of old apartments differ based on the apartment’s location and type. Thus, the research tries to build a robust predictive model that can predict the number of old apartments sold in Finland for the next 12 months as accurately as possible. This research is conducted as a both quantitative and qualitative study. In order to connect the results of this study to the existing literature and theoretical framework, five hypotheses were created. The hypotheses in order: (H1), the number of sold old apartments in total will increase within the next 12 months, (H2) the sales volume for old apartments will increase more in the capital region (Helsinki, Espoo and Vantaa) than in other regions, (H3) the sales volume for smaller studio apartments will increase more than for other apartment types, (H4) the economic variables have the biggest impact on the number of house sold and (H5) search query data from Google Trends enhances the model and serves as an important predictor variable. Four models were created to predict the sales volume. Poisson regression and Negative Binomial regression were chosen as the modelling methods given that the response variable represented count data. Based on the results Negative Binomial regression model using predictor variables from Lasso variable selection was the best model as it had the best goodness of fit and thus the best prediction accuracy. Based on the forecasts it seems that the total sales volume of old apartments will increase overall within the next 12 months regardless of the location or type. The growth will be strongest in the capital region followed by Tampere and Turku. Variables related to economy or finance seems to be the most important ones in terms of predicting the sales volume of apartments

    Renewable Energy Resource Assessment and Forecasting

    Get PDF
    In recent years, several projects and studies have been launched towards the development and use of new methodologies, in order to assess, monitor, and support clean forms of energy. Accurate estimation of the available energy potential is of primary importance, but is not always easy to achieve. The present Special Issue on ‘Renewable Energy Resource Assessment and Forecasting’ aims to provide a holistic approach to the above issues, by presenting multidisciplinary methodologies and tools that are able to support research projects and meet today’s technical, socio-economic, and decision-making needs. In particular, research papers, reviews, and case studies on the following subjects are presented: wind, wave and solar energy; biofuels; resource assessment of combined renewable energy forms; numerical models for renewable energy forecasting; integrated forecasted systems; energy for buildings; sustainable development; resource analysis tools and statistical models; extreme value analysis and forecasting for renewable energy resources

    Machine learning based adaptive soft sensor for flash point inference in a refinery realtime process

    Get PDF
    In industrial control processes, certain characteristics are sometimes difficult to measure by a physical sensor due to technical and/or economic limitations. This fact is especially true in the petrochemical industry. Some of those quantities are especially crucial for operators and process safety. This is the case for the automotive diesel Flash Point Temperature (FT). Traditional methods for FT estimation are based on the study of the empirical inference between flammability properties and the denoted target magnitude. The necessary measures are taken indirectly by samples from the process and analyzing them in the laboratory, this process implies time (can take hours from collection to flash temperature measurement) and thus make it very difficult for real-time monitorization, which in fact results in security and economical losses. This study defines a procedure based on Machine Learning modules that demonstrate the power of real-time monitorization over real data from an important international refinery. As input, easily measured values provided in real-time, such as temperature, pressure, and hydraulic flow are used and a benchmark of different regressive algorithms for FT estimation is presented. The study highlights the importance of sequencing preprocessing techniques for the correct inference of values. The implementation of adaptive learning strategies achieves considerable economic benefits in the productization of this soft sensor. The validity of the method is tested in the reality of a refinery. In addition, real-world industrial data sets tend to be unstable and volatile, and the data is often affected by noise, outliers, irrelevant or unnecessary features, and missing data. This contribution demonstrates with the inclusion of a new concept, called an adaptive soft sensor, the importance of the dynamic adaptation of the conformed schemes based on Machine Learning through their combination with feature selection, dimensional reduction, and signal processing techniques. The economic benefits of applying this soft sensor in the refinery's production plant and presented as potential semi-annual savings.This work has received funding support from the SPRI-Basque Gov- ernment through the ELKARTEK program (OILTWIN project, ref. KK- 2020/00052)

    Fairness, engagement, and discourse analysis in AI-driven social media and healthcare

    Get PDF
    This thesis addresses the critical concerns of fairness, accountability, transparency, and ethics (FATE) within the context of artificial intelligence (AI) systems applied to social media and healthcare domains. First, a comprehensive survey examines existing research on FATE in AI, specifically focusing on the subdomains of social media and healthcare. The survey evaluates current solutions, highlights their benefits, limitations, and potential challenges, and charts out future research directions. Key findings emphasize the significance of statistical and intersectional fairness in ensuring equitable healthcare access on social media platforms and highlight the pivotal role of transparency in AI systems to foster accountability. Building upon the survey, this thesis delves into an analysis of social media usage by healthcare organizations, with a specific emphasis on engagement and sentiment forecasting during the COVID-19 pandemic. Data collection from Twitter handles of pharmaceutical companies, public health agencies, and the World Health Organization enables extensive analysis. Natural language processing (NLP)-based topic modeling techniques are applied to identify health-related topics, while sentiment forecasting models are employed to gauge public sentiment. The results uncover the impact of COVID-19-related topics on public engagement, highlighting the varying levels of engagement across diverse healthcare organizations. Notably, the World Health Organization exhibits dynamic engagement patterns over time, necessitating adaptable strategies. The thesis further presents latest sentiment forecasting models, such as autoregressive integrated moving average (ARIMA) and seasonal autoregressive integrated moving average with exogenous factors (SARIMAX), which enable organizations to optimize their content strategies for maximum user engagement. Furthermore, discourse analysis is conducted to unravel the factors that shape the content of tweets by healthcare organizations on Twitter. [...

    Learning from the Past, Looking to the Future: Modeling Social Unrest in Karachi, Pakistan

    Full text link
    corecore