Ensemble model-based method for time series sensors’ data validation and imputation applied to a real waste water treatment plant

Abstract

Intelligent Decision Support Systems (IDSSs) integrate different Artificial Intelligence (AI) techniques with the aim of taking or supporting human-like decisions. To this end, these techniques are based on the available data from the target process. This implies that invalid or missing data could trigger incorrect decisions and therefore, undesirable situations in the supervised process. This is even more important in environmental systems, which incorrect malfunction could jeopardise related ecosystems. In data-driven applications such as IDSS, data quality is a basal problem that should be addressed for the sake of the overall systems’ performance. In this paper, a data validation and imputation methodology for time-series is presented. This methodology is integrated in an IDSS software tool which generates suitable control set-points to control the process. The data validation and imputation approach presented here is focused on the imputation step, and it is based on an ensemble of different prediction models obtained for the sensors involved in the process. A Case-Based Reasoning (CBR) approach is used for data imputation, i.e., similar past situations to the current one can propose new values for the missing ones. The CBR model is complemented with other prediction models such as Auto Regressive (AR) models or Artificial Neural Network (ANN) models. Then, the different obtained predictions are ensembled to obtain a better prediction performance than the obtained by each individual prediction model separately. Furthermore, the use of a meta-prediction model, trained using the predictions of all individual models as inputs, is proposed and compared with other ensemble methods to validate its performance. Finally, this approach is illustrated in a real Waste Water Treatment Plant (WWTP) case study using one of the most relevant measures for the correct operation of the WWTPs IDSS, i.e., the ammonia sensor, and considering real faults, showing promising results with improved performance when using the ensemble approach presented here compared against the prediction obtained by each individual model separately.The authors acknowledge the partial support of this work by the Industrial Doctorate Programme (2017DI-006) and the Research Consolidated Groups/Centres Grant (2017 SGR 574) from the Catalan Agency of University and Research Grants Management (AGAUR), from Catalan Government.Peer ReviewedPostprint (published version

    Similar works