36 research outputs found

    Modeling Temporal Pattern and Event Detection using Hidden Markov Model with Application to a Sludge Bulking Data

    Get PDF
    This paper discusses a method of modeling temporal pattern and event detection based on Hidden Markov Model (HMM) for a continuous time series data. We also provide methods for checking model adequacy and predicting future events. These methods are applied to a real example of sludge bulking data for detecting sludge bulking for a water plant in Chicago

    Detection and Analysis of Sludge Bulking Events Using Data Mining and Machine Learning Approach

    Get PDF
    Sludge bulking is the most notable cause of activated sludge plant failure (i.e. exceeding discharge permit quality limits) worldwide. Numerous mathematical methods have been applied to detect and provide warning for the prevention of sludge bulking. However, these models often fail to reliably forecast sludge bulking events because they focus on the point-by-point curve-fitting strategy, while the number of bulking event data points is relatively small in comparison with the large amount of data in the time series. Therefore, three machine learning approaches which focus on detecting the temporal pattern data before the sludge bulking events are considered in this study. The main objective of this research is to apply machine learning and statistical methods to detect the hidden temporal patterns in the sludge volume index (SVI) data and related water-quality parameters occurring before high SVI values (sludge bulking) occur, and then the hidden temporal patterns can be used to forecast high SVI values in the future. Three methods are applied in this research, the improved Time Series Data Mining (TSDM) method, the Hidden Markov Models (HMMs) method, and the combined method of Hidden Markov Models and multinomial logistic regression (MLR). The results and analysis show that the improved TSDM method and the HMMs method are capable to detect and predict sludge bulking events. The improved TSDM method can have a sludge bulking event prediction accuracy between 60% and 100%. The HMMs method could provide warning information to the WWTP operators, even if the HMMs method only detects the first state of the pattern leading to sludge bulking. Once the first pattern state was detected, there was high probability (\u3e80% in all cases, mostly \u3e 90%) that sludge bulking would occur. However, both of these methods have limitations because they are new methods applied to the sludge bulking problem. For the combined method, although the results are not useful for the detection of sludge bulking, some wastewater quality parameters are found to have significant impact on the sludge bulking, i.e., sludge retention time (SRT) and effluent pH for all three batteries

    A New Ensemble Learning Method for Temporal Pattern Identification

    Get PDF
    AbstractIn this paper we present a method for identification of temporal patterns predictive of significant events in a dynamic data system. A new hybrid model using Reconstructed Phase Space (MRPS) and Hidden Markov Model (HMM) is applied to identify temporal patterns. This method constructs phase space embedding by using individual embedding of each variable sequences. We also employ Hidden Markov Models (HMM) to the multivariate sequence data to categorize multi-dimensional data into three states, e.g. normal, patterns and events. A support vector machine optimization method is used to search an optimal classifier to identify temporal patterns that are predictive of future events. We performed two experimental applications using chaotic time series and natural gas usage series related to the natural gas usage forecasting problem. Experiments show that the new method significantly outperforms the original RPS framework and neural network method

    Predictive Pattern Discovery in Dynamic Data Systems

    Get PDF
    This dissertation presents novel methods for analyzing nonlinear time series in dynamic systems. The purpose of the newly developed methods is to address the event prediction problem through modeling of predictive patterns. Firstly, a novel categorization mechanism is introduced to characterize different underlying states in the system. A new hybrid method was developed utilizing both generative and discriminative models to address the event prediction problem through optimization in multivariate systems. Secondly, in addition to modeling temporal dynamics, a Bayesian approach is employed to model the first-order Markov behavior in the multivariate data sequences. Experimental evaluations demonstrated superior performance over conventional methods, especially when the underlying system is chaotic and has heterogeneous patterns during state transitions. Finally, the concept of adaptive parametric phase space is introduced. The equivalence between time-domain phase space and associated parametric space is theoretically analyzed

    Prediction of Filamentous Sludge Bulking using a State-based Gaussian Processes Regression Model.

    Full text link
    Activated sludge process has been widely adopted to remove pollutants in wastewater treatment plants (WWTPs). However, stable operation of activated sludge process is often compromised by the occurrence of filamentous bulking. The aim of this study is to build a proper model for timely diagnosis and prediction of filamentous sludge bulking in an activated sludge process. This study developed a state-based Gaussian Process Regression (GPR) model to monitor the filamentous sludge bulking related parameter, sludge volume index (SVI), in such a way that the evolution of SVI can be predicted over multi-step ahead. This methodology was validated with SVI data collected from one full-scale WWTP. Online diagnosis and prediction of filamentous bulking sludge with real-time SVI prediction was tested through a simulation study. The results showed that the proposed methodology was capable of predicting future SVIs with good accuracy, thus providing sufficient time for predicting and controlling filamentous sludge bulking

    Generation of (synthetic) influent data for performing wastewater treatment modelling studies

    Get PDF
    The success of many modelling studies strongly depends on the availability of sufficiently long influent time series - the main disturbance of a typical wastewater treatment plant (WWTP) - representing the inherent natural variability at the plant inlet as accurately as possible. This is an important point since most modelling projects suffer from a lack of realistic data representing the influent wastewater dynamics. The objective of this paper is to show the advantages of creating synthetic data when performing modelling studies for WWTPs. This study reviews the different principles that influent generators can be based on, in order to create realistic influent time series. In addition, the paper summarizes the variables that those models can describe: influent flow rate, temperature and traditional/emerging pollution compounds, weather conditions (dry/wet) as well as their temporal resolution (from minutes to years). The importance of calibration/validation is addressed and the authors critically analyse the pros and cons of manual versus automatic and frequentistic vs Bayesian methods. The presentation will focus on potential engineering applications of influent generators, illustrating the different model concepts with case studies. The authors have significant experience using these types of tools and have worked on interesting case studies that they will share with the audience. Discussion with experts at the WWTmod seminar shall facilitate identifying critical knowledge gaps in current WWTP influent disturbance models. Finally, the outcome of these discussions will be used to define specific tasks that should be tackled in the near future to achieve more general acceptance and use of WWTP influent generators

    Biological investigation and predictive modelling of foaming in anaerobic digester

    Get PDF
    Anaerobic digestion (AD) of waste has been identified as a leading technology for greener renewable energy generation as an alternative to fossil fuel. AD will reduce waste through biochemical processes, converting it to biogas which could be used as a source of renewable energy and the residue bio-solids utilised in enriching the soil. A problem with AD though is with its foaming and the associated biogas loss. Tackling this problem effectively requires identifying and effectively controlling factors that trigger and promote foaming. In this research, laboratory experiments were initially carried out to differentiate foaming causal and exacerbating factors. Then the impact of the identified causal factors (organic loading rate-OLR and volatile fatty acid-VFA) on foaming occurrence were monitored and recorded. Further analysis of foaming and nonfoaming sludge samples by metabolomics techniques confirmed that the OLR and VFA are the prime causes of foaming occurrence in AD. In addition, the metagenomics analysis showed that the phylum bacteroidetes and proteobacteria were found to be predominant with a higher relative abundance of 30% and 29% respectively while the phylum actinobacteria representing the most prominent filamentous foam causing bacteria such as Norcadia amarae and Microthrix Parvicella had a very low and consistent relative abundance of 0.9% indicating that the foaming occurrence in the AD studied was not triggered by the presence of filamentous bacteria. Consequently, data driven models to predict foam formation were developed based on experimental data with inputs (OLR and VFA in the feed) and output (foaming occurrence). The models were extensively validated and assessed based on the mean squared error (MSE), root mean squared error (RMSE), R2 and mean absolute error (MAE). Levenberg Marquadt neural network model proved to be the best model for foaming prediction in AD, with RMSE = 5.49, MSE = 30.19 and R2 = 0.9435. The significance of this study is the development of a parsimonious and effective modelling tool that enable AD operators to proactively avert foaming occurrence, as the two model input variables (OLR and VFA) can be easily adjustable through simple programmable logic controller

    Applying an Improved MRPS-GMM Method to Detect Temporal Patterns in Dynamic Data System

    Get PDF
    The purpose of this thesis is to introduce an improved approach for the temporal pattern detection, which is based on the Multivariate Reconstructed Phase Space (MRPS) and the Gaussian Mixture Model (GMM), to overcome the disadvantage caused by the diversity of shapes among different temporal patterns in multiple nonlinear time series. Moreover, this thesis presents an applicable software program developed with MATLAB for users to utilize this approach. A major study involving dynamic data systems is to understand the correspondence between events of interest and predictive temporal patterns in the output observations, which can be used to develop a mechanism to predict the occurrence of events. The approach introduced in this thesis employs Expectation-Maximization (EM) algorithm to fit a more precise distribution for the data points embedded in the MRPS. Furthermore, it proposes an improved algorithm for the pattern classification process. As a result, the computational complexity will be reduced. A recently developed software program, MATPAD, is presented as a deliverable application of this approach. The GUI of this program contains specific functionalities so that users can directly implement the procedure of MRPS embedding and fit data distribution with GMM. Moreover, it allows users to customize the related parameters for specific problems so that users will be able to test their own data
    corecore