677 research outputs found

    Contributions to Collective Dynamical Clustering-Modeling of Discrete Time Series

    Get PDF
    The analysis of sequential data is important in business, science, and engineering, for tasks such as signal processing, user behavior mining, and commercial transactions analysis. In this dissertation, we build upon the Collective Dynamical Modeling and Clustering (CDMC) framework for discrete time series modeling, by making contributions to clustering initialization, dynamical modeling, and scaling. We first propose a modified Dynamic Time Warping (DTW) approach for clustering initialization within CDMC. The proposed approach provides DTW metrics that penalize deviations of the warping path from the path of constant slope. This reduces over-warping, while retaining the efficiency advantages of global constraint approaches, and without relying on domain dependent constraints. Second, we investigate the use of semi-Markov chains as dynamical models of temporal sequences in which state changes occur infrequently. Semi-Markov chains allow explicitly specifying the distribution of state visit durations. This makes them superior to traditional Markov chains, which implicitly assume an exponential state duration distribution. Third, we consider convergence properties of the CDMC framework. We establish convergence by viewing CDMC from an Expectation Maximization (EM) perspective. We investigate the effect on the time to convergence of our efficient DTW-based initialization technique and selected dynamical models. We also explore the convergence implications of various stopping criteria. Fourth, we consider scaling up CDMC to process big data, using Storm, an open source distributed real-time computation system that supports batch and distributed data processing. We performed experimental evaluation on human sleep data and on user web navigation data. Our results demonstrate the superiority of the strategies introduced in this dissertation over state-of-the-art techniques in terms of modeling quality and efficiency

    Remaining useful life estimation in heterogeneous fleets working under variable operating conditions

    Get PDF
    The availability of condition monitoring data for large fleets of similar equipment motivates the development of data-driven prognostic approaches that capitalize on the information contained in such data to estimate equipment Remaining Useful Life (RUL). A main difficulty is that the fleet of equipment typically experiences different operating conditions, which influence both the condition monitoring data and the degradation processes that physically determine the RUL. We propose an approach for RUL estimation from heterogeneous fleet data based on three phases: firstly, the degradation levels (states) of an homogeneous discrete-time finite-state semi-markov model are identified by resorting to an unsupervised ensemble clustering approach. Then, the parameters of the discrete Weibull distributions describing the transitions among the states and their uncertainties are inferred by resorting to the Maximum Likelihood Estimation (MLE) method and to the Fisher Information Matrix (FIM), respectively. Finally, the inferred degradation model is used to estimate the RUL of fleet equipment by direct Monte Carlo (MC) simulation. The proposed approach is applied to two case studies regarding heterogeneous fleets of aluminium electrolytic capacitors and turbofan engines. Results show the effectiveness of the proposed approach in predicting the RUL and its superiority compared to a fuzzy similarity-based approach of literature

    The Dynamic Chain Event Graph

    Get PDF
    In this paper we develop a formal dynamic version of Chain Event Graphs (CEGs), a particularly expressive family of discrete graph- ical models. We demonstrate how this class links to semi-Markov models and provides a convenient generalization of the Dynamic Bayesian Network (DBN). In particular we develop a repeating time-slice Dynamic CEG providing a useful and simpler model in this family. We demonstrate how the Dynamic CEG’s graphical formulation exhibits asymmetric conditional independence statements and also how each model can be estimated in a closed form enabling fast model search over the class. The expressive power of this model class together with its estimation is illustrated throughout by a variety of examples that include the risk of childhood hospitalization and the efficacy of a flu vaccine

    Multidimensional prognostics for rotating machinery: A review

    Get PDF
    open access articleDetermining prognosis for rotating machinery could potentially reduce maintenance costs and improve safety and avail- ability. Complex rotating machines are usually equipped with multiple sensors, which enable the development of multidi- mensional prognostic models. By considering the possible synergy among different sensor signals, multivariate models may provide more accurate prognosis than those using single-source information. Consequently, numerous research papers focusing on the theoretical considerations and practical implementations of multivariate prognostic models have been published in the last decade. However, only a limited number of review papers have been written on the subject. This article focuses on multidimensional prognostic models that have been applied to predict the failures of rotating machinery with multiple sensors. The theory and basic functioning of these techniques, their relative merits and draw- backs and how these models have been used to predict the remnant life of a machine are discussed in detail. Furthermore, this article summarizes the rotating machines to which these models have been applied and discusses future research challenges. The authors also provide seven evaluation criteria that can be used to compare the reviewed techniques. By reviewing the models reported in the literature, this article provides a guide for researchers considering prognosis options for multi-sensor rotating equipment

    Learning and testing stochastic discrete event

    Get PDF
    Dissertação de mestrado em Engenharia de InformáticaSistemas de eventos discretos (DES) são uma importante subclasse de sistemas (à luz da teoria dos sistemas). Estes têm sido usados, particularmente na indústria para analisar e modelar um vasto conjunto de sistemas reais, tais como, sistemas de produção, sistemas de computador, sistemas de controlo de tráfego e sistemas híbridos. O nosso trabalho explora uma extensão de DES com ênfase nos processos estocásticos, comummente chamado como sistemas de eventos discretos estocásticos (SDES). Existe assim a necessidade de estabelecer uma abstração estocástica através do uso de processos semi-Markovianos generalizados (GSMP) para SDES. Assim, o objetivo do nosso trabalho é propor uma metodologia e um conjunto de algoritmos para aprendizagem de GSMP, usar técnicas de model-checking estatístico para a verificação e propor duas novas abordagens para teste de DES e SDES (respetivamente, não estocasticamente e estocasticamente). Este trabalho também introduz uma noção de modelação, analise e verificação de sistemas contínuos e modelos de perturbação no contexto da verificação por model-checking estatístico.Discrete event systems (DES) are an important subclass of systems (in systems theory). They have been used, particularly in industry, to analyze and model a wide variety of real systems, such as production systems, computer systems, traffic systems, and hybrid systems. Our work explores an extension of DES with an emphasis on stochastic processes, commonly called stochastic discrete event systems (SDES). There was a need to establish a stochastic abstraction for SDES through generalized semi-Markov processes (GSMP). Thus, the aim of our work is to propose a methodology and a set of algorithms for GSMP learning, using model checking techniques for verification, and to propose two new approaches for testing DES and SDES (non-stochastically and stochastically). This work also introduces a notion of modeling, analysis, and verification of continuous systems and disturbance models in the context of verifiable statistical model checking

    Evolving Clustering Algorithms And Their Application For Condition Monitoring, Diagnostics, & Prognostics

    Get PDF
    Applications of Condition-Based Maintenance (CBM) technology requires effective yet generic data driven methods capable of carrying out diagnostics and prognostics tasks without detailed domain knowledge and human intervention. Improved system availability, operational safety, and enhanced logistics and supply chain performance could be achieved, with the widespread deployment of CBM, at a lower cost level. This dissertation focuses on the development of a Mutual Information based Recursive Gustafson-Kessel-Like (MIRGKL) clustering algorithm which operates recursively to identify underlying model structure and parameters from stream type data. Inspired by the Evolving Gustafson-Kessel-like Clustering (eGKL) algorithm, we applied the notion of mutual information to the well-known Mahalanobis distance as the governing similarity measure throughout. This is also a special case of the Kullback-Leibler (KL) Divergence where between-cluster shape information (governed by the determinant and trace of the covariance matrix) is omitted and is only applicable in the case of normally distributed data. In the cluster assignment and consolidation process, we proposed the use of the Chi-square statistic with the provision of having different probability thresholds. Due to the symmetry and boundedness property brought in by the mutual information formulation, we have shown with real-world data that the algorithm’s performance becomes less sensitive to the same range of probability thresholds which makes system tuning a simpler task in practice. As a result, improvement demonstrated by the proposed algorithm has implications in improving generic data driven methods for diagnostics, prognostics, generic function approximations and knowledge extractions for stream type of data. The work in this dissertation demonstrates MIRGKL’s effectiveness in clustering and knowledge representation and shows promising results in diagnostics and prognostics applications
    corecore