16 research outputs found

    Autoregressive Asymmetric Linear Gaussian Hidden Markov Models

    Full text link
    In a real life process evolving over time, the relationship between its relevant variables may change. Therefore, it is advantageous to have different inference models for each state of the process. Asymmetric hidden Markov models fulfil this dynamical requirement and provide a framework where the trend of the process can be expressed as a latent variable. In this paper, we modify these recent asymmetric hidden Markov models to have an asymmetric autoregressive component, allowing the model to choose the order of autoregression that maximizes its penalized likelihood for a given training set. Additionally, we show how inference, hidden states decoding and parameter learning must be adapted to fit the proposed model. Finally, we run experiments with synthetic and real data to show the capabilities of this new model.Comment: 34 pages, 16 figures, intended to be published in IEEE Transactions on Pattern Analysis and Machine Intelligenc

    Asymmetric HMMs for online ball-bearing health assessments

    Get PDF
    The degradation of critical components inside large industrial assets, such as ball-bearings, has a negative impact on production facilities, reducing the availability of assets due to an unexpectedly high failure rate. Machine learning- based monitoring systems can estimate the remaining useful life (RUL) of ball-bearings, reducing the downtime by early failure detection. However, traditional approaches for predictive systems require run-to-failure (RTF) data as training data, which in real scenarios can be scarce and expensive to obtain as the expected useful life could be measured in years. Therefore, to overcome the need of RTF, we propose a new methodology based on online novelty detection and asymmetrical hidden Markov models (As-HMM) to work out the health assessment. This new methodology does not require previous RTF data and can adapt to natural degradation of mechanical components over time in data-stream and online environments. As the system is designed to work online within the electrical cabinet of machines it has to be deployed using embedded electronics. Therefore, a performance analysis of As-HMM is presented to detect the strengths and critical points of the algorithm. To validate our approach, we use real life ball-bearing data-sets and compare our methodology with other methodologies where no RTF data is needed and check the advantages in RUL prediction and health monitoring. As a result, we showcase a complete end-to-end solution from the sensor to actionable insights regarding RUL estimation towards maintenance application in real industrial environments.This study was supported partially by the Spanish Ministry of Economy and Competitiveness through the PID2019-109247GB-I00 project and by the Spanish Ministry of Science and Innovation through the RTC2019-006871-7 (DSTREAMS project). Also, by the H2020 IoTwins project (Distributed Digital Twins for industrial SMEs: a big-data platform) funded by the EU under the call ICT-11-2018- 2019, Grant Agreement No. 857191.Peer ReviewedPostprint (author's final draft

    Asymmetric hidden Markov models and extensions applied to industry

    Full text link
    The Internet of things is a paradigm with the goal of creating customizable goods and services based on user experience. This paradigm has been applied in industrial environments generating what is called the industrial Internet of things. This new paradigm measures industrial assets continuously. The collected information is processed to extract data insights regarding the assets status or health. Depending on the asset status, maintenance policies can be planned to prevent the assets failure or degradation. To elaborate such insights, artificial intelligence models are usually applied to learn and predict industrial data patterns and behaviors. Nonetheless, in many cases, computational and reliability restrictions are present, and fast and explainable models are required to satisfy the industry needs. Hidden Markov models (HMMs) are statistical models that are capable of learning data patterns and detect non-stationary behavior in data. When HMMs are compared to other models, HMMs are economic, explainable and reliable. They are economic because their learning and inference algorithms can run in a reasonable time without the need for graphic cards or other power-intensive computing devices. They are explainable since all the learned parameters are interpretable from a probabilistic and area-of-knowledge points of view. They are reliable, because, if a mistake is committed by the model, it is possible to detect and infer its causes from the models parameters and structure. Due to the previous discussion, the motivation and results of this thesis aim to extend theoretically current HMMs, to make them more relevant, general and useful for industrial applications. For all the proposed models, the expectation-maximization algorithm was used for the learning phase. The first contribution appears in Chapter 4, where context-specific Bayesian networks were used to model the emission probabilities of continuous variables. That model is referred as AsLG-HMM since linear Gaussian Bayesian networks were used. The model was compared to a mixture of Gaussian HMM, where improvements in log-likelihoods by the proposed model were observed in both synthetic data and real data from ball-bearings. Nonetheless, such model was further developed in Chapter 5, where autoregressive values of the observable variables were considered in the context-specific Bayesian networks. This model is referred to AR-AsLG-HMM. In this case, the model was studied with further mathematical rigor. Also, a forward greedy algorithm was proposed to discover structures of Bayesian networks for the emission probabilities. The model was tested with synthetic and real data incoming from air quality and ball-bearing data. For this model, several types of HMMs were used for comparison. The learning times were also considered for evalua tion. The proposed model showed improvements in log-likelihood with fair learning times, and additional data insights were provided due to the learned Bayesian networks. As comment, AR-AsLG-HMM served as a cornerstone model for other contributions in this thesis. In Chapter 6, AR-AsLG-HMM was endowed with feature saliencies to enable the model to perform an embedded feature selection procedure. This implies that the model during its learning procedure determined the relevant features. This model is referred to FS-AsHMM. In this case, the model was compared to other HMMs with feature saliencies. Synthetic data and real data from ball-bearings and cameras with face- expressions data were used for validation. The model obtained better results regarding expression recognition and detection of non-relevant features. The previous contributions were focused to offline analysis. Nevertheless, this thesis is focused on working in industrial environments, where data-streams are generated and the models are expected to adapt to changes in data. To address such issue, in Chapter 7 the AR-AsLG-HMM was adapted to be used in data-stream and perform continuous learning. Novel-concept detection techniques were used to determine when new unobserved patterns appeared. Based on the data-insights of the AR-AsLG-HMM from the data-stream, a healthindex and a regression model were proposed to determine the health status and remaining useful life of ball-bearings. Two datasets were used to validate the proposed methodology: open access datasets with ball-bearings which are run to failure, and a ball-bearing testbed from a company promoting the thesis, Aingura IIoT. Additionally, in collaboration with the Barcelona Supercomputer Center, the methodology code was optimized to be embedded into edge devices and use it in real life applications. The methodology was compared to others in the state of the art. It obtained better results in terms of health estimation, and fair results regarding the remaining useful life prediction. Next, in Chapter 8 a feature saliency model for HMMs was adapted to determine relevant harmonics of ball-bearings data in online environments. However, this study was a preliminary work for what was done in Chapter 9, where local feature saliencies were applied on AR-AsLG-HMMs. This model is referred to LFS-AsHMM. This model was adapted to be used in data-streams with novel-concept detection techniques to keep track of the evolution of relevant features. This model updated the relevant features only when the data needed it. Synthetic and real open access data from ball-bearings was used for validation. The model was compared to other strategies and methodologies that perform feature selection in data-streams. However, these strategies did the feature selection whenever a new instance arrived and not when needed. Unfortunately, this model did not get to be implemented into edge devices during the writing of this thesis. Finally, the proposed models assume linear Gaussian data and if such assumption fails, the models are no longer valid. To address such problem, in Chapter 10, the ideas used on AR-AsLG-HMM were imposed over HMMs with non-parametric emission probabilities, more precisely, kernel density estimations were used to approximate the emission probabilities, and the estimations depended on context-specific Bayesian networks. The proposed model is referred to KDE-AsHMM. The proposed model is validated using synthetic non-linear Gaussian data and open access real data from sound recognition problems and drill milling processes. The model showed improvements in likelihood and sound recognition accuracy when compared to other HMMs. Nonetheless, the learning times and computational resources were high demanding. At the end of the thesis, in Chapter 11, the corresponding conclusions, final remarks and future research lines were proposed. RESUMEN El internet de las cosas es un paradigma con el objetivo de crear bienes y servicios perzonalizados basados en la experiencia del usuario. Este paradigma se ha aplicado en entornos industriales generando lo que se denomina el Internet de las cosas industriales. Este nuevo paradigma mide los activos industriales de forma continua. La información recopilada se procesa para extraer información sobre el estado o la salud de los activos. Según el estado de los activos, se pueden planificar políticas de mantenimiento para evitar fallas o degradación de los activos. Para elaborar tales conocimientos, modelos de inteligencia artificial se aplican para aprender y predecir patrones y comportamientos de datos industriales. No obstante, en muchos casos existen restricciones computacionales y de confiabilidad, y se requieren modelos rápidos y explicables para satisfacer las necesidades de la industria. Los modelos ocultos de Markov (HMM) son modelos estadísticos que son capaces de aprender patrones de datos y detectar comportamientos no estacionarios en los datos. Cuando los HMMs se comparan con otros modelos, los HMMs son económicos, explicables y confiables. Son económicos porque sus algoritmos de aprendizaje e inferencia pueden ejecutarse en un tiempo razonable sin necesidad de tarjetas gráficas u otros dispositivos informáticos que consumen mucha energía. Son explicables ya que todos los parámetros aprendidos son interpretables desde un punto de vista probabilístico y de área de conocimiento. Son confiables, puesto que si el modelo comete un error, es posible detectar e inferir sus causas a partir de los parámetros y la estructura del modelo. Debido a la discusión anterior, la motivación y los resultados de esta tesis tienen como objetivo extender los HMM teóricamente actuales, para hacerlos más relevantes, generales y útiles para aplicaciones industriales. Para todos los modelos propuestos se utilizó el algoritmo de maximización de expectativas para la fase de aprendizaje. La primera contribución aparece en el Capítulo 4, donde se utilizaron redes bayesianas específicas del contexto para modelar las probabilidades de emisión de variables continuas. Ese modelo se conoce como AsLG-HMM ya que se utilizaron redes lineales gaussianas bayesianas. El modelo se comparó con una mezcla de Gaussian HMM, donde se observaron mejoras en las probabilidades logarítmicas del modelo propuesto tanto en datos sintéticos como en datos reales de rodamientos de bolas. No obstante, dicho modelo se desarrolló más en el Capítulo 5, donde se consideraron los valores autorregresivos de las variables observables en las redes bayesianas específicas del contexto. Este modelo se denomina AR-AsLG-HMM. En este caso, el modelo fue estudiado con mayor rigor matemático. Además, se propuso un algoritmo voraz directo para descubrir estructuras de redes bayesianas para las probabilidades de emisión. El modelo se probó con datos sintéticos y reales provenientes de la calidad del aire y datos de cojinetes de bolas. Para este modelo, se usaron varios tipos de HMM para comparar. Los tiempos de aprendizaje también fueron considerados para la evaluación. El modelo propuesto mostró mejoras en la probabilidad de registro con tiempos de aprendizaje justos, y se proporcionaron conocimientos de datos adicionales debido a las redes bayesianas aprendidas. Como comentario, AR-AsLG-HMM sirvió como modelo fundamental para otras contribuciones en esta tesis. En el Capítulo 6, se dotó a AR-AsLG-HMM con variables destacadas para permitir que el modelo realice un procedimiento de selección de variables incorporado. Esto implica que el modelo durante su procedimiento de aprendizaje determinó las variables relevantes. Este modelo se denomina FS-AsHMM. En este caso, el modelo se comparó con otros HMM con variables sobresalientes. Para la validación se utilizaron datos sintéticos y datos reales de rodamientos de bolas y cámaras con datos de expresiones faciales. El modelo obtuvo mejores resultados en cuanto al reconocimiento de expresiones y detección de variables no relevantes. Las contribuciones anteriores estaban enfocadas al análisis fuera de línea. Sin embargo, esta tesis se centra en trabajar en entornos industriales, donde se generan flujos de datos y se espera que los modelos se adapten a los cambios en los datos. Para abordar este problema, en el Capítulo 7, el AR-AsLG-HMM se adaptó para usarse en flujo de datos y realizar un aprendizaje continuo. Se utilizaron técnicas de detección de conceptos novedosos para determinar cuándo aparecían nuevos patrones no observados. Con base en los conocimientos de datos del AR-AsLG-HMM del flujo de datos, se propusieron un índice de salud y un modelo de regresión para determinar el estado de salud y la vida útil restante de los rodamientos de bolas. Se utilizaron dos conjuntos de datos para validar la metodología propuesta: conjuntos de datos de acceso abierto con rodamientos de bolas que funcionan hasta el fallo y un banco de pruebas de rodamientos de bolas de la empresa promotora de la tesis, Aingura IIoT. Además, en colaboración con el Barcelona Supercomputing center, se optimizó el código de la metodología para integrarlo en edge devices y usarlo en aplicaciones de la vida real. La metodología fue comparada con otras en el estado del arte. Obtuvo mejores resultados en cuanto a la estimación de la salud, y resultados regulares en cuanto a la predicción de la vida útil remanente. A continuación, en el Capítulo 8, se adaptó un modelo de prominencia de variable para HMM para determinar los armónicos relevantes de los datos de rodamientos en entornos en línea. Sin embargo, este estudio fue un trabajo preliminar para lo que se hizo en el Capítulo 9, donde se aplicaron las prominencias de variables locales en AR-AsLG-HMM. Este modelo se denomina LFS-AsHMM. Este modelo se adaptó para usarse en flujos de datos con técnicas de detección de conceptos novedosos para realizar un seguimiento de la evolución de las variables relevantes. Este modelo actualizó las variables relevantes solo cuando los datos lo necesitaban. Para la validación se utilizaron datos de acceso abierto sintéticos y reales de rodamientos de bolas. El modelo se comparó con otras estrategias y metodologías que realizan la selección de variables en flujos de datos. Sin embargo, estas estrategias hacían la selección de funciones cada vez que llegaba una nueva instancia y no cuando era necesario. Desafortunadamente, este modelo no llegó a implementarse en dispositivos de Edge durante la redacción de esta tesis. Finalmente, los modelos propuestos asumen datos gaussianos lineales y si tal suposición falla, los modelos ya no son válidos. Para abordar tal problema, en el Capítulo 10, las ideas utilizadas en AR-AsLG-HMM se impusieron sobre los HMM con probabilidades de emisión no paramétricas, más precisamente, se utilizaron estimaciones de densidad kernel para aproximar las probabilidades de emisión, y las estimaciones dependían de redes bayesianas específicas del contexto. El modelo propuesto se refiere a KDE-AsHMM. El modelo propuesto se valida utilizando datos gaussianos no lineales sintéticos y datos reales de acceso abierto de problemas de reconocimiento de sonido y procesos de fresado de perforación. El modelo mostró mejoras en la probabilidad y la precisión del reconocimiento de sonido en comparación con otros HMM. No obstante, los tiempos de aprendizaje y los recursos computacionales fueron muy exigentes. Al final de la tesis, en el Capítulo 11, se propusieron las correspondientes conclusiones, comentarios finales y futuras líneas de investigación

    Mechanical rotor unbalance monitoring based on system identification and signal processing approaches

    No full text
    Mechanical unbalance is an important source of vibrations that can cause malfunctions in rotodynamic machinery. In industrial applications, unbalance is a critical issue for mass production machines. Previous studies for detecting and monitoring unbalance are based on balancing machines, trial weights, and intrusive actuators, while other studies rely on signal processing techniques, finite element analysis and physical modeling. These methodologies have some critical drawbacks, especially when non-intrusive monitoring is required, such as having to trial weights or determine constructive parameters such as mass and stiffness. The proposed approach is based on detecting and monitoring the unbalance condition in rotatory machines using data extracted from vibration sensors and a rotation sensor fitted to the system supports. The methodology comprises two main steps: identifying the appropriate speed range for unbalance monitoring and the modal parameters of the rotor, and determining and continuously monitoring the unbalance condition. Signal processing and system identification techniques are used to estimate unbalance in the rotatory machine. Experimental results for two rotodynamic systems demonstrate satisfactory performance in identifying and monitoring different unbalance conditions.Peer ReviewedPostprint (author's final draft

    Context-specific kernel-based hidden Markov model for time series analysis

    Full text link
    Traditional hidden Markov models have been a useful tool to understand and model stochastic dynamic data; in the case of non-Gaussian data, models such as mixture of Gaussian hidden Markov models can be used. However, these suffer from the computation of precision matrices and have a lot of unnecessary parameters. As a consequence, such models often perform better when it is assumed that all variables are independent, a hypothesis that may be unrealistic. Hidden Markov models based on kernel density estimation are also capable of modeling non-Gaussian data, but they assume independence between variables. In this article, we introduce a new hidden Markov model based on kernel density estimation, which is capable of capturing kernel dependencies using context-specific Bayesian networks. The proposed model is described, together with a learning algorithm based on the expectation-maximization algorithm. Additionally, the model is compared to related HMMs on synthetic and real data. From the results, the benefits in likelihood and classification accuracy from the proposed model are quantified and analyzed.Comment: Keywords: Hidden Markov models, Kernel density estimation, Bayesian networks, Adaptive models, Time serie

    Asymmetric Hidden Markov Models with continuous variables

    Full text link
    Hidden Markov models have been successfully applied to model signals and dynamic data. However, when dealing with many variables, traditional hidden Markov models do not take into account asymmetric dependencies, leading to models with overfitting and poor problem insight. To deal with the previous problem, asymmetric hidden Markov models were recently proposed, whose emission probabilities are modified to follow a state-dependent graphical model. However, only discrete models have been developed. In this paper we introduce asymmetric hidden Markov models with continuous variables using state-dependent linear Gaussian Bayesian networks. We propose a parameter and structure learning algorithm for this new model. We run experiments with real data from bearing vibration. Since vibrational data is continuous, with the proposed model we can avoid any variable discretization step and perform learning and inference in an asymmetric information frame

    Autoregressive Asymmetric Linear Gaussian Hidden Markov Models

    Full text link
    In a real life process evolving over time, the relationship between its relevant variables may change. Therefore, it is advantageous to have different inference models for each state of the process. Asymmetric hidden Markov models fulfil this dynamical requirement and provide a framework where the trend of the process can be expressed as a latent variable. In this paper, we modify these recent asymmetric hidden Markov models to have an asymmetric autoregressive component in the case of continuous variables, allowing the model to choose the order of autoregression that maximizes its penalized likelihood for a given training set. Additionally, we show how inference, hidden states decoding and parameter learning must be adapted to fit the proposed model. Finally, we run experiments with synthetic and real data to show the capabilities of this new model

    Feature saliencies in asymmetric hidden Markov models

    Full text link
    Many real-life problems are stated as nonlabeled high-dimensional data. Current strategies to select features are mainly focused on labeled data, which reduces the options to select relevant features for unsupervised problems, such as clustering. Recently, feature saliency models have been introduced and developed as clustering models to select and detect relevant variables/features as the model is learned. Usually, these models assume that all variables are independent, which narrows their applicability. This article introduces asymmetric hidden Markov models with feature saliencies, i.e., models capable of simultaneously determining during their learning phase relevant variables/features and probabilistic relationships between variables. The proposed models are compared with other state-of-the-art approaches using synthetic data and real data related to grammatical face videos and wear in ball bearings. We show that the proposed models have better or equal fitness than other state-of-the-art models and provide further data insights

    An online feature selection methodology for ball-bearing harmonic frequencies based on HMMs

    Full text link
    Much attention has been given to supervised feature subset selection methodologies in data streams. However, less attention has been given to data streams produced by sensors in industrial environments, where labels are difficult to obtain. Feature subset selection is critical in online analysis since it can accelerate and improve the performance of any model inference and reduce data storage issues especially when no cloud is available. In this work we propose an online feature subset selection methodology based on hidden Markov models (HMM) for un supervised data streams of ball-bearings in order to determine which fundamental and harmonic frequencies are relevant during operation. A validation of the proposed methodology is done with synthetic data and ball-bearing real data in a controlled data stream ambient
    corecore