2,917 research outputs found

    Wind turbine condition monitoring strategy through multiway PCA and multivariate inference

    Get PDF
    This article states a condition monitoring strategy for wind turbines using a statistical data-driven modeling approach by means of supervisory control and data acquisition (SCADA) data. Initially, a baseline data-based model is obtained from the healthy wind turbine by means of multiway principal component analysis (MPCA). Then, when the wind turbine is monitorized, new data is acquired and projected into the baseline MPCA model space. The acquired SCADA data are treated as a random process given the random nature of the turbulent wind. The objective is to decide if the multivariate distribution that is obtained from the wind turbine to be analyzed (healthy or not) is related to the baseline one. To achieve this goal, a test for the equality of population means is performed. Finally, the results of the test can determine that the hypothesis is rejected (and the wind turbine is faulty) or that there is no evidence to suggest that the two means are different, so the wind turbine can be considered as healthy. The methodology is evaluated on a wind turbine fault detection benchmark that uses a 5 MW high-fidelity wind turbine model and a set of eight realistic fault scenarios. It is noteworthy that the results, for the presented methodology, show that for a wide range of significance, a in [1%, 13%], the percentage of correct decisions is kept at 100%; thus it is a promising tool for real-time wind turbine condition monitoring.Peer ReviewedPostprint (published version

    Machine Learning in Wireless Sensor Networks: Algorithms, Strategies, and Applications

    Get PDF
    Wireless sensor networks monitor dynamic environments that change rapidly over time. This dynamic behavior is either caused by external factors or initiated by the system designers themselves. To adapt to such conditions, sensor networks often adopt machine learning techniques to eliminate the need for unnecessary redesign. Machine learning also inspires many practical solutions that maximize resource utilization and prolong the lifespan of the network. In this paper, we present an extensive literature review over the period 2002-2013 of machine learning methods that were used to address common issues in wireless sensor networks (WSNs). The advantages and disadvantages of each proposed algorithm are evaluated against the corresponding problem. We also provide a comparative guide to aid WSN designers in developing suitable machine learning solutions for their specific application challenges.Comment: Accepted for publication in IEEE Communications Surveys and Tutorial

    Real-time predictive maintenance for wind turbines using Big Data frameworks

    Full text link
    This work presents the evolution of a solution for predictive maintenance to a Big Data environment. The proposed adaptation aims for predicting failures on wind turbines using a data-driven solution deployed in the cloud and which is composed by three main modules. (i) A predictive model generator which generates predictive models for each monitored wind turbine by means of Random Forest algorithm. (ii) A monitoring agent that makes predictions every 10 minutes about failures in wind turbines during the next hour. Finally, (iii) a dashboard where given predictions can be visualized. To implement the solution Apache Spark, Apache Kafka, Apache Mesos and HDFS have been used. Therefore, we have improved the previous work in terms of data process speed, scalability and automation. In addition, we have provided fault-tolerant functionality with a centralized access point from where the status of all the wind turbines of a company localized all over the world can be monitored, reducing O&M costs

    Data-based fault detection in chemical processes: Managing records with operator intervention and uncertain labels

    Get PDF
    Developing data-driven fault detection systems for chemical plants requires managing uncertain data labels and dynamic attributes due to operator-process interactions. Mislabeled data is a known problem in computer science that has received scarce attention from the process systems community. This work introduces and examines the effects of operator actions in records and labels, and the consequences in the development of detection models. Using a state space model, this work proposes an iterative relabeling scheme for retraining classifiers that continuously refines dynamic attributes and labels. Three case studies are presented: a reactor as a motivating example, flooding in a simulated de-Butanizer column, as a complex case, and foaming in an absorber as an industrial challenge. For the first case, detection accuracy is shown to increase by 14% while operating costs are reduced by 20%. Moreover, regarding the de-Butanizer column, the performance of the proposed strategy is shown to be 10% higher than the filtering strategy. Promising results are finally reported in regard of efficient strategies to deal with the presented problemPeer ReviewedPostprint (author's final draft

    Accelerating Large-Scale Data Analysis by Offloading to High-Performance Computing Libraries using Alchemist

    Full text link
    Apache Spark is a popular system aimed at the analysis of large data sets, but recent studies have shown that certain computations---in particular, many linear algebra computations that are the basis for solving common machine learning problems---are significantly slower in Spark than when done using libraries written in a high-performance computing framework such as the Message-Passing Interface (MPI). To remedy this, we introduce Alchemist, a system designed to call MPI-based libraries from Apache Spark. Using Alchemist with Spark helps accelerate linear algebra, machine learning, and related computations, while still retaining the benefits of working within the Spark environment. We discuss the motivation behind the development of Alchemist, and we provide a brief overview of its design and implementation. We also compare the performances of pure Spark implementations with those of Spark implementations that leverage MPI-based codes via Alchemist. To do so, we use data science case studies: a large-scale application of the conjugate gradient method to solve very large linear systems arising in a speech classification problem, where we see an improvement of an order of magnitude; and the truncated singular value decomposition (SVD) of a 400GB three-dimensional ocean temperature data set, where we see a speedup of up to 7.9x. We also illustrate that the truncated SVD computation is easily scalable to terabyte-sized data by applying it to data sets of sizes up to 17.6TB.Comment: Accepted for publication in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, UK, 201

    Data Mining Applications to Fault Diagnosis in Power Electronic Systems: A Systematic Review

    Get PDF
    • …
    corecore