714 research outputs found

    Learning With Imbalanced Data in Smart Manufacturing: A Comparative Analysis

    Get PDF
    The Internet of Things (IoT) paradigm is revolutionising the world of manufacturing into what is known as Smart Manufacturing or Industry 4.0. The main pillar in smart manufacturing looks at harnessing IoT data and leveraging machine learning (ML) to automate the prediction of faults, thus cutting maintenance time and cost and improving the product quality. However, faults in real industries are overwhelmingly outweighed by instances of good performance (faultless samples); this bias is reflected in the data captured by IoT devices. Imbalanced data limits the success of ML in predicting faults, thus presents a significant hindrance in the progress of smart manufacturing. Although various techniques have been proposed to tackle this challenge in general, this work is the first to present a framework for evaluating the effectiveness of these remedies in the context of manufacturing. We present a comprehensive comparative analysis in which we apply our proposed framework to benchmark the performance of different combinations of algorithm components using a real-world manufacturing dataset. We draw key insights into the effectiveness of each component and inter-relatedness between the dataset, the application context, and the design of the ML algorithm

    An alternative approach to dimension reduction for pareto distributed data: a case study

    Get PDF
    Deep learning models are tools for data analysis suitable for approximating (non-linear) relationships among variables for the best prediction of an outcome. While these models can be used to answer many important questions, their utility is still harshly criticized, being extremely challenging to identify which data descriptors are the most adequate to represent a given specific phenomenon of interest. With a recent experience in the development of a deep learning model designed to detect failures in mechanical water meter devices, we have learnt that a sensible deterioration of the prediction accuracy can occur if one tries to train a deep learning model by adding specific device descriptors, based on categorical data. This can happen because of an excessive increase in the dimensions of the data, with a correspondent loss of statistical significance. After several unsuccessful experiments conducted with alternative methodologies that either permit to reduce the data space dimensionality or employ more traditional machine learning algorithms, we changed the training strategy, reconsidering that categorical data, in the light of a Pareto analysis. In essence, we used those categorical descriptors, not as an input on which to train our deep learning model, but as a tool to give a new shape to the dataset, based on the Pareto rule. With this data adjustment, we trained a more performative deep learning model able to detect defective water meter devices with a prediction accuracy in the range 87-90%, even in the presence of categorical descriptors

    Structural Health Monitoring - deep learning approach

    Get PDF
    The development of the sensor on Structural Health Monitoring (SHM) provides some useful data that indicates the condition of the structures. A data-driven approach as an option for achieving the goal of SHM in predicting structural conditions and obtaining\ua0useful information from the numerical data. This thesis introduces deep learning methods to perform supervised learning on damage conditions in SHM.Deep learning methods such as one-dimensional Convolutional Neural Networks (1D-CNN) and Long Short-term Memory (LSTM) are applied to predicting crack position, crack width and deflection of the concrete beams. A Linear Regression (LR) is also investigated\ua0to compare with the deep learning models.Given multidimensional time-series strain data that simulated from finite element\ua0methods and the labeled crack positions, 1D-CNN and LSTM models are proposed to handle the binary classification problem. The result shows that an LSTM model is a more promising model than a 1D-CNN model on crack position prediction while handling multidimensional input and output and time-series classification. LSTM model could be a potential solution to achieve automatic monitoring on structural health with only using strain data obtained from DOFS.In predicting crack width and deflection, a predictive model as LR is also a promising method for solving the regression problem. While exploring different sets of input variables for the LR model, such as strain and geometry variables as inputs, only training with\ua0strain data results in a better performance on prediction. 1D-CNN and LSTM models are also implemented and evaluated for comparison with the LR model, which achieved good performance results

    Anomaly Detection of Smart Meter Data

    Get PDF
    Presently, households and buildings use almost one-third of total energy consumption among all the power consumption sources. This trend is continuing to rise as more and more buildings install smart meter sensors and connect to the Smart Grid. Smart Grid uses sensors and ICT technologies to achieve an uninterrupted power supply and minimize power wastage. Abnormalities in sensors and faults lead to power wastage. Along with that studying the consumption pattern of a building can lead to a substantial reduction in power wastage which can save millions of dollars. According to studies, 20\% of energy consumed by buildings are wasted due to the above factors. In this work, we propose an anomaly detection approach for detecting anomalies in the power consumption of smart meter data from an open dataset of 10 houses from Ausgrid Corporation Australia. Since the power consumption may be affected by various factors such as weather conditions during the year, it was necessary to search for a way to discover the anomalies, considering seasonal periods such as weather seasons, day/night and holidays. Consequently, the first part of this thesis is to identify the outliers and obtain data with labels (normal or anomalous). We use Facebook prophet algorithm along with power consumption domain knowledge to detect anomalies for two years of half-hour sampled data. After generating the dataset with anomaly labels, we proposed a method to classify future power consumptions as anomalous or normal. We use four different approaches using machine learning for classifying anomalies. We also measure the run-time of different classification algorithms. We are able to achieve a G-mean score of 97 per cent

    DEEP LEARNING-BASED VISUAL CRACK DETECTION USING GOOGLE STREET VIEW IMAGES

    Get PDF
    DEEP LEARNING-BASED VISUAL CRACK DETECTION USING GOOGLE STREET VIEW IMAGE

    Field-regularised factorization machines for mining the maintenance logs of equipment

    Full text link
    © Springer Nature Switzerland AG 2018. Failure prediction is very important for railway infrastructure. Traditionally, data from various sensors are collected for this task. Value of maintenance logs is often neglected. Maintenance records of equipment usually indicate equipment status. They could be valuable for prediction of equipment faults. In this paper, we propose Field-regularised Factorization Machines (FrFMs) to predict failures of railway points with maintenance logs. Factorization Machine (FM) and its variants are state-of-the-art algorithms designed for sparse data. They are widely used in click-through rate prediction and recommendation systems. Categorical variables are converted to binary features through one-hot encoding and then fed into these models. However, field information is ignored in this process. We propose Field-regularised Factorization Machines to incorporate such valuable information. Experiments on data set from railway maintenance logs and another public data set show the effectiveness of our methods

    A Machine Learning Approach for Detecting Unemployment using the Smart Metering Infrastructure

    Get PDF
    Technological advancements in the field of electrical energy distribution and utilization are revolutionizing the way consumers and utility providers interact. In addition to allowing utility companies to monitor the status of their network independently in autonomous fashion, data collected by smart meters as part of the wider advanced metering infrastructure, can be valuable for third parties, such as government authorities. The availability of the information, the granularity of the data, and the real-time nature of the smart meter, means that predictive analytics can be employed to profile consumers with high accuracy and approximate, for example, the number of individuals living in a house, the type of appliances being used, or the duration of occupancy, to name but a few applications. This paper presents a machine learning model comparison for unemployment prediction of single household occupants, based on features extracted from smart meter electricity readings. A number of nonlinear classifiers are compared, and benchmarked against a generalized linear model, and the results presented. To ensure the robustness of the classifiers, we use repeated cross validation. The results revealed that it is possible to predict employability status with Area Under Curve (AUC) = 74%, Sensitivity (SE) = 54% and Specificity (SP) = 83%, using a multilayer perceptron neural network with dropout, closely followed by the results produced by a distance weighted discrimination with polynomial kernel model. This shows the potential of using the smart metering infrastructure to provide additional autonomous services, such as unemployment detection, for governments using data collected from an advanced and distributed Internet of Things (IoT) sensor network

    Prediction Error-based Classification for Class-Incremental Learning

    Full text link
    Class-incremental learning (CIL) is a particularly challenging variant of continual learning, where the goal is to learn to discriminate between all classes presented in an incremental fashion. Existing approaches often suffer from excessive forgetting and imbalance of the scores assigned to classes that have not been seen together during training. In this study, we introduce a novel approach, Prediction Error-based Classification (PEC), which differs from traditional discriminative and generative classification paradigms. PEC computes a class score by measuring the prediction error of a model trained to replicate the outputs of a frozen random neural network on data from that class. The method can be interpreted as approximating a classification rule based on Gaussian Process posterior variance. PEC offers several practical advantages, including sample efficiency, ease of tuning, and effectiveness even when data are presented one class at a time. Our empirical results show that PEC performs strongly in single-pass-through-data CIL, outperforming other rehearsal-free baselines in all cases and rehearsal-based methods with moderate replay buffer size in most cases across multiple benchmarks
    corecore