14 research outputs found

    Multivariate Real Time Series Data Using Six Unsupervised Machine Learning Algorithms

    Get PDF
    The development of artificial intelligence (AI) algorithms for classification purpose of undesirable events has gained notoriety in the industrial world. Nevertheless, for AI algorithm training is necessary to have labeled data to identify the normal and anomalous operating conditions of the system. However, labeled data is scarce or nonexistent, as it requires a herculean effort to the specialists of labeling them. Thus, this chapter provides a comparison performance of six unsupervised Machine Learning (ML) algorithms to pattern recognition in multivariate time series data. The algorithms can identify patterns to assist in semiautomatic way the data annotating process for, subsequentially, leverage the training of AI supervised models. To verify the performance of the unsupervised ML algorithms to detect interest/anomaly pattern in real time series data, six algorithms were applied in following two identical cases (i) meteorological data from a hurricane season and (ii) monitoring data from dynamic machinery for predictive maintenance purposes. The performance evaluation was investigated with seven threshold indicators: accuracy, precision, recall, specificity, F1-Score, AUC-ROC and AUC-PRC. The results suggest that algorithms with multivariate approach can be successfully applied in the detection of anomalies in multivariate time series data

    Study of the Wind Speed Forecasting Applying Computational Intelligence

    Get PDF
    The conventional sources of energy such as oil, natural gas, coal, or nuclear are finite and generate environmental pollution. Alternatively, renewable energy source like wind is clean and abundantly available in nature. Wind power has a huge potential of becoming a major source of renewable energy for this modern world. It is a clean, emission-free power generation technology. Wind energy has been experiencing very rapid growth in Brazil and in Uruguay; therefore, it’s a promising industry in these countries. Thus, this rapid expansion can bring several regional benefits and contribute to sustainable development, especially in places with low economic development. Therefore, the scope of this chapter is to estimate short-term wind speed forecasting applying computational intelligence, by recurrent neural networks (RNN), using anemometers data collected by an anemometric tower at a height of 100.0 m in Brazil (tropical region) and 101.8 m in Uruguay (subtropical region), both Latin American countries. The results of this study are compared with wind speed prediction results from the literature. In one of the cases investigated, this study proved to be more appropriate when analyzing evaluation metrics (error and regression) of the prediction results obtained by the proposed model

    Exposure and dose assessment of school children to air pollutants in a tropical coastal-urban area

    Get PDF
    This study estimates exposure and inhaled dose to air pollutants of children residing in a tropical coastal-urban area in Southeast Brazil. For that, twenty-one children filled their time-activities diaries and wore the passive samplers to monitor NO2. The personal exposure was also estimated using data provided by the combination of WRF-Urban/GEOS-Chem/CMAQ models, and the nearby monitoring station. Indoor/outdoor ratios were used to consider the amount of time spent indoors by children in homes and schools. The model's performance was assessed by comparing the modelled data with concentrations measured by urban monitoring stations. A sensitivity analyses was also performed to evaluate the impact of the model's height on the air pollutant concentrations. The results showed that the mean children's personal exposure to NO2 predicted by the model (22.3 μg/m3) was nearly twice to those measured by the passive samplers (12.3 μg/m3). In contrast, the nearest urban monitoring station did not represent the personal exposure to NO2 (9.3 μg/m3), suggesting a bias in the quantification of previous epidemiological studies. The building effect parameterisation (BEP) together with the lowering of the model height enhanced the air pollutant concentrations and the exposure of children to air pollutants. With the use of the CMAQ model, exposure to O3, PM10, PM2.5, and PM1 was also estimated and revealed that the daily children's personal exposure was 13.4, 38.9, 32.9, and 9.6 μg/m3, respectively. Meanwhile, the potential inhalation daily dose was 570-667 μg for PM2.5, 684-789 μg for PM10, and 163-194 μg for PM1, showing to be favourable to cause adverse health effects. The exposure of children to air pollutants estimated by the numerical model in this work was comparable to other studies found in the literature, showing one of the advantages of using the modelling approach since some air pollutants are poorly spatially represented and/or are not routinely monitored by environmental agencies in many regions

    RL-SSI Model: Adapting a Supervised Learning Approach to a Semi-Supervised Approach for Human Action Recognition

    No full text
    Generally, the action recognition task requires a vast amount of labeled data, which represents a time-consuming human annotation effort. To mitigate the dependency on labeled data, this study proposes Semi-Supervised and Iterative Reinforcement Learning (RL-SSI), which adapts a supervised approach that uses 100% labeled data to a semi-supervised and iterative approach using reinforcement learning for human action recognition in videos. The JIGSAWS and Breakfast datasets were used to evaluate the RL-SSI model, because they are commonly used in the action segmentation task. The same applies to the performance metrics used in this work-F-Score (F1) and Edit Score-which are commonly applied for such tasks. In JIGSAWS tests, we observed that the RL-SSI outperformed previously developed state-of-the-art techniques in all quantitative measures, while using only 65% of the labeled data. When analysing the Breakfast tests, we compared the effectiveness of RL-SSI with the results of the self-supervised technique called SSTDA. We have found that RL-SSI outperformed SSTDA with an accuracy of 66.44% versus 65.8%, but RL-SSI was surpassed by the F1@10 segmentation measure, which presented an accuracy of 67.33% versus 69.3% for SSTDA. Despite this, our experiment only used 55.8% of the labeled data, while SSTDA used 65%. We conclude that our approach outperformed equivalent supervised learning methods and is comparable to SSTDA, when evaluated on multiple datasets of human action recognition, proving to be an important innovative method to successfully building solutions to reduce the amount of fully labeled data, leveraging the work of human specialists in the task of data labeling of videos, and their respectives frames, for human action recognition, thus reducing the required resources to accomplish it

    Using discrete wavelet transform for optimizing COVID-19 new cases and deaths prediction worldwide with deep neural networks.

    No full text
    This work aims to compare deep learning models designed to predict daily number of cases and deaths caused by COVID-19 for 183 countries, using a daily basis time series, in addition to a feature augmentation strategy based on Discrete Wavelet Transform (DWT). The following deep learning architectures were compared using two different feature sets with and without DWT: (1) a homogeneous architecture containing multiple LSTM (Long-Short Term Memory) layers and (2) a hybrid architecture combining multiple CNN (Convolutional Neural Network) layers and multiple LSTM layers. Therefore, four deep learning models were evaluated: (1) LSTM, (2) CNN + LSTM, (3) DWT + LSTM and (4) DWT + CNN + LSTM. Their performances were quantitatively assessed using the metrics: Mean Absolute Error (MAE), Normalized Mean Squared Error (NMSE), Pearson R, and Factor of 2. The models were designed to predict the daily evolution of the two main epidemic variables up to 30 days ahead. After a fine-tuning procedure for hyperparameters optimization of each model, the results show a statistically significant difference between the models' performances both for the prediction of deaths and confirmed cases (p-value<0.001). Based on NMSE values, significant differences were observed between LSTM and CNN+LSTM, indicating that convolutional layers added to LSTM networks made the model more accurate. The use of wavelet coefficients as additional features (DWT+CNN+LSTM) achieved equivalent results to CNN+LSTM model, which demonstrates the potential of wavelets application for optimizing models, since this allows training with a smaller time series data

    Predicting the number of days in court cases using artificial intelligence.

    No full text
    Brazilian legal system prescribes means of ensuring the prompt processing of court cases, such as the principle of reasonable process duration, the principle of celerity, procedural economy, and due legal process, with a view to optimizing procedural progress. In this context, one of the great challenges of the Brazilian judiciary is to predict the duration of legal cases based on information such as the judge, lawyers, parties involved, subject, monetary values of the case, starting date of the case, etc. Recently, there has been great interest in estimating the duration of various types of events using artificial intelligence algorithms to predict future behaviors based on time series. Thus, this study presents a proof-of-concept for creating and demonstrating a mechanism for predicting the amount of time, after the case is argued in court (time when a case is made available for the magistrate to make the decision), for the magistrate to issue a ruling. Cases from a Regional Labor Court were used as the database, with preparation data in two ways (original and discretization), to test seven machine learning techniques (i) Multilayer Perceptron (MLP); (ii) Gradient Boosting; (iii) Adaboost; (iv) Regressive Stacking; (v) Stacking Regressor with MLP; (vi) Regressive Stacking with Gradient Boosting; and (vii) Support Vector Regression (SVR), and determine which gives the best results. After executing the runs, it was identified that the adaboost technique excelled in the task of estimating the duration for issuing a ruling, as it had the best performance among the tested techniques. Thus, this study shows that it is possible to use machine learning techniques to perform this type of prediction, for the test data set, with an R2 of 0.819 and when transformed into levels, an accuracy of 84%

    Using discrete wavelet transform for optimizing COVID-19 new cases and deaths prediction worldwide with deep neural networks

    No full text
    This work aims to compare deep learning models designed to predict daily number of cases and deaths caused by COVID-19 for 183 countries, using a daily basis time series, in addition to a feature augmentation strategy based on Discrete Wavelet Transform (DWT). The following deep learning architectures were compared using two different feature sets with and without DWT: (1) a homogeneous architecture containing multiple LSTM (Long-Short Term Memory) layers and (2) a hybrid architecture combining multiple CNN (Convolutional Neural Network) layers and multiple LSTM layers. Therefore, four deep learning models were evaluated: (1) LSTM, (2) CNN + LSTM, (3) DWT + LSTM and (4) DWT + CNN + LSTM. Their performances were quantitatively assessed using the metrics: Mean Absolute Error (MAE), Normalized Mean Squared Error (NMSE), Pearson R, and Factor of 2. The models were designed to predict the daily evolution of the two main epidemic variables up to 30 days ahead. After a fine-tuning procedure for hyperparameters optimization of each model, the results show a statistically significant difference between the models’ performances both for the prediction of deaths and confirmed cases (p-value<0.001). Based on NMSE values, significant differences were observed between LSTM and CNN+LSTM, indicating that convolutional layers added to LSTM networks made the model more accurate. The use of wavelet coefficients as additional features (DWT+CNN+LSTM) achieved equivalent results to CNN+LSTM model, which demonstrates the potential of wavelets application for optimizing models, since this allows training with a smaller time series data

    Low-Cost Air Quality Sensing towards Smart Homes

    No full text
    The evolution of low-cost sensors (LCSs) has made the spatio-temporal mapping of indoor air quality (IAQ) possible in real-time but the availability of a diverse set of LCSs make their selection challenging. Converting individual sensors into a sensing network requires the knowledge of diverse research disciplines, which we aim to bring together by making IAQ an advanced feature of smart homes. The aim of this review is to discuss the advanced home automation technologies for the monitoring and control of IAQ through networked air pollution LCSs. The key steps that can allow transforming conventional homes into smart homes are sensor selection, deployment strategies, data processing, and development of predictive models. A detailed synthesis of air pollution LCSs allowed us to summarise their advantages and drawbacks for spatio-temporal mapping of IAQ. We concluded that the performance evaluation of LCSs under controlled laboratory conditions prior to deployment is recommended for quality assurance/control (QA/QC), however, routine calibration or implementing statistical techniques during operational times, especially during long-term monitoring, is required for a network of sensors. The deployment height of sensors could vary purposefully as per location and exposure height of the occupants inside home environments for a spatio-temporal mapping. Appropriate data processing tools are needed to handle a huge amount of multivariate data to automate pre-/post-processing tasks, leading to more scalable, reliable and adaptable solutions. The review also showed the potential of using machine learning technique for predicting spatio-temporal IAQ in LCS networked-systems
    corecore