322 research outputs found

    Air Quality Prediction in Smart Cities Using Machine Learning Technologies Based on Sensor Data: A Review

    Get PDF
    The influence of machine learning technologies is rapidly increasing and penetrating almost in every field, and air pollution prediction is not being excluded from those fields. This paper covers the revision of the studies related to air pollution prediction using machine learning algorithms based on sensor data in the context of smart cities. Using the most popular databases and executing the corresponding filtration, the most relevant papers were selected. After thorough reviewing those papers, the main features were extracted, which served as a base to link and compare them to each other. As a result, we can conclude that: (1) instead of using simple machine learning techniques, currently, the authors apply advanced and sophisticated techniques, (2) China was the leading country in terms of a case study, (3) Particulate matter with diameter equal to 2.5 micrometers was the main prediction target, (4) in 41% of the publications the authors carried out the prediction for the next day, (5) 66% of the studies used data had an hourly rate, (6) 49% of the papers used open data and since 2016 it had a tendency to increase, and (7) for efficient air quality prediction it is important to consider the external factors such as weather conditions, spatial characteristics, and temporal features

    Spatiotemporal and temporal forecasting of ambient air pollution levels through data-intensive hybrid artificial neural network models

    Get PDF
    Outdoor air pollution (AP) is a serious public threat which has been linked to severe respiratory and cardiovascular illnesses, and premature deaths especially among those residing in highly urbanised cities. As such, there is a need to develop early-warning and risk management tools to alleviate its effects. The main objective of this research is to develop AP forecasting models based on Artificial Neural Networks (ANNs) according to an identified model-building protocol from existing related works. Plain, hybrid and ensemble ANN model architectures were developed to estimate the temporal and spatiotemporal variability of hourly NO2 levels in several locations in the Greater London area. Wavelet decomposition was integrated with Multilayer Perceptron (MLP) and Long Short-term Memory (LSTM) models to address the issue of high variability of AP data and improve the estimation of peak AP levels. Block-splitting and crossvalidation procedures have been adapted to validate the models based on Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Willmott’s index of agreement (IA). The results of the proposed models present better performance than those from the benchmark models. For instance, the proposed wavelet-based hybrid approach provided 39.15% and 28.58% reductions in RMSE and MAE indices, respectively, on the performance of the benchmark MLP model results for the temporal forecasting of NO2 levels. The same approach reduced the RMSE and MAE indices of the benchmark LSTM model results by 12.45% and 20.08%, respectively, for the spatiotemporal estimation of NO2 levels in one site at Central London. The proposed hybrid deep learning approach offers great potential to be operational in providing air pollution forecasts in areas without a reliable database. The model-building protocol adapted in this thesis can also be applied to studies using measurements from other sites.Outdoor air pollution (AP) is a serious public threat which has been linked to severe respiratory and cardiovascular illnesses, and premature deaths especially among those residing in highly urbanised cities. As such, there is a need to develop early-warning and risk management tools to alleviate its effects. The main objective of this research is to develop AP forecasting models based on Artificial Neural Networks (ANNs) according to an identified model-building protocol from existing related works. Plain, hybrid and ensemble ANN model architectures were developed to estimate the temporal and spatiotemporal variability of hourly NO2 levels in several locations in the Greater London area. Wavelet decomposition was integrated with Multilayer Perceptron (MLP) and Long Short-term Memory (LSTM) models to address the issue of high variability of AP data and improve the estimation of peak AP levels. Block-splitting and crossvalidation procedures have been adapted to validate the models based on Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Willmott’s index of agreement (IA). The results of the proposed models present better performance than those from the benchmark models. For instance, the proposed wavelet-based hybrid approach provided 39.15% and 28.58% reductions in RMSE and MAE indices, respectively, on the performance of the benchmark MLP model results for the temporal forecasting of NO2 levels. The same approach reduced the RMSE and MAE indices of the benchmark LSTM model results by 12.45% and 20.08%, respectively, for the spatiotemporal estimation of NO2 levels in one site at Central London. The proposed hybrid deep learning approach offers great potential to be operational in providing air pollution forecasts in areas without a reliable database. The model-building protocol adapted in this thesis can also be applied to studies using measurements from other sites

    Air pollution forecasts: An overview

    Full text link
    © 2018 by the authors. Licensee MDPI, Basel, Switzerland. Air pollution is defined as a phenomenon harmful to the ecological system and the normal conditions of human existence and development when some substances in the atmosphere exceed a certain concentration. In the face of increasingly serious environmental pollution problems, scholars have conducted a significant quantity of related research, and in those studies, the forecasting of air pollution has been of paramount importance. As a precaution, the air pollution forecast is the basis for taking effective pollution control measures, and accurate forecasting of air pollution has become an important task. Extensive research indicates that the methods of air pollution forecasting can be broadly divided into three classical categories: statistical forecasting methods, artificial intelligence methods, and numerical forecasting methods. More recently, some hybrid models have been proposed, which can improve the forecast accuracy. To provide a clear perspective on air pollution forecasting, this study reviews the theory and application of those forecasting models. In addition, based on a comparison of different forecasting methods, the advantages and disadvantages of some methods of forecasting are also provided. This study aims to provide an overview of air pollution forecasting methods for easy access and reference by researchers, which will be helpful in further studies

    Using ensembles of artificial neural networks to improve PM10 forecasts

    Get PDF
    High concentrations of atmospheric pollutants provoke negative effects that range from respiratory problems in humans to altered growth in crops due to the reduction of solar radiation. In this context, the study of suspended particulate matter (PM) in the atmosphere is especially relevant. Several works in the literature are dedicated to evaluate PM impacts and to develop models to forecast PM concentrations. Among these models, artificial neural networks (ANNs) are often employed mainly due to the facts that they are capable of learning from a set of training data samples and that they are known to be universal function approximators. However, most ANN training algorithms are susceptible to initial conditions, so the resulting models of distinct training phases may present different accuracies for the same problem. It is known from the machine learning literature that the ensemble approach, which basically combines a set of slightly different high-accuracy predictors, tends to lead to more accurate forecasts. Therefore, in this paper an ensemble of ANNs is proposed to forecast the daily concentrations of PM10 (phi <= 10 mu m) in the city of Piracicaba, Brazil. The ensemble was trained with daily samples collected from 07.2009 to 06.2013 and evaluated with one-day-ahead forecasts from 07.2013 to 06.2014. Experiments with distinct ANN configurations were made and an average reduction of 8.85 % was obtained in the Mean Squared Error. The ensembles were compared to individual ANNs that led to the best accuracy in the training dataset. It was also verified that, when compared to distinct single ANNs, the ensemble-based approach facilitated the generation of high accuracy models, as it increased the robustness of the development process. It is important to highlight that the proposed approach can be directly applied to other scenarios related to the prediction of PM concentrations, such as different atmospheric pollutants and meteorological data.High concentrations of atmospheric pollutants provoke negative effects that range from respiratory problems in humans to altered growth in crops due to the reduction of solar radiation. In this context, the study of suspended particulate matter (PM) in th4321612166sem informaçãosem informaçã

    Modelling and Forecasting Temporal PM<sub>2.5</sub> Concentration Using Ensemble Machine Learning Methods

    Get PDF
    Exposure of humans to high concentrations of PM2.5 has adverse effects on their health. Researchers estimate that exposure to particulate matter from fossil fuel emissions accounted for 18% of deaths in 2018&mdash;a challenge policymakers argue is being exacerbated by the increase in the number of extreme weather events and rapid urbanization as they tinker with strategies for reducing air pollutants. Drawing on a number of ensemble machine learning methods that have emerged as a result of advancements in data science, this study examines the effectiveness of using ensemble models for forecasting the concentrations of air pollutants, using PM2.5 as a representative case. A comprehensive evaluation of the ensemble methods was carried out by comparing their predictive performance with that of other standalone algorithms. The findings suggest that hybrid models provide useful tools for PM2.5 concentration forecasting. The developed models show that machine learning models are efficient in predicting air particulate concentrations, and can be used for air pollution forecasting. This study also provides insights into how climatic factors influence the concentrations of pollutants found in the air

    Development of a regional feature selection-based machine learning system (RFSML v1.0) for air pollution forecasting over China

    Get PDF
    With the explosive growth of atmospheric data, machine learning models have achieved great success in air pollution forecasting because of their higher computational efficiency than the traditional chemical transport models. However, in previous studies, new prediction algorithms have only been tested at stations or in a small region; a large-scale air quality forecasting model remains lacking to date. Huge dimensionality also means that redundant input data may lead to increased complexity and therefore the over-fitting of machine learning models. Feature selection is a key topic in machine learning development, but it has not yet been explored in atmosphere-related applications. In this work, a regional feature selection-based machine learning (RFSML) system was developed, which is capable of predicting air quality in the short term with high accuracy at the national scale. Ensemble-Shapley additive global importance analysis is combined with the RFSML system to extract significant regional features and eliminate redundant variables at an affordable computational expense. The significance of the regional features is also explained physically. Compared with a standard machine learning system fed with relative features, the RFSML system driven by the selected key features results in superior interpretability, less training time, and more accurate predictions. This study also provides insights into the difference in interpretability among machine learning models (i.e., random forest, gradient boosting, and multi-layer perceptron models).</p

    Comparative Analysis of Machine Learning Techniques for Predicting Air Pollution

    Get PDF
    The modern and motorized way of life has cultured air pollution.&nbsp; Air pollution has become the biggest rival of robust living. This situation is becoming more lethal in developing countries and so in Pakistan.&nbsp; Hence, this inquiry was carried out to propose an architecture design that could make real-time prediction of air pollution with another purpose of scanning the frequently adopted algorithm in past investigations. In addition, it was also intended to narrate the toxic effects of air pollution on human health. So, this research was carried out on a large dataset of Seoul as an adequate dataset of Pakistan was not attainable. The dataset consisted of three years (2017-2019) including 647,512 instances and 11 attributes. The four distinctive algorithms termed Random Forest, Linear Regression, Decision Tree and XGBoosting were employed. It was inferred that XGB is more promising and feasible in predicting concentration level of NO2, O3, SO2, PM10, PM2.5 and CO with the lowest RMSE and MAE values of 0.0111, 0.0262, 0.0168, 49.64, 41.68 and 0.1856 and 0.0067, 0.0096, 0.0017, 12.28, 7.63 and 0.0982 respectively. Furthermore, it was found out as well that the Random Forest was preferred mostly in the previous studies related to air pollution prophecy while many probes supported that air pollution is very detrimental to human health especially long-lasting exposure causes lung cancer, respiratory and cardiovascular diseases

    Developing an early-warning system for air quality prediction and assessment of cities in China

    Full text link
    © 2017 Elsevier Ltd Air quality has received continuous attention from both environmental managers and citizens. Accordingly, early-warning systems for air pollution are very useful tools to avoid negative health effects and develop effective prevention programs. However, developing robust early-warning systems is very challenging, as well as necessary. This paper develops a reliable and effective early-warning system that consists of air quality prediction and assessment modules. In the prediction module, a hybrid forecasting method is developed for predicting pollutant concentrations that effectively estimates future air quality conditions. In developing this proposed model, we suggest the use of a back propagation neural network algorithm, combined with a probabilistic parameter model and data preprocessing techniques, to address the uncertainties involved in future air quality prediction. Meanwhile, a pre-analysis is implemented, primarily by using optimized distribution functions to examine and analyze statistical characteristics and emission behaviors of air pollutants. The second method, which is developed as part of the second module, is based on fuzzy set theory and the Analytic Hierarchy Process, and it performs air quality assessments to provide a clear and intelligible description of air quality conditions. Using data from the Ministry of Environmental Protection of China and six stages of air quality classification levels, specifically good, moderate, lightly polluted, moderately polluted, heavily polluted and severely polluted, two cities in China, Chengdu and Hangzhou, are used as illustrative examples to verify the effectiveness of the developed early-warning system. The results demonstrate that the proposed methods are effective and reliable for use by environmental supervisors in air pollution monitoring and management