351 research outputs found

    Spatial-temporal prediction of air quality based on recurrent neural networks

    Get PDF
    To predict air quality (PM2.5 concentrations, et al), many parametric regression models have been developed, while deep learning algorithms are used less often. And few of them takes the air pollution emission or spatial information into consideration or predict them in hour scale. In this paper, we proposed a spatial-temporal GRU-based prediction framework incorporating ground pollution monitoring (GPM), factory emissions (FE), surface meteorology monitoring (SMM) variables to predict hourly PM2.5 concentrations. The dataset for empirical experiments was built based on air quality monitoring in Shenyang, China. Experimental results indicate that our method enables more accurate predictions than all baseline models and by applying the convolutional processing to the GPM and FE variables notable improvement can be achieved in prediction accuracy

    Advanced Air Quality Management with Machine Learning

    Get PDF
    Air pollution has been a significant health risk factor at a regional and global scale. Although the present method can provide assessment indices like exposure risks or air pollutant concentrations for air quality management, the modeling estimations still remain non-negligible bias which could deviate from reality and limit the effectiveness of emission control strategies to reduce air pollution and derive health benefits. The current development in air quality management is still impeded by two major obstacles: (1) biased air quality concentrations from air quality models and (2) inaccurate exposure risk estimations Inspired by more available and overwhelming data, machine learning techniques provide promising opportunities to solve the above-mentioned obstacles and bridge the gap between model results and reality. This dissertation illustrates three machine learning applications to strengthen air quality management: (1) identifying heterogeneous exposure risk to air pollutants among diverse urbanization levels, (2) correcting modeled air pollutant concentrations and quantifying the bias of sources from model inputs, and (3) examine nonlinear air pollutant responses to local emissions. This dissertation uses Taiwan as a case study, due to its well-established hospital data, emission inventory, and air quality monitoring network. In conclusion, although ML models have become common in atmospheric and environmental health science in recent years, the modeling processes and output interpretation should rely on interdisciplinary professions and judgment. Except for meeting the basic modeling performance, future ML applications in atmospheric and environmental health science should provide interpretability and explainability in terms of human-environment interactions and interpretable physical/chemical mechanisms. Such applications are expected to feedback to traditional methods and deepen our understanding of environmental science

    Representing chemical history in ozone time-series predictions – a model experiment study building on the MLAir (v1.5) deep learning framework

    Get PDF
    Tropospheric ozone is a secondary air pollutant that is harmful to living beings and crops. Predicting ozone concentrations at specific locations is thus important to initiate protection measures, i.e. emission reductions or warnings to the population. Ozone levels at specific locations result from emission and sink processes, mixing and chemical transformation along an air parcel's trajectory. Current ozone forecasting systems generally rely on computationally expensive chemistry transport models (CTMs). However, recently several studies have demonstrated the potential of deep learning for this task. While a few of these studies were trained on gridded model data, most efforts focus on forecasting time series from individual measurement locations. In this study, we present a hybrid approach which is based on time-series forecasting (up to 4 d) but uses spatially aggregated meteorological and chemical data from upstream wind sectors to represent some aspects of the chemical history of air parcels arriving at the measurement location. To demonstrate the value of this additional information, we extracted pseudo-observation data for Germany from a CTM to avoid extra complications with irregularly spaced and missing data. However, our method can be extended so that it can be applied to observational time series. Using one upstream sector alone improves the forecasts by 10 % during all 4 d, while the use of three sectors improves the mean squared error (MSE) skill score by 14 % during the first 2 d of the prediction but depends on the upstream wind direction. Our method shows its best performance in the northern half of Germany for the first 2 prediction days. Based on the data's seasonality and simulation period, we shed some light on our models' open challenges with (i) spatial structures in terms of decreasing skill scores from the northern German plain to the mountainous south and (ii) concept drifts related to an unusually cold winter season. Here we expect that the inclusion of explainable artificial intelligence methods could reveal additional insights in future versions of our model.</p

    Spatiotemporal and temporal forecasting of ambient air pollution levels through data-intensive hybrid artificial neural network models

    Get PDF
    Outdoor air pollution (AP) is a serious public threat which has been linked to severe respiratory and cardiovascular illnesses, and premature deaths especially among those residing in highly urbanised cities. As such, there is a need to develop early-warning and risk management tools to alleviate its effects. The main objective of this research is to develop AP forecasting models based on Artificial Neural Networks (ANNs) according to an identified model-building protocol from existing related works. Plain, hybrid and ensemble ANN model architectures were developed to estimate the temporal and spatiotemporal variability of hourly NO2 levels in several locations in the Greater London area. Wavelet decomposition was integrated with Multilayer Perceptron (MLP) and Long Short-term Memory (LSTM) models to address the issue of high variability of AP data and improve the estimation of peak AP levels. Block-splitting and crossvalidation procedures have been adapted to validate the models based on Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Willmott’s index of agreement (IA). The results of the proposed models present better performance than those from the benchmark models. For instance, the proposed wavelet-based hybrid approach provided 39.15% and 28.58% reductions in RMSE and MAE indices, respectively, on the performance of the benchmark MLP model results for the temporal forecasting of NO2 levels. The same approach reduced the RMSE and MAE indices of the benchmark LSTM model results by 12.45% and 20.08%, respectively, for the spatiotemporal estimation of NO2 levels in one site at Central London. The proposed hybrid deep learning approach offers great potential to be operational in providing air pollution forecasts in areas without a reliable database. The model-building protocol adapted in this thesis can also be applied to studies using measurements from other sites.Outdoor air pollution (AP) is a serious public threat which has been linked to severe respiratory and cardiovascular illnesses, and premature deaths especially among those residing in highly urbanised cities. As such, there is a need to develop early-warning and risk management tools to alleviate its effects. The main objective of this research is to develop AP forecasting models based on Artificial Neural Networks (ANNs) according to an identified model-building protocol from existing related works. Plain, hybrid and ensemble ANN model architectures were developed to estimate the temporal and spatiotemporal variability of hourly NO2 levels in several locations in the Greater London area. Wavelet decomposition was integrated with Multilayer Perceptron (MLP) and Long Short-term Memory (LSTM) models to address the issue of high variability of AP data and improve the estimation of peak AP levels. Block-splitting and crossvalidation procedures have been adapted to validate the models based on Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Willmott’s index of agreement (IA). The results of the proposed models present better performance than those from the benchmark models. For instance, the proposed wavelet-based hybrid approach provided 39.15% and 28.58% reductions in RMSE and MAE indices, respectively, on the performance of the benchmark MLP model results for the temporal forecasting of NO2 levels. The same approach reduced the RMSE and MAE indices of the benchmark LSTM model results by 12.45% and 20.08%, respectively, for the spatiotemporal estimation of NO2 levels in one site at Central London. The proposed hybrid deep learning approach offers great potential to be operational in providing air pollution forecasts in areas without a reliable database. The model-building protocol adapted in this thesis can also be applied to studies using measurements from other sites

    Urban air pollution modelling with machine learning using fixed and mobile sensors

    Get PDF
    Detailed air quality (AQ) information is crucial for sustainable urban management, and many regions in the world have built static AQ monitoring networks to provide AQ information. However, they can only monitor the region-level AQ conditions or sparse point-based air pollutant measurements, but cannot capture the urban dynamics with high-resolution spatio-temporal variations over the region. Without pollution details, citizens will not be able to make fully informed decisions when choosing their everyday outdoor routes or activities, and policy-makers can only make macroscopic regulating decisions on controlling pollution triggering factors and emission sources. An increasing research effort has been paid on mobile and ubiquitous sampling campaigns as they are deemed the more economically and operationally feasible methods to collect urban AQ data with high spatio-temporal resolution. The current research proposes a Machine Learning based AQ Inference (Deep AQ) framework from data-driven perspective, consisting of data pre-processing, feature extraction and transformation, and pixelwise (grid-level) AQ inference. The Deep AQ framework is adaptable to integrate AQ measurements from the fixed monitoring sites (temporally dense but spatially sparse), and mobile low-cost sensors (temporally sparse but spatially dense). While instantaneous pollutant concentration varies in the micro-environment, this research samples representative values in each grid-cell-unit and achieves AQ inference at 1 km \times 1 km pixelwise scale. This research explores the predictive power of the Deep AQ framework based on samples from only 40 fixed monitoring sites in Chengdu, China (4,900 {\mathrm{km}}^\mathrm{2}, 26 April - 12 June 2019) and collaborative sampling from 28 fixed monitoring sites and 15 low-cost sensors equipped with taxis deployed in Beijing, China (3,025 {\mathrm{km}}^\mathrm{2}, 19 June - 16 July 2018). The proposed Deep AQ framework is capable of producing high-resolution (1 km \times 1 km, hourly) pixelwise AQ inference based on multi-source AQ samples (fixed or mobile) and urban features (land use, population, traffic, and meteorological information, etc.). This research has achieved high-resolution (1 km \times 1 km, hourly) AQ inference (Chengdu: less than 1% spatio-temporal coverage; Beijing: less than 5% spatio-temporal coverage) with reasonable and satisfactory accuracy by the proposed methods in urban cases (Chengdu: SMAPE \mathrm{<} 20%; Beijing: SMAPE \mathrm{<} 15%). Detailed outcomes and main conclusions are provided in this thesis on the aspects of fixed and mobile sensing, spatio-temporal coverage and density, and the relative importance of urban features. Outcomes from this research facilitate to provide a scientific and detailed health impact assessment framework for exposure analysis and inform policy-makers with data driven evidence for sustainable urban management.Open Acces

    Modelling atmospheric ozone concentration using machine learning algorithms

    Get PDF
    Air quality monitoring is one of several important tasks carried out in the area of environmental science and engineering. Accordingly, the development of air quality predictive models can be very useful as such models can provide early warnings of pollution levels increasing to unsatisfactory levels. The literature review conducted within the research context of this thesis revealed that only a limited number of widely used machine learning algorithms have been employed for the modelling of the concentrations of atmospheric gases such as ozone, nitrogen oxides etc. Despite this observation the research and technology area of machine learning has recently advanced significantly with the introduction of ensemble learning techniques, convolutional and deep neural networks etc. Given these observations the research presented in this thesis aims to investigate the effective use of ensemble learning algorithms with optimised algorithmic settings and the appropriate choice of base layer algorithms to create effective and efficient models for the prediction and forecasting of specifically, ground level ozone (O3). Three main research contributions have been made by this thesis in the application area of modelling O3 concentrations. As the first contribution, the performance of several ensemble learning (Homogeneous and Heterogonous) algorithms were investigated and compared with all popular and widely used single base learning algorithms. The results have showed impressive prediction performance improvement obtainable by using meta learning (Bagging, Stacking, and Voting) algorithms. The performances of the three investigated meta learning algorithms were similar in nature giving an average 0.91 correlation coefficient, in prediction accuracy. Thus as a second contribution, the effective use of feature selection and parameter based optimisation was carried out in conjunction with the application of Multilayer Perceptron, Support Vector Machines, Random Forest and Bagging based learning techniques providing significant improvements in prediction accuracy. The third contribution of research presented in this thesis includes the univariate and multivariate forecasting of ozone concentrations based of optimised Ensemble Learning algorithms. The results reported supersedes the accuracy levels reported in forecasting Ozone concentration variations based on widely used, single base learning algorithms. In summary the research conducted within this thesis bridges an existing research gap in big data analytics related to environment pollution modelling, prediction and forecasting where present research is largely limited to using standard learning algorithms such as Artificial Neural Networks and Support Vector Machines often available within popular commercial software packages
    corecore