3,562 research outputs found

    The Temporal and Frequent Pattern Mining Analysis and Machine Learning Forecasting on Mobile Sourced Urban Air Pollutants

    Get PDF
    Ground-level ozone and atmospheric fine particles (PM2.5) have been recognized as critical air pollutants that act as important contributors to the toxicity of anthropogenic air pollution in urban areas. To limit the adverse impacts on public health and ecosystems of ground-level ozone and PM2.5, it is necessary and imperative to identify a practical and effective way to predict the upcoming pollution concentration levels accurately. Under this need, various research was conducted aiming to perform the forecasting of ground-level ozone and PM2.5 that mainly utilized the time-series and neural network analysis. In the meantime, machine learning is also adopted in analysis and forecasting in existing research, which is, however, associated with some limitations that are not easily overcome. (1) The majority of existing forecasting models are highly dependent on time-series inputs without considering the influencing factors of the air pollutants. While a relatively accurate prediction may be provided, the influencing factors of the air pollution level caused by real-world complexity are neglected. (2) The existing forecasting models are mainly focused on the short-term estimation, while some of them need to use the previous prediction as a part of the input, which increased the system complexity and decreased the computational efficiency and accuracy. (3) The accurate annual hourly air pollution level forecasting ability is seldomly achieved. The objective of this research is to propose a systematical methodology to forecast the long-term hourly future air pollution concentration levels through historical data considering the concentration influencing factors. To achieve this research goal, a series of methodologies to analyze the historical air pollution concentration by temporal characteristics and frequent pattern data mining algorithms are introduced. The association rules of air pollution concentration levels and the influencing factors are revealed. A systematical air pollution level forecasting approach based on supervised machine learning algorithms with the ability to predict the annual hourly value is proposed and evaluated. To quantify and validate the results, a case study was conducted in the Houston region with the collection and analysis of ten years of historical environmental, meteorological, and transportation-related data. From the results of this research, (1) the complex correlations between the influencing factors and air pollution concentration levels are quantified and presented. (2) The association rules between each dependant and independent parameters are calculated. (3) The supervised machine learning algorithm pool is created and evaluated. And (4), an accurate long-term hourly air pollution level machine learning forecasting procedure is proposed. The innovative methodology of this research is advanced in computation complexity with high accuracy when compared with the existing models, which could be easily applied to similar regions for various types of air pollution concentration level forecasting

    Multi-Source-Data-Oriented Ensemble Learning Based PM 2.5 Concentration Prediction in Shenyang

    Get PDF
    Shenyang where is surrounded by smokestack industries and depends on coal heating in winter, is a classical one of cities in China northeastern which has suffered from serious air pollution, especially PM2.5. The existing research on machine learning, based on historical air-monitoring data and meteorological data, does neither forecast accurately nor identify key pollutants for PM2.5. This paper presents a multi-source-data-oriented ensemble learning for predicting PM2.5 concentration. The proposed framework incorporates not only air quality data and weather data, but also industrial emission data, especially those of winter heating enterprises, in Shenyang and nearby cities; the model also takes into account location and emission frequency of pollution sources. All these data are entered into an ensemble learning model based on Extreme Gradient Boosting (XGBoost) in order to predict PM2.5 concentration, which not only improves prediction accuracy effectively, but also provides contribution analysis of different pollutants. Experimental results show that the top two factors affecting PM2.5 concentration are: (1) air pollutant emission quantities and (2) distance from pollution sources to air-monitoring stations. According to the importance of these two factors, we refine feature selection and re-train the ensemble learning model and find that the new model performs better on 72% of evaluation indexes

    Comparison of algorithms for road surface temperature prediction

    Get PDF
    Purpose - The influence of road surface temperature (RST) on vehicles is becoming more and more obvious. Accurate predication of RST is distinctly meaningful. At present, however, the prediction accuracy of RST is not satisfied with physical methods or statistical learning methods. To find an effective prediction method, this paper selects five representative algorithms to predict the road surface temperature separately. Design/methodology/approach - Multiple linear regressions, least absolute shrinkage and selection operator, random forest and gradient boosting regression tree (GBRT) and neural network are chosen to be representative predictors. Findings - The experimental results show that for temperature data set of this experiment, the prediction effect of GBRT in the ensemble algorithm is the best compared with the other four algorithms. Originality/value - This paper compares different kinds of machine learning algorithms, observes the road surface temperature data from different angles, and finds the most suitable prediction method

    Geo-Information Technology and Its Applications

    Get PDF
    Geo-information technology has been playing an ever more important role in environmental monitoring, land resource quantification and mapping, geo-disaster damage and risk assessment, urban planning and smart city development. This book focuses on the fundamental and applied research in these domains, aiming to promote exchanges and communications, share the research outcomes of scientists worldwide and to put these achievements better social use. This Special Issue collects fourteen high-quality research papers and is expected to provide a useful reference and technical support for graduate students, scientists, civil engineers and experts of governments to valorize scientific research

    Air Quality Management in Macao – assessment, development of an operational forecast, and future perspectives

    Get PDF
    A combination of assessment, operational forecast, and future perspective was thoroughly explored to provide an overview of the existing air quality problems in Macao. The levels of air pollution in Macao often exceed those recommended by the World Health Organization (WHO). In order for the population to take precautionary measures and avoid further health risks during high pollution episodes, it is important to develop a reliable air quality forecast. Statistical models based on linear multiple regression (MLR) and classification and regression trees (CART) analysis were successfully developed for Macao, to predict the next day concentrations of NO2, PM10, PM2.5, and O3. Meteorological variables were selected from an extensive list of possible variables, including geopotential height, relative humidity, atmospheric stability, and air temperature at different vertical levels. Air quality variables translate the resilience of the recent past concentrations of each pollutant and usually are maximum and/or the average of latest 24-hour levels. The models were applied in forecasting the next day average daily concentrations for NO2 and PM and maximum hourly O3 levels for five air quality monitoring stations. The results are expected to support an operational air quality forecast for Macao. The work involved two phases. On a first phase, the models utilized meteorological and air quality variables based on five years of historical data, from 2013 to 2017. Data from 2013 to 2016 were used to develop the statistical models and data from 2017 was used for validation purposes. All the developed models were statistically significantly valid with a 95% confidence level with high coefficients of determination (from 0.78 to 0.93) for all pollutants. On a second phase, these models were used with 2019 validation data, while a new set of models based on a more extended historical data series, from 2013 to 2018, were also validated with 2019 data. There were no significant differences in the coefficients of determination (R2 ) and minor improvements in root mean square errors (RMSE), mean absolute errors (MAE) and biases (BIAS) between the 2013 to 2016 and the 2013 to 2018 data models. In addition, for one air quality monitoring station (Taipa Ambient), the 2013 to 2018 model was applied for two days ahead (D2) forecast and the coefficient of determination (R2 ) was considerably less accurate to the one day ahead (D1) forecast, but still able to provide a reliable air quality forecast for Macao. To understand if the prediction model was robust to extreme variations in pollutants concentration, a test was performed under the circumstances of a high pollution episode for PM2.5 and O3 during 2019, and a low pollution episode during 2020. Regarding the high pollution episode, the period of the Chinese National Holiday of 2019 was selected, in which high concentration levels were identified for PM2.5 and O3, with peaks of daily concentration for PM2.5 levels exceeding 55 μg/m3 and the maximum hourly concentration for O3 levels exceeding 400 μg/m3 . For the low pollution episode, the 2020 period of implementation of the preventive measures for COVID-19 pandemic was selected, with a low record of daily concentration for PM2.5 levels at 2 μg/m3 and maximum hourly concentration for O3 levels at 50 μg/m3 . The 2013 to 2018 model successfully predicted the high pollution episode with high coefficients of determination (0.92 for PM2.5 and 0.82 for O3). Likewise, the low pollution episode was also correctly predicted with high coefficients of determination (0.86 and 0.84 for PM2.5 and O3, respectively). Overall, the results demonstrate that the statistical forecast model is robust and able to correctly reproduce extreme air pollution events of both high and low concentration levels. Machine learning methods maybe adopted to provide significant improvements in combination of multiple linear regression (MLR) and classification and regression tree (CART) to further improve the accuracy of the statistical forecast. The developed air pollution forecasting model may be combined with other measures to mitigate the impact of air pollution in Macao. These may include the establishment of low emission zones (LEZ), as enforced in some European cities, license plate restrictions and lottery policy, as used in some Asian, tax exemptions on electric vehicles (EVs) and exclusive corridors for public transportations

    Data-driven analysis on the subbase strain prediction:a deep data augmentation-based study

    Get PDF
    The service quality of the subbase may affect the overall road performance during its service life. Thus, monitoring and prediction of subbase strain development are of great importance for civil engineers. In this paper, a method based on the time-series augmentation was employed to predict the subbase strain development. The time-series generative adversarial network (TimeGAN) model was implemented to perform the augmentation of time-series data based on the original monitored data. The augmented data was trained through deep learning network to learn the feature correlation of the subbase strain. The effectiveness of TimeGAN on the prediction accuracy was evaluated through the Attention-Sequence to Sequence (Attention-Seq2seq) model, and temporal convolution network-adaptively parametric rectifier linear units (TCN-APReLU) model. Results indicated that the TimeGAN network could capture sufficient information from the time-series monitored data of subbase strain development so that the corresponding augmented data matches well with the original data, which improves the prediction accuracy. It is also discovered that the combination of TimeGAN and TCN-APReLU appropriately predict the subbase strain development based on the original monitored data

    Spatial-temporal prediction of air quality based on recurrent neural networks

    Get PDF
    To predict air quality (PM2.5 concentrations, et al), many parametric regression models have been developed, while deep learning algorithms are used less often. And few of them takes the air pollution emission or spatial information into consideration or predict them in hour scale. In this paper, we proposed a spatial-temporal GRU-based prediction framework incorporating ground pollution monitoring (GPM), factory emissions (FE), surface meteorology monitoring (SMM) variables to predict hourly PM2.5 concentrations. The dataset for empirical experiments was built based on air quality monitoring in Shenyang, China. Experimental results indicate that our method enables more accurate predictions than all baseline models and by applying the convolutional processing to the GPM and FE variables notable improvement can be achieved in prediction accuracy
    corecore