12 research outputs found

    Particulate Matter Sampling Techniques and Data Modelling Methods

    Get PDF
    Particulate matter with 10 渭m or less in diameter (PM10) is known to have adverse effects on human health and the environment. For countries committed to reducing PM10 emissions, it is essential to have models that accurately estimate and predict PM10 concentrations for reporting and monitoring purposes. In this chapter, a broad overview of recent empirical statistical and machine learning techniques for modelling PM10 is presented. This includes the instrumentation used to measure particulate matter, data preprocessing, the selection of explanatory variables and modelling methods. Key features of some PM10 prediction models developed in the last 10 years are described, and current work modelling and predicting PM10 trends in New Zealand鈥攁 remote country of islands in the South Pacific Ocean鈥攁re examined. In conclusion, the issues and challenges faced when modelling PM10 are discussed and suggestions for future avenues of investigation, which could improve the precision of PM10 prediction and estimation models are presented

    Analysis of sensory data using graph signal processing

    Get PDF
    Air pollution monitoring is an important topic that has been researched in the past few years thanks to the massive deployment of IoT platforms, as it affects the lives of both children and adults, and it kills millions of people worldwide every year. A new framework of tools called Graph Signal Processing was presented recently and it allows, among other things, the ability to predict data on a node that belongs to a network of sensors using both the data itself and the topology of the graph, which is based on the Laplacian matrix. This thesis is a comparative study on different prediction techniques for pollutant signals, such as Linear Combination, Multiple Linear Regression and GSP and it presents the results of all three methods in different scenarios, using RMSE and R2 indicators, and focusing the efforts on the understanding of how different parameters (such as the distance between nodes) affect the performances of these new tools. The results of the study show that pollutants O3 and NO2 are lowpass signals, and as the number of edges between nodes increases, GSP obtains a close performances to MRL. For PM10, we conclude that is not a low-pass signal, and the performance of the indicators drop massively compared with the previous ones. Linear combination is the worst of all three and MLR has a stable performance during all the scenarios

    Predicci贸n temporal de calidad del aire en Lima a partir de datos de estaciones de bajo costo y Aprendizaje Autom谩tico: una revisi贸n de literatura

    Get PDF
    El presente trabajo explora los estudios en los cuales se utilizan t茅cnicas de aprendizaje profundo para realizar predicci贸n temporal de calidad del aire, de manera que se pueda comprender que caracter铆sticas tendr铆an los modelos de aprendizaje profundo que tienen un mejor rendimiento con para realizar esta tarea y puedan utilizarse como l铆nea base para desarrollar modelos similares en el contexto de la ciudad de lima. Esta revisi贸n de literatura se realiza con el objetivo de poder obtener los modelos de aprendizaje profundo que est茅n teniendo un mejor rendimiento en la actualidad al predecir temporalmente la calidad del aire mediante un procedimiento que garantice objetividad y reproducci贸n de resultados. Para ello, se realiza una revisi贸n sistem谩tica de literatura que garantiza el uso de procedimientos estructurados y definidos para conocer las preguntas de investigaci贸n que gu铆an la exploraci贸n de los estudios de predicci贸n temporal de calidad del aire, los motores de b煤squeda considerados para la revisi贸n y las cadenas de b煤squeda asociadas tanto a las preguntas de investigaci贸n como los motores de b煤squeda, de manera que estas se puedan ejecutar y reproducir la obtenci贸n de estudios. Las respuestas se reportan en un formulario de extracci贸n con datos relacionados a las arquitecturas de aprendizaje profundo, limitaciones de los modelos empleados y el rendimiento obtenido por cada modelo en cada estudio. Al finalizar el estudio, se concluye que se puede desarrollar un modelo basado en una arquitectura adecuada de aprendizaje profundo para poder atacar el problema de la predicci贸n inadecuada de calidad del aire en Lima al percatar su efectividad reportada en la literatura para otras localidades en el mundo, considerando que dichos modelos deben tomarse 煤nicamente como una l铆nea base y que deben ajustarse a la localidad de Lima para obtener predicciones adecuadas a su entorno.Trabajo de investigaci贸

    A New Collaborative Multi-Agent Monte Carlo Simulation Model for Spatial Correlations Air Pollutions Global Risk Assessment

    Get PDF
    Air pollution risk assessment is complex due to dynamic data change and pollution source distribution. Air quality index concentration level prediction is an effective method of protecting public health by providing the means for an early warning against harmful air pollution. However, air quality index-based prediction is challenging as it depends on several complicated factors resulting from dynamic nonlinear air quality time-series data, such as dynamic weather patterns and the verity and distribution of air pollution sources. Subsequently, some minimal models have incorporated time series-based predicting air quality index at a global level (for a particular city or various cities). These models require interaction between the multiple air pollution sensing sources and additional parameters like wind direction and wind speed. The existing methods in predicting air quality index cannot handle short-term dependencies. These methods also mostly neglect the spatial correlations between the different parameters. Moreover, the assumption of selecting the most recent part of the air quality time series is not valid considering that pollution is cyclic behavior according to various events and conditions due to the high possibility of falling into the trap of local minimum and poor generalization. Therefore, this pa-per proposes a new air pollution global risk assessment (APGRA) model for predicting spatial correlations air quality index risk assessment to address these issues. The APGRA model incorporates autoregressive integrated moving average (ARIMA), Monte-Carlo simulation, and collaborative multi-agent system, and prediction algorithm for reducing air quality index prediction error and processing time. The proposed APGRA model is evaluated based on Malaysia and China real-world air quality datasets. The proposed APGRA model improves the average root mean squared error by 41%, mean and absolute error by 47.10% compared with the conventional ARIMA and ANFIS models

    Pron贸stico de las concentraciones de SO2 y NO2 en Ecuador a partir de im谩genes satelitales Sentinel 5P, mediante t茅cnicas de Machine Learning

    Get PDF
    La contaminaci贸n del aire se ha convertido en uno de los principales problemas ambientales a nivel mundial debido a su afecci贸n tanto en el medio ambiente como en la salud en general. Los gobiernos tanto nacionales como internacionales han implementado esfuerzos para medir y controlar las emisiones de contaminantes al aire proveniente de fuente antr贸picas instalando redes de monitorizaci贸n atmosf茅rica. Sin embargo, no todas las ciudades y pa铆ses cuentan con estas herramientas de monitoreo. Por ello, el uso de las im谩genes satelitales ha ido tomando fuerza en los 煤ltimos a帽os ya que nos permite obtener informaci贸n satelital de 谩reas que no cuentan con monitoreo terrestre y poder utilizar estos datos para fines de control, prevenci贸n e investigaci贸n. Por medio de dicha informaci贸n podemos realizar an谩lisis y modelado de las emisiones y comportamiento de los contaminantes atmosf茅ricos. Debido a la necesidad de poder prevenir a la sociedad y tomar medidas preventivas de las emisiones de contaminantes atmosf茅ricos, la comunidad cient铆fica en los 煤ltimos a帽os ha propuesto diferentes modelos matem谩ticos y modelos de aprendizaje no supervisado que permitan predecir las emisiones de los contaminantes atmosf茅ricos. Para ello, es necesario tomar en cuenta las variables externas que afectan al comportamiento de los contaminantes dependiendo de la zona de estudio, ya que la ubicaci贸n geogr谩fica, la topograf铆a, y condiciones meteorol贸gicas influyen directa o indirectamente en este comportamiento, por esta raz贸n generalmente los investigadores dise帽an modelos para regiones espec铆ficas. No existe un m茅todo para establecer qu茅 variables meteorol贸gicas deben ser usadas en la predicci贸n de los contaminantes, los antecedentes a usar son los estudios previos realizados, observando los resultados obtenidos para saber las influencias de estas variables en el comportamiento de los contaminantes. El presente trabajo propone dos modelos de predicci贸n de la concentraci贸n de NO2 y SO2 para las tres ciudades m谩s importantes del Ecuador Tomando como base la informaci贸n de im谩genes satelitales Sentinel-5P, Giovanni NASA y ERA 5. El primer modelo propuesto utiliza redes Neuronales Recurrentes utilizando el n煤mero de retrasos o variables ficticias creadas que se utilizan para encontrar relaciones entre la concentraci贸n y las variables meteorol贸gicas, las cuales proporcionan informaci贸n a la red neuronal para realizar la Bryam Montesdeoca Jara Dayana Ortiz Morocho ii predicci贸n. Se propuso predecir la contaminaci贸n atmosf茅rica hasta 5 d铆as hacia adelante con el uso de diferentes estructuras buscando la mejor para el pron贸stico. El segundo modelo propuesto utiliza el m茅todo de Random Forest teniendo en cuenta dos caracter铆sticas importantes, la profundidad m谩xima de cada 谩rbol y el n煤mero m铆nimo de muestras para considerarse Nodos Hoja. Estas dos caracter铆sticas nos dan dos perspectivas acerca de los bosques aleatorios buscando el mejor modelo de predicci贸n. Se puede decir que la predicci贸n a trav茅s del algoritmo de Regresi贸n de Random Forest fue el que mejor rendimiento R2=0,98 mostr贸 y las m茅tricas de error MAPE, RMSE y PBIAS fueron m谩s bajas en este m茅todo con valores de 7, 3,67, 0,68, respectivamente, haciendo 茅nfasis en los distintos conjuntos de datos, la predicci贸n para la ciudad de Cuenca fue la mejor seguida de la ciudad de Guayaquil que supera ligeramente a las predicciones de Quito. Esto demuestra que la predicci贸n de la calidad del aire es efectiva mostrando resultados satisfactorios y abriendo puertas a nuevas investigaciones con la finalidad de poder prever las medidas de concentraciones de gases contaminantes al aire y as铆 poder tomar decisiones preventivas tanto para la salud como el medio ambiente.Air pollution has become one of the main environmental problems worldwide due to its effects on both the environment and health in general. Both national and international governments have implemented efforts to measure and control air pollutant emissions from anthropogenic sources by installing atmospheric monitoring networks. However, not all cities and countries have these monitoring tools. For this reason, the use of satellite images has been gaining strength in recent years as it allows us to obtain satellite information from areas that do not have terrestrial monitoring and to be able to use this data for control, prevention and research purposes. Through this information we can perform analysis and modeling of emissions and behavior of atmospheric pollutants. Due to the need to be able to prevent society and take preventive measures regarding the emissions of atmospheric pollutants, the scientific community in recent years has proposed different mathematical models and unsupervised learning models that allow predicting the emissions of atmospheric pollutants. For them it is necessary to take into account the external variables that affect the behavior of pollutants depending on the study area, since the geographical location, topography, and meteorological conditions directly or indirectly influence this behavior, for this reason researchers generally design models for specific regions. There is no method to establish which meteorological variables should be used in the prediction of pollutants, the background to be used are the previous studies carried out, observing the results obtained to know the influences of these variables on the behavior of pollutants. The present work proposes two prediction models for the concentration of NO2 and SO2 for the three most important cities of Ecuador, based on information from Sentinel-5P, Giovanni NASA and ERA 5 satellite images. The first proposed model uses Recurrent Neural Networks using the number of lags or dummy variables created that are used to find relationships between concentration and meteorological variables, which provide information to the neural network to make the prediction. It was proposed to predict air pollution up to 5 days ahead with the use of different structures looking for the best one for the forecast. The second proposed model uses the Random Forest method taking into account two important characteristics, the maximum depth of each tree and the minimum number of samples to be considered Leaf Nodes. These two features give us two Bryam Montesdeoca Jara Dayana Ortiz Morocho iv perspectives about random forests looking for the best prediction model. It can be said that the prediction through the Random Forest Regression algorithm was the one that showed the best performance R2=0.98 and the error metrics MAPE, RMSE and PBIAS were lower in this method with values of 7, 3.67, 0.68, respectively. , emphasizing the different data sets, the prediction for the city of Cuenca was the best, followed by the city of Guayaquil, which slightly exceeds the predictions for Quito. This shows that the prediction of air quality is effective, showing satisfactory results and opening doors to new research in order to be able to anticipate the measurements of concentrations of polluting gases in the air and thus be able to make preventive decisions for both health and the environment.Ingeniero AmbientalCuenc

    Data mining methods for prediction of air pollution

    No full text
    The paper discusses methods of data mining for prediction of air pollution. Two tasks in such a problem are important: generation and selection of the prognostic features, and the final prognostic system of the pollution for the next day. An advanced set of features, created on the basis of the atmospheric parameters, is proposed. This set is subject to analysis and selection of the most important features from the prediction point of view. Two methods of feature selection are compared. One applies a genetic algorithm (a global approach), and the other鈥攁 linear method of stepwise fit (a locally optimized approach). On the basis of such analysis, two sets of the most predictive features are selected. These sets take part in prediction of the atmospheric pollutants PM10, SO2, NO2 and O3. Two approaches to prediction are compared. In the first one, the features selected are directly applied to the random forest (RF), which forms an ensemble of decision trees. In the second case, intermediate predictors built on the basis of neural networks (the multilayer perceptron, the radial basis function and the support vector machine) are used. They create an ensemble integrated into the final prognosis. The paper shows that preselection of the most important features, cooperating with an ensemble of predictors, allows increasing the forecasting accuracy of atmospheric pollution in a significant way

    Data mining methods for prediction of air pollution

    No full text
    The paper discusses methods of data mining for prediction of air pollution. Two tasks in such a problem are important: generation and selection of the prognostic features, and the final prognostic system of the pollution for the next day. An advanced set of features, created on the basis of the atmospheric parameters, is proposed. This set is subject to analysis and selection of the most important features from the prediction point of view. Two methods of feature selection are compared. One applies a genetic algorithm (a global approach), and the other-a linear method of stepwise fit (a locally optimized approach). On the basis of such analysis, two sets of the most predictive features are selected. These sets take part in prediction of the atmospheric pollutants PM10, SO2, NO2 and O3. Two approaches to prediction are compared. In the first one, the features selected are directly applied to the random forest (RF), which forms an ensemble of decision trees. In the second case, intermediate predictors built on the basis of neural networks (the multilayer perceptron, the radial basis function and the support vector machine) are used. They create an ensemble integrated into the final prognosis. The paper shows that preselection of the most important features, cooperating with an ensemble of predictors, allows increasing the forecasting accuracy of atmospheric pollution in a significant way
    corecore