1,275 research outputs found

    Integrated data-driven techniques for environmental pollution monitoring

    Get PDF
    The adverse health e_x000B_ffects of tropospheric ozone around urban zones indicate a substantial risk for many segments of the population. This necessitates the short term forecast in order to take evasive action on days conducive to ozone formation. Therefore it is important to study the ozone formation mechanisms and predict the ozone levels in a geographic region. Multivariate statistical techniques provide a very e_x000B_ffective framework for the classifi_x000C_cation and monitoring of systems with multiple variables. Cluster analysis, sequence analysis and hidden Markov models (HMMs) are statistical methods which have been used in a wide range of studies to model the data structure. In this dissertation, we propose to formulate, implement and apply a data-driven computational framework for air quality monitoring and forecasting with application to ozone formation. The proposed framework integrates, in a unique way, advanced statistical data processing and analysis tools to investigate ozone formation mechanisms and predict the ozone levels in a geographic region. This dissertation focuses on cluster analysis for identi_x000C_fication and classi_x000C_fication of underlying mechanisms of a system and HMMs for predicting the occurrence of an extreme event in a system. The usefulness of the proposed methodology in air quality monitoring is demonstrated by applying it to study the ozone problem in Houston, Texas and Baton Rouge, Louisiana regions. Hierarchical clustering is used to visualize air flow patterns at two time scales relevant for ozone buildup. First, clustering is performed at the hourly time scale to identify surface flow patterns. Then, sequencing is performed at the daily time scale to identify groups of days sharing similar diurnal cycles for the surface flow. Selection of appropriate numbers of air flow patterns allowed inference of regional transport and dispersion patterns for understanding population exposure to ozone. This dissertation proposes to build HMMs for ozone prediction using air quality and meteorological measurements obtained from a network of surface monitors. The case study of the Houston, Texas region for the 2004 and 2005 ozone seasons showed that the results indicate the capability of HMMs as a simpler forecasting tool

    Data Mining Paradigm in the Study of Air Quality

    Get PDF
    Air pollution is a serious global problem that threatens human life and health, as well as the environment. The most important aspect of a successful air quality management strategy is the measurement analysis, air quality forecasting, and reporting system. A complete insight, an accurate prediction, and a rapid response may provide valuable information for society’s decision-making. The data mining paradigm can assist in the study of air quality by providing a structured work methodology that simplifies data analysis. This study presents a systematic review of the literature from 2014 to 2018 on the use of data mining in the analysis of air pollutant measurements. For this review, a data mining approach to air quality analysis was proposed that was consistent with the 748 articles consulted. The most frequent sources of data have been the measurements of monitoring networks, and other technologies such as remote sensing, low-cost sensors, and social networks which are gaining importance in recent years. Among the topics studied in the literature were the redundancy of the information collected in the monitoring networks, the forecasting of pollutant levels or days of excessive regulation, and the identification of meteorological or land use parameters that have the most substantial impact on air quality. As methods to visualise and present the results, we recovered graphic design, air quality index development, heat mapping, and geographic information systems. We hope that this study will provide anchoring of theoretical-practical development in the field and that it will provide inputs for air quality planning and management.Facultad de Ciencias Exacta

    SOME TRIGONOMETRIC SIMILARITY MEASURES OF COMPLEX FUZZY SETS WITH APPLICATION

    Get PDF
    Similarity measures of fuzzy sets are applied to compare the closeness among fuzzy sets. These measures have numerous applications in pattern recognition, image processing, texture synthesis, medical diagnosis, etc. However, in many cases of pattern recognition, digital image processing, signal processing, and so forth, the similarity measures of the fuzzy sets are not appropriate due to the presence of dual information of an object, such as amplitude term and phase term. In these cases, similarity measures of complex fuzzy sets are the most suitable for measuring proximity between objects with two-dimensional information. In the present paper, we propose some trigonometric similarity measures of the complex fuzzy sets involving similarity measures based on the sine, tangent, cosine, and cotangent functions. Furthermore, in many situations in real life, the weight of an attribute plays an important role in making the right decisions using similarity measures. So in this paper, we also consider the weighted trigonometric similarity measures of the complex fuzzy sets, namely, the weighted similarity measures based on the sine, tangent, cosine, and cotangent functions. Some properties of the similarity measures and the weighted similarity measures are discussed. We also apply our proposed methods to the pattern recognition problem and compare them with existing methods to show the validity and effectiveness of our proposed methods

    Imputation through Clustering of Time Series Data: a case study in air pollution

    Get PDF
    Air pollution is a global problem, and air pollution concentration assessment plays an essential role in evaluating the associated risk to human health. Unfortunately, air pollution monitoring stations often have periods of missing data. In this thesis, we investigated missing values problem in air quality data by looking at the hourly pollutant concentration Time Series (TS) of the main four pollutants included in air quality assessment: O3, NO2, PM2.5, and PM10. The research presented in this thesis aims to reduce the uncertainty of the air quality assessment by proposing methods for the imputation of missing values either partially or completely. Our approach uses clustering of stations based on measured pollutants to inform the imputation. We started by testing uni-variate clustering and then developing a multivariate time series (MVTS) clustering method that considers all measured pollutants at a station by aggregating the similarity between those pollutants (through a fused distance) followed by imputation models for the whole TS. We developed various imputation models including ensemble models which aggregate temporal similarity obtained from clustering and spatial similarity obtained by the geographical correlation between stations. Our experimental results show that using MVTS clustering enables imputation of unmeasured pollutants in any station and produced plausible imputed values for all pollutants. Ensemble imputation models (Model 8 and 9) gave the lowest RMSE, the highest (IOA) between imputed and real values, and met the minimum requirement criteria using FAC2 for air quality modelling. The imputation models reproduce high pollution episodes at stations within the clusters where these episodes possibly happened but were not measured, as some of them were captured by the cluster centroids. We also found two important pollutants associated with those episodes: PM2.5 and O3 which may require more measures or should be imputed in different locations for more realistic air quality monitoring

    Environmental risk assessment in the mediterranean region using artificial neural networks

    Get PDF
    Los mapas auto-organizados han demostrado ser una herramienta apropiada para la clasificación y visualización de grupos de datos complejos. Redes neuronales, como los mapas auto-organizados (SOM) o las redes difusas ARTMAP (FAM), se utilizan en este estudio para evaluar el impacto medioambiental acumulativo en diferentes medios (aguas subterráneas, aire y salud humana). Los SOMs también se utilizan para generar mapas de concentraciones de contaminantes en aguas subterráneas simulando las técnicas geostadísticas de interpolación como kriging y cokriging. Para evaluar la confiabilidad de las metodologías desarrolladas en esta tesis, se utilizan procedimientos de referencia como puntos de comparación: la metodología DRASTIC para el estudio de vulnerabilidad en aguas subterráneas y el método de interpolación espacio-temporal conocido como Bayesian Maximum Entropy (BME) para el análisis de calidad del aire. Esta tesis contribuye a demostrar las capacidades de las redes neuronales en el desarrollo de nuevas metodologías y modelos que explícitamente permiten evaluar las dimensiones temporales y espaciales de riesgos acumulativos

    Un enfoque de sustentabilidad utilizando lógica difusa y minería de datos

    Get PDF
    [ES] Sustainable development goals are now the agreed criteria to monitor states, and this work will demonstrate that numerical and graphical methods are valuable tools in assessing progress. Fuzzy Logic is a reliable procedure for transforming human qualitative knowledge into quantitative variables that can be used in the reasoning of the type “if, then” to obtain answers pertaining to sustainability assessment. Applications of machine learning techniques and artificial intelligence procedures span almost all fields of science. Here, for the first-time, unsupervised machine learning is applied to sustainability assessment, combining numerical approaches with graphical procedures to analyze global sustainability. CD HJ-Biplots to portray graphically the sustainability position of a large number of countries are a useful complement to mathematical models of sustainability. Graphical information could be useful to planners it shows directly how countries are grouped according to the most related sustainability indicators. Thus, planners can prioritize social, environmental, and economic policies and make the most effective decisions. One could graphically observe the dynamic evolution of sustainability worldwide over time with a graphical approach used to draw relevant conclusions. In an era of climate change, species extinction, poverty, and environmental migration, such observations could aid political decision-making regarding the future of our planet. A large number of countries remain in the areas of moderate or low sustainability. Fuzzy logic has proven to be an uncontested numerical method as it occurs with SAFE. An unsupervised learning method called Variational Autoencoder interplay Graphical Analysis (VEA&GA) has been proposed, to support sustainability performance with appropriate training data. The promising results show that this can be a sound alternative to assess sustainability, extrapolating its applications to other kinds of problems at different levels of analysis (continents, regions, cities, etc.) further corroborating the effectiveness of the unsupervised training methods
    corecore