4,879 research outputs found

    Air Quality Prediction in Smart Cities Using Machine Learning Technologies Based on Sensor Data: A Review

    Get PDF
    The influence of machine learning technologies is rapidly increasing and penetrating almost in every field, and air pollution prediction is not being excluded from those fields. This paper covers the revision of the studies related to air pollution prediction using machine learning algorithms based on sensor data in the context of smart cities. Using the most popular databases and executing the corresponding filtration, the most relevant papers were selected. After thorough reviewing those papers, the main features were extracted, which served as a base to link and compare them to each other. As a result, we can conclude that: (1) instead of using simple machine learning techniques, currently, the authors apply advanced and sophisticated techniques, (2) China was the leading country in terms of a case study, (3) Particulate matter with diameter equal to 2.5 micrometers was the main prediction target, (4) in 41% of the publications the authors carried out the prediction for the next day, (5) 66% of the studies used data had an hourly rate, (6) 49% of the papers used open data and since 2016 it had a tendency to increase, and (7) for efficient air quality prediction it is important to consider the external factors such as weather conditions, spatial characteristics, and temporal features

    Spatiotemporal Deep Learning ๋ชจ๋ธ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ๋„์‹œ ์ „์—ญ์˜ ๋Œ€๊ธฐ ์˜ค์—ผ ๋ณด๊ฐ„๊ณผ ์˜ˆ์ธก

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (์„์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2019. 2. ์ฐจ์ƒ๊ท .๋Œ€๊ธฐ ์˜ค์—ผ์€ ๋Œ€๋„์‹œ์—์„œ ๊ฐ€์žฅ ํฐ ๋ฌธ์ œ ์ค‘ ํ•˜๋‚˜์ด๋‹ค. ๋งŽ์€ ๊ตญ๊ฐ€๋“ค์€ ์ฃผ์š” ๋„์‹œ ์ฃผ๋ณ€์— ๋Œ€๊ธฐ ์˜ค์—ผ ๋ชจ๋‹ˆํ„ฐ๋ง ์„ผํ„ฐ๋ฅผ ๊ฑด์„คํ•˜์—ฌ ๋Œ€๊ธฐ ์˜ค์—ผ ๋ฌผ์งˆ์„ ์ˆ˜์ง‘ํ•˜๊ณ  ํ•ด๋‹น ์ง€์—ญ์˜ ์‹œ๋ฏผ๋“ค์—๊ฒŒ ๋Œ€๊ธฐ ์˜ค์—ผ์„ ๊ฒฝ๊ณ ํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋„์‹œ์—์„œ์˜ ๋Œ€๊ธฐ ์˜ค์—ผ์€ ๊ท ์ผํ•˜์ง€ ์•Š์œผ๋ฉฐ ์‹œ๊ณต๊ฐ„ (spatiotemporal)์ ์ธ ๋ฌธ์ œ์ด๋‹ค. ๋Œ€๊ธฐ ์˜ค์—ผ์€ ์œ„์น˜ (๊ณต๊ฐ„์  ํŠน์„ฑ)๊ณผ ์‹œ๊ฐ (์‹œ๊ฐ„์  ํŠน์„ฑ)์— ๋”ฐ๋ผ ๋‹ฌ๋ผ์ง„๋‹ค. ๋”ฐ๋ผ์„œ, ๋„์‹œ ์ „์ฒด์˜ ๋Œ€๊ธฐ ์˜ค์—ผ ๋ณด๊ฐ„๊ณผ ์˜ˆ์ธก์€ ์‹œ๋ฏผ๋“ค์ด ์‹œ๊ฐ„๊ณผ ๊ณต๊ฐ„์— ๋Œ€ํ•ด ๋Œ€๊ธฐ์˜ ์งˆ์„ ํŒŒ์•…ํ•˜๊ณ , ๋‚˜์•„๊ฐ€ ๊ฑด๊ฐ•์— ๋Œ€ํ•œ ์œ„ํ˜‘์„ ์ œ๊ฑฐํ•˜๊ธฐ ์œ„ํ•œ ํ•„์š” ์กฐ๊ฑด์ด๋‹ค. ๋Œ€๊ธฐ ์˜ค์—ผ์€ ๋„์‹œ ์ „์—ญ์˜ ์—ฌ๋Ÿฌ ์‹œ๊ณต๊ฐ„์  ์š”์ธ์— ์˜ํ•ด ์˜ํ–ฅ์„ ๋ฐ›๋Š” ๊ฒƒ์œผ๋กœ ์•Œ๋ ค์ ธ ์žˆ๋‹ค. ๊ทธ ์ค‘, ๊ธฐ์ƒ์ด ๋Œ€๊ธฐ ์˜ค์—ผ์— ๊ฐ€์žฅ ํฐ ์˜ํ–ฅ์„ ์ฃผ๋Š” ๊ฒƒ์œผ๋กœ ์ธ์‹๋˜๊ณ  ์žˆ๋‹ค. ๊ทธ ์™ธ์—, ๊ตํ†ต๋Ÿ‰์€ ๋Œ€๊ธฐ ์˜ค์—ผ์˜ ์ฃผ์š” ์›์ธ์ธ ๋„๋กœ์˜ ์ฐจ๋Ÿ‰ ๋ฐ€๋„๋ฅผ ๋ฐ˜์˜ํ•œ๋‹ค. ํ‰๊ท  ์ฃผํ–‰ ์†๋„๋Š” ๋„์‹œ ๋Œ€๊ธฐ ์˜ค์—ผ์— ์˜ํ–ฅ์„ ์ค€๋‹ค๊ณ  ํŒ๋‹จ๋˜๋Š” ๊ตํ†ต ์ฒด์ฆ์„ ๋‚˜ํƒ€๋‚ธ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ์™ธ๋ถ€ ๋Œ€๊ธฐ ์˜ค์—ผ์›์€ ๋„์‹œ ๋Œ€๊ธฐ ์˜ค์—ผ ๋ฌธ์ œ์˜ ๊ทผ์› ์ค‘ ํ•˜๋‚˜๋ผ๊ณ  ์ฃผ์žฅ๋œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์„œ์šธ์‹œ์˜ ๋Œ€๊ธฐ ์˜ค์—ผ ๋ฐ์ดํ„ฐ, ๊ธฐ์ƒ ๋ฐ์ดํ„ฐ, ๊ตํ†ต๋Ÿ‰, ํ‰๊ท  ์ฃผํ–‰ ์†๋„์™€ ๊ฐ™์€ ๋งŽ์€ ์‹œ๊ณต๊ฐ„์  ๋ฐ์ดํ„ฐ์™€ ์„œ์šธ์˜ ๋Œ€๊ธฐ ์˜ค์—ผ์— ์˜ํ–ฅ์„ ์ค€๋‹ค๊ณ  ์•Œ๋ ค์ง„ ์ค‘๊ตญ์˜ 3๊ฐœ ์ง€๋ฐฉ(๋ฒ ์ด์ง•, ์ƒํ•˜์ด, ์‚ฐ๋™)์˜ ๋Œ€๊ธฐ ์˜ค์—ผ ๋ฐ์ดํ„ฐ๋ฅผ ์ œ์‹œํ•˜์˜€๋‹ค. ๋Œ€๊ธฐ ์˜ค์—ผ์— ๋Œ€ํ•œ ์ตœ๊ทผ์˜ ์—ฐ๊ตฌ์—์„œ๋Š” ํŠน์ • ์œ„์น˜์™€ ์‹œ๊ฐ„์˜ ๋Œ€๊ธฐ ์˜ค์—ผ ์˜ˆ์ธก ๋ชจ๋ธ์„ ๊ตฌ์ถ•ํ•˜๋ ค๊ณ  ์‹œ๋„ํ•ด์™”๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋Œ€๋ถ€๋ถ„ ์—ฐ์†๋˜์ง€ ์•Š์€ ์œ„์น˜์—๋Œ€ํ•œ ๋Œ€๊ธฐ ์˜ค์—ผ์„ ์˜ˆ์ธกํ•˜๊ฑฐ๋‚˜ ์ง์ ‘ ๋งŒ๋“  ๊ณต๊ฐ„ ๋ฐ ์‹œ๊ฐ„์  ํŠน์„ฑ์„ ์‚ฌ์šฉํ•˜๋Š” ๋ฐ ์ค‘์ ์„ ๋‘์—ˆ๋‹ค. ์ตœ๊ทผ CNN (Convolutional Neural Network), RNN (Recurrent Neural Network) ๋ฐ LSTM (Long-Short Term Memory)๊ณผ ๊ฐ™์€ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์ด ๊ณต๊ฐ„ ๋ฐ ์‹œ๊ฐ„ ๊ด€๋ จ ๋ฌธ์ œ์—์„œ ์šฐ์ˆ˜ํ•˜๋‹ค๊ณ  ์•Œ๋ ค์ ธ์žˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” CNN๊ณผ LSTM์„ ๊ฒฐํ•ฉํ•œ ConvLSTM (Convolutional Long-Short Term Memory) ๋ชจ๋ธ์„ ์ œ์•ˆํ•˜์˜€์œผ๋ฉฐ, ์ด๋ฅผ ํ†ตํ•ด ๋ฐ์ดํ„ฐ์˜ ๊ณต๊ฐ„ ๋ฐ ์‹œ๊ฐ„์  ํŠน์„ฑ์„ ํšจ์œจ์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•˜๊ณ  ์ตœ๊ทผ์˜ ๋‹ค๋ฅธ ์—ฐ๊ตฌ ๊ฒฐ๊ณผ๋ณด๋‹ค ๋›ฐ์–ด๋‚œ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•˜์˜€๋‹ค.Air pollution is one of the most concerns of big cities. Many countries in the world have constructed air quality monitoring stations around major cities to collect air pollutants and make the warning to urban citizens about the air pollution around them. However, air pollution is not uniform in the city, but it is a spatiotemporal problem. It changes by locations (spatial feature) and by time (temporal feature). Consequently, citywide air pollution interpolation and prediction is a requirement of urban people to know the air quality through time and spaces to eliminate the health risks. Moreover, air pollution is affected by many spatiotemporal factors throughout the whole city. Among them, meteorology is recognized to be one the most significant effects to air pollution. Besides that, traffic volume reflects the density of vehicles on roads which is the primary cause of air pollution. Average driving speed indicates the traffic congestion which also reasonably influences air pollution over the city. Finally, external air pollution sources from outside areas are claimed to be the reason contributing to a city's air pollution problem. In this thesis, we present many spatiotemporal datasets collected over Seoul city, Korea such as air pollution data, meteorological data, traffic volume, average driving speed, and air pollution of 3 China areas like Beijing, Shanghai, Shandong, which are known to have the effect to Seoul's air pollution. Recent research in air pollution has tried to build models to predict air pollution by locations and in the future time. Nonetheless, they mostly focused on predicting air pollution in discrete locations or used hand-crafted spatial and temporal features. Recently, Deep learning models such as Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Long-Short Term Memory (LSTM) are known to be superior in spatial and temporal relating problems. In this thesis, we propose the usage of Convolutional Long-Short Term Memory (ConvLSTM) model, a combination of CNN and LSTM, which efficiently manipulates the spatial and temporal features of the data and outperforms other recent research.1 INTRODUCTION 1 1.1 Air pollution description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Citywide Air pollution Interpolation and Prediction . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Spatiotemporal datasets introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.4 Thesis contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8 2 RELATED WORK 11 2.1 Spatiotemporal Air pollution interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 Machine Learning/Neural Networks based Air pollution prediction models . . . .12 2.3 Spatiotemporal Deep Learning models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3 Spatiotemporal Deep Learning Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16 3.1 CNN and LSTM models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16 3.2 ConvLSTM model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.3 Air Pollution Interpolation and Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4 EXPERIMENTS AND EVALUATIONS 29 4.1 Baselines description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.2 Experiments and Evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.2.1 Air pollution Interpolation: experiments and evaluations . . . . . . . . . . . . . . . . . 34 4.2.2 Air pollution Forecasting: experiments and evaluations . . . . . . . . . . . . . . . . . . 41 5 CONCLUSIONS AND FUTURE WORK 45 5.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46Maste

    Intelligent Data Analytics using Deep Learning for Data Science

    Get PDF
    Nowadays, data science stimulates the interest of academics and practitioners because it can assist in the extraction of significant insights from massive amounts of data. From the years 2018 through 2025, the Global Datasphere is expected to rise from 33 Zettabytes to 175 Zettabytes, according to the International Data Corporation. This dissertation proposes an intelligent data analytics framework that uses deep learning to tackle several difficulties when implementing a data science application. These difficulties include dealing with high inter-class similarity, the availability and quality of hand-labeled data, and designing a feasible approach for modeling significant correlations in features gathered from various data sources. The proposed intelligent data analytics framework employs a novel strategy for improving data representation learning by incorporating supplemental data from various sources and structures. First, the research presents a multi-source fusion approach that utilizes confident learning techniques to improve the data quality from many noisy sources. Meta-learning methods based on advanced techniques such as the mixture of experts and differential evolution combine the predictive capacity of individual learners with a gating mechanism, ensuring that only the most trustworthy features or predictions are integrated to train the model. Then, a Multi-Level Convolutional Fusion is presented to train a model on the correspondence between local-global deep feature interactions to identify easily confused samples of different classes. The convolutional fusion is further enhanced with the power of Graph Transformers, aggregating the relevant neighboring features in graph-based input data structures and achieving state-of-the-art performance on a large-scale building damage dataset. Finally, weakly-supervised strategies, noise regularization, and label propagation are proposed to train a model on sparse input labeled data, ensuring the model\u27s robustness to errors and supporting the automatic expansion of the training set. The suggested approaches outperformed competing strategies in effectively training a model on a large-scale dataset of 500k photos, with just about 7% of the images annotated by a human. The proposed framework\u27s capabilities have benefited various data science applications, including fluid dynamics, geometric morphometrics, building damage classification from satellite pictures, disaster scene description, and storm-surge visualization
    • โ€ฆ
    corecore