66 research outputs found

    A Gaussian Bayesian model to identify spatio-temporal causalities for air pollution based on urban big data

    Get PDF
    Identifying the causalities for air pollutants and answering questions, such as, where do Beijing's air pollutants come from, are crucial to inform government decision-making. In this paper, we identify the spatio-temporal (ST) causalities among air pollutants at different locations by mining the urban big data. This is challenging for two reasons: 1) since air pollutants can be generated locally or dispersed from the neighborhood, we need to discover the causes in the ST space from many candidate locations with time efficiency; 2) the cause-and-effect relations between air pollutants are further affected by confounding variables like meteorology. To tackle these problems, we propose a coupled Gaussian Bayesian model with two components: 1) a Gaussian Bayesian Network (GBN) to represent the cause-and-effect relations among air pollutants, with an entropy-based algorithm to efficiently locate the causes in the ST space; 2) a coupled model that combines cause-and-effect relations with meteorology to better learn the parameters while eliminating the impact of confounding. The proposed model is verified using air quality and meteorological data from 52 cities over the period Jun 1st 2013 to May 1st 2015. Results show superiority of our model beyond baseline causality learning methods, in both time efficiency and prediction accuracy. ยฉ 2016 IEEE.postprintLink_to_subscribed_fulltex

    AQNet: ๊นŠ์€ ์ƒ์„ฑ ๋ชจ๋ธ์„ ์ด์šฉํ•œ ๋Œ€๊ธฐ ์งˆ์˜ ์‹œ๊ณต๊ฐ„์  ์˜ˆ์ธก

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(์„์‚ฌ)--์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› :๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€,2019. 8. Cha, Sang Kyun.With the increase of global economic activities and high energy demand, many countries have concerns about air pollution. However, air quality prediction is a challenging issue due to the complex interaction of many factors. In this thesis, we propose a deep generative model for spatio-temporal air quality prediction, entitled AQNet. Unlike previous work, our model transforms air quality index data into 2D frames (heat-map images) for effectively capturing spatial relations of air quality levels among different areas. It then combines the spatial representation with temporal features of critical factors such as meteorology and external air pollution sources. For prediction, the model first generates heat-map images of future air quality levels, then aggregates them into output values of corresponding areas. Based on the analyses of data, we also assessed the impacts of critical factors on air quality prediction. To evaluate the proposed method, we conducted experiments on two real-world air pollution datasets: Seoul dataset and China 1-year dataset. For Seoul dataset, our method showed a 15.2%, 8.2% improvement in mean absolute error score for long-term predictions of PM2.5 and PM10, respectively compared to baselines and state-of-the-art methods. Also, our method improved mean absolute error score of PM2.5 predictions by 20% compared to the previous state-of-the-art results on China dataset.์„ธ๊ณ„ ๊ฒฝ์ œ ํ™œ๋™๊ณผ ์—๋„ˆ์ง€ ์ˆ˜์š”๊ฐ€ ์ฆ๊ฐ€ํ•จ์— ๋”ฐ๋ผ ๋งŽ์€ ๊ตญ๊ฐ€๋“ค์ด ๋Œ€๊ธฐ ์˜ค์—ผ์— ๋Œ€ํ•œ ์šฐ๋ ค๋ฅผ ์ œ๊ธฐํ•˜๊ณ  ์žˆ๋‹ค. ํ•˜์ง€๋งŒ ๋งŽ์€ ์š”์ธ๋“ค์˜ ๋ณต์žกํ•œ ์ƒํ˜ธ ์ž‘์šฉ์œผ๋กœ ์ธํ•ด ๋Œ€๊ธฐ ์งˆ์„ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์€ ์–ด๋ ค์šด ๋ฌธ์ œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” AQNet์ด๋ผ๋Š” ์ด๋ฆ„์˜ ์‹œ๊ณต๊ฐ„์  ๋Œ€๊ธฐ ์งˆ ์˜ˆ์ธก์„ ์œ„ํ•œ ์‹ฌ์ธต ์ƒ์„ฑ ๋ชจ๋ธ์„ ์ œ์•ˆํ•œ๋‹ค. ์ด์ „ ์—ฐ๊ตฌ์™€ ๋‹ฌ๋ฆฌ ์ด ๋ชจ๋ธ์€ ๋Œ€๊ธฐ ์งˆ ์ง€์ˆ˜ ๋ฐ์ดํ„ฐ๋ฅผ 2D ํ”„๋ ˆ์ž„(ํžˆํŠธ ๋งต ์ด๋ฏธ์ง€)์œผ๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ ๋Œ€๊ธฐ ํ’ˆ์งˆ ์ˆ˜์ค€์˜ ์˜์—ญ๊ฐ„ ๊ณต๊ฐ„์  ๊ด€๊ณ„๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ํฌ์ฐฉํ•œ๋‹ค. ๊ทธ๋Ÿฐ ๋‹ค์Œ ๊ธฐ์ƒ๊ณผ ์™ธ๋ถ€ ๋Œ€๊ธฐ ์˜ค์—ผ์›๊ณผ ๊ฐ™์€ ์ค‘์š”ํ•œ ์š”์†Œ์˜ ์‹œ๊ฐ„์  ํŠน์ง•๊ณผ ๊ณต๊ฐ„ ํ‘œํ˜„์„ ๊ฒฐํ•ฉํ•œ๋‹ค. ์˜ˆ์ธก ๋ชจ๋ธ์€ ๋จผ์ € ๋ฏธ๋ž˜์˜ ๋Œ€๊ธฐ ํ’ˆ์งˆ ์ˆ˜์ค€์˜ ํžˆํŠธ ๋งต ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•œ ๋‹ค์Œ ํ•ด๋‹น ์˜์—ญ์˜ ์ถœ๋ ฅ ๊ฐ’์œผ๋กœ ์ง‘๊ณ„ํ•œ๋‹ค. ๋ฐ์ดํ„ฐ ๋ถ„์„์„ ํ† ๋Œ€๋กœ ๋Œ€๊ธฐ ์˜ค์—ผ ์˜ˆ์ธก์— ๊ฐ ์ฃผ์š” ์š”์†Œ๋“ค์ด ๋ฏธ์น˜๋Š” ์˜ํ–ฅ์„ ํ‰๊ฐ€ํ•˜์˜€๋‹ค. ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•์„ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด ์‹ค์ œ ๋Œ€๊ธฐ ์˜ค์—ผ ๋ฐ์ดํ„ฐ ์„ธํŠธ์ธ ์„œ์šธ์˜ ๋ฐ์ดํ„ฐ ์„ธํŠธ์™€ ์ค‘๊ตญ์˜ 1๋…„ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์‹คํ—˜ํ–ˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•œ ๋ฐฉ๋ฒ•์€ ์„œ์šธ ๋ฐ์ดํ„ฐ์„ธํŠธ์—์„œ ์ˆ˜ํ–‰๋œ PM2.5์™€ PM10์˜ ์žฅ๊ธฐ ์˜ˆ์ธก์— ๋Œ€ํ•ด ์ด์ „์˜ SOTA ๋ฐฉ๋ฒ•๊ณผ ๋น„๊ตํ•˜์—ฌ MAE ์ ์ˆ˜๊ฐ€ ๊ฐ๊ฐ 15.2%, 8.2% ํ–ฅ์ƒ๋˜์—ˆ๋‹ค. ๋˜ํ•œ ์ค‘๊ตญ ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ๋Œ€ํ•œ ์ด์ „ ์—ฐ๊ตฌ์™€ ๋น„๊ตํ•˜์—ฌ PM2.5 ์˜ˆ์ธก์˜ MAE ์ ์ˆ˜๋ฅผ 20% ํ–ฅ์ƒ์‹œ์ผฐ๋‹ค.Abstract i Contents ii List of Tables iv List of Figures v 1 INTRODUCTION 1 1.1 Air Pollution Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Overview of the Proposed Method . . . . . . . . . . . . . . . . . . . 2 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 RELATED WORK 5 2.1 Spatio-Temporal Prediction . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Air Pollution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3 OVERVIEW 8 3.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4 DATA MANAGEMENT 11 4.1 Real-time Data Collecting . . . . . . . . . . . . . . . . . . . . . . . 11 4.2 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.3 Spatial Transformation Function . . . . . . . . . . . . . . . . . . . . 13 4.3.1 District-based Interpolation . . . . . . . . . . . . . . . . . . 14 4.3.2 Geo-based Interpolation . . . . . . . . . . . . . . . . . . . . 15 5 Proposed Method 17 5.1 Data Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 5.2 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 5.3 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 5.3.1 Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5.3.2 Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.3.3 Training Algorithm . . . . . . . . . . . . . . . . . . . . . . . 26 6 EXPERIMENTS 28 6.1 Baselines and State-of-the-art methods . . . . . . . . . . . . . . . . . 28 6.2 Experimental Settings . . . . . . . . . . . . . . . . . . . . . . . . . . 29 6.2.1 Implementation details . . . . . . . . . . . . . . . . . . . . . 29 6.2.2 Evaluation Metric . . . . . . . . . . . . . . . . . . . . . . . . 30 6.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 30 6.3.1 Performance on Spatial Module Selection . . . . . . . . . . . 31 6.3.2 Comparison to Baselines and State-of-the-art Methods . . . . 33 6.3.3 Evaluation on China 1-year Dataset . . . . . . . . . . . . . . 36 6.3.4 Assessing the Impact of Critical Factors . . . . . . . . . . . . 37 7 CONCLUSION 41 Abstract (In Korean) 47 Acknowlegement 48Maste

    Statistical and Stochastic Learning Algorithms for Distributed and Intelligent Systems

    Get PDF
    In the big data era, statistical and stochastic learning for distributed and intelligent systems focuses on enhancing and improving the robustness of learning models that have become pervasive and are being deployed for decision-making in real-life applications including general classification, prediction, and sparse sensing. The growing prospect of statistical learning approaches such as Linear Discriminant Analysis and distributed Learning being used (e.g., community sensing) has raised concerns around the robustness of algorithm design. Recent work on anomalies detection has shown that such Learning models can also succumb to the so-called \u27edge-cases\u27 where the real-life operational situation presents data that are not well-represented in the training data set. Such cases have been the primary reason for quite a few mis-classification bottleneck problems recently. Although initial research has begun to address scenarios with specific Learning models, there remains a significant knowledge gap regarding the detection and adaptation of learning models to \u27edge-cases\u27 and extreme ill-posed settings in the context of distributed and intelligent systems. With this motivation, this dissertation explores the complex in several typical applications and associated algorithms to detect and mitigate the uncertainty which will substantially reduce the risk in using statistical and stochastic learning algorithms for distributed and intelligent systems

    Integrated human exposure to air pollution

    Get PDF
    The book โ€œIntegrated human exposure to air pollutionโ€ aimed to increase knowledge about human exposure in different micro-environments, or when citizens are performing specific tasks, to demonstrate methodologies for the understanding of pollution sources and their impact on indoor and ambient air quality, and, ultimately, to identify the most effective mitigation measures to decrease human exposure and protect public health. Taking advantage of the latest available tools, such as internet of things (IoT), low-cost sensors and a wide access to online platforms and apps by the citizens, new methodologies and approaches can be implemented to understand which factors can influence human exposure to air pollution. This knowledge, when made available to the citizens, along with the awareness of the impact of air pollution on human life and earth systems, can empower them to act, individually or collectively, to promote behavioral changes aiming to reduce pollutantsโ€™ emissions. Overall, this book gathers fourteen innovative studies that provide new insights regarding these important topics within the scope of human exposure to air pollution. A total of five main areas were discussed and explored within this book and, hopefully, can contribute to the advance of knowledge in this field

    Big data analytics for preventive medicine

    Get PDF
    ยฉ 2019, Springer-Verlag London Ltd., part of Springer Nature. Medical data is one of the most rewarding and yet most complicated data to analyze. How can healthcare providers use modern data analytics tools and technologies to analyze and create value from complex data? Data analytics, with its promise to efficiently discover valuable pattern by analyzing large amount of unstructured, heterogeneous, non-standard and incomplete healthcare data. It does not only forecast but also helps in decision making and is increasingly noticed as breakthrough in ongoing advancement with the goal is to improve the quality of patient care and reduces the healthcare cost. The aim of this study is to provide a comprehensive and structured overview of extensive research on the advancement of data analytics methods for disease prevention. This review first introduces disease prevention and its challenges followed by traditional prevention methodologies. We summarize state-of-the-art data analytics algorithms used for classification of disease, clustering (unusually high incidence of a particular disease), anomalies detection (detection of disease) and association as well as their respective advantages, drawbacks and guidelines for selection of specific model followed by discussion on recent development and successful application of disease prevention methods. The article concludes with open research challenges and recommendations

    Characterising and modeling the co-evolution of transportation networks and territories

    Full text link
    The identification of structuring effects of transportation infrastructure on territorial dynamics remains an open research problem. This issue is one of the aspects of approaches on complexity of territorial dynamics, within which territories and networks would be co-evolving. The aim of this thesis is to challenge this view on interactions between networks and territories, both at the conceptual and empirical level, by integrating them in simulation models of territorial systems.Comment: Doctoral dissertation (2017), Universit\'e Paris 7 Denis Diderot. Translated from French. Several papers compose this PhD thesis; overlap with: arXiv:{1605.08888, 1608.00840, 1608.05266, 1612.08504, 1706.07467, 1706.09244, 1708.06743, 1709.08684, 1712.00805, 1803.11457, 1804.09416, 1804.09430, 1805.05195, 1808.07282, 1809.00861, 1811.04270, 1812.01473, 1812.06008, 1908.02034, 2012.13367, 2102.13501, 2106.11996

    Disaster and Pandemic Management Using Machine Learning: A Survey

    Get PDF
    This article provides a literature review of state-of-the-art machine learning (ML) algorithms for disaster and pandemic management. Most nations are concerned about disasters and pandemics, which, in general, are highly unlikely events. To date, various technologies, such as IoT, object sensing, UAV, 5G, and cellular networks, smartphone-based system, and satellite-based systems have been used for disaster and pandemic management. ML algorithms can handle multidimensional, large volumes of data that occur naturally in environments related to disaster and pandemic management and are particularly well suited for important related tasks, such as recognition and classification. ML algorithms are useful for predicting disasters and assisting in disaster management tasks, such as determining crowd evacuation routes, analyzing social media posts, and handling the post-disaster situation. ML algorithms also find great application in pandemic management scenarios, such as predicting pandemics, monitoring pandemic spread, disease diagnosis, etc. This article first presents a tutorial on ML algorithms. It then presents a detailed review of several ML algorithms and how we can combine these algorithms with other technologies to address disaster and pandemic management. It also discusses various challenges, open issues and, directions for future research
    • โ€ฆ
    corecore