1,312 research outputs found

    Exploring Interpretable LSTM Neural Networks over Multi-Variable Data

    Full text link
    For recurrent neural networks trained on time series with target and exogenous variables, in addition to accurate prediction, it is also desired to provide interpretable insights into the data. In this paper, we explore the structure of LSTM recurrent neural networks to learn variable-wise hidden states, with the aim to capture different dynamics in multi-variable time series and distinguish the contribution of variables to the prediction. With these variable-wise hidden states, a mixture attention mechanism is proposed to model the generative process of the target. Then we develop associated training methods to jointly learn network parameters, variable and temporal importance w.r.t the prediction of the target variable. Extensive experiments on real datasets demonstrate enhanced prediction performance by capturing the dynamics of different variables. Meanwhile, we evaluate the interpretation results both qualitatively and quantitatively. It exhibits the prospect as an end-to-end framework for both forecasting and knowledge extraction over multi-variable data.Comment: Accepted to International Conference on Machine Learning (ICML), 201

    Application and Evaluation of LSTM Architectures for Energy Time-Series Forecasting

    Get PDF
    Täpsete prognooside koostamine on energiavaldkonnas väga aktiivneuurimisvaldkond, kuna usaldusväärne teave tulevase elektritootmise kohta on oluline elektrivõrgu ohutuse tagamisel ning aitab minimeerida liigset elektrienergia tootmist. Kuna rekurrentsed tehisnärvivõrgud ületavad aegridade prognoosimise täpsuses enamikke muid masinõppe meetodeid, siis on need võetud ka energia prognoosimisel laialdaselt kasutusele. Käesolevas töös on energiaprognooside tegemiseks rakendatud algoritme Persistence ja ARIMA baasmeetoditena ning pika lühiajalise mäluga (LSTM) tehisnärvivõrke erinevates konfiguratsioonides. Töö uurib kolme LSTM-põhist arhitektuuri:i) standardne LSTM, ii) kahekihiline (stacked) LSTM ja iii) jadast-jadasse (sequence to sequence) LSTM. Kõigi nende LSTM-arhitektuuridega uuritakse nii ühemõõtmelisi kui ka mitmemõõtmelisi õpiülesandeid. LSTM-mudeleid treenitakse kuue erineva avalikult kättesaadava aegrea ennustamiseks, kusjuures iga aegrea jaoks treenitakse kuus erinevat LSTM mudelit. LSTM-mudelite poolt tehtud ennustusi mõõdetakse viie erineva hindamismõõdikuga. Lähtuvalt hindamise tulemustest neil kuuel aegreal hinnatakse LSTM-mudelite arhitektuuride robustsust.Accurate energy forecasting is a very active research field as reliable information about future electricity generation allows for the safe operation of the power grid and helps to minimize excessive electricity production. As Recurrent Neural Networks outperform most machine learning approaches in time series forecasting, they became widely used models for energy forecasting problems. In this work, the Persistence forecast and ARIMA model as baseline methods and the long short-term memory (LSTM)-based neural networks with various configurations are constructed to implement multi-step energy forecasting. The presented work investigates three LSTM based architectures:i) Standard LSTM, ii) Stack LSTM and iii) Sequence to Sequence LSTM architecture. Univariate and multivariate learning problems are investigated with each of these LSTM architectures. The LSTM models are implemented on six different time series which are taken from publicly available data. Overall, six LSTM models are trained for each time series. The performance of the LSTM models is measured by five different evaluation metrics. Considering the results of all the evaluation metrics, the robustness of the LSTM models is estimated over six time series

    AQNet: 깊은 생성 모델을 이용한 대기 질의 시공간적 예측

    Get PDF
    학위논문(석사)--서울대학교 대학원 :공과대학 전기·정보공학부,2019. 8. Cha, Sang Kyun.With the increase of global economic activities and high energy demand, many countries have concerns about air pollution. However, air quality prediction is a challenging issue due to the complex interaction of many factors. In this thesis, we propose a deep generative model for spatio-temporal air quality prediction, entitled AQNet. Unlike previous work, our model transforms air quality index data into 2D frames (heat-map images) for effectively capturing spatial relations of air quality levels among different areas. It then combines the spatial representation with temporal features of critical factors such as meteorology and external air pollution sources. For prediction, the model first generates heat-map images of future air quality levels, then aggregates them into output values of corresponding areas. Based on the analyses of data, we also assessed the impacts of critical factors on air quality prediction. To evaluate the proposed method, we conducted experiments on two real-world air pollution datasets: Seoul dataset and China 1-year dataset. For Seoul dataset, our method showed a 15.2%, 8.2% improvement in mean absolute error score for long-term predictions of PM2.5 and PM10, respectively compared to baselines and state-of-the-art methods. Also, our method improved mean absolute error score of PM2.5 predictions by 20% compared to the previous state-of-the-art results on China dataset.세계 경제 활동과 에너지 수요가 증가함에 따라 많은 국가들이 대기 오염에 대한 우려를 제기하고 있다. 하지만 많은 요인들의 복잡한 상호 작용으로 인해 대기 질을 예측하는 것은 어려운 문제다. 본 논문에서는 AQNet이라는 이름의 시공간적 대기 질 예측을 위한 심층 생성 모델을 제안한다. 이전 연구와 달리 이 모델은 대기 질 지수 데이터를 2D 프레임(히트 맵 이미지)으로 변환하여 대기 품질 수준의 영역간 공간적 관계를 효과적으로 포착한다. 그런 다음 기상과 외부 대기 오염원과 같은 중요한 요소의 시간적 특징과 공간 표현을 결합한다. 예측 모델은 먼저 미래의 대기 품질 수준의 히트 맵 이미지를 생성한 다음 해당 영역의 출력 값으로 집계한다. 데이터 분석을 토대로 대기 오염 예측에 각 주요 요소들이 미치는 영향을 평가하였다. 제안된 방법을 평가하기 위해 실제 대기 오염 데이터 세트인 서울의 데이터 세트와 중국의 1년 데이터 세트를 실험했다. 본 논문에서 제안한 방법은 서울 데이터세트에서 수행된 PM2.5와 PM10의 장기 예측에 대해 이전의 SOTA 방법과 비교하여 MAE 점수가 각각 15.2%, 8.2% 향상되었다. 또한 중국 데이터 세트에 대한 이전 연구와 비교하여 PM2.5 예측의 MAE 점수를 20% 향상시켰다.Abstract i Contents ii List of Tables iv List of Figures v 1 INTRODUCTION 1 1.1 Air Pollution Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Overview of the Proposed Method . . . . . . . . . . . . . . . . . . . 2 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 RELATED WORK 5 2.1 Spatio-Temporal Prediction . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Air Pollution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3 OVERVIEW 8 3.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4 DATA MANAGEMENT 11 4.1 Real-time Data Collecting . . . . . . . . . . . . . . . . . . . . . . . 11 4.2 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.3 Spatial Transformation Function . . . . . . . . . . . . . . . . . . . . 13 4.3.1 District-based Interpolation . . . . . . . . . . . . . . . . . . 14 4.3.2 Geo-based Interpolation . . . . . . . . . . . . . . . . . . . . 15 5 Proposed Method 17 5.1 Data Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 5.2 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 5.3 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 5.3.1 Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5.3.2 Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.3.3 Training Algorithm . . . . . . . . . . . . . . . . . . . . . . . 26 6 EXPERIMENTS 28 6.1 Baselines and State-of-the-art methods . . . . . . . . . . . . . . . . . 28 6.2 Experimental Settings . . . . . . . . . . . . . . . . . . . . . . . . . . 29 6.2.1 Implementation details . . . . . . . . . . . . . . . . . . . . . 29 6.2.2 Evaluation Metric . . . . . . . . . . . . . . . . . . . . . . . . 30 6.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 30 6.3.1 Performance on Spatial Module Selection . . . . . . . . . . . 31 6.3.2 Comparison to Baselines and State-of-the-art Methods . . . . 33 6.3.3 Evaluation on China 1-year Dataset . . . . . . . . . . . . . . 36 6.3.4 Assessing the Impact of Critical Factors . . . . . . . . . . . . 37 7 CONCLUSION 41 Abstract (In Korean) 47 Acknowlegement 48Maste

    Spatiotemporal Deep Learning 모델을 기반으로 한 도시 전역의 대기 오염 보간과 예측

    Get PDF
    학위논문 (석사)-- 서울대학교 대학원 : 공과대학 전기·정보공학부, 2019. 2. 차상균.대기 오염은 대도시에서 가장 큰 문제 중 하나이다. 많은 국가들은 주요 도시 주변에 대기 오염 모니터링 센터를 건설하여 대기 오염 물질을 수집하고 해당 지역의 시민들에게 대기 오염을 경고한다. 그러나 도시에서의 대기 오염은 균일하지 않으며 시공간 (spatiotemporal)적인 문제이다. 대기 오염은 위치 (공간적 특성)과 시각 (시간적 특성)에 따라 달라진다. 따라서, 도시 전체의 대기 오염 보간과 예측은 시민들이 시간과 공간에 대해 대기의 질을 파악하고, 나아가 건강에 대한 위협을 제거하기 위한 필요 조건이다. 대기 오염은 도시 전역의 여러 시공간적 요인에 의해 영향을 받는 것으로 알려져 있다. 그 중, 기상이 대기 오염에 가장 큰 영향을 주는 것으로 인식되고 있다. 그 외에, 교통량은 대기 오염의 주요 원인인 도로의 차량 밀도를 반영한다. 평균 주행 속도는 도시 대기 오염에 영향을 준다고 판단되는 교통 체증을 나타낸다. 마지막으로, 외부 대기 오염원은 도시 대기 오염 문제의 근원 중 하나라고 주장된다. 본 논문에서는 서울시의 대기 오염 데이터, 기상 데이터, 교통량, 평균 주행 속도와 같은 많은 시공간적 데이터와 서울의 대기 오염에 영향을 준다고 알려진 중국의 3개 지방(베이징, 상하이, 산동)의 대기 오염 데이터를 제시하였다. 대기 오염에 대한 최근의 연구에서는 특정 위치와 시간의 대기 오염 예측 모델을 구축하려고 시도해왔다. 그러나 대부분 연속되지 않은 위치에대한 대기 오염을 예측하거나 직접 만든 공간 및 시간적 특성을 사용하는 데 중점을 두었다. 최근 CNN (Convolutional Neural Network), RNN (Recurrent Neural Network) 및 LSTM (Long-Short Term Memory)과 같은 딥러닝 모델이 공간 및 시간 관련 문제에서 우수하다고 알려져있다. 본 논문에서는 CNN과 LSTM을 결합한 ConvLSTM (Convolutional Long-Short Term Memory) 모델을 제안하였으며, 이를 통해 데이터의 공간 및 시간적 특성을 효율적으로 처리하고 최근의 다른 연구 결과보다 뛰어난 성능을 달성하였다.Air pollution is one of the most concerns of big cities. Many countries in the world have constructed air quality monitoring stations around major cities to collect air pollutants and make the warning to urban citizens about the air pollution around them. However, air pollution is not uniform in the city, but it is a spatiotemporal problem. It changes by locations (spatial feature) and by time (temporal feature). Consequently, citywide air pollution interpolation and prediction is a requirement of urban people to know the air quality through time and spaces to eliminate the health risks. Moreover, air pollution is affected by many spatiotemporal factors throughout the whole city. Among them, meteorology is recognized to be one the most significant effects to air pollution. Besides that, traffic volume reflects the density of vehicles on roads which is the primary cause of air pollution. Average driving speed indicates the traffic congestion which also reasonably influences air pollution over the city. Finally, external air pollution sources from outside areas are claimed to be the reason contributing to a city's air pollution problem. In this thesis, we present many spatiotemporal datasets collected over Seoul city, Korea such as air pollution data, meteorological data, traffic volume, average driving speed, and air pollution of 3 China areas like Beijing, Shanghai, Shandong, which are known to have the effect to Seoul's air pollution. Recent research in air pollution has tried to build models to predict air pollution by locations and in the future time. Nonetheless, they mostly focused on predicting air pollution in discrete locations or used hand-crafted spatial and temporal features. Recently, Deep learning models such as Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Long-Short Term Memory (LSTM) are known to be superior in spatial and temporal relating problems. In this thesis, we propose the usage of Convolutional Long-Short Term Memory (ConvLSTM) model, a combination of CNN and LSTM, which efficiently manipulates the spatial and temporal features of the data and outperforms other recent research.1 INTRODUCTION 1 1.1 Air pollution description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Citywide Air pollution Interpolation and Prediction . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Spatiotemporal datasets introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.4 Thesis contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8 2 RELATED WORK 11 2.1 Spatiotemporal Air pollution interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 Machine Learning/Neural Networks based Air pollution prediction models . . . .12 2.3 Spatiotemporal Deep Learning models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3 Spatiotemporal Deep Learning Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16 3.1 CNN and LSTM models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16 3.2 ConvLSTM model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.3 Air Pollution Interpolation and Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4 EXPERIMENTS AND EVALUATIONS 29 4.1 Baselines description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.2 Experiments and Evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.2.1 Air pollution Interpolation: experiments and evaluations . . . . . . . . . . . . . . . . . 34 4.2.2 Air pollution Forecasting: experiments and evaluations . . . . . . . . . . . . . . . . . . 41 5 CONCLUSIONS AND FUTURE WORK 45 5.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46Maste
    corecore