47,453 research outputs found

    Prediction Using LSTM Networks

    Get PDF
    Photovoltaic (PV) systems use the sunlight and convert it to electrical power. It is predicted that by 2023, 371,000 PV installations will be embedded in power networks in the UK. This may increase the risk of voltage rise which has adverse impacts on the power network. The balance maintenance is important for high security of the physical electrical systems and the operation economy. Therefore, the prediction of the output of PV systems is of great importance. The output of a PV system highly depends on local environmental conditions. These include sun radiation, temperature, and humidity. In this research, the importance of various weather factors are studied. The weather attributes are subsequently employed for the prediction of the solar panel power generation from a time-series database. LongShort Term Memory networks are employed for obtaining the dependencies between various elements of the weather conditions and the PV energy metrics. Evaluation results indicate the efficiency of the deep networks for energy generation prediction

    Recurrent Highway Networks

    Full text link
    Many sequential processing tasks require complex nonlinear transition functions from one step to the next. However, recurrent neural networks with 'deep' transition functions remain difficult to train, even when using Long Short-Term Memory (LSTM) networks. We introduce a novel theoretical analysis of recurrent networks based on Gersgorin's circle theorem that illuminates several modeling and optimization issues and improves our understanding of the LSTM cell. Based on this analysis we propose Recurrent Highway Networks, which extend the LSTM architecture to allow step-to-step transition depths larger than one. Several language modeling experiments demonstrate that the proposed architecture results in powerful and efficient models. On the Penn Treebank corpus, solely increasing the transition depth from 1 to 10 improves word-level perplexity from 90.6 to 65.4 using the same number of parameters. On the larger Wikipedia datasets for character prediction (text8 and enwik8), RHNs outperform all previous results and achieve an entropy of 1.27 bits per character.Comment: 12 pages, 6 figures, 3 table

    Analysis And Comparison Of Long Short-Term Memory Networks Short-Term Traffic Prediction Performance

    Get PDF
    DOGAN, Erdem/0000-0001-7802-641XWOS:000546566800002Long short-term memory networks (LSTM) produces promising results in the prediction of traffic flows. However, LSTM needs large numbers of data to produce satisfactory results. Therefore, the effect of LSTM training set size on performance and optimum training set size for short-term traffic flow prediction problems were investigated in this study. To achieve this, the numbers of data in the training set was set between 480 and 2800, and the prediction performance of the LSTMs trained using these adjusted training sets was measured. In addition, LSTM prediction results were compared with nonlinear autoregressive neural networks (NAR) trained using the same training sets. Consequently, it was seen that the increase in LSTM's training cluster size increased performance to a certain point. However, after this point, the performance decreased. Three main results emerged in this study: First, the optimum training set size for LSTM significantly improves the prediction performance of the model. Second, LSTM makes short-term traffic forecasting better than NAR. Third, LSTM predictions fluctuate less than the NAR model following instant traffic flow changes

    Cryptocurrency price prediction using LSTM neural networks

    Get PDF
    The interest in cryptocurrencies is increasing among individuals and investors. Bitcoin is the leading existing cryptocurrency with the highest market capitalization. However, its high volatility aligns with political uncertainty making it very difficult to predict its value. Therefore, there is a need to create advanced models that use mathematical and statistical methods to reduce investment risk. This research aims to verify if long short-term memory (LSTM), and bidirectional long short-term memory (BiLSTM) neural networks, can be used with Savitzky–Golay filter to predict next-day bitcoin closing prices. We found evidence both networks can be used effectively to predict bitcoin prices. LSTM performed 4.49 mean absolute percentage error (MAPE) and BiLSTM 4.44 MAPE. We also found that using Savitzky– Golay filter and dropout regularization significantly improved the model’s prediction performance.O interesse em moedas digitais tem aumentado por parte de indivíduos e investidores. A bitcoin é a moeda digital com maior capitalização de mercado, no entanto, a sua alta volatilidade alinhada à incerteza política, torna muito difícil prever seu valor. Portanto, existe a necessidade de criar modelos avançados que utilizem métodos matemáticos e estatísticos para reduzir o risco de investimento. Este estudo tem como objetivo verificar se as redes neurais artificiais de memória longo curto prazo (LSTM) e redes bidirecionais de memória longo curto prazo (BiLSTM) podem ser usadas juntamente com o filtro Savitzky-Golay para prever os preços de fecho do dia seguinte da bitcoin. Os resultados mostraram que existe evidência que ambas as redes podem ser usadas de forma efetiva. LSTM obteve um erro percentual absoluto médio (MAPE) de 4.49 e BiLSTM um MAPE de 4,44. Também o uso do filtro Savitzky-Golay e regularização, melhora significativamente o desempenho de previsão dos modelos

    Activity Recognition and Prediction in Real Homes

    Full text link
    In this paper, we present work in progress on activity recognition and prediction in real homes using either binary sensor data or depth video data. We present our field trial and set-up for collecting and storing the data, our methods, and our current results. We compare the accuracy of predicting the next binary sensor event using probabilistic methods and Long Short-Term Memory (LSTM) networks, include the time information to improve prediction accuracy, as well as predict both the next sensor event and its mean time of occurrence using one LSTM model. We investigate transfer learning between apartments and show that it is possible to pre-train the model with data from other apartments and achieve good accuracy in a new apartment straight away. In addition, we present preliminary results from activity recognition using low-resolution depth video data from seven apartments, and classify four activities - no movement, standing up, sitting down, and TV interaction - by using a relatively simple processing method where we apply an Infinite Impulse Response (IIR) filter to extract movements from the frames prior to feeding them to a convolutional LSTM network for the classification.Comment: 12 pages, Symposium of the Norwegian AI Society NAIS 201

    Robust training of recurrent neural networks to handle missing data for disease progression modeling

    Get PDF
    Disease progression modeling (DPM) using longitudinal data is a challenging task in machine learning for healthcare that can provide clinicians with better tools for diagnosis and monitoring of disease. Existing DPM algorithms neglect temporal dependencies among measurements and make parametric assumptions about biomarker trajectories. In addition, they do not model multiple biomarkers jointly and need to align subjects' trajectories. In this paper, recurrent neural networks (RNNs) are utilized to address these issues. However, in many cases, longitudinal cohorts contain incomplete data, which hinders the application of standard RNNs and requires a pre-processing step such as imputation of the missing values. We, therefore, propose a generalized training rule for the most widely used RNN architecture, long short-term memory (LSTM) networks, that can handle missing values in both target and predictor variables. This algorithm is applied for modeling the progression of Alzheimer's disease (AD) using magnetic resonance imaging (MRI) biomarkers. The results show that the proposed LSTM algorithm achieves a lower mean absolute error for prediction of measurements across all considered MRI biomarkers compared to using standard LSTM networks with data imputation or using a regression-based DPM method. Moreover, applying linear discriminant analysis to the biomarkers' values predicted by the proposed algorithm results in a larger area under the receiver operating characteristic curve (AUC) for clinical diagnosis of AD compared to the same alternatives, and the AUC is comparable to state-of-the-art AUCs from a recent cross-sectional medical image classification challenge. This paper shows that built-in handling of missing values in LSTM network training paves the way for application of RNNs in disease progression modeling.Comment: 9 pages, 1 figure, MIDL conferenc
    corecore