47,453 research outputs found
Prediction Using LSTM Networks
Photovoltaic (PV) systems use the sunlight and convert it to electrical power. It is predicted that by 2023, 371,000 PV installations will be embedded in power networks in the UK.
This may increase the risk of voltage rise which has adverse impacts on the power network. The balance maintenance is important for high security of the physical electrical systems and the operation economy. Therefore, the prediction of the output of PV systems is of great importance. The output of a PV system highly depends on local environmental conditions. These include sun radiation, temperature, and humidity. In this research, the importance of various weather factors are studied. The weather attributes are subsequently employed for the prediction of the solar panel power generation from a time-series database. LongShort Term Memory networks are employed for obtaining the dependencies between various elements of the weather conditions and the PV energy metrics. Evaluation results indicate the efficiency of the deep networks for energy generation prediction
Recurrent Highway Networks
Many sequential processing tasks require complex nonlinear transition
functions from one step to the next. However, recurrent neural networks with
'deep' transition functions remain difficult to train, even when using Long
Short-Term Memory (LSTM) networks. We introduce a novel theoretical analysis of
recurrent networks based on Gersgorin's circle theorem that illuminates several
modeling and optimization issues and improves our understanding of the LSTM
cell. Based on this analysis we propose Recurrent Highway Networks, which
extend the LSTM architecture to allow step-to-step transition depths larger
than one. Several language modeling experiments demonstrate that the proposed
architecture results in powerful and efficient models. On the Penn Treebank
corpus, solely increasing the transition depth from 1 to 10 improves word-level
perplexity from 90.6 to 65.4 using the same number of parameters. On the larger
Wikipedia datasets for character prediction (text8 and enwik8), RHNs outperform
all previous results and achieve an entropy of 1.27 bits per character.Comment: 12 pages, 6 figures, 3 table
Analysis And Comparison Of Long Short-Term Memory Networks Short-Term Traffic Prediction Performance
DOGAN, Erdem/0000-0001-7802-641XWOS:000546566800002Long short-term memory networks (LSTM) produces promising results in the prediction of traffic flows. However, LSTM needs large numbers of data to produce satisfactory results. Therefore, the effect of LSTM training set size on performance and optimum training set size for short-term traffic flow prediction problems were investigated in this study. To achieve this, the numbers of data in the training set was set between 480 and 2800, and the prediction performance of the LSTMs trained using these adjusted training sets was measured. In addition, LSTM prediction results were compared with nonlinear autoregressive neural networks (NAR) trained using the same training sets. Consequently, it was seen that the increase in LSTM's training cluster size increased performance to a certain point. However, after this point, the performance decreased. Three main results emerged in this study: First, the optimum training set size for LSTM significantly improves the prediction performance of the model. Second, LSTM makes short-term traffic forecasting better than NAR. Third, LSTM predictions fluctuate less than the NAR model following instant traffic flow changes
Cryptocurrency price prediction using LSTM neural networks
The interest in cryptocurrencies is increasing among individuals and investors. Bitcoin is the leading
existing cryptocurrency with the highest market capitalization. However, its high volatility aligns with
political uncertainty making it very difficult to predict its value. Therefore, there is a need to create
advanced models that use mathematical and statistical methods to reduce investment risk. This research
aims to verify if long short-term memory (LSTM), and bidirectional long short-term memory (BiLSTM)
neural networks, can be used with Savitzky–Golay filter to predict next-day bitcoin closing prices. We
found evidence both networks can be used effectively to predict bitcoin prices. LSTM performed 4.49
mean absolute percentage error (MAPE) and BiLSTM 4.44 MAPE. We also found that using Savitzky–
Golay filter and dropout regularization significantly improved the model’s prediction performance.O interesse em moedas digitais tem aumentado por parte de indivíduos e investidores. A bitcoin é a
moeda digital com maior capitalização de mercado, no entanto, a sua alta volatilidade alinhada à
incerteza política, torna muito difícil prever seu valor. Portanto, existe a necessidade de criar modelos
avançados que utilizem métodos matemáticos e estatísticos para reduzir o risco de investimento. Este
estudo tem como objetivo verificar se as redes neurais artificiais de memória longo curto prazo (LSTM)
e redes bidirecionais de memória longo curto prazo (BiLSTM) podem ser usadas juntamente com o
filtro Savitzky-Golay para prever os preços de fecho do dia seguinte da bitcoin. Os resultados mostraram
que existe evidência que ambas as redes podem ser usadas de forma efetiva. LSTM obteve um erro
percentual absoluto médio (MAPE) de 4.49 e BiLSTM um MAPE de 4,44. Também o uso do filtro
Savitzky-Golay e regularização, melhora significativamente o desempenho de previsão dos modelos
Activity Recognition and Prediction in Real Homes
In this paper, we present work in progress on activity recognition and
prediction in real homes using either binary sensor data or depth video data.
We present our field trial and set-up for collecting and storing the data, our
methods, and our current results. We compare the accuracy of predicting the
next binary sensor event using probabilistic methods and Long Short-Term Memory
(LSTM) networks, include the time information to improve prediction accuracy,
as well as predict both the next sensor event and its mean time of occurrence
using one LSTM model. We investigate transfer learning between apartments and
show that it is possible to pre-train the model with data from other apartments
and achieve good accuracy in a new apartment straight away. In addition, we
present preliminary results from activity recognition using low-resolution
depth video data from seven apartments, and classify four activities - no
movement, standing up, sitting down, and TV interaction - by using a relatively
simple processing method where we apply an Infinite Impulse Response (IIR)
filter to extract movements from the frames prior to feeding them to a
convolutional LSTM network for the classification.Comment: 12 pages, Symposium of the Norwegian AI Society NAIS 201
Robust training of recurrent neural networks to handle missing data for disease progression modeling
Disease progression modeling (DPM) using longitudinal data is a challenging
task in machine learning for healthcare that can provide clinicians with better
tools for diagnosis and monitoring of disease. Existing DPM algorithms neglect
temporal dependencies among measurements and make parametric assumptions about
biomarker trajectories. In addition, they do not model multiple biomarkers
jointly and need to align subjects' trajectories. In this paper, recurrent
neural networks (RNNs) are utilized to address these issues. However, in many
cases, longitudinal cohorts contain incomplete data, which hinders the
application of standard RNNs and requires a pre-processing step such as
imputation of the missing values. We, therefore, propose a generalized training
rule for the most widely used RNN architecture, long short-term memory (LSTM)
networks, that can handle missing values in both target and predictor
variables. This algorithm is applied for modeling the progression of
Alzheimer's disease (AD) using magnetic resonance imaging (MRI) biomarkers. The
results show that the proposed LSTM algorithm achieves a lower mean absolute
error for prediction of measurements across all considered MRI biomarkers
compared to using standard LSTM networks with data imputation or using a
regression-based DPM method. Moreover, applying linear discriminant analysis to
the biomarkers' values predicted by the proposed algorithm results in a larger
area under the receiver operating characteristic curve (AUC) for clinical
diagnosis of AD compared to the same alternatives, and the AUC is comparable to
state-of-the-art AUCs from a recent cross-sectional medical image
classification challenge. This paper shows that built-in handling of missing
values in LSTM network training paves the way for application of RNNs in
disease progression modeling.Comment: 9 pages, 1 figure, MIDL conferenc
- …