35,622 research outputs found
Modelling of nonlinear stochastic dynamical systems using neurofuzzy networks
Though nonlinear stochastic dynamical system can be approximated by feedforward neural networks, the dimension of the input space of the network may be too large, making it to be of little practical importance. The Nonlinear Autoregressive Moving Average model with eXogenous input (NARMAX) is shown to be able to represent nonlinear stochastic dynamical system under certain conditions. As the dimension of the input space is finite, it can be readily applied in practical application. It is well known that the training of recurrent networks using gradient method has a slow convergence rate. In this paper, a fast training algorithm based on the Newton-Raphson method for recurrent neurofuzzy network with NARMAX structure is presented. The convergence and the uniqueness of the proposed training algorithm are established. A simulation example involving a nonlinear dynamical system corrupted with the correlated noise and a sinusoidal disturbance is used to illustrate the performance of the proposed training algorithm.published_or_final_versio
Extended Kalman Filter In Recurrent Neural Network: USDIDR Forecasting Case Study
Artificial Neural Networks (ANN) especially Recurrent Neural Network (RNN) have been widely used to predict currency exchange rates. The learning algorithm that is commonly used in ANN is Stochastic Gradient Descent (SGD). One of the advantages of SGD is that the computational time needed is relatively short. But SGD also has weaknesses, including SGD requiring several hyperparameters such as the regularization parameter. Besides that SGD relatively requires a lot of epoch to reach convergence. Extended Kalman Filter (EKF) as a learning algorithm on RNN is used to replace SGD with the hope of a better level of accuracy and convergence rate. This study uses IDR / USD exchange rate data from 31 August 2015 to 29 August 2018 with 70% data as training data and 30% data as test data. This research shows that RNN-EKF produces better convergent speeds and better accuracy compared to RNN-SGD
Improving Language Modelling with Noise-contrastive estimation
Neural language models do not scale well when the vocabulary is large.
Noise-contrastive estimation (NCE) is a sampling-based method that allows for
fast learning with large vocabularies. Although NCE has shown promising
performance in neural machine translation, it was considered to be an
unsuccessful approach for language modelling. A sufficient investigation of the
hyperparameters in the NCE-based neural language models was also missing. In
this paper, we showed that NCE can be a successful approach in neural language
modelling when the hyperparameters of a neural network are tuned appropriately.
We introduced the 'search-then-converge' learning rate schedule for NCE and
designed a heuristic that specifies how to use this schedule. The impact of the
other important hyperparameters, such as the dropout rate and the weight
initialisation range, was also demonstrated. We showed that appropriate tuning
of NCE-based neural language models outperforms the state-of-the-art
single-model methods on a popular benchmark
Learning Compact Recurrent Neural Networks with Block-Term Tensor Decomposition
Recurrent Neural Networks (RNNs) are powerful sequence modeling tools.
However, when dealing with high dimensional inputs, the training of RNNs
becomes computational expensive due to the large number of model parameters.
This hinders RNNs from solving many important computer vision tasks, such as
Action Recognition in Videos and Image Captioning. To overcome this problem, we
propose a compact and flexible structure, namely Block-Term tensor
decomposition, which greatly reduces the parameters of RNNs and improves their
training efficiency. Compared with alternative low-rank approximations, such as
tensor-train RNN (TT-RNN), our method, Block-Term RNN (BT-RNN), is not only
more concise (when using the same rank), but also able to attain a better
approximation to the original RNNs with much fewer parameters. On three
challenging tasks, including Action Recognition in Videos, Image Captioning and
Image Generation, BT-RNN outperforms TT-RNN and the standard RNN in terms of
both prediction accuracy and convergence rate. Specifically, BT-LSTM utilizes
17,388 times fewer parameters than the standard LSTM to achieve an accuracy
improvement over 15.6\% in the Action Recognition task on the UCF11 dataset.Comment: CVPR201
- …