Search CORE

35,622 research outputs found

Modelling of nonlinear stochastic dynamical systems using neurofuzzy networks

Author: Chan CW
Chan WC
Cheung KC
Wang Y
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1999
Field of study

Though nonlinear stochastic dynamical system can be approximated by feedforward neural networks, the dimension of the input space of the network may be too large, making it to be of little practical importance. The Nonlinear Autoregressive Moving Average model with eXogenous input (NARMAX) is shown to be able to represent nonlinear stochastic dynamical system under certain conditions. As the dimension of the input space is finite, it can be readily applied in practical application. It is well known that the training of recurrent networks using gradient method has a slow convergence rate. In this paper, a fast training algorithm based on the Newton-Raphson method for recurrent neurofuzzy network with NARMAX structure is presented. The convergence and the uniqueness of the proposed training algorithm are established. A simulation example involving a nonlinear dynamical system corrupted with the correlated noise and a sinusoidal disturbance is used to illustrate the performance of the proposed training algorithm.published_or_final_versio

HKU Scholars Hub

Extended Kalman Filter In Recurrent Neural Network: USDIDR Forecasting Case Study

Author: Hazazi Muhammad Asaduddin
Sihabuddin Agus
Publication venue: 'Universitas Gadjah Mada'
Publication date: 31/07/2019
Field of study

Artificial Neural Networks (ANN) especially Recurrent Neural Network (RNN) have been widely used to predict currency exchange rates. The learning algorithm that is commonly used in ANN is Stochastic Gradient Descent (SGD). One of the advantages of SGD is that the computational time needed is relatively short. But SGD also has weaknesses, including SGD requiring several hyperparameters such as the regularization parameter. Besides that SGD relatively requires a lot of epoch to reach convergence. Extended Kalman Filter (EKF) as a learning algorithm on RNN is used to replace SGD with the hope of a better level of accuracy and convergence rate. This study uses IDR / USD exchange rate data from 31 August 2015 to 29 August 2018 with 70% data as training data and 30% data as test data. This research shows that RNN-EKF produces better convergent speeds and better accuracy compared to RNN-SGD

IJCCS (Indonesian Journal of Computing and Cybernetics Systems)

Improving Language Modelling with Noise-contrastive estimation

Author: Grzes Marek
Liza Farhana Ferdousi
Publication venue
Publication date: 22/09/2017
Field of study

Neural language models do not scale well when the vocabulary is large. Noise-contrastive estimation (NCE) is a sampling-based method that allows for fast learning with large vocabularies. Although NCE has shown promising performance in neural machine translation, it was considered to be an unsuccessful approach for language modelling. A sufficient investigation of the hyperparameters in the NCE-based neural language models was also missing. In this paper, we showed that NCE can be a successful approach in neural language modelling when the hyperparameters of a neural network are tuned appropriately. We introduced the 'search-then-converge' learning rate schedule for NCE and designed a heuristic that specifies how to use this schedule. The impact of the other important hyperparameters, such as the dropout rate and the weight initialisation range, was also demonstrated. We showed that appropriate tuning of NCE-based neural language models outperforms the state-of-the-art single-model methods on a popular benchmark

arXiv.org e-Print Archive

Kent Academic Repository

Learning Compact Recurrent Neural Networks with Block-Term Tensor Decomposition

Author: Chen Di
Chu Xinqi
Li Guangxi
Wang Linnan
Xu Zenglin
Ye Jinmian
Zhe Shandian
Publication venue
Publication date: 11/05/2018
Field of study

Recurrent Neural Networks (RNNs) are powerful sequence modeling tools. However, when dealing with high dimensional inputs, the training of RNNs becomes computational expensive due to the large number of model parameters. This hinders RNNs from solving many important computer vision tasks, such as Action Recognition in Videos and Image Captioning. To overcome this problem, we propose a compact and flexible structure, namely Block-Term tensor decomposition, which greatly reduces the parameters of RNNs and improves their training efficiency. Compared with alternative low-rank approximations, such as tensor-train RNN (TT-RNN), our method, Block-Term RNN (BT-RNN), is not only more concise (when using the same rank), but also able to attain a better approximation to the original RNNs with much fewer parameters. On three challenging tasks, including Action Recognition in Videos, Image Captioning and Image Generation, BT-RNN outperforms TT-RNN and the standard RNN in terms of both prediction accuracy and convergence rate. Specifically, BT-LSTM utilizes 17,388 times fewer parameters than the standard LSTM to achieve an accuracy improvement over 15.6\% in the Action Recognition task on the UCF11 dataset.Comment: CVPR201

arXiv.org e-Print Archive

Crossref