24,239 research outputs found

    A Long Short-Term Memory Recurrent Neural Network Framework for Network Traffic Matrix Prediction

    Full text link
    Network Traffic Matrix (TM) prediction is defined as the problem of estimating future network traffic from the previous and achieved network traffic data. It is widely used in network planning, resource management and network security. Long Short-Term Memory (LSTM) is a specific recurrent neural network (RNN) architecture that is well-suited to learn from experience to classify, process and predict time series with time lags of unknown size. LSTMs have been shown to model temporal sequences and their long-range dependencies more accurately than conventional RNNs. In this paper, we propose a LSTM RNN framework for predicting short and long term Traffic Matrix (TM) in large networks. By validating our framework on real-world data from GEANT network, we show that our LSTM models converge quickly and give state of the art TM prediction performance for relatively small sized models.Comment: Submitted for peer review. arXiv admin note: text overlap with arXiv:1402.1128 by other author

    Online Natural Gradient as a Kalman Filter

    Full text link
    We cast Amari's natural gradient in statistical learning as a specific case of Kalman filtering. Namely, applying an extended Kalman filter to estimate a fixed unknown parameter of a probabilistic model from a series of observations, is rigorously equivalent to estimating this parameter via an online stochastic natural gradient descent on the log-likelihood of the observations. In the i.i.d. case, this relation is a consequence of the "information filter" phrasing of the extended Kalman filter. In the recurrent (state space, non-i.i.d.) case, we prove that the joint Kalman filter over states and parameters is a natural gradient on top of real-time recurrent learning (RTRL), a classical algorithm to train recurrent models. This exact algebraic correspondence provides relevant interpretations for natural gradient hyperparameters such as learning rates or initialization and regularization of the Fisher information matrix.Comment: 3rd version: expanded intr
    corecore