350 research outputs found

    Data Assimilation by Artificial Neural Networks for an Atmospheric General Circulation Model: Conventional Observation

    Full text link
    This paper presents an approach for employing artificial neural networks (NN) to emulate an ensemble Kalman filter (EnKF) as a method of data assimilation. The assimilation methods are tested in the Simplified Parameterizations PrimitivE-Equation Dynamics (SPEEDY) model, an atmospheric general circulation model (AGCM), using synthetic observational data simulating localization of balloon soundings. For the data assimilation scheme, the supervised NN, the multilayer perceptrons (MLP-NN), is applied. The MLP-NN are able to emulate the analysis from the local ensemble transform Kalman filter (LETKF). After the training process, the method using the MLP-NN is seen as a function of data assimilation. The NN were trained with data from first three months of 1982, 1983, and 1984. A hind-casting experiment for the 1985 data assimilation cycle using MLP-NN were performed with synthetic observations for January 1985. The numerical results demonstrate the effectiveness of the NN technique for atmospheric data assimilation. The results of the NN analyses are very close to the results from the LETKF analyses, the differences of the monthly average of absolute temperature analyses is of order 0.02. The simulations show that the major advantage of using the MLP-NN is better computational performance, since the analyses have similar quality. The CPU-time cycle assimilation with MLP-NN is 90 times faster than cycle assimilation with LETKF for the numerical experiment.Comment: 17 pages, 16 figures, monthly weather revie

    Random orthogonal additive filters: a solution to the vanishing/exploding gradient of deep neural networks

    Full text link
    Since the recognition in the early nineties of the vanishing/exploding (V/E) gradient issue plaguing the training of neural networks (NNs), significant efforts have been exerted to overcome this obstacle. However, a clear solution to the V/E issue remained elusive so far. In this manuscript a new architecture of NN is proposed, designed to mathematically prevent the V/E issue to occur. The pursuit of approximate dynamical isometry, i.e. parameter configurations where the singular values of the input-output Jacobian are tightly distributed around 1, leads to the derivation of a NN's architecture that shares common traits with the popular Residual Network model. Instead of skipping connections between layers, the idea is to filter the previous activations orthogonally and add them to the nonlinear activations of the next layer, realising a convex combination between them. Remarkably, the impossibility for the gradient updates to either vanish or explode is demonstrated with analytical bounds that hold even in the infinite depth case. The effectiveness of this method is empirically proved by means of training via backpropagation an extremely deep multilayer perceptron of 50k layers, and an Elman NN to learn long-term dependencies in the input of 10k time steps in the past. Compared with other architectures specifically devised to deal with the V/E problem, e.g. LSTMs for recurrent NNs, the proposed model is way simpler yet more effective. Surprisingly, a single layer vanilla RNN can be enhanced to reach state of the art performance, while converging super fast; for instance on the psMNIST task, it is possible to get test accuracy of over 94% in the first epoch, and over 98% after just 10 epochs
    corecore