1,392 research outputs found

    Visualising Basins of Attraction for the Cross-Entropy and the Squared Error Neural Network Loss Functions

    Get PDF
    Quantification of the stationary points and the associated basins of attraction of neural network loss surfaces is an important step towards a better understanding of neural network loss surfaces at large. This work proposes a novel method to visualise basins of attraction together with the associated stationary points via gradient-based random sampling. The proposed technique is used to perform an empirical study of the loss surfaces generated by two different error metrics: quadratic loss and entropic loss. The empirical observations confirm the theoretical hypothesis regarding the nature of neural network attraction basins. Entropic loss is shown to exhibit stronger gradients and fewer stationary points than quadratic loss, indicating that entropic loss has a more searchable landscape. Quadratic loss is shown to be more resilient to overfitting than entropic loss. Both losses are shown to exhibit local minima, but the number of local minima is shown to decrease with an increase in dimensionality. Thus, the proposed visualisation technique successfully captures the local minima properties exhibited by the neural network loss surfaces, and can be used for the purpose of fitness landscape analysis of neural networks.Comment: Preprint submitted to the Neural Networks journa

    The Early Restart Algorithm

    Get PDF
    Consider an algorithm whose time to convergence is unknown (because of some random element in the algorithm, such as a random initial weight choice for neural network training). Consider the following strategy. Run the algorithm for a specific time T. If it has not converged by time T, cut the run short and rerun it from the start (repeat the same strategy for every run). This so-called restart mechanism has been proposed by Fahlman (1988) in the context of backpropagation training. It is advantageous in problems that are prone to local minima or when there is a large variability in convergence time from run to run, and may lead to a speed-up in such cases. In this article, we analyze theoretically the restart mechanism, and obtain conditions on the probability density of the convergence time for which restart will improve the expected convergence time. We also derive the optimal restart time. We apply the derived formulas to several cases, including steepest-descent algorithms

    New acceleration technique for the backpropagation algorithm

    Full text link
    Artificial neural networks have been studied for many years in the hope of achieving human like performance in the area of pattern recognition, speech synthesis and higher level of cognitive process. In the connectionist model there are several interconnected processing elements called the neurons that have limited processing capability. Even though the rate of information transmitted between these elements is limited, the complex interconnection and the cooperative interaction between these elements results in a vastly increased computing power; The neural network models are specified by an organized network topology of interconnected neurons. These networks have to be trained in order them to be used for a specific purpose. Backpropagation is one of the popular methods of training the neural networks. There has been a lot of improvement over the speed of convergence of standard backpropagation algorithm in the recent past. Herein we have presented a new technique for accelerating the existing backpropagation without modifying it. We have used the fourth order interpolation method for the dominant eigen values, by using these we change the slope of the activation function. And by doing so we increase the speed of convergence of the backpropagation algorithm; Our experiments have shown significant improvement in the convergence time for problems widely used in benchmarKing Three to ten fold decrease in convergence time is achieved. Convergence time decreases as the complexity of the problem increases. The technique adjusts the energy state of the system so as to escape from local minima

    Multivariate time series analysis for short-term forecasting of ground level ozone (O3) in Malaysia

    Get PDF
    The declining of air quality mostly affects the elderly, children, people with asthma, as well as a restriction on outdoor activities. Therefore, there is an importance to provide a statistical modelling to forecast the future values of surface layer ozone (O3) concentration. The objectives of this study are to obtain the best multivariate time series (MTS) model and develop an online air quality forecasting system for O3 concentration in Malaysia. The implementations of MTS model improve the recent statistical model on air quality for short-term prediction. Ten air quality monitoring stations situated at four (4) different types of location were selected in this study. The first type is industrial represent by Pasir Gudang, Perai, and Nilai, second type is urban represent by Kuala Terengganu, Kota Bharu, and Alor Setar. The third is suburban located in Banting, Kangar, and Tanjung Malim, also the only background station at Jerantut. The hourly record data from 2010 to 2017 were used to assess the characteristics and behaviour of O3 concentration. Meanwhile, the monthly record data of O3, particulate matter (PM10), nitrogen dioxide (NO2), sulphur dioxide (SO2), carbon monoxide (CO), temperature (T), wind speed (WS), and relative humidity (RH) were used to examine the best MTS models. Three methods of MTS namely vector autoregressive (VAR), vector moving average (VMA), and vector autoregressive moving average (VARMA), has been applied in this study. Based on the performance error, the most appropriate MTS model located in Pasir Gudang, Kota Bharu and Kangar is VAR(1), Kuala Terengganu and Alor Setar for VAR(2), Perai and Nilai for VAR(3), Tanjung Malim for VAR(4) and Banting for VAR(5). Only Jerantut obtained the VMA(2) as the best model. The lowest root mean square error (RMSE) and normalized absolute error is 0.0053 and <0.0001 which is for MTS model in Perai and Kuala Terengganu, respectively. Meanwhile, for mean absolute error (MAE), the lowest is in Banting and Jerantut at 0.0013. The online air quality forecasting system for O3 was successfully developed based on the best MTS models to represent each monitoring station

    Evolving stochastic learning algorithm based on Tsallis entropic index

    Get PDF
    In this paper, inspired from our previous algorithm, which was based on the theory of Tsallis statistical mechanics, we develop a new evolving stochastic learning algorithm for neural networks. The new algorithm combines deterministic and stochastic search steps by employing a different adaptive stepsize for each network weight, and applies a form of noise that is characterized by the nonextensive entropic index q, regulated by a weight decay term. The behavior of the learning algorithm can be made more stochastic or deterministic depending on the trade off between the temperature T and the q values. This is achieved by introducing a formula that defines a time-dependent relationship between these two important learning parameters. Our experimental study verifies that there are indeed improvements in the convergence speed of this new evolving stochastic learning algorithm, which makes learning faster than using the original Hybrid Learning Scheme (HLS). In addition, experiments are conducted to explore the influence of the entropic index q and temperature T on the convergence speed and stability of the proposed method

    Comparative performance of some popular ANN algorithms on benchmark and function approximation problems

    Full text link
    We report an inter-comparison of some popular algorithms within the artificial neural network domain (viz., Local search algorithms, global search algorithms, higher order algorithms and the hybrid algorithms) by applying them to the standard benchmarking problems like the IRIS data, XOR/N-Bit parity and Two Spiral. Apart from giving a brief description of these algorithms, the results obtained for the above benchmark problems are presented in the paper. The results suggest that while Levenberg-Marquardt algorithm yields the lowest RMS error for the N-bit Parity and the Two Spiral problems, Higher Order Neurons algorithm gives the best results for the IRIS data problem. The best results for the XOR problem are obtained with the Neuro Fuzzy algorithm. The above algorithms were also applied for solving several regression problems such as cos(x) and a few special functions like the Gamma function, the complimentary Error function and the upper tail cumulative Ļ‡2\chi^2-distribution function. The results of these regression problems indicate that, among all the ANN algorithms used in the present study, Levenberg-Marquardt algorithm yields the best results. Keeping in view the highly non-linear behaviour and the wide dynamic range of these functions, it is suggested that these functions can be also considered as standard benchmark problems for function approximation using artificial neural networks.Comment: 18 pages 5 figures. Accepted in Pramana- Journal of Physic

    Improved sign-based learning algorithm derived by the composite nonlinear Jacobi process

    Get PDF
    In this paper a globally convergent first-order training algorithm is proposed that uses sign-based information of the batch error measure in the framework of the nonlinear Jacobi process. This approach allows us to equip the recently proposed Jacobiā€“Rprop method with the global convergence property, i.e. convergence to a local minimizer from any initial starting point. We also propose a strategy that ensures the search direction of the globally convergent Jacobiā€“Rprop is a descent one. The behaviour of the algorithm is empirically investigated in eight benchmark problems. Simulation results verify that there are indeed improvements on the convergence success of the algorithm

    Supervised learning with hybrid global optimisation methods

    Get PDF
    • ā€¦
    corecore