94 research outputs found

    Chaotic Time Series Forecasting Using Higher Order Neural Networks

    Get PDF
    This study presents a novel application and comparison of higher order neural networks (HONNs) to forecast benchmark chaotic time series. Two models of HONNs were implemented, namely functional link neural network (FLNN) and pi-sigma neural network (PSNN). These models were tested on two benchmark time series; the monthly smoothed sunspot numbers and the Mackey-Glass time-delay differential equation time series. The forecasting performance of the HONNs is compared against the performance of different models previously used in the literature such as fuzzy and neural networks models. Simulation results showed that FLNN and PSNN offer good performance compared to many previously used hybrid models

    A Comprehensive Survey on Pi-Sigma Neural Network for Time Series Prediction

    Get PDF
    Prediction of time series grabs received much attention because of its effect on the vast range of real life applications. This paper presents a survey of time series applications using Higher Order Neural Network (HONN) model. The basic motivation behind using HONN is the ability to expand the input space, to solve complex problems it becomes more efficient and perform high learning abilities of the time series forecasting. Pi-Sigma Neural Network (PSNN) includes indirectly the capabilities of higher order networks using product cells as the output units and less number of weights. The goal of this research is to present the reader awareness about PSNN for time series prediction, to highlight some benefits and challenges using PSNN. Possible fields of PSNN applications in comparison with existing methods are presented and future directions are also explored in advantage with the properties of error feedback and recurrent networks

    Nonconvex Stochastic Bregman Proximal Gradient Method with Application to Deep Learning

    Full text link
    The widely used stochastic gradient methods for minimizing nonconvex composite objective functions require the Lipschitz smoothness of the differentiable part. But the requirement does not hold true for problem classes including quadratic inverse problems and training neural networks. To address this issue, we investigate a family of stochastic Bregman proximal gradient (SBPG) methods, which only require smooth adaptivity of the differentiable part. SBPG replaces the upper quadratic approximation used in SGD with the Bregman proximity measure, resulting in a better approximation model that captures the non-Lipschitz gradients of the nonconvex objective. We formulate the vanilla SBPG and establish its convergence properties under nonconvex setting without finite-sum structure. Experimental results on quadratic inverse problems testify the robustness of SBPG. Moreover, we propose a momentum-based version of SBPG (MSBPG) and prove it has improved convergence properties. We apply MSBPG to the training of deep neural networks with a polynomial kernel function, which ensures the smooth adaptivity of the loss function. Experimental results on representative benchmarks demonstrate the effectiveness and robustness of MSBPG in training neural networks. Since the additional computation cost of MSBPG compared with SGD is negligible in large-scale optimization, MSBPG can potentially be employed as an universal open-source optimizer in the future.Comment: 37 page

    Recent Advances in Randomized Methods for Big Data Optimization

    Get PDF
    In this thesis, we discuss and develop randomized algorithms for big data problems. In particular, we study the finite-sum optimization with newly emerged variance- reduction optimization methods (Chapter 2), explore the efficiency of second-order information applied to both convex and non-convex finite-sum objectives (Chapter 3) and employ the fast first-order method in power system problems (Chapter 4).In Chapter 2, we propose two variance-reduced gradient algorithms – mS2GD and SARAH. mS2GD incorporates a mini-batching scheme for improving the theoretical complexity and practical performance of SVRG/S2GD, aiming to minimize a strongly convex function represented as the sum of an average of a large number of smooth con- vex functions and a simple non-smooth convex regularizer. While SARAH, short for StochAstic Recursive grAdient algoritHm and using a stochastic recursive gradient, targets at minimizing the average of a large number of smooth functions for both con- vex and non-convex cases. Both methods fall into the category of variance-reduction optimization, and obtain a total complexity of O((n+κ)log(1/ε)) to achieve an ε-accuracy solution for strongly convex objectives, while SARAH also maintains a sub-linear convergence for non-convex problems. Meanwhile, SARAH has a practical variant SARAH+ due to its linear convergence of the expected stochastic gradients in inner loops.In Chapter 3, we declare that randomized batches can be applied with second- order information, as to improve upon convergence in both theory and practice, with a framework of L-BFGS as a novel approach to finite-sum optimization problems. We provide theoretical analyses for both convex and non-convex objectives. Meanwhile, we propose LBFGS-F as a variant where Fisher information matrix is used instead of Hessian information, and prove it applicable to a distributed environment within the popular applications of least-square and cross-entropy losses.In Chapter 4, we develop fast randomized algorithms for solving polynomial optimization problems on the applications of alternating-current optimal power flows (ACOPF) in power system field. The traditional research on power system problem focuses on solvers using second-order method, while no randomized algorithms have been developed. First, we propose a coordinate-descent algorithm as an online solver, applied for solving time-varying optimization problems in power systems. We bound the difference between the current approximate optimal cost generated by our algorithm and the optimal cost for a relaxation using the most recent data from above by a function of the properties of the instance and the rate of change to the instance over time. Second, we focus on a steady-state problem in power systems, and study means of switching from solving a convex relaxation to Newton method working on a non-convex (augmented) Lagrangian of the problem

    Deep Machine Learning with Spatio-Temporal Inference

    Get PDF
    Deep Machine Learning (DML) refers to methods which utilize hierarchies of more than one or two layers of computational elements to achieve learning. DML may draw upon biomemetic models, or may be simply biologically-inspired. Regardless, these architectures seek to employ hierarchical processing as means of mimicking the ability of the human brain to process a myriad of sensory data and make meaningful decisions based on this data. In this dissertation we present a novel DML architecture which is biologically-inspired in that (1) all processing is performed hierarchically; (2) all processing units are identical; and (3) processing captures both spatial and temporal dependencies in the observations to organize and extract features suitable for supervised learning. We call this architecture Deep Spatio-Temporal Inference Network (DeSTIN). In this framework, patterns observed in pixel data at the lowest layer of the hierarchy are organized and fit to generalizations using decomposition algorithms. Subsequent spatial layers draw upon previous layers, their own temporal observations and beliefs, and the observations and beliefs of parent nodes to extract features suitable for supervised learning using standard classifiers such as feedforward neural networks. Hence, DeSTIN is viewed as an unsupervised feature extraction scheme in the sense that rather than relying on human engineering to determine features for a particular problem, DeSTIN naturally constructs features of interest by representing salient regularities in the patterns observed. Detailed discussion and analysis of the DeSTIN framework is provided, including focus on its key components of generalization through online clustering and temporal inference. We present a variety of implementation details, including static and dynamic learning formulations, and function approximation methods. Results on standardized datasets of handwritten digits as well as face and optic nerve detection are presented, illustrating the efficacy of the proposed approach

    International Conference on Continuous Optimization (ICCOPT) 2019 Conference Book

    Get PDF
    The Sixth International Conference on Continuous Optimization took place on the campus of the Technical University of Berlin, August 3-8, 2019. The ICCOPT is a flagship conference of the Mathematical Optimization Society (MOS), organized every three years. ICCOPT 2019 was hosted by the Weierstrass Institute for Applied Analysis and Stochastics (WIAS) Berlin. It included a Summer School and a Conference with a series of plenary and semi-plenary talks, organized and contributed sessions, and poster sessions. This book comprises the full conference program. It contains, in particular, the scientific program in survey style as well as with all details, and information on the social program, the venue, special meetings, and more

    Fast algorithms for smooth and monotone covariance matrix estimation

    Get PDF
    In this thesis the problem of interest is, within the setting of financial risk management, covariance matrix estimation from limited number of high dimensional independent identically distributed (i.i.d.) multivariate samples when the random variables of interest have a natural spatial indexing along a low-dimensional manifold, e.g., along a line. Sample covariance matrix estimate is fraught with peril in this context. A variety of approaches to improve the covariance estimates have been developed by exploiting knowledge of structure in the data, which, however, in general impose very strict structure. We instead exploit another formulation which assumes that the covariance matrix is smooth and monotone with respect to the spatial indexing. Originally the formulation is derived from the estimation problem within a convex-optimization framework, and the resulting semidefinite-programming problem (SDP) is solved by an interior-point method (IPM). However, solving SDP via an IPM can become unduly computationally expensive for large covariance matrices. Motivated by this observation, this thesis develops highly efficient first-order solvers for smooth and monotone covariance matrix estimation. We propose two types of solvers for covariance matrix estimation: first based on projected gradients, and then based on recently developed optimal first order methods. Given such numerical algorithms, we present a comprehensive experimental analysis. We first demonstrate the benefits of imposing smoothness and monotonicity constraints in covariance matrix estimation in a number of scenarios, involving limited, missing, and asynchronous data. We then demonstrate the potential computational benefits offered by first order methods through a detailed comparison to solution of the problem via IPMs
    • …