209 research outputs found

    Riemann Hypothesis: a GGC factorisation

    Full text link
    A GGC (Generalized Gamma Convolution) representation of Riemann's Xi-function is constructed

    On Hilbert's 8th Problem

    Full text link
    A Hadamard factorization of the Riemann Xi-function is constructed to characterize the zeros of the zeta function

    van Dantzig Pairs, Wald Couples and Hadamard Factorisation

    Full text link
    Some consequences of a duality between the Hadamard-Weierstrass factorisation of an entire function and van Dantzig-Wald couples of random variables are explored. We demonstrate the methodology on particular functions including the Riemann zeta and xi-functions, Ramanujan's tau function, L-functions and Gamma and Hyperbolic functions

    Posterior Concentration for Sparse Deep Learning

    Full text link
    Spike-and-Slab Deep Learning (SS-DL) is a fully Bayesian alternative to Dropout for improving generalizability of deep ReLU networks. This new type of regularization enables provable recovery of smooth input-output maps with unknown levels of smoothness. Indeed, we show that the posterior distribution concentrates at the near minimax rate for α\alpha-H\"older smooth maps, performing as well as if we knew the smoothness level α\alpha ahead of time. Our result sheds light on architecture design for deep neural networks, namely the choice of depth, width and sparsity level. These network attributes typically depend on unknown smoothness in order to be optimal. We obviate this constraint with the fully Bayes construction. As an aside, we show that SS-DL does not overfit in the sense that the posterior concentrates on smaller networks with fewer (up to the optimal number of) nodes and links. Our results provide new theoretical justifications for deep ReLU networks from a Bayesian point of view

    Deep Learning: Computational Aspects

    Full text link
    In this article we review computational aspects of Deep Learning (DL). Deep learning uses network architectures consisting of hierarchical layers of latent variables to construct predictors for high-dimensional input-output models. Training a deep learning architecture is computationally intensive, and efficient linear algebra libraries is the key for training and inference. Stochastic gradient descent (SGD) optimization and batch sampling are used to learn from massive data sets

    Deep Learning for Short-Term Traffic Flow Prediction

    Full text link
    We develop a deep learning model to predict traffic flows. The main contribution is development of an architecture that combines a linear model that is fitted using 1\ell_1 regularization and a sequence of tanh\tanh layers. The challenge of predicting traffic flows are the sharp nonlinearities due to transitions between free flow, breakdown, recovery and congestion. We show that deep learning architectures can capture these nonlinear spatio-temporal effects. The first layer identifies spatio-temporal relations among predictors and other layers model nonlinear relations. We illustrate our methodology on road sensor data from Interstate I-55 and predict traffic flows during two special events; a Chicago Bears football game and an extreme snowstorm event. Both cases have sharp traffic flow regime changes, occurring very suddenly, and we show how deep learning provides precise short term traffic flow predictions

    Bayesian Particle Tracking of Traffic Flows

    Full text link
    We develop a Bayesian particle filter for tracking traffic flows that is capable of capturing non-linearities and discontinuities present in flow dynamics. Our model includes a hidden state variable that captures sudden regime shifts between traffic free flow, breakdown and recovery. We develop an efficient particle learning algorithm for real time on-line inference of states and parameters. This requires a two step approach, first, resampling the current particles, with a mixture predictive distribution and second, propagation of states using the conditional posterior distribution. Particle learning of parameters follows from updating recursions for conditional sufficient statistics. To illustrate our methodology, we analyze measurements of daily traffic flow from the Illinois interstate I-55 highway system. We demonstrate how our filter can be used to inference the change of traffic flow regime on a highway road segment based on a measurement from freeway single-loop detectors. Finally, we conclude with directions for future research

    Deep Learning: A Bayesian Perspective

    Full text link
    Deep learning is a form of machine learning for nonlinear high dimensional pattern matching and prediction. By taking a Bayesian probabilistic perspective, we provide a number of insights into more efficient algorithms for optimisation and hyper-parameter tuning. Traditional high-dimensional data reduction techniques, such as principal component analysis (PCA), partial least squares (PLS), reduced rank regression (RRR), projection pursuit regression (PPR) are all shown to be shallow learners. Their deep learning counterparts exploit multiple deep layers of data reduction which provide predictive performance gains. Stochastic gradient descent (SGD) training optimisation and Dropout (DO) regularization provide estimation and variable selection. Bayesian regularization is central to finding weights and connections in networks to optimize the predictive bias-variance trade-off. To illustrate our methodology, we provide an analysis of international bookings on Airbnb. Finally, we conclude with directions for future research

    Regularizing Bayesian Predictive Regressions

    Full text link
    We show that regularizing Bayesian predictive regressions provides a framework for prior sensitivity analysis. We develop a procedure that jointly regularizes expectations and variance-covariance matrices using a pair of shrinkage priors. Our methodology applies directly to vector autoregressions (VAR) and seemingly unrelated regressions (SUR). The regularization path provides a prior sensitivity diagnostic. By exploiting a duality between regularization penalties and predictive prior distributions, we reinterpret two classic Bayesian analyses of macro-finance studies: equity premium predictability and forecasting macroeconomic growth rates. We find there exist plausible prior specifications for predictability in excess S&P 500 index returns using book-to-market ratios, CAY (consumption, wealth, income ratio), and T-bill rates. We evaluate the forecasts using a market-timing strategy, and we show the optimally regularized solution outperforms a buy-and-hold approach. A second empirical application involves forecasting industrial production, inflation, and consumption growth rates, and demonstrates the feasibility of our approach

    Bayesian l0l_0-regularized Least Squares

    Full text link
    Bayesian l0l_0-regularized least squares is a variable selection technique for high dimensional predictors. The challenge is optimizing a non-convex objective function via search over model space consisting of all possible predictor combinations. Spike-and-slab (a.k.a. Bernoulli-Gaussian) priors are the gold standard for Bayesian variable selection, with a caveat of computational speed and scalability. Single Best Replacement (SBR) provides a fast scalable alternative. We provide a link between Bayesian regularization and proximal updating, which provides an equivalence between finding a posterior mode and a posterior mean with a different regularization prior. This allows us to use SBR to find the spike-and-slab estimator. To illustrate our methodology, we provide simulation evidence and a real data example on the statistical properties and computational efficiency of SBR versus direct posterior sampling using spike-and-slab priors. Finally, we conclude with directions for future research.Comment: 22 pages, 6 figures, 1 tabl
    corecore