209 research outputs found
Riemann Hypothesis: a GGC factorisation
A GGC (Generalized Gamma Convolution) representation of Riemann's Xi-function
is constructed
On Hilbert's 8th Problem
A Hadamard factorization of the Riemann Xi-function is constructed to
characterize the zeros of the zeta function
van Dantzig Pairs, Wald Couples and Hadamard Factorisation
Some consequences of a duality between the Hadamard-Weierstrass factorisation
of an entire function and van Dantzig-Wald couples of random variables are
explored. We demonstrate the methodology on particular functions including the
Riemann zeta and xi-functions, Ramanujan's tau function, L-functions and Gamma
and Hyperbolic functions
Posterior Concentration for Sparse Deep Learning
Spike-and-Slab Deep Learning (SS-DL) is a fully Bayesian alternative to
Dropout for improving generalizability of deep ReLU networks. This new type of
regularization enables provable recovery of smooth input-output maps with
unknown levels of smoothness. Indeed, we show that the posterior distribution
concentrates at the near minimax rate for -H\"older smooth maps,
performing as well as if we knew the smoothness level ahead of time.
Our result sheds light on architecture design for deep neural networks, namely
the choice of depth, width and sparsity level. These network attributes
typically depend on unknown smoothness in order to be optimal. We obviate this
constraint with the fully Bayes construction. As an aside, we show that SS-DL
does not overfit in the sense that the posterior concentrates on smaller
networks with fewer (up to the optimal number of) nodes and links. Our results
provide new theoretical justifications for deep ReLU networks from a Bayesian
point of view
Deep Learning: Computational Aspects
In this article we review computational aspects of Deep Learning (DL). Deep
learning uses network architectures consisting of hierarchical layers of latent
variables to construct predictors for high-dimensional input-output models.
Training a deep learning architecture is computationally intensive, and
efficient linear algebra libraries is the key for training and inference.
Stochastic gradient descent (SGD) optimization and batch sampling are used to
learn from massive data sets
Deep Learning for Short-Term Traffic Flow Prediction
We develop a deep learning model to predict traffic flows. The main
contribution is development of an architecture that combines a linear model
that is fitted using regularization and a sequence of layers.
The challenge of predicting traffic flows are the sharp nonlinearities due to
transitions between free flow, breakdown, recovery and congestion. We show that
deep learning architectures can capture these nonlinear spatio-temporal
effects. The first layer identifies spatio-temporal relations among predictors
and other layers model nonlinear relations. We illustrate our methodology on
road sensor data from Interstate I-55 and predict traffic flows during two
special events; a Chicago Bears football game and an extreme snowstorm event.
Both cases have sharp traffic flow regime changes, occurring very suddenly, and
we show how deep learning provides precise short term traffic flow predictions
Bayesian Particle Tracking of Traffic Flows
We develop a Bayesian particle filter for tracking traffic flows that is
capable of capturing non-linearities and discontinuities present in flow
dynamics. Our model includes a hidden state variable that captures sudden
regime shifts between traffic free flow, breakdown and recovery. We develop an
efficient particle learning algorithm for real time on-line inference of states
and parameters. This requires a two step approach, first, resampling the
current particles, with a mixture predictive distribution and second,
propagation of states using the conditional posterior distribution. Particle
learning of parameters follows from updating recursions for conditional
sufficient statistics. To illustrate our methodology, we analyze measurements
of daily traffic flow from the Illinois interstate I-55 highway system. We
demonstrate how our filter can be used to inference the change of traffic flow
regime on a highway road segment based on a measurement from freeway
single-loop detectors. Finally, we conclude with directions for future
research
Deep Learning: A Bayesian Perspective
Deep learning is a form of machine learning for nonlinear high dimensional
pattern matching and prediction. By taking a Bayesian probabilistic
perspective, we provide a number of insights into more efficient algorithms for
optimisation and hyper-parameter tuning. Traditional high-dimensional data
reduction techniques, such as principal component analysis (PCA), partial least
squares (PLS), reduced rank regression (RRR), projection pursuit regression
(PPR) are all shown to be shallow learners. Their deep learning counterparts
exploit multiple deep layers of data reduction which provide predictive
performance gains. Stochastic gradient descent (SGD) training optimisation and
Dropout (DO) regularization provide estimation and variable selection. Bayesian
regularization is central to finding weights and connections in networks to
optimize the predictive bias-variance trade-off. To illustrate our methodology,
we provide an analysis of international bookings on Airbnb. Finally, we
conclude with directions for future research
Regularizing Bayesian Predictive Regressions
We show that regularizing Bayesian predictive regressions provides a
framework for prior sensitivity analysis. We develop a procedure that jointly
regularizes expectations and variance-covariance matrices using a pair of
shrinkage priors. Our methodology applies directly to vector autoregressions
(VAR) and seemingly unrelated regressions (SUR). The regularization path
provides a prior sensitivity diagnostic. By exploiting a duality between
regularization penalties and predictive prior distributions, we reinterpret two
classic Bayesian analyses of macro-finance studies: equity premium
predictability and forecasting macroeconomic growth rates. We find there exist
plausible prior specifications for predictability in excess S&P 500 index
returns using book-to-market ratios, CAY (consumption, wealth, income ratio),
and T-bill rates. We evaluate the forecasts using a market-timing strategy, and
we show the optimally regularized solution outperforms a buy-and-hold approach.
A second empirical application involves forecasting industrial production,
inflation, and consumption growth rates, and demonstrates the feasibility of
our approach
Bayesian -regularized Least Squares
Bayesian -regularized least squares is a variable selection technique
for high dimensional predictors. The challenge is optimizing a non-convex
objective function via search over model space consisting of all possible
predictor combinations. Spike-and-slab (a.k.a. Bernoulli-Gaussian) priors are
the gold standard for Bayesian variable selection, with a caveat of
computational speed and scalability. Single Best Replacement (SBR) provides a
fast scalable alternative. We provide a link between Bayesian regularization
and proximal updating, which provides an equivalence between finding a
posterior mode and a posterior mean with a different regularization prior. This
allows us to use SBR to find the spike-and-slab estimator. To illustrate our
methodology, we provide simulation evidence and a real data example on the
statistical properties and computational efficiency of SBR versus direct
posterior sampling using spike-and-slab priors. Finally, we conclude with
directions for future research.Comment: 22 pages, 6 figures, 1 tabl
- …