Search CORE

104,480 research outputs found

Why and When Can Deep -- but Not Shallow -- Networks Avoid the Curse of Dimensionality: a Review

Author: Liao Qianli
Mhaskar Hrushikesh
Miranda Brando
Poggio Tomaso
Rosasco Lorenzo
Publication venue
Publication date: 01/01/2017
Field of study

The paper characterizes classes of functions for which deep learning can be exponentially better than shallow learning. Deep convolutional networks are a special case of these conditions, though weight sharing is not the main reason for their exponential advantage

arXiv.org e-Print Archive

DSpace@MIT

Caltech Authors

Archivio istituzionale della ricerca - Università di Genova

Approximate Bayesian Computation in State Space Models

Author: Maneesoonthorn Worapree
Martin Gael M.
McCabe Brendan P. M.
Robert Christian P.
Publication venue
Publication date: 01/01/2014
Field of study

A new approach to inference in state space models is proposed, based on approximate Bayesian computation (ABC). ABC avoids evaluation of the likelihood function by matching observed summary statistics with statistics computed from data simulated from the true process; exact inference being feasible only if the statistics are sufficient. With finite sample sufficiency unattainable in the state space setting, we seek asymptotic sufficiency via the maximum likelihood estimator (MLE) of the parameters of an auxiliary model. We prove that this auxiliary model-based approach achieves Bayesian consistency, and that - in a precise limiting sense - the proximity to (asymptotic) sufficiency yielded by the MLE is replicated by the score. In multiple parameter settings a separate treatment of scalar parameters, based on integrated likelihood techniques, is advocated as a way of avoiding the curse of dimensionality. Some attention is given to a structure in which the state variable is driven by a continuous time process, with exact inference typically infeasible in this case as a result of intractable transitions. The ABC method is demonstrated using the unscented Kalman filter as a fast and simple way of producing an approximation in this setting, with a stochastic volatility model for financial returns used for illustration

arXiv.org e-Print Archive

SelectedWorks @ Melbourne Business School (The University of Melbourne)

Algorithms and lower bounds for de Morgan formulas of low-communication leaf gates

Author: Carboni Oliveira Igor
Kabanets Valentine
Koroth Sajin
Lu Zhenjian
Myrisiotis Dimitrios
Publication venue
Publication date: 01/01/2020
Field of study

The class

FORMULA[s] \circ \mathcal{G}

consists of Boolean functions computable by size-

s

de Morgan formulas whose leaves are any Boolean functions from a class

\mathcal{G}

. We give lower bounds and (SAT, Learning, and PRG) algorithms for

FORMULA[n^{1.99}]\circ \mathcal{G}

, for classes

\mathcal{G}

of functions with low communication complexity. Let

R^{(k)}(\mathcal{G})

be the maximum

k

-party NOF randomized communication complexity of

\mathcal{G}

. We show: (1) The Generalized Inner Product function

GIP^k_n

cannot be computed in

FORMULA[s]\circ \mathcal{G}

on more than

1/2+\varepsilon

fraction of inputs for

s = o \! \left ( \frac{n^2}{ \left(k \cdot 4^k \cdot {R}^{(k)}(\mathcal{G}) \cdot \log (n/\varepsilon) \cdot \log(1/\varepsilon) \right)^{2}} \right).

As a corollary, we get an average-case lower bound for

GIP^k_n

against

FORMULA[n^{1.99}]\circ PTF^{k-1}

. (2) There is a PRG of seed length

n/2 + O\left(\sqrt{s} \cdot R^{(2)}(\mathcal{G}) \cdot\log(s/\varepsilon) \cdot \log (1/\varepsilon) \right)

that

\varepsilon

-fools

FORMULA[s] \circ \mathcal{G}

. For

FORMULA[s] \circ LTF

, we get the better seed length

O\left(n^{1/2}\cdot s^{1/4}\cdot \log(n)\cdot \log(n/\varepsilon)\right)

. This gives the first non-trivial PRG (with seed length

o(n)

) for intersections of

n

half-spaces in the regime where

\varepsilon \leq 1/n

. (3) There is a randomized

2^{n-t}

-time

\#

SAT algorithm for

FORMULA[s] \circ \mathcal{G}

, where

t=\Omega\left(\frac{n}{\sqrt{s}\cdot\log^2(s)\cdot R^{(2)}(\mathcal{G})}\right)^{1/2}.

In particular, this implies a nontrivial #SAT algorithm for

FORMULA[n^{1.99}]\circ LTF

. (4) The Minimum Circuit Size Problem is not in

FORMULA[n^{1.99}]\circ XOR

. On the algorithmic side, we show that

FORMULA[n^{1.99}] \circ XOR

can be PAC-learned in time

2^{O(n/\log n)}

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Warwick Research Archives Portal Repository

Variational Principle of Bogoliubov and Generalized Mean Fields in Many-Particle Interacting Systems

Author: Kuzemsky A. L.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 02/07/2015
Field of study

The approach to the theory of many-particle interacting systems from a unified standpoint, based on the variational principle for free energy is reviewed. A systematic discussion is given of the approximate free energies of complex statistical systems. The analysis is centered around the variational principle of N. N. Bogoliubov for free energy in the context of its applications to various problems of statistical mechanics and condensed matter physics. The review presents a terse discussion of selected works carried out over the past few decades on the theory of many-particle interacting systems in terms of the variational inequalities. It is the purpose of this paper to discuss some of the general principles which form the mathematical background to this approach, and to establish a connection of the variational technique with other methods, such as the method of the mean (or self-consistent) field in the many-body problem, in which the effect of all the other particles on any given particle is approximated by a single averaged effect, thus reducing a many-body problem to a single-body problem. The method is illustrated by applying it to various systems of many-particle interacting systems, such as Ising and Heisenberg models, superconducting and superfluid systems, strongly correlated systems, etc. It seems likely that these technical advances in the many-body problem will be useful in suggesting new methods for treating and understanding many-particle interacting systems. This work proposes a new, general and pedagogical presentation, intended both for those who are interested in basic aspects, and for those who are interested in concrete applications.Comment: 60 pages, Refs.25

arXiv.org e-Print Archive

CiteSeerX

Approximating multivariate posterior distribution functions from Monte Carlo samples for sequential Bayesian inference

Author: Thijssen Bram
Wessels Lodewyk F. A.
Publication venue
Publication date: 21/06/2019
Field of study

An important feature of Bayesian statistics is the opportunity to do sequential inference: the posterior distribution obtained after seeing a dataset can be used as prior for a second inference. However, when Monte Carlo sampling methods are used for inference, we only have a set of samples from the posterior distribution. To do sequential inference, we then either have to evaluate the second posterior at only these locations and reweight the samples accordingly, or we can estimate a functional description of the posterior probability distribution from the samples and use that as prior for the second inference. Here, we investigated to what extent we can obtain an accurate joint posterior from two datasets if the inference is done sequentially rather than jointly, under the condition that each inference step is done using Monte Carlo sampling. To test this, we evaluated the accuracy of kernel density estimates, Gaussian mixtures, vine copulas and Gaussian processes in approximating posterior distributions, and then tested whether these approximations can be used in sequential inference. In low dimensionality, Gaussian processes are more accurate, whereas in higher dimensionality Gaussian mixtures or vine copulas perform better. In our test cases, posterior approximations are preferable over direct sample reweighting, although joint inference is still preferable over sequential inference. Since the performance is case-specific, we provide an R package mvdens with a unified interface for the density approximation methods

arXiv.org e-Print Archive

Directory of Open Access Journals

On the Quantitative Hardness of CVP

Author: Bennett Huck
Golovnev Alexander
Stephens-Davidowitz Noah
Publication venue
Publication date: 05/10/2017
Field of study

\newcommand{\eps}{\varepsilon} \newcommand{\problem}[1]{\ensuremath{\mathrm{#1}} } \newcommand{\CVP}{\problem{CVP}} \newcommand{\SVP}{\problem{SVP}} \newcommand{\CVPP}{\problem{CVPP}} \newcommand{\ensuremath}[1]{#1}

For odd integers

p \geq 1

(and

p = \infty

), we show that the Closest Vector Problem in the

\ell_p

norm (\CVP_p) over rank

n

lattices cannot be solved in 2^{(1-\eps) n} time for any constant \eps > 0 unless the Strong Exponential Time Hypothesis (SETH) fails. We then extend this result to "almost all" values of

p \geq 1

, not including the even integers. This comes tantalizingly close to settling the quantitative time complexity of the important special case of \CVP_2 (i.e., \CVP in the Euclidean norm), for which a

2^{n +o(n)}

-time algorithm is known. In particular, our result applies for any

p = p(n) \neq 2

that approaches

2

n \to \infty

. We also show a similar SETH-hardness result for \SVP_\infty; hardness of approximating \CVP_p to within some constant factor under the so-called Gap-ETH assumption; and other quantitative hardness results for \CVP_p and \CVPP_p for any

1 \leq p < \infty

under different assumptions

arXiv.org e-Print Archive

Crossref