Search CORE

1,348 research outputs found

Polynomial-Time Pseudodeterministic Construction of Primes

Author: Chen Lijie
Lu Zhenjian
Oliveira Igor C.
Ren Hanlin
Santhanam Rahul
Publication venue
Publication date: 24/05/2023
Field of study

A randomized algorithm for a search problem is *pseudodeterministic* if it produces a fixed canonical solution to the search problem with high probability. In their seminal work on the topic, Gat and Goldwasser posed as their main open problem whether prime numbers can be pseudodeterministically constructed in polynomial time. We provide a positive solution to this question in the infinitely-often regime. In more detail, we give an *unconditional* polynomial-time randomized algorithm

B

such that, for infinitely many values of

n

B(1^n)

outputs a canonical

n

-bit prime

p_n

with high probability. More generally, we prove that for every dense property

Q

of strings that can be decided in polynomial time, there is an infinitely-often pseudodeterministic polynomial-time construction of strings satisfying

Q

. This improves upon a subexponential-time construction of Oliveira and Santhanam. Our construction uses several new ideas, including a novel bootstrapping technique for pseudodeterministic constructions, and a quantitative optimization of the uniform hardness-randomness framework of Chen and Tell, using a variant of the Shaltiel--Umans generator

arXiv.org e-Print Archive

Memory-Sample Lower Bounds for Learning Parity with Noise

Author: Garg Sumegha
Kothari Pravesh K.
Liu Pengda
Raz Ran
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)
Publication date: 01/01/2021
Field of study

In this work, we show, for the well-studied problem of learning parity under noise, where a learner tries to learn

x=(x_1,\ldots,x_n) \in \{0,1\}^n

from a stream of random linear equations over

\mathrm{F}_2

that are correct with probability

\frac{1}{2}+\varepsilon

and flipped with probability

\frac{1}{2}-\varepsilon

, that any learning algorithm requires either a memory of size

\Omega(n^2/\varepsilon)

or an exponential number of samples. In fact, we study memory-sample lower bounds for a large class of learning problems, as characterized by [GRT'18], when the samples are noisy. A matrix

M: A \times X \rightarrow \{-1,1\}

corresponds to the following learning problem with error parameter

\varepsilon

: an unknown element

x \in X

is chosen uniformly at random. A learner tries to learn

x

from a stream of samples,

(a_1, b_1), (a_2, b_2) \ldots

, where for every

i

a_i \in A

is chosen uniformly at random and

b_i = M(a_i,x)

with probability

1/2+\varepsilon

and

b_i = -M(a_i,x)

with probability

1/2-\varepsilon

(

0<\varepsilon< \frac{1}{2}

). Assume that

k,\ell, r

are such that any submatrix of

M

of at least

2^{-k} \cdot |A|

rows and at least

2^{-\ell} \cdot |X|

columns, has a bias of at most

2^{-r}

. We show that any learning algorithm for the learning problem corresponding to

M

, with error, requires either a memory of size at least

\Omega\left(\frac{k \cdot \ell}{\varepsilon} \right)

, or at least

2^{\Omega(r)}

samples. In particular, this shows that for a large class of learning problems, same as those in [GRT'18], any learning algorithm requires either a memory of size at least

\Omega\left(\frac{(\log |X|) \cdot (\log |A|)}{\varepsilon}\right)

or an exponential number of noisy samples. Our proof is based on adapting the arguments in [Raz'17,GRT'18] to the noisy case.Comment: 19 pages. To appear in RANDOM 2021. arXiv admin note: substantial text overlap with arXiv:1708.0263

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Extractor-Based Time-Space Lower Bounds for Learning

Author: Garg Sumegha
Raz Ran
Tal Avishay
Publication venue
Publication date: 08/08/2017
Field of study

A matrix

M: A \times X \rightarrow \{-1,1\}

corresponds to the following learning problem: An unknown element

x \in X

is chosen uniformly at random. A learner tries to learn

x

from a stream of samples,

(a_1, b_1), (a_2, b_2) \ldots

, where for every

i

a_i \in A

is chosen uniformly at random and

b_i = M(a_i,x)

. Assume that

k,\ell, r

are such that any submatrix of

M

of at least

2^{-k} \cdot |A|

rows and at least

2^{-\ell} \cdot |X|

columns, has a bias of at most

2^{-r}

. We show that any learning algorithm for the learning problem corresponding to

M

requires either a memory of size at least

\Omega\left(k \cdot \ell \right)

, or at least

2^{\Omega(r)}

samples. The result holds even if the learner has an exponentially small success probability (of

2^{-\Omega(r)}

). In particular, this shows that for a large class of learning problems, any learning algorithm requires either a memory of size at least

\Omega\left((\log |X|) \cdot (\log |A|)\right)

or an exponential number of samples, achieving a tight

\Omega\left((\log |X|) \cdot (\log |A|)\right)

lower bound on the size of the memory, rather than a bound of

\Omega\left(\min\left\{(\log |X|)^2,(\log |A|)^2\right\}\right)

obtained in previous works [R17,MM17b]. Moreover, our result implies all previous memory-samples lower bounds, as well as a number of new applications. Our proof builds on [R17] that gave a general technique for proving memory-samples lower bounds

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

GRADIENT-BASED STOCHASTIC OPTIMIZATION METHODS IN BAYESIAN EXPERIMENTAL DESIGN

Author: Huan Xun
Marzouk Youssef M.
Publication venue: 'Begell House'
Publication date: 01/01/2012
Field of study

Optimal experimental design (OED) seeks experiments expected to yield the most useful data for some purpose. In practical circumstances where experiments are time-consuming or resource-intensive, OED can yield enormous savings. We pursue OED for nonlinear systems from a Bayesian perspective, with the goal of choosing experiments that are optimal for parameter inference. Our objective in this context is the expected information gain in model parameters, which in general can only be estimated using Monte Carlo methods. Maximizing this objective thus becomes a stochastic optimization problem. This paper develops gradient-based stochastic optimization methods for the design of experiments on a continuous parameter space. Given a Monte Carlo estimator of expected information gain, we use infinitesimal perturbation analysis to derive gradients of this estimator.We are then able to formulate two gradient-based stochastic optimization approaches: (i) Robbins-Monro stochastic approximation, and (ii) sample average approximation combined with a deterministic quasi-Newton method. A polynomial chaos approximation of the forward model accelerates objective and gradient evaluations in both cases.We discuss the implementation of these optimization methods, then conduct an empirical comparison of their performance. To demonstrate design in a nonlinear setting with partial differential equation forward models, we use the problem of sensor placement for source inversion. Numerical results yield useful guidelines on the choice of algorithm and sample sizes, assess the impact of estimator bias, and quantify tradeoffs of computational cost versus solution quality and robustness.United States. Air Force Office of Scientific Research (Computational Mathematics Program)National Science Foundation (U.S.) (Award ECCS-1128147

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Crossref