16 research outputs found

    Anderson's orthogonality catastrophe

    Get PDF
    The topic of this thesis is a mathematical treatment of Anderson's orthogonality catastrophe. Named after P.W. Anderson, who studied the phenomenon in the late 1960s, the catastrophe is an intrinsic effect in Fermi gases. In his first work on the topic in [Phys. Rev. Lett. 18:1049--1051], Anderson studied a system of NN noninteracting fermions in three space dimensions and found the ground state to be asymptotically orthogonal to the ground state of the same system perturbed by a finite-range scattering potential. More precisely, let ΦLN\Phi_L^N be the NN-body ground state of the fermionic system in a dd-dimensional box of length LL,and let ΨLN\Psi_L^N be the ground state of the corresponding system in the presence of the additional finite-range potential. Then the catastrophe brings about the asymptotic vanishing §LN:= Lγ/2\S_L^N := \ \sim L^{-\gamma/2} of the overlap §LN\S_L^N of the NN-body ground states ΦLN\Phi_L^N and ΨLN\Psi_L^N. The asymptotics is in the thermodynamic limit LL\to\infty and NN\to\infty with fixed density N/Ldϱ>0N/L^d\to\varrho > 0. In [Commun. Math. Phys. 329:979--998], the overlap §LN\S_L^N has been bounded from above with an asymptotic bound of the form \abs{\S_L^N}^2 \lesssim L^{-\tilde{\gamma}}. The decay exponent γ~\tilde{\gamma} there corresponds to the one of Anderson in [Phys. Rev. Lett. 18:1049--1051]. Another publication by Anderson from the same year, [Phys. Rev. 164:352--359], contains the exact asymptotics with a bigger coefficient γ\gamma. This thesis features a step towards the exact asymptotics. We prove a bound with a coefficient γ\gamma that corresponds in a certain sense to the one in [Phys. Rev. 164:352--359], and improves upon the one in [Commun. Math. Phys. 329:979--998]. We use the methods from [Commun. Math. Phys. 329:979--998], but treat every term in a series expansion of lnSLN\ln S_L^N, instead of only the first one. Treating the higher order terms introduces additional arguments since the trace expressions occurring are no longer necessarily nonnegative, which complicates some of the estimates. The main contents of this thesis will also be published in a forthcoming article co-authored with Martin Gebert, Peter Müller, and Peter Otte

    The exponent in the orthogonality catastrophe for Fermi gases

    Get PDF
    We quantify the asymptotic vanishing of the ground-state overlap of two non-interacting Fermi gases in dd-dimensional Euclidean space in the thermodynamic limit. Given two one-particle Schr\"odinger operators in finite-volume which differ by a compactly supported bounded potential, we prove a power-law upper bound on the ground-state overlap of the corresponding non-interacting NN-particle systems. We interpret the decay exponent γ\gamma in terms of scattering theory and find γ=π2arcsinTE/2HS2\gamma = \pi^{-2}{\lVert\arcsin{\lvert T_E/2\rvert}\rVert}_{\mathrm{HS}}^2, where TET_E is the transition matrix at the Fermi energy EE. This exponent reduces to the one predicted by Anderson [Phys. Rev. 164, 352-359 (1967)] for the exact asymptotics in the special case of a repulsive point-like perturbation.Comment: Version as to appear in J. Spectr. Theory, References update

    Anderson's orthogonality catastrophe

    Get PDF
    The topic of this thesis is a mathematical treatment of Anderson's orthogonality catastrophe. Named after P.W. Anderson, who studied the phenomenon in the late 1960s, the catastrophe is an intrinsic effect in Fermi gases. In his first work on the topic in [Phys. Rev. Lett. 18:1049--1051], Anderson studied a system of NN noninteracting fermions in three space dimensions and found the ground state to be asymptotically orthogonal to the ground state of the same system perturbed by a finite-range scattering potential. More precisely, let ΦLN\Phi_L^N be the NN-body ground state of the fermionic system in a dd-dimensional box of length LL,and let ΨLN\Psi_L^N be the ground state of the corresponding system in the presence of the additional finite-range potential. Then the catastrophe brings about the asymptotic vanishing §LN:= Lγ/2\S_L^N := \ \sim L^{-\gamma/2} of the overlap §LN\S_L^N of the NN-body ground states ΦLN\Phi_L^N and ΨLN\Psi_L^N. The asymptotics is in the thermodynamic limit LL\to\infty and NN\to\infty with fixed density N/Ldϱ>0N/L^d\to\varrho > 0. In [Commun. Math. Phys. 329:979--998], the overlap §LN\S_L^N has been bounded from above with an asymptotic bound of the form \abs{\S_L^N}^2 \lesssim L^{-\tilde{\gamma}}. The decay exponent γ~\tilde{\gamma} there corresponds to the one of Anderson in [Phys. Rev. Lett. 18:1049--1051]. Another publication by Anderson from the same year, [Phys. Rev. 164:352--359], contains the exact asymptotics with a bigger coefficient γ\gamma. This thesis features a step towards the exact asymptotics. We prove a bound with a coefficient γ\gamma that corresponds in a certain sense to the one in [Phys. Rev. 164:352--359], and improves upon the one in [Commun. Math. Phys. 329:979--998]. We use the methods from [Commun. Math. Phys. 329:979--998], but treat every term in a series expansion of lnSLN\ln S_L^N, instead of only the first one. Treating the higher order terms introduces additional arguments since the trace expressions occurring are no longer necessarily nonnegative, which complicates some of the estimates. The main contents of this thesis will also be published in a forthcoming article co-authored with Martin Gebert, Peter Müller, and Peter Otte

    Learning with AMIGo: Adversarially Motivated Intrinsic Goals

    Get PDF
    A key challenge for reinforcement learning (RL) consists of learning in environments with sparse extrinsic rewards. In contrast to current RL methods, humans are able to learn new skills with little or no reward by using various forms of intrinsic motivation. We propose AMIGo, a novel agent incorporating -- as form of meta-learning -- a goal-generating teacher that proposes Adversarially Motivated Intrinsic Goals to train a goal-conditioned "student" policy in the absence of (or alongside) environment reward. Specifically, through a simple but effective "constructively adversarial" objective, the teacher learns to propose increasingly challenging -- yet achievable -- goals that allow the student to learn general skills for acting in a new environment, independent of the task to be solved. We show that our method generates a natural curriculum of self-proposed goals which ultimately allows the agent to solve challenging procedurally-generated tasks where other forms of intrinsic motivation and state-of-the-art RL methods fail.Comment: 18 pages, 6 figures, published at The Ninth International Conference on Learning Representations (2021

    PAQ: 65 Million Probably-Asked Questions and What You Can Do With Them

    Get PDF
    Open-domain Question Answering models which directly leverage question-answer (QA) pairs, such as closed-book QA (CBQA) models and QA-pair retrievers, show promise in terms of speed and memory compared to conventional models which retrieve and read from text corpora. QA-pair retrievers also offer interpretable answers, a high degree of control, and are trivial to update at test time with new knowledge. However, these models lack the accuracy of retrieve-and-read systems, as substantially less knowledge is covered by the available QA-pairs relative to text corpora like Wikipedia. To facilitate improved QA-pair models, we introduce Probably Asked Questions (PAQ), a very large resource of 65M automatically-generated QA-pairs. We introduce a new QA-pair retriever, RePAQ, to complement PAQ. We find that PAQ preempts and caches test questions, enabling RePAQ to match the accuracy of recent retrieve-and-read models, whilst being significantly faster. Using PAQ, we train CBQA models which outperform comparable baselines by 5%, but trail RePAQ by over 15%, indicating the effectiveness of explicit retrieval. RePAQ can be configured for size (under 500MB) or speed (over 1K questions per second) whilst retaining high accuracy. Lastly, we demonstrate RePAQ's strength at selective QA, abstaining from answering when it is likely to be incorrect. This enables RePAQ to ``back-off" to a more expensive state-of-the-art model, leading to a combined system which is both more accurate and 2x faster than the state-of-the-art model alone

    Grounding Aleatoric Uncertainty in Unsupervised Environment Design

    Full text link
    Adaptive curricula in reinforcement learning (RL) have proven effective for producing policies robust to discrepancies between the train and test environment. Recently, the Unsupervised Environment Design (UED) framework generalized RL curricula to generating sequences of entire environments, leading to new methods with robust minimax regret properties. Problematically, in partially-observable or stochastic settings, optimal policies may depend on the ground-truth distribution over aleatoric parameters of the environment in the intended deployment setting, while curriculum learning necessarily shifts the training distribution. We formalize this phenomenon as curriculum-induced covariate shift (CICS), and describe how its occurrence in aleatoric parameters can lead to suboptimal policies. Directly sampling these parameters from the ground-truth distribution avoids the issue, but thwarts curriculum learning. We propose SAMPLR, a minimax regret UED method that optimizes the ground-truth utility function, even when the underlying training data is biased due to CICS. We prove, and validate on challenging domains, that our approach preserves optimality under the ground-truth distribution, while promoting robustness across the full range of environment settings

    Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

    Get PDF
    Large pre-trained language models have been shown to store factual knowledge in their parameters, and achieve state-of-the-art results when fine-tuned on downstream NLP tasks. However, their ability to access and precisely manipulate knowledge is still limited, and hence on knowledge-intensive tasks, their performance lags behind task-specific architectures. Additionally, providing provenance for their decisions and updating their world knowledge remain open research problems. Pre-trained models with a differentiable access mechanism to explicit non-parametric memory can overcome this issue, but have so far been only investigated for extractive downstream tasks. We explore a general-purpose fine-tuning recipe for retrieval-augmented generation (RAG) -- models which combine pre-trained parametric and non-parametric memory for language generation. We introduce RAG models where the parametric memory is a pre-trained seq2seq model and the non-parametric memory is a dense vector index of Wikipedia, accessed with a pre-trained neural retriever. We compare two RAG formulations, one which conditions on the same retrieved passages across the whole generated sequence, the other can use different passages per token. We fine-tune and evaluate our models on a wide range of knowledge-intensive NLP tasks and set the state-of-the-art on three open domain QA tasks, outperforming parametric seq2seq models and task-specific retrieve-and-extract architectures. For language generation tasks, we find that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.Comment: Accepted at NeurIPS 202
    corecore