45 research outputs found

    Supersymmetry of Noncompact MQCD-like Membrane Instantons and Heat Kernel Asymptotics

    Full text link
    We perform a heat kernel asymptotics analysis of the nonperturbative superpotential obtained from wrapping of an M2-brane around a supersymmetric noncompact three-fold embedded in a (noncompact) G_2-manifold as obtained in [1], the three-fold being the one relevant to domain walls in Witten's MQCD [2], in the limit of small "zeta", a complex constant that appears in the Riemann surfaces relevant to defining the boundary conditions for the domain wall in MQCD. The MQCD-like configuration is interpretable, for small but non-zero zeta as a noncompact/"large" open membrane instanton, and for vanishing zeta, as the type IIA D0-brane (for vanishing M-theory cicle radius). We find that the eta-function Seeley de-Witt coefficients vanish, and we get a perfect match between the zeta-function Seeley de-Witt coefficients (up to terms quadratic in zeta) between the Dirac-type operator and one of the two Laplace-type operators figuring in the superpotential. This is an extremely strong signature of residual supersymmetry for the nonperturbative configurations in M-theory considered in this work.Comment: 21 pages, LaTeX; v3: several clarifying remarks added, to appear in JHE

    Large Language Models Can Be Easily Distracted by Irrelevant Context

    Full text link
    Large language models have achieved impressive performance on various natural language processing tasks. However, so far they have been evaluated primarily on benchmarks where all information in the input context is relevant for solving the task. In this work, we investigate the distractibility of large language models, i.e., how the model problem-solving accuracy can be influenced by irrelevant context. In particular, we introduce Grade-School Math with Irrelevant Context (GSM-IC), an arithmetic reasoning dataset with irrelevant information in the problem description. We use this benchmark to measure the distractibility of cutting-edge prompting techniques for large language models, and find that the model performance is dramatically decreased when irrelevant information is included. We also identify several approaches for mitigating this deficiency, such as decoding with self-consistency and adding to the prompt an instruction that tells the language model to ignore the irrelevant information

    On Semantic Cognition, Inductive Generalization, and Language Models

    No full text
    My doctoral research focuses on understanding semantic knowledge in neural network models trained solely to predict natural language (referred to as language models, or LMs), by drawing on insights from the study of concepts and categories grounded in cognitive science. I propose a framework inspired by 'inductive reasoning,' a phenomenon that sheds light on how humans utilize background knowledge to make inductive leaps and generalize from new pieces of information about concepts and their properties. Drawing from experiments that study inductive reasoning, I propose to analyze semantic inductive generalization in LMs using phenomena observed in human-induction literature, investigate inductive behavior on tasks such as implicit reasoning and emergent feature recognition, and analyze and relate induction dynamics to the learned conceptual representation space

    Exploring Lexical Sensitivities in Word Prediction Models: A Case Study on Bert

    No full text
    Estimating word probabilities in context is a fundamental mechanism underlying the training of neural network-based language processing models. Models pre-trained using this mechanism tend to learn task independent representations that exhibit a variety of semantic regularities that are desirable for language processing. While prediction based tasks have become an important component for these models, much is unknown about what kinds of information the models draw from context to inform word probabilities. The present work aims to advance the understanding of word prediction models by integrating perspectives from the psycholinguistic phenomenon of semantic priming, and presents a case study analyzing the lexical properties of the pretrained BERT model. Using stimuli that cause priming in humans, this thesis relates BERT’s sensitivity towards lexical cues with predictive contextual constraints and finer-grained lexical relations. To augment the empirical methodology utilized to behaviorally analyze BERT, this thesis draws on the knowledgerich paradigm of Ontological Semantics and fuzzy-inferences supported by its practical realization, the Ontological Semantics Technology, to qualitatively relate BERT’s predictive mechanisms to meaning interpretation in context. The findings establish the importance of considering predictive constraint effects of context in studies that behaviorally analyze language processing models, and highlight possible parallels with human processing

    On Semantic Cognition, Inductive Generalization, and Language Models

    No full text
    Our ability to understand language and perform reasoning crucially relies on a robust system of semantic cognition (G. L. Murphy, 2002; Rogers & McClelland, 2004; Rips et al., 2012; Lake & Murphy, 2021): processes that allow us to learn, update, and produce inferences about everyday concepts (e.g., cat, chair), properties (e.g., has fur, can be sat on), categories (e.g., mammals, furniture), and relations (e.g., is-a, taller-than). Meanwhile, recent progress in the field of natural language processing (NLP) has led to the development of language models (LMs): sophisticated neural networks that are trained to predict words in context (Devlin et al., 2019; Radford et al., 2019; Brown et al., 2020), and as a result build representations that encode the knowledge present in the statistics of their training environment. These models have achieved impressive levels of performance on a range of tasks that require sophisticated semantic knowledge (e.g. question answering and natural language inference), often even reaching human parity. To what extent do LMs capture the nuances of human conceptual knowledge and reasoning? Centering around this broad question, this dissertation uses core ideas in human semantic cognition as guiding principles and lays down the groundwork to establish effective evaluation and improvement of conceptual understanding in LMs. In particular, I build on prior work that focuses on characterizing what semantic knowledge is made available in the behavior and representations of LMs, and extend it by additionally proposing tests that focus on functional consequences of acquiring basic semantic knowledge.I primarily focus on inductive generalization(Hayes & Heit, 2018)—the unique ability of humans to rely on acquired conceptual knowledge to project or generalize novel information— as a context within which we can analyze LMs’ encoding of conceptual knowledge. I do this, since the literature surrounding inductive generalization contains a variety of empirical regularities that map to specific conceptual abstractions and shed light on how humans store, organize and use conceptual knowledge. Before explicitly analyzing LMs for these empirical regularities, I test them on two other contexts, which also feature the role of inductive generalization. First I test the extent to which LMs demonstrate typicality effects— a robust finding in human categorization literature where certain members of a category are considered to be more central to the category than are others. Specifically, I test the behavior 19 different LMs on two contexts where typicality effects modulate human behavior: 1) verification of sentences expressing taxonomic category membership, and 2) projecting novel properties from individual category members to the entire category. In both tests, LMs achieved positive but modest correlations with human typicality ratings, suggesting that they can to a non-trivial extent capture subtle differences between category members. Next, I propose a new benchmark to test the robustness of LMs in attributing properties to everyday concepts, and in making inductive leaps to endow properties to novel concepts. On testing 31 different LMs for these capacities, I find that while they can correctly attribute properties to everyday concepts and even predict the properties of novel concepts in simple settings, they struggle to do so robustly. Combined with the analyses of typicality effects, these results suggest that the ability of LMs to demonstrate impressive conceptual knowledge and reasoning behavior can be explained by their sensitivities to shallow predictive cues
    corecore