50,401 research outputs found

    Variability, negative evidence, and the acquisition of verb argument constructions

    Get PDF
    We present a hierarchical Bayesian framework for modeling the acquisition of verb argument constructions. It embodies a domain-general approach to learning higher-level knowledge in the form of inductive constraints (or overhypotheses), and has been used to explain other aspects of language development such as the shape bias in learning object names. Here, we demonstrate that the same model captures several phenomena in the acquisition of verb constructions. Our model, like adults in a series of artificial language learning experiments, makes inferences about the distributional statistics of verbs on several levels of abstraction simultaneously. It also produces the qualitative learning patterns displayed by children over the time course of acquisition. These results suggest that the patterns of generalization observed in both children and adults could emerge from basic assumptions about the nature of learning. They also provide an example of a broad class of computational approaches that can resolve Baker's Paradox

    Complexity over Uncertainty in Generalized Representational\ud Information Theory (GRIT): A Structure-Sensitive General\ud Theory of Information

    Get PDF
    What is information? Although researchers have used the construct of information liberally to refer to pertinent forms of domain-specific knowledge, relatively few have attempted to generalize and standardize the construct. Shannon and Weaver(1949)offered the best known attempt at a quantitative generalization in terms of the number of discriminable symbols required to communicate the state of an uncertain event. This idea, although useful, does not capture the role that structural context and complexity play in the process of understanding an event as being informative. In what follows, we discuss the limitations and futility of any generalization (and particularly, Shannon’s) that is not based on the way that agents extract patterns from their environment. More specifically, we shall argue that agent concept acquisition, and not the communication of\ud states of uncertainty, lie at the heart of generalized information, and that the best way of characterizing information is via the relative gain or loss in concept complexity that is experienced when a set of known entities (regardless of their nature or domain of origin) changes. We show that Representational Information Theory perfectly captures this crucial aspect of information and conclude with the first generalization of Representational Information Theory (RIT) to continuous domains

    A Theoretical Analysis of Contrastive Unsupervised Representation Learning

    Full text link
    Recent empirical works have successfully used unlabeled data to learn feature representations that are broadly useful in downstream classification tasks. Several of these methods are reminiscent of the well-known word2vec embedding algorithm: leveraging availability of pairs of semantically "similar" data points and "negative samples," the learner forces the inner product of representations of similar pairs with each other to be higher on average than with negative samples. The current paper uses the term contrastive learning for such algorithms and presents a theoretical framework for analyzing them by introducing latent classes and hypothesizing that semantically similar points are sampled from the same latent class. This framework allows us to show provable guarantees on the performance of the learned representations on the average classification task that is comprised of a subset of the same set of latent classes. Our generalization bound also shows that learned representations can reduce (labeled) sample complexity on downstream tasks. We conduct controlled experiments in both the text and image domains to support the theory.Comment: 19 pages, 5 figure

    Levels of discontinuity, limit-computability, and jump operators

    Full text link
    We develop a general theory of jump operators, which is intended to provide an abstraction of the notion of "limit-computability" on represented spaces. Jump operators also provide a framework with a strong categorical flavor for investigating degrees of discontinuity of functions and hierarchies of sets on represented spaces. We will provide a thorough investigation within this framework of a hierarchy of Δ20\Delta^0_2-measurable functions between arbitrary countably based T0T_0-spaces, which captures the notion of computing with ordinal mind-change bounds. Our abstract approach not only raises new questions but also sheds new light on previous results. For example, we introduce a notion of "higher order" descriptive set theoretical objects, we generalize a recent characterization of the computability theoretic notion of "lowness" in terms of adjoint functors, and we show that our framework encompasses ordinal quantifications of the non-constructiveness of Hilbert's finite basis theorem

    Kernel-based Information Criterion

    Full text link
    This paper introduces Kernel-based Information Criterion (KIC) for model selection in regression analysis. The novel kernel-based complexity measure in KIC efficiently computes the interdependency between parameters of the model using a variable-wise variance and yields selection of better, more robust regressors. Experimental results show superior performance on both simulated and real data sets compared to Leave-One-Out Cross-Validation (LOOCV), kernel-based Information Complexity (ICOMP), and maximum log of marginal likelihood in Gaussian Process Regression (GPR).Comment: We modified the reference 17, and the subcaptions of Figure

    Classical Langevin Dynamics for Model Hamiltonians

    Full text link
    We propose a scheme for extending the model Hamiltonian method developed originally for studying the equilibrium properties of complex perovskite systems to include Langevin dynamics. The extension is based on Zwanzig's treatment of nonlinear generalized Langevin's equations. The parameters entering the equations of motion are to be determined by mapping from first-principles calculations, as in the original model Hamiltonian method. The scheme makes possible, in principle, the study of the dynamics and kinetics of structural transformations inaccessible to the original model Hamiltonian method. Moreover, we show that the equilibrium properties are governed by an effective Hamiltonian which differs from that used in previous work by a term which captures the coherent part of the previously ignored dynamical interaction with the omitted degrees of freedom. We describe how the additional information required for the Langevin equations can be obtained by a minor extension of the previous mapping.Comment: 5 pages, no figures, to appear in Physica Status Solidi; replacement acknowledges funding agenc
    • …
    corecore