50,401 research outputs found
Variability, negative evidence, and the acquisition of verb argument constructions
We present a hierarchical Bayesian framework for modeling the acquisition of verb argument constructions. It embodies a domain-general approach to learning higher-level knowledge in the form of inductive constraints (or overhypotheses), and has been used to explain other aspects of language development such as the shape bias in learning object names. Here, we demonstrate that the same model captures several phenomena in the acquisition of verb constructions. Our model, like adults in a series of artificial language learning experiments, makes inferences about the distributional statistics of verbs on several levels of abstraction simultaneously. It also produces the qualitative learning patterns displayed by children over the time course of acquisition. These results suggest that the patterns of generalization observed in both children and adults could emerge from basic assumptions about the nature of learning. They also provide an example of a broad class of computational approaches that can resolve Baker's Paradox
Complexity over Uncertainty in Generalized Representational\ud Information Theory (GRIT): A Structure-Sensitive General\ud Theory of Information
What is information? Although researchers have used the construct of information liberally to refer to pertinent forms of domain-specific knowledge, relatively few have attempted to generalize and standardize the construct. Shannon and Weaver(1949)offered the best known attempt at a quantitative generalization in terms of the number of discriminable symbols required to communicate the state of an uncertain event. This idea, although useful, does not capture the role that structural context and complexity play in the process of understanding an event as being informative. In what follows, we discuss the limitations and futility of any generalization (and particularly, Shannon’s) that is not based on the way that agents extract patterns from their environment. More specifically, we shall argue that agent concept acquisition, and not the communication of\ud
states of uncertainty, lie at the heart of generalized information, and that the best way of characterizing information is via the relative gain or loss in concept complexity that is experienced when a set of known entities (regardless of their nature or domain of origin) changes. We show that Representational Information Theory perfectly captures this crucial aspect of information and conclude with the first generalization of Representational Information Theory (RIT) to continuous domains
A Theoretical Analysis of Contrastive Unsupervised Representation Learning
Recent empirical works have successfully used unlabeled data to learn feature
representations that are broadly useful in downstream classification tasks.
Several of these methods are reminiscent of the well-known word2vec embedding
algorithm: leveraging availability of pairs of semantically "similar" data
points and "negative samples," the learner forces the inner product of
representations of similar pairs with each other to be higher on average than
with negative samples. The current paper uses the term contrastive learning for
such algorithms and presents a theoretical framework for analyzing them by
introducing latent classes and hypothesizing that semantically similar points
are sampled from the same latent class. This framework allows us to show
provable guarantees on the performance of the learned representations on the
average classification task that is comprised of a subset of the same set of
latent classes. Our generalization bound also shows that learned
representations can reduce (labeled) sample complexity on downstream tasks. We
conduct controlled experiments in both the text and image domains to support
the theory.Comment: 19 pages, 5 figure
Levels of discontinuity, limit-computability, and jump operators
We develop a general theory of jump operators, which is intended to provide
an abstraction of the notion of "limit-computability" on represented spaces.
Jump operators also provide a framework with a strong categorical flavor for
investigating degrees of discontinuity of functions and hierarchies of sets on
represented spaces. We will provide a thorough investigation within this
framework of a hierarchy of -measurable functions between arbitrary
countably based -spaces, which captures the notion of computing with
ordinal mind-change bounds. Our abstract approach not only raises new questions
but also sheds new light on previous results. For example, we introduce a
notion of "higher order" descriptive set theoretical objects, we generalize a
recent characterization of the computability theoretic notion of "lowness" in
terms of adjoint functors, and we show that our framework encompasses ordinal
quantifications of the non-constructiveness of Hilbert's finite basis theorem
Kernel-based Information Criterion
This paper introduces Kernel-based Information Criterion (KIC) for model
selection in regression analysis. The novel kernel-based complexity measure in
KIC efficiently computes the interdependency between parameters of the model
using a variable-wise variance and yields selection of better, more robust
regressors. Experimental results show superior performance on both simulated
and real data sets compared to Leave-One-Out Cross-Validation (LOOCV),
kernel-based Information Complexity (ICOMP), and maximum log of marginal
likelihood in Gaussian Process Regression (GPR).Comment: We modified the reference 17, and the subcaptions of Figure
Classical Langevin Dynamics for Model Hamiltonians
We propose a scheme for extending the model Hamiltonian method developed
originally for studying the equilibrium properties of complex perovskite
systems to include Langevin dynamics. The extension is based on Zwanzig's
treatment of nonlinear generalized Langevin's equations. The parameters
entering the equations of motion are to be determined by mapping from
first-principles calculations, as in the original model Hamiltonian method. The
scheme makes possible, in principle, the study of the dynamics and kinetics of
structural transformations inaccessible to the original model Hamiltonian
method. Moreover, we show that the equilibrium properties are governed by an
effective Hamiltonian which differs from that used in previous work by a term
which captures the coherent part of the previously ignored dynamical
interaction with the omitted degrees of freedom. We describe how the additional
information required for the Langevin equations can be obtained by a minor
extension of the previous mapping.Comment: 5 pages, no figures, to appear in Physica Status Solidi; replacement
acknowledges funding agenc
- …