3,369 research outputs found
Constraining Implicit Space with Minimum Description Length: An Unsupervised Attention Mechanism across Neural Network Layers
Inspired by the adaptation phenomenon of neuronal firing, we propose the
regularity normalization (RN) as an unsupervised attention mechanism (UAM)
which computes the statistical regularity in the implicit space of neural
networks under the Minimum Description Length (MDL) principle. Treating the
neural network optimization process as a partially observable model selection
problem, UAM constrains the implicit space by a normalization factor, the
universal code length. We compute this universal code incrementally across
neural network layers and demonstrated the flexibility to include data priors
such as top-down attention and other oracle information. Empirically, our
approach outperforms existing normalization methods in tackling limited,
imbalanced and non-stationary input distribution in image classification,
classic control, procedurally-generated reinforcement learning, generative
modeling, handwriting generation and question answering tasks with various
neural network architectures. Lastly, UAM tracks dependency and critical
learning stages across layers and recurrent time steps of deep networks
Null Models of Economic Networks: The Case of the World Trade Web
In all empirical-network studies, the observed properties of economic
networks are informative only if compared with a well-defined null model that
can quantitatively predict the behavior of such properties in constrained
graphs. However, predictions of the available null-model methods can be derived
analytically only under assumptions (e.g., sparseness of the network) that are
unrealistic for most economic networks like the World Trade Web (WTW). In this
paper we study the evolution of the WTW using a recently-proposed family of
null network models. The method allows to analytically obtain the expected
value of any network statistic across the ensemble of networks that preserve on
average some local properties, and are otherwise fully random. We compare
expected and observed properties of the WTW in the period 1950-2000, when
either the expected number of trade partners or total country trade is kept
fixed and equal to observed quantities. We show that, in the binary WTW,
node-degree sequences are sufficient to explain higher-order network properties
such as disassortativity and clustering-degree correlation, especially in the
last part of the sample. Conversely, in the weighted WTW, the observed sequence
of total country imports and exports are not sufficient to predict higher-order
patterns of the WTW. We discuss some important implications of these findings
for international-trade models.Comment: 39 pages, 46 figures, 2 table
The Narrow Conception of Computational Psychology
One particularly successful approach to modeling within cognitive science is computational psychology. Computational psychology explores psychological processes by building and testing computational models with human data. In this paper, it is argued that a specific approach to understanding computation, what is called the ‘narrow conception’, has problematically limited the kinds of models, theories, and explanations that are offered within computational psychology. After raising two problems for the narrow conception, an alternative, ‘wide approach’ to computational psychology is proposed
Reanalyzing language expectations: Native language knowledge modulates the sensitivity to intervening cues during anticipatory processing
Issue Online:21 September 2018We investigated how native language experience shapes anticipatory language processing. Two groups of bilinguals (either Spanish or Basque natives) performed a word matching task (WordMT) and a picture matching task (PictureMT). They indicated whether the stimuli they visually perceived matched with the noun they heard. Spanish noun endings were either diagnostic of the gender (transparent) or ambiguous (opaque). ERPs were time-locked to an intervening gender-marked determiner preceding the predicted noun. The determiner always gender agreed with the following noun but could also introduce a mismatching noun, so that it was not fully task diagnostic. Evoked brain activity time-locked to the determiner was considered as reflecting updating/reanalysis of the task-relevant preactivated representation. We focused on the timing of this effect by estimating the comparison between a gender-congruent and a gender-incongruent determiner. In the WordMT, both groups showed a late N400 effect. Crucially, only Basque natives displayed an earlier P200 effect for determiners preceding transparent nouns. In the PictureMT, both groups showed an early P200 effect for determiners preceding opaque nouns. The determiners of transparent nouns triggered a negative effect at similar to 430 ms in Spanish natives, but at similar to 550 ms in Basque natives. This pattern of results supports a "retracing hypothesis" according to which the neurocognitive system navigates through the intermediate (sublexical and lexical) linguistic representations available from previous processing to evaluate the need of an update in the linguistic expectation concerning a target lexical item.Spanish Ministry of Economy and Competitiveness (MINECO), Agencia Estatal de Investigación (AEI), Fondo Europeo de Desarrollo Regional (FEDER) (grant PSI2015‐65694‐P to N. M.), Spanish Ministry of Economy and Competitiveness “Severo Ochoa” Programme for Centres/Units of Excellence in R&D (grant SEV‐2015‐490
Comparing Information-Theoretic Measures of Complexity in Boltzmann Machines
In the past three decades, many theoretical measures of complexity have been
proposed to help understand complex systems. In this work, for the first time,
we place these measures on a level playing field, to explore the qualitative
similarities and differences between them, and their shortcomings.
Specifically, using the Boltzmann machine architecture (a fully connected
recurrent neural network) with uniformly distributed weights as our model of
study, we numerically measure how complexity changes as a function of network
dynamics and network parameters. We apply an extension of one such
information-theoretic measure of complexity to understand incremental Hebbian
learning in Hopfield networks, a fully recurrent architecture model of
autoassociative memory. In the course of Hebbian learning, the total
information flow reflects a natural upward trend in complexity as the network
attempts to learn more and more patterns.Comment: 16 pages, 7 figures; Appears in Entropy, Special Issue "Information
Geometry II
- …