8 research outputs found
Spoken word recognition without a TRACE
International audienceHow do we map the rapid input of spoken language onto phonological and lexical representations over time? Attempts at psychologically-tractable computational models of spoken word recognition tend either to ignore time or to transform the temporal input into a spatial representation. TRACE, a connectionist model with broad and deep coverage of speech perception and spoken word recognition phenomena, takes the latter approach, using exclusively time-specific units at every level of representation. TRACE reduplicates featural, phonemic, and lexical inputs at every time step in a large memory trace, with rich interconnections (excitatory forward and backward connections between levels and inhibitory links within levels). As the length of the memory trace is increased, or as the phoneme and lexical inventory of the model is increased to a realistic size, this reduplication of time-(temporal position) specific units leads to a dramatic proliferation of units and connections, begging the question of whether a more efficient approach is possible. Our starting point is the observation that models of visual object recognition-including visual word recognition-have grappled with the problem of spatial invariance, and arrived at solutions other than a fully-reduplicative strategy like that of TRACE. This inspires a new model of spoken word recognition that combines time-specific phoneme representations similar to those in TRACE with higher-level representations based on string kernels: temporally independent (time invariant) diphone and lexical units. This reduces the number of necessary units and connections by several orders of magnitude relative to TRACE. Critically, we compare the new model to TRACE on a set of key phenomena, demonstrating that the new model inherits much of the behavior of TRACE and that the drastic computational savings do not come at the cost of explanatory power
Probabilistic symmetries and invariant neural networks
Treating neural network inputs and outputs as random variables, we
characterize the structure of neural networks that can be used to model data
that are invariant or equivariant under the action of a compact group. Much
recent research has been devoted to encoding invariance under symmetry
transformations into neural network architectures, in an effort to improve the
performance of deep neural networks in data-scarce, non-i.i.d., or unsupervised
settings. By considering group invariance from the perspective of probabilistic
symmetry, we establish a link between functional and probabilistic symmetry,
and obtain generative functional representations of probability distributions
that are invariant or equivariant under the action of a compact group. Our
representations completely characterize the structure of neural networks that
can be used to model such distributions and yield a general program for
constructing invariant stochastic or deterministic neural networks. We
demonstrate that examples from the recent literature are special cases, and
develop the details of the general program for exchangeable sequences and
arrays.Comment: Revised structure for clarity; fixed minor mistakes; incorporated
reviewer feedback for publicatio
Symmetries and Discriminability in Feedforward Network Architectures
This paper investigates the effects of introducing symmetries into feedforward neural networks in what are termed symmetry networks. This technique allows more efficient training for problems in which we require the output of a network to be invariant under a set of transformations of the input. The particular problem of graph recognition is considered. In this case the network is designed to deliver the same output for isomorphic graphs. This leads to the question for which inputs can be distinguished by such architectures. A theorem characterizing when two inputs can be distinguished by a symmetry network is given. As a consequence, a particular network design is shown to be able to distinguish nonisomorphic graphs if and only if the graph reconstruction conjecture holds