2,796 research outputs found
Learning An Invariant Speech Representation
Recognition of speech, and in particular the ability to generalize and learn
from small sets of labelled examples like humans do, depends on an appropriate
representation of the acoustic input. We formulate the problem of finding
robust speech features for supervised learning with small sample complexity as
a problem of learning representations of the signal that are maximally
invariant to intraclass transformations and deformations. We propose an
extension of a theory for unsupervised learning of invariant visual
representations to the auditory domain and empirically evaluate its validity
for voiced speech sound classification. Our version of the theory requires the
memory-based, unsupervised storage of acoustic templates -- such as specific
phones or words -- together with all the transformations of each that normally
occur. A quasi-invariant representation for a speech segment can be obtained by
projecting it to each template orbit, i.e., the set of transformed signals, and
computing the associated one-dimensional empirical probability distributions.
The computations can be performed by modules of filtering and pooling, and
extended to hierarchical architectures. In this paper, we apply a single-layer,
multicomponent representation for phonemes and demonstrate improved accuracy
and decreased sample complexity for vowel classification compared to standard
spectral, cepstral and perceptual features.Comment: CBMM Memo No. 022, 5 pages, 2 figure
Seeing things
This paper is concerned with the problem of attaching meaningful symbols to aspects of the visible environment in machine and biological vision. It begins with a review of some of the arguments commonly used to support either the 'symbolic' or the 'behaviourist' approach to vision. Having explored these avenues without arriving at a satisfactory conclusion, we then present a novel argument, which starts from the question : given a functional description of a vision system, when could it be said to support a symbolic interpretation? We argue that to attach symbols to a system, its behaviour must exhibit certain well defined regularities in its response to its visual input and these are best described in terms of invariance and equivariance to transformations which act in the world and induce corresponding changes of the vision system state. This approach is illustrated with a brief exploration of the problem of identifying and acquiring visual representations having these symmetry properties, which also highlights the advantages of using an 'active' model of vision
Empiricism without Magic: Transformational Abstraction in Deep Convolutional Neural Networks
In artificial intelligence, recent research has demonstrated the remarkable potential of Deep Convolutional Neural Networks (DCNNs), which seem to exceed state-of-the-art performance in new domains weekly, especially on the sorts of very difficult perceptual discrimination tasks that skeptics thought would remain beyond the reach of artificial intelligence. However, it has proven difficult to explain why DCNNs perform so well. In philosophy of mind, empiricists have long suggested that complex cognition is based on information derived from sensory experience, often appealing to a faculty of abstraction. Rationalists have frequently complained, however, that empiricists never adequately explained how this faculty of abstraction actually works. In this paper, I tie these two questions together, to the mutual benefit of both disciplines. I argue that the architectural features that distinguish DCNNs from earlier neural networks allow them to implement a form of hierarchical processing that I call “transformational abstraction”. Transformational abstraction iteratively converts sensory-based representations of category exemplars into new formats that are increasingly tolerant to “nuisance variation” in input. Reflecting upon the way that DCNNs leverage a combination of linear and non-linear processing to efficiently accomplish this feat allows us to understand how the brain is capable of bi-directional travel between exemplars and abstractions, addressing longstanding problems in empiricist philosophy of mind. I end by considering the prospects for future research on DCNNs, arguing that rather than simply implementing 80s connectionism with more brute-force computation, transformational abstraction counts as a qualitatively distinct form of processing ripe with philosophical and psychological significance, because it is significantly better suited to depict the generic mechanism responsible for this important kind of psychological processing in the brain
Brain Dynamics across levels of Organization
After presenting evidence that the electrical activity recorded from the brain surface can reflect metastable state transitions of neuronal configurations at the mesoscopic level, I will suggest that their patterns may correspond to the distinctive spatio-temporal activity in the Dynamic Core (DC) and the Global Neuronal Workspace (GNW), respectively, in the models of the Edelman group on the one hand, and of Dehaene-Changeux, on the other. In both cases, the recursively reentrant activity flow in intra-cortical and cortical-subcortical neuron loops plays an essential and distinct role. Reasons will be given for viewing the temporal characteristics of this activity flow as signature of Self-Organized Criticality (SOC), notably in reference to the dynamics of neuronal avalanches. This point of view enables the use of statistical Physics approaches for exploring phase transitions, scaling and universality properties of DC and GNW, with relevance to the macroscopic electrical activity in EEG and EMG
Measuring changes in preferences and perception due to the entry of a new brand with choice data
Context effects can have a major influence on brand choice behavior after the introduction of a new product. Based on behavioral literature, several hypotheses about the effects of a new brand on perception, preferences and choice behavior can be derived, but studies with real choice data are still lacking. We employ an internal market structure analysis to measure context effects caused by a new product in scanner panel data, and to discriminate between alternative theoretical explanations. An empirical investigation reveals strong support for categorization effects and changes in perception, which affect customers in two out of five segments.context effects, categorization, brand choice models, new brand introduction
How is Gaze Influenced by Image Transformations? Dataset and Model
Data size is the bottleneck for developing deep saliency models, because
collecting eye-movement data is very time consuming and expensive. Most of
current studies on human attention and saliency modeling have used high quality
stereotype stimuli. In real world, however, captured images undergo various
types of transformations. Can we use these transformations to augment existing
saliency datasets? Here, we first create a novel saliency dataset including
fixations of 10 observers over 1900 images degraded by 19 types of
transformations. Second, by analyzing eye movements, we find that observers
look at different locations over transformed versus original images. Third, we
utilize the new data over transformed images, called data augmentation
transformation (DAT), to train deep saliency models. We find that label
preserving DATs with negligible impact on human gaze boost saliency prediction,
whereas some other DATs that severely impact human gaze degrade the
performance. These label preserving valid augmentation transformations provide
a solution to enlarge existing saliency datasets. Finally, we introduce a novel
saliency model based on generative adversarial network (dubbed GazeGAN). A
modified UNet is proposed as the generator of the GazeGAN, which combines
classic skip connections with a novel center-surround connection (CSC), in
order to leverage multi level features. We also propose a histogram loss based
on Alternative Chi Square Distance (ACS HistLoss) to refine the saliency map in
terms of luminance distribution. Extensive experiments and comparisons over 3
datasets indicate that GazeGAN achieves the best performance in terms of
popular saliency evaluation metrics, and is more robust to various
perturbations. Our code and data are available at:
https://github.com/CZHQuality/Sal-CFS-GAN
Metastability, Criticality and Phase Transitions in brain and its Models
This essay extends the previously deposited paper "Oscillations, Metastability and Phase Transitions" to incorporate the theory of Self-organizing Criticality. The twin concepts of Scaling and Universality of the theory of nonequilibrium phase transitions is applied to the role of reentrant activity in neural circuits of cerebral cortex and subcortical neural structures
Nonlinear brain dynamics as macroscopic manifestation of underlying many-body field dynamics
Neural activity patterns related to behavior occur at many scales in time and
space from the atomic and molecular to the whole brain. Here we explore the
feasibility of interpreting neurophysiological data in the context of many-body
physics by using tools that physicists have devised to analyze comparable
hierarchies in other fields of science. We focus on a mesoscopic level that
offers a multi-step pathway between the microscopic functions of neurons and
the macroscopic functions of brain systems revealed by hemodynamic imaging. We
use electroencephalographic (EEG) records collected from high-density electrode
arrays fixed on the epidural surfaces of primary sensory and limbic areas in
rabbits and cats trained to discriminate conditioned stimuli (CS) in the
various modalities. High temporal resolution of EEG signals with the Hilbert
transform gives evidence for diverse intermittent spatial patterns of amplitude
(AM) and phase modulations (PM) of carrier waves that repeatedly re-synchronize
in the beta and gamma ranges at near zero time lags over long distances. The
dominant mechanism for neural interactions by axodendritic synaptic
transmission should impose distance-dependent delays on the EEG oscillations
owing to finite propagation velocities. It does not. EEGs instead show evidence
for anomalous dispersion: the existence in neural populations of a low velocity
range of information and energy transfers, and a high velocity range of the
spread of phase transitions. This distinction labels the phenomenon but does
not explain it. In this report we explore the analysis of these phenomena using
concepts of energy dissipation, the maintenance by cortex of multiple ground
states corresponding to AM patterns, and the exclusive selection by spontaneous
breakdown of symmetry (SBS) of single states in sequences.Comment: 31 page
Invariant template matching in systems with spatiotemporal coding: a vote for instability
We consider the design of a pattern recognition that matches templates to
images, both of which are spatially sampled and encoded as temporal sequences.
The image is subject to a combination of various perturbations. These include
ones that can be modeled as parameterized uncertainties such as image blur,
luminance, translation, and rotation as well as unmodeled ones. Biological and
neural systems require that these perturbations be processed through a minimal
number of channels by simple adaptation mechanisms. We found that the most
suitable mathematical framework to meet this requirement is that of weakly
attracting sets. This framework provides us with a normative and unifying
solution to the pattern recognition problem. We analyze the consequences of its
explicit implementation in neural systems. Several properties inherent to the
systems designed in accordance with our normative mathematical argument
coincide with known empirical facts. This is illustrated in mental rotation,
visual search and blur/intensity adaptation. We demonstrate how our results can
be applied to a range of practical problems in template matching and pattern
recognition.Comment: 52 pages, 12 figure
- …