51 research outputs found
Recommended from our members
Biologically inspired speaker verification
Speaker verification is an active research problem that has been addressed using a variety of different classification techniques. However, in general, methods inspired by the human auditory system tend to show better verification performance than other methods. In this thesis three biologically inspired speaker verification algorithms are presented
The Evolution of Language Universals: Optimal Design and Adaptation
Inquiry into the evolution of syntactic universals is hampered by severe limitations on
the available evidence. Theories of selective function nevertheless lead to predictions of
local optimaliiy that can be tested scientifically. This thesis refines a diagnostic,
originally proposed by Parker and Maynard Smith (1990), for identifying selective
functions on this basis and applies it to the evolution of two syntactic universals: (I) the
distinction between open and closed lexical classes, and (2) nested constituent structure.
In the case of the former, it is argued that the selective role of the closed class items is
primarily to minimise the amount of redundancy in the lexicon. In the case of the latter,
the emergence of nested phrase structure is argued to have been a by-product of
selection for the ability to perform insertion operations on sequences - a function that
plausibly pre-dated the emergence of modem language competence. The evidence for
these claims is not just that these properties perform plausibly fitness-related functions,
but that they appear to perform them in a way that is improbably optimal.
A number of interesting findings follow when examining the selective role of
the closed classes. In particular, case, agreement and the requirement that sentences
have subjects are expected consequences of an optimised lexicon, the theory thereby
relating these properties to natural selection for the first time. It also motivates the view
that language variation is confined to parameters associated with closed class items, in
turn explaining why parameter confiicts fail to arise in bilingualism.
The simplest representation of sequences that is optimised for efficient
insertions can represent both nested constituent structure and long-distance
dependencies in a unified way, thus suggesting that movement is intrinsic to the
representation of constituency rather than an 'imperfection'. The basic structure of
phrases also follows from this representation and helps to explain the interaction
between case and theta assignment. These findings bring together a surprising array of
phenomena, reinforcing its correctness as the representational basis of syntactic
structures.
The diagnostic overcomes shortcomings in the approach of Pinker and Bloom
(1990), who argued that the appearance of 'adaptive complexity' in the design of a trait
could be used as evidence of its selective function, but there is no reason to expect the
refinements of natural selection to increase complexity in any given case.
Optimality considerations are also applied in this thesis to filter theories of the
nature of unobserved linguistic representations as well as theories of their functions. In
this context, it is argued that, despite Chomsky's (1995) resistance to the idea, it is
possible to motivate the guiding principles of the Minimalist Program in terms of
evolutionary optimisation, especially if we allow the possibility that properties of
language were selected for non-communicative functions and that redundancy is
sometimes costly rather than beneficial
Learning Attention Mechanisms and Context: An Investigation into Vision and Emotion
Attention mechanisms for context modelling are becoming ubiquitous in neural architectures in machine learning. The attention mechanism is a technique that filters out information that is irrelevant to a given task and focuses on learning task-dependent fixation points or regions. Furthermore, attention mechanisms suggest a question about a given task, i.e. `what' to learn and `where/how' to learn for task-specific context modelling. The context is the conditional variables instrumental in deciding the categorical distribution for the given data. Also, why is learning task-specific context necessary? In order to answer these questions, context modelling with attention in the vision and emotion domains is explored in this thesis using attention mechanisms with different hierarchical structures. The three main goals of this thesis are building superior classifiers using attention-based deep neural networks~(DNNs), investigating the role of context modelling in the given tasks, and developing a framework for interpreting hierarchies and attention in deep attention networks. In the vision domain, gesture and posture recognition tasks in diverse environments, are chosen. In emotion, visual and speech emotion recognition tasks are chosen. These tasks are selected for their sequential properties for modelling a spatiotemporal context. One of the key challenges from a machine learning standpoint is to extract patterns which bear maximum correlation with the information encoded in its signal while being as insensitive as possible to other types of information carried by the signal. A possible way to overcome this problem is to learn task-dependent representations. In order to achieve that, novel spatiotemporal context modelling networks and the mixture of multi-view attention~(MOMA) networks are proposed using bidirectional long-short-term memory network (BLSTM), convolutional neural network~(CNN), Capsule and attention networks. A framework has been proposed to interpret the internal attention states with respect to the given task. The results of the classifiers in the assigned tasks are compared with the \textit{state-of-the-art} DNNs, and the proposed classifiers achieve superior results. The context in speech emotion recognition is explored deeply with the attention interpretation framework, and it shows that the proposed model can assign word importance based on acoustic context. Furthermore, it has been observed that the internal states of the attention bear correlation with human perception of acoustic cues for speech emotion recognition. Overall, the results demonstrate superior classifiers and context learning models with interpretable frameworks. The findings are very important for speech emotion recognition systems. In this thesis, not only better models are produced, but also the interpretability of those models are explored, and their internal states are analysed. The phones and words are aligned with the attention vectors, and it is seen that the vowel sounds are more important for defining emotion acoustic cues than the consonants, and the model can assign word importance based on acoustic context. Also, how these approaches for emotion recognition using word importance for predicting emotions are demonstrated by the attention weight visualisation over the words. In a broader perspective, the findings from the thesis about gesture, posture and emotion recognition may be helpful in tasks like human-robot interaction~(HRI) and conversational artificial agents (such as Siri, Alexa). The communication is grounded with the symbolic and sub-symbolic cues of intent either from visual, audio or haptics. The understanding of intent is much dependent on the reasoning about the situational context. Emotion, i.e.\ speech and visual emotion, provides context to a situation, and it is a deciding factor in the response generation. Emotional intelligence and information from vision, audio and other modalities are essential for making human-human and human-robot communication more natural and feedback-driven
Audio processing on constrained devices
This thesis discusses the future of smart business applications on mobile phones
and the integration of voice interface across several business applications. It proposes
a framework that provides speech processing support for business applications
on mobile phones. The framework uses Gaussian Mixture Models (GMM)
for low-enrollment speaker recognition and limited vocabulary speech recognition.
Algorithms are presented for pre-processing of audio signals into different categories
and for start and end point detection. A method is proposed for speech processing
that uses Mel Frequency Cepstral Coeffcients (MFCC) as primary feature for extraction.
In addition, optimization schemes are developed to improve performance,
and overcome constraints of a mobile phone. Experimental results are presented
for some prototype applications that evaluate the performance of computationally
expensive algorithms on constrained hardware. The thesis concludes by discussing
the scope for improvement for the work done in this thesis and future directions in
which this work could possibly be extended
The Multi-Dimensional Contributions of Prefrontal Circuits to Emotion Regulation during Adulthood and Critical Stages of Development
The prefrontal cortex (PFC) plays a pivotal role in regulating our emotions. The importance of ventromedial regions in emotion regulation, including the ventral sector of the medial PFC, the medial sector of the orbital cortex and subgenual cingulate cortex, have been recognized for a long time. However, it is increasingly apparent that lateral and dorsal regions of the PFC, as well as neighbouring dorsal anterior cingulate cortex, also play a role. Defining the underlying psychological mechanisms by which these functionally distinct regions modulate emotions and the nature and extent of their interactions is a critical step towards better stratification of the symptoms of mood and anxiety disorders. It is also important to extend our understanding of these prefrontal circuits in development. Specifically, it is important to determine whether they exhibit differential sensitivity to perturbations by known risk factors such as stress and inflammation at distinct developmental epochs. This Special Issue brings together the most recent research in humans and other animals that addresses these important issues, and in doing so, highlights the value of the translational approach
Synchronising the Senses: The Impact of Embodied Cognition on Communication, Explored in the Domain of Colour
Colour terms divide the visual spectrum into categorical concepts. Since Berlin & Kay’s (1969) cross-cultural study of colour terms, there has been debate about the extent to which these concepts are constrained by innate biases from perceptual hardware and the environment. This study shows that concepts can affect perception in the domain of colour (e.g., reading the word ‘yellow’ causes us to see yellow). An experiment was run in which participants were asked to adjust the font colour of colour terms to appear grey. In fact, participants adjusted the font colour to perceptually oppose the colour the word described (e.g., the word ‘yellow’ was adjusted to be blue). This is interpreted as over-compensating for a perceptual activation caused by the comprehension of the word. These results are used to argue that cross-cultural patterns in colour term systems do not necessarily imply strong innate biases. It is argued that the most efficient way of converging on, maintaining and transferring a conceptual system is for shared categories to re-organise perception. This re-organisation will converge to optimally fit the perceptual and environmental biases. Therefore, an Embodied, Relativist explanation of cross-cultural patterns is supported. Furthermore, if the comprehension of language involves the activation of perceptual representations, then there will be a communicative pressure to reduce perceptual differences between speakers
Learning with delayed reinforcement in an exploratory probabilistic logic neural network
Imperial Users onl
The role of interneurons in sensory processing in primary visual cortex
Cortical networks are comprised of a multitude of cell types. To understand sensory processing, the function and interaction of these cell types must be investigated. Neurons can be separated into two main groups: excitatory pyramidal (Pyr) cells and inhibitory interneurons. Inhibitory interneurons make up 20% of the total cortical neuronal population and they exhibit a striking array of molecular, morphological and electrophysiological characteristics. The most numerous are the parvalbumin-expressing (PV+) interneurons, accounting for 35-40% of the interneuron population in adult mouse visual cortex. Somatostatin-expressing (SOM+) neurons are another significant group, comprising 20-25% of the interneuron population.
The visual responses of SOM+ and PV+ interneurons were measured using 2-photon targeted cell-attached recordings and compared with Pyr cells in the primary visual cortex of anaesthetized mice. These interneuron populations exhibited higher firing rates than Pyr cells in response to oriented gratings, but were less orientation selective, with PV+ interneurons exhibiting the lowest orientation selectivity.
Next, SOM+ interneurons were stimulated optogenetically using channelrhodopsin to measure their effect on Pyr cell and PV+ interneuron responses to visual stimuli. Activating small numbers of SOM+ interneurons in vivo inhibited stimulus- evoked firing in PV+ interneurons but not in Pyr cells. Stimulating a large number of SOM+ interneurons confirmed this differential effect, inhibiting PV+ interneurons twice as effectively as Pyr cells. Moreover, the remaining responses to oriented gratings in PV+ cells were more orientation-tuned and time-modulated. In short, inhibitory SOM+ cell activity does not summate with PV+ cell activity, but suppresses it, reconfiguring the inhibitory input to Pyr cells. These results suggest a new role for SOM+ cells, which are activated more slowly and provide dendritic inhibition to Pyr cells while strongly antagonizing PV+ cells, thereby shifting inhibitory input to Pyr cells from somatic to dendritic inhibition throughout the course of the network's visual response
Evolutionary Computation
This book presents several recent advances on Evolutionary Computation, specially evolution-based optimization methods and hybrid algorithms for several applications, from optimization and learning to pattern recognition and bioinformatics. This book also presents new algorithms based on several analogies and metafores, where one of them is based on philosophy, specifically on the philosophy of praxis and dialectics. In this book it is also presented interesting applications on bioinformatics, specially the use of particle swarms to discover gene expression patterns in DNA microarrays. Therefore, this book features representative work on the field of evolutionary computation and applied sciences. The intended audience is graduate, undergraduate, researchers, and anyone who wishes to become familiar with the latest research work on this field
Recommended from our members
Understanding Semantic Implicit Learning through distributional linguistic patterns: A computational perspective
The research presented in this PhD dissertation provides a computational perspective on Semantic Implicit Learning (SIL). It puts forward the idea that SIL does not depend on semantic knowledge as classically conceived but upon semantic-like knowledge gained through distributional analysis of massive linguistic input. Using methods borrowed from the machine learning and artificial intelligence literature, we construct computational models, which can simulate the performance observed during behavioural tasks of semantic implicit learning in a human-like way. We link this methodology to the current literature on implicit learning, arguing that this behaviour is a necessary by-product of efficient language processing.
Chapter 1 introduces the computational problem posed by implicit learning in general, and semantic implicit learning, in particular, as well as the computational framework, used to tackle them.
Chapter 2 introduces distributional semantics models as a way to learn semantic-like representations from exposure to linguistic input.
Chapter 3 reports two studies on large datasets of semantic priming which seek to identify the computational model of semantic knowledge that best fits the data under conditions that resemble SIL tasks. We find that a model which acquires semantic-like knowledge gained through distributional analysis of massive linguistic input provides the best fit to the data.
Chapter 4 generalises the results of the previous two studies by looking at the performance of the same models in languages other than English.
Chapter 5 applies the results of the two previous Chapters on eight datasets of semantic implicit learning. Crucially, these datasets use various semantic manipulations and speakers of different L1s enabling us to test the predictions of different models of semantics.
Chapter 6 examines more closely two assumptions which we have taken for granted throughout this thesis. Firstly, we test whether a simpler model based on phonological information can explain the generalisation patterns observed in the tasks. Secondly, we examine whether our definition of the computational problem in Chapter 5 is reasonable.
Chapter 7 summarises and discusses the implications for implicit language learning and computational models of cognition. Furthermore, we offer one more study that seeks to bridge the literature on distributional models of semantics to `deeper' models of semantics by learning semantic relations.
There are two main contributions of this dissertation to the general field of implicit learning research. Firstly, we highlight the superiority of distributional models of semantics in modelling unconscious semantic knowledge. Secondly, we question whether `deep' semantic knowledge is needed to achieve above chance performance in SIIL tasks. We show how a simple model that learns through distributional analysis of the patterns found in the linguistic input can match the behavioural results in different languages. Furthermore, we link these models to more general problems faced in psycholinguistics such as language processing and learning of semantic relations.Alexandros Onassis Foundatio
- …