537 research outputs found
Active Audition for Robots using Parameter-Less Self-Organising Maps
How can a robot become aware of its surroundings? How does it create its own subjective, inner representation of the real world, so that relationships in the one are reflected in the other? It is well known that structures analogous to Self-Organising Maps (SOM) are involved with this task in animals, and this thesis undertakes to explore if and how a similar approach can be success- fully applied in robotics. In order to study the environment-to-abstraction mapping with a minimum of guidance from directed learning and built-in design assumptions, this thesis examines the active audition task in which a system must determine the direction of a sound source and orient towards it, both in horizontal and vertical direction. Previous explanations of directional hearing in animals, and the implementation of directional hearing algorithms in robots have tended to focus on the two best known directional clues; the intensity and time differences. This thesis hypothesises that it is advantageous to use a synergy of a wider range of metrics, namely the phase and relative intensity difference. A solution to the active audition problem is proposed based on the Parameter- Less Self-Organising Map (PLSOM), a new algorithm also introduced in this thesis. The PLSOM is used to extract patterns from a high-dimensional input space to a low-dimensional output space. In this application the output space is mapped to the correct motor command for turning towards the source and focusing attention on the selected source by filtering unwanted noise. The dimension-reducing capability of the PLSOM enables the use of more than just two directional clues for computation of the direction. This thesis presents the new PLSOM algorithm for SOM training and quantifies its performance relative to the ordinary SOM algorithm. The mathematical correctness of the PLSOM is demonstrated and the properties and some applications of this new algorithm are examined, notably in automatically modelling a robot's surroundings in a functional form: Inverse Kinematics (IK). The IK problem is related in principle to the active audition problem - functional rather than abstract representation of reality - but raises some new questions of how to use this internal representation in planning and execution of movements. The PLSOM is also applied to classification of high-dimensional data and model-free chaotic time series prediction. A variant of Reinforcement Learning based on Q-Learning is devised and tested. This variant solves some problems related to stochastic reward functions. A mathematical proof of correct state-action pairing is devised
The Parameter-Less Self-Organizing Map algorithm
The Parameter-Less Self-Organizing Map (PLSOM) is a new neural network
algorithm based on the Self-Organizing Map (SOM). It eliminates the need for a
learning rate and annealing schemes for learning rate and neighbourhood size.
We discuss the relative performance of the PLSOM and the SOM and demonstrate
some tasks in which the SOM fails but the PLSOM performs satisfactory. Finally
we discuss some example applications of the PLSOM and present a proof of
ordering under certain limited conditions.Comment: 29 pages, 27 figures. Based on publication in IEEE Trans. on Neural
Network
Calibration of sound source localisation for robots using multiple adaptive filter models of the cerebellum
The aim of this research was to investigate the calibration of Sound Source Localisation (SSL) for robots using the adaptive filter model of the cerebellum and how this could be automatically adapted for multiple acoustic environments. The role of the cerebellum has mainly been identified in the context of motor control, and only in recent years has it been recognised that it has a wider role to play in the senses and cognition. The adaptive filter model of the cerebellum has been successfully applied to a number of robotics applications but so far none involving auditory sense. Multiple models frameworks such as MOdular Selection And Identification for Control (MOSAIC) have also been developed in the context of motor control, and this has been the inspiration for adaptation of audio calibration in multiple acoustic environments; again, application of this approach in the area of auditory sense is completely new. The thesis showed that it was possible to calibrate the output of an SSL algorithm using the adaptive filter model of the cerebellum, improving the performance compared to the uncalibrated SSL. Using an adaptation of the MOSAIC framework, and specifically using responsibility estimation, a system was developed that was able to select an appropriate set of cerebellar calibration models and to combine their outputs in proportion to how well each was able to calibrate, to improve the SSL estimate in multiple acoustic contexts, including novel contexts. The thesis also developed a responsibility predictor, also part of the MOSAIC framework, and this improved the robustness of the system to abrupt changes in context which could otherwise have resulted in a large performance error. Responsibility prediction also improved robustness to missing ground truth, which could occur in challenging environments where sensory feedback of ground truth may become impaired, which has not been addressed in the MOSAIC literature, adding to the novelty of the thesis. The utility of the so-called cerebellar chip has been further demonstrated through the development of a responsibility predictor that is based on the adaptive filter model of the cerebellum, rather than the more conventional function fitting neural network used in the literature. Lastly, it was demonstrated that the multiple cerebellar calibration architecture is capable of limited self-organising from a de-novo state, with a predetermined number of models. It was also demonstrated that the responsibility predictor could learn against its model after self-organisation, and to a limited extent, during self-organisation. The thesis addresses an important question of how a robot could improve its ability to listen in multiple, challenging acoustic environments, and recommends future work to develop this ability
Towards music perception by redundancy reduction and unsupervised learning in probabilistic models
PhDThe study of music perception lies at the intersection of several disciplines: perceptual
psychology and cognitive science, musicology, psychoacoustics, and acoustical
signal processing amongst others. Developments in perceptual theory over the last
fifty years have emphasised an approach based on Shannon’s information theory and
its basis in probabilistic systems, and in particular, the idea that perceptual systems
in animals develop through a process of unsupervised learning in response to natural
sensory stimulation, whereby the emerging computational structures are well adapted
to the statistical structure of natural scenes. In turn, these ideas are being applied to
problems in music perception.
This thesis is an investigation of the principle of redundancy reduction through
unsupervised learning, as applied to representations of sound and music.
In the first part, previous work is reviewed, drawing on literature from some of the
fields mentioned above, and an argument presented in support of the idea that perception
in general and music perception in particular can indeed be accommodated within
a framework of unsupervised learning in probabilistic models.
In the second part, two related methods are applied to two different low-level representations.
Firstly, linear redundancy reduction (Independent Component Analysis)
is applied to acoustic waveforms of speech and music. Secondly, the related method of
sparse coding is applied to a spectral representation of polyphonic music, which proves
to be enough both to recognise that the individual notes are the important structural elements,
and to recover a rough transcription of the music.
Finally, the concepts of distance and similarity are considered, drawing in ideas
about noise, phase invariance, and topological maps. Some ecologically and information
theoretically motivated distance measures are suggested, and put in to practice in
a novel method, using multidimensional scaling (MDS), for visualising geometrically
the dependency structure in a distributed representation.Engineering and Physical Science Research Counci
Brain-inspired self-organization with cellular neuromorphic computing for multimodal unsupervised learning
Cortical plasticity is one of the main features that enable our ability to
learn and adapt in our environment. Indeed, the cerebral cortex self-organizes
itself through structural and synaptic plasticity mechanisms that are very
likely at the basis of an extremely interesting characteristic of the human
brain development: the multimodal association. In spite of the diversity of the
sensory modalities, like sight, sound and touch, the brain arrives at the same
concepts (convergence). Moreover, biological observations show that one
modality can activate the internal representation of another modality when both
are correlated (divergence). In this work, we propose the Reentrant
Self-Organizing Map (ReSOM), a brain-inspired neural system based on the
reentry theory using Self-Organizing Maps and Hebbian-like learning. We propose
and compare different computational methods for unsupervised learning and
inference, then quantify the gain of the ReSOM in a multimodal classification
task. The divergence mechanism is used to label one modality based on the
other, while the convergence mechanism is used to improve the overall accuracy
of the system. We perform our experiments on a constructed written/spoken
digits database and a DVS/EMG hand gestures database. The proposed model is
implemented on a cellular neuromorphic architecture that enables distributed
computing with local connectivity. We show the gain of the so-called hardware
plasticity induced by the ReSOM, where the system's topology is not fixed by
the user but learned along the system's experience through self-organization.Comment: Preprin
Self-directedness, integration and higher cognition
In this paper I discuss connections between self-directedness, integration and higher cognition. I present a model of self-directedness as a basis for approaching higher cognition from a situated cognition perspective. According to this model increases in sensorimotor complexity create pressure for integrative higher order control and learning processes for acquiring information about the context in which action occurs. This generates complex articulated abstractive information processing, which forms the major basis for higher cognition. I present evidence that indicates that the same integrative characteristics found in lower cognitive process such as motor adaptation are present in a range of higher cognitive process, including conceptual learning. This account helps explain situated cognition phenomena in humans because the integrative processes by which the brain adapts to control interaction are relatively agnostic concerning the source of the structure participating in the process. Thus, from the perspective of the motor control system using a tool is not fundamentally different to simply controlling an arm
Recommended from our members
The application of artificial neural networks to interpret acoustic emissions from submerged arc welding
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Automated fusion welding processes play a fundamental role in modern manufacturing industries. The proliferation of joint geometries together with the large permutation of associated process variable configurations has given rise to research into complex system modelling and control strategies. Many of these techniques have involved monitoring of not only the electrical characteristics of the process but visual and acoustic information. Acoustic information derived from certain welding processes is well documented as it is an established fact that skilled manual welders utilise such information as an aid to creating an optimum weld. The experimental investigation presented in this thesis is dedicated to the feasibility of monitoring airborne acoustic emissions of Submerged Arc Welding (SAW) for diagnostic and real time control purposes. The experimental method adopted for this research takes a cybernetic approach to data processing and interpretation in an attempt to replicate the robustness of human biological functions. A custom designed audio hardware system was used to analyse signals obtained from bead on mild steel plate fusion welds. Time and frequency domains were used in an attempt to establish salient characteristics or identify the signatures associated with changes of the process variables. The featured parameters were voltage / current and weld travel speed, due to their ease of validation. However, consideration has also been given to weld defect prediction due to process instabilities. As the data proved to be highly correlated and erratic when subjected to off line statistical analysis, extensive investigation was given to the application of artificial neural networks to signal processing and real time control scenarios. As a consequence, a dedicated neural based software system was developed, utilising supervised and unsupervised neural techniques to monitor the process. The research was aimed at proving the feasibility of monitoring the electrical process parameters and stability of the welding process in real time. It was shown to be possible, by the exploitation of artificial neural networks, to generate a number of monitoring parameters indicative of the welding process state. The limitations of the present neural method and proposed developments are discussed, together with an overview of applied neural network technology and its impact on artificial intelligence and robotic control. Further developments are considered together with recommendations for future areas of research
Redefining the audio editor.
This thesis describes new design principles for audio editing software. This kind of software, also called audio editor, is the digital cutting table for sound and music production in which audio can be loaded or recorded, then selected and edited. First an understanding of the audio editor is established. Then a new approach to audio editing software design is developed, based on research into current software. This new approach consists of a set of design principles that aim at improving coherency, flexibility and creativity in the audio editing process. These principles are formed by carefully rethinking core elements in audio editing such as audio representation, selection and manipulation, editing flexibility, automation and personalisation. As artefact of this research, a concept audio editor called OFFline is presented in a second section. This audio editor demonstrates a possible implementation of the new design principles
The cartographies of place: Approaches to audio-visual composition incorporating aspects of place
Incorporating aural and visual elements of a place in a composition serves as a powerful way of exploring the intersection of time, history and geography associated with a location. The combination of these elements acts as an invitation for deeper engagement by offering multiple perspectives of place. One way of exploring these intersections is through incorporating aspects of place—in the form of field recordings, field footage and cartographical information—into audio and audio-visual work, where spatial and physical information can be situated as a way of representing an individual’s surroundings and subjective realities of place. This practice-led exegesis aims to explore how sound and visual elements can combine and resonate with each other, and how such a practice can highlight the connections between artist and place. As part of this exploration, this exegesis discusses a portfolio of works (submitted as part of the examinable thesis) highlighting the connections between artist, history and place, and how these aspects can inform the creation of new work. Methods explored include framing personal and sono-environmental reflections in terms of looking inwards (as a reflection on the self) and looking outwards (as a reflection on the history, cultural significance and geospatial features of place), composition with original and modified field recordings, sonification of maps using graphical sequencing software, and the creation of audio-visual works that additionally combine field footage and music visualisation. These methods for composition provide a powerful way of highlighting personal associations, emotional catharsis and memories of place, by centring personal experience. Through these methods, this exegesis seeks to demonstrate a number of strategies to show how the ephemerality of sound reflects the ephemerality of being, and the fragility inherent in any relationship with place
- …