77 research outputs found
The development of artificial neural networks for the analysis of market research and electronic nose data
This thesis details research carried out into the application of unsupervised neural
network and statistical clustering techniques to market research interview survey
analysis. The objective of the research was to develop mathematical mechanisms to
locate and quantify internal clusters within the data sets with definite commonality.
As the data sets being used were binary, this commonality was expressed in terms of
identical question answers. Unsupervised neural network paradigms are investigated,
along with statistical clustering techniques. The theory of clustering in a binary space
is also looked at.
Attempts to improve the clarity of output of Self-Organising Maps (SOM) consisted
of several stages of investigation culminating in the conception of the Interrogative
Memory Structure (lMS). IMS proved easy to use, fast in operation and consistently
produced results with the highest degree of commonality when tested against SOM,
Adaptive Resonance Theory (ART!) and FASTCLUS. ARTl performed well when
clusters were measured using general metrics. During the course of the research a
supervised technique, the Vector Memory Array (VMA), was developed. VMA was
tested against Back Propagation (BP) (using data sets provided by the Warwick
electronic nose project) and consistently produced higher classification accuracies.
The main advantage of VMA is its speed of operation - in testing it produced results
in minutes compared to hours for the BP method, giving speed increases in the
region of 100: 1
Recommended from our members
Automatic sound synthesizer programming: techniques and applications
The aim of this thesis is to investigate techniques for, and applications of automatic sound synthesizer programming. An automatic sound synthesizer programmer is a system which removes the requirement to explicitly specify parameter settings for a sound synthesis algorithm from the user. Two forms of these systems are discussed in this thesis:
tone matching programmers and synthesis space explorers. A tone matching programmer takes at its input a sound synthesis algorithm and a desired target sound. At its output it produces a configuration for the sound synthesis algorithm which causes it to emit a
similar sound to the target. The techniques for achieving this that are investigated are
genetic algorithms, neural networks, hill climbers and data driven approaches. A synthesis
space explorer provides a user with a representation of the space of possible sounds
that a synthesizer can produce and allows them to interactively explore this space. The
applications of automatic sound synthesizer programming that are investigated include
studio tools, an autonomous musical agent and a self-reprogramming drum machine. The
research employs several methodologies: the development of novel software frameworks
and tools, the examination of existing software at the source code and performance levels
and user trials of the tools and software. The main contributions made are: a method
for visualisation of sound synthesis space and low dimensional control of sound synthesizers; a general purpose framework for the deployment and testing of sound synthesis and optimisation algorithms in the SuperCollider language sclang; a comparison of a variety of optimisation techniques for sound synthesizer programming; an analysis of sound synthesizer error surfaces; a general purpose sound synthesizer programmer compatible with industry standard tools; an automatic improviser which passes a loose equivalent of the Turing test for Jazz musicians, i.e. being half of a man-machine duet which was rated as one of the best sessions of 2009 on the BBC's 'Jazz on 3' programme
The application of neural networks to non-destructive testing techniques
The low strain test method has become the prevalent method for integrity testing of cast in situ foundation piles. The automated interpretation of the sonic echo traces resulting from this test would prove beneficial to industry through the standardisation of the test method procedure and a reduction in the time spent analysing results. Therefore, in this research the generalisation and feature extraction strengths of artificial neural networks have been exploited to aid test trace interpretation. This study involved the identification of three multilayer networks considered most suitable for the heteroassociative function approximation task described above. Multilayer Perceptron (MLP) networks, Radial Basis Neural Networks (RBNN) and Wavelet Basis Neural Networks (WBNN) have all been trained using numerically generated data and their performances compared to identify the optimum network type. While each network presented similar strengths and weaknesses in fault diagnosis, statistical analysis suggested that the MLP network was marginally more successful in quantifying changes in cross-sections along the pile length. Field data from three test sites have confirmed that the network can identify, locate and quantify significant (±13%) changes in diameter along the pile length (within known test method limitations). The network has also diagnosed changes in diameter at the pile head. This task is notoriously difficult using conventional techniques and has been facilitated through the development of a novel pre-processing technique: the wavelet mobility scalogram
Organising and structuring a visual diary using visual interest point detectors
As wearable cameras become more popular, researchers are increasingly focusing on novel applications to manage the large volume of data these devices produce. One such application is the construction of a Visual Diary from an individual’s photographs. Microsoft’s SenseCam, a
device designed to passively record a Visual Diary and cover a typical day of the user wearing the camera, is an example of one such device. The vast quantity of images generated by these devices means that the management and organisation of these collections is not a trivial matter.
We believe wearable cameras, such as SenseCam, will become more popular in the future and the management of the volume of data generated by these devices is a key issue.
Although there is a significant volume of work in the literature in the object detection and recognition
and scene classification fields, there is little work in the area of setting detection. Furthermore, few authors have examined the issues involved in analysing extremely large image collections (like a Visual Diary) gathered over a long period of time. An algorithm developed for setting
detection should be capable of clustering images captured at the same real world locations (e.g. in the dining room at home, in front of the computer in the office, in the park, etc.). This requires the selection and implementation of suitable methods to identify visually similar backgrounds in images using their visual features. We present a number of approaches to setting detection based on
the extraction of visual interest point detectors from the images. We also analyse the performance of two of the most popular descriptors - Scale Invariant Feature Transform (SIFT) and Speeded Up Robust Features (SURF).We present an implementation of a Visual Diary application and evaluate
its performance via a series of user experiments. Finally, we also outline some techniques to allow the Visual Diary to automatically detect new settings, to scale as the image collection continues to grow substantially over time, and to allow the user to generate a personalised summary of their data
Computer audition for emotional wellbeing
This thesis is focused on the application of computer audition (i. e., machine listening) methodologies for monitoring states of emotional wellbeing. Computer audition is a growing field and has been successfully applied to an array of use cases in recent years. There are several advantages to audio-based computational analysis; for example, audio can be recorded non-invasively, stored economically, and can capture rich information on happenings in a given environment, e. g., human behaviour. With this in mind, maintaining emotional wellbeing is a challenge for humans and emotion-altering conditions, including stress and anxiety, have become increasingly common in recent years. Such conditions manifest in the body, inherently changing how we express ourselves. Research shows these alterations are perceivable within vocalisation, suggesting that speech-based audio monitoring may be valuable for developing artificially intelligent systems that target improved wellbeing. Furthermore, computer audition applies machine learning and other computational techniques to audio understanding, and so by combining computer audition with applications in the domain of computational paralinguistics and emotional wellbeing, this research concerns the broader field of empathy for Artificial Intelligence (AI). To this end, speech-based audio modelling that incorporates and understands paralinguistic wellbeing-related states may be a vital cornerstone for improving the degree of empathy that an artificial intelligence has.
To summarise, this thesis investigates the extent to which speech-based computer audition methodologies can be utilised to understand human emotional wellbeing. A fundamental background on the fields in question as they pertain to emotional wellbeing is first presented, followed by an outline of the applied audio-based methodologies. Next, detail is provided for several machine learning experiments focused on emotional wellbeing applications, including analysis and recognition of under-researched phenomena in speech, e. g., anxiety, and markers of stress. Core contributions from this thesis include the collection of several related datasets, hybrid fusion strategies for an emotional gold standard, novel machine learning strategies for data interpretation, and an in-depth acoustic-based computational evaluation of several human states. All of these contributions focus on ascertaining the advantage of audio in the context of modelling emotional wellbeing. Given the sensitive nature of human wellbeing, the ethical implications involved with developing and applying such systems are discussed throughout
Recommended from our members
Intelligent image cropping and scaling
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University, 2011.Nowadays, there exist a huge number of end devices with different screen properties for
watching television content, which is either broadcasted or transmitted over the internet.
To allow best viewing conditions on each of these devices, different image formats have
to be provided by the broadcaster. Producing content for every single format is,
however, not applicable by the broadcaster as it is much too laborious and costly.
The most obvious solution for providing multiple image formats is to produce one high resolution format and prepare formats of lower resolution from this. One possibility to do this is to simply scale video images to the resolution of the target image format. Two significant drawbacks are the loss of image details through ownscaling and possibly unused image areas due to letter- or pillarboxes. A preferable solution is to find the contextual most important region in the high-resolution format at first and crop this area with an aspect ratio of the target image format afterwards. On the other hand, defining
the contextual most important region manually is very time consuming. Trying to apply that to live productions would be nearly impossible. Therefore, some approaches exist that automatically define cropping areas. To do so, they extract visual features, like moving reas in a video, and define regions of interest
(ROIs) based on those. ROIs are finally used to define an enclosing cropping area. The
extraction of features is done without any knowledge about the type of content. Hence,
these approaches are not able to distinguish between features that might be important in
a given context and those that are not.
The work presented within this thesis tackles the problem of extracting visual features based on prior knowledge about the content. Such knowledge is fed into the system in form of metadata that is available from TV production environments. Based on the
extracted features, ROIs are then defined and filtered dependent on the analysed
content. As proof-of-concept, this application finally adapts SDTV (Standard Definition Television) sports productions automatically to image formats with lower resolution through intelligent cropping and scaling. If no content information is available, the system can still be applied on any type of content through a default mode. The presented approach is based on the principle of a plug-in system. Each plug-in
represents a method for analysing video content information, either on a low level by
extracting image features or on a higher level by processing extracted ROIs. The
combination of plug-ins is determined by the incoming descriptive production metadata
and hence can be adapted to each type of sport individually. The application has been comprehensively evaluated by comparing the results of the system against alternative cropping methods. This evaluation utilised videos which were manually cropped by a professional video editor, statically cropped videos and simply scaled, non-cropped videos. In addition to and apart from purely subjective evaluations,
the gaze positions of subjects watching sports videos have been measured and compared
to the regions of interest positions extracted by the system
Recommended from our members
Magnetoencephalographic studies of neural systems associated with higher order processes in humans
This thesis has been concerned with the neuromagnetic fields associated with the processing of faces and sentences in humans. In four, largely independent sub-projects, results were obtained using novel methods of analysis to extract neurophysiologically relevant information from magnetoencephalographic MEG readings. Using the MEG facility of the Helsinki University of Technology, Finland, the research has led to four main suggestions: a) there are early latency face-specific neural systems in humans that are predominantly in right inferior occipito-temporal cortex, b) MEG recordings are useful in the study of autism, in that autistic subjects exhibit different responses to normal subjects following face presentation, c) phase-locked y-band activity has a specific role in semantic processing of sentences in normal subjects, and d) the late components of responses to face images are modified by endogenous priming, which is detectable before stimulus arrival in normal subjects.
In order to pursue these neuroscience objectives, new methods for treating MEG data were developed, implemented and used. These comprise: a) an improved parameterisation of signal power over regions of interest, b) the use of re-sampling strategies to achieve statistical assessment of spectral coefficients within subjects, and c) a prestimulus method for the study of face processing using a tailored state-space representation approach
Visualisation of multi-dimensional medical images with application to brain electrical impedance tomography
Medical imaging plays an important role in modem medicine. With the increasing complexity and information presented by medical images, visualisation is vital for medical research and clinical applications to interpret the information presented in these images. The aim of this research is to investigate improvements to medical image visualisation, particularly for multi-dimensional medical image datasets. A recently
developed medical imaging technique known as Electrical Impedance Tomography (EIT) is presented as a demonstration. To fulfil the aim, three main efforts are included in this work.
First, a novel scheme for the processmg of brain EIT data with SPM (Statistical Parametric Mapping) to detect ROI (Regions of Interest) in the data is proposed based on a theoretical analysis. To evaluate the feasibility of this scheme, two types of experiments are carried out: one is implemented with simulated EIT data, and the other is performed with human brain EIT data under visual stimulation. The experimental
results demonstrate that: SPM is able to localise the expected ROI in EIT data correctly; and it is reasonable to use the balloon hemodynamic change model to simulate the
impedance change during brain function activity.
Secondly, to deal with the absence of human morphology information in EIT visualisation, an innovative landmark-based registration scheme is developed to register brain EIT image with a standard anatomical brain atlas.
Finally, a new task typology model is derived for task exploration in medical image visualisation, and a task-based system development methodology is proposed for the visualisation of multi-dimensional medical images. As a case study, a prototype visualisation system, named EIT5DVis, has been developed, following this methodology. to visualise five-dimensional brain EIT data. The EIT5DVis system is able to accept visualisation tasks through a graphical user interface; apply appropriate methods to analyse tasks, which include the ROI detection approach and registration scheme mentioned in the preceding paragraphs; and produce various visualisations
- …