3,298 research outputs found
Recommended from our members
Evaluation and analysis of hybrid intelligent pattern recognition techniques for speaker identification
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The rapid momentum of the technology progress in the recent years has led to a tremendous rise in the use of biometric authentication systems. The objective of this research is to investigate the problem
of identifying a speaker from its voice regardless of the content (i.e.
text-independent), and to design efficient methods of combining face and voice in producing a robust authentication system.
A novel approach towards speaker identification is developed using
wavelet analysis, and multiple neural networks including Probabilistic
Neural Network (PNN), General Regressive Neural Network (GRNN)and Radial Basis Function-Neural Network (RBF NN) with the AND
voting scheme. This approach is tested on GRID and VidTIMIT cor-pora and comprehensive test results have been validated with state-
of-the-art approaches. The system was found to be competitive and it improved the recognition rate by 15% as compared to the classical Mel-frequency Cepstral Coe±cients (MFCC), and reduced the recognition time by 40% compared to Back Propagation Neural Network (BPNN), Gaussian Mixture Models (GMM) and Principal Component Analysis (PCA).
Another novel approach using vowel formant analysis is implemented using Linear Discriminant Analysis (LDA). Vowel formant based speaker identification is best suitable for real-time implementation and requires only a few bytes of information to be stored for each speaker, making it both storage and time efficient. Tested on GRID and Vid-TIMIT, the proposed scheme was found to be 85.05% accurate when Linear Predictive Coding (LPC) is used to extract the vowel formants, which is much higher than the accuracy of BPNN and GMM. Since the proposed scheme does not require any training time other than creating a small database of vowel formants, it is faster as well. Furthermore, an increasing number of speakers makes it di±cult for BPNN and GMM to sustain their accuracy, but the proposed score-based methodology stays almost linear.
Finally, a novel audio-visual fusion based identification system is implemented using GMM and MFCC for speaker identi¯cation and PCA for face recognition. The results of speaker identification and face recognition are fused at different levels, namely the feature, score and decision levels. Both the score-level and decision-level (with OR voting) fusions were shown to outperform the feature-level fusion in terms of accuracy and error resilience. The result is in line with the distinct nature of the two modalities which lose themselves when combined at the feature-level. The GRID and VidTIMIT test results validate that
the proposed scheme is one of the best candidates for the fusion of
face and voice due to its low computational time and high recognition accuracy
Analysis and synthesis of iris images
Of all the physiological traits of the human body that help in personal identification, the iris is probably the most robust and accurate. Although numerous iris recognition algorithms have been proposed, the underlying processes that define the texture of irises have not been extensively studied. In this thesis, multiple pair-wise pixel interactions have been used to describe the textural content of the iris image thereby resulting in a Markov Random Field (MRF) model for the iris image. This information is expected to be useful for the development of user-specific models for iris images, i.e. the matcher could be tuned to accommodate the characteristics of each user\u27s iris image in order to improve matching performance. We also use MRF modeling to construct synthetic irises based on iris primitive extracted from real iris images. The synthesis procedure is deterministic and avoids the sampling of a probability distribution making it computationally simple. We demonstrate that iris textures in general are significantly different from other irregular textural patterns. Clustering experiments indicate that the synthetic irises generated using the proposed technique are similar in textural content to real iris images
Tensor Analysis and Fusion of Multimodal Brain Images
Current high-throughput data acquisition technologies probe dynamical systems
with different imaging modalities, generating massive data sets at different
spatial and temporal resolutions posing challenging problems in multimodal data
fusion. A case in point is the attempt to parse out the brain structures and
networks that underpin human cognitive processes by analysis of different
neuroimaging modalities (functional MRI, EEG, NIRS etc.). We emphasize that the
multimodal, multi-scale nature of neuroimaging data is well reflected by a
multi-way (tensor) structure where the underlying processes can be summarized
by a relatively small number of components or "atoms". We introduce
Markov-Penrose diagrams - an integration of Bayesian DAG and tensor network
notation in order to analyze these models. These diagrams not only clarify
matrix and tensor EEG and fMRI time/frequency analysis and inverse problems,
but also help understand multimodal fusion via Multiway Partial Least Squares
and Coupled Matrix-Tensor Factorization. We show here, for the first time, that
Granger causal analysis of brain networks is a tensor regression problem, thus
allowing the atomic decomposition of brain networks. Analysis of EEG and fMRI
recordings shows the potential of the methods and suggests their use in other
scientific domains.Comment: 23 pages, 15 figures, submitted to Proceedings of the IEE
Registration of 3D Face Scans with Average Face Models
The accuracy of a 3D face recognition system depends on a correct registration that aligns the facial surfaces and makes a comparison possible. The best results obtained so far use a costly one-to-all registration approach, which requires the registration of each facial surface to all faces in the gallery. We explore the approach of registering the new facial surface to an average face model (AFM), which automatically establishes correspondence to the pre-registered gallery faces. We propose a new algorithm for constructing an AFM, and show that it works better than a recent approach. Extending the single-AFM approach, we propose to employ category-specific alternative AFMs for registration, and evaluate the effect on subsequent classification. We perform simulations with multiple AFMs that correspond to different clusters in the face shape space and compare these with gender and morphology based groupings. We show that the automatic clustering approach separates the faces into gender and morphology groups, consistent with the other race effect reported in the psychology literature. We inspect thin-plate spline and iterative closest point based registration schemes under manual or automatic landmark detection prior to registration. Finally, we describe and analyse a regular re-sampling method that significantly increases the accuracy of registration
Unsupervised Machine Learning Algorithms to Characterize Single-Cell Heterogeneity and Perturbation Response
Recent advances in microfluidic technologies facilitate the measurement of gene expression, DNA accessibility, protein content, or genomic mutations at unprecedented scale. The challenges imposed by the scale of these datasets are further exacerbated by non-linearity in molecular effects, complex interdependencies between features, and a lack of understanding of both data generating processes and sources of technical and biological noise. As a result, analysis of modern single-cell data requires the development of specialized computational tools. One solution to these problems is the use of manifold learning, a sub-field of unsupervised machine learning that seeks to model data geometry using a simplifying assumption that the underlying system is continuous and locally Euclidean. In this dissertation, I show how manifold learning is naturally suited for single-cell analysis and introduce three related algorithms for characterization of single-cell heterogeneity and perturbation response. I first describe Vertex Frequency Clustering, an algorithm that identifies groups of cells with similar responses to an experiment perturbation by analyzing the spectral representation of condition labels expressed as signals over a cell similarity graph. Next, I introduce MELD, an algorithm that expands on these ideas to estimate the density of each experimental sample over the graph to quantify the effect of an experimental perturbation at single cell resolution. Finally, I describe a neural network for archetypal analysis that represents the data as continuously distributed between a set of extrema. Each of these algorithms are demonstrated on a combination of real and synthetic datasets and are benchmarked against state-of-the-art algorithms
Stochasticity from function -- why the Bayesian brain may need no noise
An increasing body of evidence suggests that the trial-to-trial variability
of spiking activity in the brain is not mere noise, but rather the reflection
of a sampling-based encoding scheme for probabilistic computing. Since the
precise statistical properties of neural activity are important in this
context, many models assume an ad-hoc source of well-behaved, explicit noise,
either on the input or on the output side of single neuron dynamics, most often
assuming an independent Poisson process in either case. However, these
assumptions are somewhat problematic: neighboring neurons tend to share
receptive fields, rendering both their input and their output correlated; at
the same time, neurons are known to behave largely deterministically, as a
function of their membrane potential and conductance. We suggest that spiking
neural networks may, in fact, have no need for noise to perform sampling-based
Bayesian inference. We study analytically the effect of auto- and
cross-correlations in functionally Bayesian spiking networks and demonstrate
how their effect translates to synaptic interaction strengths, rendering them
controllable through synaptic plasticity. This allows even small ensembles of
interconnected deterministic spiking networks to simultaneously and
co-dependently shape their output activity through learning, enabling them to
perform complex Bayesian computation without any need for noise, which we
demonstrate in silico, both in classical simulation and in neuromorphic
emulation. These results close a gap between the abstract models and the
biology of functionally Bayesian spiking networks, effectively reducing the
architectural constraints imposed on physical neural substrates required to
perform probabilistic computing, be they biological or artificial
Min–Max Hyperellipsoidal Clustering for Anomaly Detection in Network Security
A novel hyperellipsoidal clustering technique is presented for an intrusion-detection system in network security. Hyperellipsoidal clusters toward maximum intracluster similarity and minimum intercluster similarity are generated from training data sets. The novelty of the technique lies in the fact that the parameters needed to construct higher order data models in general multivariate Gaussian functions are incrementally derived from the data sets using accretive processes. The technique is implemented in a feedforward neural network that uses a Gaussian radial basis function as the model generator. An evaluation based on the inclusiveness and exclusiveness of samples with respect to specific criteria is applied to accretively learn the output clusters of the neural network. One significant advantage of this is its ability to detect individual anomaly types that are hard to detect with other anomaly-detection schemes. Applying this technique, several feature subsets of the tcptrace network-connection records that give above 95% detection at false-positive rates below 5% were identified
- …