5 research outputs found

    The effectiveness of features in pattern recognition

    Get PDF
    Imperial Users onl

    Information Theoretic Limits on Non-cooperative Airborne Target Recognition by Means of Radar Sensors

    Get PDF
    The main objective of this research is to demonstrate that information theory, and specifically the concept of mutual information (MI) can be used to predict the maximum target recognition performance for a given radar concept in combination with a given set of targets of interest. This approach also allows for the direct comparison of disparate approaches to designing a radar concept which is capable of target recognition without resorting to choosing specific feature extraction and classification algorithms. The main application area of the study is the recognition of fighter type aircraft using surface based radar systems, although the results are also applicable to airborne radars. Information theoretic concepts are developed mathematically for the analysis of the radar target recognition problem. The various forms of MI required for this application are derived in detail and are tested rigorously against results from digital communication theory. The results are also compared to Shannon’s channel capacity bound, which is the fundamental limit on the amount of information which can be transmitted over a channel. Several sets of simulation based experiments were conducted to demonstrate the insights achievable by applying MI concepts to quantitatively predict the maximum achievable performance of disparate approaches to the radar target recognition problem. Asymptotic computational electromagnetic code was applied to calculate the target’s response to the radar signal for freely available geometrical models of fighter aircraft. The calculated target responses were then used to quantify the amount of information which is transmitted back to the radar about the target as a function of signal to noise ratio (SNR). The information content of the F-14, F-15 and F-16 were evaluated for a 480 MHz bandwidth waveform at 10 GHz as a baseline. Several ultra-wideband (UWB) waveforms, spanning 2-10 GHz, 10- 18 GHz and 2-18 GHz, but which were highly range ambiguous, were evaluated and showed SNR gains of 0.5-2 dB relative to the baseline. The effect of sensing the full polarimetric response of an F-18 and F-35 was evaluated and SNR gains of 5-7 dB over a single linear polarisation were measured. A Boeing 707 scale model (1:25) was measured in the University of Pretoria’s compact range spanning 2-18 GHz and gains of 2 dB were observed between single and dual linear polarisations. This required numerical integration in 8004 dimensions, demonstrating the stability of the MI estimation algorithm in high dimensional signal spaces. The information gained by including the difference channel signal of an X-band monopulse radar for the F-14 data set was approximately 3 dB at 50 km and increased to 4.5 dB at 2 km due to the increased target extent relative to the antenna pattern. This experiment necessitated the use of target profiles which were matched to the range of the target to achieve maximum information transfer. Experiments were conducted to evaluate the loss in information due to envelope processing. For the baseline data set, SNR losses in the region of 7 dB were measured. Linear pre-processing using the fast Fourier transform (FFT) and principal component analysis (PCA), before envelope processing, were compared and the PCA algorithm outperformed the FFT by approximately 1 dB at high MI values. Finally, the expression for multi-target MI was applied in conjunction with Fano’s inequality to predict the probability of incorrectly classifying a target. Probability of error is a critical parameter for a radar user. For the baseline data set, at P(error) = 0.001, maximum losses in the region of 0.6 to 0.9 dB were measured. This result shows that these targets are easily separable in the signal space. This study was only the proverbial “tip of the iceberg” and future research could extend the results and applications of the techniques developed. The types of targets and configurations of the individual targets could be increased and analysed. The analysis should also be extended to describe effects internal to the radar such as phase noise, spurious signals and analogue to digital converters and external effects such as clutter and multipath. The techniques could also be applied to quantify the gains in target recognition performance achievable for multistatic radar, multiple input multiple output (MIMO) radar and more exotic concepts, such as the fusion of data from multiple monostatic microwave radars with multi-receiver multi-band passive bistatic radar (PBR) data

    Enhancing brain-computer interfacing through advanced independent component analysis techniques

    No full text
    A Brain-computer interface (BCI) is a direct communication system between a brain and an external device in which messages or commands sent by an individual do not pass through the brain’s normal output pathways but is detected through brain signals. Some severe motor impairments, such as Amyothrophic Lateral Sclerosis, head trauma, spinal injuries and other diseases may cause the patients to lose their muscle control and become unable to communicate with the outside environment. Currently no effective cure or treatment has yet been found for these diseases. Therefore using a BCI system to rebuild the communication pathway becomes a possible alternative solution. Among different types of BCIs, an electroencephalogram (EEG) based BCI is becoming a popular system due to EEG’s fine temporal resolution, ease of use, portability and low set-up cost. However EEG’s susceptibility to noise is a major issue to develop a robust BCI. Signal processing techniques such as coherent averaging, filtering, FFT and AR modelling, etc. are used to reduce the noise and extract components of interest. However these methods process the data on the observed mixture domain which mixes components of interest and noise. Such a limitation means that extracted EEG signals possibly still contain the noise residue or coarsely that the removed noise also contains part of EEG signals embedded. Independent Component Analysis (ICA), a Blind Source Separation (BSS) technique, is able to extract relevant information within noisy signals and separate the fundamental sources into the independent components (ICs). The most common assumption of ICA method is that the source signals are unknown and statistically independent. Through this assumption, ICA is able to recover the source signals. Since the ICA concepts appeared in the fields of neural networks and signal processing in the 1980s, many ICA applications in telecommunications, biomedical data analysis, feature extraction, speech separation, time-series analysis and data mining have been reported in the literature. In this thesis several ICA techniques are proposed to optimize two major issues for BCI applications: reducing the recording time needed in order to speed up the signal processing and reducing the number of recording channels whilst improving the final classification performance or at least with it remaining the same as the current performance. These will make BCI a more practical prospect for everyday use. This thesis first defines BCI and the diverse BCI models based on different control patterns. After the general idea of ICA is introduced along with some modifications to ICA, several new ICA approaches are proposed. The practical work in this thesis starts with the preliminary analyses on the Southampton BCI pilot datasets starting with basic and then advanced signal processing techniques. The proposed ICA techniques are then presented using a multi-channel event related potential (ERP) based BCI. Next, the ICA algorithm is applied to a multi-channel spontaneous activity based BCI. The final ICA approach aims to examine the possibility of using ICA based on just one or a few channel recordings on an ERP based BCI. The novel ICA approaches for BCI systems presented in this thesis show that ICA is able to accurately and repeatedly extract the relevant information buried within noisy signals and the signal quality is enhanced so that even a simple classifier can achieve good classification accuracy. In the ERP based BCI application, after multichannel ICA the data just applied to eight averages/epochs can achieve 83.9% classification accuracy whilst the data by coherent averaging can reach only 32.3% accuracy. In the spontaneous activity based BCI, the use of the multi-channel ICA algorithm can effectively extract discriminatory information from two types of singletrial EEG data. The classification accuracy is improved by about 25%, on average, compared to the performance on the unpreprocessed data. The single channel ICA technique on the ERP based BCI produces much better results than results using the lowpass filter. Whereas the appropriate number of averages improves the signal to noise rate of P300 activities which helps to achieve a better classification. These advantages will lead to a reliable and practical BCI for use outside of the clinical laboratory

    Discriminant feature pursuit: from statistical learning to informative learning.

    Get PDF
    Lin Dahua.Thesis (M.Phil.)--Chinese University of Hong Kong, 2006.Includes bibliographical references (leaves 233-250).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.iiiChapter 1 --- Introduction --- p.1Chapter 1.1 --- The Problem We are Facing --- p.1Chapter 1.2 --- Generative vs. Discriminative Models --- p.2Chapter 1.3 --- Statistical Feature Extraction: Success and Challenge --- p.3Chapter 1.4 --- Overview of Our Works --- p.5Chapter 1.4.1 --- New Linear Discriminant Methods: Generalized LDA Formulation and Performance-Driven Sub space Learning --- p.5Chapter 1.4.2 --- Coupled Learning Models: Coupled Space Learning and Inter Modality Recognition --- p.6Chapter 1.4.3 --- Informative Learning Approaches: Conditional Infomax Learning and Information Chan- nel Model --- p.6Chapter 1.5 --- Organization of the Thesis --- p.8Chapter I --- History and Background --- p.10Chapter 2 --- Statistical Pattern Recognition --- p.11Chapter 2.1 --- Patterns and Classifiers --- p.11Chapter 2.2 --- Bayes Theory --- p.12Chapter 2.3 --- Statistical Modeling --- p.14Chapter 2.3.1 --- Maximum Likelihood Estimation --- p.14Chapter 2.3.2 --- Gaussian Model --- p.15Chapter 2.3.3 --- Expectation-Maximization --- p.17Chapter 2.3.4 --- Finite Mixture Model --- p.18Chapter 2.3.5 --- A Nonparametric Technique: Parzen Windows --- p.21Chapter 3 --- Statistical Learning Theory --- p.24Chapter 3.1 --- Formulation of Learning Model --- p.24Chapter 3.1.1 --- Learning: Functional Estimation Model --- p.24Chapter 3.1.2 --- Representative Learning Problems --- p.25Chapter 3.1.3 --- Empirical Risk Minimization --- p.26Chapter 3.2 --- Consistency and Convergence of Learning --- p.27Chapter 3.2.1 --- Concept of Consistency --- p.27Chapter 3.2.2 --- The Key Theorem of Learning Theory --- p.28Chapter 3.2.3 --- VC Entropy --- p.29Chapter 3.2.4 --- Bounds on Convergence --- p.30Chapter 3.2.5 --- VC Dimension --- p.35Chapter 4 --- History of Statistical Feature Extraction --- p.38Chapter 4.1 --- Linear Feature Extraction --- p.38Chapter 4.1.1 --- Principal Component Analysis (PCA) --- p.38Chapter 4.1.2 --- Linear Discriminant Analysis (LDA) --- p.41Chapter 4.1.3 --- Other Linear Feature Extraction Methods --- p.46Chapter 4.1.4 --- Comparison of Different Methods --- p.48Chapter 4.2 --- Enhanced Models --- p.49Chapter 4.2.1 --- Stochastic Discrimination and Random Subspace --- p.49Chapter 4.2.2 --- Hierarchical Feature Extraction --- p.51Chapter 4.2.3 --- Multilinear Analysis and Tensor-based Representation --- p.52Chapter 4.3 --- Nonlinear Feature Extraction --- p.54Chapter 4.3.1 --- Kernelization --- p.54Chapter 4.3.2 --- Dimension reduction by Manifold Embedding --- p.56Chapter 5 --- Related Works in Feature Extraction --- p.59Chapter 5.1 --- Dimension Reduction --- p.59Chapter 5.1.1 --- Feature Selection --- p.60Chapter 5.1.2 --- Feature Extraction --- p.60Chapter 5.2 --- Kernel Learning --- p.61Chapter 5.2.1 --- Basic Concepts of Kernel --- p.61Chapter 5.2.2 --- The Reproducing Kernel Map --- p.62Chapter 5.2.3 --- The Mercer Kernel Map --- p.64Chapter 5.2.4 --- The Empirical Kernel Map --- p.65Chapter 5.2.5 --- Kernel Trick and Kernelized Feature Extraction --- p.66Chapter 5.3 --- Subspace Analysis --- p.68Chapter 5.3.1 --- Basis and Subspace --- p.68Chapter 5.3.2 --- Orthogonal Projection --- p.69Chapter 5.3.3 --- Orthonormal Basis --- p.70Chapter 5.3.4 --- Subspace Decomposition --- p.70Chapter 5.4 --- Principal Component Analysis --- p.73Chapter 5.4.1 --- PCA Formulation --- p.73Chapter 5.4.2 --- Solution to PCA --- p.75Chapter 5.4.3 --- Energy Structure of PCA --- p.76Chapter 5.4.4 --- Probabilistic Principal Component Analysis --- p.78Chapter 5.4.5 --- Kernel Principal Component Analysis --- p.81Chapter 5.5 --- Independent Component Analysis --- p.83Chapter 5.5.1 --- ICA Formulation --- p.83Chapter 5.5.2 --- Measurement of Statistical Independence --- p.84Chapter 5.6 --- Linear Discriminant Analysis --- p.85Chapter 5.6.1 --- Fisher's Linear Discriminant Analysis --- p.85Chapter 5.6.2 --- Improved Algorithms for Small Sample Size Problem . --- p.89Chapter 5.6.3 --- Kernel Discriminant Analysis --- p.92Chapter II --- Improvement in Linear Discriminant Analysis --- p.100Chapter 6 --- Generalized LDA --- p.101Chapter 6.1 --- Regularized LDA --- p.101Chapter 6.1.1 --- Generalized LDA Implementation Procedure --- p.101Chapter 6.1.2 --- Optimal Nonsingular Approximation --- p.103Chapter 6.1.3 --- Regularized LDA algorithm --- p.104Chapter 6.2 --- A Statistical View: When is LDA optimal? --- p.105Chapter 6.2.1 --- Two-class Gaussian Case --- p.106Chapter 6.2.2 --- Multi-class Cases --- p.107Chapter 6.3 --- Generalized LDA Formulation --- p.108Chapter 6.3.1 --- Mathematical Preparation --- p.108Chapter 6.3.2 --- Generalized Formulation --- p.110Chapter 7 --- Dynamic Feedback Generalized LDA --- p.112Chapter 7.1 --- Basic Principle --- p.112Chapter 7.2 --- Dynamic Feedback Framework --- p.113Chapter 7.2.1 --- Initialization: K-Nearest Construction --- p.113Chapter 7.2.2 --- Dynamic Procedure --- p.115Chapter 7.3 --- Experiments --- p.115Chapter 7.3.1 --- Performance in Training Stage --- p.116Chapter 7.3.2 --- Performance on Testing set --- p.118Chapter 8 --- Performance-Driven Subspace Learning --- p.119Chapter 8.1 --- Motivation and Principle --- p.119Chapter 8.2 --- Performance-Based Criteria --- p.121Chapter 8.2.1 --- The Verification Problem and Generalized Average Margin --- p.122Chapter 8.2.2 --- Performance Driven Criteria based on Generalized Average Margin --- p.123Chapter 8.3 --- Optimal Subspace Pursuit --- p.125Chapter 8.3.1 --- Optimal threshold --- p.125Chapter 8.3.2 --- Optimal projection matrix --- p.125Chapter 8.3.3 --- Overall procedure --- p.129Chapter 8.3.4 --- Discussion of the Algorithm --- p.129Chapter 8.4 --- Optimal Classifier Fusion --- p.130Chapter 8.5 --- Experiments --- p.131Chapter 8.5.1 --- Performance Measurement --- p.131Chapter 8.5.2 --- Experiment Setting --- p.131Chapter 8.5.3 --- Experiment Results --- p.133Chapter 8.5.4 --- Discussion --- p.139Chapter III --- Coupled Learning of Feature Transforms --- p.140Chapter 9 --- Coupled Space Learning --- p.141Chapter 9.1 --- Introduction --- p.142Chapter 9.1.1 --- What is Image Style Transform --- p.142Chapter 9.1.2 --- Overview of our Framework --- p.143Chapter 9.2 --- Coupled Space Learning --- p.143Chapter 9.2.1 --- Framework of Coupled Modelling --- p.143Chapter 9.2.2 --- Correlative Component Analysis --- p.145Chapter 9.2.3 --- Coupled Bidirectional Transform --- p.148Chapter 9.2.4 --- Procedure of Coupled Space Learning --- p.151Chapter 9.3 --- Generalization to Mixture Model --- p.152Chapter 9.3.1 --- Coupled Gaussian Mixture Model --- p.152Chapter 9.3.2 --- Optimization by EM Algorithm --- p.152Chapter 9.4 --- Integrated Framework for Image Style Transform --- p.154Chapter 9.5 --- Experiments --- p.156Chapter 9.5.1 --- Face Super-resolution --- p.156Chapter 9.5.2 --- Portrait Style Transforms --- p.157Chapter 10 --- Inter-Modality Recognition --- p.162Chapter 10.1 --- Introduction to the Inter-Modality Recognition Problem . . . --- p.163Chapter 10.1.1 --- What is Inter-Modality Recognition --- p.163Chapter 10.1.2 --- Overview of Our Feature Extraction Framework . . . . --- p.163Chapter 10.2 --- Common Discriminant Feature Extraction --- p.165Chapter 10.2.1 --- Formulation of the Learning Problem --- p.165Chapter 10.2.2 --- Matrix-Form of the Objective --- p.168Chapter 10.2.3 --- Solving the Linear Transforms --- p.169Chapter 10.3 --- Kernelized Common Discriminant Feature Extraction --- p.170Chapter 10.4 --- Multi-Mode Framework --- p.172Chapter 10.4.1 --- Multi-Mode Formulation --- p.172Chapter 10.4.2 --- Optimization Scheme --- p.174Chapter 10.5 --- Experiments --- p.176Chapter 10.5.1 --- Experiment Settings --- p.176Chapter 10.5.2 --- Experiment Results --- p.177Chapter IV --- A New Perspective: Informative Learning --- p.180Chapter 11 --- Toward Information Theory --- p.181Chapter 11.1 --- Entropy and Mutual Information --- p.181Chapter 11.1.1 --- Entropy --- p.182Chapter 11.1.2 --- Relative Entropy (Kullback Leibler Divergence) --- p.184Chapter 11.2 --- Mutual Information --- p.184Chapter 11.2.1 --- Definition of Mutual Information --- p.184Chapter 11.2.2 --- Chain rules --- p.186Chapter 11.2.3 --- Information in Data Processing --- p.188Chapter 11.3 --- Differential Entropy --- p.189Chapter 11.3.1 --- Differential Entropy of Continuous Random Variable . --- p.189Chapter 11.3.2 --- Mutual Information of Continuous Random Variable . --- p.190Chapter 12 --- Conditional Infomax Learning --- p.191Chapter 12.1 --- An Overview --- p.192Chapter 12.2 --- Conditional Informative Feature Extraction --- p.193Chapter 12.2.1 --- Problem Formulation and Features --- p.193Chapter 12.2.2 --- The Information Maximization Principle --- p.194Chapter 12.2.3 --- The Information Decomposition and the Conditional Objective --- p.195Chapter 12.3 --- The Efficient Optimization --- p.197Chapter 12.3.1 --- Discrete Approximation Based on AEP --- p.197Chapter 12.3.2 --- Analysis of Terms and Their Derivatives --- p.198Chapter 12.3.3 --- Local Active Region Method --- p.200Chapter 12.4 --- Bayesian Feature Fusion with Sparse Prior --- p.201Chapter 12.5 --- The Integrated Framework for Feature Learning --- p.202Chapter 12.6 --- Experiments --- p.203Chapter 12.6.1 --- A Toy Problem --- p.203Chapter 12.6.2 --- Face Recognition --- p.204Chapter 13 --- Channel-based Maximum Effective Information --- p.209Chapter 13.1 --- Motivation and Overview --- p.209Chapter 13.2 --- Maximizing Effective Information --- p.211Chapter 13.2.1 --- Relation between Mutual Information and Classification --- p.211Chapter 13.2.2 --- Linear Projection and Metric --- p.212Chapter 13.2.3 --- Channel Model and Effective Information --- p.213Chapter 13.2.4 --- Parzen Window Approximation --- p.216Chapter 13.3 --- Parameter Optimization on Grassmann Manifold --- p.217Chapter 13.3.1 --- Grassmann Manifold --- p.217Chapter 13.3.2 --- Conjugate Gradient Optimization on Grassmann Manifold --- p.219Chapter 13.3.3 --- Computation of Gradient --- p.221Chapter 13.4 --- Experiments --- p.222Chapter 13.4.1 --- A Toy Problem --- p.222Chapter 13.4.2 --- Face Recognition --- p.223Chapter 14 --- Conclusion --- p.23

    A data fusion-based hybrid sensory system for older people’s daily activity recognition.

    Get PDF
    Population aged 60 and over is growing faster. Ageing-caused changes, such as physical or cognitive decline, could affect people’s quality of life, resulting in injuries, mental health or the lack of physical activity. Sensor-based human activity recognition (HAR) has become one of the most promising assistive technologies for older people’s daily life. Literature in HAR suggests that each sensor modality has its strengths and limitations and single sensor modalities may not cope with complex situations in practice. This research aims to design and implement a hybrid sensory HAR system to provide more comprehensive, practical and accurate surveillance for older people to assist them living independently. This reseach: 1) designs and develops a hybrid HAR system which provides a spatio- temporal surveillance system for older people by combining the wrist-worn sensors and the room-mounted ambient sensors (passive infrared); the wearable data are used to recognize the defined specific daily activities, and the ambient information is used to infer the occupant’s room-level daily routine; 2): proposes a unique and effective data fusion method to hybridize the two-source sensory data, in which the captured room-level location information from the ambient sensors is also utilized to trigger the sub classification models pretrained by room-assigned wearable data; 3): implements augmented features which are extracted from the attitude angles of the wearable device and explores the contribution of the new features to HAR; 4:) proposes a feature selection (FS) method in the view of kernel canonical correlation analysis (KCCA) to maximize the relevance between the feature candidate and the target class labels and simultaneously minimizes the joint redundancy between the already selected features and the feature candidate, named mRMJR-KCCA; 5:) demonstrates all the proposed methods above with the ground-truth data collected from recruited participants in home settings. The proposed system has three function modes: 1) the pure wearable sensing mode (the whole classification model) which can identify all the defined specific daily activities together and function alone when the ambient sensing fails; 2) the pure ambient sensing mode which can deliver the occupant’s room-level daily routine without wearable sensing; and 3) the data fusion mode (room-based sub classification mode) which provides a more comprehensive and accurate surveillance HAR when both the wearable sensing and ambient sensing function properly. The research also applies the mutual information (MI)-based FS methods for feature selection, Support Vector Machine (SVM) and Random Forest (RF) for classification. The experimental results demonstrate that the proposed hybrid sensory system improves the recognition accuracy to 98.96% after applying data fusion using Random Forest (RF) classification and mRMJR-KCCA feature selection. Furthermore, the improved results are achieved with a much smaller number of features compared with the scenario of recognizing all the defined activities using wearable data alone. The research work conducted in the thesis is unique, which is not directly compared with others since there are few other similar existing works in terms of the proposed data fusion method and the introduced new feature set
    corecore