34 research outputs found

    Face recognition using nonparametric-weighted Fisherfaces

    Get PDF
    This study presents an appearance-based face recognition scheme called the nonparametric-weighted Fisherfaces (NW-Fisherfaces). Pixels in a facial image are considered as coordinates in a high-dimensional space and are transformed into a face subspace for analysis by using nonparametric-weighted feature extraction (NWFE). According to previous studies of hyperspectral image classification, NWFE is a powerful tool for extracting hyperspectral image features. The Fisherfaces method maximizes the ratio of between-class scatter to that of within-class scatter. In this study, the proposed NW-Fisherfaces weighted the between-class scatter to emphasize the boundary structure of the transformed face subspace and, therefore, enhances the separability for different persons' face. The proposed NW-Fisherfaces was compared with Orthogonal Laplacianfaces, Eigenfaces, Fisherfaces, direct linear discriminant analysis, and null space linear discriminant analysis methods for tests on five facial databases. Experimental results showed that the proposed approach outperforms other feature extraction methods for most databases. © 2012 Li et al

    Dimension-reduction and discrimination of neuronal multi-channel signals

    Get PDF
    Dimensionsreduktion und Trennung neuronaler Multikanal-Signale

    Multi-Objective Optimization in Metabolomics/Computational Intelligence

    Get PDF
    The development of reliable computational models for detecting non-linear patterns encased in throughput datasets and characterizing them into phenotypic classes has been of particular interest and comprises dynamic studies in metabolomics and other disciplines that are encompassed within the omics science. Some of the clinical conditions that have been associated with these studies include metabotypes in cancer, in ammatory bowel disease (IBD), asthma, diabetes, traumatic brain injury (TBI), metabolic syndrome, and Parkinson's disease, just to mention a few. The traction in this domain is attributable to the advancements in the procedures involved in 1H NMR-linked datasets acquisition, which have fuelled the generation of a wide abundance of datasets. Throughput datasets generated by modern 1H NMR spectrometers are often characterized with features that are uninformative, redundant and inherently correlated. This renders it di cult for conventional multivariate analysis techniques to e ciently capture important signals and patterns. Therefore, the work covered in this research thesis provides novel alternative techniques to address the limitations of current analytical pipelines. This work delineates 13 variants of population-based nature inspired metaheuristic optimization algorithms which were further developed in this thesis as wrapper-based feature selection optimizers. The optimizers were then evaluated and benchmarked against each other through numerical experiments. Large-scale 1H NMR-linked datasets emerging from three disease studies were employed for the evaluations. The rst is a study in patients diagnosed with Malan syndrome; an autosomal dominant inherited disorder marked by a distinctive facial appearance, learning disabilities, and gigantism culminating in tall stature and macrocephaly, also referred to as cerebral gigantism. Another study involved Niemann-Pick Type C1 (NP-C1), a rare progressive neurodegenerative condition marked by intracellular accrual of cholesterol and complex lipids including sphingolipids and phospholipids in the endosomal/lysosomal system. The third study involved sore throat investigation in human (also known as `pharyngitis'); an acute infection of the upper respiratory tract that a ects the respiratory mucosa of the throat. In all three cases, samples from pathologically-con rmed cohorts with corresponding controls were acquired, and metabolomics investigations were performed using 1H NMR technique. Thereafter, computational optimizations were conducted on all three high-dimensional datasets that were generated from the disease studies outlined, so that key biomarkers and most e cient optimizers were identi ed in each study. The clinical and biochemical signi cance of the results arising from this work were discussed and highlighted

    Discriminant feature pursuit: from statistical learning to informative learning.

    Get PDF
    Lin Dahua.Thesis (M.Phil.)--Chinese University of Hong Kong, 2006.Includes bibliographical references (leaves 233-250).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.iiiChapter 1 --- Introduction --- p.1Chapter 1.1 --- The Problem We are Facing --- p.1Chapter 1.2 --- Generative vs. Discriminative Models --- p.2Chapter 1.3 --- Statistical Feature Extraction: Success and Challenge --- p.3Chapter 1.4 --- Overview of Our Works --- p.5Chapter 1.4.1 --- New Linear Discriminant Methods: Generalized LDA Formulation and Performance-Driven Sub space Learning --- p.5Chapter 1.4.2 --- Coupled Learning Models: Coupled Space Learning and Inter Modality Recognition --- p.6Chapter 1.4.3 --- Informative Learning Approaches: Conditional Infomax Learning and Information Chan- nel Model --- p.6Chapter 1.5 --- Organization of the Thesis --- p.8Chapter I --- History and Background --- p.10Chapter 2 --- Statistical Pattern Recognition --- p.11Chapter 2.1 --- Patterns and Classifiers --- p.11Chapter 2.2 --- Bayes Theory --- p.12Chapter 2.3 --- Statistical Modeling --- p.14Chapter 2.3.1 --- Maximum Likelihood Estimation --- p.14Chapter 2.3.2 --- Gaussian Model --- p.15Chapter 2.3.3 --- Expectation-Maximization --- p.17Chapter 2.3.4 --- Finite Mixture Model --- p.18Chapter 2.3.5 --- A Nonparametric Technique: Parzen Windows --- p.21Chapter 3 --- Statistical Learning Theory --- p.24Chapter 3.1 --- Formulation of Learning Model --- p.24Chapter 3.1.1 --- Learning: Functional Estimation Model --- p.24Chapter 3.1.2 --- Representative Learning Problems --- p.25Chapter 3.1.3 --- Empirical Risk Minimization --- p.26Chapter 3.2 --- Consistency and Convergence of Learning --- p.27Chapter 3.2.1 --- Concept of Consistency --- p.27Chapter 3.2.2 --- The Key Theorem of Learning Theory --- p.28Chapter 3.2.3 --- VC Entropy --- p.29Chapter 3.2.4 --- Bounds on Convergence --- p.30Chapter 3.2.5 --- VC Dimension --- p.35Chapter 4 --- History of Statistical Feature Extraction --- p.38Chapter 4.1 --- Linear Feature Extraction --- p.38Chapter 4.1.1 --- Principal Component Analysis (PCA) --- p.38Chapter 4.1.2 --- Linear Discriminant Analysis (LDA) --- p.41Chapter 4.1.3 --- Other Linear Feature Extraction Methods --- p.46Chapter 4.1.4 --- Comparison of Different Methods --- p.48Chapter 4.2 --- Enhanced Models --- p.49Chapter 4.2.1 --- Stochastic Discrimination and Random Subspace --- p.49Chapter 4.2.2 --- Hierarchical Feature Extraction --- p.51Chapter 4.2.3 --- Multilinear Analysis and Tensor-based Representation --- p.52Chapter 4.3 --- Nonlinear Feature Extraction --- p.54Chapter 4.3.1 --- Kernelization --- p.54Chapter 4.3.2 --- Dimension reduction by Manifold Embedding --- p.56Chapter 5 --- Related Works in Feature Extraction --- p.59Chapter 5.1 --- Dimension Reduction --- p.59Chapter 5.1.1 --- Feature Selection --- p.60Chapter 5.1.2 --- Feature Extraction --- p.60Chapter 5.2 --- Kernel Learning --- p.61Chapter 5.2.1 --- Basic Concepts of Kernel --- p.61Chapter 5.2.2 --- The Reproducing Kernel Map --- p.62Chapter 5.2.3 --- The Mercer Kernel Map --- p.64Chapter 5.2.4 --- The Empirical Kernel Map --- p.65Chapter 5.2.5 --- Kernel Trick and Kernelized Feature Extraction --- p.66Chapter 5.3 --- Subspace Analysis --- p.68Chapter 5.3.1 --- Basis and Subspace --- p.68Chapter 5.3.2 --- Orthogonal Projection --- p.69Chapter 5.3.3 --- Orthonormal Basis --- p.70Chapter 5.3.4 --- Subspace Decomposition --- p.70Chapter 5.4 --- Principal Component Analysis --- p.73Chapter 5.4.1 --- PCA Formulation --- p.73Chapter 5.4.2 --- Solution to PCA --- p.75Chapter 5.4.3 --- Energy Structure of PCA --- p.76Chapter 5.4.4 --- Probabilistic Principal Component Analysis --- p.78Chapter 5.4.5 --- Kernel Principal Component Analysis --- p.81Chapter 5.5 --- Independent Component Analysis --- p.83Chapter 5.5.1 --- ICA Formulation --- p.83Chapter 5.5.2 --- Measurement of Statistical Independence --- p.84Chapter 5.6 --- Linear Discriminant Analysis --- p.85Chapter 5.6.1 --- Fisher's Linear Discriminant Analysis --- p.85Chapter 5.6.2 --- Improved Algorithms for Small Sample Size Problem . --- p.89Chapter 5.6.3 --- Kernel Discriminant Analysis --- p.92Chapter II --- Improvement in Linear Discriminant Analysis --- p.100Chapter 6 --- Generalized LDA --- p.101Chapter 6.1 --- Regularized LDA --- p.101Chapter 6.1.1 --- Generalized LDA Implementation Procedure --- p.101Chapter 6.1.2 --- Optimal Nonsingular Approximation --- p.103Chapter 6.1.3 --- Regularized LDA algorithm --- p.104Chapter 6.2 --- A Statistical View: When is LDA optimal? --- p.105Chapter 6.2.1 --- Two-class Gaussian Case --- p.106Chapter 6.2.2 --- Multi-class Cases --- p.107Chapter 6.3 --- Generalized LDA Formulation --- p.108Chapter 6.3.1 --- Mathematical Preparation --- p.108Chapter 6.3.2 --- Generalized Formulation --- p.110Chapter 7 --- Dynamic Feedback Generalized LDA --- p.112Chapter 7.1 --- Basic Principle --- p.112Chapter 7.2 --- Dynamic Feedback Framework --- p.113Chapter 7.2.1 --- Initialization: K-Nearest Construction --- p.113Chapter 7.2.2 --- Dynamic Procedure --- p.115Chapter 7.3 --- Experiments --- p.115Chapter 7.3.1 --- Performance in Training Stage --- p.116Chapter 7.3.2 --- Performance on Testing set --- p.118Chapter 8 --- Performance-Driven Subspace Learning --- p.119Chapter 8.1 --- Motivation and Principle --- p.119Chapter 8.2 --- Performance-Based Criteria --- p.121Chapter 8.2.1 --- The Verification Problem and Generalized Average Margin --- p.122Chapter 8.2.2 --- Performance Driven Criteria based on Generalized Average Margin --- p.123Chapter 8.3 --- Optimal Subspace Pursuit --- p.125Chapter 8.3.1 --- Optimal threshold --- p.125Chapter 8.3.2 --- Optimal projection matrix --- p.125Chapter 8.3.3 --- Overall procedure --- p.129Chapter 8.3.4 --- Discussion of the Algorithm --- p.129Chapter 8.4 --- Optimal Classifier Fusion --- p.130Chapter 8.5 --- Experiments --- p.131Chapter 8.5.1 --- Performance Measurement --- p.131Chapter 8.5.2 --- Experiment Setting --- p.131Chapter 8.5.3 --- Experiment Results --- p.133Chapter 8.5.4 --- Discussion --- p.139Chapter III --- Coupled Learning of Feature Transforms --- p.140Chapter 9 --- Coupled Space Learning --- p.141Chapter 9.1 --- Introduction --- p.142Chapter 9.1.1 --- What is Image Style Transform --- p.142Chapter 9.1.2 --- Overview of our Framework --- p.143Chapter 9.2 --- Coupled Space Learning --- p.143Chapter 9.2.1 --- Framework of Coupled Modelling --- p.143Chapter 9.2.2 --- Correlative Component Analysis --- p.145Chapter 9.2.3 --- Coupled Bidirectional Transform --- p.148Chapter 9.2.4 --- Procedure of Coupled Space Learning --- p.151Chapter 9.3 --- Generalization to Mixture Model --- p.152Chapter 9.3.1 --- Coupled Gaussian Mixture Model --- p.152Chapter 9.3.2 --- Optimization by EM Algorithm --- p.152Chapter 9.4 --- Integrated Framework for Image Style Transform --- p.154Chapter 9.5 --- Experiments --- p.156Chapter 9.5.1 --- Face Super-resolution --- p.156Chapter 9.5.2 --- Portrait Style Transforms --- p.157Chapter 10 --- Inter-Modality Recognition --- p.162Chapter 10.1 --- Introduction to the Inter-Modality Recognition Problem . . . --- p.163Chapter 10.1.1 --- What is Inter-Modality Recognition --- p.163Chapter 10.1.2 --- Overview of Our Feature Extraction Framework . . . . --- p.163Chapter 10.2 --- Common Discriminant Feature Extraction --- p.165Chapter 10.2.1 --- Formulation of the Learning Problem --- p.165Chapter 10.2.2 --- Matrix-Form of the Objective --- p.168Chapter 10.2.3 --- Solving the Linear Transforms --- p.169Chapter 10.3 --- Kernelized Common Discriminant Feature Extraction --- p.170Chapter 10.4 --- Multi-Mode Framework --- p.172Chapter 10.4.1 --- Multi-Mode Formulation --- p.172Chapter 10.4.2 --- Optimization Scheme --- p.174Chapter 10.5 --- Experiments --- p.176Chapter 10.5.1 --- Experiment Settings --- p.176Chapter 10.5.2 --- Experiment Results --- p.177Chapter IV --- A New Perspective: Informative Learning --- p.180Chapter 11 --- Toward Information Theory --- p.181Chapter 11.1 --- Entropy and Mutual Information --- p.181Chapter 11.1.1 --- Entropy --- p.182Chapter 11.1.2 --- Relative Entropy (Kullback Leibler Divergence) --- p.184Chapter 11.2 --- Mutual Information --- p.184Chapter 11.2.1 --- Definition of Mutual Information --- p.184Chapter 11.2.2 --- Chain rules --- p.186Chapter 11.2.3 --- Information in Data Processing --- p.188Chapter 11.3 --- Differential Entropy --- p.189Chapter 11.3.1 --- Differential Entropy of Continuous Random Variable . --- p.189Chapter 11.3.2 --- Mutual Information of Continuous Random Variable . --- p.190Chapter 12 --- Conditional Infomax Learning --- p.191Chapter 12.1 --- An Overview --- p.192Chapter 12.2 --- Conditional Informative Feature Extraction --- p.193Chapter 12.2.1 --- Problem Formulation and Features --- p.193Chapter 12.2.2 --- The Information Maximization Principle --- p.194Chapter 12.2.3 --- The Information Decomposition and the Conditional Objective --- p.195Chapter 12.3 --- The Efficient Optimization --- p.197Chapter 12.3.1 --- Discrete Approximation Based on AEP --- p.197Chapter 12.3.2 --- Analysis of Terms and Their Derivatives --- p.198Chapter 12.3.3 --- Local Active Region Method --- p.200Chapter 12.4 --- Bayesian Feature Fusion with Sparse Prior --- p.201Chapter 12.5 --- The Integrated Framework for Feature Learning --- p.202Chapter 12.6 --- Experiments --- p.203Chapter 12.6.1 --- A Toy Problem --- p.203Chapter 12.6.2 --- Face Recognition --- p.204Chapter 13 --- Channel-based Maximum Effective Information --- p.209Chapter 13.1 --- Motivation and Overview --- p.209Chapter 13.2 --- Maximizing Effective Information --- p.211Chapter 13.2.1 --- Relation between Mutual Information and Classification --- p.211Chapter 13.2.2 --- Linear Projection and Metric --- p.212Chapter 13.2.3 --- Channel Model and Effective Information --- p.213Chapter 13.2.4 --- Parzen Window Approximation --- p.216Chapter 13.3 --- Parameter Optimization on Grassmann Manifold --- p.217Chapter 13.3.1 --- Grassmann Manifold --- p.217Chapter 13.3.2 --- Conjugate Gradient Optimization on Grassmann Manifold --- p.219Chapter 13.3.3 --- Computation of Gradient --- p.221Chapter 13.4 --- Experiments --- p.222Chapter 13.4.1 --- A Toy Problem --- p.222Chapter 13.4.2 --- Face Recognition --- p.223Chapter 14 --- Conclusion --- p.23

    Visual Scene Understanding by Deep Fisher Discriminant Learning

    No full text
    Modern deep learning has recently revolutionized several fields of classic machine learning and computer vision, such as, scene understanding, natural language processing and machine translation. The substitution of feature hand-crafting with automatic feature learning, provides an excellent opportunity for gaining an in-depth understanding of large-scale data statistics. Deep neural networks generally train models with huge numbers of parameters, facilitating efficient search for optimal and sub-optimal spaces of highly non-convex objective functions. On the other hand, Fisher discriminant analysis has been widely employed to impose class discrepancy, for the sake of segmentation, classification, and recognition tasks. This thesis bridges between contemporary deep learning and classic discriminant analysis, to accommodate some important challenges in visual scene understanding, i.e. semantic segmentation, texture classification, and object recognition. The aim is to accomplish specific tasks in some new high-dimensional spaces, covered by the statistical information of the datasets under study. Inspired by a new formulation of Fisher discriminant analysis, this thesis introduces some novel arrangements of well-known deep learning architectures, to achieve better performances on the targeted missions. The theoretical justifications are based upon a large body of experimental work, and consolidate the contribution of the proposed idea; Deep Fisher Discriminant Learning, to several challenges in visual scene understanding

    Recherche d'images par le contenu, analyse multirésolution et modèles de régression logistique

    Get PDF
    Cette thèse, présente l'ensemble de nos contributions relatives à la recherche d'images par le contenu à l'aide de l'analyse multirésolution ainsi qu'à la classification linéaire et nonlinéaire. Dans la première partie, nous proposons une méthode simple et rapide de recherche d'images par le contenu. Pour représenter les images couleurs, nous introduisons de nouveaux descripteurs de caractéristiques qui sont des histogrammes pondérés par le gradient multispectral. Afin de mesurer le degré de similarité entre deux images d'une façon rapide et efficace, nous utilisons une pseudo-métrique pondérée qui utilise la décomposition en ondelettes et la compression des histogrammes extraits des images. Les poids de la pseudo-métrique sont ajustés à l'aide du modèle classique de régression logistique afin d'améliorer sa capacité à discriminer et la précision de la recherche. Dans la deuxième partie, nous proposons un nouveau modèle bayésien de régression logistique fondé sur une méthode variationnelle. Une comparaison de ce nouveau modèle au modèle classique de régression logistique est effectuée dans le cadre de la recherche d'images. Nous illustrons par la suite que le modèle bayésien permet par rapport au modèle classique une amélioration notoire de la capacité à discriminer de la pseudo-métrique et de la précision de recherche. Dans la troisième partie, nous détaillons la dérivation du nouveau modèle bayésien de régression logistique fondé sur une méthode variationnelle et nous comparons ce modèle au modèle classique de régression logistique ainsi qu'à d'autres classificateurs linéaires présents dans la littérature. Nous comparons par la suite, notre méthode de recherche, utilisant le modèle bayésien de régression logistique, à d'autres méthodes de recherches déjà publiées. Dans la quatrième partie, nous introduisons la sélection des caractéristiques pour améliorer notre méthode de recherche utilisant le modèle introduit ci-dessus. En effet, la sélection des caractéristiques permet de donner automatiquement plus d'importance aux caractéristiques qui discriminent le plus et moins d'importance aux caractéristiques qui discriminent le moins. Finalement, dans la cinquième partie, nous proposons un nouveau modèle bayésien d'analyse discriminante logistique construit à l'aide de noyaux permettant ainsi une classification nonlinéaire flexible

    The Singular Value Decomposition, Applications and Beyond

    Full text link
    The singular value decomposition (SVD) is not only a classical theory in matrix computation and analysis, but also is a powerful tool in machine learning and modern data analysis. In this tutorial we first study the basic notion of SVD and then show the central role of SVD in matrices. Using majorization theory, we consider variational principles of singular values and eigenvalues. Built on SVD and a theory of symmetric gauge functions, we discuss unitarily invariant norms, which are then used to formulate general results for matrix low rank approximation. We study the subdifferentials of unitarily invariant norms. These results would be potentially useful in many machine learning problems such as matrix completion and matrix data classification. Finally, we discuss matrix low rank approximation and its recent developments such as randomized SVD, approximate matrix multiplication, CUR decomposition, and Nystrom approximation. Randomized algorithms are important approaches to large scale SVD as well as fast matrix computations

    Determining spatio-temporal metrics that distinguish play outcomes in field hockey

    Get PDF
    Tactical behaviour in field sports can be examined using spatio-temporal metrics, which are descriptions of player behaviour derived from data of player positions over time. Many metrics can be computed that describe the cooperative and adversarial interactions between players. The methods typically used by sports performance analysts cannot appropriately analyse the many possible spatio-temporal metrics and their interactions. Tantalisingly, the interactions between these descriptions of player behaviour could potentially describe tactical differences in performance. This thesis describes a programme of research that determined some spatiotemporal metrics that distinguish play outcomes in field hockey. Methods inspired by genetic analysts were used to estimate the influence of combinations of spatio-temporal metrics on the outcome of field hockey plays. The novel application of the genetic methods to sports performance data raised some practical difficulties. Adjustments to the method facilitated the selection of distinguishing metric combinations from an initially large list of over 3,600 metrics. The adjustments made to the genetic methods represent one of several contributions to knowledge made by this programme of research. These contributions will help performance analysts with the increasingly common task of analysing high-dimensional data. Other contributions to knowledge are a suite of metric combinations that distinguish play outcomes in field hockey and empirical support for some tactical preconceptions. The key finding of interest for players and coaches is that play outcomes in field hockey are distinguished by proximity to the goal and passing execution. The metrics that distinguish the several outcomes differ depending on the outcomes being compared. Coaches and athletes should therefore recognise the variety of tactics required to minimise negative outcomes and maximise positive ones

    Analysing functional genomics data using novel ensemble, consensus and data fusion techniques

    Get PDF
    Motivation: A rapid technological development in the biosciences and in computer science in the last decade has enabled the analysis of high-dimensional biological datasets on standard desktop computers. However, in spite of these technical advances, common properties of the new high-throughput experimental data, like small sample sizes in relation to the number of features, high noise levels and outliers, also pose novel challenges. Ensemble and consensus machine learning techniques and data integration methods can alleviate these issues, but often provide overly complex models which lack generalization capability and interpretability. The goal of this thesis was therefore to develop new approaches to combine algorithms and large-scale biological datasets, including novel approaches to integrate analysis types from different domains (e.g. statistics, topological network analysis, machine learning and text mining), to exploit their synergies in a manner that provides compact and interpretable models for inferring new biological knowledge. Main results: The main contributions of the doctoral project are new ensemble, consensus and cross-domain bioinformatics algorithms, and new analysis pipelines combining these techniques within a general framework. This framework is designed to enable the integrative analysis of both large- scale gene and protein expression data (including the tools ArrayMining, Top-scoring pathway pairs and RNAnalyze) and general gene and protein sets (including the tools TopoGSA , EnrichNet and PathExpand), by combining algorithms for different statistical learning tasks (feature selection, classification and clustering) in a modular fashion. Ensemble and consensus analysis techniques employed within the modules are redesigned such that the compactness and interpretability of the resulting models is optimized in addition to the predictive accuracy and robustness. The framework was applied to real-word biomedical problems, with a focus on cancer biology, providing the following main results: (1) The identification of a novel tumour marker gene in collaboration with the Nottingham Queens Medical Centre, facilitating the distinction between two clinically important breast cancer subtypes (framework tool: ArrayMining) (2) The prediction of novel candidate disease genes for Alzheimer’s disease and pancreatic cancer using an integrative analysis of cellular pathway definitions and protein interaction data (framework tool: PathExpand, collaboration with the Spanish National Cancer Centre) (3) The prioritization of associations between disease-related processes and other cellular pathways using a new rule-based classification method integrating gene expression data and pathway definitions (framework tool: Top-scoring pathway pairs) (4) The discovery of topological similarities between differentially expressed genes in cancers and cellular pathway definitions mapped to a molecular interaction network (framework tool: TopoGSA, collaboration with the Spanish National Cancer Centre) In summary, the framework combines the synergies of multiple cross-domain analysis techniques within a single easy-to-use software and has provided new biological insights in a wide variety of practical settings
    corecore