15 research outputs found

    Learning the kernel with hyperkernels

    No full text
    This paper addresses the problem of choosing a kernel suitable for estimation with a support vector machine, hence further automating machine learning. This goal is achieved by defining a reproducing kernel Hilbert space on the space of kernels itself. Such a formulation leads to a statistical estimation problem similar to the problem of minimizing a regularized risk functional. We state the equivalent representer theorem for the choice of kernels and present a semidefinite programming formulation of the resulting optimization problem. Several recipes for constructing hyperkernels are provided, as well as the details of common machine learning problems. Experimental results for classification, regression and novelty detection on UCI data show the feasibility of our approach

    Classification and fusion methods for multimodal biometric authentication.

    Get PDF
    Ouyang, Hua.Thesis (M.Phil.)--Chinese University of Hong Kong, 2007.Includes bibliographical references (leaves 81-89).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Biometric Authentication --- p.1Chapter 1.2 --- Multimodal Biometric Authentication --- p.2Chapter 1.2.1 --- Combination of Different Biometric Traits --- p.3Chapter 1.2.2 --- Multimodal Fusion --- p.5Chapter 1.3 --- Audio-Visual Bi-modal Authentication --- p.6Chapter 1.4 --- Focus of This Research --- p.7Chapter 1.5 --- Organization of This Thesis --- p.8Chapter 2 --- Audio-Visual Bi-modal Authentication --- p.10Chapter 2.1 --- Audio-visual Authentication System --- p.10Chapter 2.1.1 --- Why Audio and Mouth? --- p.10Chapter 2.1.2 --- System Overview --- p.11Chapter 2.2 --- XM2VTS Database --- p.12Chapter 2.3 --- Visual Feature Extraction --- p.14Chapter 2.3.1 --- Locating the Mouth --- p.14Chapter 2.3.2 --- Averaged Mouth Images --- p.17Chapter 2.3.3 --- Averaged Optical Flow Images --- p.21Chapter 2.4 --- Audio Features --- p.23Chapter 2.5 --- Video Stream Classification --- p.23Chapter 2.6 --- Audio Stream Classification --- p.25Chapter 2.7 --- Simple Fusion --- p.26Chapter 3 --- Weighted Sum Rules for Multi-modal Fusion --- p.27Chapter 3.1 --- Measurement-Level Fusion --- p.27Chapter 3.2 --- Product Rule and Sum Rule --- p.28Chapter 3.2.1 --- Product Rule --- p.28Chapter 3.2.2 --- Naive Sum Rule (NS) --- p.29Chapter 3.2.3 --- Linear Weighted Sum Rule (WS) --- p.30Chapter 3.3 --- Optimal Weights Selection for WS --- p.31Chapter 3.3.1 --- Independent Case --- p.31Chapter 3.3.2 --- Identical Case --- p.33Chapter 3.4 --- Confidence Measure Based Fusion Weights --- p.35Chapter 4 --- Regularized k-Nearest Neighbor Classifier --- p.39Chapter 4.1 --- Motivations --- p.39Chapter 4.1.1 --- Conventional k-NN Classifier --- p.39Chapter 4.1.2 --- Bayesian Formulation of kNN --- p.40Chapter 4.1.3 --- Pitfalls and Drawbacks of kNN Classifiers --- p.41Chapter 4.1.4 --- Metric Learning Methods --- p.43Chapter 4.2 --- Regularized k-Nearest Neighbor Classifier --- p.46Chapter 4.2.1 --- Metric or Not Metric? --- p.46Chapter 4.2.2 --- Proposed Classifier: RkNN --- p.47Chapter 4.2.3 --- Hyperkernels and Hyper-RKHS --- p.49Chapter 4.2.4 --- Convex Optimization of RkNN --- p.52Chapter 4.2.5 --- Hyper kernel Construction --- p.53Chapter 4.2.6 --- Speeding up RkNN --- p.56Chapter 4.3 --- Experimental Evaluation --- p.57Chapter 4.3.1 --- Synthetic Data Sets --- p.57Chapter 4.3.2 --- Benchmark Data Sets --- p.64Chapter 5 --- Audio-Visual Authentication Experiments --- p.68Chapter 5.1 --- Effectiveness of Visual Features --- p.68Chapter 5.2 --- Performance of Simple Sum Rule --- p.71Chapter 5.3 --- Performances of Individual Modalities --- p.73Chapter 5.4 --- Identification Tasks Using Confidence-based Weighted Sum Rule --- p.74Chapter 5.4.1 --- Effectiveness of WS_M_C Rule --- p.75Chapter 5.4.2 --- WS_M_C v.s. WS_M --- p.76Chapter 5.5 --- Speaker Identification Using RkNN --- p.77Chapter 6 --- Conclusions and Future Work --- p.78Chapter 6.1 --- Conclusions --- p.78Chapter 6.2 --- Important Follow-up Works --- p.80Bibliography --- p.81Chapter A --- Proof of Proposition 3.1 --- p.90Chapter B --- Proof of Proposition 3.2 --- p.9

    Analytical Techniques for the Improvement of Mass Spectrometry Protein Profiling

    Get PDF
    Bioinformatics is rapidly advancing through the "post-genomic" era following the sequencing of the human genome. In preparation for studying the inner workings behind genes, proteins and even smaller biological elements, several subdivisions of bioinformatics have developed. The subdivision of proteomics, concerning the structure and function of proteins, has been aided by the mass spectrometry data source. Biofluid or tissue samples are rapidly assayed for their protein composition. The resulting mass spectra are analyzed using machine learning techniques to discover reliable patterns which discriminate samples from two populations, for example, healthy or diseased, or treatment responders versus non-responders. However, this data source is imperfect and faces several challenges: unwanted variability arising from the data collection process, obtaining a robust discriminative model that generalizes well to future data, and validating a predictive pattern statistically and biologically.This thesis presents several techniques which attempt to intelligently deal with the problems facing each stage of the analytical process. First, an automatic preprocessing method selection system is demonstrated. This system learns from data and selects a combination of preprocessing methods which is most appropriate for the task at hand. This reduces the noise affecting potential predictive patterns. Our results suggest that this method can help adapt to data from different technologies, improving downstream predictive performance. Next, the issues of feature selection and predictive modeling are revisited with respect to the unique challenges posed by proteomic profile data. Approaches to model selection through kernel learning are also investigated. Key insights are obtained for designing the feature selection and predictive modeling portion of the analytical framework. Finally, methods for interpreting the resultsof predictive modeling are demonstrated. These methods are used to assure the user of various desirable properties: validation of the strength of a predictive model, validation of reproducible signal across multiple data generation sessions and generalizability of predictive models to future data. A method for labeling profile features with biological identities is also presented, which aids in the interpretation of the data. Overall, these novel techniques give the protein profiling community additional support and leverage to aid the predictive capability of the technology

    Probabilistic multiple kernel learning

    Get PDF
    The integration of multiple and possibly heterogeneous information sources for an overall decision-making process has been an open and unresolved research direction in computing science since its very beginning. This thesis attempts to address parts of that direction by proposing probabilistic data integration algorithms for multiclass decisions where an observation of interest is assigned to one of many categories based on a plurality of information channels

    Learning with Multiple Similarities

    Get PDF
    The notion of similarities between data points is central to many classification and clustering algorithms. We often encounter situations when there are more than one set of pairwise similarity graphs between objects, either arising from different measures of similarity between objects or from a single similarity measure defined on multiple data representations, or a combination of these. Such examples can be found in various applications in computer vision, natural language processing and computational biology. Combining information from these multiple sources is often beneficial in learning meaningful concepts from data. This dissertation proposes novel methods to effectively fuse information from these multiple similarity graphs, targeted towards two fundamental tasks in machine learning - classification and clustering. In particular, I propose two models for learning spectral embedding from multiple similarity graphs using ideas from co-training and co-regularization. Further, I propose a novel approach to the problem of multiple kernel learning (MKL), converting it to a more familiar problem of binary classification in a transformed space. The proposed MKL approach learns a ``good'' linear combination of base kernels by optimizing a quality criterion that is justified both empirically and theoretically. The ideas of the proposed MKL method are also extended to learning nonlinear combinations of kernels, in particular, polynomial kernel combination and more general nonlinear kernel combination using random forests

    Isometry and convexity in dimensionality reduction

    Get PDF
    The size of data generated every year follows an exponential growth. The number of data points as well as the dimensions have increased dramatically the past 15 years. The gap between the demand from the industry in data processing and the solutions provided by the machine learning community is increasing. Despite the growth in memory and computational power, advanced statistical processing on the order of gigabytes is beyond any possibility. Most sophisticated Machine Learning algorithms require at least quadratic complexity. With the current computer model architecture, algorithms with higher complexity than linear O(N) or O(N logN) are not considered practical. Dimensionality reduction is a challenging problem in machine learning. Often data represented as multidimensional points happen to have high dimensionality. It turns out that the information they carry can be expressed with much less dimensions. Moreover the reduced dimensions of the data can have better interpretability than the original ones. There is a great variety of dimensionality reduction algorithms under the theory of Manifold Learning. Most of the methods such as Isomap, Local Linear Embedding, Local Tangent Space Alignment, Diffusion Maps etc. have been extensively studied under the framework of Kernel Principal Component Analysis (KPCA). In this dissertation we study two current state of the art dimensionality reduction methods, Maximum Variance Unfolding (MVU) and Non-Negative Matrix Factorization (NMF). These two dimensionality reduction methods do not fit under the umbrella of Kernel PCA. MVU is cast as a Semidefinite Program, a modern convex nonlinear optimization algorithm, that offers more flexibility and power compared to iv KPCA. Although MVU and NMF seem to be two disconnected problems, we show that there is a connection between them. Both are special cases of a general nonlinear factorization algorithm that we developed. Two aspects of the algorithms are of particular interest: computational complexity and interpretability. In other words computational complexity answers the question of how fast we can find the best solution of MVU/NMF for large data volumes. Since we are dealing with optimization programs, we need to find the global optimum. Global optimum is strongly connected with the convexity of the problem. Interpretability is strongly connected with local isometry1 that gives meaning in relationships between data points. Another aspect of interpretability is association of data with labeled information. The contributions of this thesis are the following: 1. MVU is modified so that it can scale more efficient. Results are shown on 1 million speech datasets. Limitations of the method are highlighted. 2. An algorithm for fast computations for the furthest neighbors is presented for the first time in the literature. 3. Construction of optimal kernels for Kernel Density Estimation with modern convex programming is presented. For the first time we show that the Leave One Cross Validation (LOOCV) function is quasi-concave. 4. For the first time NMF is formulated as a convex optimization problem 5. An algorithm for the problem of Completely Positive Matrix Factorization is presented. 6. A hybrid algorithm of MVU and NMF the isoNMF is presented combining advantages of both methods. 7. The Isometric Separation Maps (ISM) a variation of MVU that contains classification information is presented. 8. Large scale nonlinear dimensional analysis on the TIMIT speech database is performed. 9. A general nonlinear factorization algorithm is presented based on sequential convex programming. Despite the efforts to scale the proposed methods up to 1 million data points in reasonable time, the gap between the industrial demand and the current state of the art is still orders of magnitude wide.Ph.D.Committee Chair: David Anderson; Committee Co-Chair: Alexander Gray; Committee Member: Anthony Yezzi; Committee Member: Hongyuan Zha; Committee Member: Justin Romberg; Committee Member: Ronald Schafe

    Hyper-parameter learning for graph based semi-supervised learning algorithms

    Get PDF
    Master'sMASTER OF SCIENC

    NASA Tech Briefs, November 1997

    Get PDF
    Topics covered include: Test and Measurement; Electronic Components and Circuits; Electronic Systems; Physical Sciences; Materials; Computer Software; Mechanics; Machinery/Automation; Books and Reports.

    NASA Tech Briefs, December 1997

    Get PDF
    Topics: Design and Analysis Software; Electronic Components and Circuits; Electronic Systems; Physical Sciences; Materials; Computer Software; Mechanics; Manufacturing/Fabrication; Mathematics and Information Sciences; Books and Reports

    Tools and Algorithms for the Construction and Analysis of Systems

    Get PDF
    This open access two-volume set constitutes the proceedings of the 27th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2021, which was held during March 27 – April 1, 2021, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg and changed to an online format due to the COVID-19 pandemic. The total of 41 full papers presented in the proceedings was carefully reviewed and selected from 141 submissions. The volume also contains 7 tool papers; 6 Tool Demo papers, 9 SV-Comp Competition Papers. The papers are organized in topical sections as follows: Part I: Game Theory; SMT Verification; Probabilities; Timed Systems; Neural Networks; Analysis of Network Communication. Part II: Verification Techniques (not SMT); Case Studies; Proof Generation/Validation; Tool Papers; Tool Demo Papers; SV-Comp Tool Competition Papers
    corecore