251 research outputs found

    Some fundamental issues in receiver design and performance analysis for wireless communication

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Discriminative, generative, and imitative learning

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2002.Includes bibliographical references (leaves 201-212).I propose a common framework that combines three different paradigms in machine learning: generative, discriminative and imitative learning. A generative probabilistic distribution is a principled way to model many machine learning and machine perception problems. Therein, one provides domain specific knowledge in terms of structure and parameter priors over the joint space of variables. Bayesian networks and Bayesian statistics provide a rich and flexible language for specifying this knowledge and subsequently refining it with data and observations. The final result is a distribution that is a good generator of novel exemplars. Conversely, discriminative algorithms adjust a possibly non-distributional model to data optimizing for a specific task, such as classification or prediction. This typically leads to superior performance yet compromises the flexibility of generative modeling. I present Maximum Entropy Discrimination (MED) as a framework to combine both discriminative estimation and generative probability densities. Calculations involve distributions over parameters, margins, and priors and are provably and uniquely solvable for the exponential family. Extensions include regression, feature selection, and transduction. SVMs are also naturally subsumed and can be augmented with, for example, feature selection, to obtain substantial improvements. To extend to mixtures of exponential families, I derive a discriminative variant of the Expectation-Maximization (EM) algorithm for latent discriminative learning (or latent MED).(cont.) While EM and Jensen lower bound log-likelihood, a dual upper bound is made possible via a novel reverse-Jensen inequality. The variational upper bound on latent log-likelihood has the same form as EM bounds, is computable efficiently and is globally guaranteed. It permits powerful discriminative learning with the wide range of contemporary probabilistic mixture models (mixtures of Gaussians, mixtures of multinomials and hidden Markov models). We provide empirical results on standardized data sets that demonstrate the viability of the hybrid discriminative-generative approaches of MED and reverse-Jensen bounds over state of the art discriminative techniques or generative approaches. Subsequently, imitative learning is presented as another variation on generative modeling which also learns from exemplars from an observed data source. However, the distinction is that the generative model is an agent that is interacting in a much more complex surrounding external world. It is not efficient to model the aggregate space in a generative setting. I demonstrate that imitative learning (under appropriate conditions) can be adequately addressed as a discriminative prediction task which outperforms the usual generative approach. This discriminative-imitative learning approach is applied with a generative perceptual system to synthesize a real-time agent that learns to engage in social interactive behavior.by Tony Jebara.Ph.D

    Envelope detection of orthogonal signals with phase noise

    Get PDF
    Cover title.Includes bibliographical references (p. 33-34).Research supported by the NSF. NSF/8802991-NCR Research supported by DARPA. F19628-90-C-0002Murat Azizo\1E21lu and Pierre A. Humblet

    Convergence of Stochastic Processes

    Get PDF
    Often the best way to adumbrate a dark and dense assemblage of material is to describe the background in contrast to which the edges of the nebulosity may be clearly discerned. Hence, perhaps the most appropriate way to introduce this paper is to describe what it is not. It is not a comprehensive study of stochastic processes, nor an in-depth treatment of convergence. In fact, on the surface, the material covered in this paper is nothing more than a compendium of seemingly loosely-connected and barely-miscible theorems, methods and conclusions from the three main papers surveyed ([VC71], [Pol89] and [DL91])

    Bayesian Gaussian Process Models: PAC-Bayesian Generalisation Error Bounds and Sparse Approximations

    Get PDF
    Non-parametric models and techniques enjoy a growing popularity in the field of machine learning, and among these Bayesian inference for Gaussian process (GP) models has recently received significant attention. We feel that GP priors should be part of the standard toolbox for constructing models relevant to machine learning in the same way as parametric linear models are, and the results in this thesis help to remove some obstacles on the way towards this goal. In the first main chapter, we provide a distribution-free finite sample bound on the difference between generalisation and empirical (training) error for GP classification methods. While the general theorem (the PAC-Bayesian bound) is not new, we give a much simplified and somewhat generalised derivation and point out the underlying core technique (convex duality) explicitly. Furthermore, the application to GP models is novel (to our knowledge). A central feature of this bound is that its quality depends crucially on task knowledge being encoded faithfully in the model and prior distributions, so there is a mutual benefit between a sharp theoretical guarantee and empirically well-established statistical practices. Extensive simulations on real-world classification tasks indicate an impressive tightness of the bound, in spite of the fact that many previous bounds for related kernel machines fail to give non-trivial guarantees in this practically relevant regime. In the second main chapter, sparse approximations are developed to address the problem of the unfavourable scaling of most GP techniques with large training sets. Due to its high importance in practice, this problem has received a lot of attention recently. We demonstrate the tractability and usefulness of simple greedy forward selection with information-theoretic criteria previously used in active learning (or sequential design) and develop generic schemes for automatic model selection with many (hyper)parameters. We suggest two new generic schemes and evaluate some of their variants on large real-world classification and regression tasks. These schemes and their underlying principles (which are clearly stated and analysed) can be applied to obtain sparse approximations for a wide regime of GP models far beyond the special cases we studied here

    Fast Learning by Bounding Likelihoods in Sigmoid Type Belief Networks

    Get PDF
    Sigmoid type belief networks, a class of probabilistic neural networks, provide a natural framework for compactly representing probabilistic information in a variety of unsupervised and supervised learning problems. Often the parameters used in these networks need to be learned from examples. Unfortunately, estimating the parameters via exact probabilistic calculations (i.e, the EM-algorithm) is intractable even for networks with fairly small numbers of hidden units. We propose to avoid the infeasibility of the E step by bounding likelihoods instead of computing them exactly. We introduce extended and complementary representations for these networks and show that the estimation of the network parameters can be made fast (reduced to quadratic optimization) by performing the estimation in either of the alternative domains. The complementary networks can be used for continuous density estimation as well

    Performance of on-off modulated lightwave signals with phase

    Get PDF
    Caption title.Includes bibliographical references (leaves [8]-[9]).Research supported by the National Science Foundation. NSF/8802991-NCR Research supported by the U.S. Army Research Office. DAAL03-86-K-0171Murat AzizogÌlu, Pierre A. Humblet

    Quantum Lattice Enumeration in Limited Depth

    Get PDF
    In 2018, Aono et al. (ASIACRYPT 2018) proposed to use quantum backtracking algorithms (Montanaro, TOC 2018; Ambainis and Kokainis, STOC 2017) to speedup lattice point enumeration. Quantum lattice sieving algorithms had already been proposed (Laarhoven et al., PQCRYPTO 2013), being shown to provide an asymptotic speedup over classical counterparts, but also to lose competitivity at relevant dimensions to cryptography if practical considerations on quantum computer architecture were taken into account (Albrecht et al., ASIACRYPT 2020). Aono et al.’s work argued that quantum walk speedups can be applied to lattice enumeration, achieving at least a quadratic asymptotic speedup à la Grover search while not requiring exponential amounts of quantum accessible classical memory, as it is the case for sieving. In this work, we explore how to lower bound the cost of using Aono et al.’s techniques on lattice enumeration with extreme cylinder pruning assuming a limit to the maximum depth that a quantum computation can achieve without decohering, with the objective of better understanding the practical applicability of quantum backtracking in lattice cryptanalysis
    corecore