5 research outputs found

    Private Distribution Learning with Public Data: The View from Sample Compression

    Full text link
    We study the problem of private distribution learning with access to public data. In this setup, which we refer to as public-private learning, the learner is given public and private samples drawn from an unknown distribution pp belonging to a class Q\mathcal Q, with the goal of outputting an estimate of pp while adhering to privacy constraints (here, pure differential privacy) only with respect to the private samples. We show that the public-private learnability of a class Q\mathcal Q is connected to the existence of a sample compression scheme for Q\mathcal Q, as well as to an intermediate notion we refer to as list learning. Leveraging this connection: (1) approximately recovers previous results on Gaussians over Rd\mathbb R^d; and (2) leads to new ones, including sample complexity upper bounds for arbitrary kk-mixtures of Gaussians over Rd\mathbb R^d, results for agnostic and distribution-shift resistant learners, as well as closure properties for public-private learnability under taking mixtures and products of distributions. Finally, via the connection to list learning, we show that for Gaussians in Rd\mathbb R^d, at least dd public samples are necessary for private learnability, which is close to the known upper bound of d+1d+1 public samples.Comment: 31 page

    ON EXPRESSIVENESS, INFERENCE, AND PARAMETER ESTIMATION OF DISCRETE SEQUENCE MODELS

    Get PDF
    Huge neural autoregressive sequence models have achieved impressive performance across different applications, such as NLP, reinforcement learning, and bioinformatics. However, some lingering problems (e.g., consistency and coherency of generated texts) continue to exist, regardless of the parameter count. In the first part of this thesis, we chart a taxonomy of the expressiveness of various sequence model families (Ch 3). In particular, we put forth complexity-theoretic proofs that string latent-variable sequence models are strictly more expressive than energy-based sequence models, which in turn are more expressive than autoregressive sequence models. Based on these findings, we introduce residual energy-based sequence models, a family of energy-based sequence models (Ch 4) whose sequence weights can be evaluated efficiently, and also perform competitively against autoregressive models. However, we show how unrestricted energy-based sequence models can suffer from uncomputability; and how such a problem is generally unfixable without knowledge of the true sequence distribution (Ch 5). In the second part of the thesis, we study practical sequence model families and algorithms based on theoretical findings in the first part of the thesis. We introduce neural particle smoothing (Ch 6), a family of approximate sampling methods that work with conditional latent variable models. We also introduce neural finite-state transducers (Ch 7), which extend weighted finite state transducers with the introduction of mark strings, allowing scoring transduction paths in a finite state transducer with a neural network. Finally, we propose neural regular expressions (Ch 8), a family of neural sequence models that are easy to engineer, allowing a user to design flexible weighted relations using Marked FSTs, and combine these weighted relations together with various operations

    Understanding Semantic Implicit Learning through distributional linguistic patterns: A computational perspective

    Get PDF
    The research presented in this PhD dissertation provides a computational perspective on Semantic Implicit Learning (SIL). It puts forward the idea that SIL does not depend on semantic knowledge as classically conceived but upon semantic-like knowledge gained through distributional analysis of massive linguistic input. Using methods borrowed from the machine learning and artificial intelligence literature, we construct computational models, which can simulate the performance observed during behavioural tasks of semantic implicit learning in a human-like way. We link this methodology to the current literature on implicit learning, arguing that this behaviour is a necessary by-product of efficient language processing. Chapter 1 introduces the computational problem posed by implicit learning in general, and semantic implicit learning, in particular, as well as the computational framework, used to tackle them. Chapter 2 introduces distributional semantics models as a way to learn semantic-like representations from exposure to linguistic input. Chapter 3 reports two studies on large datasets of semantic priming which seek to identify the computational model of semantic knowledge that best fits the data under conditions that resemble SIL tasks. We find that a model which acquires semantic-like knowledge gained through distributional analysis of massive linguistic input provides the best fit to the data. Chapter 4 generalises the results of the previous two studies by looking at the performance of the same models in languages other than English. Chapter 5 applies the results of the two previous Chapters on eight datasets of semantic implicit learning. Crucially, these datasets use various semantic manipulations and speakers of different L1s enabling us to test the predictions of different models of semantics. Chapter 6 examines more closely two assumptions which we have taken for granted throughout this thesis. Firstly, we test whether a simpler model based on phonological information can explain the generalisation patterns observed in the tasks. Secondly, we examine whether our definition of the computational problem in Chapter 5 is reasonable. Chapter 7 summarises and discusses the implications for implicit language learning and computational models of cognition. Furthermore, we offer one more study that seeks to bridge the literature on distributional models of semantics to `deeper' models of semantics by learning semantic relations. There are two main contributions of this dissertation to the general field of implicit learning research. Firstly, we highlight the superiority of distributional models of semantics in modelling unconscious semantic knowledge. Secondly, we question whether `deep' semantic knowledge is needed to achieve above chance performance in SIIL tasks. We show how a simple model that learns through distributional analysis of the patterns found in the linguistic input can match the behavioural results in different languages. Furthermore, we link these models to more general problems faced in psycholinguistics such as language processing and learning of semantic relations.Alexandros Onassis Foundatio

    Machine learning applications for noisy intermediate-scale quantum computers

    Get PDF
    Quantum machine learning (QML) has proven to be a fruitful area in which to search for applications of quantum computers. This is particularly true for those available in the near term, so called noisy intermediate-scale quantum (NISQ) devices. In this thesis, we develop and study QML algorithms in three application areas. We focus our attention towards heuristic algorithms of a variational (meaning hybrid quantum-classical) nature, using parameterised quantum circuits as the underlying quantum machine learning model. The variational nature of these models makes them especially suited for NISQ computers. We order these applications in terms of the increasing complexity of the data presented to them. Firstly, we study a variational quantum classifier in supervised machine learning, and focus on how (classical) data, feature vectors, may be encoded in such models in a way that is robust to the inherent noise on NISQ computers. We provide a framework for studying the robustness of these classification models, prove theoretical results relative to some common noise channels, and demonstrate extensive numerical results reinforcing these findings. Secondly, we move to a variational generative model called the Born machine, where the data becomes a (classical or quantum) probability distribution. Now, the problem falls into the category of unsupervised machine learning. Here, we develop new training methods for the Born machine which outperform the previous state of the art, discuss the possibility of quantum advantage in generative modelling, and perform a systematic comparison of the Born machine relative to a classical competitor, the restricted Boltzmann machine. We also demonstrate the largest scale implementation (28 qubits) of such a model on real quantum hardware to date, using the Rigetti superconducting platform. Finally, for our third QML application, the data becomes purely quantum in nature. We focus on the problem of approximately cloning quantum states, an important primitive in the foundations of quantum mechanics. For this, we develop a variational quantum algorithm which can learn to clone such states, and show how this algorithm can be used to improve quantum cloning fidelities on NISQ hardware. Interestingly, this application can be viewed as either supervised or unsupervised in nature. Furthermore, we demonstrate how this can algorithm can be used to discover novel implementable attacks on quantum cryptographic protocols, focusing on quantum coin flipping and key distribution as examples. For the algorithm, we derive differentiable cost functions, prove theoretical guarantees such as faithfulness, and incorporate state of the art methods such as quantum architecture search

    Proceedings of the tenth international conference Models in developing mathematics education: September 11 - 17, 2009, Dresden, Saxony, Germany

    Get PDF
    This volume contains the papers presented at the International Conference on “Models in Developing Mathematics Education” held from September 11-17, 2009 at The University of Applied Sciences, Dresden, Germany. The Conference was organized jointly by The University of Applied Sciences and The Mathematics Education into the 21st Century Project - a non-commercial international educational project founded in 1986. The Mathematics Education into the 21st Century Project is dedicated to the improvement of mathematics education world-wide through the publication and dissemination of innovative ideas. Many prominent mathematics educators have supported and contributed to the project, including the late Hans Freudental, Andrejs Dunkels and Hilary Shuard, as well as Bruce Meserve and Marilyn Suydam, Alan Osborne and Margaret Kasten, Mogens Niss, Tibor Nemetz, Ubi D’Ambrosio, Brian Wilson, Tatsuro Miwa, Henry Pollack, Werner Blum, Roberto Baldino, Waclaw Zawadowski, and many others throughout the world. Information on our project and its future work can be found on Our Project Home Page http://math.unipa.it/~grim/21project.htm It has been our pleasure to edit all of the papers for these Proceedings. Not all papers are about research in mathematics education, a number of them report on innovative experiences in the classroom and on new technology. We believe that “mathematics education” is fundamentally a “practicum” and in order to be “successful” all new materials, new ideas and new research must be tested and implemented in the classroom, the real “chalk face” of our discipline, and of our profession as mathematics educators. These Proceedings begin with a Plenary Paper and then the contributions of the Principal Authors in alphabetical name order. We sincerely thank all of the contributors for their time and creative effort. It is clear from the variety and quality of the papers that the conference has attracted many innovative mathematics educators from around the world. These Proceedings will therefore be useful in reviewing past work and looking ahead to the future
    corecore