14 research outputs found
Recommended from our members
Understanding Semantic Implicit Learning through distributional linguistic patterns: A computational perspective
The research presented in this PhD dissertation provides a computational perspective on Semantic Implicit Learning (SIL). It puts forward the idea that SIL does not depend on semantic knowledge as classically conceived but upon semantic-like knowledge gained through distributional analysis of massive linguistic input. Using methods borrowed from the machine learning and artificial intelligence literature, we construct computational models, which can simulate the performance observed during behavioural tasks of semantic implicit learning in a human-like way. We link this methodology to the current literature on implicit learning, arguing that this behaviour is a necessary by-product of efficient language processing.
Chapter 1 introduces the computational problem posed by implicit learning in general, and semantic implicit learning, in particular, as well as the computational framework, used to tackle them.
Chapter 2 introduces distributional semantics models as a way to learn semantic-like representations from exposure to linguistic input.
Chapter 3 reports two studies on large datasets of semantic priming which seek to identify the computational model of semantic knowledge that best fits the data under conditions that resemble SIL tasks. We find that a model which acquires semantic-like knowledge gained through distributional analysis of massive linguistic input provides the best fit to the data.
Chapter 4 generalises the results of the previous two studies by looking at the performance of the same models in languages other than English.
Chapter 5 applies the results of the two previous Chapters on eight datasets of semantic implicit learning. Crucially, these datasets use various semantic manipulations and speakers of different L1s enabling us to test the predictions of different models of semantics.
Chapter 6 examines more closely two assumptions which we have taken for granted throughout this thesis. Firstly, we test whether a simpler model based on phonological information can explain the generalisation patterns observed in the tasks. Secondly, we examine whether our definition of the computational problem in Chapter 5 is reasonable.
Chapter 7 summarises and discusses the implications for implicit language learning and computational models of cognition. Furthermore, we offer one more study that seeks to bridge the literature on distributional models of semantics to `deeper' models of semantics by learning semantic relations.
There are two main contributions of this dissertation to the general field of implicit learning research. Firstly, we highlight the superiority of distributional models of semantics in modelling unconscious semantic knowledge. Secondly, we question whether `deep' semantic knowledge is needed to achieve above chance performance in SIIL tasks. We show how a simple model that learns through distributional analysis of the patterns found in the linguistic input can match the behavioural results in different languages. Furthermore, we link these models to more general problems faced in psycholinguistics such as language processing and learning of semantic relations.Alexandros Onassis Foundatio
Quantum Mathematics in Artificial Intelligence
In the decade since 2010, successes in artificial intelligence have been at
the forefront of computer science and technology, and vector space models have
solidified a position at the forefront of artificial intelligence. At the same
time, quantum computers have become much more powerful, and announcements of
major advances are frequently in the news.
The mathematical techniques underlying both these areas have more in common
than is sometimes realized. Vector spaces took a position at the axiomatic
heart of quantum mechanics in the 1930s, and this adoption was a key motivation
for the derivation of logic and probability from the linear geometry of vector
spaces. Quantum interactions between particles are modelled using the tensor
product, which is also used to express objects and operations in artificial
neural networks.
This paper describes some of these common mathematical areas, including
examples of how they are used in artificial intelligence (AI), particularly in
automated reasoning and natural language processing (NLP). Techniques discussed
include vector spaces, scalar products, subspaces and implication, orthogonal
projection and negation, dual vectors, density matrices, positive operators,
and tensor products. Application areas include information retrieval,
categorization and implication, modelling word-senses and disambiguation,
inference in knowledge bases, and semantic composition.
Some of these approaches can potentially be implemented on quantum hardware.
Many of the practical steps in this implementation are in early stages, and
some are already realized. Explaining some of the common mathematical tools can
help researchers in both AI and quantum computing further exploit these
overlaps, recognizing and exploring new directions along the way.Comment: Adding journal reference, recommended by JAIR editors upon
publicatio
A Mathematical Framework for Causally Structured Dilations and its Relation to Quantum Self-Testing
The motivation for this thesis was to recast quantum self-testing [MY98,MY04]
in operational terms. The result is a category-theoretic framework for
discussing the following general question: How do different implementations of
the same input-output process compare to each other? In the proposed framework,
an input-output process is modelled by a causally structured channel in some
fixed theory, and its implementations are modelled by causally structured
dilations formalising hidden side-computations. These dilations compare through
a pre-order formalising relative strength of side-computations. Chapter 1
reviews a mathematical model for physical theories as semicartesian symmetric
monoidal categories. Many concrete examples are discussed, in particular
quantum and classical information theory. The key feature is that the model
facilitates the notion of dilations. Chapter 2 is devoted to the study of
dilations. It introduces a handful of simple yet potent axioms about dilations,
one of which (resembling the Purification Postulate [CDP10]) entails a duality
theorem encompassing a large number of classic no-go results for quantum
theory. Chapter 3 considers metric structure on physical theories, introducing
in particular a new metric for quantum channels, the purified diamond distance,
which generalises the purified distance [TCR10,Tom12] and relates to the Bures
distance [KSW08a]. Chapter 4 presents a category-theoretic formalism for
causality in terms of '(constructible) causal channels' and 'contractions'. It
simplifies aspects of the formalisms [CDP09,KU17] and relates to traces in
monoidal categories [JSV96]. The formalism allows for the definition of 'causal
dilations' and the establishment of a non-trivial theory of such dilations.
Chapter 5 realises quantum self-testing from the perspective of chapter 4, thus
pointing towards the first known operational foundation for self-testing.Comment: PhD thesis submitted to the University of Copenhagen (ISBN
978-87-7125-039-8). Advised by prof. Matthias Christandl, submitted 1st of
December 2020, defended 11th of February 2021. Keywords: dilations, applied
category theory, quantum foundations, causal structure, quantum self-testing.
242 pages, 1 figure. Comments are welcom
A Framework for Non-Asymptotic Quantum Information Theory
This thesis consolidates, improves and extends the smooth entropy framework
for non-asymptotic information theory and cryptography.
We investigate the conditional min- and max-entropy for quantum states,
generalizations of classical R\'enyi entropies. We introduce the purified
distance, a novel metric for unnormalized quantum states and use it to define
smooth entropies as optimizations of the min- and max-entropies over a ball of
close states. We explore various properties of these entropies, including
data-processing inequalities, chain rules and their classical limits. The most
important property is an entropic formulation of the asymptotic equipartition
property, which implies that the smooth entropies converge to the von Neumann
entropy in the limit of many independent copies. The smooth entropies also
satisfy duality and entropic uncertainty relations that provide limits on the
power of two different observers to predict the outcome of a measurement on a
quantum system.
Finally, we discuss three example applications of the smooth entropy
framework. We show a strong converse statement for source coding with quantum
side information, characterize randomness extraction against quantum side
information and prove information theoretic security of quantum key
distribution using an intuitive argument based on the entropic uncertainty
relation.Comment: PhD thesis, Department of Physics, ETH Zuric
Applications of Approximate Learning and Inference for Probabilistic Models
We develop approximate inference and learning methods for facilitating the use of probabilistic modeling techniques motivated by applications in two different areas. First, we consider the ill-posed inverse problem of recovering an image from an underdetermined system of linear measurements corrupted by noise. Second, we consider the problem of inferring user preferences for items from counts, pairwise comparisons and user activity logs, instances of implicit feedback. Plausible models for images and the noise, incurred when recording them, render posterior inference intractable, while the scale of the inference problem makes sampling based approximations ineffective. Therefore, we develop deterministic approximate inference algorithms for two different augmentations of a typical sparse linear model: first, for the rectified-linear Poisson likelihood, and second, for tree-structured super-Gaussian mixture models. The rectified-linear Poisson likelihood is an alternative noise model, applicable in astronomical and biomedical imaging applications, that operate in intensity regimes in which quantum effects lead to observations that are best described by counts of particles arriving at a sensor, as well as in general Poisson regression problems arising in various fields. In this context we show, that the model-specific computations for Expectation Propagation can be robustly solved by a simple dynamic program. Next, we develop a scalable approximate inference algorithm for structured mixture models, that uses a discrete graphical model to represent dependencies between the latent mixture components of a collection of mixture models. Specifically, we use tree-structured mixtures of super-Gaussians to model the persistence across scales of large coefficients of the Wavelet transform of an image for improved reconstruction. In the second part on models of user preference, we consider two settings: the global static and the contextual dynamic setting. In the global static setting, we represent user-item preferences by a latent low-rank matrix. Instead of using numeric ratings we develop methods to infer this latent representation for two types of implicit feedback: aggregate counts of users interacting with a service and the binary outcomes of pairwise comparisons. We model count data using a latent Gaussian bilinear model with Poisson likelihoods. For this model, we show that the Variational Gaussian approximation can be further relaxed to be available in closed-form by adding additional constraints, leading to an efficient inference algorithm. In the second implicit feedback scenario, we infer the latent preference matrix from pairwise preference statements. We combine a low-rank bilinear model with non-parameteric item- feature regression and develop a novel approximate variational Expectation Maximization algorithm that mitigates the computational challenges due to latent couplings induced by the pairwise comparisons. Finally, in the contextual dynamic setting, we model sequences of user activity at the granularity of single interaction events instead of aggregate counts. Routinely gathered in the background at a large scale in many applications, such sequences can reveal temporal and contextual aspects of user behavior through recurrent patterns. To describe such data, we propose a generic collaborative sequence model based on recurrent neural networks, that combines ideas from collaborative filtering and language modeling
Forecasting: theory and practice
Forecasting has always been at the forefront of decision making and planning. The uncertainty that surrounds the future is both exciting and challenging, with individuals and organisations seeking to minimise risks and maximise utilities. The large number of forecasting applications calls for a diverse set of forecasting methods to tackle real-life challenges. This article provides a non-systematic review of the theory and the practice of forecasting. We provide an overview of a wide range of theoretical, state-of-the-art models, methods, principles, and approaches to prepare, produce, organise, and evaluate forecasts. We then demonstrate how such theoretical concepts are applied in a variety of real-life contexts.
We do not claim that this review is an exhaustive list of methods and applications. However, we wish that our encyclopedic presentation will offer a point of reference for the rich work that has been undertaken over the last decades, with some key insights for the future of forecasting theory and practice. Given its encyclopedic nature, the intended mode of reading is non-linear. We offer cross-references to allow the readers to navigate through the various topics. We complement the theoretical concepts and applications covered by large lists of free or open-source software implementations and publicly-available databases.info:eu-repo/semantics/publishedVersio
Forecasting: theory and practice
Forecasting has always been at the forefront of decision making and planning. The uncertainty that surrounds the future is both exciting and challenging, with individuals and organisations seeking to minimise risks and maximise utilities. The large number of forecasting applications calls for a diverse set of forecasting methods to tackle real-life challenges. This article provides a non-systematic review of the theory and the practice of forecasting. We provide an overview of a wide range of theoretical, state-of-the-art models, methods, principles, and approaches to prepare, produce, organise, and evaluate forecasts. We then demonstrate how such theoretical concepts are applied in a variety of real-life contexts. We do not claim that this review is an exhaustive list of methods and applications. However, we wish that our encyclopedic presentation will offer a point of reference for the rich work that has been undertaken over the last decades, with some key insights for the future of forecasting theory and practice. Given its encyclopedic nature, the intended mode of reading is non-linear. We offer cross-references to allow the readers to navigate through the various topics. We complement the theoretical concepts and applications covered by large lists of free or open-source software implementations and publicly-available databases
Celebrated Econometricians: Katarina Juselius and Søren Johansen
This Special Issue collects contributions related to advances in the theory and practice of Econometrics induced by the research of Katarina Juselius and Søren Johansen, whom this Special Issue aims to celebrate. The papers in this Special Issue provide advances on several topics, and they are grouped in the following areas, with three to four papers per group). The first group provides a historical perspective on Katarina’s and Søren’s contributions to Econometrics. The second group of papers concentrates on representation theory, while the third focuses on estimation and inference. The fourth group explores extensions of CVARs for modelling and forecasting, and the fifth and final group is centered on empirical applications
Motif discovery in sequential data
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Chemical Engineering, 2006.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (v. 2, leaves [435]-467).In this thesis, I discuss the application and development of methods for the automated discovery of motifs in sequential data. These data include DNA sequences, protein sequences, and real-valued sequential data such as protein structures and timeseries of arbitrary dimension. As more genomes are sequenced and annotated, the need for automated, computational methods for analyzing biological data is increasing rapidly. In broad terms, the goal of this thesis is to treat sequential data sets as unknown languages and to develop tools for interpreting an understanding these languages. The first chapter of this thesis is an introduction to the fundamentals of motif discovery, which establishes a common mode of thought and vocabulary for the subsequent chapters. One of the central themes of this work is the use of grammatical models, which are more commonly associated with the field of computational linguistics. In the second chapter, I use grammatical models to design novel antimicrobial peptides (AmPs). AmPs are small proteins used by the innate immune system to combat bacterial infection in multicellular eukaryotes. There is mounting evidence that these peptides are less susceptible to bacterial resistance than traditional antibiotics and may form the basis for a novel class of therapeutics.(cont.) In this thesis, I described the rational design of novel AmPs that show limited homology to naturally-occurring proteins but have strong bacteriostatic activity against several species of bacteria, including Staphylococcus aureus and Bacillus anthracis. These peptides were designed using a linguistic model of natural AmPs by treating the amino acid sequences of natural AmPs as a formal language and building a set of regular grammars to describe this language. is set of grammars was used to create novel, unnatural AmP sequences that conform to the formal syntax of natural antimicrobial peptides but populate a previously unexplored region of protein sequence space. The third chapter describes a novel, GEneric MOtif DIscovery Algorithm (Gemoda) for sequential data. Gemoda can be applied to any dataset with a sequential character, including both categorical and real-valued data. As I show, Gemoda deterministically discovers motifs that are maximal in composition and length. As well, the algorithm allows any choice of similarity metric for finding motifs. These motifs are representation-agnostic: they can be represented using regular expressions, position weight matrices, or any other model for sequential data.(cont.) I demonstrate a number of applications of the algorithm, including the discovery of motifs in amino acids and DNA sequences, and the discovery of conserved protein sub-structures. The final chapter is devoted to a series of smaller projects, employing tool methods indirectly related to motif discovery in sequential data. I describe the construction of a software tool, Biogrep that is designed to match large pattern sets against large biosequence databases in a parallel fashion. is makes biogrep well-suited to annotating sets of sequences using biologically significant patterns. In addition, I show that the BLOSUM series of amino acid substitution matrices, which are commonly used in motif discovery and sequence alignment problems, have changed drastically over time.The fidelity of amino acid sequence alignment and motif discovery tools depends strongly on the target frequencies implied by these underlying matrices. us, these results suggest that further optimization of these matrices is possible. The final chapter also contains two projects wherein I apply statistical motif discovery tools instead of grammatical tools.(cont.) In the first of these two, I develop three different physiochemical representations for a set of roughly 700 HIV-I protease substrates and use these representations for sequence classification and annotation. In the second of these two projects, I develop a simple statistical method for parsing out the phenotypic contribution of a single mutation from libraries of functional diversity that contain a multitude of mutations and varied phenotypes. I show that this new method successfully elucidates the effects of single nucleotide polymorphisms on the strength of a promoter placed upstream of a reporter gene. The central theme, present throughout this work, is the development and application of novel approaches to finding motifs in sequential data. The work on the design of AmPs is very applied and relies heavily on existing literature. In contrast, the work on Gemoda is the greatest contribution of this thesis and contains many new ideas.by Kyle L. Jensen.Ph.D