89 research outputs found

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Learning Possibilistic Logic Theories

    Get PDF
    Vi tar opp problemet med å lære tolkbare maskinlæringsmodeller fra usikker og manglende informasjon. Vi utvikler først en ny dyplæringsarkitektur, RIDDLE: Rule InDuction with Deep LEarning (regelinduksjon med dyp læring), basert på egenskapene til mulighetsteori. Med eksperimentelle resultater og sammenligning med FURIA, en eksisterende moderne metode for regelinduksjon, er RIDDLE en lovende regelinduksjonsalgoritme for å finne regler fra data. Deretter undersøker vi læringsoppgaven formelt ved å identifisere regler med konfidensgrad knyttet til dem i exact learning-modellen. Vi definerer formelt teoretiske rammer og viser forhold som må holde for å garantere at en læringsalgoritme vil identifisere reglene som holder i et domene. Til slutt utvikler vi en algoritme som lærer regler med tilhørende konfidensverdier i exact learning-modellen. Vi foreslår også en teknikk for å simulere spørringer i exact learning-modellen fra data. Eksperimenter viser oppmuntrende resultater for å lære et sett med regler som tilnærmer reglene som er kodet i data.We address the problem of learning interpretable machine learning models from uncertain and missing information. We first develop a novel deep learning architecture, named RIDDLE (Rule InDuction with Deep LEarning), based on properties of possibility theory. With experimental results and comparison with FURIA, a state of the art method, RIDDLE is a promising rule induction algorithm for finding rules from data. We then formally investigate the learning task of identifying rules with confidence degree associated to them in the exact learning model. We formally define theoretical frameworks and show conditions that must hold to guarantee that a learning algorithm will identify the rules that hold in a domain. Finally, we develop an algorithm that learns rules with associated confidence values in the exact learning model. We also propose a technique to simulate queries in the exact learning model from data. Experiments show encouraging results to learn a set of rules that approximate rules encoded in data.Doktorgradsavhandlin

    ON EXPRESSIVENESS, INFERENCE, AND PARAMETER ESTIMATION OF DISCRETE SEQUENCE MODELS

    Get PDF
    Huge neural autoregressive sequence models have achieved impressive performance across different applications, such as NLP, reinforcement learning, and bioinformatics. However, some lingering problems (e.g., consistency and coherency of generated texts) continue to exist, regardless of the parameter count. In the first part of this thesis, we chart a taxonomy of the expressiveness of various sequence model families (Ch 3). In particular, we put forth complexity-theoretic proofs that string latent-variable sequence models are strictly more expressive than energy-based sequence models, which in turn are more expressive than autoregressive sequence models. Based on these findings, we introduce residual energy-based sequence models, a family of energy-based sequence models (Ch 4) whose sequence weights can be evaluated efficiently, and also perform competitively against autoregressive models. However, we show how unrestricted energy-based sequence models can suffer from uncomputability; and how such a problem is generally unfixable without knowledge of the true sequence distribution (Ch 5). In the second part of the thesis, we study practical sequence model families and algorithms based on theoretical findings in the first part of the thesis. We introduce neural particle smoothing (Ch 6), a family of approximate sampling methods that work with conditional latent variable models. We also introduce neural finite-state transducers (Ch 7), which extend weighted finite state transducers with the introduction of mark strings, allowing scoring transduction paths in a finite state transducer with a neural network. Finally, we propose neural regular expressions (Ch 8), a family of neural sequence models that are easy to engineer, allowing a user to design flexible weighted relations using Marked FSTs, and combine these weighted relations together with various operations

    Integrating Distributional, Compositional, and Relational Approaches to Neural Word Representations

    Get PDF
    When the field of natural language processing (NLP) entered the era of deep neural networks, the task of representing basic units of language, an inherently sparse and symbolic medium, using low-dimensional dense real-valued vectors, or embeddings, became crucial. The dominant technique to perform this task has for years been to segment input text sequences into space-delimited words, for which embeddings are trained over a large corpus by means of leveraging distributional information: a word is reducible to the set of contexts it appears in. This approach is powerful but imperfect; words not seen during the embedding learning phase, known as out-of-vocabulary words (OOVs), emerge in any plausible application where embeddings are used. One approach applied in order to combat this and other shortcomings is the incorporation of compositional information obtained from the surface form of words, enabling the representation of morphological regularities and increasing robustness to typographical errors. Another approach leverages word-sense information and relations curated in large semantic graph resources, offering a supervised signal for embedding space structure and improving representations for domain-specific rare words. In this dissertation, I offer several analyses and remedies for the OOV problem based on the utilization of character-level compositional information in multiple languages and the structure of semantic knowledge in English. In addition, I provide two novel datasets for the continued exploration of vocabulary expansion in English: one with a taxonomic emphasis on novel word formation, and the other generated by a real-world data-driven use case in the entity graph domain. Finally, recognizing the recent shift in NLP towards contextualized representations of subword tokens, I describe the form in which the OOV problem still appears in these methods, and apply an integrative compositional model to address it.Ph.D

    A method for system of systems definition and modeling using patterns of collective behavior

    Get PDF
    The Department of Defense ship and aircraft acquisition process, with its capability-based assessments and fleet synthesis studies, relies heavily on the assumption that a functional decomposition of higher-level system of systems (SoS) capabilities into lower-level system and subsystem behaviors is both possible and practical. However, SoS typically exhibit “non-decomposable” behaviors (also known as emergent behaviors) for which no widely-accepted representation exists. The presence of unforeseen emergent behaviors, particularly undesirable ones, can make systems vulnerable to attacks, hacks, or other exploitation, or can cause delays in acquisition program schedules and cost overruns in order to mitigate them. The International Council on Systems Engineering has identified the development of methods for predicting and managing emergent behaviors as one of the top research priorities for the Systems Engineering profession. Therefore, this thesis develops a method for rendering quantifiable SoS emergent properties and behaviors traceable to patterns of interaction of their constitutive systems, so that exploitable patterns identified during the early stages of design can be accounted for. This method is designed to fill two gaps in the literature. First, the lack of an approach for mining data to derive a model (i.e. an equation) of the non-decomposable behavior. Second, the lack of an approach for qualitatively and quantitatively associating emergent behaviors with the components that cause the behavior. A definition for emergent behavior is synthesized from the literature, as well as necessary conditions for its identification. An ontology of emergence that enables studying the emergent behaviors exhibited by self-organized systems via numerical simulations is adapted for this thesis in order to develop the mathematical approach needed to satisfy the research objective. Within the confines of two carefully qualified assumptions (that the model is valid, and that the model is efficient), it is argued that simulated emergence is bona-fide emergence, and that simulations can be used for experimentation without sacrificing rigor. This thesis then puts forward three hypotheses: The first hypothesis is that self-organized structures imply the presence of a form of data compression, and this compression can be used to explicitly calculate an upper bound on the number of emergent behaviors that a system can possess. The second hypothesis is that the set of numerical criteria for detecting emergent behavior derived in this research constitutes sufficient conditions for identifying weak and functional emergent behaviors. The third hypothesis states that affecting the emergent properties of these systems will have a bigger impact on the system’s performance than affecting any single component of that system. Using the method developed in this thesis, exploitable properties are identified and component behaviors are modified to attempt the exploit. Changes in performance are evaluated using problem-specific measures of merit. The experiments find that Hypothesis 2 is false (the numerical criteria are not sufficient conditions) by identifying instances where the numerical criteria produce a false-positive. As a result, a set of sufficient conditions for emergent behavior identification remains to be found. Hypothesis 1 was also falsified based on a worst-case scenario where the largest possible number of obtainable emergent behaviors was compared against the upper bound computed from the smallest possible data compression of a self-organized system. Hypothesis 3, on the other hand, was supported, as it was found that new behavior rules based on component-level properties provided less improvement to performance against an adversary than rules based on system-level properties. Overall, the method is shown to be an effective, systematic approach to non-decomposable behavior exploitation, and an improvement over the modern, largely ad hoc approach.Ph.D

    The complexity of the first-order theory of pure equality

    Full text link
    We will find a lower bound on the recognition complexity of the theories that are nontrivial relative to some equivalence relation (this relation may be equality), namely, each of these theories is consistent with the formula, whose sense is that there exist two non-equivalent elements. However, at first, we will obtain a lower bound on the computational complexity for the first-order theory of Boolean algebra that has only two elements. For this purpose, we will code the long-continued deterministic Turing machine computations by the relatively short-length quantified Boolean formulae; the modified Stockmeyer and Meyer method will appreciably be used for this simulation. Then, we will transform the modeling formulae of the theory of this Boolean algebra to the simulation ones of the first-order theory of the only equivalence relation in polynomial time. Since the computational complexity of these theories is not polynomial, we obtain that the class P\mathbf{P} is a proper subclass of PSPACE\mathbf{PSPACE} (Polynomial Time is a proper subset of Polynomial Space). Keywords: Computational complexity, the theory of equality, the coding of computations, simulation by means formulae, polynomial time, polynomial space, lower complexity boundComment: 40 pages, 19 references bibliograph

    Arbitrary-precision computation of the gamma function

    Get PDF
    We discuss the best methods available for computing the gamma function Γ(z)\Gamma(z) in arbitrary-precision arithmetic with rigorous error bounds. We address different cases: rational, algebraic, real or complex arguments; large or small arguments; low or high precision; with or without precomputation. The methods also cover the log-gamma function logΓ(z)\log \Gamma(z), the digamma function ψ(z)\psi(z), and derivatives Γ(n)(z)\Gamma^{(n)}(z) and ψ(n)(z)\psi^{(n)}(z). Besides attempting to summarize the existing state of the art, we present some new formulas, estimates, bounds and algorithmic improvements and discuss implementation results

    MIP*=RE

    Full text link
    We show that the class MIP* of languages that can be decided by a classical verifier interacting with multiple all-powerful quantum provers sharing entanglement is equal to the class RE of recursively enumerable languages. Our proof builds upon the quantum low-degree test of (Natarajan and Vidick, FOCS 2018) and the classical low-individual degree test of (Ji, et al., 2020) by integrating recent developments from (Natarajan and Wright, FOCS 2019) and combining them with the recursive compression framework of (Fitzsimons et al., STOC 2019). An immediate byproduct of our result is that there is an efficient reduction from the Halting Problem to the problem of deciding whether a two-player nonlocal game has entangled value 11 or at most 1/21/2. Using a known connection, undecidability of the entangled value implies a negative answer to Tsirelson's problem: we show, by providing an explicit example, that the closure CqaC_{qa} of the set of quantum tensor product correlations is strictly included in the set CqcC_{qc} of quantum commuting correlations. Following work of (Fritz, Rev. Math. Phys. 2012) and (Junge et al., J. Math. Phys. 2011) our results provide a refutation of Connes' embedding conjecture from the theory of von Neumann algebras.Comment: 206 pages. v2: Updated to use arXiv:2009.12982. New appendi

    36th International Symposium on Theoretical Aspects of Computer Science: STACS 2019, March 13-16, 2019, Berlin, Germany

    Get PDF
    corecore