13,911 research outputs found

    Multi-party Poisoning through Generalized pp-Tampering

    Get PDF
    In a poisoning attack against a learning algorithm, an adversary tampers with a fraction of the training data TT with the goal of increasing the classification error of the constructed hypothesis/model over the final test distribution. In the distributed setting, TT might be gathered gradually from mm data providers P1,,PmP_1,\dots,P_m who generate and submit their shares of TT in an online way. In this work, we initiate a formal study of (k,p)(k,p)-poisoning attacks in which an adversary controls k[n]k\in[n] of the parties, and even for each corrupted party PiP_i, the adversary submits some poisoned data TiT'_i on behalf of PiP_i that is still "(1p)(1-p)-close" to the correct data TiT_i (e.g., 1p1-p fraction of TiT'_i is still honestly generated). For k=mk=m, this model becomes the traditional notion of poisoning, and for p=1p=1 it coincides with the standard notion of corruption in multi-party computation. We prove that if there is an initial constant error for the generated hypothesis hh, there is always a (k,p)(k,p)-poisoning attacker who can decrease the confidence of hh (to have a small error), or alternatively increase the error of hh, by Ω(pk/m)\Omega(p \cdot k/m). Our attacks can be implemented in polynomial time given samples from the correct data, and they use no wrong labels if the original distributions are not noisy. At a technical level, we prove a general lemma about biasing bounded functions f(x1,,xn)[0,1]f(x_1,\dots,x_n)\in[0,1] through an attack model in which each block xix_i might be controlled by an adversary with marginal probability pp in an online way. When the probabilities are independent, this coincides with the model of pp-tampering attacks, thus we call our model generalized pp-tampering. We prove the power of such attacks by incorporating ideas from the context of coin-flipping attacks into the pp-tampering model and generalize the results in both of these areas

    Minimal Synthesis of String To String Functions From Examples

    Full text link
    We study the problem of synthesizing string to string transformations from a set of input/output examples. The transformations we consider are expressed using deterministic finite automata (DFA) that read pairs of letters, one letter from the input and one from the output. The DFA corresponding to these transformations have additional constraints, ensuring that each input string is mapped to exactly one output string. We suggest that, given a set of input/output examples, the smallest DFA consistent with the examples is a good candidate for the transformation the user was expecting. We therefore study the problem of, given a set of examples, finding a minimal DFA consistent with the examples and satisfying the functionality and totality constraints mentioned above. We prove that, in general, this problem (the corresponding decision problem) is NP-complete. This is unlike the standard DFA minimization problem which can be solved in polynomial time. We provide several NP-hardness proofs that show the hardness of multiple (independent) variants of the problem. Finally, we propose an algorithm for finding the minimal DFA consistent with input/output examples, that uses a reduction to SMT solvers. We implemented the algorithm, and used it to evaluate the likelihood that the minimal DFA indeed corresponds to the DFA expected by the user.Comment: SYNT 201

    Developments from enquiries into the learnability of the pattern languages from positive data

    Get PDF
    AbstractThe pattern languages are languages that are generated from patterns, and were first proposed by Angluin as a non-trivial class that is inferable from positive data [D. Angluin, Finding patterns common to a set of strings, Journal of Computer and System Sciences 21 (1980) 46–62; D. Angluin, Inductive inference of formal languages from positive data, Information and Control 45 (1980) 117–135]. In this paper we chronologize some results that developed from the investigations on the inferability of the pattern languages from positive data

    Epistemic virtues, metavirtues, and computational complexity

    Get PDF
    I argue that considerations about computational complexity show that all finite agents need characteristics like those that have been called epistemic virtues. The necessity of these virtues follows in part from the nonexistence of shortcuts, or efficient ways of finding shortcuts, to cognitively expensive routines. It follows that agents must possess the capacities – metavirtues –of developing in advance the cognitive virtues they will need when time and memory are at a premium

    Learning probability distributions generated by finite-state machines

    Get PDF
    We review methods for inference of probability distributions generated by probabilistic automata and related models for sequence generation. We focus on methods that can be proved to learn in the inference in the limit and PAC formal models. The methods we review are state merging and state splitting methods for probabilistic deterministic automata and the recently developed spectral method for nondeterministic probabilistic automata. In both cases, we derive them from a high-level algorithm described in terms of the Hankel matrix of the distribution to be learned, given as an oracle, and then describe how to adapt that algorithm to account for the error introduced by a finite sample.Peer ReviewedPostprint (author's final draft
    corecore