13,911 research outputs found
Multi-party Poisoning through Generalized -Tampering
In a poisoning attack against a learning algorithm, an adversary tampers with
a fraction of the training data with the goal of increasing the
classification error of the constructed hypothesis/model over the final test
distribution. In the distributed setting, might be gathered gradually from
data providers who generate and submit their shares of
in an online way.
In this work, we initiate a formal study of -poisoning attacks in
which an adversary controls of the parties, and even for each
corrupted party , the adversary submits some poisoned data on
behalf of that is still "-close" to the correct data (e.g.,
fraction of is still honestly generated). For , this model
becomes the traditional notion of poisoning, and for it coincides with
the standard notion of corruption in multi-party computation.
We prove that if there is an initial constant error for the generated
hypothesis , there is always a -poisoning attacker who can decrease
the confidence of (to have a small error), or alternatively increase the
error of , by . Our attacks can be implemented in
polynomial time given samples from the correct data, and they use no wrong
labels if the original distributions are not noisy.
At a technical level, we prove a general lemma about biasing bounded
functions through an attack model in which each
block might be controlled by an adversary with marginal probability
in an online way. When the probabilities are independent, this coincides with
the model of -tampering attacks, thus we call our model generalized
-tampering. We prove the power of such attacks by incorporating ideas from
the context of coin-flipping attacks into the -tampering model and
generalize the results in both of these areas
Minimal Synthesis of String To String Functions From Examples
We study the problem of synthesizing string to string transformations from a
set of input/output examples. The transformations we consider are expressed
using deterministic finite automata (DFA) that read pairs of letters, one
letter from the input and one from the output. The DFA corresponding to these
transformations have additional constraints, ensuring that each input string is
mapped to exactly one output string.
We suggest that, given a set of input/output examples, the smallest DFA
consistent with the examples is a good candidate for the transformation the
user was expecting. We therefore study the problem of, given a set of examples,
finding a minimal DFA consistent with the examples and satisfying the
functionality and totality constraints mentioned above.
We prove that, in general, this problem (the corresponding decision problem)
is NP-complete. This is unlike the standard DFA minimization problem which can
be solved in polynomial time. We provide several NP-hardness proofs that show
the hardness of multiple (independent) variants of the problem.
Finally, we propose an algorithm for finding the minimal DFA consistent with
input/output examples, that uses a reduction to SMT solvers. We implemented the
algorithm, and used it to evaluate the likelihood that the minimal DFA indeed
corresponds to the DFA expected by the user.Comment: SYNT 201
Developments from enquiries into the learnability of the pattern languages from positive data
AbstractThe pattern languages are languages that are generated from patterns, and were first proposed by Angluin as a non-trivial class that is inferable from positive data [D. Angluin, Finding patterns common to a set of strings, Journal of Computer and System Sciences 21 (1980) 46–62; D. Angluin, Inductive inference of formal languages from positive data, Information and Control 45 (1980) 117–135]. In this paper we chronologize some results that developed from the investigations on the inferability of the pattern languages from positive data
Epistemic virtues, metavirtues, and computational complexity
I argue that considerations about computational complexity show that all finite agents need characteristics like those that have been called epistemic virtues. The necessity of these virtues follows in part from the nonexistence of shortcuts, or efficient ways of finding shortcuts, to cognitively expensive routines. It follows that agents must possess the capacities – metavirtues –of developing in advance the cognitive virtues they will need when time and memory are at a premium
Learning probability distributions generated by finite-state machines
We review methods for inference of probability distributions generated by probabilistic automata and related models for sequence generation. We focus on methods that can be proved to learn in the inference
in the limit and PAC formal models. The methods we review are state merging and state splitting methods for probabilistic deterministic automata and the recently developed spectral method for nondeterministic probabilistic automata. In both cases, we derive them from a high-level algorithm described in terms of the Hankel matrix of the distribution to be learned, given as an oracle, and then describe how to adapt that algorithm to account for the error introduced by a finite sample.Peer ReviewedPostprint (author's final draft
- …