10 research outputs found

    Blockwise pp-Tampering Attacks on Cryptographic Primitives, Extractors, and Learners

    Get PDF
    Austrin, Chung, Mahmoody, Pass and Seth (Crypto\u2714) studied the notion of bitwise pp-tampering attacks over randomized algorithms in which an efficient `virus\u27 gets to control each bit of the randomness with independent probability pp in an online way. The work of Austrin et al. showed how to break certain `privacy primitives\u27 (e.g., encryption, commitments, etc.) through bitwise pp-tampering, by giving a bitwise pp-tampering biasing attack for increasing the average E[f(Un)]E[f(U_n)] of any efficient function f ⁣:{0,1}n[1,+1]f \colon \{0,1\}^n \to [-1,+1] by Ω(pVar[f(Un)])\Omega(p \cdot Var[f(U_n)]) where Var[f(Un)]Var[f(U_n)] is the variance of f(Un)f(U_n). In this work, we revisit and extend the bitwise tampering model of Austrin et al. to blockwise setting, where blocks of randomness becomes tamperable with independent probability pp. Our main result is an efficient blockwise pp-tampering attack to bias the average E[f(X)]E[f(X)] of any efficient function ff mapping arbitrary XX to [1,+1][-1,+1] by Ω(pVar[f(X)])\Omega(p \cdot Var[f(X)]) regardless of how XX is partitioned into individually tamperable blocks X=(X1,,Xn)X=(X_1,\dots,X_n). Relying on previous works, our main biasing attack immediately implies efficient attacks against the privacy primitives as well as seedless multi-source extractors, in a model where the attacker gets to tamper with each block (or source) of the randomness with independent probability pp. Further, we show how to increase the classification error of deterministic learners in the so called `targeted poisoning\u27 attack model under Valiant\u27s adversarial noise. In this model, an attacker has a `target\u27 test data dd in mind and wishes to increase the error of classifying dd while she gets to tamper with each training example with independent probability pp an in an online way

    Truth inference at scale: A Bayesian model for adjudicating highly redundant crowd annotations

    Get PDF
    Crowd-sourcing is a cheap and popular means of creating training and evaluation datasets for machine learning, however it poses the problem of 'truth inference', as individual workers cannot be wholly trusted to provide reliable annotations. Research into models of annotation aggregation attempts to infer a latent 'true' annotation, which has been shown to improve the utility of crowd-sourced data. However, existing techniques beat simple baselines only in low redundancy settings, where the number of annotations per instance is low (≤ 3), or in situations where workers are unreliable and produce low quality annotations (e.g., through spamming, random, or adversarial behaviours.) As we show, datasets produced by crowd-sourcing are often not of this type: the data is highly redundantly annotated (≥ 5 annotations per instance), and the vast majority of workers produce high quality outputs. In these settings, the majority vote heuristic performs very well, and most truth inference models underperform this simple baseline. We propose a novel technique, based on a Bayesian graphical model with conjugate priors, and simple iterative expectation-maximisation inference. Our technique produces competitive performance to the state-of-the-art benchmark methods, and is the only method that significantly outperforms the majority vote heuristic at one-sided level 0.025, shown by significance tests. Moreover, our technique is simple, is implemented in only 50 lines of code, and trains in seconds

    Legion: Best-first concolic testing (competition contribution)

    Get PDF
    Legion is a grey-box coverage-based concolic tool that aims to balance the complementary nature of fuzzing and symbolic execution to achieve the best of both worlds. It proposes a variation of Monte Carlo tree search (MCTS) that formulates program exploration as sequential decision-making under uncertainty guided by the best-first search strategy. It relies on approximate path-preserving fuzzing, a novel instance of constrained random testing, which quickly generates many diverse inputs that likely target program parts of interest. In Test-Comp 2020 [1], the prototype performed within 90% of the best score in 9 of 22 categories

    Attacking Data Transforming Learners at Training Time

    Get PDF
    While machine learning systems are known to be vulnerable to data-manipulation attacks at both training and deployment time, little is known about how to adapt attacks when the defender transforms data prior to model estimation. We consider the setting where the defender Bob first transforms the data then learns a model from the result; Alice, the attacker, perturbs Bob’s input data prior to him transforming it. We develop a general-purpose “plug and play” framework for gradient-based attacks based on matrix differentials, focusing on ordinary least-squares linear regression. This allows learning algorithms and data transformations to be paired and composed arbitrarily: attacks can be adapted through the use of the chain rule—analogous to backpropagation on neural network parameters—to compositional learning maps. Bestresponse attacks can be computed through matrix multiplications from a library of attack matrices for transformations and learners. Our treatment of linear regression extends state-ofthe-art attacks at training time, by permitting the attacker to affect both features and targets optimally and simultaneously. We explore several transformations broadly used across machine learning with a driving motivation for our work being autogressive modeling. There, Bob transforms a univariate time series into a matrix of observations and vector of target values which can then be fed into standard learners. Under this learning reduction, a perturbation from Alice to a single value of the time series affects features of several data points along with target values

    Exploring misconceptions as a trigger for enhancing student learning

    Get PDF
    This article addresses the importance of confronting misconceptions in the teaching of the STEM disciplines. First, we review the central place for threshold concepts in many disciplines and the threat misconceptions pose to quality education. Second, approaches will be offered for confronting misconceptions in the classroom in different contexts. Finally, we discuss what we can learn about these approaches and the common threads that reveal successful approaches. These steps have been explored in relation to four case studies across diverse disciplines. From these case studies, a set of principles about how best to address misconceptions in STEM disciplines has been distilled. As conceptual knowledge increases in importance in higher education, effective strategies for helping students develop accurate conceptual understanding will also be increasingly critical
    corecore