206 research outputs found
Backdoors in Neural Models of Source Code
Deep neural networks are vulnerable to a range of adversaries. A particularly
pernicious class of vulnerabilities are backdoors, where model predictions
diverge in the presence of subtle triggers in inputs. An attacker can implant a
backdoor by poisoning the training data to yield a desired target prediction on
triggered inputs. We study backdoors in the context of deep-learning for source
code. (1) We define a range of backdoor classes for source-code tasks and show
how to poison a dataset to install such backdoors. (2) We adapt and improve
recent algorithms from robust statistics for our setting, showing that
backdoors leave a spectral signature in the learned representation of source
code, thus enabling detection of poisoned data. (3) We conduct a thorough
evaluation on different architectures and languages, showing the ease of
injecting backdoors and our ability to eliminate them
Automatic Abstraction in SMT-Based Unbounded Software Model Checking
Software model checkers based on under-approximations and SMT solvers are
very successful at verifying safety (i.e. reachability) properties. They
combine two key ideas -- (a) "concreteness": a counterexample in an
under-approximation is a counterexample in the original program as well, and
(b) "generalization": a proof of safety of an under-approximation, produced by
an SMT solver, are generalizable to proofs of safety of the original program.
In this paper, we present a combination of "automatic abstraction" with the
under-approximation-driven framework. We explore two iterative approaches for
obtaining and refining abstractions -- "proof based" and "counterexample based"
-- and show how they can be combined into a unified algorithm. To the best of
our knowledge, this is the first application of Proof-Based Abstraction,
primarily used to verify hardware, to Software Verification. We have
implemented a prototype of the framework using Z3, and evaluate it on many
benchmarks from the Software Verification Competition. We show experimentally
that our combination is quite effective on hard instances.Comment: Extended version of a paper in the proceedings of CAV 201
PECAN: A Deterministic Certified Defense Against Backdoor Attacks
Neural networks are vulnerable to backdoor poisoning attacks, where the
attackers maliciously poison the training set and insert triggers into the test
input to change the prediction of the victim model. Existing defenses for
backdoor attacks either provide no formal guarantees or come with
expensive-to-compute and ineffective probabilistic guarantees. We present
PECAN, an efficient and certified approach for defending against backdoor
attacks. The key insight powering PECAN is to apply off-the-shelf test-time
evasion certification techniques on a set of neural networks trained on
disjoint partitions of the data. We evaluate PECAN on image classification and
malware detection datasets. Our results demonstrate that PECAN can (1)
significantly outperform the state-of-the-art certified backdoor defense, both
in defense strength and efficiency, and (2) on real back-door attacks, PECAN
can reduce attack success rate by order of magnitude when compared to a range
of baselines from the literature
Synthesis of Recursive ADT Transformations from Reusable Templates
Recent work has proposed a promising approach to improving scalability of
program synthesis by allowing the user to supply a syntactic template that
constrains the space of potential programs. Unfortunately, creating templates
often requires nontrivial effort from the user, which impedes the usability of
the synthesizer. We present a solution to this problem in the context of
recursive transformations on algebraic data-types. Our approach relies on
polymorphic synthesis constructs: a small but powerful extension to the
language of syntactic templates, which makes it possible to define a program
space in a concise and highly reusable manner, while at the same time retains
the scalability benefits of conventional templates. This approach enables
end-users to reuse predefined templates from a library for a wide variety of
problems with little effort. The paper also describes a novel optimization that
further improves the performance and scalability of the system. We evaluated
the approach on a set of benchmarks that most notably includes desugaring
functions for lambda calculus, which force the synthesizer to discover Church
encodings for pairs and boolean operations
- …