6 research outputs found
Determining Sequence of Image Processing Technique (IPT) to Detect Adversarial Attacks
Developing secure machine learning models from adversarial examples is
challenging as various methods are continually being developed to generate
adversarial attacks. In this work, we propose an evolutionary approach to
automatically determine Image Processing Techniques Sequence (IPTS) for
detecting malicious inputs. Accordingly, we first used a diverse set of attack
methods including adaptive attack methods (on our defense) to generate
adversarial samples from the clean dataset. A detection framework based on a
genetic algorithm (GA) is developed to find the optimal IPTS, where the
optimality is estimated by different fitness measures such as Euclidean
distance, entropy loss, average histogram, local binary pattern and loss
functions. The "image difference" between the original and processed images is
used to extract the features, which are then fed to a classification scheme in
order to determine whether the input sample is adversarial or clean. This paper
described our methodology and performed experiments using multiple data-sets
tested with several adversarial attacks. For each attack-type and dataset, it
generates unique IPTS. A set of IPTS selected dynamically in testing time which
works as a filter for the adversarial attack. Our empirical experiments
exhibited promising results indicating the approach can efficiently be used as
processing for any AI model
Robust filtering schemes for machine learning systems to defend Adversarial Attack
Robust filtering schemes for machine learning systems to defend Adversarial Attac