2 research outputs found
Google's Cloud Vision API Is Not Robust To Noise
Google has recently introduced the Cloud Vision API for image analysis.
According to the demonstration website, the API "quickly classifies images into
thousands of categories, detects individual objects and faces within images,
and finds and reads printed words contained within images." It can be also used
to "detect different types of inappropriate content from adult to violent
content."
In this paper, we evaluate the robustness of Google Cloud Vision API to input
perturbation. In particular, we show that by adding sufficient noise to the
image, the API generates completely different outputs for the noisy image,
while a human observer would perceive its original content. We show that the
attack is consistently successful, by performing extensive experiments on
different image types, including natural images, images containing faces and
images with texts. For instance, using images from ImageNet dataset, we found
that adding an average of 14.25% impulse noise is enough to deceive the API.
Our findings indicate the vulnerability of the API in adversarial environments.
For example, an adversary can bypass an image filtering system by adding noise
to inappropriate images. We then show that when a noise filter is applied on
input images, the API generates mostly the same outputs for restored images as
for original images. This observation suggests that cloud vision API can
readily benefit from noise filtering, without the need for updating image
analysis algorithms
Towards Robust Classification with Image Quality Assessment
Recent studies have shown that deep convolutional neural networks (DCNN) are
vulnerable to adversarial examples and sensitive to perceptual quality as well
as the acquisition condition of images. These findings raise a big concern for
the adoption of DCNN-based applications for critical tasks. In the literature,
various defense strategies have been introduced to increase the robustness of
DCNN, including re-training an entire model with benign noise injection,
adversarial examples, or adding extra layers. In this paper, we investigate the
connection between adversarial manipulation and image quality, subsequently
propose a protective mechanism that doesnt require re-training a DCNN. Our
method combines image quality assessment with knowledge distillation to detect
input images that would trigger a DCCN to produce egregiously wrong results.
Using the ResNet model trained on ImageNet as an example, we demonstrate that
the detector can effectively identify poor quality and adversarial images.Comment: 11 pages, 7 figure