A Black-Box Adversarial Attack via Deep Reinforcement Learning on the Feature Space

Lyue Li, Amir Rezapour, Wen Guey Tzeng

    研究成果: Conference contribution同行評審

    摘要

    In this paper we propose a novel black-box adversarial attack by using the reinforcement learning to learn the characteristics of the target classifier C. Our method does not need to find a substitute classifier that resembles C with respect to its structure and parameters. Instead, our method learns an optimal attacking policy of guiding the attacker to build an adversarial image from the original image. We work on the feature space of images, instead of the pixels of images directly. Our method achieves better results on many measures. Our method achieves 94.5 % attack success rate on a well-Trained digit classifier. Our adversarial images have better imperceptibility even though the norm distances to original images are larger than other methods. Since our method works on the characteristics of a classifier, it has better transferability. The transfer rate of our method could reach 52.1 % for a targeted class and 65.9% for a non-Targeted class. This improves over previous results of single-digit transfer rates. Also, we show that it is harder to defend our attack by incorporating defense mechanisms, such as MagNet, which uses a denoising technique. We show that our method achieves 65% attack success rate even though the target classifier employs MagNet to defend.

    原文English
    主出版物標題2021 IEEE Conference on Dependable and Secure Computing, DSC 2021
    發行者Institute of Electrical and Electronics Engineers Inc.
    ISBN(電子)9781728175348
    DOIs
    出版狀態Published - 30 一月 2021
    事件2021 IEEE Conference on Dependable and Secure Computing, DSC 2021 - Aizuwakamatsu, Fukushima, Japan
    持續時間: 30 一月 20212 二月 2021

    出版系列

    名字2021 IEEE Conference on Dependable and Secure Computing, DSC 2021

    Conference

    Conference2021 IEEE Conference on Dependable and Secure Computing, DSC 2021
    國家/地區Japan
    城市Aizuwakamatsu, Fukushima
    期間30/01/212/02/21

    指紋

    深入研究「A Black-Box Adversarial Attack via Deep Reinforcement Learning on the Feature Space」主題。共同形成了獨特的指紋。

    引用此