中圖分類號: TP183 文獻(xiàn)標(biāo)識碼: A DOI: 10.19358/j.issn.2096-5133.2022.01.004 引用格式: 宗啟灼,徐茹枝,年家呈. 一種基于局部平均有限差分的黑盒對抗攻擊方法[J].信息技術(shù)與網(wǎng)絡(luò)安全,2022,41(1):23-29,36.
A black-box adversarial attack method based on local average finite difference
Zong Qizhuo,Xu Ruzhi,Nian Jiacheng
(School of Control and Computer Engineering,North China Electric Power University,Beijing 102206,China)
Abstract: In the field of black box attacks, the current main method is to use the migration of adversarial samples to achieve adversarial attacks. However, the current methods are not effective. For this reason, this paper proposes an access-based black box attack method, which uses the finite difference method to directly estimate the gradient of the loss function of the sample in the target model. In order to improve the efficiency of the attack, the algorithm is optimized in two aspects. Firstly, in the finite difference process, the average pixel value in a fixed area is used instead of each pixel value in the area, so that each area only needs to be calculated once. Secondly, when generating adversarial samples iteratively, the idea of reusing multiple generations of gradient generation to resist disturbance is proposed, which significantly reduces the number of attack iterations. After a lot of experimental verification, the iterative non-target attacks in MNIST, CIFAR-10 and ImageNet have achieved 99.8%, 99.9% and 85.8% attack success rates respectively, leading most of today′s black box attack algorithms.
Key words : image recognition;adversarial sample;local average finite difference;black box attack