Abstract
Although deep convolutional neural networks (CNNs) have demonstrated remarkable performance on multiple computer vision tasks, researches on adversarial learning have shown that deep models are vulnerable to adversarial examples, which are crafted by adding visually imperceptible perturbations to the input images. Most of the existing adversarial attack methods only create a single adversarial example for the input, which just gives a glimpse of the underlying data manifold of adversarial examples. An attractive solution is to explore the solution space of the adversarial examples and generate a diverse bunch of them, which could potentially improve the robustness of real-world systems and help prevent severe security threats and vulnerabilities. In this paper, we present an effective method, called Hamiltonian Monte Carlo with Accumulated Momentum (HMCAM), aiming to generate a sequence of adversarial examples. To improve the efficiency of HMC, we propose a new regime to automatically control the length of trajectories, which allows the algorithm to move with adaptive step sizes along the search direction at different positions. Moreover, we revisit the reason for high computational cost of adversarial training under the view of MCMC and design a new generative method called Contrastive Adversarial Training (CAT), which approaches equilibrium distribution of adversarial examples with only few iterations by building from small modifications of the standard Contrastive Divergence (CD) and achieve a trade-off between efficiency and accuracy. Both quantitative and qualitative analysis on several natural image datasets and practical systems have confirmed the superiority of the proposed algorithm.
Framework
Experiment
Conclusion
In this paper, we formulate the generation of adversarial examples as a MCMC process and present an efficient paradigm called Hamiltonian Monte Carlo with Accumulated Momentum (HMCAM). In contrast to traditional iterative attack methods that aim to generate a single optimal adversarial example in one run, HMCAM can efficiently explore the distribution space to search multiple solutions and generate a sequence of adversarial examples. We also develop a new generative method called Contrastive Adversarial Training (CAT), which approaches equilibrium distribution of adversarial examples with only few iterations by building from small modifications of the standard Contrastive Divergence. Extensive results with comparisons on CIFAR10 showed that not only HMCAM attained much higher success rates than other black-box models and comparable results as other white-box models in adversarial attack, but also CAT achieved a trade-off between efficiency and accuracy in adversarial training. By further evaluating this enhanced attack against the champion solution in the defense track of CAAD 2018 competition, HMCAM outperforms the official baseline attack and M-PGD. To demonstrate its practical applicability, we apply the proposed HMCAM method to investigate the robustness of real-world celebrity recognition systems, and compare against the Geekpwn CAAD 2018 method. The result shows that the existing real-world celebrity recognition systems are extremely vulnerable to adversarial attacks in the black-box scenario since most examples generated by our approach can mislead the system with high confidence, which raises security concerns for developing more robust celebrity recognition models. The proposed attack strategy leads to a new paradigm for generating adversarial examples, which can potentially assess the robustness of networks and inspire stronger adversarial learning methods in the future.