In the past several years, various adversarial training (AT) approaches have been invented to robustify deep learning model against adversarial attacks. However, mainstream AT methods assume the training and testing data are drawn from the same distribution and the training data are annotated. When the two assumptions are violated, existing AT methods fail because either they cannot pass knowledge learnt from a source domain to an unlabeled target domain or they are confused by the adversarial samples in that unlabeled space. In this paper, we first point out this new and challenging problem— adversarial training in unlabeled target domain . We then propose a novel framework named Unsupervised Cross-domain Adversarial Training (UCAT) to address this problem. UCAT effectively leverages the knowledge of the labeled source domain to prevent the adversarial samples from misleading the training process, under the guidance of automatically selected high quality pseudo labels of the unannotated target domain data together with the discriminative and robust anchor representations of the source domain data. The experiments on four public benchmarks show that models trained with UCAT can achieve both high accuracy and strong robustness. The effectiveness of the proposed components is demonstrated through a large set of ablation studies. The source code is publicly available at https://github.com/DIAL-RPI/UCAT.
Reference
IEEE Trans. Image Processing, Vol. 32, pp. 1272-1284, (2023)
Bibtex
@ARTICLE{Zhang_Chao_TIP_2023, author={Zhang, Jiajin and Chao, Hanqing and Yan, Pingkun}, journal={IEEE Transactions on Image Processing}, title={Toward Adversarial Robustness in Unlabeled Target Domains}, year={2023}, volume={32}, number={}, pages={1272-1284}, doi={10.1109/TIP.2023.3242141} }