S-Adapter: Generalizing Vision Transformer for Face Anti-Spoofing with Statistical Tokens

Cai, Rizhao; Yu, Zitong; Kong, Chenqi; Li, Haoliang; Chen, Changsheng; Hu, Yongjian; Kot, Alex

Abstract:Face Anti-Spoofing (FAS) aims to detect malicious attempts to invade a face recognition system by presenting spoofed faces. State-of-the-art FAS techniques predominantly rely on deep learning models but their cross-domain generalization capabilities are often hindered by the domain shift problem, which arises due to different distributions between training and testing data. In this study, we develop a generalized FAS method under the Efficient Parameter Transfer Learning (EPTL) paradigm, where we adapt the pre-trained Vision Transformer models for the FAS task. During training, the adapter modules are inserted into the pre-trained ViT model, and the adapters are updated while other pre-trained parameters remain fixed. We find the limitations of previous vanilla adapters in that they are based on linear layers, which lack a spoofing-aware inductive bias and thus restrict the cross-domain generalization. To address this limitation and achieve cross-domain generalized FAS, we propose a novel Statistical Adapter (S-Adapter) that gathers local discriminative and statistical information from localized token histograms. To further improve the generalization of the statistical tokens, we propose a novel Token Style Regularization (TSR), which aims to reduce domain style variance by regularizing Gram matrices extracted from tokens across different domains. Our experimental results demonstrate that our proposed S-Adapter and TSR provide significant benefits in both zero-shot and few-shot cross-domain testing, outperforming state-of-the-art methods on several benchmark tests. We will release the source code upon acceptance.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2309.04038 [cs.CV]
	(or arXiv:2309.04038v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2309.04038

Computer Science > Computer Vision and Pattern Recognition

Title:S-Adapter: Generalizing Vision Transformer for Face Anti-Spoofing with Statistical Tokens

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators