Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting

Liao, Shengcai; Shao, Ling

doi:10.1007/978-3-030-58621-8_27

Computer Science > Computer Vision and Pattern Recognition

arXiv:1904.10424 (cs)

[Submitted on 23 Apr 2019 (v1), last revised 20 Jul 2020 (this version, v4)]

Title:Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting

Authors:Shengcai Liao, Ling Shao

View PDF

Abstract:For person re-identification, existing deep networks often focus on representation learning. However, without transfer learning, the learned model is fixed as is, which is not adaptable for handling various unseen scenarios. In this paper, beyond representation learning, we consider how to formulate person image matching directly in deep feature maps. We treat image matching as finding local correspondences in feature maps, and construct query-adaptive convolution kernels on the fly to achieve local matching. In this way, the matching process and results are interpretable, and this explicit matching is more generalizable than representation features to unseen scenarios, such as unknown misalignments, pose or viewpoint changes. To facilitate end-to-end training of this architecture, we further build a class memory module to cache feature maps of the most recent samples of each class, so as to compute image matching losses for metric learning. Through direct cross-dataset evaluation, the proposed Query-Adaptive Convolution (QAConv) method gains large improvements over popular learning methods (about 10%+ mAP), and achieves comparable results to many transfer learning methods. Besides, a model-free temporal cooccurrence based score weighting method called TLift is proposed, which improves the performance to a further extent, achieving state-of-the-art results in cross-dataset person re-identification. Code is available at this https URL.

Comments:	This is the ECCV 2020 version, including the appendix
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1904.10424 [cs.CV]
	(or arXiv:1904.10424v4 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1904.10424
Journal reference:	Vedaldi A., Bischof H., Brox T., Frahm JM. (eds). European Conference on Computer Vision. ECCV 2020. Lecture Notes in Computer Science, vol 12356. Springer, Cham
Related DOI:	https://doi.org/10.1007/978-3-030-58621-8_27

Submission history

From: Shengcai Liao [view email]
[v1] Tue, 23 Apr 2019 17:03:13 UTC (2,393 KB)
[v2] Mon, 16 Dec 2019 11:43:28 UTC (2,552 KB)
[v3] Sat, 4 Jul 2020 17:25:54 UTC (2,378 KB)
[v4] Mon, 20 Jul 2020 07:36:28 UTC (2,381 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators