Pedestrian alignment network for large-scale person re-identification
Authors: Zhedong Zheng, Liang Zheng, Yi Yang
Published in IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2018
Recommended citation: Zhedong Zheng, Liang Zheng, Yi Yang, "Pedestrian alignment network for large-scale person re-identification." IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2018. DOI: 10.1109/TCSVT.2018.2873599
Download PDF: https://zdzheng.xyz/files/TCSVT-08481710.pdf
中文解读: https://zhuanlan.zhihu.com/p/29269953
Code is available at: https://github.com/layumi/Pedestrian_Alignment
Abstract: Person re-identification (person re-ID) is mostly viewed as an image retrieval problem. This task aims to search a query person in a large image pool. In practice, person re-ID usually adopts automatic detectors to obtain cropped pedestrian images. However, this process suffers from two types of detector errors: excessive background and part missing. Both errors deteriorate the quality of pedestrian alignment and may compromise pedestrian matching due to the position and scale variances. To address the misalignment problem, we propose that alignment can be learned from an identification procedure. We introduce the pedestrian alignment network (PAN) which allows discriminative embedding learning and pedestrian alignment without extra annotations. Our key observation is that when the convolutional neural network (CNN) learns to discriminate between different identities, the learned feature maps usually exhibit strong activations on the human body rather than the background. The proposed network thus takes advantage of this attention mechanism to adaptively locate and align pedestrians within a bounding box. Visual examples show that pedestrians are better aligned with PAN. Experiments on three large-scale re-ID datasets confirm that PAN improves the discriminative ability of the feature embeddings and yields competitive accuracy with the state-of-the-art methods.
@article{zheng2018pedestrian,
author = "Zheng, Zhedong and Zheng, Liang and Yang, Yi",
doi = "10.1109/TCSVT.2018.2873599",
title = "Pedestrian alignment network for large-scale person re-identification",
journal = "IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)",
volume = "29",
number = "10",
pages = "3037--3045",
year = "2018",
code = "https://github.com/layumi/Pedestrian\_Alignment",
url = "https://zdzheng.xyz/files/TCSVT-08481710.pdf",
blog = "https://zhuanlan.zhihu.com/p/29269953",
publisher = "IEEE",
abs = "Person re-identification (person re-ID) is mostly viewed as an image retrieval problem. This task aims to search a query person in a large image pool. In practice, person re-ID usually adopts automatic detectors to obtain cropped pedestrian images. However, this process suffers from two types of detector errors: excessive background and part missing. Both errors deteriorate the quality of pedestrian alignment and may compromise pedestrian matching due to the position and scale variances. To address the misalignment problem, we propose that alignment can be learned from an identification procedure. We introduce the pedestrian alignment network (PAN) which allows discriminative embedding learning and pedestrian alignment without extra annotations. Our key observation is that when the convolutional neural network (CNN) learns to discriminate between different identities, the learned feature maps usually exhibit strong activations on the human body rather than the background. The proposed network thus takes advantage of this attention mechanism to adaptively locate and align pedestrians within a bounding box. Visual examples show that pedestrians are better aligned with PAN. Experiments on three large-scale re-ID datasets confirm that PAN improves the discriminative ability of the feature embeddings and yields competitive accuracy with the state-of-the-art methods." }