LTCC

Person re-identification (Re-ID) aims to match a target person across camera views at different locations and times. Existing Re-ID studies focus on the short-term cloth-consistent setting, under which a person re-appears in different camera views with the same outfit. A discriminative feature representation learned by existing deep Re-ID models is thus dominated by the visual appearance of clothing. In this work, we focus on a much more difficult yet practical setting where person matching is conducted over long-duration, e.g., over days and months and therefore inevitably under the new challenge of changing clothes. This problem, termed Long-Term Cloth-Changing (LTCC) Re-ID is much understudied due to the lack of large scale datasets. The first contribution of this work is a new LTCC dataset containing people captured over a long period of time with frequent clothing changes. As a second contribution, we propose a novel Re-ID method specifically designed to address the cloth-changing challenge. Specifically, we consider that under cloth-changes, soft-biometrics such as body shape would be more reliable. We, therefore, introduce a shape embedding module as well as a cloth-elimination shape-distillation module aiming to eliminate the now unreliable clothing appearance features and focus on the body shape information. Extensive experiments show that superior performance is achieved by the proposed model on the new LTCC dataset.

Illustration of the long-term cloth-changing Re-ID task and dataset. The task is to match the same person under cloth-changes from different views, and the dataset contains same identities with diverse clothes.

To facilitate the study of Long-Term Cloth-Changing (LTCC) Re-ID, we collect a new LTCC person Re-ID dataset. LTCC contains 17,119 person images of 152 identities, and each identity is captured by at least two cameras. To further explore the cloth-changing Re-ID scenario, we assume that different people will not wear identical outfits (however visually similar they may be), and annotate each image with a cloth label as well. Note that the changes in hairstyle or carrying items, e.g., hat, bag, or laptop, do not affect the cloth label. Finally, dependent on whether there is a cloth-change, the dataset can be divided into two subsets: one cloth-change set where 91 persons appearing with 416 different sets of outfits in 14,783 images, and one cloth-consistent subset containing the remaining 61 identities with 2,336 images without outfit changes. On average, there are 5 different clothes for each cloth-changing person, with the number of outfit changes ranging from 2 to 14.

Examples of some people wearing the same and different clothes in LTCC dataset. There exist various illuminations, occlusion, camera view, carrying, and pose changes. This dataset has already been released.

Data Release Agreement

Dataset records are made available to researchers only after the receipt and acceptance of a completed and signed Database Release Agreement.

[Data Release Protocol]

Please submit requests for the dataset unless otherwise indicated: xuelinq92@gmail.com or xlqian@nwpu.edu.cn ~~xlqian15@fudan.edu.cn~~

Directory Structure

The directory structure is similar to Market-1501. Specially, the package contains four folders:
(1) "train". There are 9,576 images with 77 identities in this folder used for training (46 cloth-change IDs + 31 cloth-consistent IDs).
(2) "test". There are 7,050 images with 75 identities in this folder used for testing (45 cloth-change IDs + 30 cloth-consistent IDs).
(3) "query". There are 493 images with 75 identities. We randomly select one query image for each camera and each outfit.
(4) "info". There are 4 TXT files depicting the details of the identity split in the train and test folder, which can be used for different experiment settings (e.g., training with cloth-changing images only).

Naming Rule:
Take "103_1_c9_017008.png" for example, the first part of '103' means the person identity (there are totally 152 identities, annotated from 000 to 151). The second '1' denotes the clothes counter, i.e., the first suit of the person '103'. "c9" means the ninth camera (there are totally 12 cameras). The last term of '017008' is just the image counter, we don't use it during training and testing.

Evaluation Protocol

During inference, we load all gallery images but set remove or junk index to filter the images that violate the evaluation protocol. Below, we use pseudocode to explain two settings used in the paper. Also, please refer to the source code of LTCC or FIRe2.
(1) "Standard Setting": Gallery images with the same camera ID as the query person are filtered.

(2) "Cloth-changing Setting": Gallery images with the same camera ID and the same clothes ID as the query person are filtered.

!Attention, please do not directly use TXT files in the “info." folder to filter gallery images unless you can be sure to get the same filtering results as above. Those TXT files are provided to reproduce the experiments (with $\dag$) in Table 1.

With cloth-changing now commonplace in LTCC Re-ID, existing Re-ID models are expected to struggle because they assume that the clothing appearance is consistent and relies on clothing features to distinguish people from each other. Our key idea is to remove the cloth-appearance related information completely and only focus on view/pose-change-insensitive body shape information. To this end, we introduce a Shape Embedding (SE) to help shape feature extraction and a Cloth-Elimination Shape-Distillation (CESD) module to eliminate cloth-related information.

Illustration of our framework and the details of Cloth-Elimination ShapeDistillation (CESD) module. Here, we introduce Shape Embedding (SE) module to extract structural features from human keypoints, followed by learning identity-sensitive and cloth-insensitive representations using the CESD module.

Abstract

Motivation

Long-Term Cloth-Changing (LTCC) Dataset

Dataset Download

Cloth-Elimination Shape-Distillation (CESD) Module

Results

Paper

Acknowledgements