18 April 2024
10:00 Doctoral defense fully remotely
Theme
Self-supervised learning for fully unannotated person re-identification in real-world applications
Student
Gabriel Capiteli Bertocco
Advisor / Teacher
Anderson de Rezende Rocha - Co-supervisor: Fernanda Alcântara Andaló
Brief summary
One of the most complex problems in Machine Learning is dealing with unlabeled data. Most high-performance models rely on massive amounts of labeled data to get the best results. However, labeling is neither easy nor reliable as it is a highly time-consuming, costly and error-prone task. Furthermore, biases in the labeled data can be propagated to the model, impairing its performance and generalization. Therefore, it is essential to develop methods that can find functional patterns in completely unsupervised scenarios, allowing rapid implementation and less prone to bias. These models can be used in various applications, such as forensic investigations, biometrics and event understanding. This research proposes self-supervised learning algorithms to deal with unlabeled data in challenging scenarios. A challenging scenario may contain high intra-class disparity (representations of the same class are far from each other in vector space) and high inter-class similarity (samples from different classes may be closer to each other). To instantiate this complex requirement with the aforementioned challenges, our exploration focuses on two applications: Unsupervised ReIdentification (ReID) of People and Objects, due to its applicability in understanding events, and Text Authorship Attribution. Considering these applications, in this thesis, we propose four methods that deal with varying levels of complexity in unsupervised scenarios. Our first three solutions target the Unsupervised People ReID task, where we assume that we do not have the identity annotation, i.e., we do not know ``who'' was detected in the image. This first solution considers meta-information, such as camera annotation, to help solve the task. As there are scenarios where camera information is not available, our second solution is completely unsupervised, i.e., it does not require any additional information. Thus, it can be applied to other tasks, in different modalities, such as Attribution of Authorship in Text in posts on social networks. The third method also handles unsupervised re-identification scenarios but with large-scale datasets. We also show that we can extend it to re-identify objects, such as vehicles. The fourth solution considers the long-range recognition problem through supervised training. The model learns from distorted images due to atmospheric turbulence, and achieves state-of-the-art results in both People ReID and Facial Recognition tasks. The solutions proposed in this research can be coupled into forensic and biometrics application pipelines. They can be used to understand events, in which authorities aim to find suspects and investigate people's behavior, as well as relationships with objects in a scene. Solutions can be used to gain an understanding of what occurred and propose avenues of investigation.
Examination Board
Headlines:
Anderson de Rezende Rocha IC / UNICAMP
Esther Luna Colombini IC / UNICAMP
Sébastien Marcel IDIAP/Switzerland
Vitomir Štruc Uni-Lj/Slovenia
Patrick Flynn ND/USA
Substitutes:
Hélio Pedrini IC / UNICAMP
Jacques Wainer IC / UNICAMP
João Paulo Papa FC / UNESP
William Robson Schwartz DCC / UFMG