TY - GEN
T1 - Integrating Vision-Language Supervision for Uniform Appearance Tracking
AU - Alansari, Mohamad Yousif Abdulkareem
AU - Abughali, Ahmed Mousa
AU - Habash, Obadah
AU - Alnuaimi, Khaled
AU - Javed, Sajid
AU - Werghi, Naoufel
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Integrating detailed Natural Language (NL) descriptions with modern tracking technologies represents a significant and emerging field within Uniform Appearance (UA) crowd-tracking research, demonstrating substantial potential for future developments. A prominent challenge in this area is the lack of NL descriptions tailored for UA crowd tracking datasets. Existing datasets for Drone-Person Tracking in Uniform Appearance Crowd (D-PTUAC) lack essential textual annotations. Our study aims to bridge this gap by innovatively introducing comprehensive natural language descriptions for the D-PTUAC dataset, specifically designed for Uniform Appearance crowd tracking using drones. This enhancement aims to provide a richer understanding of the dataset and facilitate more effective utilization in research and applications related to drone-based crowd tracking. These descriptions are meticulously designed to include extensive information about the target entities, thereby significantly augmenting the dataset's depth and applicability. Our evaluations utilizing the latest state-of-the-art (SOTA) NL-based tracking algorithms showed us a remarkable competitive performance in tracking when juxtaposed against SOTA visual trackers benchmarked on the D-PTUAC dataset. This outcome highlights the critical role and efficacy of integrated language descriptions in enhancing the methodologies employed in UA crowd tracking.
AB - Integrating detailed Natural Language (NL) descriptions with modern tracking technologies represents a significant and emerging field within Uniform Appearance (UA) crowd-tracking research, demonstrating substantial potential for future developments. A prominent challenge in this area is the lack of NL descriptions tailored for UA crowd tracking datasets. Existing datasets for Drone-Person Tracking in Uniform Appearance Crowd (D-PTUAC) lack essential textual annotations. Our study aims to bridge this gap by innovatively introducing comprehensive natural language descriptions for the D-PTUAC dataset, specifically designed for Uniform Appearance crowd tracking using drones. This enhancement aims to provide a richer understanding of the dataset and facilitate more effective utilization in research and applications related to drone-based crowd tracking. These descriptions are meticulously designed to include extensive information about the target entities, thereby significantly augmenting the dataset's depth and applicability. Our evaluations utilizing the latest state-of-the-art (SOTA) NL-based tracking algorithms showed us a remarkable competitive performance in tracking when juxtaposed against SOTA visual trackers benchmarked on the D-PTUAC dataset. This outcome highlights the critical role and efficacy of integrated language descriptions in enhancing the methodologies employed in UA crowd tracking.
KW - Drone-Person Tracking in Uniform Appearance Crowd
KW - Natural Language Processing (NLP)
KW - Visual Object Tracking
UR - https://www.scopus.com/pages/publications/85216861981
U2 - 10.1109/ICIP51287.2024.10648244
DO - 10.1109/ICIP51287.2024.10648244
M3 - Conference contribution
AN - SCOPUS:85216861981
T3 - Proceedings - International Conference on Image Processing, ICIP
SP - 747
EP - 752
BT - 2024 IEEE International Conference on Image Processing, ICIP 2024 - Proceedings
PB - IEEE Computer Society
T2 - 31st IEEE International Conference on Image Processing, ICIP 2024
Y2 - 27 October 2024 through 30 October 2024
ER -