Visual-Language Alignment for Background Subtraction

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Background Subtraction (BGS) is a fundamental task in video analysis, critical for many application scenarios. Despite the development of various methods to address the identification of moving objects, current techniques fall short when faced with the intricate challenges inherent in real-world settings. Two such challenges that persist are the presence of dynamic backgrounds, where the environmental backdrop is constantly changing, and camera jitter, which introduces erratic movements into the scene. In the field of computer vision, we introduce for the first time a vision-language model designed for BGS tasks, utilizing the integration of linguistic and visual information to enhance the understanding and interpretation of complex scenes within the context of background sub-traction efforts. Our model has been rigorously tested across three categories within the extensive CDNet-2014 dataset, the results indicate a compelling average F-measure of 0.9771, highlighting the model's proficiency. This investigation offers a new perspective and a novel solution for BGS, particularly in complex video scenarios.

Original languageBritish English
Title of host publication2024 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350379815
DOIs
StatePublished - 2024
Event2024 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2024 - Niagara Falls, Canada
Duration: 15 Jul 202419 Jul 2024

Publication series

Name2024 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2024

Conference

Conference2024 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2024
Country/TerritoryCanada
CityNiagara Falls
Period15/07/2419/07/24

Keywords

  • Background Subtraction
  • Deep Learning
  • Vision-Language Model

Fingerprint

Dive into the research topics of 'Visual-Language Alignment for Background Subtraction'. Together they form a unique fingerprint.

Cite this