Bimodal SegNet: Fused instance segmentation using events and RGB frames

Sanket Kachole, Xiaoqian Huang, Fariborz Baghaei Naeini, Rajkumar Muthusamy, Dimitrios Makris, Yahya Zweiri

    Research output: Contribution to journalArticlepeer-review

    1 Scopus citations

    Abstract

    Object segmentation enhances robotic grasping by aiding object identification. Complex environments and dynamic conditions pose challenges such as occlusion, low light conditions, motion blur and object size variance. To address these challenges, we propose a Bimodal SegNet that fuses two types of visual signals, event-based data and RGB frame data. The proposed Bimodal SegNet network has two distinct encoders — one for RGB signal input and another for Event signal input, in addition to an Atrous Pyramidal Feature Amplification module. Encoders capture and fuse the rich contextual information from different resolutions via a Cross-Domain Contextual Attention layer while the decoder obtains sharp object boundaries. The evaluation of the proposed method undertakes five unique image degradation challenges including occlusion, blur, brightness, trajectory and scale variance on the Event-based Segmentation (ESD) Dataset. The results show a 4%–6% MIOU score improvement over state-of-the-art methods in terms of mean intersection over the union and pixel accuracy. The source code, dataset and model are publicly available at: https://github.com/sanket0707/Bimodal-SegNet.

    Original languageBritish English
    Article number110215
    JournalPattern Recognition
    Volume149
    DOIs
    StatePublished - May 2024

    Keywords

    • Cross attention
    • Deep learning
    • Event vision
    • Grasping
    • Robotics

    Fingerprint

    Dive into the research topics of 'Bimodal SegNet: Fused instance segmentation using events and RGB frames'. Together they form a unique fingerprint.

    Cite this