Sumin Lee, Sangmin Woo, Yeonju Park, Muhammad Adi Nugroho, and Changick Kim, "Modality Mixer for Multi-modal Action Recognition," in Proc. the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023.

In multi-modal action recognition, it is important to consider not only the complementary nature of different modalities but also global action content. In this paper, we propose a novel network, named Modality Mixer (M-Mixer) network, to leverage complementary information across modalities and temporal context of an action for multi-modal action recognition. We also introduce a simple yet effective recurrent unit, called Multi-modal Contextualization Unit (MCU), which is a core component of M-Mixer. Our MCU temporally encodes a sequence of one modality (e.g., RGB) with action content features of other modalities (e.g., depth, IR). This process encourages M-Mixer to exploit global action content and also to supplement complementary information of other modalities. As a result, our proposed method outperforms state-of-the-art methods on NTU RGB+D 60, NTU RGB+D 120, and NW-UCLA datasets. Moreover, we demonstrate the effectiveness of M-Mixer by conducting comprehensive ablation studies.

AI in EE

AI in Signal Division

AI in Computer Division

AI in Communication Division

AI in Signal Division

AI in Wave Division

AI in Circuit Division

AI in Device Division

Sumin Lee, Sangmin Woo, Yeonju Park, Muhammad Adi Nugroho, and Changick Kim, “Modality Mixer for Multi-modal Action Recognition,” in Proc. the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023.

학부 소개

연구

EE-X

AI in EE

구성원

교육

입학

소식

기부

학부 소개

연구

EE-X

AI in EE

구성원

교육

대외협력

입학

소식