Kiran Ramnath, Leda Sari, Mark Hasegawa-johnson and Chang Yoo, " Worldly Wise (WoW)-Cross-Lingual Knowledge Fusion for Fact-based Visual Spoken-Question Answering", Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2021.

Although Question-Answering has long been of research interest, its accessibility to users through a speech interface and its support to multiple languages have not been addressed in prior studies. Towards these ends, we present a new task and a synthetically-generated dataset to do Fact-based Visual Spoken-Question Answering (FVSQA). FVSQA is based on the FVQA dataset, which requires a system to retrieve an entity from Knowledge Graphs (KGs) to answer a question about an image. In FVSQA, the question is spoken rather than typed. Three sub-tasks are proposed: (1) speech-to-text based, (2) end-to-end, without speech-to-text as an intermediate component, and (3) cross-lingual, in which the question is spoken in a language different from that in which the KG is recorded. The end-to-end and cross-lingual tasks are the first to require world knowledge from a multi-relational KG as a differentiable layer in an end-to-end spoken language understanding task, hence the proposed reference implementation is called WorldlyWise (WoW). WoW is shown to perform endto-end cross-lingual FVSQA at same levels of accuracy across 3 languages – English, Hindi, and Turkish.

AI in EE

AI in Signal Division

AI in Computer Division

AI in Communication Division

AI in Signal Division

AI in Wave Division

AI in Circuit Division

AI in Device Division

Kiran Ramnath, Leda Sari, Mark Hasegawa-johnson and Chang Yoo, ” Worldly Wise (WoW)-Cross-Lingual Knowledge Fusion for Fact-based Visual Spoken-Question Answering”, Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2021.

학부 소개

연구

EE-X

AI in EE

구성원

교육

입학

소식

기부

학부 소개

연구

EE-X

AI in EE

구성원

교육

대외협력

입학

소식