跳到主要导航 跳到搜索 跳到主要内容

Summary on the Multimodal Information Based Speech Processing (MISP) 2022 Challenge

  • Hang Chen
  • , Shilong Wu
  • , Yusheng Dai
  • , Zhe Wang
  • , Jun Du
  • , Chin Hui Lee
  • , Jingdong Chen
  • , Shinji Watanabe
  • , Sabato Marco Siniscalchi
  • , Odette Scharenborg
  • , Di Yuan Liu
  • , Bao Cai Yin
  • , Jia Pan
  • , Jian Qing Gao
  • , Cong Liu
  • University of Science and Technology of China
  • Northwestern Polytechnical University Xian
  • Delft University of Technology
  • Carnegie Mellon University
  • Georgia Institute of Technology
  • IFLYTEK Co., Ltd.
  • Kore University of Enna

科研成果: 书/报告/会议事项章节会议稿件同行评审

2 引用 (Scopus)

摘要

The Multimodal Information based Speech Processing (MISP) 2022 challenge aimed to enhance speech processing performance in harsh acoustic environments by leveraging additional modalities such as video or text. The challenge included two tracks: audio-visual speaker diarization (AVSD) and audio-visual diarization and recognition (AVDR). The training material was based on previous MISP 2021 recordings, but we have accurately synchronized audio and visual data. Additionally, a new evaluation set was provided. This paper gives an overview of the challenge setup, presents the results, and summarizes the effective techniques employed by the participants. We also analyze the current technical challenges and suggest directions for future research in AVSD and AVDR.

源语言英语
主期刊名ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
出版商Institute of Electrical and Electronics Engineers Inc.
ISBN(电子版)9781728163277
DOI
出版状态已出版 - 2023
已对外发布
活动48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023 - Rhodes Island, 希腊
期限: 4 6月 202310 6月 2023

出版系列

姓名ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN(印刷版)1520-6149

会议

会议48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023
国家/地区希腊
Rhodes Island
时期4/06/2310/06/23

指纹

探究 'Summary on the Multimodal Information Based Speech Processing (MISP) 2022 Challenge' 的科研主题。它们共同构成独一无二的指纹。

引用此