Inferring human interactions in meetings: A multimodal approach

Zhiwen Yu; Zhiyong Yu; Yusa Ko; Xingshe Zhou; Yuichi Nakamura

doi:10.1007/978-3-642-02830-4_3

Inferring human interactions in meetings: A multimodal approach

Zhiwen Yu, Zhiyong Yu, Yusa Ko, Xingshe Zhou, Yuichi Nakamura

School of Computer Science

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

9 Scopus citations

Abstract

Social dynamics, such as human interaction is important for understanding how a conclusion was reached in a meeting and determining whether the meeting was well organized. In this paper, a multimodal approach is proposed to infer human semantic interactions in meeting discussions. The human interaction, such as proposing an idea, giving comments, expressing a positive opinion, etc., implies user role, attitude, or intention toward a topic. Our approach infers human interactions based on a variety of audiovisual and high-level features, e.g., gestures, attention, speech tone, speaking time, interaction occasion, and information about the previous interaction. Four different inference models including Support Vector Machine (SVM), Bayesian Net, Naïve Bayes, and Decision Tree are selected and compared in human interaction recognition. Our experimental results show that SVM outperforms other inference models, we can successfully infer human interactions with a recognition rate around 80%, and our multimodal approach achieves robust and reliable results by leveraging on the characteristics of each single modality.

Original language	English
Title of host publication	Ubiquitous Intelligence and Computing - 6th International Conference, UIC 2009, Proceedings
Pages	14-24
Number of pages	11
DOIs	https://doi.org/10.1007/978-3-642-02830-4_3
State	Published - 2009
Event	6th International Conference on Ubiquitous Intelligence and Computing, UIC 2009 - Brisbane, QLD, Australia Duration: 7 Jul 2009 → 9 Jul 2009

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	5585 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	6th International Conference on Ubiquitous Intelligence and Computing, UIC 2009
Country/Territory	Australia
City	Brisbane, QLD
Period	7/07/09 → 9/07/09

Keywords

Human interaction
Multimodal recognition
Smart meeting

Access to Document

10.1007/978-3-642-02830-4_3

Cite this

Yu, Z., Yu, Z., Ko, Y., Zhou, X., & Nakamura, Y. (2009). Inferring human interactions in meetings: A multimodal approach. In Ubiquitous Intelligence and Computing - 6th International Conference, UIC 2009, Proceedings (pp. 14-24). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5585 LNCS). https://doi.org/10.1007/978-3-642-02830-4_3

@inproceedings{859908d772da406aa8a284f3ebca9fca,

title = "Inferring human interactions in meetings: A multimodal approach",

abstract = "Social dynamics, such as human interaction is important for understanding how a conclusion was reached in a meeting and determining whether the meeting was well organized. In this paper, a multimodal approach is proposed to infer human semantic interactions in meeting discussions. The human interaction, such as proposing an idea, giving comments, expressing a positive opinion, etc., implies user role, attitude, or intention toward a topic. Our approach infers human interactions based on a variety of audiovisual and high-level features, e.g., gestures, attention, speech tone, speaking time, interaction occasion, and information about the previous interaction. Four different inference models including Support Vector Machine (SVM), Bayesian Net, Na{\"i}ve Bayes, and Decision Tree are selected and compared in human interaction recognition. Our experimental results show that SVM outperforms other inference models, we can successfully infer human interactions with a recognition rate around 80%, and our multimodal approach achieves robust and reliable results by leveraging on the characteristics of each single modality.",

keywords = "Human interaction, Multimodal recognition, Smart meeting",

author = "Zhiwen Yu and Zhiyong Yu and Yusa Ko and Xingshe Zhou and Yuichi Nakamura",

year = "2009",

doi = "10.1007/978-3-642-02830-4_3",

language = "英语",

isbn = "3642028292",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

pages = "14--24",

booktitle = "Ubiquitous Intelligence and Computing - 6th International Conference, UIC 2009, Proceedings",

note = "6th International Conference on Ubiquitous Intelligence and Computing, UIC 2009 ; Conference date: 07-07-2009 Through 09-07-2009",

}

Yu, Z, Yu, Z, Ko, Y, Zhou, X & Nakamura, Y 2009, Inferring human interactions in meetings: A multimodal approach. in Ubiquitous Intelligence and Computing - 6th International Conference, UIC 2009, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5585 LNCS, pp. 14-24, 6th International Conference on Ubiquitous Intelligence and Computing, UIC 2009, Brisbane, QLD, Australia, 7/07/09. https://doi.org/10.1007/978-3-642-02830-4_3

Inferring human interactions in meetings: A multimodal approach. / Yu, Zhiwen; Yu, Zhiyong; Ko, Yusa et al.
Ubiquitous Intelligence and Computing - 6th International Conference, UIC 2009, Proceedings. 2009. p. 14-24 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5585 LNCS).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Inferring human interactions in meetings

T2 - 6th International Conference on Ubiquitous Intelligence and Computing, UIC 2009

AU - Yu, Zhiwen

AU - Yu, Zhiyong

AU - Ko, Yusa

AU - Zhou, Xingshe

AU - Nakamura, Yuichi

PY - 2009

Y1 - 2009

N2 - Social dynamics, such as human interaction is important for understanding how a conclusion was reached in a meeting and determining whether the meeting was well organized. In this paper, a multimodal approach is proposed to infer human semantic interactions in meeting discussions. The human interaction, such as proposing an idea, giving comments, expressing a positive opinion, etc., implies user role, attitude, or intention toward a topic. Our approach infers human interactions based on a variety of audiovisual and high-level features, e.g., gestures, attention, speech tone, speaking time, interaction occasion, and information about the previous interaction. Four different inference models including Support Vector Machine (SVM), Bayesian Net, Naïve Bayes, and Decision Tree are selected and compared in human interaction recognition. Our experimental results show that SVM outperforms other inference models, we can successfully infer human interactions with a recognition rate around 80%, and our multimodal approach achieves robust and reliable results by leveraging on the characteristics of each single modality.

AB - Social dynamics, such as human interaction is important for understanding how a conclusion was reached in a meeting and determining whether the meeting was well organized. In this paper, a multimodal approach is proposed to infer human semantic interactions in meeting discussions. The human interaction, such as proposing an idea, giving comments, expressing a positive opinion, etc., implies user role, attitude, or intention toward a topic. Our approach infers human interactions based on a variety of audiovisual and high-level features, e.g., gestures, attention, speech tone, speaking time, interaction occasion, and information about the previous interaction. Four different inference models including Support Vector Machine (SVM), Bayesian Net, Naïve Bayes, and Decision Tree are selected and compared in human interaction recognition. Our experimental results show that SVM outperforms other inference models, we can successfully infer human interactions with a recognition rate around 80%, and our multimodal approach achieves robust and reliable results by leveraging on the characteristics of each single modality.

KW - Human interaction

KW - Multimodal recognition

KW - Smart meeting

UR - http://www.scopus.com/inward/record.url?scp=70350680121&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-02830-4_3

DO - 10.1007/978-3-642-02830-4_3

M3 - 会议稿件

AN - SCOPUS:70350680121

SN - 3642028292

SN - 9783642028298

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 14

EP - 24

BT - Ubiquitous Intelligence and Computing - 6th International Conference, UIC 2009, Proceedings

Y2 - 7 July 2009 through 9 July 2009

ER -

Yu Z, Yu Z, Ko Y, Zhou X, Nakamura Y. Inferring human interactions in meetings: A multimodal approach. In Ubiquitous Intelligence and Computing - 6th International Conference, UIC 2009, Proceedings. 2009. p. 14-24. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-642-02830-4_3

Inferring human interactions in meetings: A multimodal approach

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this