Phonetic-enriched text representation for Chinese sentiment analysis with reinforcement learning

Haiyun Peng; Yukun Ma; Soujanya Poria; Yang Li; Erik Cambria

doi:10.1016/j.inffus.2021.01.005

Phonetic-enriched text representation for Chinese sentiment analysis with reinforcement learning

Haiyun Peng, Yukun Ma, Soujanya Poria, Yang Li, Erik Cambria

自动化学院

科研成果: 期刊稿件 › 文章 › 同行评审

47 引用（Scopus）

摘要

The Chinese pronunciation system offers two characteristics that distinguish it from other languages: deep phonemic orthography and intonation variations. In this paper, we hypothesize that these two important properties can play a major role in Chinese sentiment analysis. In particular, we propose two effective features to encode phonetic information and, hence, fuse it with textual information. With this hypothesis, we propose Disambiguate Intonation for Sentiment Analysis (DISA), a network that we develop based on the principles of reinforcement learning. DISA disambiguates intonations for each Chinese character (pinyin) and, hence, learns precise phonetic representations. We also fuse phonetic features with textual and visual features to further improve performance. Experimental results on five different Chinese sentiment analysis datasets show that the inclusion of phonetic features significantly and consistently improves the performance of textual and visual representations and surpasses the state-of-the-art Chinese character-level representations.

源语言	英语
页（从-至）	88-99
页数	12
期刊	Information Fusion
卷	70
DOI	https://doi.org/10.1016/j.inffus.2021.01.005
出版状态	已出版 - 6月 2021

访问文件

10.1016/j.inffus.2021.01.005

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{aaeae018f4504385869b2ccf468843df,

title = "Phonetic-enriched text representation for Chinese sentiment analysis with reinforcement learning",

abstract = "The Chinese pronunciation system offers two characteristics that distinguish it from other languages: deep phonemic orthography and intonation variations. In this paper, we hypothesize that these two important properties can play a major role in Chinese sentiment analysis. In particular, we propose two effective features to encode phonetic information and, hence, fuse it with textual information. With this hypothesis, we propose Disambiguate Intonation for Sentiment Analysis (DISA), a network that we develop based on the principles of reinforcement learning. DISA disambiguates intonations for each Chinese character (pinyin) and, hence, learns precise phonetic representations. We also fuse phonetic features with textual and visual features to further improve performance. Experimental results on five different Chinese sentiment analysis datasets show that the inclusion of phonetic features significantly and consistently improves the performance of textual and visual representations and surpasses the state-of-the-art Chinese character-level representations.",

keywords = "Chinese phonetics, Deep phonemic orthography, Multilingual sentiment analysis, Sentiment analysis",

author = "Haiyun Peng and Yukun Ma and Soujanya Poria and Yang Li and Erik Cambria",

note = "Publisher Copyright: {\textcopyright} 2021",

year = "2021",

month = jun,

doi = "10.1016/j.inffus.2021.01.005",

language = "英语",

volume = "70",

pages = "88--99",

journal = "Information Fusion",

issn = "1566-2535",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - Phonetic-enriched text representation for Chinese sentiment analysis with reinforcement learning

AU - Peng, Haiyun

AU - Ma, Yukun

AU - Poria, Soujanya

AU - Li, Yang

AU - Cambria, Erik

PY - 2021/6

Y1 - 2021/6

N2 - The Chinese pronunciation system offers two characteristics that distinguish it from other languages: deep phonemic orthography and intonation variations. In this paper, we hypothesize that these two important properties can play a major role in Chinese sentiment analysis. In particular, we propose two effective features to encode phonetic information and, hence, fuse it with textual information. With this hypothesis, we propose Disambiguate Intonation for Sentiment Analysis (DISA), a network that we develop based on the principles of reinforcement learning. DISA disambiguates intonations for each Chinese character (pinyin) and, hence, learns precise phonetic representations. We also fuse phonetic features with textual and visual features to further improve performance. Experimental results on five different Chinese sentiment analysis datasets show that the inclusion of phonetic features significantly and consistently improves the performance of textual and visual representations and surpasses the state-of-the-art Chinese character-level representations.

AB - The Chinese pronunciation system offers two characteristics that distinguish it from other languages: deep phonemic orthography and intonation variations. In this paper, we hypothesize that these two important properties can play a major role in Chinese sentiment analysis. In particular, we propose two effective features to encode phonetic information and, hence, fuse it with textual information. With this hypothesis, we propose Disambiguate Intonation for Sentiment Analysis (DISA), a network that we develop based on the principles of reinforcement learning. DISA disambiguates intonations for each Chinese character (pinyin) and, hence, learns precise phonetic representations. We also fuse phonetic features with textual and visual features to further improve performance. Experimental results on five different Chinese sentiment analysis datasets show that the inclusion of phonetic features significantly and consistently improves the performance of textual and visual representations and surpasses the state-of-the-art Chinese character-level representations.

KW - Chinese phonetics

KW - Deep phonemic orthography

KW - Multilingual sentiment analysis

KW - Sentiment analysis

UR - http://www.scopus.com/inward/record.url?scp=85099510729&partnerID=8YFLogxK

U2 - 10.1016/j.inffus.2021.01.005

DO - 10.1016/j.inffus.2021.01.005

M3 - 文章

AN - SCOPUS:85099510729

SN - 1566-2535

VL - 70

SP - 88

EP - 99

JO - Information Fusion

JF - Information Fusion

ER -

Phonetic-enriched text representation for Chinese sentiment analysis with reinforcement learning

摘要

访问文件

其它文件与链接

指纹

引用此