Automatic Construction and Global Optimization of a Multisentiment Lexicon

Xiaoping Yang; Zhongxia Zhang; Zhongqiu Zhang; Yuting Mo; Lianbei Li; Li Yu; Peican Zhu

doi:10.1155/2016/2093406

Automatic Construction and Global Optimization of a Multisentiment Lexicon

Xiaoping Yang, Zhongxia Zhang, Zhongqiu Zhang, Yuting Mo, Lianbei Li, Li Yu, Peican Zhu

光电与智能研究院

科研成果: 期刊稿件 › 文章 › 同行评审

8 引用（Scopus）

摘要

Manual annotation of sentiment lexicons costs too much labor and time, and it is also difficult to get accurate quantification of emotional intensity. Besides, the excessive emphasis on one specific field has greatly limited the applicability of domain sentiment lexicons (Wang et al., 2010). This paper implements statistical training for large-scale Chinese corpus through neural network language model and proposes an automatic method of constructing a multidimensional sentiment lexicon based on constraints of coordinate offset. In order to distinguish the sentiment polarities of those words which may express either positive or negative meanings in different contexts, we further present a sentiment disambiguation algorithm to increase the flexibility of our lexicon. Lastly, we present a global optimization framework that provides a unified way to combine several human-annotated resources for learning our 10-dimensional sentiment lexicon SentiRuc. Experiments show the superior performance of SentiRuc lexicon in category labeling test, intensity labeling test, and sentiment classification tasks. It is worth mentioning that, in intensity label test, SentiRuc outperforms the second place by 21 percent.

源语言	英语
文章编号	2093406
期刊	Computational Intelligence and Neuroscience
卷	2016
DOI	https://doi.org/10.1155/2016/2093406
出版状态	已出版 - 2016

访问文件

10.1155/2016/2093406

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{55de093ac18f42ba9b6817e30b73fa87,

title = "Automatic Construction and Global Optimization of a Multisentiment Lexicon",

abstract = "Manual annotation of sentiment lexicons costs too much labor and time, and it is also difficult to get accurate quantification of emotional intensity. Besides, the excessive emphasis on one specific field has greatly limited the applicability of domain sentiment lexicons (Wang et al., 2010). This paper implements statistical training for large-scale Chinese corpus through neural network language model and proposes an automatic method of constructing a multidimensional sentiment lexicon based on constraints of coordinate offset. In order to distinguish the sentiment polarities of those words which may express either positive or negative meanings in different contexts, we further present a sentiment disambiguation algorithm to increase the flexibility of our lexicon. Lastly, we present a global optimization framework that provides a unified way to combine several human-annotated resources for learning our 10-dimensional sentiment lexicon SentiRuc. Experiments show the superior performance of SentiRuc lexicon in category labeling test, intensity labeling test, and sentiment classification tasks. It is worth mentioning that, in intensity label test, SentiRuc outperforms the second place by 21 percent.",

author = "Xiaoping Yang and Zhongxia Zhang and Zhongqiu Zhang and Yuting Mo and Lianbei Li and Li Yu and Peican Zhu",

note = "Publisher Copyright: {\textcopyright} 2016 Xiaoping Yang et al.",

year = "2016",

doi = "10.1155/2016/2093406",

language = "英语",

volume = "2016",

journal = "Computational Intelligence and Neuroscience",

issn = "1687-5265",

publisher = "Hindawi Publishing Corporation",

}

TY - JOUR

T1 - Automatic Construction and Global Optimization of a Multisentiment Lexicon

AU - Yang, Xiaoping

AU - Zhang, Zhongxia

AU - Zhang, Zhongqiu

AU - Mo, Yuting

AU - Li, Lianbei

AU - Yu, Li

AU - Zhu, Peican

PY - 2016

Y1 - 2016

N2 - Manual annotation of sentiment lexicons costs too much labor and time, and it is also difficult to get accurate quantification of emotional intensity. Besides, the excessive emphasis on one specific field has greatly limited the applicability of domain sentiment lexicons (Wang et al., 2010). This paper implements statistical training for large-scale Chinese corpus through neural network language model and proposes an automatic method of constructing a multidimensional sentiment lexicon based on constraints of coordinate offset. In order to distinguish the sentiment polarities of those words which may express either positive or negative meanings in different contexts, we further present a sentiment disambiguation algorithm to increase the flexibility of our lexicon. Lastly, we present a global optimization framework that provides a unified way to combine several human-annotated resources for learning our 10-dimensional sentiment lexicon SentiRuc. Experiments show the superior performance of SentiRuc lexicon in category labeling test, intensity labeling test, and sentiment classification tasks. It is worth mentioning that, in intensity label test, SentiRuc outperforms the second place by 21 percent.

AB - Manual annotation of sentiment lexicons costs too much labor and time, and it is also difficult to get accurate quantification of emotional intensity. Besides, the excessive emphasis on one specific field has greatly limited the applicability of domain sentiment lexicons (Wang et al., 2010). This paper implements statistical training for large-scale Chinese corpus through neural network language model and proposes an automatic method of constructing a multidimensional sentiment lexicon based on constraints of coordinate offset. In order to distinguish the sentiment polarities of those words which may express either positive or negative meanings in different contexts, we further present a sentiment disambiguation algorithm to increase the flexibility of our lexicon. Lastly, we present a global optimization framework that provides a unified way to combine several human-annotated resources for learning our 10-dimensional sentiment lexicon SentiRuc. Experiments show the superior performance of SentiRuc lexicon in category labeling test, intensity labeling test, and sentiment classification tasks. It is worth mentioning that, in intensity label test, SentiRuc outperforms the second place by 21 percent.

UR - http://www.scopus.com/inward/record.url?scp=85006084802&partnerID=8YFLogxK

U2 - 10.1155/2016/2093406

DO - 10.1155/2016/2093406

M3 - 文章

C2 - 28042290

AN - SCOPUS:85006084802

SN - 1687-5265

VL - 2016

JO - Computational Intelligence and Neuroscience

JF - Computational Intelligence and Neuroscience

M1 - 2093406

ER -

Automatic Construction and Global Optimization of a Multisentiment Lexicon

摘要

访问文件

其它文件与链接

指纹

引用此