首页 > 操作系统 >

中文微博情感分析研究综述_微博情感博主名字_微博情感分析工具(2)

电脑杂谈　发布时间：2017-03-04 23:59:42　来源：网络整理

情感分析主要目的就是识别用户对事物或人的看法、态度（attitudes：enduring, affectively colored beliefs, dispositions towards objects or persons），参与主体主要包括：

Holder (source)of attitude：观点持有者

Target (aspect)of attitude：评价对象

Typeof attitude：评价观点

From a set of types：Like, love, hate, value, desire,etc.

Or (more commonly) weightedpolarity:positive, negative, neutral,together withstrength

Textcontaining the attitude：评价文本，一般是句子或整篇文档

更细更深入的还包括评价属性，情感词/极性词，评价搭配等、

通常，我们面临的情感分析任务包括如下几类：

Simplest task:Is the attitude of this text positive or negative?

More complex:Rank the attitude of this text from 1 to 5

Advanced:Detect the target, source, or complex attitude types

后续章节将以Simplest task为例进行介绍。

2）A Baseline Algorithm

本小节对影评进行情感分析为例，向大家展示一个简单、实用的情感分析系统。详细见论文: Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002.Thumbs up? Sentiment Classification using Machine Learning Techniques. EMNLP-2002, 79—86.

Bo Pang and Lillian Lee. 2004.A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts. ACL, 271-278

我们面临的任务是“Polarity detection: Is anIMDBmovie review positive or negative?”，数据集为“Polrity Data 2.0:”.作者将情感分析当作分类任务，拆分成如下子任务：

Tokenization：正文提取，过滤时间、电话号码等，保留大写字母开头的字符串，保留表情符号，切词；

Feature Extraction：直观上，我们会认为形容词直接决定文本的情感，而Pang和Lee的实验表明，采用所有词（unigram）作为特征，可以达到更好的情感分类效果。

其中，需要对否定句进行特别的处理，如句子”Ididn’tlike this movie”vs “I really like this movie”，unigram只差一个词，但是有着截然不同的含义。为了有效处理这种情况，Das and Chen (2001)提出了“Add NOT_ to every word between negation and following punctuation”，根据此规则可以将句子“didn’t like this movie , but I”转换为“didn’t NOT_like NOT_this NOT_movie, but I”。

本文来自电脑杂谈，转载请注明本文网址：
http://www.pc-fly.com/a/jisuanjixue/article-35803-2.html

相关阅读

发表评论　　请自觉遵守互联网相关的政策法规，严禁发布、暴力、反动的言论

山帅

有十万的人根本不在乎那100元

2026年06月04日回复顶转发

每日福利

鲍增辉的博客

什么格式以及如何打开PHP

竹汇云学校东莞华为认证培训学校-ICT行业工程师培训的摇篮

[Java Web] 5. JSP（1）注释和小脚本

热点图片

热点排行