b2科目四模拟试题多少题驾考考爆了怎么补救
b2科目四模拟试题多少题 驾考考爆了怎么补救

中文微博情感分析研究综述_微博情感博主名字_微博情感分析工具(6)

电脑杂谈  发布时间:2017-03-04 23:59:42  来源:网络整理

Step 1:Extract aphrasal lexiconfrom reviews,通过规则抽取的phrasal如下图所示:

Step 2:Learn polarity of each phrase,那么,如何评价phrase的polarity呢?直观上,有这样的结论:“Positive phrases co-occur more with‘excellent’,Negative phrases co-occur more with’poor’”,这时,将问题转换成如何衡量词条之间的共现关系?于是,学者们引入了点互信息(Pointwise mutual information,PMI),它经常被用于度量两个具体事件的相关程度,公式为:

两个词条的PMI公式为:

常用的计算PMI(word1, word2)方法是分别以”word1”,”word2”和”word1 NEAR word2”为query,根据搜索引擎检索结果,得到P(word)和P(word1, word2),如下:

P(word) = hits(word)/N

P(word1,word2) = hits(word1 NEAR word2)/N2

则有:

那么,计算一个phrase的polarity公式为(excellent和poor也可以使用其它已知极性词代替):

Turney Algorithm在410 reviews(from Epinions)的数据集上,其中170 (41%) negative,240 (59%) positive,取得了74%的准确率(baseline为59%,均标注为positive)。

Step 3:Rate a review by the average polarity of its phrases

3. Using WordNet to learn polarity:论文见S.M. Kim and E. Hovy. 2004.Determining the sentiment of opinions. COLING 2004,M. Hu and B. Liu.Mining and summarizing customer reviews. In Proceedings of KDD, 2004.该方法步骤如下:

Create positive (“good”) and negative seed-words (“terrible”)

Find Synonyms and Antonyms

Positive Set: Add synonyms of positive words (“well”) and antonyms of negative words

Negative Set: Add synonyms of negative words (“awful”) and antonyms of positive words (”evil”)

Repeat, following chains of synonyms

Filter

以上几个方法都有较好的领域适应性和鲁棒性,基本思想可以概括为“Use seeds and semi-supervised learning to induce lexicons”,即:

Start with a seed set of words (‘good’, ‘poor’)

Find other words that have similar polarity:

Using “and” and “but”


本文来自电脑杂谈,转载请注明本文网址:
http://www.pc-fly.com/a/jisuanjixue/article-35803-6.html

相关阅读
    发表评论  请自觉遵守互联网相关的政策法规,严禁发布、暴力、反动的言论

    • 川岛得爱
      川岛得爱

      小子人太狂了要付出代价的

    • 尹苗苗
      尹苗苗

      有钱了说句屁话都被人捧为经典

    热点图片
    拼命载入中...