代码之家  ›  专栏  ›  技术社区  ›  Josh

在r中的数据集中添加情绪列

  •  1
  • Josh  · 技术社区  · 6 年前

    我在r中做了一些基本的情感分析,想知道是否有办法分析句子或行的情感,然后在列中附加句子的情感。到目前为止,我所做的所有分析都为我提供了情绪的概述或提取了特定的词语,但没有链接回原始数据行

    我的数据输入将通过BI软件输入,并显示如下所示的案例编号和一些文本:

    "12345","I am extremely angry with my service"
    "23456","I was happy with how everything turned out"
    "34567","The rep did a great job helping me"
    

    我希望它作为下面的输出返回

    "12345","I am extremely angry with my service","Anger"
    "23456","I was happy with how everything turned out","Positive"
    "34567","The rep did a great job helping me","Positive"
    

    任何一个包或资源的正确方向上的任何一点都将不胜感激!

    1 回复  |  直到 6 年前
        1
  •  3
  •   phiver    6 年前

    你在句子中遇到的问题是,情感词汇是基于单词的。如果你看看nrc的词典,“愤怒”一词有三个情感价值:愤怒、厌恶和消极。你选择哪一个?或者让句子返回一个词典中的多个单词。试着用文本测试不同的词汇,看看会发生什么,例如 tidytext

    如果你想要一个能从句子层面分析情感的软件包,你可以 sentimentr 。你不会得到像愤怒这样的情绪值,而是情绪/极性得分。更多关于 多愁善感的人 可以在中找到 package documentation 和上 sentimentr github页面。

    一个小示例代码:

    library(sentimentr)
    text <- data.frame(id = c("12345","23456","34567"),
                       sentence = c("I am extremely angry with my service", "I was happy with how everything turned out", "The rep did a great job helping me"),
                       stringsAsFactors = FALSE)
    
    
    
    sentiment(text$sentence)
       element_id sentence_id word_count  sentiment
    1:          1           1          7 -0.5102520
    2:          2           1          8  0.2651650
    3:          3           1          8  0.3535534
    
    # add sentiment score to data.frame
    text$sentiment <- sentiment(text$sentence)$sentiment 
    
    text
         id                                   sentence  sentiment
    1 12345       I am extremely angry with my service -0.5102520
    2 23456 I was happy with how everything turned out  0.2651650
    3 34567         The rep did a great job helping me  0.3535534