代码之家  ›  专栏  ›  技术社区  ›  jiji

为什么我不能使用“TermDocumentMatrix”?

  •  0
  • jiji  · 技术社区  · 7 年前

    我使用了以下命令来统一单数形式的复数单词,但我得到了一个错误。

    crudeCorp <- tm_map(crudeCorp, gsub, pattern = "smells", replacement = "smell")
    crudeCorp <- tm_map(crudeCorp, gsub, pattern = "feels", replacement = "feel")
    crudeDtm <- TermDocumentMatrix(crudeCorp, control=list(removePunctuation=T))
    Error in UseMethod("meta", x) : 
      no applicable method for 'meta' applied to an object of class "character"
    

    2、这个命令我用错了吗?

    我将把以下代码附加到句子处理和矩阵中。

    library(tm)
    library(XML)
    
    crudeCorp<-VCorpus(VectorSource(readLines(file.choose())))
    
    #(Eliminating Extra Whitespace) 
    crudeCorp <- tm_map(crudeCorp, stripWhitespace)
    
    #(Convert to Lower Case)
    
    crudeCorp<-tm_map(crudeCorp, content_transformer(tolower))
    
    
    # remove stopwords from corpus
    
    crudeCorp<-tm_map(crudeCorp, removeWords, stopwords("english"))
    myStopwords <- c(stopwords("english"), "can", "will","got","also","goes","get","much","since","way","even")
    myStopwords <- setdiff(myStopwords, c("will","can"))
    crudeCorp <- tm_map(crudeCorp, removeWords, myStopwords)
    
    crudeCorp<-tm_map(crudeCorp,removeNumbers)
    
    crudeCorp <- tm_map(crudeCorp, gsub, pattern = "smells", replacement = "smell")
    crudeCorp <- tm_map(crudeCorp, gsub, pattern = "feels", replacement = "feel")
    
    #-(Creating Term-Document Matrices)
    crudeDtm <- TermDocumentMatrix(crudeCorp, control=list(removePunctuation=T))
    

    1. I'M HAPPY
    2. how are you?
    3. This apple is good
    (skip)
    
    1 回复  |  直到 7 年前
        1
  •  0
  •   Prem    7 年前

    为什么不使用下面的代码进行词干处理&标点符号删除?

    crudeCorp <- tm_map(crudeCorp, removePunctuation)
    crudeCorp <- tm_map(crudeCorp, stemDocument, language = "english")  
    crudeDtm  <- DocumentTermMatrix(crudeCorp)
    

    希望这有帮助!