我们可以根据指示日期和时间的模式分割字符串,然后修剪空白。
text <- "2018-02-19 10:49:50 fgdfhdsgfhdsgfh 2018-02-19 10:49:50 abd abd adjskfjs 2018-02-19 10:51:21 jfhdsjfdsf"
text2 <- trimws(strsplit(text, split = "\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2}")[[1]][-1])
text2
# [1] "fgdfhdsgfhdsgfh" "abd abd adjskfjs" "jfhdsjfdsf"
使现代化
如果我们正在处理数据帧中的一列,并且希望输出在单独的列中,那么可以使用
str_split
来自的函数
stringr
包裹请注意,在下面的示例中,我复制了原始文本以形成一列两行的数据框。
library(stringr)
text <- "2018-02-19 10:49:50 fgdfhdsgfhdsgfh 2018-02-19 10:49:50 abd abd adjskfjs 2018-02-19 10:51:21 jfhdsjfdsf"
text_df <- data.frame(text = rep(text, 2), stringsAsFactors = FALSE)
m1 <- str_split(text_df$text, pattern = "\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2}", simplify = TRUE)
m2 <- m1[, 2:ncol(m1)]
m3 <- apply(m2, 2, trimws)
m3
# [,1] [,2] [,3]
# [1,] "fgdfhdsgfhdsgfh" "abd abd adjskfjs" "jfhdsjfdsf"
# [2,] "fgdfhdsgfhdsgfh" "abd abd adjskfjs" "jfhdsjfdsf"