代码之家 › 专栏 › 技术社区 › Chabo

从字符表中提取长度为(1 | 2)的数字字符

regex r

Chabo · 技术社区 · 6 年前

# Data comes in as a long string
Test<-("82026-424 82026-424 1 CSX10 Store Room 75.74 75.74")

# Seperate data into individual pieces with str_split
Split_Test<-str_split(Test[1],"\\s+")

# We can easily unlist it with the following code (Not sure if needed)
Test_Unlisted<-unlist(Split_Test)

> Test_Unlisted
[1] "82026-424" "82026-424" "1"         "CSX10"     "Store"     "Room"      
[8] "75.74" "75.74"

我期望的结果是从字符列表中得到“1”,然后如果值是“20”,也可以识别它。

Test_Final<-str_match(Test_Unlisted, "\\d|\\d\\d")

使用此代码,我可以获取长度为1的任何内容,但不能保证它是一个字符:

Test_Final<-which(sapply(Test_Unlisted, nchar)==1)

1 回复 | 直到 6 年前

Wiktor StribiÅ¼ew 6 年前

你需要使用

Test<-("82026-424 82026-424 1 CSX10 Store Room 75.74 75.74, 20")
regmatches(Test, gregexpr("\\b(?<!\\d\\.)\\d{1,2}\\b(?!\\.\\d)", Test, perl=TRUE))

regex demo regex demo .

\b
(?<!\d\.) -如果在当前位置的左边有一个数字和一个点,则匹配失败的一种反向查找
\d{1,2} -1或2位数
-词界
(?!\.\d) -如果在当前位置的右边有一个点和一个数字,则匹配失败的一种负向前看。

注意,由于模式中使用了lookarounds,regex应该传递给PCRE regex引擎,因此 perl=TRUE

与 stringr

library(stringr)
str_extract_all(Test, "\\b(?<!\\d\\.)\\d{1,2}\\b(?!\\.\\d)")

推荐文章

Marc B. · 使用ggplot2创建条形图时“缺少值”

1 年前

deschen · tidyverse与外部向量发生突变,该外部向量的元素是数据帧中的列值

1 年前

Laura · 在Shiny中使用可排序的包拖放名称,这些名称将成为图表

1 年前

Mallikarjun M · 如何使用随机森林进行时间序列预测?

1 年前

ly li · 模型摘要:当表格形状改变时,拟合优度消失

1 年前

C.Robin · 将marginaffects::predictions()的结果连接回main df?

1 年前

monotonic · 如何将格式为“col1+col3+col4”的数据帧的行名转换为一列数字向量“c(1,3,4)”?

2 年前

Shawn Hemelstrand · 为什么我的自定义errorbar函数不能在R中工作?

2 年前

RoyBatty · 统计每个字符在整个数据集中出现的次数

2 年前

stats_noob · R: 记录某个“行为”发生的循环的索引?

2 年前