代码之家 › 专栏 › 技术社区 › BlahMclean

如何匹配不在<和>[duplicate]实例之间的字符串字符

replaceall regex html java

BlahMclean · 技术社区 · 6 年前

我有这个HTML:

"This is simple html text <span class='simple'>simple simple text text</span> text"

我只需要匹配任何HTML标记之外的单词。我的意思是,如果我想匹配简单和文本,我应该只从这是简单的html文本和最后一部分文本得到结果,结果将是简单1匹配,文本2匹配。有人能帮我吗?我正在使用jQuery。

var pattern = new RegExp("(\\b" + value + "\\b)", 'gi');

if (pattern.test(text)) {
    text = text.replace(pattern, "<span class='notranslate'>$1</span>");
}

value 是我想要匹配的单词(在本例中是简单的)
text 是 "This is simple html text <span class='simple'>simple simple text text</span> text"

我需要用 <span> .但我只想把外面的词包起来任何 HTML标签。这个例子的结果应该是

This is <span class='notranslate'>simple</span> html <span class='notranslate'>text</span> <span class='simple'>simple simple text text</span> <span class='notranslate'>text</span>

我不想替换里面的任何文本

<span class='simple'>simple simple text text</span>

应与更换前相同。

0 回复 | 直到 9 年前

Jerry 8 年前

好的,试着用这个正则表达式:

(text|simple)(?![^<]*>|[^<>]*</)

Example worked on regex101 .

细分:

(         # Open capture group
  text    # Match 'text'
|         # Or
  simple  # Match 'simple'
)         # End capture group
(?!       # Negative lookahead start (will cause match to fail if contents match)
  [^<]*   # Any number of non-'<' characters
  >       # A > character
|         # Or
  [^<>]*  # Any number of non-'<' and non-'>' characters
  </      # The characters < and /
)         # End negative lookahead.

如果 text 或 simple 在html标签之间。

Explosion Pills 11 年前

^([^<]*)<\w+.*/\w+>([^<]*)$

然而,这是一个非常天真的表达。最好使用DOM解析器。

推荐文章

lonix · 使用sed从JSON中提取非贪婪正则表达式

1 年前

me-me · regex检查电子邮件字符串是否有@后跟一个点以及点符号后至少2个字符[重复]

2 年前

Dave Guerrero · 是否有一个正则表达式模式来捕获字符串中直到第一个字母字符的数字?

2 年前

Dima Malko · 如何在指定符号前添加符号?

2 年前

shekharsabale · 从列表元素捕获子字符串

2 年前

maycca · 正则表达式:过滤年份数值大于某个值的文件?字符串中编码的年份

2 年前

Katia · 根据特定规则进行多行匹配

2 年前

Andrei Cleland · 在长正则表达式中包含unicode字符

2 年前

MHA · Pandas str.extract()以字母结尾的数字

2 年前

Slava Vir · 如何查找后面“/”之间的最后一组

2 年前