代码之家  ›  专栏  ›  技术社区  ›  Regular Jo

我们可以使用基于lookbehinds的条件句吗?

  •  0
  • Regular Jo  · 技术社区  · 6 年前

    我有一个正则表达式,任务与此类似。这个想知道价格( demo ),其中美元符号必须在数字之前或之后,但决不能两者都在。

      \b       # "word" boundary
      (?<=\$)  # Assert that current position is preceded by $
      \d+\b    # digits, boundary
      (?!\$)   # Assert that current position is not followed by $
    |          # alternation
      \b       # boundary
      (?<!\$)  # Assert that current position not preceded by $
      \d+\b    # digits, boundary
      (?=\$)   # Assert that current position is followed by $
    

    PCRE中有没有一种方法可以使用一个条件来处理类似的条件。( demo )

    (                 # capture 1, because we can't quantify lookarounds
      (?<=\$)         # Asserts that $ precedes current position
    )?                # close capture 1, make it "optional" (0 or 1 times)
    
    \b                # boundary
    \d+               # digits
    \b                # boundary
    (?(1)             # conditional that capture #1 was a success
      (?!\$)          # if successful, assert that it is not followed by a #
    |
      (?=\$)          # unsuccessful, assert that this position is followed by a $
    )
    

    注: 单词边界很重要,这样两者都可以捕获整个数字,否则regex将后退一步,从多个数字中剪切一个数字。

    上面的两个表达式都匹配$15和$16$,但不匹配$17$。如果没有边界,它们将匹配15美元、16美元和17美元中的1美元。

    1 回复  |  直到 6 年前
        1
  •  0
  •   Regular Jo    6 年前

    有效的解决办法( demo ). 这个问题的解决方案是99个步骤,在这个简单的例子中,这是不必要的牺牲。在有些情况下,牺牲,如果有的话,是非常谨慎的。

    结果很简单。生成捕获组1 占有 . 这意味着无论它捕捉到什么,如果regex试图回溯,它都不会投降。这意味着一旦它决定在数字前面加上 $ ,如果数字后面还跟着一个 $ . 它会放弃并继续前进,寻找下一个边界,然后是数字,并观察周围环境。

    (                 # capture 1, because we can't quantify lookarounds
      (?<=\$)         # Asserts that $ precedes current position
    )?+               # close capture 1, make it "optional" (0 or 1 times)
                      # the possessive quantifier is the only change to the regex
                        # + is only 'possessive' when it's the second character of a quantifier
    
    \b                # boundary
    \d+               # digits
    \b                # boundary
    (?(1)             # conditional that capture #1 was a success
      (?!\$)          # if successful, assert that it is not followed by a #
    |
      (?=\$)          # unsuccessful, assert that this position is followed by a $
    )