代码之家  ›  专栏  ›  技术社区  ›  gtilflm

记事本中复杂的查找和替换++

  •  0
  • gtilflm  · 技术社区  · 5 年前

    我有数百个文件,其中包含如下内容,我想去掉整个文件块。

       <texttool id="468" rect="55,306,319,23">
          <toolstroke />
          <toolcolor />
          <font style="2" size="18" />
          <Text>(Be sure to state the page/problem number.)</Text>
        </texttool>
    

    问题是他们都有不同的想法 id=XXX 部分。。。但其他一切都是一样的。

    有没有办法进行大规模的查找和替换来处理这种情况?

    1 回复  |  直到 5 年前
        1
  •  3
  •   piet.t Charis A.    5 年前

    使用下面的正则表达式搜索整个文件并删除所有 <texttool> 块及其内部内容:

    (<texttool(?:.|\n)*?<\/texttool>)
    

    之前

     text before<texttool id="468" rect="55,306,319,23">
          <toolstroke />
          <toolcolor />
          <font style="2" size="18" />
          <Text>(Be sure to state the page/problem number.)</Text>
        </texttool> text after
    
    <texttool id="468" rect="55,306,319,23">
          <toolstroke />
          <toolcolor />
          <font style="2" size="18" />
          <Text>(Be sure to state the page/problem number.)</Text>
        </texttool>
    

    之后

     text before text after
    

    你可以在这里自己试试 DEMO

    更新1

    根据要求,以下正则表达式只会删除这些 <文本工具> 包含以下属性的- rect="55,306,319,23" :

    (<texttool.*rect=\"55\,306\,319\,23\"(?:.|\n)*?<\/texttool>)
    

    这是最新的正则表达式 DEMO .

    请注意,它将只匹配那些包含特定字符串的块,并逐个字符匹配其文本。

    更新2

    我提供的正则表达式在Notepad++中无法正常工作,因为它使用了基于PCRE的自定义正则表达式系统。下面是一个经过测试和验证的模式,对我来说很有效:

    <\btexttool.*\brect\=\"55\,306\,319\,23\"([\s\S]*?)<\/\btexttool>
    

    它是 非常重要 禁用 . matches newline 选项,否则该模式将无法工作,因为提供的模式与它不兼容。

        2
  •  3
  •   Toto    5 年前
    • Ctrl键 + H
    • 找到什么: <texttool\b[^>]*?\brect="55,306,319,23"(?:(?!<texttool\b).)*</texttool>
    • 替换为: LEAVE EMPTY
    • 检查火柴盒
    • 检查包裹
    • 检查正则表达式
    • 检查 . matches newline
    • 全部替换

    说明:

    <texttool\b             # open tag
    [^>]*?                  # 0 or more any character that is not >, not greedy
    \brect="55,306,319,23"  # literally
                        # Tempered greedy token
    (?:                     # start non capture group
     (?!<texttool\b)        # negative lookahead, make sure we haven't same tag
     .                      # any character
    )*                      # end group, may appear 0 or more times
    </texttool>             # end tag
    

    鉴于:

      <grouptool id="881" rect="20,576,456,141">
        <imagetool id="882" rect="349.15240478515625,581.5066528320312,111.22747039794922,132.8365936279297">
          <toolstroke WIDTH="1.0" CAP="2" JOIN="2" MITER="0.0" />
          <bordercolor />
          <image name="head-set-md.png" type="CLIPART" size="20419" w="252" h="300" CRC="3224584205" />
        </imagetool>
        <texttool id="884" rect="30,584,214,31">
          <toolstroke />
          <toolcolor />
          <font style="3" size="24" />
          <Text>Got Audio Problems?</Text>
        </texttool>
        <texttool id="885" rect="55,306,319,23">
          <toolstroke />
          <toolcolor />
          <font style="2" size="18" />
          <Text>Note: Audio problems can be caused</Text>
        </texttool>
        <imagetool id="886" rect="36.17853927612305,631.7913818359375,262.9012756347656,24.34532356262207">
          <toolstroke WIDTH="1.0" CAP="2" JOIN="2" MITER="0.0" />
          <bordercolor />
          <image name="unknown.png" type="CLIPART" size="1777" w="260" h="24" CRC="2321804736" />
        </imagetool>
        <texttool id="887" rect="55,306,319,23">
          <toolstroke />
          <toolcolor />
          <font style="2" size="18" />
          <Text>by a weak/spotty internet connection.</Text>
        </texttool>
        <rectangletool id="888" rect="249.5330505371093,627.7338256835938,30.33476448059082,31.446043014526367">
          <toolstroke WIDTH="4.0" />
          <toolcolor RGB="52224" />
          <fillcolor RGB="16777215" ALPHA="0" />
        </rectangletool>
      </grouptool>
    

    结果举例如下:

      <grouptool id="881" rect="20,576,456,141">
        <imagetool id="882" rect="349.15240478515625,581.5066528320312,111.22747039794922,132.8365936279297">
          <toolstroke WIDTH="1.0" CAP="2" JOIN="2" MITER="0.0" />
          <bordercolor />
          <image name="head-set-md.png" type="CLIPART" size="20419" w="252" h="300" CRC="3224584205" />
        </imagetool>
        <rectangletool id="883" rect="20,576,455.0214538574219,141">
          <toolstroke />
          <toolcolor />
          <fillcolor RGB="16777215" ALPHA="0" />
        </rectangletool>
    
    
        <imagetool id="886" rect="36.17853927612305,631.7913818359375,262.9012756347656,24.34532356262207">
          <toolstroke WIDTH="1.0" CAP="2" JOIN="2" MITER="0.0" />
          <bordercolor />
          <image name="unknown.png" type="CLIPART" size="1777" w="260" h="24" CRC="2321804736" />
        </imagetool>
    
        <rectangletool id="888" rect="249.5330505371093,627.7338256835938,30.33476448059082,31.446043014526367">
          <toolstroke WIDTH="4.0" />
          <toolcolor RGB="52224" />
          <fillcolor RGB="16777215" ALPHA="0" />
        </rectangletool>
      </grouptool>
    

    屏幕截图:

    enter image description here