代码之家  ›  专栏  ›  技术社区  ›  knittl

从C中的字符串提取最后一个匹配项#

  •  1
  • knittl  · 技术社区  · 14 年前

    我在表格中有字符串 [abc].[some other string].[can.also.contain.periods].[our match]

    i now want to match the string "our match" (i.e. without the brackets), so i played around with lookarounds and whatnot. i now get the correct match, but i don't think this is a clean solution.

    (?<=\.?\[)     starts with '[' or '.['
    ([^\[]*)      our match, i couldn't find a way to not use a negated character group
                  `.*?` non-greedy did not work as expected with lookarounds,
                  it would still match from the first match
                  (matches might contain escaped brackets)
    (?=\]$)       string ends with an ]
    

    语言是.NET/C。如果有一个简单的解决方案不涉及regex,我也很高兴知道

    what really irritates me is the fact, that i cannot use (.*?) 要捕获字符串,因为它似乎不是贪婪的,不适用于lookbehinds。

    我也尝试过: Regex.Split(str, @"\]\.\[").Last().TrimEnd(']'); 但是我也不是很喜欢这个解决方案

    4 回复  |  直到 14 年前
        1
  •  3
  •   Jacob Poul Richardt    14 年前

    下面将介绍这个技巧。假设字符串在最后一次匹配之后结束。

    string input = "[abc].[some other string].[can.also.contain.periods].[our match]";
    
    var search = new Regex("\\.\\[(.*?)\\]$", RegexOptions.RightToLeft);
    
    string ourMatch = search.Match(input).Groups[1]);
    
        2
  •  2
  •   David_001    14 年前

    Assuming you can guarantee the input format, and it's just the last entry you want, LastIndexOf 可以使用:

    string input = "[abc].[some other string].[can.also.contain.periods].[our match]";
    
    int lastBracket = input.LastIndexOf("[");
    string result = input.Substring(lastBracket + 1, input.Length - lastBracket - 2);
    
        3
  •  0
  •   Rox    14 年前

    使用string.split():

    string input = "[abc].[some other string].[can.also.contain.periods].[our match]";
    char[] seps = {'[',']','\\'};
    string[] splitted = input.Split(seps,StringSplitOptions.RemoveEmptyEntries);
    

    Edit: the array will have the string inside [] and then . and so on, so if you have a variable number of groups, you can use that to get the value you want (or remove the strings that are just '.')

    编辑后将反斜杠添加到分隔符,以处理类似'\[abc\]

    edit2:对于嵌套的[]:

    string input = @"[abc].[some other string].[can.also.contain.periods].[our [the] match]";
    string[] seps2 = { "].["};
    string[] splitted = input.Split(seps2, StringSplitOptions.RemoveEmptyEntries);
    

    您在最后一个元素(索引3)中的[匹配]必须删除多余的]

        4
  •  0
  •   polygenelubricants    14 年前

    您有几个选项:

    • RegexOptions.RightToLeft -是的,.net regex可以做到!用它!
    • Match the whole thing with greedy prefix, use brackets to capture the suffix that you're interested in
      • 一般说来, pattern 变成 .*(pattern)
      • 在这种情况下, .*\[([^\]]*)\] , then extract what \1 捕获(捕获) see this on rubular.com )

    工具书类