代码之家  ›  专栏  ›  技术社区  ›  dreeves

在PHP中解析属性/值列表

  •  3
  • dreeves  · 技术社区  · 15 年前

    给定一个具有属性/值对的字符串,例如

    attr1="some text" attr2 = "some other text" attr3= "some weird !@'#$\"=+ text"
    

    目标是解析它并输出一个关联数组,在这种情况下:

    array('attr1' => 'some text',
          'attr2' => 'some other text',
          'attr3' => 'some weird !@\'#$\"=+ text')
    

    注意等号周围的间距不一致,输入中有转义双引号,输出中有转义单引号。

    2 回复  |  直到 15 年前
        1
  •  6
  •   Bart Kiers    15 年前

    尝试如下操作:

    $text = "attr1=\"some text\" attr2 = \"some other text\" attr3= \"some weird !@'#$\\\"=+ text\"";
    echo $text;
    preg_match_all('/(\S+)\s*=\s*"((?:\\\\.|[^\\"])*)"/', $text, $matches, PREG_SET_ORDER);
    print_r($matches);
    

    产生:

    attr1="some text" attr2 = "some other text" attr3= "some weird !@'#$\"=+ text"
    
    Array
    (
        [0] => Array
            (
                [0] => attr1="some text"
                [1] => attr1
                [2] => some text
            )
    
        [1] => Array
            (
                [0] => attr2 = "some other text"
                [1] => attr2
                [2] => some other text
            )
    
        [2] => Array
            (
                [0] => attr3= "some weird !@'#$\"=+ text"
                [1] => attr3
                [2] => some weird !@'#$\"=+ text
            )
    
    )
    

    简单解释一下:

    (\S+)               // match one or more characters other than white space characters
                        // > and store it in group 1
    \s*=\s*             // match a '=' surrounded by zero or more white space characters 
    "                   // match a double quote
    (                   // open group 2
      (?:\\\\.|[^\\"])* //   match zero or more sub strings that are either a backslash
                        //   > followed by any character, or any character other than a
                        //   > backslash
    )                   // close group 2
    "                   // match a double quote
    
        2
  •  2
  •   Amarghosh    15 年前

    编辑:如果值以反斜杠结尾,则此regex将失败,如 attr4="something\\"

    我不知道php,但是因为regex在任何语言中基本上都是相同的,所以我在actionscript中就是这样做的:

    var text:String = "attr1=\"some text\" attr2 = \"some other text\" attr3= \"some weird !@'#$\\\"=+ text\"";
    
    var regex:RegExp = /\s*(\w+)\s*=\s*(?:"(.*?)(?<!\\)")\s*/g;
    
    var result:Object;
    while(result = regex.exec(text))
        trace(result[1] + " is " + result[2]);
    

    我得出了以下结论:

    ATTR1是一些文本
    ATTR2是其他文本
    ATTR3有些奇怪!@ '$$\ ' = +文本