代码之家  ›  专栏  ›  技术社区  ›  little girl

R gsub特殊字符

  •  0
  • little girl  · 技术社区  · 6 年前

    我有数据框。在一列中,我有字符串

    "\t\tStatus: {\\id\\:\\d6b084be-9429-4b4b-8141-1cb5f5a84d2d\\,\\device\\:\\lge LG-H955 (z2_global_com)\\,\\result\\:\\1\\,\\script\\:[{\\timestamp\\:\\1519033801850\\,\\step\\:\\step1\\,\\answer\\:\\1\\},{\\timestamp\\:\\1519033879798\\,\\step\\:\\step2\\,\\answer\\:\\1\\}]},"
    

    我想删除一些特殊字符,所需输出为

    Status: {"id":"d6b084be-9429-4b4b-8141-1cb5f5a84d2d","device":"lge LG-H955 (z2_global_com)","result":"1","script":[{"timestamp":"1519033801850","step":"step1","answer":"1"},{"timestamp":"1519033879798","step":"step2","answer":"1"}]}
    

    我希望将“每”更改为“并从开始删除”和“删除第一个”和“最后一个”符号

    我尝试了gsub,但它没有正常工作

    更新 :谢谢你的作品!! 但我还有一个问题,与下面的问题相同,但它更复杂:(有很多

    "Script": "{\t\"id\": \"hh-d6b084be-9429-4b4b-8141-1cb5f5a84d2d\",\t\t\t\t   \"version\": \"1.0.0\",\t\t\t\t\t\"start_step\": \"step0\",\t\t\t\t\t\"script\": [\t\t\t{\t\t\t\t\"id\": \"step0\",\t\t\t\t\"text\": \"hh?\",\t\t\t\t\"interaction\": null,\t\t\t\t\t\"options\": [\t\t\t\t{ \"id\": \"1\", \"text\": \"cc \", \"action\": { \"goto\": \"step1\" } },\t\t\t\t{ \"id\": \"2\", \"text\": \"hh \", \"action\": {\"goto\": \"step2\"} }\t\t\t\t]\t\t\t},\t\t\t{\t\t\t\t\"id\": \"step1\",\t\t\t\t\"text\": \"Chh?\",\t\t\t\t\"interaction\": null,\t\t\t\t\t\"options\": [\t\t\t\t{ \"id\": \"1\", \"text\": \"hhgo hh\", \"action\": { \"goto\": \"step3\" } },\t\t\t\t{ \"id\": \"2\", \"text\": \"jj\", \"action\": { \"goto\": \"step4\" } },\t\t\t\t{ \"id\": \"3\", \"text\": \"jj aa z jj jj \", \"action\": { \"goto\": \"step5\" } }\t\t\t\t]\t\t\t},\t\t\t{\t\t\t\t\"id\": \"step2\",\t\t\t\t\"text\": \"jjj\",\t\t\t\t\"interaction\": null,\t\t\t\t\t\"options\": [\t\t\t\t{ \"id\": \"1\", \"text\": \"jjjj\", \"action\": { \"deeplink\": \"pl://app/nn\" } },\t\t\t\t{ \"id\": \"2\", \"text\": \"jj\", \"action\": { \"deeplink\":\"pl://app/xx/nn/nn\" } },\t\t\t\t{ \"id\": \"3\", \"text\": \"nnn\", \"action\": { \"finish\": \"1\" } },\t\t\t\t{ \"id\": \"4\", \"text\": \"nn\", \"action\": { \"finish\": \"1\" } }\t\t\t\t]\t\t\t},\t\t\t{\t\t\t\t\"id\": \"step3\",\t\t\t\t\t\"text\": \"nnnel. <a href='https://www.dd.pl/dd/dd.pdf'>  </a>\",\t\t\t\"interaction\": null,\t\t\t\t\t\"options\": [\t\t\t\t{ \"id\": \"1\", \"text\": \"fff\", \"action\": { \"deeplink\": \"pl://app/nn/apply?nn=KG&nn=*\" } },\t\t\t\t{ \"id\": \"2\", \"text\": \"hhh\", \"action\": { \"goto\": \"step6\" } },\t\t\t\t{ \"id\": \"3\", \"text\": \"hh \", \"action\": { \"goto\": \"step7\" } }\t\t\t\t]\t\t\t},\t\t   {\t\t\t\t\"id\": \"step4\",\t\t\t\t\"text\": \"hh\",\t\t\t\t\"interaction\": null,\t\t\t\t\t\"options\": [\t\t\t\t{ \"id\": \"1\", \"text\": \"ff\", \"action\": { \"deeplink\": \"https://www.k.uk/hh/hh.html\" } },\t\t\t\t{ \"id\": \"2\", \"text\": \"ss\", \"action\": { \"deeplink\": \"pl://app/ddd\" } },\t\t\t\t{ \"id\": \"3\", \"text\": \"ss\", \"action\": { \"finish\": \"1\" } },\t\t\t\t{ \"id\": \"4\", \"text\": \"ss\", \"action\": { \"finish\": \"1\" } }\t\t\t\t]\t\t\t},\t\t\t{\t\t\t\t\"id\": \"step5\",\t\t\t\t\t\"text\": \"sss?\",\t\t\t\t\t\"interaction\":  {\t\t\t\t\"type\": \"poll\",\t\t\t\t\t\t\"data\": {\t\t\t\t\t\"minimum_checked\": \"1\",\t\t\t\t\t\t\t\"maximum_checked\": \"1\",\t\t\t\t\t\t\t\"fields\": [\t\t\t\t\t{ \"id\": \"1\", \"text\": \"fff\" },\t\t\t\t\t{ \"id\": \"2\", \"text\": \"ff ff\" },\t\t\t\t\t{ \"id\": \"3\", \"text\": \"ff\" }\t\t\t\t\t\t]\t\t\t\t}\t\t\t},\t\t\t\t\"options\": [\t\t\t\t{ \"id\": \"1\", \"text\": \"dd\", \"action\": { \"goto\": \"step8\" } }\t\t\t\t]\t\t\t},\t\t\t{\t\t\t\t\"id\": \"step6\",\t\t\t\t\"text\": \"fff dd ddd dd i dd ff.\",\t\t\t\t\"interaction\": null,\t\t\t\t\t\"options\": [\t\t\t\t{ \"id\": \"1\", \"text\": \"Ok, aa\", \"action\": { \"deeplink\": \"l://app/ff\"} },\t\t\t\t{ \"id\": \"2\", \"text\": \"dddd\", \"action\": { \"deeplink\": \"ff://app/contact/ff/ff\" } },\t\t\t\t{ \"id\": \"3\", \"text\": \"ddd\", \"action\": { \"finish\": \"1\" } },\t\t\t\t{ \"id\": \"4\", \"text\": \"ddd\", \"action\": { \"finish\": \"1\" } }\t\t\t\t]\t\t\t},\t\t\t{\t\t\t\t\"id\": \"step7\",\t\t\t\t\t\"text\": \"ddd\",\t\t\t\t\t\"interaction\": {\t\t\t\t\"type\": \"poll\",\t\t\t\t\t\t\"data\": {\t\t\t\t\t\"minimum_checked\": \"1\",\t\t\t\t\t\t\t\"maximum_checked\": \"3\",\t\t\t\t\t\t\t\"fields\": [\t\t\t\t\t{ \"id\": \"1\", \"text\": \"dddd\" },\t\t\t\t\t{ \"id\": \"2\", \"text\": \"Kssss\" },\t\t\t\t\t{ \"id\": \"3\", \"text\": \"ss ss\" }\t\t\t\t\t\t]\t\t\t\t}\t\t\t},\t\t\t\t\"options\": [\t\t\t\t{ \"id\": \"1\", \"text\": \"ss\", \"action\": { \"goto\": \"step9\" } }\t\t\t\t]\t\t\t},\t\t\t{\t\t\t\t\"id\": \"step8\",\t\t\t\t\"text\": \"sss.\",\t\t\t\t\"interaction\": null,\t\t\t\t\t\"options\": [\t\t\t\t{ \"id\": \"1\", \"text\": \"Ok, aaa\", \"action\": { \"deeplink\": \"ss://app/call\" } },\t\t\t\t{ \"id\": \"2\", \"text\": \"sss\", \"action\": { \"deeplink\": \"ss://app/ss/ss/chat\" } },\t\t\t\t{ \"id\": \"3\", \"text\": \"ssss\", \"action\": { \"finish\": \"1\" } },\t\t\t\t{ \"id\": \"4\", \"text\": \"ss\", \"action\": { \"finish\": \"1\" } }\t\t\t\t]\t\t\t},\t\t\t{\t\t\t\t\"id\": \"step9\",\t\t\t\t\"text\": \"ss.\",\t\t\t\t\"interaction\": null,\t\t\t\t\t\"options\": [\t\t\t\t{ \"id\": \"1\", \"text\": \"Ok, aa\", \"action\": { \"deeplink\": \"ss://app/ss\" } },\t\t\t\t{ \"id\": \"2\", \"text\": \"ss\", \"action\": { \"deeplink\": \"ss://app/ss/cvc/ss\" } },\t\t\t\t{ \"id\": \"3\", \"text\": \"ss\", \"action\": { \"finish\": \"1\" } },\t\t\t\t{ \"id\": \"4\", \"text\": \"aaa\", \"action\": {\"finish\": \"1\" } }\t\t\t\t]\t\t\t}\t   ]\t}"
    

    当我尝试用DJack答案中的相同代码执行此操作时

    text <- gsub("\\\\",'"', gsub("\t|,$","", text))
    

    看起来像这样

    "Script": "{    \"id\": \"d6b084be-9429-4b4b-8141-1cb5f5a84d2d\",    \"start_step\": \"step1\",    \"script\": [\t{            \"id\": \"step1\",            \"text\": \"ggg\",            \"interaction\": null,            \"options\": [{ \t\t\t\t\t\"id\": \"1\",\t\t\t\t\t\"text\": \"gg\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"goto\": \"step2\"\t\t\t\t\t}\t\t\t\t}, { \t\t\t\t\t\"id\": \"2\",\t\t\t\t\t\"text\": \"gg\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"goto\": \"step3\"\t\t\t\t\t}\t\t\t\t}, { \t\t\t\t\t\"id\": \"3\",\t\t\t\t\t\"text\": \"gg\",\t\t\t\t\t\"action\": {\t\t\t\t\t\"goto\": \"step4\"\t\t\t\t\t}\t\t\t\t}            ]       },\t\t{            \"id\": \"step2\",            \"text\": \"gg?\",            \"interaction\": null,            \"options\": [{ \t\t\t\t\t\"id\": \"1\",\t\t\t\t\t\"text\": \"gg\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"deeplink\": \"gg://app/gg/apply?type=KG&gg=*\",\t\t\t\t\t\t\"finish\": \"1\",\t\t\t\t\t\t\"goto\": \"step10\"\t\t\t\t\t}\t\t\t\t\t},{ \t\t\t\t\t\"id\": \"2\",\t\t\t\t\t\"text\": \"gg z gg\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"deeplink\": \"ww://app/ww\",\t\t\t\t\t\t\"finish\": \"1\",\t\t\t\t\t\t\"goto\": \"step10\"\t\t\t\t\t}\t\t\t\t\t},{ \t\t\t\t\t\"id\": \"3\",\t\t\t\t\t\"text\": \"ww gg\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"deeplink\": \"dd://app/aa/cvc/aaa\",\t\t\t\t\t\t\"finish\": \"1\",\t\t\t\t\t\t\"goto\": \"step10\"\t\t\t\t\t}\t\t\t\t}            ]       },{            \"id\": \"step3\",            \"text\": \"ggg\",            \"interaction\": null,            \"options\": [{ \t\t\t\t\t\"id\": \"1\",\t\t\t\t\t\"text\": \"ww\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"deeplink\": \"ww://app/ww/apply?type=KG&dd=*\",\t\t\t\t\t\t\"finish\": \"1\",\t\t\t\t\t\t\"goto\": \"step10\"\t\t\t\t\t}\t\t\t\t\t},{ \t\t\t\t\t\"id\": \"2\",\t\t\t\t\t\"text\": \"aaa\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"deeplink\": \"ww://app/c2c\",\t\t\t\t\t\t\"finish\": \"1\",\t\t\t\t\t\t\"goto\": \"step10\"\t\t\t\t\t}\t\t\t\t\t},{ \t\t\t\t\t\"id\": \"3\",\t\t\t\t\t\"text\": \"aaa\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"deeplink\": \"dd://app/aa/ss/ss\",\t\t\t\t\t\t\"finish\": \"1\",\t\t\t\t\t\t\"goto\": \"step10\"\t\t\t\t\t}\t\t\t\t}            ]       },{            \"id\": \"step4\",            \"text\": \"ddd\",            \"interaction\": null,            \"options\": [{ \t\t\t\t\t\"id\": \"1\",\t\t\t\t\t\"text\": \"ss\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"deeplink\": \"dd://app/oneclick/apply?type=KG&profile=*\",\t\t\t\t\t\t\"finish\": \"1\",\t\t\t\t\t\t\"goto\": \"step10\"\t\t\t\t\t}\t\t\t\t\t},{ \t\t\t\t\t\"id\": \"2\",\t\t\t\t\t\"text\": \"sss\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"deeplink\": \"dd://app/dd\",\t\t\t\t\t\t\"finish\": \"1\",\t\t\t\t\t\t\"goto\": \"step10\"\t\t\t\t\t}\t\t\t\t\t},{ \t\t\t\t\t\"id\": \"3\",\t\t\t\t\t\"text\": \"aaa\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"deeplink\": \"dd://app/aa/cvc/aa\",\t\t\t\t\t\t\"finish\": \"1\",\t\t\t\t\t\t\"goto\": \"step10\"\t\t\t\t\t}\t\t\t\t}            ]       },\t\t{\t\t\t\"id\": \"step10\",\t\t\t\"text\": \"aaa\",\t\t\t\"interaction\": null,\t\t\t\"options\": null\t\t}]}"
    

    当我尝试这个的时候

    (
    fromJSON(substr(text, 9, nchar(text)))) 
    )
    

    我有错误

    Error: lexical error: invalid char in json text.
                "script": ["t{            "id": "step1",            "text"
                         (right here) ------^
    
    2 回复  |  直到 6 年前
        1
  •  2
  •   DJack    6 年前

    正如评论中提到的,我不知道你所说的“删除第一个和最后一个”是什么意思 " “。它只定义了数据类型(字符)。下面是一个解决方案(使用 ' 代替 " 但在R中,它们的含义相同):

    text <- "\t\tStatus: {\\id\\:\\d6b084be-9429-4b4b-8141-1cb5f5a84d2d\\,\\device\\:\\lge LG-H955 (z2_global_com)\\,\\result\\:\\1\\,\\script\\:[{\\timestamp\\:\\1519033801850\\,\\step\\:\\step1\\,\\answer\\:\\1\\},{\\timestamp\\:\\1519033879798\\,\\step\\:\\step2\\,\\answer\\:\\1\\}]},"
    
    text <- gsub("\\\\","'", gsub("\t|,$","", text))
    
    text
    
    "Status: {'id':'d6b084be-9429-4b4b-8141-1cb5f5a84d2d','device':'lge LG-H955 (z2_global_com)','result':'1','script':[{'timestamp':'1519033801850','step':'step1','answer':'1'},{'timestamp':'1519033879798','step':'step2','answer':'1'}]}"
    

    根据dienow的答案编辑

    如果您正在按照Dienow的回答寻找有效的“json”(我不熟悉这种格式), fromJSON 功能要求 " . 因此,您可以将代码改编为:

    text <- gsub("\\\\",'"', gsub("\t|,$","", text))
    

    这与Dienow的答案相同:

    library(jsonlite)
    fromJSON(substr(text, 9, nchar(text)))
    
    $id
    [1] "d6b084be-9429-4b4b-8141-1cb5f5a84d2d"
    
    $device
    [1] "lge LG-H955 (z2_global_com)"
    
    $result
    [1] "1"
    
    $script
          timestamp  step answer
    1 1519033801850 step1      1
    2 1519033879798 step2      1
    
        2
  •  1
  •   Dienow    6 年前
    s <- "\t\tStatus: {\\id\\:\\d6b084be-9429-4b4b-8141-1cb5f5a84d2d\\,\\device\\:\\lge LG-H955 (z2_global_com)\\,\\result\\:\\1\\,\\script\\:[{\\timestamp\\:\\1519033801850\\,\\step\\:\\step1\\,\\answer\\:\\1\\},{\\timestamp\\:\\1519033879798\\,\\step\\:\\step2\\,\\answer\\:\\1\\}]},"
    r <- gsub("\t", "", gsub("\\\\", "\"",s))
    

    下面是结果是有效json的证明:

    library(jsonlite)
    fromJson(substr(r, 9, nchar(r) - 1))
    

    此输出

    $id
    [1] "d6b084be-9429-4b4b-8141-1cb5f5a84d2d"
    
    $device
    [1] "lge LG-H955 (z2_global_com)"
    
    $result
    [1] "1"
    
    $script
          timestamp  step answer
    1 1519033801850 step1      1
    2 1519033879798 step2      1