代码之家  ›  专栏  ›  技术社区  ›  Lance

如何在Lucee中模拟Unicode JS正则表达式

  •  3
  • Lance  · 技术社区  · 6 年前

    这是JS

    function charTest(k){
        var regexp = /^[\u00C0-\u00ff\s -\~]+$/;
        return regexp.test(k)
    }
    
    if(!charTest(thisKey)){
        alert("Please Use Latin Characters Only");
        return false;
    }
    

    regexp = '[\u00C0-\u00ff\s -\~]+/';
    writeDump(reFind(regexp,"测));
    writeDump(reFind(regexp,"test));
    

    我也试过了

     regexp = "[\\p{L}]";
    

    但垃圾场总是 0

    1 回复  |  直到 6 年前
        1
  •  2
  •   Shawn    6 年前

    一秒钟多。您最初的JS regex是: "/^[\u00C0-\u00ff\s -\~]+$/"

    Basic parts of regex:
    "/..../" == signifies the start and stop of the Regex.
    "^[...]" == signifies anything that is NOT in this group
    "+" == signifies at least one of the previous
    "$" == signifies the end of the string
    
    Identifiers in the regex:
    "\u00c0-\u00ff" == Unicode character range of Character 192 (À) 
                       to Character 255 (ÿ). This is the Latin 1 
                       Extension of the Unicode character set.
    "\s" == signifies a Space Character
    " -\~" == signifies another identifier for a space character to the 
              (escaped) tilde character (~). This is ASCII 32-126, which
              includes the printable characters of ASCII (except the DEL
              character (127). This includes alpha-numerics amd most punctuation.
    

    你可以试试这个:

    <cfscript>
    //http://www.asciitable.com/
    //https://en.wikipedia.org/wiki/List_of_Unicode_characters
    //https://en.wikipedia.org/wiki/Latin_script_in_Unicode
    
    
    function charTest(k) {
      return 
        REfind("[^" 
          & chr(32) & "-" & chr(126) 
          & chr(192) & "-" & chr(255) 
          & "]",arguments.k) 
        ? "Please Use Latin Characters Only" 
        : "" 
      ;
    }
    
    
    // TESTS
    writeDump(charTest("测")); // Not Latin
    writeDump(charTest("test")); // All characters between 31 & 126
    writeDump(charTest("À")); // Character 192 (in range)
    writeDump(charTest("À ")); // Character 192 and Space
    writeDump(charTest("     ")); // Space Characters
    writeDump(charTest("12345")); // Digits ( character 48-57 )
    writeDump(charTest("ð")); // Character 240 (in range) 
    writeDump(charTest("ℿ")); // Character 8511 (outside range)
    writeDump(charTest(chr(199))); // CF Character (in range)
    writeDump(charTest(chr(10))); // CF Line Feed Character (outside range)
    writeDump(charTest(chr(1000))); // CF Character (outside range)
    
    writeDump(charTest("
    ")); // CRLF (outside range)
    
    writeDump(charTest(URLDecode("%00", "utf-8"))); // CF Null character (outside range)
    
    //writeDump(asc("测"));
    //writeDump(asc("test"));
    //writeDump(asc("À"));
    //writeDump(asc("ð"));
    //writeDump(asc("ℿ"));
    </cfscript>
    

    https://trycf.com/gist/05d27baaed2b8fc269f90c7c80a1aa82/lucee5?theme=monokai

    正则表达式所做的就是查看输入字符串,如果没有找到介于 chr(192) chr(255) ,它将返回您选择的字符串,否则将不返回任何内容。

    我认为您可以直接访问255以下的UNICODE字符。我得测试一下。

    你需要像Javascript那样提醒这个函数吗?如果需要,只需输出1或0即可确定此函数是否实际找到了要查找的字符。

    推荐文章