代码之家  ›  专栏  ›  技术社区  ›  Dan McGrath

如何使toLowerCase()和toUpperCase()在不同浏览器中保持一致

  •  3
  • Dan McGrath  · 技术社区  · 5 年前

    背景信息

    执行以下操作将在浏览器中,甚至在浏览器版本(例如FireFox 54和55)之间产生不同的结果:

    document.write(String.fromCodePoint(223).normalize("NFKC").toLowerCase().toUpperCase().toLowerCase())
    

    在Firefox55中,它为您提供 ss ,在Firefox54中,它为您提供 ß

    一般来说,这很好,像locale这样的机制可以处理很多您想要的情况;然而,当您需要跨平台的一致行为时,例如与BaaS系统交谈,如

    1 回复  |  直到 4 年前
        1
  •  3
  •   Mathias Bynens    5 年前

    请注意,这个问题似乎只会影响Firefox的过时版本,因此除非您明确需要支持这些旧版本,否则 using jsvu + eshost

    $ jsvu # Update installed JavaScript engine binaries to the latest version.
    
    $ eshost -e '"\xDF".normalize("NFKC").toLowerCase().toUpperCase().toLowerCase()'
    #### Chakra
    ss
    
    #### V8 --harmony
    ss
    
    #### JavaScriptCore
    ss
    
    #### V8
    ss
    
    #### SpiderMonkey
    ss
    
    #### xs
    ss
    

    https://tc39.github.io/ecma262/#sec-string.prototype.tolowercase 国家:

    cuList toLowercase(cpList) ,根据Unicode默认大小写转换算法。

    Unicode默认大小写转换算法 section 3.13 Default Case Algorithms of the Unicode standard .

    Unicode字符的全大小写映射是通过使用 SpecialCasing.txt 加上来自 UnicodeData.txt ,不包括任何可能发生冲突的后一种映射。在这些文件中没有映射的任何字符都被视为映射到自身。

    []

    以下规则指定Unicode字符串的默认大小写转换操作。这些规则使用完全大小写转换操作, Uppercase_Mapping(C) , Lowercase_Mapping(C) Titlecase_Mapping(C)

    一串 X

    • toUppercase(X) C 在里面 X 大写字母映射(C)
    • R2 toLowercase(X) :映射每个字符 C 在里面 X .

    这里有一个例子 ,并在下面添加了我的注释:

    00DF  ; 00DF   ; 0053 0073; 0053 0053;                      # LATIN SMALL LETTER SHARP S
    <code>; <lower>; <title>  ; <upper>  ; (<condition_list>;)? # <comment>
    

    'ß' )小写至U+00DF( ß )大写字母改为U+0053 U+0053( SS ).

    这里有一个例子 UnicodeData.txt

    0041  ; LATIN CAPITAL LETTER A; Lu;0;L;;;;;N;;;; 0061   ;
    <code>; <name>                ; <ignore>       ; <lower>; <upper>
    

    这行写着U+0041( 'A' )小写至U+0061( 'a' ). 它没有显式的大写映射,这意味着它将大写字母映射到自身。

    这里有另一个例子 UnicodeData.txt :

    0061  ; LATIN SMALL LETTER A; Ll;0;L;;;;;N;; ;0041;        ; 0041
    <code>; <name>              ; <ignore>            ; <lower>; <upper>
    

    这行写着U+0061( “a” )大写至U+0041( ). 它没有显式的小写映射,这意味着它将小写映射到自身。

    您可以编写一个脚本来解析这两个文件,读取这些示例后面的每一行,并构建小写/大写映射。然后,您可以将这些映射转换为一个小型JavaScript库,以提供符合规范的功能 toLowerCase / toUpperCase 功能。

    只是 SpecialCasing.txt

    // Instead of…
    function normalize(string) {
      const normalized = string.normalize('NFKC');
      const lowercased = normalized.toLowerCase();
      return lowercased;
    }
    
    // …one could do something like:
    function lowerCaseSpecialCases(string) {
      // TODO: replace all SpecialCasing.txt characters with their lowercase
      // mapping.
      return string.replace(/TODO/g, fn);
    }
    function normalize(string) {
      const normalized = string.normalize('NFKC');
      const fixed = lowerCaseSpecialCases(normalized); // Workaround for old Firefox 54 behavior.
      const lowercased = fixed.toLowerCase();
      return lowercased;
    }
    

    并生成一个实现 lowerCaseSpecialCases toLower )以及 toUpper https://gist.github.com/mathiasbynens/a37e3f3138069729aa434ea90eea4a3c 根据具体的用例,您可能不需要 图珀 以及相应的正则表达式和映射。以下是完整生成的库:

    const reToLower = /[\u0130\u1F88-\u1F8F\u1F98-\u1F9F\u1FA8-\u1FAF\u1FBC\u1FCC\u1FFC]/g;
    const toLowerMap = new Map([
      ['\u0130', 'i\u0307'],
      ['\u1F88', '\u1F80'],
      ['\u1F89', '\u1F81'],
      ['\u1F8A', '\u1F82'],
      ['\u1F8B', '\u1F83'],
      ['\u1F8C', '\u1F84'],
      ['\u1F8D', '\u1F85'],
      ['\u1F8E', '\u1F86'],
      ['\u1F8F', '\u1F87'],
      ['\u1F98', '\u1F90'],
      ['\u1F99', '\u1F91'],
      ['\u1F9A', '\u1F92'],
      ['\u1F9B', '\u1F93'],
      ['\u1F9C', '\u1F94'],
      ['\u1F9D', '\u1F95'],
      ['\u1F9E', '\u1F96'],
      ['\u1F9F', '\u1F97'],
      ['\u1FA8', '\u1FA0'],
      ['\u1FA9', '\u1FA1'],
      ['\u1FAA', '\u1FA2'],
      ['\u1FAB', '\u1FA3'],
      ['\u1FAC', '\u1FA4'],
      ['\u1FAD', '\u1FA5'],
      ['\u1FAE', '\u1FA6'],
      ['\u1FAF', '\u1FA7'],
      ['\u1FBC', '\u1FB3'],
      ['\u1FCC', '\u1FC3'],
      ['\u1FFC', '\u1FF3']
    ]);
    const toLower = (string) => string.replace(reToLower, (match) => toLowerMap.get(match));
    
    const reToUpper = /[\xDF\u0149\u01F0\u0390\u03B0\u0587\u1E96-\u1E9A\u1F50\u1F52\u1F54\u1F56\u1F80-\u1FAF\u1FB2-\u1FB4\u1FB6\u1FB7\u1FBC\u1FC2-\u1FC4\u1FC6\u1FC7\u1FCC\u1FD2\u1FD3\u1FD6\u1FD7\u1FE2-\u1FE4\u1FE6\u1FE7\u1FF2-\u1FF4\u1FF6\u1FF7\u1FFC\uFB00-\uFB06\uFB13-\uFB17]/g;
    const toUpperMap = new Map([
      ['\xDF', 'SS'],
      ['\uFB00', 'FF'],
      ['\uFB01', 'FI'],
      ['\uFB02', 'FL'],
      ['\uFB03', 'FFI'],
      ['\uFB04', 'FFL'],
      ['\uFB05', 'ST'],
      ['\uFB06', 'ST'],
      ['\u0587', '\u0535\u0552'],
      ['\uFB13', '\u0544\u0546'],
      ['\uFB14', '\u0544\u0535'],
      ['\uFB15', '\u0544\u053B'],
      ['\uFB16', '\u054E\u0546'],
      ['\uFB17', '\u0544\u053D'],
      ['\u0149', '\u02BCN'],
      ['\u0390', '\u0399\u0308\u0301'],
      ['\u03B0', '\u03A5\u0308\u0301'],
      ['\u01F0', 'J\u030C'],
      ['\u1E96', 'H\u0331'],
      ['\u1E97', 'T\u0308'],
      ['\u1E98', 'W\u030A'],
      ['\u1E99', 'Y\u030A'],
      ['\u1E9A', 'A\u02BE'],
      ['\u1F50', '\u03A5\u0313'],
      ['\u1F52', '\u03A5\u0313\u0300'],
      ['\u1F54', '\u03A5\u0313\u0301'],
      ['\u1F56', '\u03A5\u0313\u0342'],
      ['\u1FB6', '\u0391\u0342'],
      ['\u1FC6', '\u0397\u0342'],
      ['\u1FD2', '\u0399\u0308\u0300'],
      ['\u1FD3', '\u0399\u0308\u0301'],
      ['\u1FD6', '\u0399\u0342'],
      ['\u1FD7', '\u0399\u0308\u0342'],
      ['\u1FE2', '\u03A5\u0308\u0300'],
      ['\u1FE3', '\u03A5\u0308\u0301'],
      ['\u1FE4', '\u03A1\u0313'],
      ['\u1FE6', '\u03A5\u0342'],
      ['\u1FE7', '\u03A5\u0308\u0342'],
      ['\u1FF6', '\u03A9\u0342'],
      ['\u1F80', '\u1F08\u0399'],
      ['\u1F81', '\u1F09\u0399'],
      ['\u1F82', '\u1F0A\u0399'],
      ['\u1F83', '\u1F0B\u0399'],
      ['\u1F84', '\u1F0C\u0399'],
      ['\u1F85', '\u1F0D\u0399'],
      ['\u1F86', '\u1F0E\u0399'],
      ['\u1F87', '\u1F0F\u0399'],
      ['\u1F88', '\u1F08\u0399'],
      ['\u1F89', '\u1F09\u0399'],
      ['\u1F8A', '\u1F0A\u0399'],
      ['\u1F8B', '\u1F0B\u0399'],
      ['\u1F8C', '\u1F0C\u0399'],
      ['\u1F8D', '\u1F0D\u0399'],
      ['\u1F8E', '\u1F0E\u0399'],
      ['\u1F8F', '\u1F0F\u0399'],
      ['\u1F90', '\u1F28\u0399'],
      ['\u1F91', '\u1F29\u0399'],
      ['\u1F92', '\u1F2A\u0399'],
      ['\u1F93', '\u1F2B\u0399'],
      ['\u1F94', '\u1F2C\u0399'],
      ['\u1F95', '\u1F2D\u0399'],
      ['\u1F96', '\u1F2E\u0399'],
      ['\u1F97', '\u1F2F\u0399'],
      ['\u1F98', '\u1F28\u0399'],
      ['\u1F99', '\u1F29\u0399'],
      ['\u1F9A', '\u1F2A\u0399'],
      ['\u1F9B', '\u1F2B\u0399'],
      ['\u1F9C', '\u1F2C\u0399'],
      ['\u1F9D', '\u1F2D\u0399'],
      ['\u1F9E', '\u1F2E\u0399'],
      ['\u1F9F', '\u1F2F\u0399'],
      ['\u1FA0', '\u1F68\u0399'],
      ['\u1FA1', '\u1F69\u0399'],
      ['\u1FA2', '\u1F6A\u0399'],
      ['\u1FA3', '\u1F6B\u0399'],
      ['\u1FA4', '\u1F6C\u0399'],
      ['\u1FA5', '\u1F6D\u0399'],
      ['\u1FA6', '\u1F6E\u0399'],
      ['\u1FA7', '\u1F6F\u0399'],
      ['\u1FA8', '\u1F68\u0399'],
      ['\u1FA9', '\u1F69\u0399'],
      ['\u1FAA', '\u1F6A\u0399'],
      ['\u1FAB', '\u1F6B\u0399'],
      ['\u1FAC', '\u1F6C\u0399'],
      ['\u1FAD', '\u1F6D\u0399'],
      ['\u1FAE', '\u1F6E\u0399'],
      ['\u1FAF', '\u1F6F\u0399'],
      ['\u1FB3', '\u0391\u0399'],
      ['\u1FBC', '\u0391\u0399'],
      ['\u1FC3', '\u0397\u0399'],
      ['\u1FCC', '\u0397\u0399'],
      ['\u1FF3', '\u03A9\u0399'],
      ['\u1FFC', '\u03A9\u0399'],
      ['\u1FB2', '\u1FBA\u0399'],
      ['\u1FB4', '\u0386\u0399'],
      ['\u1FC2', '\u1FCA\u0399'],
      ['\u1FC4', '\u0389\u0399'],
      ['\u1FF2', '\u1FFA\u0399'],
      ['\u1FF4', '\u038F\u0399'],
      ['\u1FB7', '\u0391\u0342\u0399'],
      ['\u1FC7', '\u0397\u0342\u0399'],
      ['\u1FF7', '\u03A9\u0342\u0399']
    ]);
    const toUpper = (string) => string.replace(reToUpper, (match) => toUpperMap.get(match));
    
    推荐文章