代码之家  ›  专栏  ›  技术社区  ›  Bittercoder

javascript字符串中转义HTML中的未终止字符串文本

  •  2
  • Bittercoder  · 技术社区  · 14 年前

    在编码此值时,我发现一些javascript字符串文本存在问题:

    未编码

    <!-- Start ValueClick Media 300x250 Code for Test Tag -->
    <script language="javascript" src="http://media.fastclick.net/w/get.media?sid=38901&m=6&tp=8&d=j&t=n"></script>
    <noscript><a href="http://media.fastclick.net/w/click.here?sid=38901&m=6&c=1" target="_blank">
    <img src="http://media.fastclick.net/w/get.media?sid=38901&m=6&tp=8&d=s&c=1"width=300 height=250 border=1></a></noscript>
    <!-- End ValueClick Media 300x250 Code for Test Tag -->
    

    我得到这个值:

    已解码

    "<!-- Start ValueClick Media 300x250 Code for Test Tag -->\r\n<script language=\"javascript\" src=\"http://media.fastclick.net/w/get.media?sid=38901&m=6&tp=8&d=j&t=n\"></script>\r\n<noscript><a href=\"http://media.fastclick.net/w/click.here?sid=38901&m=6&c=1\" target=\"_blank\">\r\n<img src=\"http://media.fastclick.net/w/get.media?sid=38901&m=6&tp=8&d=s&c=1\"width=300 height=250 border=1></a></noscript>\r\n<!-- End ValueClick Media 300x250 Code for Test Tag -->"
    

    当在一些javascript代码中用作javascript文本时,firefox会抱怨它没有终止——但我不明白为什么是我自己。

    奇怪的是,如果我移除 </script> “以上HTML中的结束标记,编码版本正常工作,如下所示:

    未编码

    <!-- Start ValueClick Media 300x250 Code for Test Tag -->
    <script language="javascript" src="http://media.fastclick.net/w/get.media?sid=38901&m=6&tp=8&d=j&t=n">
    <noscript><a href="http://media.fastclick.net/w/click.here?sid=38901&m=6&c=1" target="_blank">
    <img src="http://media.fastclick.net/w/get.media?sid=38901&m=6&tp=8&d=s&c=1"width=300 height=250 border=1></a></noscript>
    <!-- End ValueClick Media 300x250 Code for Test Tag -->
    

    编码

    "<!-- Start ValueClick Media 300x250 Code for Test Tag -->\r\n<script language=\"javascript\" src=\"http://media.fastclick.net/w/get.media?sid=38901&m=6&tp=8&d=j&t=n\">\r\n<noscript><a href=\"http://media.fastclick.net/w/click.here?sid=38901&m=6&c=1\" target=\"_blank\">\r\n<img src=\"http://media.fastclick.net/w/get.media?sid=38901&m=6&tp=8&d=s&c=1\"width=300 height=250 border=1></a></noscript>\r\n<!-- End ValueClick Media 300x250 Code for Test Tag -->"
    

    此编码值有效…

    有人知道我错过了什么吗?

    更新

    现在看来相当明显,我把原因归咎于睡眠不足,在这种情况下,应用程序依赖于旧版本的json.net来编码javascript,因此我为字符串引入了一个新的jsonConverter来解决这个问题,它处理的是在javas之后的第二次传递中转义结束标记。已应用脚本转义。

    public class EscapeTagsStringConverter : JsonConverter
    {
        public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
        {
            if (value == null)
            {
                writer.WriteNull();
                return;
            }
    
            string escapedValue = ToEscapedJavaScriptString(value.ToString(), '"').Replace("</", "<\\/");
    
            writer.WriteRawValue("\"" + escapedValue + "\"");
        }
    
        public override object ReadJson(JsonReader reader, Type objectType, JsonSerializer serializer)
        {
            return reader.Value.ToString();
        }
    
        public override bool CanConvert(Type objectType)
        {
            return (objectType == typeof (string));
        }
    
        public static char IntToHex(int n)
        {
            if (n <= 9)
            {
                return (char)(n + 48);
            }
            return (char)((n - 10) + 97);
        }
    
        public static void WriteCharAsUnicode(TextWriter writer, char c)
        {
            char h1 = IntToHex((c >> 12) & '\x000f');
            char h2 = IntToHex((c >> 8) & '\x000f');
            char h3 = IntToHex((c >> 4) & '\x000f');
            char h4 = IntToHex(c & '\x000f');
    
            writer.Write('\\');
            writer.Write('u');
            writer.Write(h1);
            writer.Write(h2);
            writer.Write(h3);
            writer.Write(h4);
        }
    
        public static void WriteEscapedJavaScriptChar(TextWriter writer, char c, char delimiter)
        {
            switch (c)
            {
                case '\t':
                    writer.Write(@"\t");
                    break;
                case '\n':
                    writer.Write(@"\n");
                    break;
                case '\r':
                    writer.Write(@"\r");
                    break;
                case '\f':
                    writer.Write(@"\f");
                    break;
                case '\b':
                    writer.Write(@"\b");
                    break;
                case '\\':
                    writer.Write(@"\\");
                    break;
                case '\'':
                    writer.Write((delimiter == '\'') ? @"\'" : @"'");
                    break;
                case '"':
                    writer.Write((delimiter == '"') ? "\\\"" : @"""");
                    break;
                default:
                    if (c > '\u001f')
                        writer.Write(c);
                    else
                        WriteCharAsUnicode(writer, c);
                    break;
            }
        }
    
        public void WriteEscapedJavaScriptString(TextWriter writer, string value, char delimiter)
        {
            if (value != null)
            {
                for (int i = 0; i < value.Length; i++)
                {
                    WriteEscapedJavaScriptChar(writer, value[i], delimiter);
                }
            }
        }
    
        public string ToEscapedJavaScriptString(string value)
        {
            return ToEscapedJavaScriptString(value, '"');
        }
    
        public string ToEscapedJavaScriptString(string value, char delimiter)
        {
            using (StringWriter w = CreateStringWriter(GetLength(value) ?? 16))
            {
                WriteEscapedJavaScriptString(w, value, delimiter);
                return w.ToString();
            }
        }
    
        public static StringWriter CreateStringWriter(int capacity)
        {
            StringBuilder sb = new StringBuilder(capacity);
            StringWriter sw = new StringWriter(sb, CultureInfo.InvariantCulture);
    
            return sw;
        }
    
        public static int? GetLength(string value)
        {
            if (value == null)
                return null;
            return value.Length;
        }
    }
    
    2 回复  |  直到 14 年前
        1
  •  4
  •   bobince    14 年前

    嗯,是的,如果你有:

    <script>
        var s= '</script>';
    </script>
    

    浏览器怎么知道第一个 </script> 脚本元素不是真正的结尾吗?每一个浏览器,不仅仅是火狐,都会读到:

    <script>
        var s= '   // uh-oh! string literal left open!
    </script>';    // script element closed. Then some trailing text content
    </script>      // close-tag for a script that isn't open, ignore
    

    避免字符串文字过早结束,该字符串文字包含 </ (etago)序列,你必须以某种方式逃离它。你可以说 '<\/script>' '\x3C/script>' 甚至 '<'+'/script>' (那本书很受欢迎,尽管我觉得很不雅)。

        2
  •  0
  •   chrismarx    14 年前

    解码后的值不会在Chrome或FF 3.6.10中引发错误。 你用的是什么FF版本?