代码之家  ›  专栏  ›  技术社区  ›  Kishore Bandi

c:通过流式传输Json的某些部分来解析一个巨大的odatajson

  •  0
  • Kishore Bandi  · 技术社区  · 5 年前

    ( 以几兆字节为单位 ) 要求是流式传输“JSON的某些部分”,甚至不将它们加载到内存中。

    :当我阅读属性时” value[0].Body.Content

    JSON:

    {
        "@odata.context": "https://localhost:5555/api/v2.0/$metadata#Me/Messages",
        "value": [
            {
                "@odata.id": "https://localhost:5555/api/v2.0/",
                "@odata.etag": "W/\"Something\"",
                "Id": "vccvJHDSFds43hwy98fh",
                "CreatedDateTime": "2018-12-01T01:47:53Z",
                "LastModifiedDateTime": "2018-12-01T01:47:53Z",
                "ChangeKey": "SDgf43tsdf",
                "WebLink": "https://localhost:5555/?ItemID=dfsgsdfg9876ijhrf",
                "Body": {
                    "ContentType": "HTML",
                    "Content": "<html>\r\n<body>Huge Data Here\r\n</body>\r\n</html>\r\n"
                },
                "ToRecipients": [{
                        "EmailAddress": {
                            "Name": "ME",
                            "Address": "me@me.com"
                        }
                    }
                ],
                "CcRecipients": [],
                "BccRecipients": [],
                "ReplyTo": [],
                "Flag": {
                    "FlagStatus": "NotFlagged"
                }
            }
        ],
        "@odata.nextLink": "http://localhost:5555/rest/jersey/sleep?%24filter=LastDeliveredDateTime+ge+2018-12-01+and+LastDeliveredDateTime+lt+2018-12-02&%24top=50&%24skip=50"
    }
    

    尝试的方法:
    1 新软件

    internally converts the data into string and loads into memory . (这会导致LOH爆炸,在压缩发生之前内存无法释放-我们的worker进程有内存限制,无法将其保存在内存中)

    **code:**
    
        using (var jsonTextReader = new JsonTextReader(sr))
        {
            var pool = new CustomArrayPool();
            // Checking if pooling will help with memory
            jsonTextReader.ArrayPool = pool;
    
            while (jsonTextReader.Read())
            {
                if (jsonTextReader.TokenType == JsonToken.PropertyName
                    && ((string)jsonTextReader.Value).Equals("value"))
                {
                    jsonTextReader.Read();
    
                    if (jsonTextReader.TokenType == JsonToken.StartArray)
                    {
                        while (jsonTextReader.Read())
                        {
                            if (jsonTextReader.TokenType == JsonToken.StartObject)
                            {
                                var Current = JToken.Load(jsonTextReader);
                                // By Now, the LOH Shoots up.
                                // Avoid below code of converting this JToken back to byte array.
                                destinationStream.write(Encoding.ASCII.GetBytes(Current.ToString()));
                            }
                            else if (jsonTextReader.TokenType == JsonToken.EndArray)
                            {
                                break;
                            }
                        }
                    }
                }
    
                if (jsonTextReader.TokenType == JsonToken.StartObject)
                {
                    var Current = JToken.Load(jsonTextReader);
                    // Do some processing with Current
                    destinationStream.write(Encoding.ASCII.GetBytes(Current.ToString()));
                }
            }
        }
    
    1. OData.Net:

      looks like it supports streaming of string fields

      ODataMessageReaderSettings settings = new ODataMessageReaderSettings();
      IODataResponseMessage responseMessage = new InMemoryMessage { Stream = stream };
      responseMessage.SetHeader("Content-Type", "application/json;odata.metadata=minimal;");
      // ODataMessageReader reader = new ODataMessageReader((IODataResponseMessage)message, settings, GetEdmModel());
      ODataMessageReader reader = new ODataMessageReader(responseMessage, settings, new EdmModel());
      var oDataResourceReader = reader.CreateODataResourceReader();
      var property = reader.ReadProperty();
      


    知道如何使用OData.Net/Newtonsoft以及某些字段的流值?
    唯一的方法是手动解析流吗?

    0 回复  |  直到 5 年前
        1
  •  3
  •   dbc    5 年前

    如果要将JSON的一部分从一个流复制到另一个流,那么可以使用 JsonWriter.WriteToken(JsonReader) Current = JToken.Load(jsonTextReader) Encoding.ASCII.GetBytes(Current.ToString()) 表示及其相关的内存开销:

    using (var textWriter = new StreamWriter(destinationStream, new UTF8Encoding(false, true), 1024, true))
    using (var jsonWriter = new JsonTextWriter(textWriter) { Formatting = Formatting.Indented, CloseOutput = false })
    {
        // Use Formatting.Indented or Formatting.None as required.
        jsonWriter.WriteToken(jsonTextReader);
    }
    

    然而,Json.NET网站的 JsonTextReader 无法以相同的方式读取“chunks”中的单个字符串值 XmlReader.ReadValueChunk() JsonWriter.WriteToken() 不会阻止这些字符串完全加载到内存中。

    或者,您可以考虑 JsonReaderWriterFactory . 这些读者和作者由 DataContractJsonSerializer 并实时地将JSON转换为XML read written . 因为这些读者和作者的基类是 XmlReader XmlWriter 支持分块读写字符串值。适当地使用它们将避免在大型对象堆中分配字符串。

    为此,首先定义以下扩展方法,将JSON值的选定子集从输入流复制到输出流,如要流化的数据的路径所指定的那样:

    public static class JsonExtensions
    {
        public static void StreamNested(Stream from, Stream to, string [] path)
        {
            var reversed = path.Reverse().ToArray();
    
            using (var xr = JsonReaderWriterFactory.CreateJsonReader(from, XmlDictionaryReaderQuotas.Max))
            {
                foreach (var subReader in xr.ReadSubtrees(s => s.Select(n => n.LocalName).SequenceEqual(reversed)))
                {
                    using (var xw = JsonReaderWriterFactory.CreateJsonWriter(to, Encoding.UTF8, false))
                    {
                        subReader.MoveToContent();
    
                        xw.WriteStartElement("root");
                        xw.WriteAttributes(subReader, true);
    
                        subReader.Read();
    
                        while (!subReader.EOF)
                        {
                            if (subReader.NodeType == XmlNodeType.Element && subReader.Depth == 1)
                                xw.WriteNode(subReader, true);
                            else
                                subReader.Read();
                        }
    
                        xw.WriteEndElement();
                    }
                }
            }
        }
    }
    
    public static class XmlReaderExtensions
    {
        public static IEnumerable<XmlReader> ReadSubtrees(this XmlReader xmlReader, Predicate<Stack<XName>> filter)
        {
            Stack<XName> names = new Stack<XName>();
    
            while (xmlReader.Read())
            {
                if (xmlReader.NodeType == XmlNodeType.Element)
                {
                    names.Push(XName.Get(xmlReader.LocalName, xmlReader.NamespaceURI));
                    if (filter(names))
                    {
                        using (var subReader = xmlReader.ReadSubtree())
                        {
                            yield return subReader;
                        }
                    }
                }
    
                if ((xmlReader.NodeType == XmlNodeType.Element && xmlReader.IsEmptyElement)
                    || xmlReader.NodeType == XmlNodeType.EndElement)
                {
                    names.Pop();
                }
            }
        }
    }
    

    现在,那个 string [] path 论证 StreamNested() 任何一种 XML阅读器 返回者 JsonReaderWriterFactory.CreateJsonReader() 反过来,用于此转换的映射由Microsoft在中记录 Mapping Between JSON and XML value[*] ,所需的XML路径为 //root/value/item . 因此,您可以通过执行以下操作来选择和流化所需的嵌套对象:

    JsonExtensions.StreamNested(inputStream, destinationStream, new[] { "root", "value", "item" });
    

    • JSON和XML之间的映射 有点复杂。将一些示例JSON加载到 XDocument 使用以下扩展方法:

      static XDocument ParseJsonAsXDocument(string json)
      {
          using (var xr = JsonReaderWriterFactory.CreateJsonReader(new MemoryStream(Encoding.UTF8.GetBytes(json)), Encoding.UTF8, XmlDictionaryReaderQuotas.Max, null))
          {
              return XDocument.Load(xr);
          }
      }
      

    • JObject.SelectToken Equivalent in .NET .