代码之家  ›  专栏  ›  技术社区  ›  Sassa

使用大型WHERE子句的查询导致EF6中使用npgsql的超时异常

  •  5
  • Sassa  · 技术社区  · 7 年前

    我有一个类似以下内容的查询:

    private static IQueryable<MultiframeModule> WhereAllFramesProperties(this IQueryable<MultiframeModule> query, ICollection<Frame> frames)
    {
        return frames.Aggregate(query, (q, frame) =>
        {
            return q.Where(p => p.Frames.Any(i => i.FrameData.ShaHash == frame.FrameData.ShaHash));
        });
    }
    

    MultiframeModule Frame 具有多对多关系。

    通过这个查询,我想找到一个 多参数模块 包含 frames 作为参数发送的集合,为此我检查 ShaHash 参数

    如果 框架 包含2个帧,则生成的SQL如下所示:

    SELECT
       "Extent1"."MultiframeModuleId",
       "Extent1"."FrameIncrementPointer",
       "Extent1"."PageNumberVector" 
    FROM
       "public"."MultiframeModule" AS "Extent1" 
    WHERE
       EXISTS 
       (
          SELECT
             1 AS "C1" 
          FROM
             "public"."Frame" AS "Extent2" 
             INNER JOIN
                "public"."FrameData" AS "Extent3" 
                ON "Extent2"."FrameData_FrameDataId" = "Extent3"."FrameDataId" 
          WHERE
             "Extent1"."MultiframeModuleId" = "Extent2"."MultiframeModule_MultiframeModuleId" 
             AND "Extent3"."ShaHash" = @p__linq__0
       )
       AND EXISTS 
       (
          SELECT
             1 AS "C1" 
          FROM
             "public"."Frame" AS "Extent4" 
             INNER JOIN
                "public"."FrameData" AS "Extent5" 
                ON "Extent4"."FrameData_FrameDataId" = "Extent5"."FrameDataId" 
          WHERE
             "Extent1"."MultiframeModuleId" = "Extent4"."MultiframeModule_MultiframeModuleId" 
             AND "Extent5"."ShaHash" = @p__linq__1
       )
       LIMIT 2
    
    -- p__linq__0: '0' (Type = Int32, IsNullable = false)
    
    -- p__linq__1: '0' (Type = Int32, IsNullable = false)
    

    但是,如果我有更多帧,比如说200帧,那么调用将抛出一个异常:

    Unable to read data from the transport connection: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
    

    使用stacktrace:

       at Npgsql.ReadBuffer.<Ensure>d__27.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
       at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
       at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
       at Npgsql.NpgsqlConnector.<DoReadMessage>d__157.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
       at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
       at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
       at System.Runtime.CompilerServices.ValueTaskAwaiter`1.GetResult()
       at Npgsql.NpgsqlConnector.<ReadMessage>d__156.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
       at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
       at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
       at System.Runtime.CompilerServices.ValueTaskAwaiter`1.GetResult()
       at Npgsql.NpgsqlConnector.<ReadExpecting>d__163`1.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
       at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
       at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
       at System.Runtime.CompilerServices.ValueTaskAwaiter`1.GetResult()
       at Npgsql.NpgsqlDataReader.<NextResult>d__32.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
       at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
       at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
       at Npgsql.NpgsqlDataReader.NextResult()
       at Npgsql.NpgsqlCommand.<Execute>d__71.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
       at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
       at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
       at System.Runtime.CompilerServices.ValueTaskAwaiter`1.GetResult()
       at Npgsql.NpgsqlCommand.<ExecuteDbDataReader>d__92.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
       at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
       at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
       at System.Runtime.CompilerServices.ValueTaskAwaiter`1.GetResult()
       at Npgsql.NpgsqlCommand.ExecuteDbDataReader(CommandBehavior behavior)
       at System.Data.Entity.Infrastructure.Interception.InternalDispatcher`1.Dispatch[TTarget,TInterceptionContext,TResult](TTarget target, Func`3 operation, TInterceptionContext interceptionContext, Action`3 executing, Action`3 executed)
       at System.Data.Entity.Infrastructure.Interception.DbCommandDispatcher.Reader(DbCommand command, DbCommandInterceptionContext interceptionContext)
       at System.Data.Entity.Core.EntityClient.Internal.EntityCommandDefinition.ExecuteStoreCommands(EntityCommand entityCommand, CommandBehavior behavior)
    

    那么,我的查询失败有什么明显的原因吗?我如何改进它以成功地执行查询?

    3 回复  |  直到 7 年前
        1
  •  3
  •   Ivan Stoev    7 年前

    据我所知,问题是由生成的SQL查询中的子查询太多引起的。

    在我的测试环境中,SqlServer(LocalDB)拒绝执行生成的查询,原因如下 太复杂了 。PostgreSQL能够执行它(在设置 CommandTimeout 至0)约4分钟。

    解决方案是找到不生成许多子查询的等价构造。在这种情况下,我通常在 计算不同匹配项并将其与条件计数进行比较 方法

    它可以通过两种方式实现。

    (1) 这仅适用于类型 property == valueN 。在这种情况下,可以这样计算不同的匹配项(在伪代码中):

    obj.Collection
       .Select(elem => elem.Property)
       .Distinct()
       .Count(value => values.Contains(values))
    

    将其应用于您的样本:

    private static IQueryable<MultiframeModule> WhereAllFramesProperties(this IQueryable<MultiframeModule> query, ICollection<Frame> frames)
    {
        var values = frames.Select(e => e.FrameData.ShaHash);
        var count = frames.Count();
        return query.Where(p => p.Frames.Select(e => e.FrameData.ShaHash)
            .Distinct().Count(v => values.Contains(v)) == count);
    }
    

    (2) 这适用于任何类型的条件。在这种情况下,匹配项由其索引标识,这需要动态构建一个选择器表达式,如下所示:

    Condition0 ? 0 : Condition1 ? 1 : ... ConditionN-1 ? N - 1 : -1
    

    不同的匹配计数是

    obj.Collection
       .Select(selector)
       .Distinct()
       .Count(i => i >= 0)
    

    将其应用于您的样本:

    private static IQueryable<MultiframeModule> WhereAllFramesProperties(this IQueryable<MultiframeModule> query, ICollection<Frame> frames)
    {
        var parameter = Expression.Parameter(typeof(MultiframeModuleFrame), "e");
        var body = frames.Select((frame, index) =>
        {
            Expression<Func<Frame, bool>> predicate = e => e.FrameData.ShaHash == frame.FrameData.ShaHash;
            return new
            {
                Condition = predicate.Body.ReplaceParameter(predicate.Parameters[0], parameter),
                Value = Expression.Constant(index)
            };
        })
        .Reverse()
        .Aggregate((Expression)Expression.Constant(-1), (next, item) =>
            Expression.Condition(item.Condition, item.Value, next));
        var selector = Expression.Lambda<Func<Frame, int>>(body, parameter);
        var count = frames.Count();
        return query.Where(p => p.Frames.AsQueryable().Select(selector)
            .Distinct().Count(i => i >= 0) == count);
    }
    

    其中 ReplaceParameter 是以下自定义扩展方法:

    public static partial class ExpressionUtils
    {
        public static Expression ReplaceParameter(this Expression expression, ParameterExpression source, Expression target)
        {
            return new ParameterReplacer { Source = source, Target = target }.Visit(expression);
        }
    
        class ParameterReplacer : ExpressionVisitor
        {
            public ParameterExpression Source;
            public Expression Target;
            protected override Expression VisitParameter(ParameterExpression node)
            {
                return node == Source ? Target : base.VisitParameter(node);
            }
        }
    }
    

    生成的SQL包含一个巨大的 CASE WHEN 表达式(不幸的是,在 WHERE 条款),但a 仅有一个的 子查询,并在SqlServer和PostgreSQL中被接受并成功执行(在后一种情况下,在与原始测试相同的条件下,不到2秒的时间-两个表中都有1K条记录,1M个链接,200个条件)。

        2
  •  0
  •   Ajay Gupta    7 年前

    您可以通过传递来禁用超时 Command Timeout=0 在连接字符串中,与默认值30秒一样,您的查询可能运行的时间太长,需要进行优化。

        3
  •  0
  •   Benjamin RD    7 年前

    你有一个 any a中的子句 where ,也许你会尝试优化它

    return frames.Aggregate(query, (q, frame) =>
            {
                return q.Frames.Any(i => i.FrameData.ShaHash == frame.FrameData.ShaHash));
            });