代码之家 › 专栏 › 技术社区 › Steve Wortham

当使用数量不确定的参数时,如何避免动态SQL?

sql-match-all stored-procedures tsql sql-server sql

Steve Wortham · 技术社区 · 15 年前

我有一个类似StackOverflow的标签系统,用于我正在处理的数据库。我正在编写一个存储过程,它根据WHERE子句中不确定数量的标记查找结果。可以有0到10个标记来过滤结果。例如,用户可以搜索标有“苹果”、“橘子”和“香蕉”的物品,以及每个结果必须包括所有3个标签。我的查询变得更加复杂,因为我也在处理一个用于标记的交叉引用表,但出于这个问题的目的,我将不讨论这个问题。

我知道我可以进行一些字符串操作,并向exec()函数提供一个查询来处理这个问题,但我不希望出现与动态SQL相关的性能问题。我认为最好是SQL缓存存储过程的查询计划。

SELECT ft.[RANK], s.shader_id, s.page_name, s.name, s.description, s.download_count, s.rating, s.price FROM shader s 
INNER JOIN FREETEXTTABLE(shader, *, @search_term) AS ft ON s.shader_id = ft.[KEY]
WHERE EXISTS(SELECT tsx.shader_id FROM tag_shader_xref tsx INNER JOIN tag t ON tsx.tag_id = t.tag_id WHERE tsx.shader_id = s.shader_id AND t.tag_name = 'color')
AND EXISTS(SELECT tsx.shader_id FROM tag_shader_xref tsx INNER JOIN tag t ON tsx.tag_id = t.tag_id WHERE tsx.shader_id = s.shader_id AND t.tag_name = 'saturation')
ORDER BY ft.[RANK] DESC

这是功能性的,但是硬编码的。您将看到,我已将其设置为查找“颜色”和“饱和度”标记。

8 回复 | 直到 13 年前

Henrik Opel 15 年前

有关此问题和类似问题的详细概述,请参阅: http://www.sommarskog.se/dyn-search-2005.html

针对您的问题,这里有以下部分: http://www.sommarskog.se/dyn-search-2005.html#AND_ISNOTNULL

还要考虑到(直接的)动态解决方案并不一定比(可能是复杂的)静态解决方案慢,因为查询计划仍然可以被缓存:请参阅 http://www.sommarskog.se/dyn-search-2005.html#dynsql

因此,您必须根据实际数据量仔细测试/衡量您的选项,同时考虑实际查询(例如,使用一个或两个参数的搜索可能比使用十个参数的搜索更常见,等等)

编辑:提问者在评论中给出了一个很好的理由来优化这一点,因此将“过早”的警告移开一点:

这闻起来很像过早优化! -您确定此存储过程会经常被调用,而使用动态SQL则会经常被调用吗 明显地 较慢(即,与应用程序中的其他内容相比)?

Steve Wortham 13 年前

所以这比我想象的要容易。在实现了一个相当简单的查询来处理这个问题之后,我立即获得了比我想象的要好得多的性能。因此,我不确定是否有必要实施和测试其他解决方案。

目前我的数据库中有大约200个着色器和500个标记。我运行了一个我认为有点现实的测试,在这个测试中,我对我存储的proc执行了35个不同的搜索查询,其中包含不同数量的标记,有搜索项和没有搜索项。我将所有这些放在一个SQL语句中,然后在ASP.NET中对结果进行基准测试。它始终在200毫秒内运行这35次搜索。如果我将搜索次数减少到5次,那么搜索时间将减少到10毫秒。这种性能非常棒。我的数据库很小,这很有帮助。但我认为查询很好地利用索引也很有帮助。

我在查询中改变的一件事是我查找标记的方式。我现在按标签的id而不是名称查找标签。通过这样做,我可以少做1次连接,并且有使用索引进行搜索的好处。在了解到SQL基于每个用户缓存查询后,我还将“dbo.”添加到表名的前面。

ALTER PROCEDURE [dbo].[search] 
    @search_term    varchar(100) = NULL,
    @tag1           int = NULL,
    @tag2           int = NULL,
    @tag3           int = NULL,
    @tag4           int = NULL,
    @tag5           int = NULL,
    @tag6           int = NULL,
    @tag7           int = NULL,
    @tag8           int = NULL,
    @tag9           int = NULL,
    @tag10          int = NULL
AS
BEGIN
    SET NOCOUNT ON;

    IF LEN(@search_term) > 0
        BEGIN
            SELECT s.shader_id, s.page_name, s.name, s.description, s.download_count, s.rating, s.price FROM dbo.shader s 
            INNER JOIN FREETEXTTABLE(dbo.shader, *, @search_term) AS ft ON s.shader_id = ft.[KEY]
            WHERE (@tag1 IS NULL OR EXISTS(SELECT 1 AS num FROM dbo.tag_shader_xref tsx WHERE tsx.shader_id = s.shader_id AND tsx.tag_id = @tag1))
            AND   (@tag2 IS NULL OR EXISTS(SELECT 1 AS num FROM dbo.tag_shader_xref tsx WHERE tsx.shader_id = s.shader_id AND tsx.tag_id = @tag2))
            AND   (@tag3 IS NULL OR EXISTS(SELECT 1 AS num FROM dbo.tag_shader_xref tsx WHERE tsx.shader_id = s.shader_id AND tsx.tag_id = @tag3))
            AND   (@tag4 IS NULL OR EXISTS(SELECT 1 AS num FROM dbo.tag_shader_xref tsx WHERE tsx.shader_id = s.shader_id AND tsx.tag_id = @tag4))
            AND   (@tag5 IS NULL OR EXISTS(SELECT 1 AS num FROM dbo.tag_shader_xref tsx WHERE tsx.shader_id = s.shader_id AND tsx.tag_id = @tag5))
            AND   (@tag6 IS NULL OR EXISTS(SELECT 1 AS num FROM dbo.tag_shader_xref tsx WHERE tsx.shader_id = s.shader_id AND tsx.tag_id = @tag6))
            AND   (@tag7 IS NULL OR EXISTS(SELECT 1 AS num FROM dbo.tag_shader_xref tsx WHERE tsx.shader_id = s.shader_id AND tsx.tag_id = @tag7))
            AND   (@tag8 IS NULL OR EXISTS(SELECT 1 AS num FROM dbo.tag_shader_xref tsx WHERE tsx.shader_id = s.shader_id AND tsx.tag_id = @tag8))
            AND   (@tag9 IS NULL OR EXISTS(SELECT 1 AS num FROM dbo.tag_shader_xref tsx WHERE tsx.shader_id = s.shader_id AND tsx.tag_id = @tag9))
            AND   (@tag10 IS NULL OR EXISTS(SELECT 1 AS num FROM dbo.tag_shader_xref tsx WHERE tsx.shader_id = s.shader_id AND tsx.tag_id = @tag10))
            ORDER BY ft.[RANK] DESC
        END
    ELSE
        BEGIN
            SELECT s.shader_id, s.page_name, s.name, s.description, s.download_count, s.rating, s.price FROM dbo.shader s 
            WHERE (@tag1 IS NULL OR EXISTS(SELECT 1 AS num FROM dbo.tag_shader_xref tsx WHERE tsx.shader_id = s.shader_id AND tsx.tag_id = @tag1))
            AND   (@tag2 IS NULL OR EXISTS(SELECT 1 AS num FROM dbo.tag_shader_xref tsx WHERE tsx.shader_id = s.shader_id AND tsx.tag_id = @tag2))
            AND   (@tag3 IS NULL OR EXISTS(SELECT 1 AS num FROM dbo.tag_shader_xref tsx WHERE tsx.shader_id = s.shader_id AND tsx.tag_id = @tag3))
            AND   (@tag4 IS NULL OR EXISTS(SELECT 1 AS num FROM dbo.tag_shader_xref tsx WHERE tsx.shader_id = s.shader_id AND tsx.tag_id = @tag4))
            AND   (@tag5 IS NULL OR EXISTS(SELECT 1 AS num FROM dbo.tag_shader_xref tsx WHERE tsx.shader_id = s.shader_id AND tsx.tag_id = @tag5))
            AND   (@tag6 IS NULL OR EXISTS(SELECT 1 AS num FROM dbo.tag_shader_xref tsx WHERE tsx.shader_id = s.shader_id AND tsx.tag_id = @tag6))
            AND   (@tag7 IS NULL OR EXISTS(SELECT 1 AS num FROM dbo.tag_shader_xref tsx WHERE tsx.shader_id = s.shader_id AND tsx.tag_id = @tag7))
            AND   (@tag8 IS NULL OR EXISTS(SELECT 1 AS num FROM dbo.tag_shader_xref tsx WHERE tsx.shader_id = s.shader_id AND tsx.tag_id = @tag8))
            AND   (@tag9 IS NULL OR EXISTS(SELECT 1 AS num FROM dbo.tag_shader_xref tsx WHERE tsx.shader_id = s.shader_id AND tsx.tag_id = @tag9))
            AND   (@tag10 IS NULL OR EXISTS(SELECT 1 AS num FROM dbo.tag_shader_xref tsx WHERE tsx.shader_id = s.shader_id AND tsx.tag_id = @tag10))
        END
END

尽管我没有用尽所有的选择,但这仍然是一个很好的练习,因为我已经向自己证明,我的数据库设计对于这项任务非常有效。我也从这个问题中学到了很多。我知道exec()不好,因为它没有缓存查询计划。但我不知道sp_executesql缓存查询计划,这很酷。我也不知道常用的表表达式。而亨里克·欧宝发布的链接中充满了这类任务的好技巧。

所以我有一个在线搜索引擎的工作示例 http://www.silverlightxap.com/controls 如果有人有兴趣看到这一行动。

OMG Ponies 15 年前

您的查询非常适合使用公共表表达式(CTE),因为EXISTS子句中存在重复的相关子查询:

WITH attribute AS(
  SELECT tsx.shader_id,
         t.tag_name
    FROM TAG_SHADER_XREF tsx ON tsx.shader_id = s.shader_id
    JOIN TAG t ON t.tad_id = tsx.tag_id)
SELECT ft.[RANK], 
       s.shader_id, 
       s.page_name, 
       s.name, 
       s.description, 
       s.download_count, 
       s.rating, 
       s.price 
  FROM SHADER s 
  JOIN FREETEXTTABLE(SHADER, *, @search_term) AS ft ON s.shader_id = ft.[KEY]
  JOIN attribute a1 ON a1.shader_id = s.shader_id AND a1.tag_name = 'color'
  JOIN attribute a2 ON a2.shader_id = s.shader_id AND a2.tag_name = 'saturation'
 ORDER BY ft.[RANK] DESC

通过使用CTE,我还将EXISTS转换为JOIN。

谈到关于使用动态SQL的最初问题,唯一的替代方法是在应用它之前检查传入参数的转义条件。即:

WHERE (@param1 IS NULL OR a1.tag_name = @param1)

您认为动态SQL存在哪些性能问题?使用 sp_executesql exec )-验证将在查询计划之前进行,为什么会跳过之后的步骤?

Community leo1 7 年前

在使用参数数量待定?

您可以动态生成适当的相反

当参数第一次出现时,构建并准备语句模板,缓存准备好的语句,以便在再次出现相同数量的参数时重复使用。

这可以在应用程序或足够复杂的存储过程中完成。

我更喜欢这种方法,比如说,一个过程最多需要10个标记,并且有grody逻辑来处理其中任何一个为空的标记。

Bill Karwin's GROUP BY answer IN 谓词和更新 COUNT xref1 , xref2 ,等等)。

Bill Karwin 15 年前

我见过两种解决这个问题的方法:

shader 表到 tags (根据需要通过外部参照)对于要查找的每个标记一次。内部联接的结果仅包括与所有标记匹配的着色器。

SELECT s.*
FROM shader s
JOIN tag_shader_xref x1 ON (s.shader_id = x1.shader_id)
JOIN tag t1 ON (t1.tag_id = x1.tag_id AND t1.tag_name = 'color')
JOIN tag_shader_xref x2 ON (s.shader_id = x2.shader_id)
JOIN tag t2 ON (t2.tag_id = x2.tag_id AND t2.tag_name = 'saturation')
JOIN tag_shader_xref x3 ON (s.shader_id = x3.shader_id)
JOIN tag t3 ON (t3.tag_id = x3.tag_id AND t3.tag_name = 'transparency');

第二种解决方案是加入该标记一次,将标记限制为所需的三个,然后 GROUP BY 这个 shader_id

SELECT s.shader_id
FROM shader s
JOIN tag_shader_xref x ON (s.shader_id = x.shader_id)
JOIN tag t ON (t.tag_id = x.tag_id 
  AND t.tag_name IN ('color', 'saturation', 'transparency'))
GROUP BY s.shader_id
HAVING COUNT(DISTINCT t.tag_name) = 3;

你应该用哪一种?取决于您的数据库品牌如何优化一种方法或另一种方法。我通常使用MySQL,但它在 分组

llamaoo7 15 年前

这可能不是最快的方法,但您能为每个标记生成一个查询字符串,然后用“INTERSECT”将它们连接起来吗?

编辑:未看到存储过程标记,因此我不知道这是否可行。

Henrik Opel 15 年前

我对Henrik的答案投了赞成票,但我能想到的另一种选择是将搜索标记放入临时表或表变量中,然后对其进行连接或使用带有子选择的IN子句。因为你想要有结果 全部的

如何将值放入表中?如果将标记传递给存储过程,并且使用的是SQL Server 2008,则可以使用新的表值参数功能将表变量直接传递给存储过程。

否则,如果在单个字符串中接收标记,则可以使用返回表的存储函数,例如 SplitString function shown here

... WHERE @SearchTagCount = (SELECT COUNT(tsx.shader_id) FROM tag_shader_xref tsx
INNER JOIN tag t ON tsx.tag_id = t.tag_id
WHERE tsx.shader_id = s.shader_id AND t.tag_name IN (SELECT * FROM dbo.SplitString(@SearchTags,',')))

-1

Kevin LaBranche 15 年前

用逗号将标记串在一起,将它们分隔为“apple”、“orange”,然后将其传递给存储过程中使用in子句的一个参数。

当然,如果您有这些标记的查找表中的值(键),我会使用这些值。

编辑:

因为您需要结果中的所有标记。。。。

不幸的是,我认为无论你做什么,SP都将面临重新制定计划的危险。

您可以使用可选参数、用例和ISNULL来构建参数。