代码之家 › 专栏 › 技术社区 › EndangeredMassa

如何优化此查询的时间性能?

performance sql

EndangeredMassa · 技术社区 · 14 年前

我正在处理大量数据:600万行。我需要查询以尽可能快的速度运行,但是对于进一步的优化来说,这是一个损失。我已经删除了3个子查询,并将它从11个多小时移动到了一个包含10万行的适度数据集上的35分钟。请看下面!

declare @UserId uniqueidentifier;
set @UserId = '936DA01F-9ABD-4d9d-80C7-02AF85C822A8';


select
    temp.Address_Line1,
    temp.Cell_Phone_Number,
    temp.City,
    temp.CPM_delt_acd,
    temp.CPM_delt_date,
    temp.Customer_Id,
    temp.Customer_Type,
    temp.Date_Birth,
    temp.Email_Business,
    temp.Email_Home,
    temp.First_Name,
    temp.Geo,
    temp.Home_Phone_Number,
    temp.Last_Name,
    temp.Link_Customer_Id,
    temp.Middle_Name,
    temp.Naics_Code,
    temp.Office_Phone_Number,
    temp.St,
    temp.Suffix,
    temp.Tin,
    temp.TIN_Indicator,
    temp.Zip_Code,

    crm_c.contactid as CrmRecordId, 
    crm_c.ownerid as OldOwnerId, 
    crm_c.ext_profiletype as old_profileType,
    coalesce(crm_fim.ownerid, @UserId) as OwnerId,
    2 as profileType,

    case 
        when
            (temp.Tin = crm_c.ext_retail_prime_taxid collate database_default 
            and temp.Last_Name = crm_c.lastname collate database_default)
        then
            ('Tin/LastName: '+temp.Tin + '/' + temp.Last_Name)
        when
            (temp.Customer_ID = crm_c.ext_customerid collate database_default)
        then
            ('Customer_ID: '+temp.Customer_ID)
        else
            ('New Customer: '+temp.Customer_ID)
    end as FriendlyName,

    case 
        when
            (temp.Customer_ID = crm_c.ext_customerid collate database_default)
        then
            0
        else
            1
    end as ForceFieldLock

from DailyProfile_Current temp

left join crm_contact crm_c 
    on (temp.Customer_ID = crm_c.ext_customerid collate database_default 
        or (temp.Tin = crm_c.ext_retail_prime_taxid collate database_default 
        and temp.Last_Name = crm_c.lastname collate database_default))
    and 0 = crm_c.deletionstatecode and 0 = crm_c.statecode    

left outer join crm_ext_ImportMapping crm_fim 
    on temp.Geo = crm_fim.ext_geocode collate database_default 
    and 0 = crm_fim.deletionstatecode and 0 = crm_fim.statecode

其中crm_contact是指向另一个数据库中的视图的同义词。该视图从联系人表和联系人扩展表中提取数据。我需要这两者的数据。如果必要的话,我可以把它分成两个连接。通常,以“ext”开头的列来自CRM“联系人”视图的扩展部分。

当我对dailyprofile_current表中的100k行运行此命令时,大约需要35分钟。该表是一组nvarchar(200)列,其中转储了一个平面文件。很糟糕,但这是我继承的。我想知道使用真正的数据类型是否会有所帮助,但我希望可能的解决方案不涉及到这一点。

如果dailyprofile_current表中充满了与联接条件不匹配的内容,则运行速度非常快。如果表中充满了与连接条件匹配的内容,那么速度会非常慢。

临时表中的客户ID和地理位置上有索引。CRM_联系人表上也有分类索引。不过,我不知道索引对nvarchar(200)列有多大帮助。

如果这很重要,我将使用SQL Server 2005。

任何想法都会受到赞赏。

3 回复 | 直到 13 年前

Ben Hoffman 14 年前

我肯定会将其拆分为两个查询,因为OR函数有时速度会很慢。另外,在这些列上放置一个非聚集索引(按行分组):

DailyProfile_Current:
Customer_ID 
Tin, Last_Name
Geo 

crm_contact:
ext_customerid,deletionstatecode,statecode
ext_retail_prime_taxid, lastname ,deletionstatecode,statecode

crm_ext_ImportMapping:
ext_geocode,deletionstatecode,statecode

van 14 年前

为什么不尝试通过查询分析器运行它?它可能会给你一些提示。
或者将执行计划包含在查询结果中并进行查看。

从查询的角度来看,我只能建议通过移动 OR 从 JOIN 子句并使用union all来联合结果。至少,它可以让你知道两种类型 加入 S很慢,从那里开始工作。

Eric 14 年前

通过查询分析器运行它,并允许它为您创建索引。我猜你至少有SQL 2000。为什么不在代码中分解一些功能呢?例如,您可以在代码中执行case语句。但这是假设您正在编写代码查询。我发现拆分查询和承担代码中的一些负载在运行时提供了显著的差异。