代码之家  ›  专栏  ›  技术社区  ›  Tim Schmelter

sort()和binarysearch()与整数/字符串的性能比较

  •  3
  • Tim Schmelter  · 技术社区  · 14 年前

    最初我想问一下对整数排序是否比字符串更快。 但我自己也回答了这个问题,我对这一巨大的差异感到惊讶。 为什么排序和binarysearch整数比字符串更快?

    (vb.net)使用1.000.000 int32/字符串的测试:

    Private Function CheckIntBinarySearch() As TimeSpan
        Dim watch As New System.Diagnostics.Stopwatch()
        Dim rnd As New Random(Date.Now.Millisecond)
        Dim intCol1 As New List(Of Int32)
        Dim intCol2 As New List(Of Int32)
        Dim contains As Int32
        For i As Int32 = 1 To 1000000
            intCol1.Add(rnd.Next(1, 1000000))
        Next
        For i As Int32 = 1 To 1000000
            intCol2.Add(rnd.Next(1, 1000000))
        Next
        Me.output.WriteLine("Integers sorting ...")
        watch.Start()
        intCol1.Sort()
        watch.Stop()
        Me.output.WriteLine("Sorting finished: " & watch.Elapsed.TotalSeconds & " seconds elapsed.")
    
        Me.output.WriteLine("Integers BinarySearch ...")
        watch.Start()
        For Each Val As Int32 In intCol2
            If intCol1.BinarySearch(Val) > -1 Then contains += 1
        Next
        watch.Stop()
        Me.output.WriteLine("BinarySearch finished(contains " & contains & "): " & watch.Elapsed.TotalSeconds & " seconds elapsed.")
        Return watch.Elapsed
    End Function
    
    Private Function CheckStringBinarySearch() As TimeSpan
        Dim watch As New System.Diagnostics.Stopwatch()
        Dim rnd As New Random(Date.Now.Millisecond)
        Dim stringCol1 As New List(Of String)
        Dim stringCol2 As New List(Of String)
        Dim contains As Int32
        For i As Int32 = 1 To 1000000
            stringCol1.Add(rnd.Next(1, 1000000).ToString)
        Next
        For i As Int32 = 1 To 1000000
            stringCol2.Add(rnd.Next(1, 1000000).ToString)
        Next
        Me.output.WriteLine("Strings sorting ...")
        watch.Start()
        stringCol1.Sort()
        watch.Stop()
        Me.output.WriteLine("Sorting finished: " & watch.Elapsed.TotalSeconds & " seconds elapsed.")
        Me.output.WriteLine("Strings BinarySearch ...")
        watch.Start()
        For Each Val As String In stringCol2
            If stringCol1.BinarySearch(Val) > -1 Then contains += 1
        Next
        watch.Stop()
        Me.output.WriteLine("BinarySearch finished(contains " & contains & "): " & watch.Elapsed.TotalSeconds & " seconds elapsed.")
        Return watch.Elapsed
    End Function
    

    检查性能5次:

    For i As Int32 = 1 To 5
       intChecks.Add(CheckIntBinarySearch())
    Next
    For i As Int32 = 1 To 5
       stringChecks.Add(CheckStringBinarySearch())
    Next
    

    输出:

        1.)Integers sorting ...
        Sorting finished: 0,2292863 seconds elapsed.
        Integers BinarySearch ...
        BinarySearch finished(contains 630857): 0,9365744 seconds elapsed.
        2.)Integers sorting ...
        Sorting finished: 0,2287382 seconds elapsed.
        Integers BinarySearch ...
        BinarySearch finished(contains 632600): 0,9053134 seconds elapsed.
        3.)Integers sorting ...
        Sorting finished: 0,2318829 seconds elapsed.
        Integers BinarySearch ...
        BinarySearch finished(contains 631475): 0,9038594 seconds elapsed.
        4.)Integers sorting ...
        Sorting finished: 0,2308994 seconds elapsed.
        Integers BinarySearch ...
        BinarySearch finished(contains 632346): 0,9011047 seconds elapsed.
        5.)Integers sorting ...
        Sorting finished: 0,2266423 seconds elapsed.
        Integers BinarySearch ...
        BinarySearch finished(contains 632982): 0,893541 seconds elapsed.
        1.)Strings sorting ...
        Sorting finished: 6,5661916 seconds elapsed.
        Strings BinarySearch ...
        BinarySearch finished(contains 632579): 12,9545657 seconds elapsed.
        2.)Strings sorting ...
        Sorting finished: 6,5641975 seconds elapsed.
        Strings BinarySearch ...
        BinarySearch finished(contains 631478): 13,0184132 seconds elapsed.
        3.)Strings sorting ...
        Sorting finished: 6,4281382 seconds elapsed.
        Strings BinarySearch ...
        BinarySearch finished(contains 631775): 12,7684214 seconds elapsed.
        4.)Strings sorting ...
        Sorting finished: 6,9455087 seconds elapsed.
        Strings BinarySearch ...
        BinarySearch finished(contains 631478): 13,7057234 seconds elapsed.
        5.)Strings sorting ...
        Sorting finished: 6,6707111 seconds elapsed.
        Strings BinarySearch ...
        BinarySearch finished(contains 632346): 13,0493649 seconds elapsed.
    
    • Int32 排序平均值: 022948982
    • String 排序平均值: 六亿六千三百四十九万四千九百四十二
    • 英特32 BinarySearch平均值: 090807858
    • BinarySearch平均值: 十三亿零九百九十二万九千七百七十二

    结论:

    • 排序整数是 二十九 比排序字符串快一倍
    • BinarySearch整数是 14、4 比binarysearch字符串快一倍

    为什么? 考虑拥有大量的“字符串整数”集合(“1”、“2”、“3”、…)。在排序和搜索之前将它们解析为整数更好吗?将字符串解析为整数的成本是多少?好吧,我想这是另一个问题。

    5 回复  |  直到 14 年前
        1
  •  4
  •   Nick Fortescue    14 年前

        2
  •  3
  •   smirkingman    14 年前

        3
  •  2
  •   Daniel    14 年前

     stringCol1.Sort(StringComparer.Ordinal)
    

    Unicode Collation Algorithm

        4
  •  1
  •   Pieter van Ginkel    14 年前

        5
  •  0
  •   Mike Dunlavey    14 年前

    sub