代码之家  ›  专栏  ›  技术社区  ›  justkt

Perl字符串的性能

  •  10
  • justkt  · 技术社区  · 14 年前

    我已经运行了很多Perl代码,它们以这种方式分解长字符串:

    my $string = "Hi, I am a very long and chatty string that just won't";
    $string .= " quit.  I'm going to keep going, and going, and going,";
    $string .= " kind of like the Energizer bunny.  What are you going to";
    $string .= " do about it?";
    

    从我的Java背景来看,构建这样一个字符串是一个性能问题,Perl也是这样吗?在我的搜索中,我用 join 在字符串数组上是连接字符串的最快方法,但是如果只是为了可读性而拆分字符串呢?最好写下:

    my $string = "Hi, I am a very long and chatty string that just won't" .
        " quit.  I'm going to keep going, and going, and going," .
        " kind of like the Energizer bunny.  What are you going to" .
        " do about it?";
    

    或者我用

    7 回复  |  直到 14 年前
        1
  •  16
  •   Justin R.    14 年前

    Camel book, p 598 :

    首选联接(“,…)一系列的 来回复制多次。 join操作符避免了这种情况。

        2
  •  11
  •   Eric Strom    14 年前

    还有一件事情要添加到这个线程中,但还没有提到——如果可以,请避免连接/连接这些字符串。很多方法都需要一段时间 列表 字符串作为参数,而不仅仅是一个字符串,因此您可以单独传递它们,例如:

    print "this is",
        " perfectly legal",
        " because print will happily",
        " take a list and send all the",
        " strings to the output stream\n";
    
    die "this is also",
        " perfectly acceptable";
    
    use Log::Log4perl :easy; use Data::Dumper;
    INFO("and this is just fine",
        " as well");
    
    INFO(sub {
        local $Data::Dumper::Maxdepth = 1;
        "also note that many libraries will",
        " accept subrefs, in which you",
        " can perform operations which",
        " return a list of strings...",
        Dumper($obj);
     });
    
        3
  •  10
  •   sebthebert    14 年前

    我做了基准!:)

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    use Benchmark qw(cmpthese timethese);
    
    my $bench = timethese($ARGV[1], {
    
      multi_concat => sub {
        my $string = "Hi, I am a very long and chatty string that just won't";
        $string .= " quit.  I'm going to keep going, and going, and going,";
        $string .= " kind of like the Energizer bunny.  What are you going to";
        $string .= " do about it?";
      },
    
      one_concat => sub {
        my $string = "Hi, I am a very long and chatty string that just won't" .
        " quit.  I'm going to keep going, and going, and going," .
        " kind of like the Energizer bunny.  What are you going to" .
        " do about it?";
      },
    
      join => sub {
        my $string = join("", "Hi, I am a very long and chatty string that just won't",
        " quit.  I'm going to keep going, and going, and going,",
        " kind of like the Energizer bunny.  What are you going to",
        " do about it?"
        );
      },
    
    } );
    
    cmpthese $bench;
    
    1;
    

    imac:Benchmarks seb$ ./strings.pl 1000
    Benchmark: running join, multi_concat, one_concat for at least 3 CPU seconds...
          join:  2 wallclock secs ( 3.13 usr +  0.01 sys =  3.14 CPU) @ 3235869.43/s (n=10160630)
    multi_concat:  3 wallclock secs ( 3.20 usr + -0.01 sys =  3.19 CPU) @ 3094491.85/s (n=9871429)
    one_concat:  2 wallclock secs ( 3.43 usr +  0.01 sys =  3.44 CPU) @ 12602343.60/s (n=43352062)
                       Rate multi_concat         join   one_concat
    multi_concat  3094492/s           --          -4%         -75%
    join          3235869/s           5%           --         -74%
    one_concat   12602344/s         307%         289%           --
    
        4
  •  3
  •   Eric Strom    14 年前

    两个示例之间的主要性能差异在于,在第一个示例中,每次调用代码时都会发生串联,而在第二个示例中,编译器会将常量字符串折叠在一起。

    因此,如果这两个示例中的任何一个都在循环或函数中多次调用,那么第二个示例将更快。

    这假设字符串在编译时是已知的。如果您在运行时构建字符串 fatcat1111 join 运算符将比重复串联更快。

        5
  •  2
  •   Community Ramakrishna.p    7 年前

    在我的基准中, join 参加

    4 strings:
              Rate   .= join    .
    .=   2538071/s   --  -4% -18%
    join 2645503/s   4%   -- -15%
    .    3105590/s  22%  17%   --
    1_000 strings:
             Rate join   .=
    join 152439/s   -- -40%
    .=   253807/s  66%   --
    

    所以就你的问题而言, . .= 对于执行时间来说,虽然还不够,但通常值得担心。可读性几乎总是比性能更重要,而且 .=

    sebthebert's answer 演示, . 比以前快多了 .= 在常量串联的情况下,我会把它当作一个规则。

    (顺便说一句,基准测试基本上是显而易见的,我不想在这里重复代码。唯一令人惊讶的是从 <DATA> 以便衬托不断的折叠。)

        6
  •  1
  •   JSBÕ±Õ¸Õ£Õ¹    14 年前

    你喜欢哪个就用哪个;它们在perl中的性能完全相同。Perl字符串不像Java字符串,可以就地修改。

        7
  •  -1
  •   dirk    14 年前

    my $string = "Hi, I am a very long and  chatty string that just won't
     quit.   I'm going to keep going, and going,  and going,
     kind of like the Energizer  bunny.  What are you going to
     do  about it?";