代码之家  ›  专栏  ›  技术社区  ›  Michael

更改groupby中的列值

  •  0
  • Michael  · 技术社区  · 6 年前

    这是一个数据帧:

               id         value next_value      step
    0       10764  612486.33921   800000.0       0.0
    1       10764  612486.33921  1000000.0  200000.0
    2       10822   97604.30520   100000.0       0.0
    3       10823   97805.45851   100000.0       0.0
    4       10823   97805.45851   165000.0   65000.0
    5       10823   97805.45851   250000.0   85000.0
    6       10823   97805.45851   400000.0  150000.0
    7       10823   97805.45851   600000.0  200000.0
    8       10823   97805.45851   800000.0  200000.0
    9       10823   97805.45851  1000000.0  200000.0
    10      10823   97805.45851  1200000.0  200000.0
    11      10882   33843.33701    40000.0       0.0
    12      10882   33843.33701    60000.0   20000.0
    13      10882   33843.33701    90000.0   30000.0
    14      10882   33843.33701   125000.0   35000.0
    15      10882   33843.33701   150000.0   25000.0
    16      10882   33843.33701   175000.0   25000.0
    17      10882   33843.33701   205000.0   30000.0
    18      10882   33843.33701   230000.0   25000.0
    19      10882   33843.33701   300000.0   70000.0
    20      10882   33843.33701   440000.0  140000.0
    21      10758  241890.94525   385000.0       0.0
    22      10817  138519.73560   150000.0       0.0
    23      10817  138519.73560   200000.0   50000.0
    24      10817  138519.73560   250000.0   50000.0
    25      10817  138519.73560   300000.0   50000.0
    26      10817  138519.73560   350000.0   50000.0
    27      10817  138519.73560   400000.0   50000.0
    28      10859   73140.30048    80000.0       0.0
    29      10859   73140.30048   100000.0   20000.0
    

    step 第一个区别是 next_value value 分组方式 id

    例如 id=10764 step = step + 800000.0 - 612486.33921 ,

    id=10822 step = step + 100000.0 - 97604.30520 ,

    等。

    我试过这个:

    df['step'] += df.groupby('id')[['next_value', 'value']].apply
        (lambda x: x.iloc[0, 0] - x.iloc[0, 1])
    

    但它给了我一切 值等于 NaN .

    1 回复  |  直到 6 年前
        1
  •  2
  •   Zero    6 年前

    使用 groupby id 去拿那个 first 价值观。然后,使用 s df.step

    In [717]: s = df.groupby('id').transform('first')
    
    In [718]: df.step + s.next_value - s.value
    Out[718]:
    0     187513.66079
    1     387513.66079
    2       2395.69480
    3       2194.54149
    4      67194.54149
    5      87194.54149
    6     152194.54149
    7     202194.54149
    8     202194.54149
    9     202194.54149
    10    202194.54149
    11      6156.66299
    12     26156.66299
    13     36156.66299
    14     41156.66299
    15     31156.66299
    16     31156.66299
    17     36156.66299
    18     31156.66299
    19     76156.66299
    20    146156.66299
    21    143109.05475
    22     11480.26440
    23     61480.26440
    24     61480.26440
    25     61480.26440
    26     61480.26440
    27     61480.26440
    28      6859.69952
    29     26859.69952
    dtype: float64
    

    In [719]: s
    Out[719]:
               value  next_value  step
    0   612486.33921    800000.0   0.0
    1   612486.33921    800000.0   0.0
    2    97604.30520    100000.0   0.0
    3    97805.45851    100000.0   0.0
    4    97805.45851    100000.0   0.0
    5    97805.45851    100000.0   0.0
    6    97805.45851    100000.0   0.0
    7    97805.45851    100000.0   0.0
    8    97805.45851    100000.0   0.0
    9    97805.45851    100000.0   0.0
    10   97805.45851    100000.0   0.0
    11   33843.33701     40000.0   0.0
    12   33843.33701     40000.0   0.0
    13   33843.33701     40000.0   0.0
    14   33843.33701     40000.0   0.0
    15   33843.33701     40000.0   0.0
    16   33843.33701     40000.0   0.0
    17   33843.33701     40000.0   0.0
    18   33843.33701     40000.0   0.0
    19   33843.33701     40000.0   0.0
    20   33843.33701     40000.0   0.0
    21  241890.94525    385000.0   0.0
    22  138519.73560    150000.0   0.0
    23  138519.73560    150000.0   0.0
    24  138519.73560    150000.0   0.0
    25  138519.73560    150000.0   0.0
    26  138519.73560    150000.0   0.0
    27  138519.73560    150000.0   0.0
    28   73140.30048     80000.0   0.0
    29   73140.30048     80000.0   0.0