我已经找到了这些数据。table和dplyr在尝试执行相同的操作时会产生不同的结果。我想使用dplyr语法,但让它以数据的方式计算。表有。用例是我想将小计添加到表中。要做到这一点,我需要对每个变量进行聚合,但要保持相同的变量名(在转换后的版本中)。数据表允许我对变量执行一些聚合并保持相同的名称。然后使用相同的变量进行另一次聚合。它将继续使用未翻译的版本。然而,Dplyr将使用转换后的版本。
在
总结
文件说明:
# Note that with data frames, newly created summaries immediately
# overwrite existing variables
mtcars %>%
group_by(cyl) %>%
summarise(disp = mean(disp), sd = sd(disp))
这基本上就是我遇到的问题,但我想知道是否有一个很好的解决方法。我发现的一件事是将转换后的变量命名为其他名称
重命名
它在结尾,但这对我来说不是很好。如果有一个很好的方法做小计,那也很高兴知道。我环顾了一下这个网站,没有看到讨论过的确切情况。任何帮助都将不胜感激!
这里我举了一个简单的例子,有一次是关于数据的。我想用这个简单的表并附加一个小计行,它是感兴趣的列(Total)的加权平均数。
library(data.table)
library(dplyr)
dt <- data.table(Group = LETTERS[1:5],
Count = c(1000, 1500, 1200, 2000, 5000),
Total = c(50, 300, 600, 400, 1000))
dt[, Count_Dist := Count/sum(Count)]
dt[, .(Count_Dist = sum(Count_Dist), Weighted_Total = sum(Count_Dist*Total))]
dt <- rbind(dt[, .(Group, Count_Dist, Total)],
dt[, .(Group = "All", Count_Dist = sum(Count_Dist), Total = sum(Count_Dist*Total))])
setnames(dt, "Total", "Weighted_Avg_Total")
dt
df <- data.frame(Group = LETTERS[1:5],
Count = c(1000, 1500, 1200, 2000, 5000),
Total = c(50, 300, 600, 400, 1000))
df %>%
mutate(Count_Dist = Count/sum(Count)) %>%
summarize(Count_Dist = sum(Count_Dist),
Weighted_Total = sum(Count_Dist*Total))
df %>%
mutate(Count_Dist = Count/sum(Count)) %>%
select(Group, Count_Dist, Total) %>%
rbind(df %>%
mutate(Count_Dist = Count/sum(Count)) %>%
summarize(Group = "All",
Count_Dist = sum(Count_Dist),
Total = sum(Count_Dist*Total))) %>%
rename(Weighted_Avg_Total = Total)
再次感谢您的帮助!