代码之家 › 专栏 › 技术社区 › Hadsga

基于另外两列的值替换值

Hadsga · 技术社区 · 1 年前

我有这3栏。“Basiswert”中有一些NA值。如果“Basiwert”中有NA值,我想用相同数字的Basiswert形式替换它,该数字出现在“D”或“E”列中。例如:在第2行中,Basiswert中的Value中有一个NA。该值必须替换为E列第3行的USDCAD=12442。

structure(list(D = c("12449", "12448", "12447", "12446", "12442", 
"12441", "12440", "12439", "12438", "12437"), E = c("0", "12442", 
"12442", "12430", "0", "12430", "12436", "12436", "12430", "12430"
), Basiswert = c("EURJPY", NA, "USDCAD", "EURAUD", "USDCAD", 
"EURAUD", NA, "GBPJPY", NA, "EURAUD")), class = "data.frame", row.names = c(NA, 
-10L))

编辑:

library(dplyr)

data %>% mutate(Basiswert = ifelse(is.na(Basiswert) == FALSE & 
D == E, Basiswert, NA))

不起作用。

2 回复 | 直到 1 年前

Hann Shaw 1 年前

不太清楚这个问题。检查这是否有效。

data %>%
  group_by(E) %>%
  mutate(Basiswert2 = ifelse(is.na(Basiswert), max(Basiswert, na.rm = T), Basiswert))

hello_friend 1 年前

基本R解决方案(在样本数据中,仅重复了E的值,因此我们可以忽略D):

# Split-apply-combine by column E values: 
# res_df => data.frame
res_df <- data.frame(
  do.call(
    rbind,
    # For each E value: 
    lapply(
      with(df, split(df, E)),
      function(x){
        # If the first value in the vector is na:
        if(is.na(x$Basiswert[1])){
          # Resolve the first non na value: ir => vector length 1
          ir <- x$Basiswert[min(which(!(is.na(x$Basiswert))))]
          # Fill the first na value with first non-na value: 
          x$Basiswert[1] <- ir
        # Otherwise: 
        }else{
          # Do nothing: 
          invisible()
        } 
        # Fill the values down: x$Basiswert => vector
        x$Basiswert <- na.omit(x$Basiswert)[cumsum(!(is.na(x$Basiswert)))]
        # Return the data.frame: data.frame => env
        x
      }
    )
  ),
  row.names = NULL
)

考虑D的Tidyverse解决方案,如果还需要考虑D:

library(dplyr)
library(tidyr)
df %>%
  group_by(E) %>%
  fill(Basiswert, .direction = "downup") %>% 
  group_by(D) %>% 
  fill(Basiswert, .direction = "downup")

输入数据:

df <- structure(
  list(
    D = c("12449", "12448", "12447", "12446", "12442", 
    "12441", "12440", "12439", "12438", "12437"), 
    E = c("0", "12442", 
    "12442", "12430", "0", "12430", "12436", "12436", "12430", "12430"
    ), 
    Basiswert = c("EURJPY", NA, "USDCAD", "EURAUD", "USDCAD", "EURAUD", NA, "GBPJPY", NA, "EURAUD")
  ), 
  class = "data.frame", 
  row.names = c(NA, -10L)
)

r2evans 1 年前

另一个基本R解决方案:

data$Basiswert2 <- ave(data$Basiswert, data$E, FUN = function(z) replace(z, is.na(z), max(z, na.rm = TRUE)))
data
#        D     E Basiswert Basiswert2
# 1  12449     0    EURJPY     EURJPY
# 2  12448 12442      <NA>     USDCAD
# 3  12447 12442    USDCAD     USDCAD
# 4  12446 12430    EURAUD     EURAUD
# 5  12442     0    USDCAD     USDCAD
# 6  12441 12430    EURAUD     EURAUD
# 7  12440 12436      <NA>     GBPJPY
# 8  12439 12436    GBPJPY     GBPJPY
# 9  12438 12430      <NA>     EURAUD
# 10 12437 12430    EURAUD     EURAUD

既然你说你想在里面分组 D 和 E ,我建议这可以是一条链:

data |>
  transform(Basiswert = ave(Basiswert, D, FUN = function(z) ifelse(is.na(z), na.omit(z)[1], z))) |>
  transform(Basiswert = ave(Basiswert, E, FUN = function(z) ifelse(is.na(z), na.omit(z)[1], z)))
#        D     E Basiswert
# 1  12449     0    EURJPY
# 2  12448 12442    USDCAD
# 3  12447 12442    USDCAD
# 4  12446 12430    EURAUD
# 5  12442     0    USDCAD
# 6  12441 12430    EURAUD
# 7  12440 12436    GBPJPY
# 8  12439 12436    GBPJPY
# 9  12438 12430    EURAUD
# 10 12437 12430    EURAUD

二者都 ifelse(is.na(z), z, ..) 和 replace(z, is.na(z), ..) 在这里实际上是一样的。。。我回避的唯一原因 ifelse 如果您的数据不是int/num/chr。。。例如,如果您在日期或时间戳上使用此逻辑,那么 如果其他 should not be used .

的使用 max(..) 用于确定所述替换值是对字符串的破解的一点;它在某些情况下会发出警告(在这里对我来说确实如此),而的使用 na.omit(z)[1] 将总是返回第一个非- NA 值,或 NA 如果不存在的话。

TarJae 1 年前

与@Hann Shaw的逻辑相同,我们也可以使用 replace :

library(dplyr)

df %>%
  mutate(Basiswert2 = replace(Basiswert, is.na(Basiswert), max(Basiswert, na.rm = TRUE)), .by=E)

     D     E Basiswert Basiswert2
1  12449     0    EURJPY     EURJPY
2  12448 12442      <NA>     USDCAD
3  12447 12442    USDCAD     USDCAD
4  12446 12430    EURAUD     EURAUD
5  12442     0    USDCAD     USDCAD
6  12441 12430    EURAUD     EURAUD
7  12440 12436      <NA>     GBPJPY
8  12439 12436    GBPJPY     GBPJPY
9  12438 12430      <NA>     EURAUD
10 12437 12430    EURAUD     EURAUD