代码之家  ›  专栏  ›  技术社区  ›  the_darkside

Dplyr错误地连接多个列

  •  0
  • the_darkside  · 技术社区  · 6 年前

    我试图在两个数据帧上执行多个匹配列的左联接-所有列都属于数值类:

    首先是 test_trips :

    > test_trips
      start_station_longitude start_station_latitude end_station_longitude end_station_latitude
    1               -122.4264               37.76142             -122.4312             37.78352
    2               -122.3942               37.79539             -122.3942             37.79539
    3               -121.8948               37.34876             -121.8771             37.32600
    4               -122.4212               37.77379             -122.3997             37.78095
    5               -122.4262               37.77643             -122.4035             37.78898
    6               -122.2825               37.80369             -122.2948             37.80532
    

    第二个是 test_distance :

    > test_distance
      start_station_longitude start_station_latitude end_station_longitude end_station_latitude distance
    1               -122.4264               37.76142             -122.4312             37.78352     3162
    2               -122.3942               37.79539             -122.3942             37.79539        0
    3               -121.8948               37.34876             -121.8771             37.32600     3591
    4               -122.4212               37.77379             -122.3997             37.78095     2843
    5               -122.4262               37.77643             -122.4035             37.78898     3088
    6               -122.2825               37.80369             -122.2948             37.80532     1198
    

    请注意,这两个表的4列中的每一列中的坐标值是如何完全相同的。为什么这个左键不起作用?

    output <- dplyr::left_join(test_trips, test_distance, by=c( "start_station_longitude", "start_station_latitude",  "end_station_longitude", "end_station_latitude"))
    

    输出产生 4 NA值 ,我不明白-出什么事了?我猜这和 测试距离 ,但我不确定如何将其修改为 class() 通知我这是普通数据帧

    > output
      start_station_longitude start_station_latitude end_station_longitude end_station_latitude distance
    1               -122.4264               37.76142             -122.4312             37.78352       NA
    2               -122.3942               37.79539             -122.3942             37.79539        0
    3               -121.8948               37.34876             -121.8771             37.32600       NA
    4               -122.4212               37.77379             -122.3997             37.78095       NA
    5               -122.4262               37.77643             -122.4035             37.78898       NA
    6               -122.2825               37.80369             -122.2948             37.80532     1198
    

    以下是示例数据:

    > dput(test_trips)
    structure(list(start_station_longitude = c(-122.4264353, -122.394203, 
    -121.894797831774, -122.42123901844, -122.426244020462, -122.282497
    ), start_station_latitude = c(37.7614205, 37.795392, 37.3487586867448, 
    37.7737932060887, 37.7764348192047, 37.8036865), end_station_longitude = c(-122.431157827377, 
    -122.394203, -121.87712, -122.399749159813, -122.403452, -122.2948365
    ), end_station_latitude = c(37.783520835261, 37.795392, 37.3259984, 
    37.7809545996075, 37.788975, 37.8053183)), row.names = c(NA, 
    6L), class = "data.frame")
    
    dput(test_distance)
    structure(list(start_station_longitude = c(-122.4264353, -122.394203, 
    -121.894797831774, -122.42123901844, -122.426244020462, -122.282497
    ), start_station_latitude = c(37.7614205, 37.795392, 37.3487586867448, 
    37.7737932060887, 37.7764348192047, 37.8036865), end_station_longitude = c(-122.431157827377, 
    -122.394203, -121.87712, -122.399749159813, -122.403452, -122.2948365
    ), end_station_latitude = c(37.783520835261, 37.795392, 37.3259984, 
    37.7809545996075, 37.788975, 37.8053183), distance = c(3162, 
    0, 3591, 2843, 3088, 1198)), class = "data.frame", row.names = c(NA, 
    6L))
    
    
    > sessionInfo()
    R version 3.5.2 (2018-12-20)
    Platform: x86_64-apple-darwin15.6.0 (64-bit)
    Running under: macOS Mojave 10.14.3
    
    Matrix products: default
    BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
    LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
    
    locale:
    [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
    
    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  methods   base     
    
    other attached packages:
    [1] dplyr_0.7.8 readr_1.3.1
    
    loaded via a namespace (and not attached):
     [1] Rcpp_1.0.0       crayon_1.3.4     assertthat_0.2.0 R6_2.3.0         jsonlite_1.6    
     [6] magrittr_1.5     pillar_1.3.1     rlang_0.3.1      rstudioapi_0.9.0 bindrcpp_0.2.2  
    [11] tools_3.5.2      glue_1.3.0       purrr_0.3.0      hms_0.4.2        yaml_2.2.0      
    [16] compiler_3.5.2   pkgconfig_2.0.2  tidyselect_0.2.5 bindr_0.1.1      tibble_2.0.1  
    
    0 回复  |  直到 6 年前