代码之家  ›  专栏  ›  技术社区  ›  G. Sozu

在ggplot2或类似plotly中混合使用直方图和条形图,可以使用hist()

  •  1
  • G. Sozu  · 技术社区  · 6 年前

    我用hist()函数创建了直方图和条形图的混合体。下图和代码。

    现在我想用ggplot2或plotly做一些类似的事情,因为我想在一个闪亮的应用程序中有这样一个绘图,作为交互式绘图。几个小时后,我没有找到解决方法。

    在我的曲线的x轴上,我有温度,在y轴上,我有生活在温度范围内的人的总数。在每个箱子上面,我还有每个箱子的实际人数。因为可能有一些人多次被列在同一个垃圾箱中,所以我也有一些独特的人的总数。

    这是它的外观 hist()

    一如既往,我们非常感谢您的帮助。

    # create df
    mydf <- data.frame(
      City.as.ID=c("Hønefoss : Norwegen", "Hønefoss : Norwegen", "Hønefoss : Norwegen", "Hønefoss : Norwegen", "Hønefoss : Norwegen", "Hønefoss : Norwegen",
                   "Jessheim : Norwegen","Jessheim : Norwegen", "Jessheim : Norwegen", "Jessheim : Norwegen", "Jessheim : Norwegen", "Jessheim : Norwegen",
                   "Hanko : Finnland","Hanko : Finnland","Hanko : Finnland","Hanko : Finnland","Hanko : Finnland", "Hanko : Finnland", 
                   "Espoo : Finnland","Espoo : Finnland","Espoo : Finnland","Espoo : Finnland","Espoo : Finnland","Espoo : Finnland"),
      peoplefreq=c(1,1,1,1,1,1,
                   3,3,3,3,3,3,
                   18,18,18,18,18,18,
                   2,2,2,2,2,2),
    
      temperature=c(-4.93, -3.55, 0.82, 3.7, 10.18,13.41,
                    -1.92, -2.6, 2.19, 4.04, 10.75, 14.18,
                    -2.39, -2.54, 0.78, 2.39, 9.22, 13.41,
                    -2.86, -3.51, 0.12, 2.06, 9.16, 13.35),
      row_id=c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24)
    )
    mydf
    
    # sorting the temperature column
    mydf <- mydf[order(mydf$temperature),]
    mydf
    
    # from here all the work for plot
    mydata <- mydf
    mx <- mydata$temperature
    my <- mydata$peoplefreq
    mc <- mydata$City.as.ID
    
    # get the data from hist()
    h <- hist(mydata$temperature, plot = FALSE)
    
    # get the breakpionts
    breaks <- data.frame(
      "start"=h$breaks[-length(h$breaks)], 
      "end"=h$breaks[-1]
    )
    breaks
    
    # sum up the y values within the x bins
    sums_of_y_within_x_bins <- apply(breaks, MARGIN=1, FUN=function(x) { sum(my[ mx >= x[1] & mx < x[2] ]) })
    sums_of_y_within_x_bins
    
    # sums instead of frequency
    h$counts <- sums_of_y_within_x_bins
    
    # sum up the unique values of y within the x bins
    # in between temperature -5 to 0 there are total 48 peoples but some of them are multiple times listed
    # in real there are only 24 people
    uniqvalues_of_y <- apply(breaks, MARGIN=1, FUN=function(x) {
      newdata <- unique(subset(mydata, select = c(City.as.ID, peoplefreq)))
      sum(newdata$peoplefreq[is.element(newdata$City.as.ID, as.vector(unique(mc[ mx >= x[1] & mx < x[2] ])))])
    })
    uniqvalues_of_y
    
    uniqvalues_of_y <- as.character(uniqvalues_of_y)
    
    # the final plot as a mix of histogram and bar chart
    plot(h, labels = uniqvalues_of_y , ylab="Total sum of y", col="gray")
    
    # some try
    library(ggplot2)
    
    #here it counts how many values are within the x bin but not the sum
    ggplot(mydata, aes(x=mx, fill=my)) + 
      geom_histogram(breaks=c(-5,0,5,10,15), color="black")
    
    1 回复  |  直到 6 年前
        1
  •  0
  •   Jack Brookes    6 年前

    我认为图表不是很清晰,也许你应该采用不同的方法,但如果你想这样做,你可以手动将数据装箱:

    library(dplyr)
    library(ggplot2)
    
    mydf %>%
      mutate(temperature_group = cut(temperature, seq(-5, 15, by = 5))) %>%
      group_by(temperature_group, City.as.ID) %>%
      summarise(sum_peoplefreq = sum(peoplefreq), unique_people = first(peoplefreq)) %>%
      summarise_at(vars(sum_peoplefreq, unique_people), "sum") %>% 
      ggplot(aes(x = temperature_group, y = sum_peoplefreq, label = unique_people)) +
      geom_col(fill = "grey80", color = "black") + 
      geom_text(nudge_y = 2) + 
      theme_classic()
    

    enter image description here