我认为你需要
sort
这个
student_id
循环之前的矢量,这样您的比较
student_id[i] != student_id[i-1]
将是有效的。否则,它就不会捕捉到同一个学生的连续翻转。
结果似乎有道理,其中
HH
和
TT
合计占总翻转次数的60.4%。
library(tidyverse)
set.seed(123)
ids <- 1:100
# only the following line was changed, all other lines are same as your code
student_id <- sort(sample(ids, 100000, replace = TRUE))
coin_result <- character(1000)
coin_result[1] <- sample(c("H", "T"), 1)
for (i in 2:length(coin_result)) {
if (student_id[i] != student_id[i-1]) {
coin_result[i] <- sample(c("H", "T"), 1)
} else if (coin_result[i-1] == "H") {
coin_result[i] <- sample(c("H", "T"), 1, prob = c(0.6, 0.4))
} else {
coin_result[i] <- sample(c("H", "T"), 1, prob = c(0.4, 0.6))
}
}
#tidy up
my_data <- data.frame(student_id, coin_result)
my_data <- my_data[order(my_data$student_id),]
my_data %>%
group_by(student_id) %>%
summarize(Sequence = str_c(coin_result, lead(coin_result)), .groups = 'drop') %>%
filter(!is.na(Sequence)) %>%
count(Sequence)
# A tibble: 4 Ã 2
Sequence n
<chr> <int>
1 HH 29763
2 HT 19782
3 TH 19775
4 TT 30580