代码之家 › 专栏 › 技术社区 › swihart

利用fitdistrplus中的fitdist和不同大小的betabinominal分布

swihart · 技术社区 · 6 年前

一个相关的问题是 "using fitdist from fitdistplus with binomial distribution " . fitdistrplus::fitdist 是一个获取单变量数据并开始猜测参数的函数。要拟合二项和二项数据,而单变量,也需要大小。如果每个基准的尺寸都是固定的,则上述链接具有所需的固定。但是,如果大小不同,需要传递一个向量,我不确定如何获得一个正常工作的调用。

opt_one 在下面的代码中是上述链接帖子中提供的解决方案——也就是说,集群大小是已知的,并且是固定的。为了 奥普特尼 ,我错误地指定了 fix.arg=list(size=125) (本质上,使每个元素的n=125)这足够接近,代码运行。但是,集群的大小 N 实际上是不同的。我试着在 opt_two 得到一个错误。任何想法都会很感激的。

library(fitdistrplus)
library(VGAM)
set.seed(123)

N <- 100 + rbinom(1000,25,0.9)

Y <- rbetabinom.ab(rep(1,length(N)), N, 1, 2)

head(cbind(Y,N))

opt_one <-
  fitdist(data=Y,
          distr=pbetabinom.ab,
          fix.arg=list(size=125),
          start=list(shape1=1,shape2=1)
  )
opt_one

它给出:

> head(cbind(Y,N))
      Y   N
[1,] 67 123
[2,] 14 121
[3,] 15 123
[4,] 42 121
[5,] 86 120
[6,] 28 125
> opt_one <-
+   fitdist(data=Y,
+           distr=pbetabinom.ab,
+           fix.arg=list(size=125),
+           start=list(shape1=1,shape2=1)
+   )
Warning messages:
1: In fitdist(data = Y, distr = pbetabinom.ab, fix.arg = list(size = 125),  :
  The dbetabinom.ab function should return a zero-length vector when input has length zero
2: In fitdist(data = Y, distr = pbetabinom.ab, fix.arg = list(size = 125),  :
  The pbetabinom.ab function should return a zero-length vector when input has length zero
> opt_one
Fitting of the distribution ' betabinom.ab ' by maximum likelihood 
Parameters:
        estimate Std. Error
shape1 0.9694054 0.04132912
shape2 2.1337839 0.10108720
Fixed parameters:
     value
size   125

不,不好,就像 shape1 和 shape2 当我们创建 Y . 下面是选项2:

opt_two <-
  fitdist(data=Y,
          distr=pbetabinom.ab,
          fix.arg=list(size=N),
          start=list(shape1=1,shape2=1)
  )

这就产生了一个错误:

> opt_two <-
+   fitdist(data=Y,
+           distr=pbetabinom.ab,
+           fix.arg=list(size=N),
+           start=list(shape1=1,shape2=1)
+   )
Error in checkparamlist(arg_startfix$start.arg, arg_startfix$fix.arg,  : 
  'fix.arg' must specify names which are arguments to 'distr'.

首次发帖后的尝试(感谢迪恩·福尔曼)

我知道我可以编码我自己的概率( opt_three ,但确实希望使用 fitdist 对象——即 奥普特二 工作。

library(Rfast)
loglik <-function(parm){  
  A<-parm[1];B<-parm[2]
  -sum( Lgamma(A+B) - Lgamma(A)- Lgamma(B) + Lgamma(Y+A) + Lgamma(N-Y+B) - Lgamma(N+A+B)  )
}

opt_three <- optim(c(1,1),loglik, method = "L-BFGS-B", lower=c(0,0))
opt_three

它给出:

> opt_three
$par
[1] 0.9525161 2.0262342

$value
[1] 61805.54

$counts
function gradient 
       7        7 

$convergence
[1] 0

$message
[1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"

同样相关的是 Ben Bolker's answer using mle2 . FitDist解决方案仍然逍遥法外。

1 回复 | 直到 6 年前

swihart 6 年前

看看 ?fitdistrplus::fitdist() 帮助页面:

# (4) defining your own distribution functions, here for the Gumbel distribution
# for other distributions, see the CRAN task view 
# dedicated to probability distributions
#

dgumbel <- function(x, a, b) 1/b*exp((a-x)/b)*exp(-exp((a-x)/b))
pgumbel <- function(q, a, b) exp(-exp((a-q)/b))
qgumbel <- function(p, a, b) a-b*log(-log(p))

fitgumbel <- fitdist(serving, "gumbel", start=list(a=10, b=10))
summary(fitgumbel)
plot(fitgumbel)

然后——因为您实际上是RTM而感到鼓舞和有见地——使用n个指定值创建自己的[DPQ]函数:

dbbspecifiedsize <- function(x, a, b) dbetabinom.ab(x, size=N, shape1=a, shape2=b)
pbbspecifiedsize <- function(q, a, b) pbetabinom.ab(q, size=N, shape1=a, shape2=b)
qbbspecifiedsize <- function(p, a, b) qbetabinom.ab(p, size=N, shape1=a, shape2=b)

opt_four <-
  fitdist(data=Y,
          distr="bbspecifiedsize",
          start=list(a=1,b=1)
  )
opt_four

它给出:

> opt_four
Fitting of the distribution ' bbspecifiedsize ' by maximum likelihood 
Parameters:
   estimate Std. Error
a 0.9526875 0.04058396
b 2.0261339 0.09576709

这与 opt_three 是一个 fitdist 对象。