代码之家  ›  专栏  ›  技术社区  ›  Andrew

求解包含非参数密度和分布的积分的最佳方法

  •  1
  • Andrew  · 技术社区  · 7 年前

    假设我想解一个包含两个积分的函数,比如(这是一个例子,实际函数更难看)

    enter image description here

    我做到了:

    # Create the data
    val <- runif(300, min=1, max = 10) #use the uniform distribution
    CDF <- (val - 1)/(10 - 1)
    pdf <- 1 / (10 - 1)
    data <- data.frame(val = val, CDF = CDF, pdf = pdf)
    
    c = 2
    d = 1
    
    # Inner integral
    integrand1 <- function(x) {
      i <- which.min(abs(x - data$val))
      FF <- data$CDF[i]
      ff <- data$pdf[i]
      (1 - FF)^(c/d) * ff
    }
    
    # Vectorize the inner integral
    Integrand1 <- Vectorize(integrand1)
    
    # Outer integral
    integrand2 <- function(x){
      i <- which.min(abs(x - data$val))
      FF <- data$CDF[i]
      ff <- data$pdf[i]
      (quadgk(Integrand1, x, 10) / FF) * c * ff
    }
    
    # Vectorize the outer integral
    Integrand2 <- Vectorize(integrand2)
    
    # Solve
    require(pracma)
    quadgk(Integrand2, 1, 10)
    

    积分非常慢。有更好的方法解决这个问题吗?非常感谢。

    ---------编辑---------

    v 具体如下:

    # Create the original data
    v <- runif(300, min = 1, max = 10)
    require(np)
    
    # Compute the CDF and pdf
    v.CDF.bw <- npudistbw(dat = v, bandwidth.compute = TRUE, ckertype = "gaussian")
    v.pdf.bw <- npudensbw(dat = v, bandwidth.compute = TRUE, ckertype = "gaussian")
    
    # Extend v on a grid (I add this step because the v vector in my data
    # is not very large. In this way I approximate the estimated pdf and CDF
    # on a grid)
    val <- seq(from = min(v), to = max(v), length.out = 1000)
    data <- data.frame(val)
    CDF <- npudist(bws = v.CDF.bw, newdata = data$val, edat = data )
    pdf <- npudens(bws = v.pdf.bw, newdata = data$val, edat = data )
    data$CDF <- CDF$dist
    data$pdf <- pdf$dens
    
    1 回复  |  直到 7 年前
        1
  •  1
  •   Glen_b    7 年前

    你考虑过使用 approxfun ?

    x <- runif(1000)+runif(1000)+2*(runif(1000)^2)
    dx <- density(x)
    fa <- approxfun(dx$x,dx$y)
    curve(fa,0,2)
    fa(0.4) 
    

    您应该能够使用网格化评估来调用它。它可能比你正在做的更快(也更准确)

    (编辑:是的,正如你所说, splinefun