代码之家 › 专栏 › 技术社区 › finnw

生成一个比例可变为“1”位的随机二进制数

bit-manipulation random optimization java

finnw · 技术社区 · 15 年前

我需要一个函数来生成随机整数。(假设Java) long 现在键入,但此项将扩展到 BigInteger 或 BitSet 后来)

复杂的部分是有一个参数p,它指定结果中任何位的(独立)概率为1。

如果p=0.5,那么我们可以使用标准随机数生成器。P的其他一些值也很容易实现。以下是一个不完整的例子:

Random random = new Random();

// ...

long nextLong(float p) {
    if      (p == 0.0f)   return 0L;
    else if (p == 1.0f)   return -1L;
    else if (p == 0.5f)   return random.nextLong();
    else if (p == 0.25f)  return nextLong(0.5f) & nextLong(0.5f);
    else if (p == 0.75f)  return nextLong(0.5f) | nextLong(0.5f);
    else if (p == 0.375f) return nextLong(0.5f) & nextLong(0.75f); // etc
    else {
      // What goes here??
      String message = String.format("P=%f not implemented yet!", p);
      throw new IllegalArgumentException(message);
    }
}

有没有一种方法可以概括出p在0.0和1.0之间的任何值?

7 回复 | 直到 10 年前

finnw 15 年前

首先是一些你已经在代码中使用的难看的数学。

定义x和y是概率分别为x=p(x=1)、y=p(y=1)的1位。那我们就有了

 p( x & y = 1) = X Y
 p( x | y = 1) = 1 - (1-X) (1-Y)
 p( x ^ y = 1) = X (1 - Y) + Y (1 - X)

如果我们让y=1/2,我们得到

P( x & y ) = X/2
P( x | y ) = (X+1)/2

现在把rhs设为我们想要的概率,我们有两个例子可以解x

X = 2 p        // if we use &
X = 2 p - 1    // if we use |

接下来,我们假设我们可以再次利用这个,得到x的另一个变量z… 然后我们继续迭代,直到完成“足够”为止。

这有点不清楚,但考虑到p=0.375

0.375 * 2 = 0.75  < 1.0 so our first operation is &
0.75 * 2 = 1.5 > 1.0 so our second operation is |
0.5 is something we know so we stop.

因此,我们可以通过x1&(x2_x3)得到p=0.375的变量。

问题是,对于大多数变量,这不会终止。例如

0.333 *2 = 0.666 < 1.0 so our first operation is &
0.666 *2 = 1.333 > 1.0 so our second operation is |
0.333 *2 = 0.666 < 1.0 so our third operation is &
etc...

因此p=0.333可由

X1 & ( X2 | (X3 & (X4 | ( ... ) ) ) )

现在我怀疑在这个系列中使用足够多的术语会给你足够的准确性,并且这可以写成一个递归函数。不过,也许还有更好的办法…我认为操作的顺序与p的二进制表示有关,我只是不确定如何…也没有时间深入思考。

无论如何,这是一些未经测试的C++代码。您应该能够轻松地实现它。

uint bitsWithProbability( float p )
{
   return bitsWithProbabilityHelper( p, 0.001, 0, 10 );
}

uint bitsWithProbabilityHelper( float p, float tol, int cur_depth, int max_depth )
{
   uint X = randbits();
   if( cur_depth >= max_depth) return X;
   if( p<0.5-tol)
   {
     return X & bitsWithProbabilityHelper( 2*p, 0.001, cur_depth+1, max_depth );
   }
   if(p>0.5+tol)
   {
     return X | bitsWithProbabilityHelper( 2*p-1, 0.001, cur_depth+1, max_depth );
   }
   return X;
}

Ondra Å½iÅ¾ka David Lilljegren 15 年前

通过数字按比例分配位数。 Pseudocode:

long generateNumber( double probability ){
  int bitCount = 64 * probability;
  byte[] data = new byte[64]; // 0-filled

  long indexes = getRandomLong();

  for 0 to bitCount-1 {
    do { 
      // distribute this bit to some postition with 0.
      int index = indexes & 64;
      indexes >> 6;
      if( indexes == 0 ) indexes = getRandomLong();
    } while ( data[index] == 0 );
    data[index] = 1;
  }

  return bytesToLong( data );
}

我希望你明白我的意思。也许 byte[] 可替换为 long 和位操作,使其更快。

President James K. Polk 15 年前

使用在0和1之间生成统一浮点数r的随机生成器。如果R>P,则将位设置为0,否则将其设置为1。

Mark Elliot 15 年前

如果你想应用一些分布,其中概率p为1,概率1-p为0,你的最佳选择就是独立生成每一个比特,概率p为1(我知道,这听起来像是一个递归定义)。

这里有一个解决方案,我将在下面介绍它:

public class MyRandomBitGenerator
{

    Random pgen = new Random();

    // assumed p is well conditioned (0 < p < 1)
    public boolean nextBitIsOne(double p){
        return pgen.nextDouble() < p ? true : false;
    }

    // assumed p is well conditioned (0 < p < 1)
    public long nextLong(double p){
        long nxt = 0;
        for(int i = 0; i < 64; i++){
           if(nextBitIsOne(p)){
               nxt += 1 << i;
           }
        }
        return nxt;
    }

}

基本上,我们首先确定如何用概率p生成1的值: pgen.nextDouble() 通过询问是否小于,生成一个0到1之间的概率相等的数字。 p 我们正在对这种分布进行抽样,以便看到 磷 1s,我们称之为无穷大的函数。

Community Reversed Engineer 7 年前

这是另一个变种 Michael Anderson's answer

为了避免递归,我们从右到左迭代处理p的位,而不是从左到右递归处理。这在浮点表示中很难做到,因此我们从二进制表示中提取指数/尾数字段。

class BitsWithProbabilityHelper {
    public BitsWithProbabilityHelper(float prob, Random rnd) {
        if (Float.isNaN(prob)) throw new IllegalArgumentException();

        this.rnd = rnd;

        if (prob <= 0f) {
            zero = true;
            return;
        }

        // Decode IEEE float
        int probBits = Float.floatToIntBits(prob);
        mantissa = probBits & 0x7FFFFF;
        exponent = probBits >>> 23;

        // Restore the implicit leading 1 (except for denormals)
        if (exponent > 0) mantissa |= 0x800000;
        exponent -= 150;

        // Force mantissa to be odd
        int ntz = Integer.numberOfTrailingZeros(mantissa);
        mantissa >>= ntz;
        exponent += ntz;
    }

    /** Determine how many random words we need from the system RNG to
     *  generate one output word with probability P.
     **/
    public int iterationCount() {
        return - exponent;
    }

    /** Generate a random number with the desired probability */
    public long nextLong() {
        if (zero) return 0L;

        long acc = -1L;
        int shiftReg = mantissa - 1;
        for (int bit = exponent; bit < 0; ++ bit) {
            if ((shiftReg & 1) == 0) {
                acc &= rnd.nextLong();
            } else {
                acc |= rnd.nextLong();
            }
            shiftReg >>= 1;
        }
        return acc;
    }

    /** Value of <code>prob</code>, represented as m * 2**e where m is always odd. */
    private int exponent;  
    private int mantissa;

    /** Random data source */
    private final Random rnd;

    /** Zero flag (special case) */
    private boolean zero;
}

finnw 11 年前

这就是我最终解决问题的方法。

根据二项式分布,生成介于0到16之间的整数n。这给出了16位部分结果中“1”位的数目。
将索引随机生成到包含16位整数的查找表中,该整数包含所需的“1”位数。
重复4次,得到4个16位整数。
将这四个16位整数拼接在一起,得到一个64位整数。

这在一定程度上是受Ondra_½I_¾Ka回答的启发。

好处是它减少了 Random.nextLong() 每64位输出8次调用。为了进行比较,滚动每个单独的位需要64个调用。按位和/或使用2到32个调用,具体取决于 P

当然,计算二项式概率也同样昂贵,所以这些都放在另一个查找表中。

这是很多代码,但在性能方面是值得的。

更新 -将其与按位和/或解决方案合并。它现在使用这个方法,如果它猜测它将更有效(在调用 Random.next() )

Kartik Kale 10 年前

假设位数组的大小是l,如果l=1,第一个位为1的概率是p,0的概率是1-p,对于l=2,得到00的概率是(1-p) ^二 ,01或10表示P(1-P),11表示P ^二 . 扩展这个逻辑,我们可以首先通过比较随机数和p来确定第一个位,然后缩放随机数,这样我们可以再次得到0到1之间的任何值。示例javascript代码:

function getRandomBitArray(maxBits,probabilityOf1) {
    var randomSeed = Math.random();
    bitArray = new Array();
    for(var currentBit=0;currentBit<maxBits;currentBit++){
        if(randomSeed<probabilityOf1){
            //fill 0 at current bit
            bitArray.push(0);
            //scale the sample space of the random no from [0,1)
            //to [0.probabilityOf1)
            randomSeed=randomSeed/probabilityOf1;
        }
        else{
            //fill 1 at current bit
            bitArray.push(1);
            //scale the sample space to [probabilityOf1,1)
            randomSeed = (randomSeed-probabilityOf1)/(1-probabilityOf1);
        }
    }
}

编辑: 此代码确实生成完全随机的位。我会尽力更好地解释这个算法。

每个位串都有一定的发生概率。假设一个字符串有发生的概率磷 ;如果我们的随机数是长度p的某个区间,我们希望选择该字符串。该区间的起始点必须是固定的,但其值不会有太大的差别。假设我们正确地选择了最多k位。然后,对于下一个位,我们将这个k长度位串对应的间隔分成两部分大小,比例为 p:1 - 磷 (这里磷是得到1的概率)。我们说,如果随机数在第一部分,下一位将是1;如果随机数在第二部分,下一位将是0。这确保长度为k+1的字符串的概率也保持正确。

Java代码:

public ArrayList<Boolean> getRandomBitArray(int maxBits, double probabilityOf1) {
    double randomSeed = Math.random();
    ArrayList<Boolean> bitArray = new ArrayList<Boolean>();
    for(int currentBit=0;currentBit<maxBits;currentBit++){
        if(randomSeed<probabilityOf1){
            //fill 0 at current bit
            bitArray.add(false);
            //scale the sample space of the random no from [0,1)
            //to [0.probabilityOf1)
            randomSeed=randomSeed/probabilityOf1;
        }
        else{
            //fill 1 at current bit
            bitArray.add(true);
            //scale the sample space to [probabilityOf1,1)
            randomSeed = (randomSeed-probabilityOf1)/(1-probabilityOf1);
        }
    }
    return  bitArray;
}