代码之家 › 专栏 › 技术社区 › Romain Valeri

有什么更好的方法来设计随机生成器的内置统计信息?

maintainability dynamically-generated random performance javascript

Romain Valeri · 技术社区 · 6 年前

上下文 :随机句子生成器

1)功能 generateSentence() 生成随机语句作为字符串返回( 工程罚款 )

2)功能 calculateStats() 输出唯一的上述函数理论上可以生成字符串(也可以很好地工作 在里面这个模型 ,所以请务必阅读免责声明,我不想浪费您的时间)

3)功能 generateStructure() 单词列表 Dictionnary.lists 随着时间的推移不断增长

主发电机功能的快速模型:

function generateSentence() {
  var words = [];
  var structure = generateStructure();

  structure.forEach(function(element) {
    words.push(Dictionnary.getElement(element));
  });

  var fullText = words.join(" ");
  fullText = fullText.substring(0, 1).toUpperCase() + fullText.substring(1);
  fullText += ".";
  return fullText;
}

var Dictionnary = {
  getElement: function(listCode) {
    return randomPick(Dictionnary.lists[listCode]);
  },
  lists: {
    _location: ["here,", "at my cousin's,", "in Antarctica,"],
    _subject: ["some guy", "the teacher", "Godzilla"],
    _vTransitive: ["is eating", "is scolding", "is seeing"],
    _vIntransitive: ["is working", "is sitting", "is yawning"],
    _adverb: ["slowly", "very carefully", "with a passion"],
    _object: ["this chair", "an egg", "the statue of Liberty"],
  }
}

// returns an array of strings symbolizing types of sentence elements
// example : ["_location", "_subject", "_vIntransitive"]
function generateStructure() {
  var str = [];

  if (dice(6) > 5) {// the structure can begin with a location or not
    str.push("_location");
  }

  str.push("_subject");// the subject is mandatory

  // verb can be of either types
  var verbType = randomPick(["_vTransitive", "_vIntransitive"]);
  str.push(verbType);

  if (dice(6) > 5) {// adverb is optional
    str.push("_adverb");
  }

  // the structure needs an object if the verb is transitive
  if (verbType == "_vTransitive") {
    str.push("_object");
  }

  return str;
}

// off-topic warning! don't mind the implementation here,
// just know it's a random pick in the array
function randomPick(sourceArray) {
  return sourceArray[dice(sourceArray.length) - 1];
}

// Same as above, not the point, just know it's a die roll (random integer from 1 to max)
function dice(max) {
  if (max < 1) { return 0; }
  return Math.round((Math.random() * max) + .5);
}

在某种程度上,我想知道它能输出多少不同的唯一字符串,我写了一些类似的东西(同样,非常简单):

function calculateStats() {// the "broken leg" function I'm trying to improve/replace
  var total = 0;
  // lines above : +1 to account for 'no location' or 'no adverb'
  var nbOfLocations = Dictionnary.lists._location.length + 1;
  var nbOfAdverbs = Dictionnary.lists._adverb.length + 1;

  var nbOfTransitiveSentences = 
    nbOfLocations *
    Dictionnary.lists._vTransitive.length *
    nbOfAdverbs *
    Dictionnary.lists._object.length;
  var nbOfIntransitiveSentences =
    nbOfLocations *
    Dictionnary.lists._vIntransitive.length *
    nbOfAdverbs;

  total = nbOfTransitiveSentences + nbOfIntransitiveSentences;
  return total;
}

(附带说明:不要担心名称空间污染、输入参数的类型检查或诸如此类的事情,为了示例的清晰性,我们假设这是在一个气泡中。)

重要免责声明 这不是关于修复我发布的代码。这是一个模型,它的工作,因为它是。真正的问题是 “随着未来可能的结构的复杂性,以及列表的大小和多样性,对于这些类型的随机结构,什么是更好的计算策略,而不是我的笨拙。 计算器() 函数,很难维护,很可能处理天文大数*,而且容易出错?”

*在真正的工具中,目前有351120个独特的结构,对于句子…总数已经超过(10次方80)一段时间了。

1 回复 | 直到 6 年前

SirPeople 6 年前

因为你的句子结构变化很大(在这个小例子中确实发生了变化,我无法想象在实际代码中会发生多大的变化),所以我会做类似的事情:

首先,我需要以某种方式保存一个给定的所有可能的句子结构。 Dictionary …也许我会创造一个 Language 对象,它有一个字典作为属性,我可以添加可能的句子结构(这部分可能会被优化,并找到生成所有可能的句子结构的更为过程化的方法,如规则引擎)。 你说的句子结构是什么意思 是吗?好吧,按照你的例子,我将把句子结构称为下一个:

[ 'location', 'transitive-verb', 'adverb', 'object' ] < - Transitive sentence
[ 'location', 'instransitive-verb', 'adverb' ] <- Intransitive sentence

你可能会找到一种生成这种结构的方法或者硬编码。

但是… 为什么我认为这可以改善你计算数据的方式? 因为通过使用map/reduce操作可以最小化每个句子的硬编码,并使其更具扩展性。

所以… 怎样?

假设我们的结构可以在全局范围内、通过对象或字典本身访问:

// Somewhere in the code
const structures = [
  [ 'location', 'transitive-verb', 'adverb', 'object' ],
  [ 'location', 'instransitive-verb', 'adverb' ] 
];
...
// In this example I just passed it as an argument
function calculateStats(structures) {
  const numberOfCombinations = structures.reduce((total, structure) => {
      // We should calculate the number of combinations a structure has
      const numberOfSentences = structure.reduce((acc, wordType) => {
          // For each word type, we access the list and get the lenght (I am not doing safety checks for any wordType)
          return acc + Dictionary.lists[wordType].length
      }, 0);//Initial accumulator

      return total + numberOfSentences;
  }, 0); // Initial accumulator
  return numberOfCombinations;
}

因此,我们将使用遍历不同结构的能力,而不是对每个可能的组合进行硬编码,因此基本上只需要添加结构和 calculateStats 功能不应该增长。

如果需要进行更复杂的计算,则需要更改减速器中使用的函数。

我对语法或句法分析知之甚少,所以也许你有了更多的知识,就能找到更简单的方法或做“更聪明的计算”。

我可以自由地用ES6风格来写,如果reduce对你来说是个奇怪的动物, you can read more here 或者使用 lodash / ramda /不管怎样^^