代码之家 › 专栏 › 技术社区 › Luis Mendo

生成包含从n个向量中提取的元素的所有组合的矩阵

cartesian-product combinations matrix matlab arrays

Luis Mendo · 技术社区 · 11 年前

这个问题经常以一种或另一种形式出现(例如参见 here 或 here ). 所以我想我会以一种通用的形式来介绍它,并提供一个可能供将来参考的答案。

给定任意数字 n 可能大小不同的向量,生成 n -列矩阵,其行描述从这些向量(笛卡尔积)中提取的元素的所有组合。

例如

vectors = { [1 2], [3 6 9], [10 20] }

应该给予

combs = [ 1     3    10
          1     3    20
          1     6    10
          1     6    20
          1     9    10
          1     9    20
          2     3    10
          2     3    20
          2     6    10
          2     6    20
          2     9    10
          2     9    20 ]

4 回复 | 直到 7 年前

Luis Mendo 11 年前

这个 ndgrid 函数几乎给出了答案,但有一个警告: n 必须显式定义输出变量才能调用它。由于 n 是任意的,最好的方法是使用 comma-separated list (从具有 n 细胞)作为输出。由此产生的 n 然后将矩阵连接到所需的 n -列矩阵:

vectors = { [1 2], [3 6 9], [10 20] }; %// input data: cell array of vectors

n = numel(vectors); %// number of vectors
combs = cell(1,n); %// pre-define to generate comma-separated list
[combs{end:-1:1}] = ndgrid(vectors{end:-1:1}); %// the reverse order in these two
%// comma-separated lists is needed to produce the rows of the result matrix in
%// lexicographical order 
combs = cat(n+1, combs{:}); %// concat the n n-dim arrays along dimension n+1
combs = reshape(combs,[],n); %// reshape to obtain desired matrix

horchler 11 年前

简单一点。。。如果你有神经网络工具箱,你可以简单地使用 combvec :

vectors = {[1 2], [3 6 9], [10 20]};
combs = combvec(vectors{:}).' % Use cells as arguments

其以稍微不同的顺序返回矩阵:

combs =

     1     3    10
     2     3    10
     1     6    10
     2     6    10
     1     9    10
     2     9    10
     1     3    20
     2     3    20
     1     6    20
     2     6    20
     1     9    20
     2     9    20

如果您想要问题中的矩阵,可以使用 sortrows :

combs = sortrows(combvec(vectors{:}).')
% Or equivalently as per @LuisMendo in the comments: 
% combs = fliplr(combvec(vectors{end:-1:1}).')

这给出了

combs =

     1     3    10
     1     3    20
     1     6    10
     1     6    20
     1     9    10
     1     9    20
     2     3    10
     2     3    20
     2     6    10
     2     6    20
     2     9    10
     2     9    20

如果您查看 康姆维奇 (类型 edit combvec 在命令窗口中),您将看到它使用的代码与@LuisMendo的答案不同。我不能说总体上哪个效率更高。

如果恰好有一个矩阵的行与前面的单元格数组相似,则可以使用:

vectors = [1 2;3 6;10 20];
vectors = num2cell(vectors,2);
combs = sortrows(combvec(vectors{:}).')

Luis Mendo 10 年前

我已经对两个提议的解决方案进行了一些基准测试。基准测试代码基于 timeit function ,并包含在本文末尾。

我考虑两种情况:三个大小向量 n 和三个大小的矢量 n/10 , n 和 n*10 (两种情况给出相同数量的组合)。 n 最大变化为 240 (我选择此值是为了避免在笔记本电脑中使用虚拟内存)。

结果如下图所示 ndgrid -基于的解决方案通常比 combvec 。值得注意的是 康姆维奇 在不同大小的情况下,变化的规律性稍小。

enter image description here

基准测试代码

的函数 nd栅格 -基于解决方案:

function combs = f1(vectors)
n = numel(vectors); %// number of vectors
combs = cell(1,n); %// pre-define to generate comma-separated list
[combs{end:-1:1}] = ndgrid(vectors{end:-1:1}); %// the reverse order in these two
%// comma-separated lists is needed to produce the rows of the result matrix in
%// lexicographical order
combs = cat(n+1, combs{:}); %// concat the n n-dim arrays along dimension n+1
combs = reshape(combs,[],n);

的函数 康姆维奇 解决方案:

function combs = f2(vectors)
combs = combvec(vectors{:}).';

通过调用来测量时间的脚本 计时 关于这些功能:

nn = 20:20:240;
t1 = [];
t2 = [];
for n = nn;
    %//vectors = {1:n, 1:n, 1:n};
    vectors = {1:n/10, 1:n, 1:n*10};
    t = timeit(@() f1(vectors));
    t1 = [t1; t];
    t = timeit(@() f2(vectors));
    t2 = [t2; t];
end

Geoff 9 年前

这里有一个自己动手的方法,让我高兴得咯咯笑 nchoosek ,尽管它是不比@Luis Mendo的公认解决方案更好。

对于给定的示例,在1000次运行后,此解决方案平均花费了我的机器0.00065935秒,而接受的解决方案为0.00012877秒。对于更大的向量,遵循@Luis Mendo的基准测试帖子,此解决方法始终比接受的答案慢。尽管如此,我还是决定发布它,希望你能从中找到一些有用的东西:

代码:

tic;
v = {[1 2], [3 6 9], [10 20]};

L = [0 cumsum(cellfun(@length,v))];
V = cell2mat(v);

J = nchoosek(1:L(end),length(v));
J(any(J>repmat(L(2:end),[size(J,1) 1]),2) | ...
  any(J<=repmat(L(1:end-1),[size(J,1) 1]),2),:)  = [];

V(J)
toc

给予

ans =

 1     3    10
 1     3    20
 1     6    10
 1     6    20
 1     9    10
 1     9    20
 2     3    10
 2     3    20
 2     6    10
 2     6    20
 2     9    10
 2     9    20

Elapsed time is 0.018434 seconds.

说明:

L 使用 cellfun 虽然 手机游戏 基本上是一个循环,考虑到向量的数量必须相对较低,这个问题才可能实际。

V 连接所有向量以便于稍后访问(这假设您将所有向量作为行输入。v’将适用于列向量。)

恩丘塞克 找到所有可以选择的方法 n=length(v) 元素总数中的元素 L(end) . 这里会有比我们需要的更多的组合。

J =

 1     2     3
 1     2     4
 1     2     5
 1     2     6
 1     2     7
 1     3     4
 1     3     5
 1     3     6
 1     3     7
 1     4     5
 1     4     6
 1     4     7
 1     5     6
 1     5     7
 1     6     7
 2     3     4
 2     3     5
 2     3     6
 2     3     7
 2     4     5
 2     4     6
 2     4     7
 2     5     6
 2     5     7
 2     6     7
 3     4     5
 3     4     6
 3     4     7
 3     5     6
 3     5     7
 3     6     7
 4     5     6
 4     5     7
 4     6     7
 5     6     7

因为只有两个元素 v(1) ,我们需要在 J(:,1)>2 。类似地,其中 J(:,2)<3 , J(:,2)>5 等…使用 L 和 repmat 我们可以确定 J 在其适当范围内,然后使用 any 丢弃具有任何错误元素的行。

最后,这些不是 v ,只是索引。 V(J) 将返回所需的矩阵。