代码之家 › 专栏 › 技术社区 › akappa

Java中的迭代笛卡尔积

cartesian-product algorithm java

akappa · 技术社区 · 15 年前

我想计算任意数的笛卡尔积 非空的 在Java中设置。

我写过迭代代码…

public static <T> List<Set<T>> cartesianProduct(List<Set<T>> list) {
    List<Iterator<T>> iterators = new ArrayList<Iterator<T>>(list.size());
    List<T> elements = new ArrayList<T>(list.size());
    List<Set<T>> toRet = new ArrayList<Set<T>>();
    for (int i = 0; i < list.size(); i++) {
        iterators.add(list.get(i).iterator());
        elements.add(iterators.get(i).next());
    }
    for (int j = 1; j >= 0;) {
        toRet.add(Sets.newHashSet(elements));
        for (j = iterators.size()-1; j >= 0 && !iterators.get(j).hasNext(); j--) {
            iterators.set(j, list.get(j).iterator());
            elements.set(j, iterators.get(j).next());
        }
        elements.set(Math.abs(j), iterators.get(Math.abs(j)).next());
    }
    return toRet;
}

…但我觉得这很不雅。有人有更好的,仍然迭代的解决方案吗?一个使用类似于函数的方法的解决方案? 否则…如何改进的建议?错误?

9 回复 | 直到 8 年前

Kevin Bourrillion Gergely 15 年前

我已经写了一个解决方案,它不需要您在内存中填充大量的集合。不幸的是,所需的代码有数百行长。你可能要等到它出现在关岛项目中( http://guava-libraries.googlecode.com ,我希望在今年年底。对不起的。:(

注意,如果笛卡尔公式生成的集合数是编译时已知的固定数,那么您可能不需要这样的实用程序——您可以只使用嵌套的for循环数。

编辑: 代码现在被释放。

Sets.cartesianProduct()

我想你会很高兴的。它只在您请求的时候创建单独的列表;不会用它们的所有MXNXPXQ填充内存。

如果你想检查源头, here at line 727 .

享受!

Marcello DeSales 9 年前

使用谷歌番石榴19和爪哇8非常简单:

假设您拥有要关联的所有数组的列表…

public static void main(String[] args) {
  List<String[]> elements = Arrays.asList(
    new String[]{"John", "Mary"}, 
    new String[]{"Eats", "Works", "Plays"},
    new String[]{"Food", "Computer", "Guitar"}
  );

  // Create a list of immutableLists of strings
  List<ImmutableList<String>> immutableElements = makeListofImmutable(elements);

  // Use Guava's Lists.cartesianProduct, since Guava 19
  List<List<String>> cartesianProduct = Lists.cartesianProduct(immutableElements);

  System.out.println(cartesianProduct);
}

生成不可变列表的方法如下:

/**
 * @param values the list of all profiles provided by the client in matrix.json
 * @return the list of ImmutableList to compute the Cartesian product of values
 */
private static List<ImmutableList<String>> makeListofImmutable(List<String[]> values) {
  List<ImmutableList<String>> converted = new LinkedList<>();
  values.forEach(array -> {
    converted.add(ImmutableList.copyOf(array));
  });
  return converted;
}

输出如下:

[
  [John, Eats, Food], [John, Eats, Computer], [John, Eats, Guitar],
  [John, Works, Food], [John, Works, Computer], [John, Works, Guitar], 
  [John, Plays, Food], [John, Plays, Computer], [John, Plays, Guitar],
  [Mary, Eats, Food], [Mary, Eats, Computer], [Mary, Eats, Guitar],
  [Mary, Works, Food], [Mary, Works, Computer], [Mary, Works, Guitar],
  [Mary, Plays, Food], [Mary, Plays, Computer], [Mary, Plays, Guitar]
]

Michael Easter 15 年前

下面的答案使用迭代而不是递归。它使用相同的 Tuple 从我以前的答案开始上课。

这是一个单独的答案,因为IMHO都是有效的、不同的方法。

这是新的主要课程:

public class Example {

    public static <T> List<Tuple<T>> cartesianProduct(List<Set<T>> sets) {
        List<Tuple<T>> tuples = new ArrayList<Tuple<T>>();

        for (Set<T> set : sets) {            
            if (tuples.isEmpty()) {
                for (T t : set) {
                    Tuple<T> tuple = new Tuple<T>();
                    tuple.add(t);    
                    tuples.add(tuple);
                }                
            } else {
                List<Tuple<T>> newTuples = new ArrayList<Tuple<T>>();

                for (Tuple<T> subTuple : tuples) {
                    for (T t : set) {
                        Tuple<T> tuple = new Tuple<T>();
                        tuple.addAll(subTuple);
                        tuple.add(t);
                        newTuples.add(tuple);
                    }
                }                

                tuples = newTuples;
            }
        }

        return tuples;
    }
}

John Kristian 13 年前

这是我写的一个迭代的、懒惰的实现。这个界面与Google的sets.cartesiaproduct非常相似,但它有点灵活:它处理的是iterables而不是sets。此代码及其单元测试位于 https://gist.github.com/1911614 .

/* Copyright 2012 LinkedIn Corp.

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
 */

import com.google.common.base.Function;
import com.google.common.collect.Iterables;
import java.lang.reflect.Array;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.Iterator;
import java.util.List;
import java.util.NoSuchElementException;

/**
 * Implements the Cartesian product of ordered collections.
 * 
 * @author <a href="mailto:jmkristian@gmail.com">John Kristian</a>
 */
public class Cartesian {
  /**
   * Generate the <a href="http://en.wikipedia.org/wiki/Cartesian_product">Cartesian
   * product</a> of the given axes. For axes [[a1, a2 ...], [b1, b2 ...], [c1, c2 ...]
   * ...] the product is [[a1, b1, c1 ...] ... [a1, b1, c2 ...] ... [a1, b2, c1 ...] ...
   * [aN, bN, cN ...]]. In other words, the results are generated in same order as these
   * nested loops:
   * 
   * <pre>
   * for (T a : [a1, a2 ...])
   *   for (T b : [b1, b2 ...])
   *     for (T c : [c1, c2 ...])
   *       ...
   *         result = new T[]{ a, b, c ... };
   * </pre>
   * 
   * Each result is a new array of T, whose elements refer to the elements of the axes. If
   * you prefer a List, you can call asLists(product(axes)).
   * <p>
   * Don't change the axes while iterating over their product, as a rule. Changes to an
   * axis can affect the product or cause iteration to fail (which is usually bad). To
   * prevent this, you can pass clones of your axes to this method.
   * <p>
   * The implementation is lazy. This method iterates over the axes, and returns an
   * Iterable that contains a reference to each axis. Iterating over the product causes
   * iteration over each axis. Methods of each axis are called as late as practical.
   */
  public static <T> Iterable<T[]> product(Class<T> resultType,
                                          Iterable<? extends Iterable<? extends T>> axes) {
    return new Product<T>(resultType, newArray(Iterable.class, axes));
  }

  /** Works like product(resultType, Arrays.asList(axes)), but slightly more efficient. */
  public static <T> Iterable<T[]> product(Class<T> resultType, Iterable<? extends T>... axes) {
    return new Product<T>(resultType, axes.clone());
  }

  /**
   * Wrap the given arrays in fixed-size lists. Changes to the lists write through to the
   * arrays.
   */
  public static <T> Iterable<List<T>> asLists(Iterable<? extends T[]> arrays) {
    return Iterables.transform(arrays, new AsList<T>());
  }

  /**
   * Arrays.asList, represented as a Function (as used in Google collections).
   */
  public static class AsList<T> implements Function<T[], List<T>> {
    @Override
    public List<T> apply(T[] array) {
      return Arrays.asList(array);
    }
  }

  /** Create a generic array containing references to the given objects. */
  private static <T> T[] newArray(Class<? super T> elementType, Iterable<? extends T> from) {
    List<T> list = new ArrayList<T>();
    for (T f : from)
      list.add(f);
    return list.toArray(newArray(elementType, list.size()));
  }

  /** Create a generic array. */
  @SuppressWarnings("unchecked")
  private static <T> T[] newArray(Class<? super T> elementType, int length) {
    return (T[]) Array.newInstance(elementType, length);
  }

  private static class Product<T> implements Iterable<T[]> {
    private final Class<T> _resultType;
    private final Iterable<? extends T>[] _axes;

    /** Caution: the given array of axes is contained by reference, not cloned. */
    Product(Class<T> resultType, Iterable<? extends T>[] axes) {
      _resultType = resultType;
      _axes = axes;
    }

    @Override
    public Iterator<T[]> iterator() {
      if (_axes.length <= 0) // an edge case
        return Collections.singleton(newArray(_resultType, 0)).iterator();
      return new ProductIterator<T>(_resultType, _axes);
    }

    @Override
    public String toString() {
      return "Cartesian.product(" + Arrays.toString(_axes) + ")";
    }

    private static class ProductIterator<T> implements Iterator<T[]> {
      private final Iterable<? extends T>[] _axes;
      private final Iterator<? extends T>[] _iterators; // one per axis
      private final T[] _result; // a copy of the last result
      /**
       * The minimum index such that this.next() will return an array that contains
       * _iterators[index].next(). There are some special sentinel values: NEW means this
       * is a freshly constructed iterator, DONE means all combinations have been
       * exhausted (so this.hasNext() == false) and _iterators.length means the value is
       * unknown (to be determined by this.hasNext).
       */
      private int _nextIndex = NEW;
      private static final int NEW = -2;
      private static final int DONE = -1;

      /** Caution: the given array of axes is contained by reference, not cloned. */
      ProductIterator(Class<T> resultType, Iterable<? extends T>[] axes) {
        _axes = axes;
        _iterators = Cartesian.<Iterator<? extends T>> newArray(Iterator.class, _axes.length);
        for (int a = 0; a < _axes.length; ++a) {
          _iterators[a] = axes[a].iterator();
        }
        _result = newArray(resultType, _iterators.length);
      }

      private void close() {
        _nextIndex = DONE;
        // Release references, to encourage garbage collection:
        Arrays.fill(_iterators, null);
        Arrays.fill(_result, null);
      }

      @Override
      public boolean hasNext() {
        if (_nextIndex == NEW) { // This is the first call to hasNext().
          _nextIndex = 0; // start here
          for (Iterator<? extends T> iter : _iterators) {
            if (!iter.hasNext()) {
              close(); // no combinations
              break;
            }
          }
        } else if (_nextIndex >= _iterators.length) {
          // This is the first call to hasNext() after next() returned a result.
          // Determine the _nextIndex to be used by next():
          for (_nextIndex = _iterators.length - 1; _nextIndex >= 0; --_nextIndex) {
            Iterator<? extends T> iter = _iterators[_nextIndex];
            if (iter.hasNext()) {
              break; // start here
            }
            if (_nextIndex == 0) { // All combinations have been generated.
              close();
              break;
            }
            // Repeat this axis, with the next value from the previous axis.
            iter = _axes[_nextIndex].iterator();
            _iterators[_nextIndex] = iter;
            if (!iter.hasNext()) { // Oops; this axis can't be repeated.
              close(); // no more combinations
              break;
            }
          }
        }
        return _nextIndex >= 0;
      }

      @Override
      public T[] next() {
        if (!hasNext())
          throw new NoSuchElementException("!hasNext");
        for (; _nextIndex < _iterators.length; ++_nextIndex) {
          _result[_nextIndex] = _iterators[_nextIndex].next();
        }
        return _result.clone();
      }

      @Override
      public void remove() {
        for (Iterator<? extends T> iter : _iterators) {
          iter.remove();
        }
      }

      @Override
      public String toString() {
        return "Cartesian.product(" + Arrays.toString(_axes) + ").iterator()";
      }
    }
  }
}

Remko Popma 13 年前

基于索引的解决方案

使用索引是一种简单的替代方法,它速度快、内存效率高,可以处理任意数量的集合。实现iterable允许在for-each循环中轻松使用。有关用法示例,请参见main方法。

public class CartesianProduct implements Iterable<int[]>, Iterator<int[]> {

private final int[] _lengths;
private final int[] _indices;
private boolean _hasNext = true;

public CartesianProduct(int[] lengths) {
    _lengths = lengths;
    _indices = new int[lengths.length];
}

public boolean hasNext() {
    return _hasNext;
}

public int[] next() {
    int[] result = Arrays.copyOf(_indices, _indices.length);
    for (int i = _indices.length - 1; i >= 0; i--) {
        if (_indices[i] == _lengths[i] - 1) {
            _indices[i] = 0;
            if (i == 0) {
                _hasNext = false;
            }
        } else {
            _indices[i]++;
            break;
        }
    }
    return result;
}

public Iterator<int[]> iterator() {
    return this;
}

public void remove() {
    throw new UnsupportedOperationException();
}

/**
 * Usage example. Prints out
 * 
 * <pre>
 * [0, 0, 0] a, NANOSECONDS, 1
 * [0, 0, 1] a, NANOSECONDS, 2
 * [0, 0, 2] a, NANOSECONDS, 3
 * [0, 0, 3] a, NANOSECONDS, 4
 * [0, 1, 0] a, MICROSECONDS, 1
 * [0, 1, 1] a, MICROSECONDS, 2
 * [0, 1, 2] a, MICROSECONDS, 3
 * [0, 1, 3] a, MICROSECONDS, 4
 * [0, 2, 0] a, MILLISECONDS, 1
 * [0, 2, 1] a, MILLISECONDS, 2
 * [0, 2, 2] a, MILLISECONDS, 3
 * [0, 2, 3] a, MILLISECONDS, 4
 * [0, 3, 0] a, SECONDS, 1
 * [0, 3, 1] a, SECONDS, 2
 * [0, 3, 2] a, SECONDS, 3
 * [0, 3, 3] a, SECONDS, 4
 * [0, 4, 0] a, MINUTES, 1
 * [0, 4, 1] a, MINUTES, 2
 * ...
 * </pre>
 */
public static void main(String[] args) {
    String[] list1 = { "a", "b", "c", };
    TimeUnit[] list2 = TimeUnit.values();
    int[] list3 = new int[] { 1, 2, 3, 4 };

    int[] lengths = new int[] { list1.length, list2.length, list3.length };
    for (int[] indices : new CartesianProduct(lengths)) {
        System.out.println(Arrays.toString(indices) //
                + " " + list1[indices[0]] //
                + ", " + list2[indices[1]] //
                + ", " + list3[indices[2]]);
    }
}

}

Michael Easter 15 年前

我相信这是正确的。它不是在寻求效率,而是通过递归和抽象实现一种干净的风格。

关键的抽象是引入一个简单的 Tuple 班级。这有助于以后的仿制药:

class Tuple<T> {
    private List<T> list = new ArrayList<T>();

    public void add(T t) { list.add(t); }

    public void addAll(Tuple<T> subT) {
        for (T t : subT.list) {
            list.add(t);
        }
    }

    public String toString() {
        String result = "(";

        for (T t : list) { result += t + ", "; }

        result = result.substring(0, result.length() - 2);
        result += " )";

        return result;
    } 
}

通过这个类,我们可以编写这样的类:

public class Example {

public static <T> List<Tuple<T>> cartesianProduct(List<Set<T>> sets) {
    List<Tuple<T>> tuples = new ArrayList<Tuple<T>>();

    if (sets.size() == 1) {
        Set<T> set = sets.get(0);
        for (T t : set) {
            Tuple<T> tuple = new Tuple<T>();
            tuple.add(t);    
            tuples.add(tuple);
        }
    } else {
        Set<T> set = sets.remove(0);
        List<Tuple<T>> subTuples = cartesianProduct(sets);
        System.out.println("TRACER size = " + tuples.size());
        for (Tuple<T> subTuple : subTuples) {
            for (T t : set) {
                Tuple<T> tuple = new Tuple<T>();
                tuple.addAll(subTuple);
                tuple.add(t);
                tuples.add(tuple);
            }
        }
    }

    return tuples;
}

}

我有一个很好的例子来说明这项工作,但是为了简洁起见省略了它。

Scott Ray 15 年前

您可能对另一个关于笛卡尔产品的问题感兴趣(编辑:删除以保存超链接,搜索标签笛卡尔产品)。这个答案有一个很好的递归解决方案,我很难改进。您是否特别需要迭代解决方案而不是递归解决方案?

编辑:

在研究了Perl中堆栈溢出的另一个迭代解决方案之后, a clean explanation ,这里是另一个解决方案:

public static <T> List<Set<T>> uglyCartesianProduct(List<Set<T>> list) {
        List<Iterator<T>> iterators = new ArrayList<Iterator<T>>(list.size());
        List<T> elements = new ArrayList<T>(list.size());
        List<Set<T>> toRet = new ArrayList<Set<T>>();

        for (int i = 0; i < list.size(); i++) {
            iterators.add(list.get(i).iterator());
            elements.add(iterators.get(i).next());
        }

        for(int i = 0; i < numberOfTuples(list); i++)
        {
            toRet.add(new HashSet<T>());
        }

        int setIndex = 0;
        for (Set<T> set : list) {
            int index = 0;
            for (int i = 0; i < numberOfTuples(list); i++) {
                toRet.get(index).add((T) set.toArray()[index % set.size()]);
                index++;
            }
            setIndex++;
        }

        return toRet;
    }

    private static <T> int numberOfTuples(List<Set<T>> list) {
        int product = 1;
        for (Set<T> set : list) {
            product *= set.size();
        }
        return product;
    }

Mike Samuel 15 年前

这里是一个使用函数生成适当输出类型的惰性迭代器方法。

  public static <T> Iterable<T> cartesianProduct(
      final Function<Object[], T> fn, Object[]... options) {
    final Object[][] opts = new Object[options.length][];
    for (int i = opts.length; --i >= 0;) {
      // NPE on null input collections, and handle the empty output case here
      // since the iterator code below assumes that it is not exhausted the
      // first time through fetch.
      if (options[i].length == 0) { return Collections.emptySet(); }
      opts[i] = options[i].clone();
    }
    return new Iterable<T>() {
      public Iterator<T> iterator() {
        return new Iterator<T>() {
          final int[] pos = new int[opts.length];
          boolean hasPending;
          T pending;
          boolean exhausted;

          public boolean hasNext() {
            fetch();
            return hasPending;
          }

          public T next() {
            fetch();
            if (!hasPending) { throw new NoSuchElementException(); }
            T out = pending;
            pending = null;  // release for GC
            hasPending = false;
            return out;
          }

          public void remove() { throw new UnsupportedOperationException(); }

          private void fetch() {
            if (hasPending || exhausted) { return; }
            // Produce a result.
            int n = pos.length;
            Object[] args = new Object[n];
            for (int j = n; --j >= 0;) { args[j] = opts[j][pos[j]]; }
            pending = fn.apply(args);
            hasPending = true;
            // Increment to next.
            for (int i = n; --i >= 0;) {
              if (++pos[i] < opts[i].length) {
                for (int j = n; --j > i;) { pos[j] = 0; }
                return;
              }
            }
            exhausted = true;
          }
        };
      }
    };
  }

dbow 14 年前

我为字符串表编写了一个递归笛卡尔积算法。您可以将其修改为具有集合ISTEAD。下面是算法。它也在我的 article

public class Main {

public static void main(String[] args) {
    String[] A = new String[]{ "a1", "a2", "a3" };
    String[] B = new String[]{ "b1", "b2", "b3" };
    String[] C = new String[]{ "c1" };

    String[] cp = CartesianProduct(0, A, B, C);

    for(String s : cp) {
         System.out.println(s);
    }
}

public static String[] CartesianProduct(int prodLevel, String[] res, String[] ...s) {
    if(prodLevel < s.length) {
        int cProdLen = res.length * s[prodLevel].length;
        String[] tmpRes = new String[cProdLen];

        for (int i = 0; i < res.length; i++) {
            for (int j = 0; j < s[prodLevel].length; j++) {
                tmpRes[i * res.length + j] = res[i] + s[prodLevel][j];
            }
        }
        res = Main.CartesianProduct(prodLevel + 1, tmpRes, s);
    }
    return res;
}}