代码之家  ›  专栏  ›  技术社区  ›  mahmood

OpenMPI-主机和-主机文件选项

  •  0
  • mahmood  · 技术社区  · 7 年前

    -host ,它有效。我的意思是,这个过程跨越到我指定的主机。但是,当我指定 -hostfile ,它不起作用!!

    mahmood@cluster:mpitest$ /share/apps/computer/openmpi-2.0.1/bin/mpirun -host compute-0-0,cluster -np 2 a.out
    ****************************************************************************
    * hwloc 1.11.2 has encountered what looks like an error from the operating system.
    *
    * Package (P#1 cpuset 0xffff0000) intersects with NUMANode (P#1 cpuset 0xff00ffff) without inclusion!
    * Error occurred in topology.c line 1048
    *
    * The following FAQ entry in the hwloc documentation may help:
    *   What should I do when hwloc reports "operating system" warnings?
    * Otherwise please report this error message to the hwloc user's mailing list,
    * along with the output+tarball generated by the hwloc-gather-topology script.
    ****************************************************************************
    Hello world from processor cluster.hpc.org, rank 1 out of 2 processors
    Hello world from processor compute-0-0.local, rank 0 out of 2 processors
    mahmood@cluster:mpitest$ cat hosts
    cluster
    compute-0-0
    
    mahmood@cluster:mpitest$ /share/apps/computer/openmpi-2.0.1/bin/mpirun -hostfile hosts -np 2 a.out      
    ****************************************************************************
    * hwloc 1.11.2 has encountered what looks like an error from the operating system.
    *
    * Package (P#1 cpuset 0xffff0000) intersects with NUMANode (P#1 cpuset 0xff00ffff) without inclusion!
    * Error occurred in topology.c line 1048
    *
    * The following FAQ entry in the hwloc documentation may help:
    *   What should I do when hwloc reports "operating system" warnings?
    * Otherwise please report this error message to the hwloc user's mailing list,
    * along with the output+tarball generated by the hwloc-gather-topology script.
    ****************************************************************************
    Hello world from processor cluster.hpc.org, rank 0 out of 2 processors
    Hello world from processor cluster.hpc.org, rank 1 out of 2 processors
    

    那么问题是什么?我该如何解决?

    1 回复  |  直到 7 年前
        1
  •  2
  •   Hristo Iliev    7 年前

    中列出的主机 -host -host A,B 指主机上的一个插槽 A B .

    mpiexec N 每个节点的进程,请使用以下选项

    --map-by ppr:N:node
    

    --map-by ppr:1:node

    cluster     slots=1 max_slots=1
    compute-0-0 slots=1 max_slots=1
    

    (尽管如此 slots=1 如果未提供,则应为默认值…)