代码之家  ›  专栏  ›  技术社区  ›  Diomidis Spinellis

为什么在大于4GB的(稀疏)文件上使用ENOMEM时COW mmap会失败?

  •  4
  • Diomidis Spinellis  · 技术社区  · 14 年前

    这发生在2.6.26-2-amd64 Linux内核上,当尝试用copy-on-write语义(PROT|u READ | PROT|u write and MAP|u PRIVATE)映射5GB文件时。映射小于4GB的文件或只使用PROT\u READ就可以了。这不是中报告的软资源限制问题 this question ;虚拟限制大小不受限制。

    下面是重现问题的代码(实际代码是问题的一部分 Boost.Interprocess

    #include <sys/types.h>
    #include <sys/stat.h>
    #include <sys/mman.h>
    
    #include <fcntl.h>
    #include <unistd.h>
    
    main()
    {
            struct stat b;
            void *base;
            int fd = open("foo.bin", O_RDWR);
    
            fstat(fd, &b);
            base = mmap(0, b.st_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
            if (base == MAP_FAILED) {
                    perror("mmap");
                    return 1;
            }
            return 0;
    }
    

    dd if=/dev/zero of=foo.bin bs=1M seek=5000 count=1
    ./test-mmap
    mmap: Cannot allocate memory
    

    open("foo.bin", O_RDWR)                 = 3
    fstat(3, {st_mode=S_IFREG|0644, st_size=5243928576, ...}) = 0
    mmap(NULL, 5243928576, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0) = -1 ENOMEM (Cannot allocate memory)
    dup(2)                                  = 4
    [...]
    write(4, "mmap: Cannot allocate memory\n", 29mmap: Cannot allocate memory
    ) = 29
    
    2 回复  |  直到 7 年前
        1
  •  5
  •   Matt Joiner    14 年前

    试着传球 MAP_NORESERVE flags

    mmap(NULL, b.st_size, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_NORESERVE, fd, 0);
    

    很可能您的交换和物理内存的组合小于5GB的请求。

    # echo 0 > /proc/sys/vm/overcommit_memory
    

       MAP_NORESERVE
              Do  not reserve swap space for this mapping.  When swap space is
              reserved, one has the guarantee that it is  possible  to  modify
              the  mapping.   When  swap  space  is not reserved one might get
              SIGSEGV upon a write if no physical memory  is  available.   See
              also  the  discussion of the file /proc/sys/vm/overcommit_memory
              in proc(5).  In kernels before 2.6, this flag  only  had  effect
              for private writable mappings.
    

    过程(5):

       /proc/sys/vm/overcommit_memory
              This file contains the kernel virtual  memory  accounting  mode.
              Values are:
    
                     0: heuristic overcommit (this is the default)
                     1: always overcommit, never check
                     2: always check, never overcommit
    
              In  mode 0, calls of mmap(2) with MAP_NORESERVE are not checked,
              and the default check is very weak, leading to the risk of  get‐
              ting a process "OOM-killed".  Under Linux 2.4 any non-zero value
              implies mode 1.  In mode 2  (available  since  Linux  2.6),  the
              total  virtual  address  space on the system is limited to (SS +
              RAM*(r/100)), where SS is the size of the swap space, and RAM is
              the  size  of  the physical memory, and r is the contents of the
              file /proc/sys/vm/overcommit_ratio.
    
        2
  •  2
  •   Community Ian Goodfellow    7 年前

    MemTotal: 4063428 kB SwapTotal: 514072 kB
    $ cat /proc/sys/vm/overcommit_memory
    0
    $ cat /proc/sys/vm/overcommit_ratio 
    50
    

    overcommit_memory

    你的选择是 MAP_NORESERVE (作为 Matt Joiner 建议),如果您确信永远不会在映射中弄脏(写入)超过可用内存和交换空间的页面,或者显著增加交换空间的大小。