代码之家 › 专栏 › 技术社区 › Serge Ballesta

将基本类型数组中的内存用于不同(但仍然是基本)类型数组是否合法

strict-aliasing language-lawyer casting c++

Serge Ballesta · 技术社区 · 6 年前

这是另一个的后续行动 question 关于内存重用。因为最初的问题是关于一个具体的实现,所以答案与该具体的实现有关。

因此,我想知道,在一致实现中,为不同类型的数组重新使用基本类型数组的内存是否合法,前提是:

这两种类型都是基本类型,因此具有普通的dtor和默认的ctor
两种类型具有相同的大小和对齐要求

我以下面的示例代码结束:

#include <iostream>

constexpr int Size = 10;

void *allocate_buffer() {
    void * buffer = operator new(Size * sizeof(int), std::align_val_t{alignof(int)});
    int *in = reinterpret_cast<int *>(buffer); // Defined behaviour because alignment is ok
    for (int i=0; i<Size; i++) in[i] = i;  // Defined behaviour because int is a fundamental type:
                                           // lifetime starts when is receives a value
    return buffer;
}
int main() {
    void *buffer = allocate_buffer();        // Ok, defined behaviour
    int *in = static_cast<int *>(buffer);    // Defined behaviour since the underlying type is int *
    for(int i=0; i<Size; i++) {
        std::cout << in[i] << " ";
    }
    std::cout << std::endl;
    static_assert(sizeof(int) == sizeof(float), "Non matching type sizes");
    static_assert(alignof(int) == alignof(float), "Non matching alignments");
    float *out = static_cast<float *>(buffer); //  (question here) Declares a dynamic float array starting at buffer
    // std::cout << out[0];      // UB! object at &out[0] is an int and not a float
    for(int i=0; i<Size; i++) {
        out[i] = static_cast<float>(in[i]) / 2;  // Defined behaviour, after execution buffer will contain floats
                                                 // because float is a fundamental type and memory is re-used.
    }
    // std::cout << in[0];       // UB! lifetime has ended because memory has been reused
    for(int i=0; i<Size; i++) {
        std::cout << out[i] << " ";         // Defined behaviour since the actual object type is float *
    }
    std::cout << std::endl;
    return 0;
}

我添加了一些注释,解释了为什么我认为这段代码应该定义行为。我知道一切都很好,符合标准,但我没能找到是否有标记 在这里提问 是否有效。

Float对象确实重复使用int对象的内存,因此当Float的生命期开始时,int的生命期结束,所以stric别名规则不应该是一个问题。数组是动态分配的,因此对象(int和float)实际上都是在 空隙类型 数组返回者 operator new . 所以我觉得一切都应该好起来。

所以问题是:上面的代码是否调用UB,如果是,在哪里以及为什么?

免责声明:我建议不要在可移植的代码库中使用此代码,这实际上是 语言律师 问题。

3 回复 | 直到 6 年前

Passer By 6 年前

int *in = reinterpret_cast<int *>(buffer); // Defined behaviour because alignment is ok

对的。但可能不是你期望的那样。 [expr.static.cast]

指针类型的prvalue cv1 void 可以转换为指针类型的prvalue cv2 T T 是对象类型和 cv2 是相同的简历资格,或大于, cv1 . 如果原始指针值代表地址 A 一个 不满足 T型 a ,还有一个对象 b 类型 T型 (忽略cv限定)可以与 一 ,结果是指向 乙

没有 int 也不存在 buffer ,因此指针值不变。 in 是类型的指针 int* 指向原始内存区域的。

for (int i=0; i<Size; i++) in[i] = i;  // Defined behaviour because int is a fundamental type:
                                       // lifetime starts when is receives a value

不正确。 [intro.object]

明显缺席的是作业。不 内景 是创建的。事实上,通过消除, 在里面 invalid pointer ,并且解引用它是UB。

后来的 float*

即使在没有上述所有UB的情况下,通过适当使用 new (pointer) Type{i}; 要创建对象,没有阵列 [expr.add]

当向指针添加或从指针减去具有整数类型的表达式时,结果具有指针操作数的类型。如果表达式 P x[i] 数组对象的 x 具有 n 元素,表达式 P + J 和 J + P (其中J的值为J)指向(可能是假设的)元素 x[i+j] if 0 â¤ i+j â¤ n; 否则,行为是未定义的。同样,表达式 P - J 指向(可能是假设的)元素 x[iâj] if 0 â¤ iâj â¤ n; 否则,行为是未定义的。

eerorika 6 年前

路人的回答涵盖了为什么示例程序有未定义的行为。我将尝试回答如何重用存储用最小的UB(重用存储阵列)在技术上是不可能的,在标准C++中给出了现行的标准措辞,所以为了实现重用,程序员必须依靠实现“做正确的事情”。

转换指针不会自动将对象显式为存在。必须首先构造浮点对象。这个

for(int i=0; i<Size; i++)
    new(in + i) float;

您可以使用placement new返回的指针(在我的示例中被丢弃)直接使用新构造的 float 对象,或者您可以 std::launder 这个 buffer 指针:

float *out = std::launder(reinterpret_cast<float*>(buffer));

许多的 更典型的是重用类型的存储 unsigned char (或 std::byte int 物体。

Yuki 6 年前

我突然进来是因为我觉得至少有一个问题没有回答,没有大声说出来,如果不是真的,我会道歉。我认为这些人很好地回答了这个问题的主要问题:在哪里以及为什么它是未定义的行为;用户2079303很少给出如何解决它的想法。我将尝试回答如何修复代码以及为什么它有效的问题。在开始阅读我的帖子之前,请阅读路人和用户2079303的答案和评论讨论。

对象由定义(6.1)、新表达式(8.3.4)、隐式更改联合体的活动成员(12.3)或创建临时对象(7.4、15.2)创建。

对象概念的定义有点复杂,但有意义。这个问题在 proposal Implicit creation of objects for low-level object manipulation 以简化对象模型。在此之前,我们应该通过上述方法显式地创建一个对象。其中一个将工作,在这种情况下是新的布局表达式,新的布局是一个非分配的新表达式,创建一个对象。对于这种特殊情况,这将帮助我们具体说明缺少的数组对象和浮动对象。下面的代码显示了我的想法,包括一些与这些行相关的注释和组装说明( clang++ -g -O0 使用)。

constexpr int Size = 10;

void* allocate_buffer() {

  // No alignment required for the `new` operator if your object does not require
  // alignment greater than alignof(std::max_align_t), what is the case here
  void* buffer = operator new(Size * sizeof(int));
  // 400fdf:    e8 8c fd ff ff          callq  400d70 <operator new(unsigned long)@plt>
  // 400fe4:    48 89 45 f8             mov    %rax,-0x8(%rbp)


  // (was missing) Create array of integers, default-initialized, no
  // initialization for array of integers
  new (buffer) int[Size];
  int* in = reinterpret_cast<int*>(buffer);
  // Two line result in a basic pointer value copy
  // 400fe8:    48 8b 45 f8             mov    -0x8(%rbp),%rax
  // 400fec:    48 89 45 f0             mov    %rax,-0x10(%rbp)


  for (int i = 0; i < Size; i++)
    in[i] = i;
  return buffer;
}

int main() {

  void* buffer = allocate_buffer();
  // 401047:    48 89 45 d0             mov    %rax,-0x30(%rbp)


  // static_cast equivalent in this case to reinterpret_cast
  int* in = static_cast<int*>(buffer);
  // Static cast results in a pointer value copy
  // 40104b:    48 8b 45 d0             mov    -0x30(%rbp),%rax
  // 40104f:    48 89 45 c8             mov    %rax,-0x38(%rbp)


  for (int i = 0; i < Size; i++) {
    std::cout << in[i] << " ";
  }
  std::cout << std::endl;
  static_assert(sizeof(int) == sizeof(float), "Non matching type sizes");
  static_assert(alignof(int) == alignof(float), "Non matching alignments");
  for (int i = 0; i < Size; i++) {
    int t = in[i];


    // (was missing) Create float with a direct initialization
    // Technically that is reuse of the storage of the array, hence that array does
    // not exist anymore.
    new (in + i) float{t / 2.f};
    // No new is called
    // 4010e4:  48 8b 45 c8             mov    -0x38(%rbp),%rax
    // 4010e8:  48 63 4d c0             movslq -0x40(%rbp),%rcx
    // 4010ec:  f3 0f 2a 4d bc          cvtsi2ssl -0x44(%rbp),%xmm1
    // 4010f1:  f3 0f 5e c8             divss  %xmm0,%xmm1
    // 4010f5:  f3 0f 11 0c 88          movss  %xmm1,(%rax,%rcx,4)


    // (was missing) Create int array on the same storage, default-initialized, no
    // initialization for an array of integers
    new (buffer) int[Size];
    // No code for new is generated
  }


    // (was missing) Create float array, default-initialized, no initialization for an array
    // of floats
  new (buffer) float[Size];
  float* out = reinterpret_cast<float*>(buffer);
  // Two line result in a simple pointer value copy
  // 401108:    48 8b 45 d0             mov    -0x30(%rbp),%rax
  // 40110c:    48 89 45 b0             mov    %rax,-0x50(%rbp)


  for (int i = 0; i < Size; i++) {
    std::cout << out[i] << " ";
  }
  std::cout << std::endl;
  operator delete(buffer);
  return 0;
}

-O0 . 使用GCC -O0号 operator new -O1 它也被省略了。让我们暂时忘记标准的形式,直接从实际意义上思考。为什么我们需要真正调用什么都不做的函数,没有什么可以阻止它在没有这些函数的情况下工作,对吧?由于C++是完全控制内存的语言,而不是某些运行库或虚拟机等。因此,我认为这里的一个原因是,标准再次给编译器更多的自由,限制了程序限制到一些额外的动作。可能的想法是编译器可以做任何重新排序的事情,忽略机器代码的魔力,只知道定义、新表达式、联合、临时对象作为指导优化算法的新对象提供程序。很可能在现实中没有这样的优化,如果您分配了内存,并且没有为一些琐碎的类型调用新的操作符,那么这些优化会使代码崩溃。有趣的事实是 new operator 是保留的,不允许替换,可能这正是告诉编译器一个新对象的最简单的形式。