代码之家  ›  专栏  ›  技术社区  ›  SomeDude

Panda 1.3.3 to_feather giving ArrowMemoryError

  •  0
  • SomeDude  · 技术社区  · 3 年前

    我有一个大小约为270MB的数据集,我使用以下内容写入羽毛文件:

    df.reset_index().to_feather(feather_path)
    

    这给了我一个错误:

      File "C:\apps\Python\lib\site-packages\pandas\util\_decorators.py", line 207, in wrapper
        return func(*args, **kwargs)
      File "C:\apps\Python\lib\site-packages\pandas\core\frame.py", line 2519, in to_feather
        to_feather(self, path, **kwargs)
      File "C:\apps\Python\lib\site-packages\pandas\io\feather_format.py", line 87, in to_feather
        feather.write_feather(df, handles.handle, **kwargs)
      File "C:\apps\Python\lib\site-packages\pyarrow\feather.py", line 152, in write_feather
        table = Table.from_pandas(df, preserve_index=False)
      File "pyarrow\table.pxi", line 1553, in pyarrow.lib.Table.from_pandas
      File "C:\apps\Python\lib\site-packages\pyarrow\pandas_compat.py", line 607, in dataframe_to_arrays
        arrays[i] = maybe_fut.result()
      File "C:\apps\Python\lib\concurrent\futures\_base.py", line 438, in result
        return self.__get_result()
      File "C:\apps\Python\lib\concurrent\futures\_base.py", line 390, in __get_result
        raise self._exception
      File "C:\apps\Python\lib\concurrent\futures\thread.py", line 52, in run
        result = self.fn(*self.args, **self.kwargs)
      File "C:\apps\Python\lib\site-packages\pyarrow\pandas_compat.py", line 575, in convert_column
        result = pa.array(col, type=type_, from_pandas=True, safe=safe)
      File "pyarrow\array.pxi", line 302, in pyarrow.lib.array
      File "pyarrow\array.pxi", line 83, in pyarrow.lib._ndarray_to_array
      File "pyarrow\error.pxi", line 114, in pyarrow.lib.check_status
    pyarrow.lib.ArrowMemoryError: realloc of size 3221225472 failed
    

    注意:这在PyCharm中运行良好。写羽毛文件没有问题。 但是当在Windows批处理文件中调用python程序时,如:

    call python "myprogram.py"
    

    当我使用任务调度器在任务中调度批处理文件时,它会失败,并出现以上内存错误。

    PyArrow版本是5.0.0(如果有帮助的话)。

    有什么想法吗?

    0 回复  |  直到 3 年前