代码之家  ›  专栏  ›  技术社区  ›  Atul Balaji

TypeError:保存时write()参数必须是str,而不是bytes。npy文件

  •  8
  • Atul Balaji  · 技术社区  · 6 年前

    我试着运行 code 在一个 keras blog post.

    代码写入一个。npy文件如下:

    bottleneck_features_train = model.predict_generator(generator, nb_train_samples // batch_size)
    np.save(open('bottleneck_features_train.npy', 'w'),bottleneck_features_train)
    

    然后从这个文件中读取:

    def train_top_model():
        train_data = np.load(open('bottleneck_features_train.npy'))
    

    现在我得到一个错误,说:

    Found 2000 images belonging to 2 classes.
    Traceback (most recent call last):
      File "kerasbottleneck.py", line 103, in <module>
        save_bottlebeck_features()
      File "kerasbottleneck.py", line 69, in save_bottlebeck_features
        np.save(open('bottleneck_features_train.npy', 'w'),bottleneck_features_train)
      File "/opt/anaconda3/lib/python3.6/site-packages/numpy/lib/npyio.py", line 511, in save
        pickle_kwargs=pickle_kwargs)
      File "/opt/anaconda3/lib/python3.6/site-packages/numpy/lib/format.py", line 565, in write_array
    version)
      File "/opt/anaconda3/lib/python3.6/site-packages/numpy/lib/format.py", line 335, in _write_array_header
    fp.write(header_prefix)
    TypeError: write() argument must be str, not bytes
    

    之后,我尝试将文件模式从“w”更改为“wb”。这导致读取文件时出错:

    Found 2000 images belonging to 2 classes.
    Found 800 images belonging to 2 classes.
    Traceback (most recent call last):
      File "kerasbottleneck.py", line 104, in <module>
        train_top_model()
      File "kerasbottleneck.py", line 82, in train_top_model
        train_data = np.load(open('bottleneck_features_train.npy'))
      File "/opt/anaconda3/lib/python3.6/site-packages/numpy/lib/npyio.py", line 404, in load
    magic = fid.read(N)
      File "/opt/anaconda3/lib/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
    UnicodeDecodeError: 'utf-8' codec can't decode byte 0x93 in position 0: invalid start byte
    

    如何修复此错误?

    1 回复  |  直到 6 年前
        1
  •  15
  •   Martijn Pieters    6 年前

    这篇博文中的代码是针对Python2的,在Python2中,文件的写入和读取与ByTestring一起工作。在Python 3中,您需要以二进制模式打开文件,既可以写入文件,也可以再次读取:

    np.save(
        open('bottleneck_features_train.npy', 'wb'),
        bottleneck_features_train)
    

    阅读时:

    train_data = np.load(open('bottleneck_features_train.npy', 'rb'))
    

    注意 b 模式参数中的字符。

    我会将该文件用作上下文管理器,以确保它完全关闭:

    with open('bottleneck_features_train.npy', 'wb') as features_train_file
        np.save(features_train_file, bottleneck_features_train)
    

    with open('bottleneck_features_train.npy', 'wb') as features_train_file:
        train_data = np.load(features_train_file)
    

    博客帖子中的代码应该使用 这两个变化 总之,因为在Python 2中,没有 B 模式文本文件中的标志具有特定于平台的换行符约定,并且在Windows上,流中的某些字符将具有特定的含义(包括如果出现EOF字符,则会导致文件看起来比实际长度短)。对于二进制数据,这可能是一个真正的问题。