代码之家 › 专栏 › 技术社区 › Day

如何模拟ZipFile.打开在Python2.5中?

python-2.5 zip python

Day · 技术社区 · 14 年前

我想将一个文件从zip文件提取到一个特定的路径,忽略归档文件中的文件路径。这在Python2.6中非常简单(我的docstring比代码长)

import shutil
import zipfile

def extract_from_zip(name, dest_path, zip_file):
    """Similar to zipfile.ZipFile.extract but extracts the file given by name
    from the zip_file (instance of zipfile.ZipFile) to the given dest_path
    *ignoring* the filename path given in the archive completely
    instead of preserving it as extract does.
    """
    dest_file = open(dest_path, 'wb')
    archived_file = zip_file.open(name)
    shutil.copyfileobj(archived_file, dest_file)


 extract_from_zip('path/to/file.dat', 'output.txt', zipfile.ZipFile('test.zip', 'r'))

ZipFile.open 方法不可用。我找不到stackoverflow的解决方案,但是 this forum post 有一个很好的解决方案,利用 ZipInfo.file_offset zlib.decompressobj 从那里解压字节。不幸的是 ZipInfo.file\u偏移量 在Python2.5中被删除了!

ZipInfo.header_offset ,我想我只需要解析并跳过头结构就可以自己找到文件偏移量。使用维基百科作为 a reference (我知道)我想出了一个很长很不优雅的解决方案。

import zipfile
import zlib

def extract_from_zip(name, dest_path, zip_file):
    """Python 2.5 version :("""
    dest_file = open(dest_path, 'wb')
    info = zip_file.getinfo(name)
    if info.compress_type == zipfile.ZIP_STORED:
        decoder = None
    elif info.compress_type == zipfile.ZIP_DEFLATED:
        decoder = zlib.decompressobj(-zlib.MAX_WBITS)
    else:
        raise zipfile.BadZipFile("Unrecognized compression method")

    # Seek over the fixed size fields to the "file name length" field in
    # the file header (26 bytes). Unpack this and the "extra field length"
    # field ourselves as info.extra doesn't seem to be the correct length.
    zip_file.fp.seek(info.header_offset + 26)
    file_name_len, extra_len = struct.unpack("<HH", zip_file.fp.read(4))
    zip_file.fp.seek(info.header_offset + 30 + file_name_len + extra_len)

    bytes_to_read = info.compress_size

    while True:
        buff = zip_file.fp.read(min(bytes_to_read, 102400))
        if not buff:
            break
        bytes_to_read -= len(buff)
        if decoder:
            buff = decoder.decompress(buff)
        dest_file.write(buff)

    if decoder:
        dest_file.write(decoder.decompress('Z'))
        dest_file.write(decoder.flush())

注意我是如何解包和读取给出额外字段长度的字段的,因为调用 len ZipInfo.extra 属性将减少4个字节,从而导致偏移量计算不正确。也许我错过了什么?

有人能改进Python2.5的这个解决方案吗?

dest_file.write(zip_file.read(name))

将失败 MemoryError

3 回复 | 直到 13 年前

Chris Adams 14 年前

我还没有测试过这一点,但是我在Python2.4中使用了非常类似的东西

import zipfile

def extract_from_zip(name, dest_path, zip_file):
    dest_file = open(dest_path, 'wb')
    dest_file.write(zip_file.read(name))
    dest_file.close()

extract_from_zip('path/to/file/in/archive.dat', 
        'output.txt', 
        zipfile.ZipFile('test.zip', 'r'))

Martijn Pieters 12 年前

我知道我在聚会上问这个问题有点晚了,但我遇到了完全相同的问题。

python_fix/zipfile.py

然后在代码中:

import python_fix.zipfile as zipfile

从那时起,我就可以使用2.6.6版本的zipfile和python2.5.1解释器(2.7.X版本在“with”和“this version”上失败)

希望这能帮助其他使用古老技术的人。

Community Neeleshkumar S 7 年前

考虑到我的限制,答案似乎是在我的问题中给出的:自己解析ZipFile结构并使用 zlib.decompressobj 一旦你找到了字节就解压。

如果你没有我的限制,你可以在这里找到更好的答案:

如果可以,只需将python2.5升级到2.6(或更高版本!),正如Daynyth在评论中所建议的那样。
如果zip中只有小文件可以100%加载到内存中,请使用 ChrisAdams' answer
如果可以引入对外部实用程序的依赖关系,请对 /usr/bin/unzip 或类似的,如 Vlad's answer