代码之家  ›  专栏  ›  技术社区  ›  f0rd42

python:typeerror:应为str、bytes或os.pathlike对象,而不是pdfileReader

  •  -1
  • f0rd42  · 技术社区  · 6 年前

    我有以下代码。这只是一个起点。稍后,我想将静态的“hello word”文本替换为csv文件中的项目,这些项目是我为csv中的每个项目读取和循环的。 我要每一页都有水印。

    # importing the required modules
    import PyPDF2
    import io
    from reportlab.pdfgen import canvas
    from reportlab.lib.pagesizes import letter
    
    def add_watermark(wmFile, pageObj):
        # opening watermark pdf file
        wmFileObj = open(wmFile, 'rb')
    
        # creating pdf reader object of watermark pdf file
        pdfReader = PyPDF2.PdfFileReader(wmFileObj)
    
        # merging watermark pdf's first page with passed page object.
        pageObj.mergePage(pdfReader.getPage(0))
    
        # closing the watermark pdf file object
        wmFileObj.close()
    
        # returning watermarked page object
        return pageObj
    
    
    def main():
        import PyPDF2
        import io
        from reportlab.pdfgen import canvas
        from reportlab.lib.pagesizes import letter
        # watermark pdf file name
        packet = io.BytesIO()
        # Create a new PDF with Reportlab
        can = canvas.Canvas(packet, pagesize=letter)
        can.setFont('Helvetica-Bold',18)
        can.drawString(10, 100, "Hello world")
        can.showPage()
        can.save()
    
        # Move to the beginning of the StringIO buffer
        packet.seek(0)
        mywatermark = PyPDF2.PdfFileReader(packet)
    
        # original pdf file name
        origFileName = 'Module1.pdf'
    
        # new pdf file name
        newFileName = 'watermarked_example.pdf'
    
        # creating pdf File object of original pdf
        pdfFileObj = open(origFileName, 'rb')
    
        # creating a pdf Reader object
        pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
    
        # creating a pdf writer object for new pdf
        pdfWriter = PyPDF2.PdfFileWriter()
    
        # adding watermark to each page
        for page in range(pdfReader.numPages):
            # creating watermarked page object
            wmpageObj = add_watermark(mywatermark, pdfReader.getPage(page))
    
            # adding watermarked page object to pdf writer
            pdfWriter.addPage(wmpageObj)
    
        # new pdf file object
        newFile = open(newFileName, 'wb')
    
        # writing watermarked pages to new file
        pdfWriter.write(newFile)
    
        # closing the original pdf file object
        pdfFileObj.close()
        # closing the new pdf file object
        newFile.close()
    
    
    if __name__ == "__main__":
        main()
    

    我得到的错误是:

    Traceback (most recent call last):
      File "watermark.py", line 101, in <module>
        main()
      File "watermark.py", line 83, in main
        wmpageObj = add_watermark(mywatermark, pdfReader.getPage(page))
      File "watermark.py", line 32, in add_watermark
        wmFileObj = open(wmFile, 'rb')
    TypeError: expected str, bytes or os.PathLike object, not PdfFileReader
    

    我相信我得到了这样的观点:它期望一个字符串、字节或一个文件,而我不写它,它只是一个“对象”。

    我尝试了一些东西,但无论我尝试什么,都会使事情变得更糟:-(

    有人能帮忙吗?我很确定这只是一件小事,因为我善于监督显而易见的事情。

    感谢您的帮助。

    谢谢

    1 回复  |  直到 6 年前
        1
  •  1
  •   Scavenger    6 年前

    我会把指南和瑕疵留到最后,以下是你如何修复这段代码:

    1)将变量“packet”设置为脚本所在目录中的现有PDF文件名:

    packet = 'my_watermark.pdf'
    

    2)删除移动到“stringio”缓冲区的开头(就像我们曾经需要的那样):

    packet.seek(0)     # delete this
    mywatermark = PyPDF2.PdfFileReader(packet) #delete this too
    

    3)将“packet”作为参数,而不是for循环块中的“mywatermark”:

    wmpageObj = add_watermark(packet, pdfReader.getPage(page))
    

    4)从add_watermark函数删除文件打开和关闭,只保留pdfilereader实例的构造,但使用参数'wmfile':

    wmFileObj = open(wmFile, 'rb')                # delete this
    pdfReader = PyPDF2.PdfFileReader(wmFile)      # let this be, but change wmFileObj to wmFile
    pageObj.mergePage(pdfReader.getPage(0))       # let this be
    wmFileObj.close()                             # delete this
    return pageObj                                # let this be  
    

    此外,在您的代码中,在您的主函数中有一些导入,将它们移到文件的开头,并阅读一些文档。 PyPDF2 的文档显示了如何合并页面(这是模块的专业tbh),而在另一方面,虽然它有点呆板, Reportlab 用户指南非常全面,但很简单。总是试图在代码后面看到它的含义。