代码之家  ›  专栏  ›  技术社区  ›  Amjad Hussain Syed

googlecloudvisionapi,如何读取文本并构造它

  •  0
  • Amjad Hussain Syed  · 技术社区  · 6 年前

    我正在使用googlecloudvisionapipython扫描文档,从中读取文本。单据是一种包含客户详细信息和表格的发票。文档到文本的数据转换非常完美。但是,数据没有排序。我找不到如何对数据排序的方法,因为我需要从中提取很少的值。而且我想提取的数据有时位于不同的位置,这使得我很难提取。

    https://cloud.google.com/vision/docs/fulltext-annotations

    这是我的python代码:

    import io
    import os
    from google.cloud import vision
    from google.cloud.vision import types
    import glob
    
    
    def scan_img(image_file):
        with io.open(image_file, 'rb') as image_file:
            content = image_file.read()
    
        image = types.Image(content=content)
    
        response = client.document_text_detection(image=image)
        document = response.full_text_annotation
        img_out_array = document.text.split("\n")
        invoice_no_raw = ""
        invoice_date_raw = ""
        net_total_idx = ""
        customer_name_index = ""
    
        for index, line in enumerate(img_out_array):
            if "Invoice No" in line:
                invoice_no_raw = line
            if "Customer Name" in line:
                index += 6
                customer_name_index = index
            if "Date :" in line:
                invoice_date_raw = line
            if "Our Bank details" in line:
                index -= 1
                net_total_idx = index
    
        net_total_sales_raw = img_out_array[net_total_idx]
        customer_name_raw = img_out_array[customer_name_index]
        print("Raw data:: ", invoice_no_raw, invoice_date_raw, customer_name_raw, img_out_array[net_total_idx])
    
        invoice_no = invoice_no_raw.split(":")[1]
        invoice_date = invoice_date_raw.split(":")[1]
        customer_name = customer_name_raw.replace("..", "")
        net_total_sales = net_total_sales_raw.split(" ")[-1]
    
        return [invoice_no, invoice_date, customer_name, net_total_sales]
    
    
    if __name__ == '__main__':
        os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = 
        "path/to/imgtotext.json"
        client = vision.ImageAnnotatorClient()
        images = glob.glob("/path/Documents/invoices/*.jpg")
        for image in images:
            print("scanning the image:::::" + image)
            invoice_no, invoice_date, customer_name, net_total_sales = 
            scan_img(image)
            print("Formatted data:: ", invoice_no, invoice_date, 
            customer_name, net_total_sales)
    

    文件1输出:

    Customer Name
    Address
    **x customer**
    area name
    streetname
    Customer LPO
    

    文件2输出:

    Customer LPO
    **y customer**
    area name
    streetname
    LPO Date
    Payment Terms
    Customer Name
    Address
    Delivery Location
    

    请告知,我想阅读X和Y客户,此位置正在从一个文档更改为另一个文档,我有几个文档。如何构造和读取数据。

    There are other several fields which I'm able successfully read it.

    提前谢谢。

    1 回复  |  直到 6 年前
        1
  •  1
  •   Armin_SC    6 年前

    云视觉API 没有指定用于读取或排序文件数据的格式的特定请求属性。相反,我认为可用的解决方法是使用 BoundingPoly Vertex 响应属性,显示与图像中包含的每个单词相关的坐标,以便处理代码逻辑中的顶点数据并定义需要按列和行分组的文本。你可以看一看 this link 其中包括一些包含这些属性的响应示例。

    如果这个特性不能满足您当前的需求,您可以使用 发送反馈 按钮,位于 service public documentation ,以及查看 Issue Tracker 工具以便 raise a Vision API feature request 并通知google这个需要的功能。