代码之家 › 专栏 › 技术社区 › Amjad Hussain Syed

googlecloudvisionapi,如何读取文本并构造它

google-cloud-vision ocr python-2.7

0

Amjad Hussain Syed · 技术社区 · 6 年前

我正在使用googlecloudvisionapipython扫描文档,从中读取文本。单据是一种包含客户详细信息和表格的发票。文档到文本的数据转换非常完美。但是,数据没有排序。我找不到如何对数据排序的方法,因为我需要从中提取很少的值。而且我想提取的数据有时位于不同的位置,这使得我很难提取。

https://cloud.google.com/vision/docs/fulltext-annotations

这是我的python代码:

import io
import os
from google.cloud import vision
from google.cloud.vision import types
import glob


def scan_img(image_file):
    with io.open(image_file, 'rb') as image_file:
        content = image_file.read()

    image = types.Image(content=content)

    response = client.document_text_detection(image=image)
    document = response.full_text_annotation
    img_out_array = document.text.split("\n")
    invoice_no_raw = ""
    invoice_date_raw = ""
    net_total_idx = ""
    customer_name_index = ""

    for index, line in enumerate(img_out_array):
        if "Invoice No" in line:
            invoice_no_raw = line
        if "Customer Name" in line:
            index += 6
            customer_name_index = index
        if "Date :" in line:
            invoice_date_raw = line
        if "Our Bank details" in line:
            index -= 1
            net_total_idx = index

    net_total_sales_raw = img_out_array[net_total_idx]
    customer_name_raw = img_out_array[customer_name_index]
    print("Raw data:: ", invoice_no_raw, invoice_date_raw, customer_name_raw, img_out_array[net_total_idx])

    invoice_no = invoice_no_raw.split(":")[1]
    invoice_date = invoice_date_raw.split(":")[1]
    customer_name = customer_name_raw.replace("..", "")
    net_total_sales = net_total_sales_raw.split(" ")[-1]

    return [invoice_no, invoice_date, customer_name, net_total_sales]


if __name__ == '__main__':
    os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = 
    "path/to/imgtotext.json"
    client = vision.ImageAnnotatorClient()
    images = glob.glob("/path/Documents/invoices/*.jpg")
    for image in images:
        print("scanning the image:::::" + image)
        invoice_no, invoice_date, customer_name, net_total_sales = 
        scan_img(image)
        print("Formatted data:: ", invoice_no, invoice_date, 
        customer_name, net_total_sales)

文件1输出:

Customer Name
Address
**x customer**
area name
streetname
Customer LPO

文件2输出:

Customer LPO
**y customer**
area name
streetname
LPO Date
Payment Terms
Customer Name
Address
Delivery Location

请告知,我想阅读X和Y客户,此位置正在从一个文档更改为另一个文档,我有几个文档。如何构造和读取数据。

There are other several fields which I'm able successfully read it.

提前谢谢。

1 回复 | 直到 6 年前

1

Armin_SC 6 年前

云视觉API 没有指定用于读取或排序文件数据的格式的特定请求属性。相反,我认为可用的解决方法是使用 BoundingPoly 和 Vertex 响应属性,显示与图像中包含的每个单词相关的坐标,以便处理代码逻辑中的顶点数据并定义需要按列和行分组的文本。你可以看一看 this link 其中包括一些包含这些属性的响应示例。

如果这个特性不能满足您当前的需求,您可以使用 发送反馈 按钮,位于 service public documentation ,以及查看 Issue Tracker 工具以便 raise a Vision API feature request 并通知google这个需要的功能。