代码之家 › 专栏 › 技术社区 › Vether

从图像上的数据库中搜索对象

pattern-recognition image-recognition computer-vision image-processing

Vether · 技术社区 · 7 年前

我试着用imagehash( https://github.com/JohannesBuchner/imagehash ),这将非常有用,但为了获得有意义的结果,我认为我必须计算(几乎)我的_图像的每个可能哈希值-原因是我不知道对象大小和在我的_图像上的位置:

hash_list = []
MyImage = Image.open('my_image.jpg')

for x_start in range(image_width):
    for y_start in range(image_height):
        for x_end in range(x_start, image_width):
            for y_end in range(y_start, image_height):
                hash_list.append(imagehash.phash(MyImage.\
                crop(x_start, y_start, x_end, y_end)))

…然后尝试在数据库中查找类似的哈希,但当例如image\u width=image\u height=500时,此循环和搜索将花费很长时间。当然,我可以将其优化一点,但对于更大的图像,它看起来仍然像seppuku:

MIN_WIDTH = 30
MIN_HEIGHT = 30
STEP = 2

hash_list = []
MyImage = Image.open('my_image.jpg')

for x_start in range(0, image_width - MIN_WIDTH, STEP):
    for y_start in range(0, image_height - MIN_HEIGHT, STEP):
        for x_end in range(x_start + MIN_WIDTH, image_width, STEP):
            for y_end in range(y_start + MIN_HEIGHT, image_height, STEP):
                hash_list.append(...)

我想知道是否有什么好方法来定义我的_图像的哪些部分对计算散列有利-例如,切割边缘看起来是个坏主意。也许有一个更简单的解决方法?如果该程序能在最多20分钟内给出答案,那就太好了。如果有任何建议,我将不胜感激。

2 回复 | 直到 7 年前

kunal18 7 年前

这看起来像是一个图像检索问题。然而,在你的例子中,你更感兴趣的是一个二进制是/否答案,它告诉你输入的图像(my_image.jpg)是否是数据库中存在的对象。

我可以建议的第一件事是,您可以将所有图像(包括输入)调整为固定大小,例如100 x 100。但是,如果某个图像中的某个对象非常小或存在于图像的特定区域(例如,左上角),则调整大小可能会使情况变得更糟。然而,从你的问题中不清楚这在你的情况下有多大可能。

与其计算和使用图像散列,我建议您阅读有关视觉文字包的内容(例如。, here )基于对象分类的方法。虽然你的目标不是对物体进行分类,但它将帮助你想出一种不同的方法来解决你的问题。

Vether 7 年前

毕竟,我找到了一个对我来说非常好的解决方案,也许它对其他人也会有用:

我正在使用SIFT从我的U图像中检测“最佳候选人”:

def multiscale_template_matching(template, image):
    results = []
    for scale in np.linspace(0.2, 1.4, 121)[::-1]:
        res = imutils.resize(image, width=int(image.shape[1] * scale))
        r = image.shape[1] / float(res.shape[1])
        if res.shape[0] < template.shape[0] or res.shape[1] < template.shape[1];
           break

        ## bigger correlation <==> better matching
        ## template_mathing uses SIFT to return best correlation and coordinates
        correlation, (x, y) = template_matching(template, res)
        coordinates = (x * r, y * r)
        results.appent((correlation, coordinates, r))

    results.sort(key=itemgetter(0), reverse=True)
    return results[:10]

ACCEPTABLE = 10

def find_best(image, template, candidates):
    template_hash = imagehash.phash(template)
    best_result = 50  ## initial value must be greater than ACCEPTABLE
    best_cand = None

    for cand in candidates:
        cand_hash = get_hash(...)
        hash_diff = template_hash - cand_hash
        if hash_diff < best_result:
            best_result = hash_diff
            best_cand = cand

    if best_result <= ACCEPTABLE:
        return best_cand, best_result
    else:
        return None, None

如果结果(<);可以接受,我几乎可以肯定答案是“抓住你了!”:)这个解算允许我在7分钟内将我的_图像与1000个对象进行比较。