代码之家 › 专栏 › 技术社区 › Gameatro

为什么有些图像有三维3,而另一些图像有4?

conv-neural-network image-processing python

Gameatro · 技术社区 · 6 年前

train_list = glob.glob('A:\Code\Machine 
Learning\CNN\ConvolutionalNN1\TrainImg\*.jpg')
X_train_orig = np.array([np.array(Image.open(file)) for file in train_list])

但它给了我一个错误,无法将(420310)广播到(420310,3)。然后我打印了数组的形状,有的是(420310,3)有的是(410320,4)。为什么是这样?我怎样才能改变它,使之与数组相匹配呢?

1 回复 | 直到 6 年前

Koustav 6 年前

问题

所以基本上这里发生的是你在玩三种不同格式的图片(至少那些出现在你的问题中的)。它们分别是:

RGB (420, 310, 3) ), 三个通道
RGB-A (420, 310, 4) ), 四通道
Grayscale (尺寸) (420, 310) 单通道

您看到的第三个维度是表示图像中的通道数(前两个分别是高度和宽度)。

一个例子将进一步澄清这一点。我从网上随机下载了属于上述三种格式之一的图片。

dog.png

RGB-A图像 fish.png

灰度图像 lena.png

下面是一个python脚本,使用 PIL 并显示其形状:

from PIL import Image
import numpy as np

dog = Image.open('dog.png')
print('Dog shape is ' + str(np.array(dog).shape))

fish = Image.open('fish.png')
print('Fish shape is ' + str(np.array(fish).shape))

lena = Image.open('lena.png')
print('Lena shape is ' + str(np.array(lena).shape))

Dog shape is (250, 250, 3)
Fish shape is (501, 393, 4)
Lena shape is (512, 512)

因此,当您试图迭代地将所有图像分配给一个数组时( np.array ),则会出现形状不匹配错误。

解决方案

解决此问题的最简单方法是在将所有图像保存到数组中之前将其转换为一种特定格式。假设您将使用预先训练的ImageNet模型,我们将把它们转换为 RGB

我们会改变 RGB-A 到 RGB公司 使用以下代码:

fish = Image.open('fish.png')
print('Fish RGB-A shape is ' + str(np.array(fish).shape))
rgb = fish.convert('RGB')
print('Fish RGB shape is ' + str(np.array(rgb).shape))

输出为:

Fish RGB-A shape is (501, 393, 4)
Fish RGB shape is (501, 393, 3)

三

注意 (420、310)

希望这能澄清你的疑问。

推荐文章

bz_jf · CNN训练损失太不稳定了

2 年前

Elia · TF |培训结束后如何从CNN预测

6 年前

Cameron Blake · 基于Keras的误差多分类神经网络

6 年前

Md Riaz · 多标签文本分类。我有一个文本/标签csv。文本是纯文本,标签是字母数字

6 年前

mcanizo · keras conv1d输入数据重塑

6 年前

batuman · Google Net的感受野计算

6 年前

Nima · caffe检查失败:kernel\u size指定了2次;0个空间DIM

6 年前

user288609 · ValueError:输入0与层conv1d\u 1不兼容:预期ndim=3,发现ndim=4

6 年前

KatharsisHerbie · 两种不同输入类型(图像和值)的Keras神经网络

6 年前

Adam Collins · sess中的第二个feed\u dict。run(),培训时评估

6 年前