代码之家 › 专栏 › 技术社区 › Pro Q Rich Lysakowski PhD

如何让Keras网络不输出所有1

keras neural-network machine-learning tensorflow python

Pro Q Rich Lysakowski PhD · 技术社区 · 6 年前

我有一堆像这样的人玩电子游戏(一个我在tkinter创建的简单游戏)的图片:

游戏的理念是用户控制屏幕底部的盒子以躲避落下的球(他们只能向左和向右躲避)。

我的目标是让神经网络输出播放器在屏幕底部的位置。如果它们完全在左侧,神经网络应输出A 0,如果它们在中间,A 5,and all the way right,A 1 ,and all the values in between.

我的图像是300x400像素。我存储数据非常简单。在50帧的游戏中,我把每个玩家的图像和位置记录为每一帧的一个元组。因此,我的结果是一个列表,其形式为 [(image,player position),…] with 50 elements。然后我把名单腌制了。

所以在我的代码中,我尝试创建一个非常基本的前馈网络,它接收图像并输出一个介于0和1之间的值,表示图像底部框的位置。但是我的神经网络只输出1s。

为了让它训练和输出接近我想要的值,我应该改变什么?


   
    
     
      
       
        
         当然,这是我的代码:
        
        机器学习代码主要来自https://machinelearningmastery.com/tutorial-first-neuric-network-python-keras/

从keras.models导入序列
从keras.layers导入密集
将numpy导入为np
进口泡菜

def pil_image_to_np_array(图像):
''获取图像并将其转换为numpy数组'''
#来自https://stackoverflow.com/a/45208895
#我所有的图像都是黑白的,所以我只需要一个频道
返回np.array(image)([:,:,0:1]

def data-to-training-set(数据):
#将列表按格式[第1帧图像,第1帧播放器位置],…]拆分为[所有图像],[所有播放器位置]]
输入,输出=[zip中val的列表(val)(*data)]
对于索引,枚举中的图像(输入):
#将pil图像转换为numpy数组,以便keras处理它们
inputs[索引]=pil_image_to_np_array(image)
返回(输入、输出)

如果“名称”,
#固定随机种子的再现性
随机种子(7)

#加载数据
#数据格式为[(第1帧图像,第1帧播放器位置),(第2帧图像,第2帧播放器位置),…]
以open(“position_data1.pkl”,“rb”)作为pickled_数据:
数据=pickle.load(pickled_data)
x,y=数据\到\训练\集(数据)

#获取图像的宽度
宽度=x[0].形状[1]=400
#将播放机位置(介于0和图像宽度之间的值)转换为介于0和1之间的值
对于索引,以枚举(y)输出:
Y[索引]=输出/宽度

#将图像输入展平,以便将其传递到神经网络
对于索引,在枚举(x)中输入:
x[索引]=np.ndarray.flatten(inpt)

#Keras需要一个图像数组数组(不是列表)来输入神经网络
x=np.数组(x)
y=np.数组(y)

#创建模型
模型=顺序()
#我的图像是300 x 400像素,因此每个输入将是一个由120000个灰度像素值组成的扁平数组。
#没有任何深入的学习,保持它的超级简单。
model.add(密集(1,输入_dim=120000,激活='sigmoid'))

#编译模型
model.compile(loss='mean'u squared'u error',optimizer='adam')

#符合模型
型号匹配(X,Y,epochs=15,批量大小=10)

#看看模型在做什么
预测=模型。预测(X,批次大小=10)
打印(预测)这将打印所有1!#TODO修复
        
         编辑:打印(Y)给我:




所以肯定不是所有的零。.


游戏的理念是用户控制屏幕底部的盒子以躲避落下的球(他们只能向左和向右躲避)。

我的目标是让神经网络输出播放器在屏幕底部的位置。如果它们完全在左边,神经网络应该输出一个0如果他们在中间,.5一路向前走1以及两者之间的所有值。

我的图像是300x400像素。我存储数据非常简单。在50帧的游戏中,我把每个玩家的图像和位置记录为每一帧的一个元组。因此,我的结果是表格中的一个列表[(image, player position), ...]有50种元素。然后我把那张单子弄糟了。

所以在我的代码中,我尝试创建一个非常基本的前馈网络,它接收图像并输出一个介于0和1之间的值,表示图像底部框的位置。但是我的神经网络只能输出1。

为了让它训练和输出接近我想要的值,我应该改变什么?

当然,这是我的代码:

# machine learning code mostly from https://machinelearningmastery.com/tutorial-first-neural-network-python-keras/

from keras.models import Sequential
from keras.layers import Dense
import numpy as np
import pickle

def pil_image_to_np_array(image):
    '''Takes an image and converts it to a numpy array'''
    # from https://stackoverflow.com/a/45208895
    # all my images are black and white, so I only need one channel
    return np.array(image)[:, :, 0:1]

def data_to_training_set(data):
    # split the list in the form [(frame 1 image, frame 1 player position), ...] into [[all images], [all player positions]]
    inputs, outputs = [list(val) for val in zip(*data)]
    for index, image in enumerate(inputs):
        # convert the PIL images into numpy arrays so Keras can process them
        inputs[index] = pil_image_to_np_array(image)
    return (inputs, outputs)

if __name__ == "__main__":
    # fix random seed for reproducibility
    np.random.seed(7)

    # load data
    # data will be in the form [(frame 1 image, frame 1 player position), (frame 2 image, frame 2 player position), ...]
    with open("position_data1.pkl", "rb") as pickled_data:
        data = pickle.load(pickled_data)
    X, Y = data_to_training_set(data)

    # get the width of the images
    width = X[0].shape[1] # == 400
    # convert the player position (a value between 0 and the width of the image) to values between 0 and 1
    for index, output in enumerate(Y):
        Y[index] = output / width

    # flatten the image inputs so they can be passed to a neural network
    for index, inpt in enumerate(X):
        X[index] = np.ndarray.flatten(inpt)

    # keras expects an array (not a list) of image-arrays for input to the neural network
    X = np.array(X)
    Y = np.array(Y)

    # create model
    model = Sequential()
    # my images are 300 x 400 pixels, so each input will be a flattened array of 120000 gray-scale pixel values
    # keep it super simple by not having any deep learning
    model.add(Dense(1, input_dim=120000, activation='sigmoid'))

    # Compile model
    model.compile(loss='mean_squared_error', optimizer='adam')

    # Fit the model
    model.fit(X, Y, epochs=15, batch_size=10)

    # see what the model is doing
    predictions = model.predict(X, batch_size=10)
    print(predictions) # this prints all 1s! # TODO fix


编辑:打印(Y)给我:



所以它肯定不是全部为零。

1 回复 | 直到 6 年前

today 6 年前

当然,一个更深的模型可能会给你一个更好的精度,但是考虑到你的图像很简单,一个只有一个隐藏层的非常简单(浅)的模型应该给一个中等到高的精度。因此,您需要进行以下修改才能实现这一点:

确保 X 和 Y 属于类型 float32 (目前, X 属于类型 uint8 )以下内容:
```
X = np.array(X, dtype=np.float32)
Y = np.array(Y, dtype=np.float32)
```
在训练神经网络时,最好将训练数据规范化。规范化有助于优化过程顺利进行,并加速收敛到解决方案。它进一步防止大的值导致大的梯度更新,这将是破坏。通常,输入数据中每个特征的值应该在一个小范围内,其中两个公共范围是 [-1,1] 和 [0,1] .因此,要确保所有值都在范围内 [-1,1] ,我们从每个特征中减去其平均值,然后除以其标准偏差:
```
X_mean = X.mean(axis=0)
X -= X_mean
X_std = X.std(axis=0)
X /= X_std + 1e-8  # add a very small constant to prevent division by zero
```
请注意,我们正在规范化每个特征(即本例中的每个像素),而不是每个图像。当您想要预测新数据时,即在推理或测试模式下,您需要减去 X_mean 从测试数据中除以 X_std (你应该 从来没有 从测试数据中减去自己的平均值或除以自己的标准差;相反,使用训练数据的平均值和标准差):
```
X_test -= X_mean
X_test /= X_std + 1e-8
```
如果应用点1和点2中的更改,您可能会注意到网络不再只预测1或0。相反,它显示了一些学习的微弱迹象,并预测了0和1的混合。这不坏,但远不是好的,我们有很高的期望!预测应该比只包含0和1的组合要好得多。在那里,你应该考虑到(忘记了!)学习率。由于考虑到一个相对简单的问题,网络的参数数量相对较大(训练数据样本也较少),因此您应该选择一个较小的学习速率来平滑梯度更新和学习过程:
```
from keras import optimizers
model.compile(loss='mean_squared_error', optimizer=optimizers.Adam(lr=0.0001))
```
你会注意到区别:损失价值达到大约 0.01 10个时期之后。网络不再预测0和1的混合;相反,预测更准确,更接近于它们应该是什么(即。 是 )。
别忘了!我们有高(逻辑!)期望。那么,我们如何在不向网络添加任何新层的情况下做得更好(显然,我们假设 adding more layers 可以救命!!)?

4.1.收集更多的培训数据。

4.2.添加权重正则化。常见的是L1和L2的正则化(我强烈推荐Jupyter notebooks 这本书的 Deep Learning with Python 作者 弗朗索瓦·乔利特 喀拉斯的创造者。明确地, here 是讨论正则化的。)

你应该总是以一种恰当和公正的方式评估你的模型。在训练数据上评估它(你曾经训练过它)并不能告诉你任何关于你的模型在看不见的(即新的或真实的)数据点上有多好的表现(例如,考虑一个存储或记忆所有训练数据的模型)。它将在训练数据上执行得很好,但它将是一个无用的模型,在新数据上执行得很差)。所以我们应该有测试和训练数据集:我们在训练数据上训练模型,在测试(即新的)数据上评估模型。然而,在建立一个好的模型的过程中,你要进行大量的实验:例如,你首先改变模型的类型和层数,训练模型,然后根据测试数据对其进行评估,以确保它是好的。然后你改变另一件事,比如说学习率,再次训练它,然后根据测试数据再次评估它…为了缩短测试周期,这些调整和评估周期会导致测试数据过度拟合。因此,我们需要一个名为 验证数据 (阅读更多: What is the difference between test set and validation set? )以下内容:
```
# first shuffle the data to make sure it isn't in any particular order
indices = np.arange(X.shape[0])
np.random.shuffle(indices)
X = X[indices]
Y = Y[indices]

# you have 200 images
# we select 100 images for training,
# 50 images for validation and 50 images for test data
X_train = X[:100]
X_val = X[100:150]
X_test = X[150:]
Y_train = Y[:100]
Y_val = Y[100:150]
Y_test = Y[150:]

# train and tune the model 
# you can attempt train and tune the model multiple times,
# each time with different architecture, hyper-parameters, etc.
model.fit(X_train, Y_train, epochs=15, batch_size=10, validation_data=(X_val, Y_val))

# only and only after completing the tuning of your model
# you should evaluate it on the test data for just one time
model.evaluate(X_test, Y_test)

# after you are satisfied with the model performance
# and want to deploy your model for production use (i.e. real world)
# you can train your model once more on the whole data available
# with the best configurations you have found out in your tunings
model.fit(X, Y, epochs=15, batch_size=10)
```
(实际上,当我们没有可用的培训数据时,将验证和测试数据与整个可用数据分开是浪费的。在这种情况下,如果模型的计算成本不高,而不是分离一个称为交叉验证的验证集,则可以这样做。 K-fold cross-validation 或者在数据样本很少的情况下迭代k-折叠交叉验证。)

写这个答案的时候大约是凌晨4点,我觉得很困,但我想再提一件与你的问题没有直接关系的事情:通过使用numpy库及其功能和方法,你可以编写更简洁和高效的代码,还可以节省你的资源。很多时候。因此,请确保您练习更多地使用它,因为它在机器学习社区和库中被大量使用。为了证明这一点,这里是您编写的相同代码,但更多地使用了numpy( 注意,我没有应用我在本代码中提到的所有更改 )以下内容:

# machine learning code mostly from https://machinelearningmastery.com/tutorial-first-neural-network-python-keras/

from keras.models import Sequential
from keras.layers import Dense
import numpy as np
import pickle

def pil_image_to_np_array(image):
    '''Takes an image and converts it to a numpy array'''
    # from https://stackoverflow.com/a/45208895
    # all my images are black and white, so I only need one channel
    return np.array(image)[:, :, 0]

def data_to_training_set(data):
    # split the list in the form [(frame 1 image, frame 1 player position), ...] into [[all images], [all player positions]]
    inputs, outputs = zip(*data)
    inputs = [pil_image_to_np_array(image) for image in inputs]
    inputs = np.array(inputs, dtype=np.float32)
    outputs = np.array(outputs, dtype=np.float32)
    return (inputs, outputs)

if __name__ == "__main__":
    # fix random seed for reproducibility
    np.random.seed(7)

    # load data
    # data will be in the form [(frame 1 image, frame 1 player position), (frame 2 image, frame 2 player position), ...]
    with open("position_data1.pkl", "rb") as pickled_data:
        data = pickle.load(pickled_data)
    X, Y = data_to_training_set(data)

    # get the width of the images
    width = X.shape[2] # == 400
    # convert the player position (a value between 0 and the width of the image) to values between 0 and 1
    Y /= width

    # flatten the image inputs so they can be passed to a neural network
    X = np.reshape(X, (X.shape[0], -1))

    # create model
    model = Sequential()
    # my images are 300 x 400 pixels, so each input will be a flattened array of 120000 gray-scale pixel values
    # keep it super simple by not having any deep learning
    model.add(Dense(1, input_dim=120000, activation='sigmoid'))

    # Compile model
    model.compile(loss='mean_squared_error', optimizer='adam')

    # Fit the model
    model.fit(X, Y, epochs=15, batch_size=10)

    # see what the model is doing
    predictions = model.predict(X, batch_size=10)
    print(predictions) # this prints all 1s! # TODO fix