代码之家  ›  专栏  ›  技术社区  ›  Adrián Arroyo Perez

给tensorflow添加原始图像尺寸误差

  •  0
  • Adrián Arroyo Perez  · 技术社区  · 6 年前

    我对tensorflow很陌生,所以我试着在 tutorial 用大小(944944)和类yes/no(1,0)的图像填充某些层,以查看它的性能,但我一直无法使其工作。我得到的最后一个错误是:“尺寸大小必须能被57032704整除,但对于输入形状为[10236236,64]的‘重塑1’,尺寸大小为3565440, 2 输入张量计算为部分形状:输入 1 = [?,57032704]".

    我不知道错误是否来自任何整形手术,或者因为我不能像这样喂养神经症患者。代码如下:

    import tensorflow as tf
    import numpy as np
    import os
    # import cv2
    from scipy import ndimage
    import PIL
    
    tf.logging.set_verbosity(tf.logging.INFO)
    
    def define_model(features, labels, mode):
    """Model function for CNN."""
    # Input Layer
    input_layer = tf.reshape(features["x"], [-1,944, 944, 1])
    
    # Convolutional Layer #1
    conv1 = tf.layers.conv2d(
      inputs=input_layer,
      filters=32,
      kernel_size=[16, 16],
      padding="same",
      activation=tf.nn.relu)
    
    # Pooling Layer #1
    pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)
    
    # Convolutional Layer #2 and Pooling Layer #2
    conv2 = tf.layers.conv2d(
        inputs=pool1,
        filters=64,
        kernel_size=[16, 16],
        padding="same",
        activation=tf.nn.relu)
    pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2)
    
    # Dense Layer
    pool2_flat = tf.reshape(pool2, [-1,944*944*64])
    dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu)
    dropout = tf.layers.dropout(
        inputs=dense, rate=0.4, training=mode == tf.estimator.ModeKeys.TRAIN)
    
    # Logits Layer - raw predictions
    logits = tf.layers.dense(inputs=dropout, units=10)
    
    predictions = {
        # Generate predictions (for PREDICT and EVAL mode)
        "classes": tf.argmax(input=logits, axis=1),
        # Add `softmax_tensor` to the graph. It is used for PREDICT and by the
        # `logging_hook`.
        "probabilities": tf.nn.softmax(logits, name="softmax_tensor")
    }
    
    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)
    
    # Calculate Loss (for both TRAIN and EVAL modes)
    loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)
    
    # Configure the Training Op (for TRAIN mode)
    if mode == tf.estimator.ModeKeys.TRAIN:
        optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
        train_op = optimizer.minimize(
            loss=loss,
            global_step=tf.train.get_global_step())
        return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)
    
    # Add evaluation metrics (for EVAL mode)
    eval_metric_ops = {
        "accuracy": tf.metrics.accuracy(
            labels=labels, predictions=predictions["classes"])}
    return tf.estimator.EstimatorSpec(
        mode=mode, loss=loss, eval_metric_ops=eval_metric_ops)
    
    if __name__ == '__main__':
    # Load training and eval data
    # mnist = tf.contrib.learn.datasets.load_dataset("mnist")
    # train_data = mnist.train.images  # Returns np.array
    # train_labels = np.asarray(mnist.train.labels, dtype=np.int32)
    train_data, train_labels = load_images("C:\\Users\\Heads\\Desktop\\BDManchas_Semi")
    
    eval_data = train_data.copy()
    eval_labels = train_labels.copy()
    
    # Create the Estimator
    classifier = tf.estimator.Estimator(
        model_fn=define_model, model_dir="/tmp/convnet_model")
    
    # Set up logging for predictions
    tensors_to_log = {"probabilities": "softmax_tensor"}
    logging_hook = tf.train.LoggingTensorHook(
        tensors=tensors_to_log, every_n_iter=50)
    
    # Train the model
    train_input_fn = tf.estimator.inputs.numpy_input_fn(
        x={"x": train_data},
        y=train_labels,
        batch_size=10,
        num_epochs=None,
        shuffle=True)
    classifier.train(
        input_fn=train_input_fn,
        steps=20000,
        hooks=[logging_hook])
    
    # Evaluate the model and print results
    eval_input_fn = tf.estimator.inputs.numpy_input_fn(
        x={"x": eval_data},
        y=eval_labels,
        num_epochs=1,
        shuffle=False)
    eval_results = classifier.evaluate(input_fn=eval_input_fn)
    print(eval_results)
    

    ---------------------------------更多:----------------------------------

    好了,现在我已经完成了整形,我又犯了一个错误,训练中的损失很小。我一直在研究这个( here 有一个很好的答案)但是对于我使用的每一个新函数,都有一个不同的错误。我试图改变这种损失:

    loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)
    

    致:

    loss = tf.nn.sigmoid_cross_entropy_with_logits(labels=labels, logits=logits)
    

    但它似乎也有问题的重塑,错误说,登录和标签必须有相同的形状((10,10)对(10,)),我曾试图重塑登录和标签,但我总是得到一个不同的错误(我想没有办法均衡这两个数组)。

    标签的定义如下:

    list_of_classes = []
    # if ... class == 1
    list_of_classes.append(1)
    #else
    list_of_classes.append(0)
    
    labels = np.array(list_of_classes).astype("int32") 
    

    你知道如何使用正确的损耗吗?

    2 回复  |  直到 6 年前
        1
  •  1
  •   benjaminplanche    6 年前

    初始问题

    第二个池层的输出( pool2 )它的形状很好 (1, 236, 236, 64) (卷积和池缩小了张量的大小),所以尝试将其重塑为 (-1, 944*944*64) ( pool2_flat )抛出一个错误。

    为了避免这种情况,您可以定义 普尔二号公寓 作为:

    pool2_shape = tf.shape(pool2)
    pool2_flat = tf.reshape(pool2, [-1, pool2_shape[1] * pool2_shape[2] * pool2_shape[3]])
    # or directly pool2_flat = tf.reshape(pool2, [-1, 236 * 236 * 64])
    # if your dimensions are fixed...
    
    # or more simply, as suggested by @xdurch0:
    pool2_flat = tf.layers.flatten(pool2)
    

    关于你的编辑

    由于不知道自己是如何定义自己的标签的,很难判断哪里做错了。这个 labels 一定很健康 (None,) (批处理中每个图像的类ID) logits 一定很健康 (None, nb_classes) (对于批次中的每个图像,每个类别的估计概率)。

    以下代码适用于我:

    def define_model(features, labels, mode):
        """Model function for CNN."""
        # Input Layer
        input_layer = tf.reshape(features["x"], [-1,944, 944, 1])
    
        # Convolutional Layer #1
        conv1 = tf.layers.conv2d(
          inputs=input_layer,
          filters=32,
          kernel_size=[16, 16],
          padding="same",
          activation=tf.nn.relu)
    
        # Pooling Layer #1
        pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)
    
        # Convolutional Layer #2 and Pooling Layer #2
        conv2 = tf.layers.conv2d(
            inputs=pool1,
            filters=64,
            kernel_size=[16, 16],
            padding="same",
            activation=tf.nn.relu)
        pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2)
    
        # Dense Layer
        pool2_flat = tf.layers.flatten(pool2)
        dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu)
        dropout = tf.layers.dropout(
            inputs=dense, rate=0.4, training=mode == tf.estimator.ModeKeys.TRAIN)
    
        # Logits Layer - raw predictions
        logits = tf.layers.dense(inputs=dropout, units=10)
    
        predictions = {
            # Generate predictions (for PREDICT and EVAL mode)
            "classes": tf.argmax(input=logits, axis=1),
            # Add `softmax_tensor` to the graph. It is used for PREDICT and by the
            # `logging_hook`.
            "probabilities": tf.nn.softmax(logits, name="softmax_tensor")
        }
    
        if mode == tf.estimator.ModeKeys.PREDICT:
            return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)
    
        # Calculate Loss (for both TRAIN and EVAL modes)
        loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)
    
        # Configure the Training Op (for TRAIN mode)
        if mode == tf.estimator.ModeKeys.TRAIN:
            optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
            train_op = optimizer.minimize(
                loss=loss,
                global_step=tf.train.get_global_step())
            return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)
    
        # Add evaluation metrics (for EVAL mode)
        eval_metric_ops = {
            "accuracy": tf.metrics.accuracy(
                labels=labels, predictions=predictions["classes"])}
        return tf.estimator.EstimatorSpec(
            mode=mode, loss=loss, eval_metric_ops=eval_metric_ops)
    
    if __name__ == '__main__':
        # Load training and eval data
        # mnist = tf.contrib.learn.datasets.load_dataset("mnist")
        # train_data = mnist.train.images  # Returns np.array
        # train_labels = np.asarray(mnist.train.labels, dtype=np.int32)
    
        def mock_load_images(path):
            nb_classes = 10
            dataset_size = 100
            train_data = np.random.rand(dataset_size, 944, 944).astype(np.float32)
            list_of_classes = [np.random.randint(nb_classes) for i in range(dataset_size)]
            train_labels = np.array(list_of_classes, dtype=np.int32)
            return train_data, train_labels
    
        train_data, train_labels = mock_load_images("C:\\Users\\Heads\\Desktop\\BDManchas_Semi")
    
        # Create the Estimator
        classifier = tf.estimator.Estimator(
            model_fn=define_model, model_dir="/tmp/convnet_model")
    
        # Set up logging for predictions
        tensors_to_log = {"probabilities": "softmax_tensor"}
        logging_hook = tf.train.LoggingTensorHook(
            tensors=tensors_to_log, every_n_iter=50)
    
        # Train the model
        train_input_fn = tf.estimator.inputs.numpy_input_fn(
            x={"x": train_data},
            y=train_labels,
            batch_size=1,
            num_epochs=None,
            shuffle=True)
        classifier.train(
            input_fn=train_input_fn,
            steps=20000,
            hooks=[logging_hook])
    
        # ...
    
        2
  •  -1
  •   Adrián Arroyo Perez    6 年前

    所以解决办法是改变路线:

    pool2_flat = tf.reshape(pool2, [-1,944*944*64])
    

    对于线路:

    pool2_flat = tf.layers.flatten(pool2)
    

    此外,我需要使用512x512大小的图像,而不是944x444,因为它不适合内存。。。