代码之家  ›  专栏  ›  技术社区  ›  syeh_106

Keras/tensorflow的sigmoid和crossentropy为什么精度低?

  •  3
  • syeh_106  · 技术社区  · 6 年前

    sigmoid 激活和; binary_crossentropy Keras公司:

    model = Sequential()
    model.add(Dense(1, input_dim=1, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    

    {(-a, 0), (a, 1)} ,即。

    y = numpy.array([0, 1])
    for a in range(40):
        x = numpy.array([-a, a])
        keras_ce[a] = model.evaluate(x, y)[0] # cross-entropy computed by keras/tensorflow
        my_ce[a] = np.log(1+exp(-a)) # My own computation
    

    我发现了二进制交叉熵( keras_ce 1.09e-7 什么时候 a 约为16,如下图所示(蓝线)。它不会随着“a”的不断增长而进一步减少。为什么?

    enter image description here

    {(-a,0),(a,1)} ,的 只是

    所以交叉熵应该随着时间的推移而减小 增加,如上面橙色(‘my’)所示。是否有一些Keras/Tensorflow/Python设置可以更改以提高其精度?还是我弄错了?如有任何建议/意见/答案,我将不胜感激。

    2 回复  |  直到 6 年前
        1
  •  10
  •   today    6 年前

    在计算损失函数时,由于数值的稳定性,概率值(即sigmoid函数的输出)被截断。


    binary_crossentropy 因为损失会导致 中的函数 losses.py

    def binary_crossentropy(y_true, y_pred):
        return K.mean(K.binary_crossentropy(y_true, y_pred), axis=-1)
    

    如您所见,它反过来调用等效的后端函数。在使用Tensorflow作为后端的情况下,这将导致调用 二元交叉熵 tensorflow_backend.py 文件:

    def binary_crossentropy(target, output, from_logits=False):
        """ Docstring ..."""
    
        # Note: tf.nn.sigmoid_cross_entropy_with_logits
        # expects logits, Keras expects probabilities.
        if not from_logits:
            # transform back to logits
            _epsilon = _to_tensor(epsilon(), output.dtype.base_dtype)
            output = tf.clip_by_value(output, _epsilon, 1 - _epsilon)
            output = tf.log(output / (1 - output))
    
        return tf.nn.sigmoid_cross_entropy_with_logits(labels=target,
                                                       logits=output)
    

    如你所见 from_logits 参数设置为 False 默认情况下。因此,if条件的计算结果为true,结果输出中的值被剪裁到范围 [epsilon, 1-epislon] epsilon 大于 1-epsilon . 这就解释了为什么 二元交叉熵 损失也是有界的。

    common.py 文件:

    _EPSILON = 1e-7
    
    def epsilon():
        """Returns the value of the fuzz factor used in numeric expressions.
        # Returns
            A float.
        # Example
        ```python
            >>> keras.backend.epsilon()
            1e-07
        ```
        """
        return _EPSILON
    

    如果出于任何原因,您希望获得更高的精度,也可以使用 set_epsilon

    def set_epsilon(e):
        """Sets the value of the fuzz factor used in numeric expressions.
        # Arguments
            e: float. New value of epsilon.
        # Example
        ```python
            >>> from keras import backend as K
            >>> K.epsilon()
            1e-07
            >>> K.set_epsilon(1e-05)
            >>> K.epsilon()
            1e-05
        ```
        """
        global _EPSILON
        _EPSILON = e
    

    但是,请注意,将epsilon设置为极低的正值或零,可能会破坏整个kera计算的稳定性。

        2
  •  5
  •   BugKiller    6 年前

    我认为 keras , 让我们追踪一下 凯拉斯 计算

    第一,

    def binary_crossentropy(y_true, y_pred):
        return K.mean(K.binary_crossentropy(y_true, y_pred), axis=-1)
    

    def binary_crossentropy(target, output, from_logits=False):
        """Binary crossentropy between an output tensor and a target tensor.
    
        # Arguments
            target: A tensor with the same shape as `output`.
            output: A tensor.
            from_logits: Whether `output` is expected to be a logits tensor.
                By default, we consider that `output`
                encodes a probability distribution.
    
        # Returns
            A tensor.
        """
        # Note: tf.nn.sigmoid_cross_entropy_with_logits
        # expects logits, Keras expects probabilities.
        if not from_logits:
            # transform back to logits
            _epsilon = _to_tensor(epsilon(), output.dtype.base_dtype)
            output = tf.clip_by_value(output, _epsilon, 1 - _epsilon)
            output = tf.log(output / (1 - output))
    
    
        return tf.nn.sigmoid_cross_entropy_with_logits(labels=target,
                                                       logits=output)
    

    通知 tf.clip_by_value 用于 数值稳定性

    binary_crossentropy tf.nn.sigmoid_cross_entropy_with_logits 和自定义损失函数(删除谷限幅)

    import numpy as np
    import matplotlib.pyplot as plt
    import tensorflow as tf
    from keras.models import Sequential
    from keras.layers import Dense
    import keras
    
    # keras
    model = Sequential()
    model.add(Dense(units=1, activation='sigmoid', input_shape=(
        1,), weights=[np.ones((1, 1)), np.zeros(1)]))
    # print(model.get_weights())
    model.compile(loss='binary_crossentropy',
                  optimizer='adam', metrics=['accuracy'])
    
    # tensorflow
    G = tf.Graph()
    with G.as_default():
        x_holder = tf.placeholder(dtype=tf.float32, shape=(2,))
        y_holder = tf.placeholder(dtype=tf.float32, shape=(2,))
        entropy = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(
            logits=x_holder, labels=y_holder))
    sess = tf.Session(graph=G)
    
    
    # keras with custom loss function
    def customLoss(target, output):
        # if not from_logits:
        #     # transform back to logits
        #     _epsilon = _to_tensor(epsilon(), output.dtype.base_dtype)
        #     output = tf.clip_by_value(output, _epsilon, 1 - _epsilon)
        #     output = tf.log(output / (1 - output))
        output = tf.log(output / (1 - output))
        return tf.nn.sigmoid_cross_entropy_with_logits(labels=target,
                                                       logits=output)
    model_m = Sequential()
    model_m.add(Dense(units=1, activation='sigmoid', input_shape=(
        1,), weights=[np.ones((1, 1)), np.zeros(1)]))
    # print(model.get_weights())
    model_m.compile(loss=customLoss,
                    optimizer='adam', metrics=['accuracy'])
    
    
    N = 100
    xaxis = np.linspace(10, 20, N)
    keras_ce = np.zeros(N)
    tf_ce = np.zeros(N)
    my_ce = np.zeros(N)
    keras_custom = np.zeros(N)
    
    y = np.array([0, 1])
    for i, a in enumerate(xaxis):
        x = np.array([-a, a])
        # cross-entropy computed by keras/tensorflow
        keras_ce[i] = model.evaluate(x, y)[0]
        my_ce[i] = np.log(1+np.exp(-a))  # My own computation
        tf_ce[i] = sess.run(entropy, feed_dict={x_holder: x, y_holder: y})
        keras_custom[i] = model_m.evaluate(x, y)[0]
    # print(model.get_weights())
    
    plt.plot(xaxis, keras_ce, label='keras')
    plt.plot(xaxis, my_ce, 'b',  label='my_ce')
    plt.plot(xaxis, tf_ce, 'r:', linewidth=5, label='tensorflow')
    plt.plot(xaxis, keras_custom, '--', label='custom loss')
    plt.xlabel('a')
    plt.ylabel('xentropy')
    plt.yscale('log')
    plt.legend()
    plt.savefig('compare.jpg')
    plt.show()
    

    enter image description here