代码之家  ›  专栏  ›  技术社区  ›  sdr2002

张量流的RNN慢化现象

  •  0
  • sdr2002  · 技术社区  · 7 年前

    我不知道它是否真的有,所以我把这个帖子留在了so。下面是此问题的玩具代码:

    import tensorflow as tf
    import numpy as np
    import time
    
    def network(input_list):
        input,init_hidden_c,init_hidden_m = input_list
        cell = tf.nn.rnn_cell.BasicLSTMCell(256, state_is_tuple=True)
        init_hidden = tf.nn.rnn_cell.LSTMStateTuple(init_hidden_c, init_hidden_m)
        states, hidden_cm = tf.nn.dynamic_rnn(cell, input, dtype=tf.float32, initial_state=init_hidden)
        net = [v for v in tf.trainable_variables()]
        return states, hidden_cm, net
    
    def action(x, h_c, h_m):
        t0 = time.time()
        outputs, output_h = sess.run([rnn_states[:,-1:,:], rnn_hidden_cm], feed_dict={
            rnn_input:x,
            rnn_init_hidden_c: h_c,
            rnn_init_hidden_m: h_m
        })
        dt = time.time() - t0
        return outputs, output_h, dt
    
    rnn_input = tf.placeholder("float", [None, None, 512])
    rnn_init_hidden_c = tf.placeholder("float", [None,256])
    rnn_init_hidden_m = tf.placeholder("float", [None,256])
    rnn_input_list = [rnn_input, rnn_init_hidden_c, rnn_init_hidden_m]
    rnn_states, rnn_hidden_cm, rnn_net = network(rnn_input_list)
    
    feed_input = np.random.uniform(low=-1.,high=1.,size=(1,1,512))
    feed_init_hidden_c = np.zeros(shape=(1,256))
    feed_init_hidden_m = np.zeros(shape=(1,256))
    
    sess = tf.Session()
    sess.run(tf.global_variables_initializer())
    for i in range(10000):
        _, output_hidden_cm, deltat = action(feed_input, feed_init_hidden_c, feed_init_hidden_m)
        if i % 10 == 0:
            print 'Running time: ' + str(deltat)
        (feed_init_hidden_c, feed_init_hidden_m) = output_hidden_cm
        feed_input = np.random.uniform(low=-1.,high=1.,size=(1,1,512))
    

    [不重要]此代码的作用是从包含LSTM的“network()”函数生成输出,其中输入的时间维度为1,因此输出也为1,并拉入&运行每个步骤的初始状态。

    我不知道为什么[:,-1:,:]运行时会出现时间的指数增长。这是不是tensorflow的性质还没有被记录,但速度特别慢(可能会自己添加更多图形?)? 谢谢你,希望这篇文章不会给其他用户带来这样的错误。

    2 回复  |  直到 7 年前
        1
  •  2
  •   Björn Mattsson    6 年前

    我遇到了同样的问题,每次运行TensorFlow时,TensorFlow都会变慢,在调试时发现了这个问题。以下是我的情况的简短描述,以及我是如何解决的,以供将来参考。希望它能为人们指明正确的方向,为他们节省一些时间。

    就我而言,问题主要是我没有利用 feed_dict 在执行时提供网络状态 sess.run() .相反,我重新申报了 outputs , final_state prediction https://github.com/tensorflow/tensorflow/issues/1439#issuecomment-194405649

    # defining the network
    lstm_layer = rnn.BasicLSTMCell(num_units, forget_bias=1)
    outputs, final_state = rnn.static_rnn(lstm_layer, input, initial_state=rnn_state, dtype='float32')
    prediction = tf.nn.softmax(tf.matmul(outputs[-1], out_weights)+out_bias)
    
    for input_data in data_seq:
        # redeclaring, stupid stupid...
        outputs, final_state = rnn.static_rnn(lstm_layer, input, initial_state=rnn_state, dtype='float32')
        prediction = tf.nn.softmax(tf.matmul(outputs[-1], out_weights)+out_bias)
        p, rnn_state = sess.run((prediction, final_state), feed_dict={x: input_data})
    

    当然,解决方案是在开始时只声明一次节点,并为新数据提供 。代码从半慢(开始时>15毫秒)到每次迭代都变慢,到在大约1毫秒内执行每次迭代。我的新代码如下所示:

    out_weights = tf.Variable(tf.random_normal([num_units, n_classes]), name="out_weights")
    out_bias = tf.Variable(tf.random_normal([n_classes]), name="out_bias")
    
    # placeholder for the network state
    state_placeholder = tf.placeholder(tf.float32, [2, 1, num_units])
    rnn_state = tf.nn.rnn_cell.LSTMStateTuple(state_placeholder[0], state_placeholder[1])
    
    x = tf.placeholder('float', [None, 1, n_input])
    input = tf.unstack(x, 1, 1)
    
    # defining the network
    lstm_layer = rnn.BasicLSTMCell(num_units, forget_bias=1)
    outputs, final_state = rnn.static_rnn(lstm_layer, input, initial_state=rnn_state, dtype='float32')
    
    prediction = tf.nn.softmax(tf.matmul(outputs[-1], out_weights)+out_bias)
    
    # actual network state, which we input with feed_dict
    _rnn_state = tf.nn.rnn_cell.LSTMStateTuple(np.zeros((1, num_units), dtype='float32'), np.zeros((1, num_units), dtype='float32'))
    
    it = 0
    for input_data in data_seq:
        encl_input = [[input_data]]
        p, _rnn_state = sess.run((prediction, final_state), feed_dict={x: encl_input, rnn_state: _rnn_state})
        print("{} - {}".format(it, p))
        it += 1
    

    将声明从for循环中移出也解决了sdrop 2002的问题,即执行切片 outputs[-1] sess.run() 在for循环内。

        2
  •  0
  •   sdr2002    7 年前

    def action(x, h_c, h_m):
        t0 = time.time()
        outputs, output_h = sess.run([rnn_states, rnn_hidden_cm], feed_dict={
            rnn_input:x,
            rnn_init_hidden_c: h_c,
            rnn_init_hidden_m: h_m
        })
        outputs = outputs[:,-1:,:]
        dt = time.time() - t0
        return outputs, output_h, dt