代码之家 › 专栏 › 技术社区 › netskink

在数据实验室中使用python3,我无法将表示Google云存储桶中文件的字符串列表作为带有tensorflow的feed_dict提供。

google-cloud-platform tensorflow

netskink · 技术社区 · 7 年前

我对tf不熟悉,我在处理一些文件时遇到了一个问题。这是代码的摘录。

xlabel_to_files_list_map['dog_bark'] # subset of data with two files

# result
['gs://some_bucket/some_dir/data/dog_bark/100652.mp3', 'gs://some_bucket/some_dir/dog_bark/100795.mp3']

在这里,我只想通过一个简单的图表来处理这些字符串:

file_to_process = tf.placeholder(tf.string)

audio_binary_remote = tf.gfile.Open(file_to_process, 'rb').read()

waveform = tf.contrib.ffmpeg.decode_audio(audio_binary_remote, file_format='mp3', samples_per_second=44100, channel_count=2)


with tf.Session() as sess:
 result = sess.run(waveform, feed_dict={
 file_to_process: xlabel_to_files_list_map['dog_bark']
 })
#print (result)

这导致

TypeError: Expected binary or unicode string, got <tf.Tensor 'Placeholder_9:0' shape=<unknown> dtype=string>

FWIW,这项工作

a_string = tf.placeholder(tf.string) 
z = a_string 
with tf.Session() as sess: 
    result = sess.run(z, feed_dict={a_string: ['one', 'two', 'three']}) 
print(result)

这导致

['one' 'two' 'three']

有效的简单示例是字符串列表。使用哈希映射值部分的更复杂的示例,该部分是字符串列表。我不知道为什么它不能像第二个例子那样工作。

另一种方法

我试着用另一种方法。这一次,我尝试构建一个结果列表,然后处理该列表。这也失败了。它没有产生错误。它只是给出了空白的结果。

waveform_tensor_list = []
for a_file in dir_to_selected_files_list_map['gs://some_bucket/some_dir/dog_bark/']:
  #print (a_file)
  waveform = tf.contrib.ffmpeg.decode_audio(a_file, file_format='mp3', samples_per_second=44100, channel_count=2)
  waveform_tensor_list.append(waveform)

此单元格的输出立即看起来错误,但格式正确:

waveform_tensor_list

导致:

[<tf.Tensor 'DecodeAudioV2_7:0' shape=(?, 2) dtype=float32>,
 <tf.Tensor 'DecodeAudioV2_8:0' shape=(?, 2) dtype=float32>,
 stuff deleted,
 <tf.Tensor 'DecodeAudioV2_22:0' shape=(?, 2) dtype=float32>,
 <tf.Tensor 'DecodeAudioV2_23:0' shape=(?, 2) dtype=float32>]

然后,要评估图表,请执行以下操作:

with tf.Session() as sess:
  result = sess.run(waveform_tensor_list)
  print (result)

如果此单元格的输出是:

[array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32)]

2 回复 | 直到 6 年前

netskink 6 年前

tf.gfile.Open 不是TensorFlow操作。换句话说,它不会向图形中添加打开文件的操作。

tf.gfile.打开 是类的别名 tf.gfile.GFile .所以这条线 tf.gfile.Open(<foo>) 正在调用 tf.gfile.GFile.__init__ 它要求第一个参数是一个python字符串,而不是 tf.Tensor 字符串(即 tf.placeholder(tf.string) 返回)。

这里有几个选项:

馈送文件的内容

raw_data = tf.placeholder(tf.string)
waveform = tf.contrib.ffmpeg.decode_audio(raw_data, file_format='mp3', samples_per_second=44100, channel_count=2)

with tf.Session() as sess:
    for file in xlabel_to_files_list_map['dog_bark']:
      result = sess.run(waveform, feed_dict={raw_data: tf.gfile.GFile(file, 'rb').read()})

打开并读取图形中的文件

(使用 tf.data classes to setup "input processing" )

filenames = xlabel_to_files_list_map['dog_bark']
dataset = tf.data.Dataset.from_tensor_slices(filenames).map(lambda x: tf.read_file(x))

raw_data = dataset.make_one_shot_iterator().get_next()
waveform =  tf.contrib.ffmpeg.decode_audio(raw_data, file_format='mp3', samples_per_second=44100, channel_count=2)

with tf.Session() as sess:
    for _ in filenames:
        result = sess.run(waveform)

使用渴望执行

(参见 Research and Experimentation section of the TensorFlow getting started guide )

这可能有助于减少图形中的内容与Python中发生的内容之间的混淆。

tf.enable_eager_execution()
for filename in xlabel_to_files_list_map['dog_bark']:
    result = tf.contrib.ffmpeg.decode_audio(tf.gfile.GFile(filename, 'rb').read(), file_format='mp3', samples_per_second=44100, channel_count=2)

希望有帮助!

MVanOrder 7 年前

我没有用张量流,但是根据 documentation tf.gfile.open创建gfile对象。tf.contrib.ffmpeg.decode_audio需要一个二进制out unicode字符串。因为gfile没有read()函数,所以我将尝试使用流:

waveform = tf.contrib.ffmpeg.decode_audio(audio_binary_remote.read(), file_format='mp3', samples_per_second=44100, channel_count=2)