代码之家 › 专栏 › 技术社区 › anon_swe

张量流:弦分裂的奇怪行为

tensorflow-datasets tensorflow python

anon_swe · 技术社区 · 6 年前

我用的是 tf.data.Dataset TensorFlow中的API。我做了一个 Dataset 像这样的对象:

dataset = tf.data.Dataset.from_tensor_slices((data, labels))
val_dataset = val_dataset.map(lambda x, y: ({'review': x}, y))

所以现在我 dataset 由元组组成,其中第一个元素是字典,第二个元素是字符串数组。

我正在尝试使用以下函数进行基本字符串预处理:

def preprocess(x, y):
    # split on whitespace
    logger.info(type(x))
    logger.info(type(y))
    x['sequence'] = tf.string_split([x['review']])
    logger.info(x['review'])

最后一条日志记录语句告诉我 x['review'] 是:

SparseTensor(indices=Tensor("StringSplit:0", shape=(?, 2), dtype=int64), values=Tensor("StringSplit:1", shape=(?,), dtype=string), dense_shape=Tensor("StringSplit:2", shape=(2,), dtype=int64))

为什么 indices 有形状 (?,2) 是吗?不应该 string_split 只需在空白处分割并得到结果 Tensor 是什么形状的结果(或至少需要的最大长度)?

谢谢!

0 回复 | 直到 6 年前

推荐文章

July · 如何定义数字间隔,然后四舍五入

1 年前

Community wiki · 对象名称前的单下划线和双下划线的含义是什么?

1 年前

Brian Johnson · 为什么在Python中列出字典列表会引发TypeError?[已关闭]

1 年前

user026 · 如何根据特定窗口的平均值(行数)创建新列?

1 年前

Ashok Shrestha · 需要追踪特定的颜色线并获取坐标

1 年前

Nicote Ool · 在FastApi和Vue3中获得422

1 年前

NeoExceptCodeBad · 如果我有很多垂直线,我如何找到它们的边缘?

1 年前

Abdulaziz · 如何对集合内的列表进行排序[重复]

1 年前

user2743931 · 带有src目录的Python setup.py

1 年前

asmgx · 为什么合并数据帧不能按照python中的预期方式工作

1 年前