这并不一定能解决dask问题,但作为一种更快的替代方案
munge
,你可以用numpy的
stride_tricks
在数据中创建滚动视图(基于示例
here
).
def munge_strides(data, backprop_window):
""" take a rolling view into array by manipulating strides """
from numpy.lib.stride_tricks import as_strided
new_shape = (data.shape[0] - backprop_window,
backprop_window,
data.shape[1])
new_strides = (data.strides[0], data.strides[0], data.strides[1])
return as_strided(data, shape=new_shape, strides=new_strides)
X_train = np.arange(100).reshape(20, 5)
np.array_equal(munge(X_train, backprop_window=3),
munge_strides(X_train, backprop_window=3))
Out[112]: True
as_strided
需要非常小心地使用-这是一种“高级”功能,不正确的参数很容易导致出现分段故障-请参阅
docstring