代码之家 › 专栏 › 技术社区 › Jaffer Wilson Dilip kumar

从数据pandas python numpy中删除序列

slice numpy pandas python-3.x python

Jaffer Wilson Dilip kumar · 技术社区 · 5 年前

我尝试了以下方法:

>>> import pandas as pd
>>> import numpy as np
>>> df = pd.read_csv("training.csv")
>>> data_raw = df.values
>>> data = []
>>> seq_len = 5
>>> for index in range(len(data_raw) - seq_len):
...     data.append(data_raw[index: index + seq_len])
...
>>> len(data)
1994
>>> len(data_raw)
1999
>>> del data[0]

此处提供数据: training.csv
我看到了 del 从数组中删除第一个元素。重新排列这些值,比如第一个位置的值,现在是第0个位置的值,等等。
我想删除索引处的值: 0,4,5,9,10,14, 等等。
但这在电流中是不可能实现的 德尔 语句,因为它将重新排列值。
请帮我找到丢失的部分。

4 回复 | 直到 5 年前

Chris 5 年前

首先,需要的去除指数: 0,4,5,9,10,14,15,19,20,24,25,29... 可以生成:

indices = []
for i in range(1,401):
    indices.append(5*(i-1))
    indices.append(5*i-1)
del indices[-1] # This is to remove 1999, which is out of index for df
print(indices[:12])
[0, 4, 5, 9, 10, 14, 15, 19, 20, 24, 25, 29]

然后使用 np.delete :

data_raw = np.random.randint(0, 10, size=(1999, 10))
new_data = np.delete(data_raw, indices, axis=0) # Since this is not inplace op

验证:

np.array_equal(new_data[:6],data_raw[[1,2,3,6,7,8]])
                                      # Where 0,4,5,9 is removed
# True

Nihal subbu 5 年前

你可以这样做

示例代码:

index = [0,4,5,9,10,14]
for i, x in enumerate(index):
    index[i] -= i

print(index)


for i in index:
    del data[i]

yatu Sayali Sonawane 5 年前

以下是克服这一问题的简单方法:

a = list(range(10))
remove = [0,4,5]

假设要删除中的索引 remove 从 a . 你能做的就是将元素排序 去除 按相反的顺序,然后将它们从 一 在for循环中:

for i in sorted(remove, reverse=True):
    del a[i]

产量

[1, 2, 3, 6, 7, 8, 9]

iamklaus 5 年前

另一种方法

a = list(range(10))

print(a)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

to_drop = [0,4,5,9] #indices to drop

values = [a[i] for i in to_drop] # values corresponding to the indices

new_v = [a.remove(v) for v in values] # new list after dropping the values

产量

[1, 2, 3, 6, 7, 8]

我的意思是移除=[0,4,5,9],这应该是移除中的顺序。列出数组是否为或10个值。如何动态创建?

这是数组的100个值。生成批处理大小为10时需要删除的索引。如果我解释错误,一定要纠正我。

to_drop = [[j+(i*10) for j in [0,4,5,9]] for i in range(10)]

O/P

[[0, 4, 5, 9],
 [10, 14, 15, 19],
 [20, 24, 25, 29],
 [30, 34, 35, 39],
 [40, 44, 45, 49],
 [50, 54, 55, 59],
 [60, 64, 65, 69],
 [70, 74, 75, 79],
 [80, 84, 85, 89],
 [90, 94, 95, 99]]

推荐文章

Mainland · Python数据帧规范化值错误:列的长度必须与键相同

1 年前

user026 · 如何根据特定窗口的平均值(行数)创建新列?

1 年前

rpn · 如何在列[1]中连续第二次出现“0”时返回列[0]的值

1 年前