代码之家 › 专栏 › 技术社区 › machinery

比iterrows更有效的方法

loops dataframe pandas python

machinery · 技术社区 · 4 年前

我有一个 DataFrame 带列的df action pointerID . 在下面的代码片段中,我遍历了每一行,这是非常低效的,因为数据帧非常大。有没有更有效的方法?

annotated = []
curr_pointers = []
count = 1
for index, row in df.iterrows():
    action = row["action"]
    id = row["pointerID"]
    if action == "ACTION_MOVE":
        annotated.append(curr_pointers[id])
    elif (action == "ACTION_POINTER_DOWN") or (action == "ACTION_DOWN"):
        if row["actionIndex"] != id:
            continue

        if id >= len(curr_pointers):
            curr_pointers.append(count)
        else:
            curr_pointers[id] = count
        annotated.append(count)
        count = count + 1
    elif (action == "ACTION_POINTER_UP") or (action == "ACTION_UP") or (action == "ACTION_CANCEL"):
        if row["actionIndex"] != id:
            continue

        annotated.append(curr_pointers[id])
    else:
        print("{} unknown".format(action))