代码之家 › 专栏 › 技术社区 › nico9T

放大设置:如何将一行数据帧添加到另一个数据帧

dataframe pandas python

nico9T · 技术社区 · 7 年前

我想创建一个空 DataFrame 我将在其中附加其他单行 数据帧 Setting With Enlargement “用于高效附加。

import numpy as np
import pandas as pd
from datetime import datetime
from pandas import DataFrame

df = DataFrame(columns=["open","high","low","close","volume","open_interest"])

row_one = DataFrame({"open":10,"high":11,"low":9,"close":10,"volume":100,"open_interest":np.NAN}, index = [datetime(2017,1,1)])
row_two = DataFrame({"open":9,"high":12,"low":8,"close":10.50,"volume":500,"open_interest":np.NAN}, index = [datetime(2017,1,2)])

df[row_one.index] = row_one.columns

"DatetimeIndex(['2017-01-01'], dtype='datetime64[ns]', freq=None) not in index"

数据帧 .我做错了什么?

2 回复 | 直到 3 年前

jezrael 7 年前

你需要 loc 对于放大设置,请选择 index 价值依据 [0] 对于标量和最后一个“转换” row_one iloc :

df.loc[row_one.index[0]] = row_one.iloc[0]
print (df)
            open  high  low  close  volume  open_interest
2017-01-01  10.0  11.0  9.0   10.0   100.0            NaN

但更好的是使用 concat ,尤其是在多个 df s:

df = pd.concat([row_one, row_two])

Bill 3 年前

我从事件中获取新数据,因此每次只需添加一行数据帧。

每次迭代都需要完整、更新的数据帧吗?

如果没有,请执行以下操作:

new_row_data = {'open': 10.0,
 'high': 11.0,
 'low': 9.0,
 'close': 10.0,
 'volume': 100.0,
 'open_interest': np.nan}
new_row_index = pd.Timestamp('2017-01-01 00:00:00')

index = []
records = []
for _ in range(500):
    index.append(new_row_index)
    records.append(new_row_data)  # add new data here

# Create dataframe at the end
df = pd.DataFrame.from_records(records, index=index)

(上面的代码大约需要2.4毫秒)。

buffer_size = 100  # adjust to your needs
data_columns = ["open","high","low","close","volume","open_interest"]
all_columns = ['DateTime'] + data_columns  # Add column for datetimes
df_empty = pd.DataFrame(None, index=range(buffer_size),
                        columns=all_columns)
# Note: You might want to specify dtypes above rather than np.nans

df = df_empty.copy()
index = 0
for _ in range(500):
    df.loc[index, 'DateTime'] = new_row_index
    df.loc[index, columns] = new_row_data  # add new data here
    # Updated dataframe if you need it:
    #print(df.loc[:index])

    index += 1
    while index >= len(df):
        df = pd.concat([df, df_empty.reindex(range(index, index + buffer_size))])

# To remove the integer index use:
df = df.loc[:index-1].set_index('DateTime', drop=True)

(上面的代码大约需要540毫秒)。

concat 或 append

推荐文章

user1245262 · 筛选Pandas数据帧时出现问题

1 年前

Foroand · 熊猫数据帧中的词频计数耗时过长

1 年前

user14696236 · 如何为每个对应的列创建一行[重复]

2 年前

Shawn Hemelstrand · 为什么我的自定义errorbar函数不能在R中工作?

2 年前

Karim Abou El Naga · 将带字符串的DataFrame绘制到堆叠条形图中

2 年前

The Great · 拆分并存储数据帧,但名称基于特定列中的唯一值

2 年前

nickolakis · 基于R中的列名复制列

2 年前

opposity · 形成一个数据帧,该数据帧包含R中包含类别和子类别的列

2 年前

A. Handler · 有没有办法将数据帧的列与完整列名向量相匹配?

2 年前

JasonX · 运行减法计算

2 年前