代码之家 › 专栏 › 技术社区 › Subhajit Kundu

从DataFrame获取值

pandas python-2.7 python

Subhajit Kundu · 技术社区 · 6 年前

我使用的是python 2.7和pandas 0.20.3。

考虑以下简单的数据帧代码。

import pandas as pd

days = ["Mon", "Tue", "Thu", "Fri"]
years = [2000, 2001, 2002, 2003, 2004]

x = [ #  Mon     Tue     Thu     Fri
        [26.16,  27.16,  25.69,  22.81],    # 2000   Row 1
        [20.75,  21.32,  18.20,  16.08],    # 2001   Row 2
        [16.42,  18.32,  18.59,  18.02],    # 2002   Row 3
        [14.56,  14.32,  13.85,  13.20],    # 2003   Row 4
        [21.02,  20.32,  20.78,  19.90]     # 2004   Row 5
    ] # Col1     Col2    Col3    Col4

df = pd.DataFrame(x, columns = days, index = years)
print df

我需要以下方面的帮助:

如何从开头打印第四列,即对应于Fri?
如何从末尾打印第二行,即对应于2003?
如何以行主方式迭代每个单元格的数据?

1 回复 | 直到 6 年前

jezrael 6 年前

使用 iloc 按职位选择和 iterrows ,但可能存在一些更快的矢量化替代方案来解决您的问题,因为它是 really slow :

#python counts from 0, so for 4.th column
a = df.iloc[:, 3]
print (a)
2000    22.81
2001    16.08
2002    18.02
2003    13.20
2004    19.90
Name: Fri, dtype: float64

#from back for second column
b = df.iloc[-2]
print (b)
Mon    14.56
Tue    14.32
Thu    13.85
Fri    13.20
Name: 2003, dtype: float64

for i, row in df.iterrows():
    print (i)
    print (row)

如果要按列/索引标签选择:

a1 = df['Fri']
print (a1)
2000    22.81
2001    16.08
2002    18.02
2003    13.20
2004    19.90
Name: Fri, dtype: float64

b1 = df.loc[2003]
print (b1)
Mon    14.56
Tue    14.32
Thu    13.85
Fri    13.20
Name: 2003, dtype: float64