虽然我能够绕过这个问题,但我想知道为什么会发生这个错误。。
数据帧
import pandas as pd
import itertools
sl_df=pd.DataFrame(
data=list(range(18)),
index=pd.MultiIndex.from_tuples(
list(itertools.product(
['A','B','C'],
['I','II','III'],
['x','y']))),
columns=['one'])
出:
one
A I x 0
y 1
II x 2
y 3
III x 4
y 5
B I x 6
y 7
II x 8
y 9
III x 10
y 11
C I x 12
y 13
II x 14
y 15
III x 16
y 17
简单的切片
sl_df.loc[pd.IndexSlice['A',:,'x']]
出:
one
A I x 0
II x 2
III x 4
引发错误的部分:
sl_df.loc[pd.IndexSlice[:,'II']]
出:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-6-4bfd2d65fd21> in <module>()
----> 1 sl_df.loc[pd.IndexSlice[:,'II']]
...\pandas\core\indexing.pyc in __getitem__(self, key)
1470 except (KeyError, IndexError):
1471 pass
-> 1472 return self._getitem_tuple(key)
1473 else:
1474 # we by definition only have the 0th axis
...\pandas\core\indexing.pyc in _getitem_tuple(self, tup)
868 def _getitem_tuple(self, tup):
869 try:
--> 870 return self._getitem_lowerdim(tup)
871 except IndexingError:
872 pass
...\pandas\core\indexing.pyc in _getitem_lowerdim(self, tup)
977 # we may have a nested tuples indexer here
978 if self._is_nested_tuple_indexer(tup):
--> 979 return self._getitem_nested_tuple(tup)
980
981 # we maybe be using a tuple to represent multiple dimensions here
...\pandas\core\indexing.pyc in _getitem_nested_tuple(self, tup)
1056
1057 current_ndim = obj.ndim
-> 1058 obj = getattr(obj, self.name)._getitem_axis(key, axis=axis)
1059 axis += 1
1060
...\pandas\core\indexing.pyc in _getitem_axis(self, key, axis)
1909
1910 # fall thru to straight lookup
-> 1911 self._validate_key(key, axis)
1912 return self._get_label(key, axis=axis)
1913
...\pandas\core\indexing.pyc in _validate_key(self, key, axis)
1796 raise
1797 except:
-> 1798 error()
1799
1800 def _is_scalar_access(self, key):
...\pandas\core\indexing.pyc in error()
1783 raise KeyError(u"the label [{key}] is not in the [{axis}]"
1784 .format(key=key,
-> 1785 axis=self.obj._get_axis_name(axis)))
1786
1787 try:
KeyError: u'the label [II] is not in the [columns]'
解决方法:
(或者当索引的第一级上有“:”时,正确的方法。)
sl_df.loc[pd.IndexSlice[:,'II'],:]
出:
one
A II x 2
y 3
B II x 8
y 9
C II x 14
y 15
问题:为什么只有在多索引的第一级使用“:”时,才必须在轴1上指定“:”?你会不会同意,它在其他级别上工作,但在多索引的第一个级别上不工作,这有点奇怪(参见上面的简单切片)?