像
@user48956 commented
根据公认的答案,使用
numpy.random.choice
np.random.seed(42)
df = pd.DataFrame(np.random.randint(0,100,size=(10000000, 4)), columns=list('ABCD'))
%time df.sample(100000).index
print(_)
%time pd.Index(np.random.choice(df.index, 100000))
Wall time: 710 ms
Int64Index([7141956, 9256789, 1919656, 2407372, 9181191, 2474961, 2345700,
4394530, 8864037, 6096638,
...
471501, 3616956, 9397742, 6896140, 670892, 9546169, 4146996,
3465455, 7748682, 5271367],
dtype='int64', length=100000)
Wall time: 6.05 ms
Int64Index([7141956, 9256789, 1919656, 2407372, 9181191, 2474961, 2345700,
4394530, 8864037, 6096638,
...
471501, 3616956, 9397742, 6896140, 670892, 9546169, 4146996,
3465455, 7748682, 5271367],
dtype='int64', length=100000)