代码之家 › 专栏 › 技术社区 › user8834780

加上x的概率和转换率%

matplotlib python-2.7 python

-1

user8834780 · 技术社区 · 6 年前

以下是当前数据的外观:

id testers_time stage_1_to_2_time activated_time stage_2_to_3_time engaged_time
a  10           30                40             30                70
b  30               
c  15           30                45        
d       

dict = {'id': ['a','b','c','d'], 'testers_time': [10, 30, 15, None], 'stage_1_to_2_time': [30, None, 30, None], 'activated_time' : [40, None, 45, None],'stage_2_to_3_time' : [30, None, None, None],'engaged_time' : [70, None, None, None]} 
df = pd.DataFrame(dict, columns=['id', 'testers_time', 'stage_1_to_2_time', 'activated_time', 'stage_2_to_3_time', 'engaged_time'])

我有一个阴谋 testers_time 相对于CDF的累积概率:

def ecdf(df):
    n = len(df)
    x = np.sort(df)
    y = np.arange(1.0, n+1) / n
    return x, y

df = df['testers_time'].dropna().sort_values()
print(df)

x, y = ecdf(df)

plt.plot(x, y, marker='.', linestyle='none') 

plt.axvline(x.mean(), color='gray', linestyle='dashed', linewidth=2) #Add mean 

x_m = int(x.mean()) 
y_m = stats.percentileofscore(df, x.mean())/100.0 

plt.annotate('(%s,%s)' % (x_m,int(y_m*100)) , xy=(x_m,y_m), xytext=(10,-5), textcoords='offset points') 

percentiles= np.array([0,25,50,75,100]) 
x_p = np.percentile(df, percentiles) 
y_p = percentiles/100.0 

plt.plot(x_p, y_p, marker='D', color='red', linestyle='none') # Overlay quartiles 

for x,y in zip(x_p, y_p): 
    plt.annotate('%s' % int(x), xy=(x,y), xytext=(10,-5), textcoords='offset points')

我想做的是图表 测试时间 反对:

1)它的非累积概率,如果用图表表示,它应该看起来像一种pdf格式

2)累计换算率,换算率为 id 已填充的(非空或空) 测试时间 . 所以身份证 a (4个id中的1个)转换,即25%,id b 转换,即50%(自累计),id c 皈依者,占75%,身份证 d 不转换,因此最大转换率为75%,为30天 测试时间 .

你能帮忙把上面的内容添加到 df ,或绘制它们?谢谢您。

1 回复 | 直到 6 年前

user8834780 6 年前

A1: df['prob'] = df['testers_time'].map(df.testers_time.value_counts(normalize=True))

A2: df['conv'] = df['testers_time'].rank(ascending=1)/len(df)

推荐文章

July · 如何定义数字间隔,然后四舍五入

1 年前

Community wiki · 对象名称前的单下划线和双下划线的含义是什么?

1 年前

Brian Johnson · 为什么在Python中列出字典列表会引发TypeError?[已关闭]

1 年前

user026 · 如何根据特定窗口的平均值(行数)创建新列?

1 年前

Ashok Shrestha · 需要追踪特定的颜色线并获取坐标

1 年前

Nicote Ool · 在FastApi和Vue3中获得422

1 年前

NeoExceptCodeBad · 如果我有很多垂直线,我如何找到它们的边缘?

1 年前

Abdulaziz · 如何对集合内的列表进行排序[重复]

1 年前

user2743931 · 带有src目录的Python setup.py

1 年前

asmgx · 为什么合并数据帧不能按照python中的预期方式工作

1 年前