需要帮助根据两个振荡值的条件将不同的状态标记到新的数据帧列中;列X和;Y
使用列Y作为状态间隔。状态间隔从0开始,到0结束。请注意,Y列中的值将始终保持在正或负范围内。每个间隔周期的顺序为+、-、+、-等。
当Y列值变为大于0的正值时开始标记,并在变为负值之前在0处停止标记;是周期的结束,将开始下一个范围或进入负范围的周期。
共有6种模式:a、B、C、D、E、F作为循环状态。我试图找出逻辑,以及如何将每个状态的标签添加到名为state的新数据帧列中。为每个周期进行标记,并在每个新的周期状态下重新开始。
+-------+-------------+---------+
| State | X | Y |
+-------+-------------+---------+
| A | from - to + | + |
| B | + | + |
| C | - | + |
| D | + | - |
| E | - | - |
| F | from + to - | - |
+-------+-------------+---------+
状态A和;F、 (列X)的值从+到-或反之亦然,交叉超过0。列Y中的值将始终保持在正或负范围内。
状态B、C、D、E在(第X列)中没有交叉。以下是数据帧值示例和具有结果状态的新列示例。
+----+---------+---------+-------+
| # | X | Y | State |
+----+---------+---------+-------+
| 1 | -0.0034 | 0.0056 | A | Cycle 1 (+)
| 2 | -0.0001 | 0.0070 | A |
| 3 | 0.0019 | 0.0073 | A |
| 4 | 0.0039 | 0.0075 | A |
| | | | |
| 5 | 0.0273 | -0.0037 | D | Cycle 2 (-)
| 6 | 0.0237 | -0.0059 | D |
| | | | |
| 7 | 0.0047 | 0.0028 | B | Cycle 3 (+)
| 8 | 0.0044 | 0.0020 | B |
| | | | |
| 9 | -0.0034 | -0.0006 | E | Cycle 4 (-)
| 10 | -0.0045 | -0.0014 | E |
| | | | |
| 11 | -0.0021 | 0.0006 | C | Cycle 5 (+)
| 12 | -0.0019 | 0.0007 | C |
| | | | |
| 13 | 0.0041 | -0.0054 | F | Cycle 6 (-)
| 14 | 0.0017 | -0.0060 | F |
| 15 | -0.0021 | -0.0059 | F |
| 16 | -0.0023 | -0.0057 | F |
+----+---------+---------+-------+
Cycles will continue 7, 8, 9, 10, etc. in the time series
具有12个周期的数据帧,类似于上面的示例,在结果中显示了两次模式A、B、C、D、E、F。
df = pd.DataFrame({
'x': [-0.0034, -0.0001, 0.0019, 0.0039, 0.0273, 0.0237, 0.0047, 0.0044, -0.0034, -0.0045, -0.0021, -0.0019, 0.0041, 0.0017, -0.0021, -0.0023, -0.0014, -0.0002, 0.0018, 0.0031, 0.0171, 0.0230, 0.0035, 0.0040, -0.0030, -0.0040, -0.0020, -0.0015, 0.0030, 0.0010, -0.0030, -0.0020, ],
'y': [0.0056, 0.007, 0.0073, 0.0075, -0.0037, -0.0059, 0.0028, 0.002, -0.0006, -0.0014, 0.0006, 0.0007, -0.0054, -0.006, -0.0059, -0.0057, 0.0040, 0.005, 0.0065, 0.0070, -0.0022, -0.0045, 0.0020, 0.001, -0.0005, -0.0010, 0.0003, 0.0005, -0.0050, -0.005, -0.0060, -0.0040, ],
})
下一个示例是开始对数据帧进行迭代编码,并需要帮助构建逻辑,包括;F表示,遍历每个+/-周期,并指导如何遍历Y列,在X列中查找交叉值。
State = []
for i, row in df.iterrows(): #i: dataframe index; row: each row in series format
if row['X'] > 0 and row['Y'] > 0:
State.append('B')
elif row['X'] < 0 and row['Y'] > 0:
State.append('C')
elif row['X'] > 0 and row['Y'] < 0:
State.append('D')
elif row['X'] < 0 and row['Y'] < 0:
State.append('E')
else:
State.append('err')
df['State'] = State
print(df)
同样,上述代码不包含&F州。
使现代化
仍然需要帮助,下面是带注释的更新代码,并将解释什么不起作用。
# Creating new column as + or - based on Column Y value
df['y_pos'] = np.where((df.y > 0), True, False)
# Creating new column to label the cycle as they are increasing order 1,2,3, etc.
df['cycle_n'] = (df.y_pos != df.y_pos.shift(1)).cumsum()
# returns dictionary whose keys and values are from DataFrames
# to be able to loop through the cycles
gb = df.groupby('cycle_n')
groups = dict(list(gb))
State = []
for name, group in gb:
# Information to help compare our final results
print("Group:" + str(name) )
print("=====================")
print("Min:" + str(group.min()) )
print("Max:" + str(group.max()) )
print("--- Group Data -----")
print(group)
print("--------------------")
print("--- Column X Row Data -----")
for index, row in group.iterrows(): # loop each row
if row['y_pos'] == True: # Column Y is (+)
print( row['x'] ) # row data value for Column X
# trying to use min and max in each cycle to figure out
# if there is a crossover
# ISSUE: min and max is holding data values for each of the
# columns, not only Column X which maybe the reason why
# it's not working correctly
if [ (group.min() <= 0) & (group.max() >= 0) ]:
State.append('A')
elif row['x'] >= 0:
State.append('B')
elif row['x'] < 0:
State.append('C')
else:
State.append('err')
elif row['y_pos'] == False: # Column Y is (-)
print( row['x'] )
# ISSUE: again min and max is holding data values for each of the
# columns, maybe the reason why it's not working correctly
if [ (group.max() >= 0) & (group.min() <= 0) ]:
State.append('F')
elif row['x'] >= 0:
State.append('D')
elif row['x'] < 0:
State.append('E')
else:
State.append('err')
else:
print("err")
df['State'] = State
# Combining y_pos & cycle_n to be printed out.
df['Label'] = 'Cycle ' + df.cycle_n.astype(str) + ' ' + df.y_pos.map({True: '(+)', False: '(-)'})
del df['y_pos']
del df['cycle_n']
print(df)
此代码的问题。仅标记状态A(&A);F now并将其他状态错误标记为A或F。使用min和max的If语句返回true;确实不正确,因为它包含字典中所有列mins和max的值。例如
print("Min:" + str(group.min()) )
Min:
x -0.0034
y 0.0056
y_pos 1.0000
cycle_n 1.0000
dtype: float64
不知道这是否是最好的方法,只是离它正常工作越来越近。