-
读取日志文件
-
从日志文件创建时间戳的二维数组
-
-
比较时间戳(第一列),如果值在2D数组中的值中或值之间,则在数据文件的最后一列中标记“1”。如果不是,则标记为0。
我的目标是配置一个可选的命令行参数,但如果给定,将改为将0或1写入名为“output.csv”的输出文件,而不是数据文件。
目前,将创建output.csv,但不只是使用1或0分类。它实际上是将整个data.csv(读入)文件重写为output.csv(应该写入)
我是Python新手,希望能在将伪代码翻译成真实代码方面得到一些帮助。我目前有一个python脚本,可以完成以下任务:
-
-
从日志文件创建时间戳的二维数组
-
打开一个数据文件
-
比较时间戳(第一列),如果值在2D数组中的值中或值之间,则在数据文件的最后一列中标记“1”。如果不是,则标记为0。
我的目标是配置一个可选的命令行参数,但如果给定,将改为将0或1写入名为“output.csv”的输出文件,而不是数据文件。
isBad
0还是1指标?
import re
import csv
import codecs
import argparse
#Configuring arguments and variables
################################################
parser = argparse.ArgumentParser(description='This script is used to automatically classify the workbench operations. If the operation was performed by a human, it will be marked appropriately. If it was done by a machine, it will be marked appropriately.')
parser.add_argument('-d', '--data', required=True, help='The data.csv produced by the workbench')
parser.add_argument('-r', '--log', required=True, help='The log file used to appropriately label the data')
parser.add_argument('-n', '--new', required=False, help='Flag to create output.csv of markings instead of marking data.csv', action="store_true")
args = parser.parse_args()
if (args.new):
print("You selected to create a new log")
print('data.csv:', args.data)
print('log:', args.log)
filepath = args.log
csv_filepath = args.data
tempStart = ''
tempEnd = ''
################################################
print(" ")
print("Starting Script")
print(" ")
#open the log
with open(filepath) as myFile:
#read the log
all_logs = myFile.read()
myFile.close()
#Create regular expressions
starting_regex = re.compile(r'\[(\d+)\s+s\]\s+Initializing\s+Workbench')
ending_regex = re.compile(r'\[(\d+)\s+s\]\s+Log\s+File\s+Completed.\s+Stopping!')
#Create arrays of start and end times
start_times = list(map(int, starting_regex.findall(all_logs)))
end_times = list(map(int, ending_regex.findall(all_logs)))
#Create 2d Array
timeArray = list(map(list, zip(start_times, end_times)))
#Print 2d Array
print(timeArray)
print(" ")
print("Completed timeArray construction")
print(" ")
#Open the csv file as a reader
with open(csv_filepath, 'rb') as csvfile:
reader = csv.reader(codecs.iterdecode(csvfile, 'utf-8'))
input_rows = [row for row in reader]
#Open the csv file as a writer
with open('output.csv', 'w') as outputfile:
writer = csv.writer(outputfile)
# loop through the rows, set the currVal to the value of the first column (timestamp)
for row in input_rows:
currVal = int(row[0])
isBad = '0'
for interval in timeArray:
if interval[0] <= currVal <= interval[1]:
isBad = '1'
break
writer.writerow(row + [isBad])
print("Script completed")