代码之家  ›  专栏  ›  技术社区  ›  artemis Roberto

查找值是否等于或介于二维数组中的值之间

  •  0
  • artemis Roberto  · 技术社区  · 6 年前

    我有一个python脚本,它捕获日志数据并将其转换为2D数组。

    脚本的下一部分旨在循环通过.csv文件,计算每行的第一列,并确定该值是否等于或介于2D数组中的值之间。如果是,则将最后一列标记为TRUE。如果不是,则将其标记为FALSE。

    例如,如果我的二维阵列如下所示:

    [[1542053213, 1542053300], [1542055000, 1542060105]]
    

    1542053220, Foo, Foo, Foo
    1542060110, Foo, Foo, Foo
    

    第一行的最后一列应为TRUE(或1),而第二行的最后一列应为FALSE(或0)。

    from os.path import expanduser
    import re
    import csv
    import codecs
    
    #Setting variables
    #Specifically, set the file path to the reveal log
    filepath = expanduser('~/LogAutomation/programlog.txt')
    csv_filepath = expanduser('~/LogAutomation/values.csv')
    tempStart = ''
    tempEnd = ''
    
    print("Starting Script")
    
    #open the log
    with open(filepath) as myFile:
        #read the log
        all_logs = myFile.read()
    myFile.close()
    
    #Create regular expressions
    starting_regex = re.compile(r'\[(\d+)\s+s\]\s+Starting\s+Program')
    ending_regex = re.compile(r'\[(\d+)\s+s\]\s+Ending\s+Program\.\s+Stopping')
    
    #Create arrays of start and end times
    start_times = list(map(int, starting_regex.findall(all_logs)))
    end_times = list(map(int, ending_regex.findall(all_logs)))
    
    #Create 2d Array
    timeArray = list(map(list, zip(start_times, end_times)))
    
    #Print 2d Array
    print(timeArray)
    
    print("Completed timeArray construction")
    
    #prints the csv file
    with open(csv_filepath, 'rb') as csvfile:
        reader = csv.reader(codecs.iterdecode(csvfile, 'utf-8'))
    
        for row in reader:
            currVal = row[0]
                #if currVal is equal to or in one of the units in timeArray, mark last column as true
                #else, mark last column as false
    
    csvfile.close()
    
    print("Script completed")
    

    我已经成功地遍历了我的.csv,并为每一行获取了第一列的值,但我不知道如何进行比较。不幸的是,我不熟悉2D数组数据结构,无法在值之间进行检入。此外,我的.csv文件中的列数可能会波动,因此是否有人知道一种非静态的方法来确定“最后一列”,以便能够在文件中的列数之后写入该列?

    有人能给我一些帮助吗?

    2 回复  |  直到 6 年前
        1
  •  1
  •   Marco Bonelli    6 年前

    您只需要在列表列表上迭代,并检查该值是否在任何间隔内。下面是一个简单的方法:

    with open(csv_filepath, 'rb') as csvfile:
        reader = csv.reader(codecs.iterdecode(csvfile, 'utf-8'))
        input_rows = [row for row in reader]
    
    with open(csv_filepath, 'w') as outputfile:
        writer = csv.writer(outputfile)
    
        for row in input_rows:
            currVal = int(row[0])
            ok = 'FALSE'
    
            for interval in timeArray:
                if interval[0] <= curVal <= interval[1]:
                    ok = 'TRUE'
                    break
    
            writer.writerow(row + [ok])
    

    上面的代码将结果写入同一个文件中,因此要小心。我还删除了 csvfile.close() 因为如果你使用 with 声明文件将自动关闭。

        2
  •  1
  •   Tzomas    6 年前

    compare = lambda x, y, t: (x <= int(t) <= y)
    with open('output.csv', 'w') as outputfile:
        writer = csv.writer(outputfile)
        with open(csv_filepath, 'rb') as csvfile:
            reader = csv.reader(codecs.iterdecode(csvfile, 'utf-8'))
    
        for row in reader:
            currVal = row[0]
            #if currVal is equal to or in one of the units in timeArray, mark last column as true
            #else, mark last column as false
            match = any(compare(x, y, currVal) for x, y in timeArray)
            write.writerow(row + ['TRUE' if match else 'FALSE'])
    
        csvfile.close()
    outputfile.close()