代码之家  ›  专栏  ›  技术社区  ›  batuman

在python中读取日志文件

  •  0
  • batuman  · 技术社区  · 6 年前

    我有一个包含以下内容的日志文件。 我喜欢读书 Iteration value detection_eval 价值

    I0704 18:10:31.097334  2421 solver.cpp:433] Iteration 200, Testing net (#0)
    I0704 18:10:31.149454  2421 net.cpp:693] Ignoring source layer mbox_loss
    I0704 18:10:40.241916  2421 solver.cpp:546]     Test net output #0: detection_eval = 0.00273318
    

    我做的

    accuracy_pattern = r"Iteration (?P<iter_num>\d+), Testing net \(#0\)\n.* detection_eval = (?P<accuracy>[+-]?(\d+(\.\d*)?|\.\d+)([eE][+-]?\d+)?)"
    

    但什么都没读,有什么问题吗?

    编辑: 然后我把精确模式读成数组

    for r in re.findall(accuracy_pattern, log):
            iteration = int(r[0])
            accuracy = float(r[1]) * 100
    

    日志包含所有文件内容,读取如下

    with open(log_file, 'r') as log_file2:
            log = log_file2.read()
    
    2 回复  |  直到 6 年前
        1
  •  2
  •   DYZ    6 年前

    根据我对您的数据的了解,以下正则表达式应该可以工作:

    pattern = "Iteration\s+(\d+)|detection_eval\s+=\s+(.+$)"
    for it,de in re.findall(pattern, log, flags=re.M):
        if it: 
            print('Iteration', int(it))
        if de:
            print('detection_eval', float(de))
    #Iteration 200
    #detection_eval 0.00273318
    

    但是,一次读取整个日志文件通常是个坏主意。考虑一次读一行:

    with open(log_file, 'r') as log_file2:
        for line in log_file2:
            for it,de in re.findall(pattern, log):
                if it: 
                    print('Iteration', int(it))
                if de:
                    print('detection_eval', float(de))
    
        2
  •  0
  •   Rakesh    6 年前

    使用 re.search

    演示:

    import re
    
    with open(log_file, "r") as infile:
        for line in infile:
            iteration = re.search("Iteration (\d+)", line)
            if iteration:
                print iteration.group()
    
            detection_eval = re.search("detection_eval = (\d.*)", line)
            if detection_eval:
                print detection_eval.group()
    

    输出:

    Iteration 200
    detection_eval = 0.00273318
    

    或使用 re.findall

    iteration = re.findall(r"Iteration (\d+)", log )
    detection_eval = re.findall(r"detection_eval = (\d.*)", log )