代码之家  ›  专栏  ›  技术社区  ›  logvca

根据Python中的值查找第一行文本

  •  1
  • logvca  · 技术社区  · 6 年前

    价值

    37.0459
    

    37.04278,-95.58895
    37.04369,-95.58592
    37.04369,-95.58582
    37.04376,-95.58557
    37.04376,-95.58546
    37.04415,-95.58429
    37.0443,-95.5839
    37.04446,-95.58346
    37.04461,-95.58305
    37.04502,-95.58204
    37.04516,-95.58184
    37.04572,-95.58139
    37.04597,-95.58127
    37.04565,-95.58073
    37.04546,-95.58033
    37.04516,-95.57948
    37.04508,-95.57914
    37.04494,-95.57842
    37.04483,-95.5771
    37.0448,-95.57674
    37.04474,-95.57606
    37.04467,-95.57534
    37.04462,-95.57474
    37.04458,-95.57396
    37.04454,-95.57274
    37.04452,-95.57233
    37.04453,-95.5722
    37.0445,-95.57164
    37.04448,-95.57122
    37.04444,-95.57054
    37.04432,-95.56845
    37.04432,-95.56834
    37.04424,-95.5668
    37.044,-95.56251
    37.04396,-95.5618
    

    预期结果

    37.04502,-95.58204
    37.04516,-95.58184
    37.04572,-95.58139
    37.04597,-95.58127
    37.04565,-95.58073
    37.04546,-95.58033
    37.04516,-95.57948
    

    补充资料

    在linux中,我可以使用grep、sed、cut和其他工具获得最接近的代码行并进行所需的处理,但我希望使用Python。

    任何帮助都将不胜感激! 非常感谢。

    4 回复  |  直到 6 年前
        1
  •  3
  •   Pedro Lobito    6 年前

    如何搜索第一个“纬度,经度”的值 在Python中的“file.txt”列表中进行协调,得到上面的3行和上面的3行 下面几行*


    您可以尝试:

    with open("text_filter.txt") as f:
        text = f.readlines() # read text lines to list
    
        filter= "37.0459"
        match = [i for i,x in enumerate(text) if filter in x] # get list index of item matching filter
        if match:
            if len(text) >= match[0]+3: # if list has 3 items after filter, print it
                print("".join(text[match[0]:match[0]+3]).strip())
            print(text[match[0]].strip())
            if match[0] >= 3:  # if list has 3 items before filter, print it
                print("".join(text[match[0]-3:match[0]]).strip())
    

    输出:

    37.04597,-95.58127
    37.04565,-95.58073
    37.04546,-95.58033
    37.04597,-95.58127
    37.04502,-95.58204
    37.04516,-95.58184
    37.04572,-95.58139
    
        2
  •  0
  •   Mayank R    6 年前

    您可以使用pandas在数据帧中导入数据,然后轻松地对其进行操作。根据您的问题,要检查的值不完全匹配,因此我已将其转换为字符串。

    import pandas as pd
    data = pd.read_csv("file.txt", header=None, names=["latitude","longitude"]) #imports text file as dataframe
    value_to_check = 37.0459 # user defined
    for i in range(len(data)):
        if str(value_to_check) == str(data.iloc[i,0])[:len(str(value_to_check))]:
            break
    print(data.iloc[i-3:i+4,:])
    

    输出

        latitude  longitude
    9   37.04502  -95.58204
    10  37.04516  -95.58184
    11  37.04572  -95.58139
    12  37.04597  -95.58127
    13  37.04565  -95.58073
    14  37.04546  -95.58033
    15  37.04516  -95.57948
    
        3
  •  0
  •   Thierry Lathuille    6 年前

    使用迭代器的解决方案,只在内存中保留必要的行,不加载文件中不必要的部分:

    from collections import deque
    from itertools import islice
    
    
    def find_in_file(file, target, before=3, after=3):
    
        queue = deque(maxlen=before)
        with open(file) as f:
            for line in f:
                if target in map(float, line.split(',')):
                    out = list(queue) + [line] + list(islice(f, 3))
                    return out
                queue.append(line)
            else:
                raise ValueError('target not found')
    

    一些测试:

    print(find_in_file('test.txt', 37.04597))
    
    # ['37.04502,-95.58204\n', '37.04516,-95.58184\n', '37.04572,-95.58139\n', '37.04597,-95.58127\n',
    #  '37.04565,-95.58073\n', '37.04565,-95.58073\n', '37.04565,-95.58073\n']
    
    print(find_in_file('test.txt', 37.044))  # Only one line after the match
    
    # ['37.04432,-95.56845\n', '37.04432,-95.56834\n', '37.04424,-95.5668\n', '37.044,-95.56251\n', 
    #   '37.04396,-95.5618\n']
    

        4
  •  0
  •   nandu kk    6 年前

    此解决方案将打印前后元素,即使它们小于3。 此外,我使用字符串,因为问题暗示您也需要部分匹配。 即37.0459将与37.04597匹配

    search_term='37.04462'
    with open('file.txt') as f:
        lines = f.readlines()
    lines = [line.strip().split(',') for line in lines] #remove '\n'
    for lat,lon in lines:
        if search_term in lat:
            index=lines.index([lat,lon])
            break
    left=0
    right=0
    for k in range (1,4): #bcoz last one is not included
        if index-k >=0:
            left+=1
        if index+k<=(len(lines)-1):
            right+=1
    for i in range(index-left,index+right+1): #bcoz last one is not included
        print(lines[i][0],lines[i][1])