代码之家  ›  专栏  ›  技术社区  ›  brotherchris

linuxbashshell读取日志文件比较每一行对文件的重置

  •  0
  • brotherchris  · 技术社区  · 6 年前

    当用户登录时,您可以在日志中看到他们的用户名和IP端口。当他们注销时,你能看到的只有IP端口。所以我需要匹配这些IP端口,然后从它们连接的线路中吐出信息。

    Date,time,Username,Viewer,IPPort <br>
    20180911,12:00,Chris,New,55567 <br>
    20180911,12:30,Tom,New,55577                  <<<<<-Connections <br>
    20180911,12:45,Larry,New,55587 <br>
    20180911,14:00,,,55567 <br>
    20180911,15:30,,,55577                 <<<<<-When user logs off <br>
    20180911,16:45,,,55587 <br>
    

    这就是我的循环目前的样子。

    INPUT=firstreport.csv
    OLDIFS=$IFS
    IFS=,
    [ ! -f $INPUT ] && { echo "$INPUT file not found"; exit 99; }
    while read Date Time Username Viewer IP
    do
            echo "IP : $IP"
            IPCHECK=$IP
            while read Date Time Username Viewer IP
        do
        if [[ $IPCHECK == $IP ]]; then
        echo "Match : $IP"
        fi
    
    
        done < $INPUT
    done < $INPUT
    IFS=$OLDIFS`
    

    任何关于我如何能做到这一点的建议将不胜感激。我的最终目标是有一个报告,我可以转储到excel和显示用户活动的地图。

    克里斯

    3 回复  |  直到 4 年前
        1
  •  2
  •   glenn jackman    6 年前

    gawk '
        BEGIN { FS = OFS = "," }
        NR == 1 {next}
        $3 != "" { # connection
            conn[$5]["on"] = $3 FS $4 FS $1 FS $2
        }
        $3 == "" {
            if ($5 in conn) {
                conn[$5]["off"] = $1 FS $2
            }
            else {
                print "Error: found a log off with no log on, line " NR
            }
        }
        END {
            print "IPPort","User","Viewer","ON date","ON time","OFF date","OFF time"
            for (id in conn) {
                print id, conn[id]["on"], conn[id]["off"]
            }
        }
    ' file
    
    IPPort,User,Viewer,ON date,ON time,OFF date,OFF time
    55567,Chris,New,20180911,12:00,20180911,14:00
    55577,Tom,New,20180911,12:30,20180911,15:30
    55587,Larry,New,20180911,12:45,20180911,16:45
    

    对于旧AWK(使用 )

    awk '
        BEGIN { FS = OFS = "," }
        NR == 1 {next}
        $3 != "" { ids[$5]; conn[$5,"on"] = $3 FS $4 FS $1 FS $2 }
        $3 == "" {
            if ($5 in ids)
                conn[$5,"off"] = $1 FS $2
            else
                print "Error: found a log off with no log on, line " NR
        }
        END {
            print "IPPort","User","Viewer","ON date","ON time","OFF date","OFF time" 
            for (id in ids)
                print id, conn[id,"on"], conn[id,"off"]
        }
    ' file
    
        2
  •  2
  •   Aaron    6 年前

    sort

    sort -t, -k 5,5
    

    在这个 分类 命令我们使用 -t, -k 5,5 分类 仅对第5个字段进行排序。

    (注:我在评论中建议使用 -k 5.1 这意味着从第5个字段的第一个字符排序,但是1) .x 字符偏移量默认为起始/结束位置字段的第一个/最后一个字符,可以重新命名;2)如果未指定结束字段,则在排序中不必要使用的字段可能多于摘录中发布的字段)

    应用于您的示例输入,使端口成为注销条目中的第5个字段:

    20180911,12:00,Chris,New,55567
    20180911,12:30,Tom,New,55577
    20180911,12:45,Larry,New,55587
    20180911,14:00,,,55567
    20180911,15:30,,,55577
    20180911,16:45,,,55587
    

    它产生以下输出:

    20180911,12:00,Chris,New,55567
    20180911,14:00,,,55567
    20180911,12:30,Tom,New,55577
    20180911,15:30,,,55577
    20180911,12:45,Larry,New,55587
    20180911,16:45,,,55587
    

    try it here .

        3
  •  0
  •   keithpjolley    6 年前

    将内环替换为:

    line=0
    while read Date Time Username Viewer IP COMMENT
    do
      let line=1+$line
      awk -F "$IFS" '
        BEGIN {
          IP="'${IP}'"
          if(!match(IP, "^[0-9]+$")) {exit}
          line='"${line}"'
        }
        NR<line { next }
        NR==line {
          print "CONNECT:",$0
          next
        }
        $5==IP && $4 != "New" {
          print "DISCONNECT:", $0
          exit
        }
        $5==IP {
          print "FOUND RECONNECT BEFORE DISCONNECT"
          exit
        }
      ' $INPUT
    done < $INPUT
    

    20180911,12:00,Chris,New,55567,
    20180911,12:30,Tom,New,55577, <<<<<-Connections 
    20180911,12:45,Larry,New,55587, 
    20180911,14:00,,55567, 
    20180911,15:30,,55577, <<<<<-When user logs off 
    20180911,16:45,,55587, 
    20180911,16:45,Tom,New,55577, <<<<<-reconnect
    20180911,16:45,55577, <<<<<-redisconnect
    20180911,16:45,CURLY,New,55577, <<<<<-reconnect
    20180911,16:45,MOE,New,55577, <<<<<- foobar
    20180911,16:45,55577, <<<<<-redisconnect
    

    CONNECT: 20180911,12:00,Chris,New,55567,
    DISCONNECT: 20180911,14:00,,55567, 
    CONNECT: 20180911,12:30,Tom,New,55577, <<<<<-Connections 
    DISCONNECT: 20180911,15:30,,55577, <<<<<-When user logs off 
    CONNECT: 20180911,12:45,Larry,New,55587, 
    DISCONNECT: 20180911,16:45,,55587, 
    CONNECT: 20180911,16:45,Tom,New,55577, <<<<<-reconnect
    FOUND RECONNECT BEFORE DISCONNECT
    CONNECT: 20180911,16:45,CURLY,New,55577, <<<<<-reconnect
    FOUND RECONNECT BEFORE DISCONNECT
    CONNECT: 20180911,16:45,MOE,New,55577, <<<<<- foobar
    

    我想这正是你想要的。我怀疑在您的真实数据上,您需要添加更多的条件,以确保用户和端口有意义。

    ****请注意 awk 脚本已更新,但输入/输出仍然是原始的