代码之家  ›  专栏  ›  技术社区  ›  Alexandr Bortnik

bash从url.txt获取dirname

  •  -1
  • Alexandr Bortnik  · 技术社区  · 6 年前
    $ cat urls.txt
    /var/www/example.com.com/upload/email/email-inliner.html
    /var/www/example.com.com/upload/email/email.html
    /var/www/example.com.com/upload/email/email2-inliner.html
    /var/www/example.com.com/upload/email/email2.html
    /var/www/example.com.com/upload/email/AquaTrainingBag.png
    /var/www/example.com.com/upload/email/fitex/fitex-ecr7.jpg
    /var/www/example.com.com/upload/email/fitex/fitex-ect7.jpg
    /var/www/example.com.com/upload/email/fitex/fitex-ecu7.jpg
    /var/www/example.com.com/upload/email/fitex/fitex.html
    /var/www/example.com.com/upload/email/fitex/logo.png
    /var/www/example.com.com/upload/email/fitex/form.html
    /var/www/example.com.com/upload/email/fitex/fitex.txt
    /var/www/example.com.com/upload/email/bigsale.html
    /var/www/example.com.com/upload/email/logo.png
    /var/www/example.com.com/upload/email/bigsale.png
    /var/www/example.com.com/upload/email/bigsale-shop.html
    /var/www/example.com.com/upload/email/bigsale.txt
    

    有谁能帮我弄到 dirname 为了这个?

    dirname /var/www/example.com.com/upload/email/sss.png 很好,但是url列表呢?

    for while ). 因为url的数量可以超过几千万。最好的方法是借助于重定向(tee)到文件

    1 回复  |  直到 6 年前
        1
  •  3
  •   kvantour    6 年前

    像往常一样,当它归结为这样的事情,Awk来拯救:

    awk 'BEGIN{FS=OFS="/"}{NF--}1' <file>
    

    请注意,这是 dirname 目录名

    awk 'BEGIN{FS=OFS="/"}{gsub("/+","/")}
         {s=$0~/^\//;NF-=$NF?1:2;$0=$0?$0:(s?"/":".")};1' <file>
    

    下表显示了差异:

    | path       | dirname | awk full | awk short |
    |------------+---------+----------+-----------|
    | .          | .       | .        |           |
    | /          | /       | /        |           |
    | foo        | .       | .        |           |
    | foo/       | .       | .        | foo       |
    | foo/bar    | foo     | foo      | foo       |
    | foo/bar/   | foo     | foo      | foo/bar   |
    | /foo       | /       | /        |           |
    | /foo/      | /       | /        | /foo      |
    | /foo/bar   | /foo    | /foo     | /foo      |
    | /foo/bar/  | /foo    | /foo     | /foo/bar  |
    | /foo///bar | /foo    | /foo     | /foo//    |
    

    各种替代解决方案可在 Extracting directory name from an absolute path using sed or awk . 解决方案 Kent 一切都会好起来的 Solid Kim 只是需要一个小调整,以修复多个斜杠(并错过了upvots!)