linux重复文件md5sum awk

发布时间:2017-04-21 22:09:42 阅读:977次

find . -type f -exec md5sum {} \; | sort -k 1 | awk 'a[$1]++{print $2}' | xargs -t -i rm -f {}

https://blog.csdn.net/liuzunming138/article/details/39345973

不能不说shell命令的强大,一条命令就可以解决繁琐编码。

根据文件MD5删除重复文件,仅保留一份文件:

find . -type f -exec md5sum {} \; | sort -k 1 | awk 'a[$1]++{print $2}' | xargs -t -i rm -f {}

这一条命令很容易理解,需要看一下find,md5sum,sort,awk,和xargs的相关解析。

1、首先使用find查找当前目录下的文件然后调用md5sum对文件进行md5计算;

2、使用sort对文件md5进行排序,为了使相同md5的文件排列到一起;

3、使用awk数组对md5进行统计,如果该md5对应多个文件,则输出该文件;

4、通过rm删除第三步统计出来的文件;

testdeMacBook-Air:check test$ grep --color -A 2 -nH 'server_name' ./*.conf  | awk '{$1="";sub(" ", "");print}' >1.txt

转:http://www.cnblogs.com/jw35/p/6227383.html 

http://blog.csdn.net/zixiaomuwu/article/details/50878383

[test@web_pre 20170423]$ md5sum *|grep e6d8b695361d6ab180cb9fe59697a2ef|awk '{print $2}'|xargs ls -rlht

testdeMacBook-Air:check test$ md5sum *|awk '{print $1}'|sort|uniq -c > result.txt
testdeMacBook-Air:check test$ cd /app/data/
testdeMacBook-Air:check test$ ncdu
testdeMacBook-Air:check test$ cd 20170421
testdeMacBook-Air:check test$ ls|wc -l
testdeMacBook-Air:check test$ find -not -empty -type f -printf "%s\n" | sort -rn |uniq -d | xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate | cut -b 36- >result.txt
testdeMacBook-Air:check test$ cd 20170420
testdeMacBook-Air:check test$ find -not -empty -type f -printf "%s\n" | sort -rn |uniq -d | xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate | cut -b 36- >result.txt
testdeMacBook-Air:check test$ cat result.txt
testdeMacBook-Air:check test$ screen find -not -empty -type f -printf "%s\n" | sort -rn |uniq -d | xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate | cut -b 36- >result.txt
testdeMacBook-Air:check test$ cat result.txt
testdeMacBook-Air:check test$ find -not -empty -type f -printf "%s\n" | sort -rn |uniq -d | xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate | cut -b 36- >result.txt
testdeMacBook-Air:check test$ cat result.txt
testdeMacBook-Air:check test$ md5sum *
testdeMacBook-Air:check test$ md5sum *|awk '{print $1}'
testdeMacBook-Air:check test$ md5sum *|awk '{print $1}'|sort
testdeMacBook-Air:check test$ md5sum *|awk '{print $1}'|uniq -c
testdeMacBook-Air:check test$ md5sum *|awk '{print $1}'|sort
testdeMacBook-Air:check test$ find . -mtime -2
testdeMacBook-Air:check test$ find . -mtime -2|head
testdeMacBook-Air:check test$ find . -mtime -2|head -100
testdeMacBook-Air:check test$ find . -mtime -2|head -100|md5sum
testdeMacBook-Air:check test$ find . -mtime -2|head -100|xargs md5sum
testdeMacBook-Air:check test$ find . -mtime -2|head -100|xargs md5sum|awk '{print $1}'
testdeMacBook-Air:check test$ find . -mtime -2|head -100|xargs md5sum|awk '{print $1}'|sort
testdeMacBook-Air:check test$ find . -mtime -2|head -100|xargs md5sum|awk '{print $1}'|sort|uniq -c
testdeMacBook-Air:check test$ md5sum *|awk '{print $1}'|sort|uniq -c > result.txt
testdeMacBook-Air:check test$ screen md5sum *|awk '{print $1}'|sort|uniq -c > result.txt
testdeMacBook-Air:check test$  md5sum *|awk '{print $1}'|sort|uniq -c > result.txt
testdeMacBook-Air:check test$ screen md5sum *|awk '{print $1}'|sort|uniq -c > result.txt
testdeMacBook-Air:check test$ cat result.txt
testdeMacBook-Air:check test$ find . -mtime -2|head -1000|xargs md5sum|awk '{print $1}'|sort|uniq -c
testdeMacBook-Air:check test$ find . -mtime -2|head -1000|xargs md5sum|awk '{print $1}'|sort|uniq -c|sort
testdeMacBook-Air:check test$ find . -mtime -2|head -1000|xargs md5sum|grep 04f98a6e5c8236e9f77d92c3bb719066

root@bananapi /home/pi/check # md5sum *

http://blog.csdn.net/weihongrao/article/details/12851771

http://www.cnblogs.com/linuxprobe/p/5699581.html

cd533b999ada48a12afaba0ec03e4c20  1.html
d41d8cd98f00b204e9800998ecf8427e  1.php
39061daa34ca3de20df03a88c52530ea  1.txt
cd533b999ada48a12afaba0ec03e4c20  2.html
39061daa34ca3de20df03a88c52530ea  2.txt
39061daa34ca3de20df03a88c52530ea  one.txt
8b6d588870d64c66c4e641c0ba2635e4  result.txt
root@bananapi /home/pi/check # md5sum *|awk '{print $1}'
cd533b999ada48a12afaba0ec03e4c20
d41d8cd98f00b204e9800998ecf8427e
39061daa34ca3de20df03a88c52530ea
cd533b999ada48a12afaba0ec03e4c20
39061daa34ca3de20df03a88c52530ea
39061daa34ca3de20df03a88c52530ea
8b6d588870d64c66c4e641c0ba2635e4
root@bananapi /home/pi/check # md5sum *|awk '{print $1}'|sort
39061daa34ca3de20df03a88c52530ea
39061daa34ca3de20df03a88c52530ea
39061daa34ca3de20df03a88c52530ea
8b6d588870d64c66c4e641c0ba2635e4
cd533b999ada48a12afaba0ec03e4c20
cd533b999ada48a12afaba0ec03e4c20
d41d8cd98f00b204e9800998ecf8427e
root@bananapi /home/pi/check # md5sum *|awk '{print $1}'|sort|uniq -c
      3 39061daa34ca3de20df03a88c52530ea
      1 8b6d588870d64c66c4e641c0ba2635e4
      2 cd533b999ada48a12afaba0ec03e4c20
      1 d41d8cd98f00b204e9800998ecf8427e
root@bananapi /home/pi/check # md5sum *|awk '{print $1}'|sort|uniq -c|sort
      1 8b6d588870d64c66c4e641c0ba2635e4
      1 d41d8cd98f00b204e9800998ecf8427e
      2 cd533b999ada48a12afaba0ec03e4c20
      3 39061daa34ca3de20df03a88c52530ea
root@bananapi /home/pi/check # md5sum *|awk '{print $1}'|sort|uniq -c|sort -n
      1 8b6d588870d64c66c4e641c0ba2635e4
      1 d41d8cd98f00b204e9800998ecf8427e
      2 cd533b999ada48a12afaba0ec03e4c20
      3 39061daa34ca3de20df03a88c52530ea
root@bananapi /home/pi/check # md5sum *|awk '{print $1}'|sort|uniq -c|sort -nr
      3 39061daa34ca3de20df03a88c52530ea
      2 cd533b999ada48a12afaba0ec03e4c20
      1 d41d8cd98f00b204e9800998ecf8427e
      1 8b6d588870d64c66c4e641c0ba2635e4
root@bananapi /home/pi/check # md5sum *|awk '{print $1}'|sort|uniq -c|sort -n
      1 8b6d588870d64c66c4e641c0ba2635e4
      1 d41d8cd98f00b204e9800998ecf8427e
      2 cd533b999ada48a12afaba0ec03e4c20
      3 39061daa34ca3de20df03a88c52530ea
root@bananapi /home/pi/check # md5sum *|awk '{print $1}'|sort|uniq -c|sort -nr
      3 39061daa34ca3de20df03a88c52530ea
      2 cd533b999ada48a12afaba0ec03e4c20
      1 d41d8cd98f00b204e9800998ecf8427e
      1 8b6d588870d64c66c4e641c0ba2635e4
root@bananapi /home/pi/check # md5sum *|awk '{print $1}'|sort|uniq -c|sort -nr|awk '$1>1{print $1 ,$2}'
3 39061daa34ca3de20df03a88c52530ea
2 cd533b999ada48a12afaba0ec03e4c20
root@bananapi /home/pi/check # md5sum *|awk '{print $1}'|sort|uniq -c|sort -nr|awk '$1>1{print }'
      3 39061daa34ca3de20df03a88c52530ea
      2 cd533b999ada48a12afaba0ec03e4c20

如有问题,可以QQ搜索群1028468525加入群聊,欢迎一起研究技术

支付宝 微信

有疑问联系站长,请联系QQ:QQ咨询

转载请注明:linux重复文件md5sum awk 出自老鄢博客 | 欢迎分享