當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

文本处理工具命令xargs, sort, uniq, tr, cut, paste, wc等

發布時間：2024/1/18 编程问答 28 豆豆

生活随笔收集整理的這篇文章主要介紹了文本处理工具命令xargs, sort, uniq, tr, cut, paste, wc等小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

1. 計數命令wc

wc -l [file]: 輸出文件[file]的行數
wc -c?[file]: 輸出文件[file]的byte（字節）數
wc -m?[file]: 輸出文件[file]的字符數, 如果文本都是單字符，則結果等同于wc -c [file]
wc -w [file]:?輸出文件[file]的單詞數

? linux_commands cat test1 hello world!oh my god! 你是 ttt fff gagds? linux_commands wc -c test147 test1 ? linux_commands wc -l test18 test1 ? linux_commands wc -m test143 test1 ? linux_commands wc -w test110 test1

wc -l [file1] [file2]...[file n]: 依次輸出[file1]和[file2]等文件的行數，并在最后累加輸出

? linux_commands wc -l test1 hello.txt8 test17 hello.txt15 total

2. 合并多文件行命令paste

paste -s (-d [delim]) [file]: 將文件的多行合并成單行，默認用tab分隔符，可以通過-d加特定分隔符

? linux_commands cat hello.txt hi world hi boysshe is saying hi hi helloHELLO everyone ? linux_commands paste -s hello.txt hi world hi boys she is saying hi hi hello HELLO everyone ? linux_commands paste -s -d "#" hello.txt hi world#hi boys##she is saying hi#hi hello##HELLO everyone ? linux_commands paste -s -d "\n" hello.txt (等同于cat hello.txt) hi world hi boysshe is saying hi hi helloHELLO everyone

paste (-d [delim]) [file1] [file2]: 將兩個文件按兩邊合并，默認用tab分隔符，可以通過-d加特定分隔符

? linux_commands cat test1 hello world!oh my god! 你是 ttt fff gagds ? linux_commands cat test4 this is test4oh my god hey man ? linux_commands paste test1 test4 hello world! this is test4oh my god oh my god! hey man 你是 ttt fff gagds ? linux_commands paste -d "#" test1 test4 hello world!#this is test4 # #oh my god oh my god!#hey man 你是# ttt# fff# gagds#

?ls | paste - - - : 分三列展示當前目錄的文件

? linux_commands ls | paste - - - diff.txt hello.txt input.txt ls.cmd regex.txt test1 test3 test4 test5 test6 test7 tt ut

sed = [file] |?paste -s -d '\t\n' - -?: 給文件[file]的每行做行數標記，此處sed命令為流編輯，具體不詳述。

? linux_commands sed = test1 | paste -s -d '\t\n' - - 1 hello world! 2 3 4 oh my god! 5 你是 6 ttt 7 fff 8 gagds

3. 行文本切割命令cut

cut -c 3-5: 對于標準輸入的每行把第3個到第5個字符切割出來, 3和5只是參數

? linux_commands cut -c 3-5 123456 (第一次輸入) 345 (切割得345) qw (第二次輸入)(由于長度<3，輸出空) ^C ? linux_commands

cut -c 3-5 [file]: 對于文件[file]的每行把第3個到第5個字符切割出來, 如果省略5，則是切割到行結尾；同理如果省略3表示從行首開始切割

? linux_commands cut -c 3-5 test1 llomyt f gds ? linux_commands cut -c 3- test1 llo world!my god!t f gds

cut -d':' -f5 : 將標準輸入的每行中，將按':'分割的第5部分輸出，此處':'只是分割符，也可以是空格或分號，默認為tab；如果文本中沒有分隔符，則將原文本輸出。

? linux_commands cut -d':' -f2 aa:bb:cc bb aa aa aa:^C ? linux_commands

cut -s -d':' -f2?: 將標準輸入的每行中，將按':'分割的第2部分輸出，-s表示如果該行沒有分割符，則不打印

? linux_commands cut -s -d':' -f2 aa aa:bb bb ^C ? linux_commands

cut -d';' -f2,3?: 將標準輸入的每行中，將按';'分割的第2到3部分輸出，如果將 -f2,3改為 -f2- 表示將分割的第2部分到文本末尾輸出。

? linux_commands cut -d';' -f2,3 aa;bb;cc;dd bb;cc aa aa ^C ? linux_commands

cut (-n) -b 2-4:?對于標準輸入的每行把第2個到第4個字節切割出來，如果加上-n參數，表示不分離多字節字符

? linux_commands cut -b 2-4 asdf sdf ^C ? linux_commands cut -n -b 2-4 晚上去吃飯晚 ^C ? linux_commands

4. 轉換字符命令tr

tr [ch1] [ch2] < [file]: 將文件[file]中的每個字符[ch1]轉換成[ch2]輸出，這里只輸出不更改文件

? linux_commands cat hello.txt hi world hi boysshe is saying hi hi helloHELLO everyone ? linux_commands tr h o < hello.txt oi world oi boyssoe is saying oi oi oelloHELLO everyone

tr [str1] [str2]? < [file]: 將文件[file]中出現的字符串[str1]包含的字符替換成字符串[str2]包含的字符，如果兩個字符串的長度不一致，采用“多退少補”的原則進行替換。

? linux_commands tr 'hi' 'oh' < hello.txt <=> tr 'h' 'o' < hello.txt | tr 'i' 'h' oh world oh boyssoe hs sayhng oh oh oello (可以看到是對h和i分別替換)HELLO everyone ? linux_commands tr hi hello < hello.txt he world (只替換了hi->he) he boysshe es sayeng he he helloHELLO everyone ? linux_commands tr boys men < hello.txt hi werld hi menn (當boys的長度>men，超出的部分以最后一個字母n補充)nhe in naning hi hi helleHELLO evernene

tr -d [str] < [file]: 將文件[file]中所有字符串[str]包含的字符都刪除

? linux_commands tr -d hi < hello.txtworldboysse s sayngelloHELLO everyone

tr "[:lower:]" "[:upper:]" < [file]: 將文件中的小寫字母轉換成大寫

? linux_commands tr "[:lower:]" "[:upper:]" < hello.txt HI WORLD HI BOYSSHE IS SAYING HI HI HELLOHELLO EVERYONE

tr -s [ch1] [ch2]: 將標準輸入中連續出現的[ch1]字符替換成[ch2]字符

? linux_commands echo "Hello my friend" | tr -s ' ' '\n' (多個空格替換成換行) Hello my friend

tr -d -c?'[characters]' : 只輸出標準輸入中含字符集[characters]的部分，即刪除字符集[characters]的補集

? linux_commands echo "22aa" | tr -d '[0-9]' aa ? linux_commands echo "22aa" | tr -d -c '[0-9]' 22%

5. 文本排序命令sort

sort (-r) [file]:? 對文件[file]的所有行按升序排列，如果加-r則表示降序排列, 加上-R表示隨機排序

? linux_commands cat hello.txt hi world hi boys hello boys HELLO everyone she is saying hi hi boys ? linux_commands sort hello.txt HELLO everyone hello boys hi boys hi boys hi world she is saying hi ? linux_commands sort -r hello.txt she is saying hi hi world hi boys hi boys hello boys HELLO everyone ? linux_commands sort -R hello.txt she is saying hi hi boys hi boys HELLO everyone hi world hello boys

sort?--ignore-case [file]: 對文件[file]的所有行忽略大小寫排列

? linux_commands sort --ignore-case hello.txt hello boys HELLO everyone hi boys hi boys hi world she is saying hi

sort -u [file]：對文件[file]的所有行排序，只保留唯一行（去除重復行）

? linux_commands sort -u hello.txt HELLO everyone hello boys hi boys hi world she is saying hi

sort -t[ch] -k [num] [file]：對文件[file]按字符[ch]分割（-t參數），然后按第[num]部分的字符串排序(-k參數)

? linux_commands sort -t' ' -k 2 hello.txt （按空格后的字符串排序） hello boys hi boys hi boys HELLO everyone she is saying hi hi world

ls -lh | sort -h(/-n)?-k 5: 對當前目錄下的文件和目錄按大小排序（-h表示按實際大小，-n表示按數值大小）

? linux_commands ls -lh | sort -n -k 5 total 376 -rw-r--r-- 1 qiushye staff 10B Apr 9 12:13 input.txt -rw-r--r-- 1 qiushye staff 23B Apr 30 12:21 regex.txt -rw-r--r-- 1 qiushye staff 68B May 14 12:34 hello.txt drwxr-xr-x 4 qiushye staff 128B Apr 4 22:17 ut -rw-r--r-- 1 qiushye staff 161B Apr 4 22:43 diff.txt -rw-r--r-- 1 qiushye staff 168K May 15 12:11 commodity.txt （文件最大但數值不是最大） dr-xr-xrwx 6 eric staff 192B Mar 22 21:24 tt drwxr-xr-x 8 qiushye staff 256B May 15 12:12 temp ? linux_commands ls -lh | sort -h -k 5 total 376 -rw-r--r-- 1 qiushye staff 10B Apr 9 12:13 input.txt -rw-r--r-- 1 qiushye staff 23B Apr 30 12:21 regex.txt -rw-r--r-- 1 qiushye staff 68B May 14 12:34 hello.txt drwxr-xr-x 4 qiushye staff 128B Apr 4 22:17 ut -rw-r--r-- 1 qiushye staff 161B Apr 4 22:43 diff.txt dr-xr-xrwx 6 eric staff 192B Mar 22 21:24 tt drwxr-xr-x 8 qiushye staff 256B May 15 12:12 temp -rw-r--r-- 1 qiushye staff 168K May 15 12:11 commodity.txt

sort [input_file] -o [output_file]: 將對文件[input_file]排序后的結果存到文件[output_file]中

? linux_commands sort -u hello.txt -o hello_sorted.txt ? linux_commands cat hello_sorted.txt HELLO everyone hello boys hi boys hi world she is saying hi

sort -c [file]: 檢查文件[file]是否已按增序排序好，如果文件是按降序的，則這里需加-r 參數

? linux_commands sort -c hello_sorted.txt ? linux_commands sort -c hello.txt sort: hello.txt:2: disorder: hi boys

6. 重復行篩選命令uniq

sort [file] | uniq : 只展示文件[file]中的各行，重復行只展示一次

? linux_commands cat hello.txt hi world hi boys hi world hello boys HELLO everyone hi world hi boys ? linux_commands sort hello.txt| uniq HELLO everyone hello boys hi boys hi world

sort [file] | uniq -u (/-d) ：只展示文件[file]中不重復的行，如果換-d參數表示只展示有重復的行

? linux_commands sort hello.txt| uniq -u HELLO everyone hello boys ? linux_commands sort hello.txt| uniq -d hi boys hi world

sort [file] | uniq -c ：對文件[file]中所有行進行重復計數展示

? linux_commands sort hello.txt| uniq -c1 HELLO everyone1 hello boys2 hi boys3 hi world

sort [file] | uniq -c | sort -nr ：對文件[file]中所有行進行重復計數展示, 并按大小排序

? linux_commands sort hello.txt| uniq -c | sort -nr3 hi world2 hi boys1 hello boys1 HELLO everyone

sort [file] | uniq -i : 對文件[file]中的所有行不區分大小寫展示不重復行

? linux_commands cat hello.txt (最后兩行新加的) hi world hi boys hi world hello boys HELLO everyone hi world hi boys oh boys Hello everyone ? linux_commands sort hello.txt| uniq -i HELLO everyone hello boys hi boys hi world

sort [file] | uniq -f [num] ：忽視第[num]個part的字符串，再輸出不重復行

? linux_commands sort hello.txt| uniq -f 1 HELLO everyone hello boys hi world oh boys

7. 將標準輸入轉化成命令行參數xargs

? ? ?shell命令的參數來源包括標準輸入和命令行參數，有些命令支持標準輸入，如cat, grep; 但有些命令不支持，只能指定命令行參數, xargs的作用就是將標準輸入轉化成命令需要的參數。

? temp ls | cat one test1 test3 test4 test5 test6 test7 three two ? temp ls | echo

xargs (echo): 從輸入中讀取字符串，但輸入ctrl+d時結束輸入并打印，后面加echo是相同效果。

? temp xargs a vb bbb a vb bbb ? temp xargs echo a vb bbb a vb bbb

echo "one two three" | xargs mkdir: 將標準輸入的內容作為創建目錄的參數，默認將空格和換行作為分隔符

? temp echo "one two three" | xargs mkdir ? temp ls one three two ? temp echo "aa\nbb" | xargs mkdir ? temp ls aa bb one three two

xargs -p (/ -t): -p參數會對打印出要執行的命令并詢問是否執行，y表示執行；-t參數會打印出要執行的命令并直接執行。

? temp ls | xargs -p echo echo aa bb one three two?...y aa bb one three two ? temp ls | xargs -t echo echo aa bb one three two aa bb one three two

find [path]?-type f -print0 | xargs -0 rm: 找出目錄[path]下的所有文件并刪除，由于xargs以空格作為默認分隔符，而find命令有一個特別的參數-print0，指定輸出的文件列表以null分隔。然后，xargs命令的-0參數表示用null當作分隔符，這樣可以保證刪除文件名帶空格的文件（此命令測試時注意path內文件是可刪的）

? ut ls (ut是目錄名) test2 test3 ? ut find . -type f -print0 | xargs -0 rm ? ut

find [path] -name [pattern] | xargs grep [str]: 在目錄[path]中查找文件名符合[pattern]模式的文件，并分別找出帶字符串[str]的行

? linux_commands find . -name "hello*" | xargs grep hello ./hello_sorted.txt:hello boys ./hello.txt:hello boys

xargs -L(-n)?[num]: 指定[num]行作為命令的參數, -n表示指定[num]項作為命令的參數

? linux_commands xargs -L 1 find . -name （指定1行作為find . -name的參數） "*.txt" （第一次輸入） ./hh.txt ./regex.txt ./diff.txt ./input.txt ./hello_sorted.txt ./hello.txt ./commodity.txt "hello*" （第二次輸入） ./hello_sorted.txt ./hello.txt ? linux_commands echo {0..9} | xargs -n 2 echo 0 1 2 3 4 5 6 7 8 9

xargs -I [str]：將字符串[str]傳給多個命令，類似變量名

? temp ls input.txt three two ? temp cat input.txt| xargs -I name sh -c 'echo name; mkdir name' (name作為參數傳遞) aa bb cc ? temp ls aa bb cc input.txt three two

總結

以上是生活随笔為你收集整理的文本处理工具命令xargs, sort, uniq, tr, cut, paste, wc等的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： strtok and strtok_s
下一篇：项目上线后出现bug该怎么解决