Pipes are only useful if the tools on each end do something. grep
and sort appeared already. A handful of others become indispensable
once you have pipes — not because they are complex, but because they
each do one small thing precisely and compose cleanly with everything
else.
sort
sort names.txt # alphabetical, ascending
sort -r names.txt # reverse
sort -n numbers.txt # numeric sort (not lexicographic)
sort -rn numbers.txt # numeric, descending
sort -u names.txt # sort and remove duplicatessort reads lines and emits them in order. The distinction between
-n and the default matters: without -n, 10 sorts before 9
because 1 < 9 lexicographically.
uniq
sort names.txt | uniq # remove consecutive duplicates
sort names.txt | uniq -c # count occurrences
sort names.txt | uniq -d # print only lines that appear more than onceuniq only removes consecutive duplicates. Always sort first.
The combination sort | uniq -c | sort -rn is a frequency counter
for anything line-based. Use it to find the most common errors in a
log, the most-called functions in a trace, the most repeated words in
a file.
cat build.log | grep 'error' | sort | uniq -c | sort -rn | head -10Ten most frequent error lines in a build log. You will use this exact pipeline[1].
wc
wc -l main.c # count lines
wc -w main.c # count words
wc -c main.c # count byteswc counts. Combined with pipes:
find . -name '*.c' | wc -l # how many C files
grep -r 'malloc' src/ | wc -l # how many malloc callscut
cut -d: -f1 /etc/passwd # first field, colon-delimited
cut -d, -f2,4 data.csv # fields 2 and 4, comma-delimited
cut -c1-10 file.txt # characters 1 through 10 of each linecut extracts columns from structured text. -d sets the delimiter,
-f selects fields (1-indexed). Essential for parsing /etc/passwd,
CSV files, or any fixed-format output.
tr
echo 'Hello World' | tr 'a-z' 'A-Z' # to uppercase
echo 'hello world' | tr ' ' '_' # replace spaces with underscores
echo 'hello' | tr -d 'l' # delete all l characters
cat file.txt | tr -s ' ' # squeeze repeated spaces to onetr translates or deletes characters. It reads stdin and writes
stdout — it does not take a filename argument. Always use it in a pipe.
sed
sed is the stream editor — part of POSIX and available on every
Unix-like system, just like vi. It reads lines, applies a
transformation, and writes the result to stdout.
echo 'hello world' | sed 's/world/terminal/' # hello terminal
cat app.log | sed 's/\[INFO\]//' # strip INFO tagsThe substitution expression s/pattern/replacement/ replaces the
first match on each line. The g flag replaces all matches:
echo 'aaa' | sed 's/a/b/' # baa — first match only
echo 'aaa' | sed 's/a/b/g' # bbb — all matches (global)-i edits the file in place without producing output:
sed -i 's/DEBUG/INFO/g' config.txtThe original file is modified directly. No temporary file, no redirect. Useful in scripts and in situations where vim's interactive editor is not what you want.
Putting it together
A build system outputs a file with one compilation unit per line, some repeated. You want a sorted list of unique filenames with counts:
cat build.log | grep '\.c$' | sort | uniq -c | sort -rnNone of these tools is impressive alone. Together, in a pipeline, they handle most of the text-processing tasks you will encounter in a C development workflow.