You will spend a significant portion of your time looking for things.
A function defined somewhere in a large codebase. A file you named
badly three days ago. Every line in a log file that mentions a specific
error. Two tools handle all of this: grep and find.
grep — searching inside files
grep 'malloc' main.cgrep searches for a pattern inside a file and prints every line that
matches. The output includes the matching line; use -n to also show
line numbers.
grep -n 'malloc' main.c # show line numbers
grep -r 'malloc' src/ # search recursively in a directory
grep -r -l 'malloc' src/ # -l: print only filenames, not lines
grep -i 'error' syslog # -i: case-insensitive
grep -v 'DEBUG' app.log # -v: invert — lines that do NOT matchgrep -r is your first tool when you join a codebase you did not
write. Find where a function is called, where a variable is defined,
where an error string originates.
Basic regular expressions
grep patterns are regular expressions[1]. You do not need to know all
of regex to use grep effectively:
grep '^int' main.c # lines starting with "int"
grep 'main$' main.c # lines ending with "main"
grep 'ma.n' main.c # . matches any single character
grep 'ma[il]n' main.c # character class: "mail" or "main"
grep 'erro*r' log.txt # * means zero or more of the previous charStart with ^ (start of line), $ (end of line), and . (any
character). You will learn the rest as you need it.
find — searching for files
find . -name 'main.c'find searches the filesystem for files matching criteria. The first
argument is where to start; . means the current directory.
find . -name '*.c' # all .c files
find . -name '*.c' -type f # only regular files (not dirs named x.c)
find . -type d # only directories
find /var/log -name '*.log' -newer /tmp/reference # modified more recently
find . -name '*.o' -delete # find and delete-type f for regular files, -type d for directories, -type l for
symlinks.
Combining find and grep
The real power appears when you pipe find into xargs grep:
find . -name '*.c' | xargs grep 'malloc'This finds every .c file and searches each one for malloc. Use it
when you need explicit control over which files are searched.
find . -name '*.c' | xargs grep -n 'tci_strlen'This is how you will search your own codebase in the chapters ahead. Learn the combination now.
When you build libtci you will have a project with a dozen .c
files across two directories. You want every call to tci_strlen — not just whether it
exists, but which file, which line. The combination gives you that:
find . -name '*.c' | xargs grep -n 'tci_strlen'The output looks like this:
./src/string.c:14: len = tci_strlen(buf);
./src/format.c:31: if (tci_strlen(str) > MAX)
./tests/test_string.c:8: assert(tci_strlen("hello") == 5);Every call, file and line. You did not open a single file. This is how you navigate codebases you did not write.
For filenames that might contain spaces, the robust form uses
-print0 and xargs -0:
find . -name '*.c' -print0 | xargs -0 grep -n 'tci_strlen'The plain pipe form works for typical project filenames. Adopt the
-0 form if your directory tree has spaces in paths.