thecodingidiot.com

The TerminalFinding Things

Finding Things

You will spend a significant portion of your time looking for things. A function defined somewhere in a large codebase. A file you named badly three days ago. Every line in a log file that mentions a specific error. Two tools handle all of this: grep and find.

grep — searching inside files

grep 'malloc' main.c

grep searches for a pattern inside a file and prints every line that matches. The output includes the matching line; use -n to also show line numbers.

grep -n 'malloc' main.c        # show line numbers
grep -r 'malloc' src/          # search recursively in a directory
grep -r -l 'malloc' src/       # -l: print only filenames, not lines
grep -i 'error' syslog         # -i: case-insensitive
grep -v 'DEBUG' app.log        # -v: invert — lines that do NOT match

grep -r is your first tool when you join a codebase you did not write. Find where a function is called, where a variable is defined, where an error string originates.

Basic regular expressions

grep patterns are regular expressions[1]. You do not need to know all of regex to use grep effectively:

grep '^int' main.c         # lines starting with "int"
grep 'main$' main.c        # lines ending with "main"
grep 'ma.n' main.c         # . matches any single character
grep 'ma[il]n' main.c      # character class: "mail" or "main"
grep 'erro*r' log.txt      # * means zero or more of the previous char

Start with ^ (start of line), $ (end of line), and . (any character). You will learn the rest as you need it.

find — searching for files

find . -name 'main.c'

find searches the filesystem for files matching criteria. The first argument is where to start; . means the current directory.

find . -name '*.c'                                  # all .c files
find . -name '*.c' -type f                          # only regular files (not dirs named x.c)
find . -type d                                      # only directories
find /var/log -name '*.log' -newer /tmp/reference   # modified more recently
find . -name '*.o' -delete                          # find and delete

-type f for regular files, -type d for directories, -type l for symlinks.

Combining find and grep

The real power appears when you pipe find into xargs grep:

find . -name '*.c' | xargs grep 'malloc'

This finds every .c file and searches each one for malloc. Use it when you need explicit control over which files are searched.

find . -name '*.c' | xargs grep -n 'tci_strlen'

This is how you will search your own codebase in the chapters ahead. Learn the combination now.

When you build libtci you will have a project with a dozen .c files across two directories. You want every call to tci_strlen — not just whether it exists, but which file, which line. The combination gives you that:

find . -name '*.c' | xargs grep -n 'tci_strlen'

The output looks like this:

./src/string.c:14:    len = tci_strlen(buf);
./src/format.c:31:    if (tci_strlen(str) > MAX)
./tests/test_string.c:8:    assert(tci_strlen("hello") == 5);

Every call, file and line. You did not open a single file. This is how you navigate codebases you did not write.

For filenames that might contain spaces, the robust form uses -print0 and xargs -0:

find . -name '*.c' -print0 | xargs -0 grep -n 'tci_strlen'

The plain pipe form works for typical project filenames. Adopt the -0 form if your directory tree has spaces in paths.

Footnotes

  1. Regular expression - Wikipedia