thecodingidiot.com

The PipelineChains

Chains

The two-command pipeline from page 06 is a special case. This page generalises it to chains of N commands using the linked-list API introduced in pages 01 and 02.

The structure

A chain of N commands needs N−1 pipes. Each pipe connects one command's stdout to the next command's stdin. The first command reads from infd; the last command writes to outfd.

For N commands:

  • Allocate an array of N−1 int[2] pipe pairs before forking.
  • Fork N children.
  • In each child, set stdin to either infd (first) or the read end of the previous pipe (others), and stdout to either outfd (last) or the write end of the current pipe (others).
  • Close all pipe ends in every process after each fork.
  • Wait for all N children.

Allocate the pipes

int     **pipes;
int     n;
int     i;
 
n = tciu_lstsize(cmds);
pipes = malloc((n - 1) * sizeof(int *));
if (!pipes)
    return;
i = 0;
while (i < n - 1) {
    pipes[i] = malloc(2 * sizeof(int));
    if (!pipes[i] || pipe(pipes[i]) < 0) {
        perror("pipe");
        exit(1);
    }
    i++;
}

All pipes are created before any fork. If a pipe() call fails after some pipes already exist, exit — the pipe file descriptors would otherwise leak into every forked child.

Fork and route each child

Walk the command list. For each command at index k (0-based):

  • stdin: infd if k == 0; else pipes[k-1][0]
  • stdout: outfd if k == n-1; else pipes[k][1]
  • Close all other pipe ends before exec.
void    run_pipeline(t_list *cmds, int infd, int outfd)
{
    int     n;
    int     **pipes;
    pid_t   *pids;
    t_list  *node;
    int     k;
    int     j;
    int     stdin_fd;
    int     stdout_fd;
    int     status;
 
    n = tciu_lstsize(cmds);
    pipes = alloc_pipes(n - 1);   /* allocate and pipe() each pair */
    pids = malloc(n * sizeof(pid_t));
    if (!pids)
        exit(1);
 
    node = cmds;
    k = 0;
    while (node) {
        stdin_fd  = (k == 0)     ? infd         : pipes[k - 1][0];
        stdout_fd = (k == n - 1) ? outfd        : pipes[k][1];
 
        pids[k] = fork();
        if (pids[k] < 0) { perror("fork"); exit(1); }
        if (pids[k] == 0) {
            dup2(stdin_fd, STDIN_FILENO);
            dup2(stdout_fd, STDOUT_FILENO);
            j = 0;
            while (j < n - 1) {
                close(pipes[j][0]);
                close(pipes[j][1]);
                j++;
            }
            if (infd != STDIN_FILENO)
                close(infd);
            if (outfd != STDOUT_FILENO)
                close(outfd);
            exec_cmd((char **)node->content);
        }
 
        node = node->next;
        k++;
    }
 
    j = 0;
    while (j < n - 1) {
        close(pipes[j][0]);
        close(pipes[j][1]);
        j++;
    }
 
    status = 0;
    k = 0;
    while (k < n) {
        waitpid(pids[k], &status, 0);
        k++;
    }
 
    free_pipes(pipes, n - 1);
    free(pids);
}

The child closes all pipe ends — including the ones it uses via dup2. After dup2(stdin_fd, STDIN_FILENO), the original stdin_fd descriptor is redundant; closing all pipes[j][0] and pipes[j][1] (including the ones that were just dup2'd) leaves only STDIN_FILENO and STDOUT_FILENO open for the exec'd program.

The parent closes all pipe ends after the fork loop — it uses none of them.

Collect exit statuses

The loop above collects all statuses but only keeps the last one. That is intentional: pipeline exits with cmdN's exit code, which is the shell's convention for pipelines.

last_status = 0;
k = 0;
while (k < n) {
    waitpid(pids[k], &status, 0);
    if (k == n - 1 && WIFEXITED(status))
        last_status = WEXITSTATUS(status);
    k++;
}
return (last_status);

Heredoc

The heredoc form replaces infile with lines read from stdin until a LIMITER line appears. Detect it in main.c before opening the first argument as a file:

static int is_heredoc(char *arg)
{
    /*
     * The first argument is a heredoc limiter if it does not name
     * an existing file. A simple heuristic: attempt to open it;
     * if open fails with ENOENT, treat it as a limiter.
     */
    return (access(arg, F_OK) != 0);
}

A more explicit approach is to require a distinct flag, but the invocation format stated in the project uses the first argument's existence to disambiguate.

Write the heredoc lines to a pipe. The parent writes; the child (cmd1) reads:

static int heredoc(char *limiter)
{
    int     fd[2];
    char    *line;
 
    if (pipe(fd) < 0) { perror("pipe"); exit(1); }
 
    line = tci_getline(STDIN_FILENO);
    while (line) {
        if (tci_strcmp(line, limiter) == 0
                || tci_strncmp(line, limiter,
                        tci_strlen(limiter)) == 0) {
            free(line);
            break;
        }
        write(fd[1], line, tci_strlen(line));
        write(fd[1], "\n", 1);
        free(line);
        line = tci_getline(STDIN_FILENO);
    }
    close(fd[1]);
    return (fd[0]);   /* caller uses this as infd */
}

tci_getline returns one line per call without the trailing \n. The line is compared to the limiter with tci_strcmp. When the limiter is matched, close(fd[1]) signals EOF to cmd1. The read end fd[0] is returned as infd.

The tci_strncmp check handles the case where the limiter appears with a trailing \n — line endings from terminal input may vary.

Update main.c to detect and handle the heredoc form:

int main(int argc, char **argv)
{
    int     infd;
    int     outfd;
    t_list  *cmds;
 
    if (argc < 4) {
        tci_printf("usage: ./pipeline infile|LIMITER cmd... outfile\n");
        return (1);
    }
 
    if (is_heredoc(argv[1]))
        infd = heredoc(argv[1]);
    else {
        infd = open(argv[1], O_RDONLY);
        if (infd < 0) { perror(argv[1]); return (1); }
    }
 
    outfd = open(argv[argc - 1], O_WRONLY | O_CREAT | O_TRUNC, 0644);
    if (outfd < 0) { perror(argv[argc - 1]); close(infd); return (1); }
 
    cmds = build_cmd_list(argv + 2, argc - 3);
    if (!cmds) { close(infd); close(outfd); return (1); }
 
    run_pipeline(cmds, infd, outfd);
 
    close(infd);
    close(outfd);
    tciu_lstclear(&cmds, free_cmd);
    return (0);
}

build_cmd_list walks argv + 2 for argc - 3 entries (everything between the first argument and the last argument), splits each on spaces, and builds the t_list of char ** argv arrays.

Verify chain behaviour

make re

Three-command chain:

echo -e "delta\nalpha\nbeta\ngamma\nalpha\ndelta" > words.txt
./pipeline words.txt "sort" "uniq" "wc -l" counts.txt
cat counts.txt

Expected: 4 — four unique words. Shell equivalent:

< words.txt sort | uniq | wc -l > counts.txt

Heredoc:

./pipeline "STOP" "cat" "wc -l" out.txt

Type a few lines, then type STOP and press Enter. out.txt should contain the line count of what was typed.

Four-command chain:

./pipeline words.txt "cat" "sort" "uniq -c" "sort -rn" ranked.txt
cat ranked.txt

Expected: words sorted by frequency, most frequent first.

The chain of N commands works. Page 08 builds, runs the tester, and closes the chapter.