tci_getline takes a file descriptor — a plain int. The POSIX system
call that reads bytes from a file descriptor into a buffer is read().
This page covers what read() does and what its return values mean
before putting it to use inside tci_getline.
File descriptors in C
In
f01/06
you saw file descriptors as shell numbers: 2>/dev/null, 2>&1, fd 0
for stdin and fd 1 for stdout. In
c02/02
you used them in C for the first time — write(fd, buf, n) writes to a
file descriptor. read(fd, buf, n) is the other direction.
Three file descriptors exist at program start:
| fd | Name | Direction |
|---|---|---|
| 0 | stdin | read |
| 1 | stdout | write |
| 2 | stderr | write |
Every other file descriptor is created by calling open(). The kernel
assigns the lowest available integer starting from 3. How to open a
file and what to do when you are done with it is next.
open() and close()
<fcntl.h> stands for file control. It is a POSIX header that
defines the functions and constants used to open, create, and control
file descriptors at the system call level. The flags like O_RDONLY,
O_WRONLY, and O_CREAT are defined here, as is open itself. It
does not deal with buffered I/O — that is the <stdio.h> domain.
open takes a path, a set of flags, and returns a file descriptor:
int open(const char *path, /* file path */
int flags, /* access mode */
...); /* optional mode for O_CREAT */For reading, the flag is O_RDONLY. open returns the new file
descriptor on success, or −1 on failure:
int fd = open("questions.txt", O_RDONLY);
if (fd < 0) {
/* file not found, permission denied, etc. */
}Every open must be paired with a close. A file descriptor is a
kernel resource — not closing it leaks it. The process has a limit on
how many file descriptors it can hold open at once. A loop that opens
files without closing them will eventually fail with EMFILE. Run
ulimit -n in your terminal to check the limit on your system.
When you are done reading, close the descriptor in C:
close(fd);The full signatures, all flags, and every error code are in the manual
— man 2 open and man 2 close.
read()
read is defined in <unistd.h>:
ssize_t read(int fd, /* file descriptor to read from */
void *buf, /* buffer to write bytes into */
size_t count); /* maximum bytes to read */Three return values matter:
- Positive: the number of bytes placed in
buf. May be less thancount— a short read is not an error. - 0: end of file. No more bytes available on this descriptor.
- −1: error.
errnoholds the reason.
read does not add a null terminator. The buffer is raw bytes. To treat
it as a C string, add '\0' at position bytes_read before using any
string function on it.
Why BUFFER_SIZE matters
A running program lives in user space — the region of memory the OS gives to each process. The kernel lives in a separate, privileged region. Code in user space cannot touch the kernel directly; it requests services through system calls.[1]
Every open, close, read, and write is a system call. Each one
forces a context switch: the CPU saves the current state, raises its
privilege level, executes kernel code, copies data across the boundary,
then switches back. You have been paying this cost since
c02/02
with every write(). With read() you pay it on the input side too.
The cost of a context switch is fixed — it exists whether you transfer 1 byte or 65536. That makes the transfer size critical. Reading 1024 bytes of a file:
| BUFFER_SIZE | read() calls |
|---|---|
| 1 | 1024 |
| 8 | 128 |
| 128 | 8 |
| 1024 | 1 |
The number is set at compile time with -D BUFFER_SIZE=128. The caller
controls the trade-off between memory and call frequency. tci_getline
adapts to whatever value is chosen — including 1, which is the hardest
case and the one the tester uses to stress-test the implementation.
A concrete exercise
Before adding any logic to tci_getline, write a standalone program that
opens a file, reads it in chunks, and prints each chunk in brackets. Save
it as readfile.c — this is scratch code, not part of libtci:
#include <fcntl.h> /* open, O_RDONLY */
#include <unistd.h> /* read, close */
#include <stdio.h> /* printf */
#ifndef BUFFER_SIZE
# define BUFFER_SIZE 8
#endif
int main(int argc, char **argv)
{
char buf[BUFFER_SIZE + 1]; /* +1 for the null terminator we add */
int fd;
ssize_t bytes;
if (argc != 2)
return (1);
fd = open(argv[1], O_RDONLY);
if (fd < 0)
return (1);
while ((bytes = read(fd, buf, BUFFER_SIZE)) > 0) {
buf[bytes] = '\0'; /* read() does not null-terminate */
printf("[%s]", buf); /* brackets show each chunk boundary */
}
printf("\n");
close(fd);
return (0);
}Create a small test file:
printf "one\ntwo\nthree\n" > test.txtCompile and run with BUFFER_SIZE=3:
gcc -Wall -Wextra -g -std=c99 -D BUFFER_SIZE=3 -o readfile readfile.c
./readfile test.txtOutput:
[one][
tw][o
][thr][ee
]The brackets show exactly when each read() call returned. The chunk
boundary falls in the middle of "two" and in the middle of "three".
read() knows nothing about '\n' — it returns raw bytes. tci_getline
must find the '\n', return everything up to it, and keep whatever
came after it for the next call. The next page covers how that state
survives between calls.