How to read from standard input Standard input, standard output, and pipes are so fundamental Unix concepts that you certainly think you know them well. Nevertheless, I bet you can learn something from this blog. The task we pick today is to write a program that reads from standard input and processes that input, with two requirements: 1) When the input comes in big chunks, the processing is fast: High throughput. 2) When the input is typed by a user directly, the processing reacts quickly: When the user has terminated typing a line, the input is processed immediately. For simplicity, "processing" the input will mean to echo it on standard output. Like the 'cat' program does. For simplicity also, we will treat read errors like EOF, and ignore write errors on standard output. Easy, you think? ISO C has everything you need, because standard I/O in ISO C was created for precisely this purpose, right? Let's see. Here's the first try: ================================== mycat1.c ================================== #include int main () { for (;;) { int c = fgetc (stdin); if (c == EOF) break; fputc (c, stdout); } return 0; } ============================================================================== Let's compile this program. We use the option '-O' so as to enable the usual compiler optimizations, and -Wall so that gcc reports dumb programming blunders that we made. $ gcc -O -Wall mycat1.c -o mycat1 Let's check the interactive behaviour, by typing two lines, "Hello" and "World", and terminate the input by pressing Ctrl-D. $ ./mycat1 Hello Hello World World [Ctrl-D] The input was echoed immediately after each line, which is fine. Now let's check the throughput. We want to benchmark the stdin processing, so let's minimize the output processing by piping it to /dev/null. $ dd if=/dev/zero bs=1M count=1000 | time ./mycat1 > /dev/null 1000+0 records in 1000+0 records out 1048576000 bytes (1.0 GB) copied, 30.1262 s, 34.8 MB/s Clock: 30.13 User: 27.33 System: 1.17 This is on a CPU with 1 GHz. 35 MB per second means ca. 30 ns of processing time for each byte, or ca. 20-50 instructions. This does not sound optimal, really. Next try. We've heard that fgetc and fputc are functions and there are C macros that do the same thing and should be faster. Let's try these instead: ================================== mycat2.c ================================== #include int main () { for (;;) { int c = getc (stdin); if (c == EOF) break; putc (c, stdout); } return 0; } ============================================================================== The interactive behaviour is the same: $ ./mycat2 Hello Hello World World [Ctrl-D] And the throughput? $ dd if=/dev/zero bs=1M count=1000 | time ./mycat2 > /dev/null 1000+0 records in 1000+0 records out 1048576000 bytes (1.0 GB) copied, 30.4019 s, 34.5 MB/s Clock: 30.39 User: 27.56 System: 1.19 It's the same. No significant change. So, next try. We heard that throughput can always be increased by introducing buffers. Every operation that used to be performed on a single byte will now be performed on an entire buffer at once. This reduces the function call overhead, and increases the locality of references during the processing. Here's what it looks like: ================================== mycat3.c ================================== #include int main () { for (;;) { char buf[4096]; size_t count = fread (buf, 1, sizeof (buf), stdin); if (count == 0) break; fwrite (buf, 1, count, stdout); } return 0; } ============================================================================== Indeed, the throughput is increased: $ dd if=/dev/zero bs=1M count=1000 | time ./mycat3 > /dev/null Clock: 3.11 User: 0.15 System: 0.82 1000+0 records in 1000+0 records out 1048576000 bytes (1.0 GB) copied, 3.11303 s, 337 MB/s 3 seconds, instead of 30 seconds, that was well worth the effort! And 3 ns, or 2 to 5 instructions, of processing time for each byte, that looks reasonable. What about the interactive behaviour? $ ./mycat3 Hello World Hello World [Ctrl-D][Ctrl-D] First, the lines are no longer processed immediately, and we had to type Ctrl-D twice to terminate the input. The latter problem is a programming error. When fread() returns the 12 bytes "HelloWorld", it has already consumed one of the Ctrl-Ds. Then our loop calls fread() again, and it doesn't return until the user presses Ctrl-D once again. So, the fix is to exploit the information that EOF was reached also when fread()'s return value is between 0 and sizeof (buf). ================================== mycat4.c ================================== #include int main () { for (;;) { char buf[4096]; size_t count = fread (buf, 1, sizeof (buf), stdin); if (count > 0) fwrite (buf, 1, count, stdout); if (count < sizeof (buf)) break; } return 0; } ============================================================================== The throughput of this program is unmodified: $ dd if=/dev/zero bs=1M count=1000 | time ./mycat4 > /dev/null Clock: 3.15 User: 0.15 System: 0.81 1000+0 records in 1000+0 records out 1048576000 bytes (1.0 GB) copied, 3.15642 s, 332 MB/s and the interactive behaviour is improved: It does not require two Ctrl-D keystrokes any more. $ ./mycat4 Hello World Hello World [Ctrl-D] But it is not interactive: The first line is processed only after the second line and the Ctrl-D were entered. If we didn't have the second requirement about interactive input, for example if our program expects well-formed XML documents that users never type by hand, we could stop here. Next try. We heard that the standard I/O can also be set to an unbuffered mode or line-buffered mode. Let's try the unbuffered mode first: ================================== mycat5.c ================================== #include int main () { setvbuf (stdin, NULL, _IONBF, 0); for (;;) { char buf[4096]; size_t count = fread (buf, 1, sizeof (buf), stdin); if (count > 0) fwrite (buf, 1, count, stdout); if (count < sizeof (buf)) break; } return 0; } ============================================================================== $ ./mycat5 Hello World Hello World [Ctrl-D] The interactive behaviour is the same, and the throughput as well. And the line-buffered mode: ================================== mycat6.c ================================== #include int main () { setvbuf (stdin, NULL, _IOLBF, 0); for (;;) { char buf[4096]; size_t count = fread (buf, 1, sizeof (buf), stdin); if (count > 0) fwrite (buf, 1, count, stdout); if (count < sizeof (buf)) break; } return 0; } ============================================================================== $ ./mycat6 Hello World Hello World [Ctrl-D] It has also no effect. Why? Because the fread() call requires buffered input nevertheless. The setvbuf call would have made a difference for the programs that read bytes one by one: it reduces the throughput of 'mycat2' from 35 MB/sec to 2.0 MB/sec. setvbuf(...,_IONBF,...) has the effect that the stdio will fetch single bytes from the operating system, rather than trying to fill a buffer with as many bytes as immediately available. But setvbuf cannot improve the interactive behaviour. So, what else can we do? We can switch between the fgetc() approach and the fread() approach, depending on a command-line option. fgetc() is essentially the same as fread() with a 1-byte buffer. So the easiest ways to unify both approaches is like this: ================================== mycat7.c ================================== #include #include int main (int argc, char *argv[]) { int unbuffered = (argc > 1 && strcmp (argv[1], "--unbuffered") == 0); for (;;) { char buf[4096]; size_t bufsize = (unbuffered ? 1 : sizeof (buf)); size_t count = fread (buf, 1, bufsize, stdin); if (count > 0) fwrite (buf, 1, count, stdout); if (count < bufsize) break; } return 0; } ============================================================================== The program now exhibits good interactive behaviour $ ./mycat7 --unbuffered Hello Hello World World [Ctrl-D] and good throughput $ dd if=/dev/zero bs=1M count=1000 | time ./mycat7 > /dev/null Clock: 2.70 User: 0.16 System: 0.75 1000+0 records in 1000+0 records out 1048576000 bytes (1.0 GB) copied, 2.70661 s, 387 MB/s But the complexity has only been pushed from the program to the user. This is bad: Users shouldn't have to learn about all command-line options in order to run a program. If there's a best choice, that choice should be built-in. So let's try to make the program guess the best choice. If stdin is a tty (a terminal or terminal emulator), we prefer interactive input; otherwise we prefer high throughput. ================================== mycat8.c ================================== #include #include int main () { int unbuffered = isatty (STDIN_FILENO); for (;;) { char buf[4096]; size_t bufsize = (unbuffered ? 1 : sizeof (buf)); size_t count = fread (buf, 1, bufsize, stdin); if (count > 0) fwrite (buf, 1, count, stdout); if (count < bufsize) break; } return 0; } ============================================================================== $ ./mycat8 Hello Hello World World [Ctrl-D] and good throughput $ dd if=/dev/zero bs=1M count=1000 | time ./mycat8 > /dev/null Clock: 2.91 User: 0.19 System: 0.77 1000+0 records in 1000+0 records out 1048576000 bytes (1.0 GB) copied, 2.91414 s, 360 MB/s So, that's the solution, now? No. The detection of whether unbuffered processing is desired was only a guess. A heuristic. A buggy guess, in other words. In this case: $ cat | ./mycat8 Hello World Hello World [Ctrl-D] the input was processed too late. Whereas this was right: $ cat | ./mycat7 --unbuffered Hello Hello World World How could the program guess that in the case of $ cat | ./mycat8 unbuffered input is better, whereas in $ dd if=/dev/zero bs=1M count=1000 | ./mycat8 or even $ dd if=/dev/zero bs=1M count=1000 | cat | ./mycat8 high throughput is better? Can the program get information about the origin of the data? No, there are no such APIs in Unix. The program can determine that stdin comes from a pipe; this is possible via fstat(). But there is no info available beyond that. So this heuristic was a dead end. All the trouble is caused by fread(), which insists in pulling the specified number of bytes, even if it means to wait. In Unix speak, the fread() call "blocks". But there is a Unix system call for reading data that returns just what is available, if something is available. It's the read() system call. But this means that we drop the ISO C standard I/O and use the Unix system calls instead. Here is the program: ================================== mycat9.c ================================== #include #include #include int main () { for (;;) { char buf[4096]; ssize_t count = read (STDIN_FILENO, buf, sizeof (buf)); if (count <= 0) break; fwrite (buf, 1, count, stdout); } return 0; } ============================================================================== This program now features the good interactive behaviour: $ ./mycat9 Hello Hello World World [Ctrl-D] and the good throughput as well: $ dd if=/dev/zero bs=1M count=1000 | time ./mycat9 > /dev/null Clock: 2.73 User: 0.17 System: 0.74 1000+0 records in 1000+0 records out 1048576000 bytes (1.0 GB) copied, 2.74331 s, 382 MB/s That's the lesson learned. If you want good interactive behaviour and high throughput on standard input, use Unix read() instead of . We are still not done, though. While Unix read(), compared to ISO C fread(), adds the ability to read just as many bytes as are readily available, it also reports errors for events that would have gone unnoticed with fread(). Namely, it returns -1, setting errno to EINTR ("Interrupted system call"), when a signal had to be handled. On most platforms this happens only when the program had a signal handler installed in a particular way; refer to the manual pages for sigaction(), signal(), and siginterrupt() for details. But on some platforms it occurs even if the program has not installed any signal handler. To observe this, stop the program through Ctrl-Z and restart it. On MacOS X, you get this: $ ./mycat9 Hello Hello [Ctrl-Z]^Z [1]+ Stopped ./mycat9 $ fg ./mycat9 $ The fix it to check for the errno value EINTR explicitly. Of course, we do this only when the read() call failed, that is, when it returned -1. And we need a #ifdef because EINTR exists only on Unix platforms. Native Windows platforms don't have it. The program thus looks like this: ================================== mycat10.c ================================== #include #include #include #include int main () { for (;;) { char buf[4096]; ssize_t count = read (STDIN_FILENO, buf, sizeof (buf)); if (count == 0) break; if (count < 0) { #ifdef EINTR if (errno != EINTR) #endif break; } if (count > 0) fwrite (buf, 1, count, stdout); } return 0; } =============================================================================== And it continues to read from standard input after being suspended and restarted: $ ./mycat10 Hello Hello [Ctrl-Z]^Z [1]+ Stopped ./mycat10 $ fg ./mycat10 World World [Ctrl-D] Now, this code is a big ugly, and it is easy to forget to handle EINTR each time you call read(). For this reason, Gnulib has a module 'safe-read' that provides a safe_read() function, similar to read(), but that handles EINTR by restarting the call. ================================== mycat11.c ================================== #include #include #include #include "safe-read.h" int main () { for (;;) { char buf[4096]; size_t count = safe_read (STDIN_FILENO, buf, sizeof (buf)); if (count == 0 || count == SAFE_READ_ERROR) break; fwrite (buf, 1, count, stdout); } return 0; } =============================================================================== $ gcc -O -Wall mycat11.c libgnu.a -o mycat11 So that's finally the way to read from standard input with high throughput, with good interactive behaviour, without bugs on MacOS X, and without #ifdefs in the middle of the code. I'll end the lesson with a few remarks about what you can do if you don't want the user to have to press Return/Enter first. That is, if processing should begin as soon as the user presses a key. Normally, you are using the basic line editing behaviour (echoing of characters, erase behaviour of the Backspace key) built into the tty device. (Don't confuse this with the advanced line editing, which supports arrow keys for movement, forward-erase behaviour of the Delete key, and so on. This line editing comes from the GNU readline library. Either the program is linked against GNU readline, or a wrapper such as 'rlwrap' or 'rlfe' (earlier called 'fep') is used.) This line editing logic sits in the kernel, but can be turned off via system calls. But when you turn it off, you also turn off keystrokes that users are accustomed to rely on: Ctrl-D for terminating the input, Ctrl-C for interrupting and terminating the program, Ctrl-Z to regain control, and so on. Users won't like to miss these features. ================================== mycat12.c ================================== #include #include #include #include static struct termios oldtermio; /* original tty mode */ static int oldtermio_initialized; /* Sets the terminal in cbreak, noecho mode. */ static int term_raw (void) { if (!oldtermio_initialized) { if (tcgetattr (STDOUT_FILENO, &oldtermio) && errno != ENOTTY) return -1; oldtermio_initialized = 1; } { struct termios newtermio; size_t i; newtermio = oldtermio; newtermio.c_iflag &= ISTRIP | IGNBRK; newtermio.c_lflag &= ISIG; for (i = 0; i < NCCS; i++) newtermio.c_cc[i] = 0; newtermio.c_cc[VMIN] = 1; newtermio.c_cc[VTIME] = 0; if (tcsetattr (STDOUT_FILENO, TCSAFLUSH, &newtermio) && errno != ENOTTY) return -1; } return 0; } /* Sets the terminal in nocbreak, echo mode. */ static int term_unraw (void) { if (oldtermio_initialized) { if (tcsetattr (STDOUT_FILENO, TCSAFLUSH, &oldtermio) && errno != ENOTTY) return -1; } return 0; } int main () { term_raw (); for (;;) { int c = fgetc (stdin); if (c == EOF) break; fputc (c, stdout); } term_unraw (); return 0; } =============================================================================== With this program, user keystrokes are processed immediately: $ ./mycat12 World But now the user is caught: No control characters are interpreted. Newlines are echoed as Ctrl-M (= Carriage Return), not Carriage Return + Line Feed, and so on. Let's kill the process: $ kill `ps aux | fgrep ./mycat12 | awk '{ print $2 }'` You may also have to restore the tty into the normal modes: $ stty sane In summary, this facility exists but is better avoided if you don't want to program actions for various control characters and if you don't want hate mails from your users.