An example of "buffer overflow" vulnerability

What is buffer overflow?

Hackers typically break into a remote system by exploiting some vulnerability of its software --- a programming or configuration bug that makes it possible to subvert the software and have it execute unplanned instructions.

A common and often exploited kind of vulnerability is the buffer overflow bug, where the program either fails to allocate enough memory for an input string, or fails to test whether the length of the string lies within its valid range. A hacker can exploit such a weakness by submitting an extra-long input to the program, designed to overflow its allocated input buffer (temporary storage area) and modify the values of nearby variables, cause the program to jump to unintended places, or even replace the program's instructions by arbitrary code.

If the buffer overflow bug lies in a network service daemon, such as ftpd or httpd, the attack can be done by directly feeding the "poisonous" input string to the daemon. If the bug lies in an ordinary system tool or application, with no direct network access, the hacker may still launch a passive attack by enclosing the poisonous string inside some innocent-looking data file (e.g. an email message, a document, a spreadsheet), and arranging for some legitimate user feed that file to the flawed program. In either case, a succesful buffer overflow attack is essentially equivalent to letting the hacker log into the system with the same user ID and privileges as the compromised program.

Buffer overflow bugs are especially common in C programs, since that language does not provides built-in array bound checking, and uses a final null byte to mark the end of a string, instead of keeping its length in a separate field. To make things worse, C provides many library functions, such as strcat and getline, which copy strings without any bounds-checking.

A simple example

The following C program is meant to illustrate the insidious nature of buffer overflow bugs. If you have access to a Linux/GCC machine, you may want to test your debugging skills on it.

To perform this small exercise, download the program's tar file, and execute:

    tar -xvf demo.tar
This will create a directory called demo, containing a makefile, the program's C source file (demo.c), and four data files (ciro.img.z, lula.img.z, serra.img.z, and garotinho.img.z). Then perform:
    cd demo
This command will compile demo.c and apply it successively to the four data files.

The demo program is supposed to read a text banner, from a file with extension .img or .img.z, and flash it on the terminal a specified number of times.The extension .img.z means that the file is compressed, using a simple run-length encoding scheme.

The command line arguments are the names of one or more banner files, without any extensions. Each name may be preceded by switches: -r selects uncompressed (raw) format, -z selects compressed format (default), and -n NUM specifies the number of times that the banner is to be flashed (default 1).

The program demo.c has a buffer overflow bug, which, on my machine, causes the last test (demo -n 3 garotinho) to abort with a segmentation fault. Your task is to find this bug and fix it, so that all four files are flashed correctly.

To avoid wasting too much of your time, I will reveal that the buffer overflow bug lies in the main procedure itself. The procedures show_file (which implements the decompression algorithm) and show_date are 100% safe.

Caveat emptor

This recipe and these symptoms should be valid when the program is compiled with the GNU C compiler for an Intel/Linux platform. If they don't work for you, please let me know. (I use GCC version 2.96 on Red Hat Linux 2.2.16-22). If you compile this program on a different machine or with some other compiler, the flaw will still be there, but its symptoms may vary.


Too lazy to try? Can't find the bug? Click here for the answer.

Last edited on 2010-10-17 17:02:40 by stolfi