To even begin to understand how this attack works, you will need at least a basic understanding of how a CPU works, how memory works, what the “heap” and “stack” of a process are, what pointers are, what libc is, what linked lists are, how function calls are implemented at the machine level (including calls to function pointers), what the malloc
and free
functions from the C library do, and so on. Hopefully you at least have some basic knowledge of C programming? (If not, you will probably not be able to complete this assignment in time.)
If you have a couple “gaps” in your knowledge of the basic topics mentioned above, hit the books and fill them in as quickly as you can. Talk to others if you need to, to make sure you understand them. Then read the following very carefully. This will not explain everything in the article you linked to, but will give you a good start. OK, ready? Let’s start…
C strings are “null-terminated”. That means the end of a string is marked by a zero byte. So for example, the string "abc"
is represented in memory as (hex): 0x61 0x62 0x63 0x00
. Notice, that 3-character string actually takes 4 bytes, due to the terminating null.
Now if you do something like this:
char *buffer = malloc(3); // not checking for error, this is just an example
strcpy(buffer, "abc");
…then that terminating null (zero byte) will go past the end of the buffer and overwrite something. We allocated a 3-byte buffer, but copied 4 bytes into it. So whatever was stored in the byte right after the end of the buffer will be replaced by a zero byte.
That was what happened in __gconv_translit_find
. They had a buffer, which had been allocated with enough space to append ".so"
, including the terminating null byte, onto the end of a string. But they copied ".so"
in starting from the wrong position. They started the copy operation one byte too far to the “right”, so the terminating null byte went past the end of the buffer and overwrote something.
Now, when you call malloc
to get back a dynamically allocated buffer, most implementations of malloc
actually store some housekeeping data right before the buffer. For example, they might store the size of the buffer. Later, when you pass that buffer to free
to release the memory, so it can be reused for something else, it will find that “hidden” data right before the beginning of the buffer, and will know how many bytes of memory you are actually free
ing. malloc
may also “hide” other housekeeping data in the same location. (In the 2014 article you referred to, the implementation of malloc
used also stored some “flag” bits there.)
The attack described in the article passed carefully crafted arguments to a command-line program, designed to trigger the buffer overflow error in __gconv_translit_find
, in such a way that the terminating null byte would wipe out the “flag” bits stored by malloc
— not the flag bits for the buffer which overflowed, but those for another buffer which was allocated right after the one which overflowed. (Since malloc
stores that extra housekeeping data before the beginning of an allocated buffer, and we are overrunning the previous buffer. You follow?)
The article shows a diagram, where 0x00000201
is stored right after the buffer which overflows. The overflowing null byte wipes out the bottom 1
and changes that into 0x00000200
. That might not make sense at first, until you remember that x86 CPUs are little-endian — if you don’t understand what “little-endian” and “big-endian” CPUs are, look it up.
Later, the buffer whose flag bit was wiped out is passed to free
. As it turns out, wiping out that one flag bit “confuses” free
and makes it, in turn, also overwrite some other memory. (You will have to understand the implementation of malloc
and free
which are used by GNU libc, in order to understand why this is so.)
By carefully choosing the input arguments to the original program, you can set things up so that the memory overwritten by the “confused” free
is that used for something called tls_dtor_list
. This is a linked list maintained by GNU libc, which holds pointers to certain functions which it must call when the main program is exiting.
So tls_dtor_list
is overwritten. The attacker has set things up just right, so that the function pointers in the overwritten tls_dtor_list
will point to some code which they want to run. When the main program is exiting, some code in libc iterates over that list and calls each of the function pointers. Result: the attacker’s code is executed!
Now, in this case, the attacker already has access to the target system. If all they can do is run some code with the privilege level of their own account, that doesn’t get them anywhere. They want to run code with root (administrator) privileges. How is that possible? It is possible because the buggy program is a setuid program, owned by root. If you don’t know what “setuid” programs in Unix are, look it up and make sure you understand it, because that is also a key to the whole exploit.
This is all about the 2014 article — I didn’t look at the one from 1998. Good luck!
1
solved what is The poisoned NUL byte, in 1998 and 2014 editions?