Breaking the Affine Cipher: Chosen Plaintext Attack Walkthrough

Challenge Info:

  • Platform: picoCTF 2025 - Guess My Cheese (Part 1)
  • Category: Cryptography
  • Difficulty: Medium
  • Key Concepts: Chosen Plaintext Attack (CPA), Affine Cipher, Modular Arithmetic

Understanding the Challenge

In this cryptography challenge, we are tasked with retrieving a flag hidden behind a "secret cheese" password. The challenge presents us with a server that holds a secret string and a system that can encrypt user input. Because the server allows us to see how our own input is transformed into ciphertext, this setup mimics a classic Encryption Oracle.

Our goal is to interact with this oracle to reverse-engineer the encryption scheme via a Chosen Plaintext Attack (CPA). By feeding the "black box" specific inputs and analyzing the outputs, we can deduce the mathematical relationship between the plaintext and the ciphertext, allowing us to decrypt the password and authenticate with the server.

The Cryptanalyst Mindset

To solve this challenge, we need to adopt the mindset of a cryptanalyst facing a "black box" oracle. We don't know the internal state (the keys), but we can probe the system.

  1. Probe the Oracle: We can't guess the random password, but we can ask the server to encrypt our data. This is a classic Chosen Plaintext Attack (CPA).
  2. Analyze the Pattern: By encrypting a known word like "cheddar", we can compare the input to the output to deduce the encryption logic (Affine Cipher).
  3. Automate the Solution: Since the keys change every session, manual solving is too slow. We must write a script to perform the attack, solve the math, and submit the answer instantly.

Attack Model Analysis

In cryptography, the power of an adversary depends on what information they have access to.

Attack Classification

  • Ciphertext-Only Attack: The weakest scenario. We only see the encrypted message (QMZEXWOILRE) and must guess the key based on statistics (frequency analysis).
  • Known-Plaintext Attack (KPA): We have a piece of plaintext and its corresponding ciphertext, but we didn't choose it.
  • Chosen-Plaintext Attack (CPA): This is our scenario and it is incredibly powerful. As an active adversary, we can ask the oracle to encrypt any text we want. This allows us to surgically probe the system's behavior with specific inputs designed to break the logic, rather than relying on luck or complex statistics.

Initial Reconnaissance

We begin the engagement by connecting to the challenge server using netcat (nc), a versatile networking utility used to read from and write to network connections. Upon connection, the server initiates a session and generates a unique "secret cheese" string for us to guess.

nc verbal-sleep.picoctf.net 64607

Note: The port number and the specific secret string are dynamic; they change with every new connection session to prevent players from simply copying a static answer.

In this specific session, the server provided the following target ciphertext: Secret String: QMZEXWOILRE


Executing the Chosen Plaintext Attack

To break the encryption, we first need to identify the algorithm being used. Since we can choose what to encrypt, we execute our CPA strategy. But what should we send?

Strategic Test Input Selection

Sending a random string or just "a" is inefficient. We choose "cheddar" strategically for three reasons:

  1. Dictionary Validation: The server likely only accepts valid cheese names. "Cheddar" is the "Hello World" of cheeses.
  2. Wide Index Range: It contains letters from early in the alphabet ('a', 'c') to later ones ('r'), giving us data points across the spectrum.
  3. The Double Letter Test: Crucially, "cheddar" contains dd.
    • If dd maps to two different characters (e.g., dd -> XK), we are facing a Polyalphabetic Cipher (like Vigenère or Enigma).
    • If dd maps to the same character twice (e.g., dd -> NN), we are facing a Monoalphabetic Substitution Cipher (like Caesar or Affine). This simple check saves us hours of analysis.

We select (e)ncrypt and input: cheddar.

  • Input (Plaintext): cheddar
  • Output (Ciphertext): EXWNNMJ

Cryptanalysis: Deriving the Keys

The output EXWNNMJ confirms our hypothesis: the double d became double N. This is a deterministic, monoalphabetic substitution.

However, it's not a simple Caesar Shift ($y = x + k$). If it were, the distance between 'c' and 'E' would be the same as 'd' and 'N'.

  • c (2) -> E (4): Shift of +2
  • d (3) -> N (13): Shift of +10

Since the shift isn't constant, but the transformation is linear, we suspect an Affine Cipher.

Mathematical Foundation: The Affine Cipher

The Affine Cipher is a generalization of the Caesar Cipher. Instead of just adding a shift, it multiplies the index and then adds a shift. It is a linear transformation on the alphabet ring $\mathbb{Z}_{26}$:

$$y = (Ax + B) \pmod{26}$$

  • Linearity: The relationship is a straight line equation ($y=mx+c$). This preserves the order and structure, which is why simple algebraic attacks work instantly.
  • The Modulo 26: We operate in modulo 26 because there are 26 letters in the alphabet. It wraps around like a clock.

To break this, we need to solve for the two unknowns: the slope ($A$) and the y-intercept ($B$).

Finding B (The Shift)

We need a point where the multiplication term ($Ax$) disappears. Since $x=0$ eliminates $A$, we look for the letter a (index 0).

  • In cheddar, a is at index 5.
  • In EXWNNMJ, the character at index 5 is M (index 12).

Substituting $x=0, y=12$: $$12 = (A \cdot 0) + B \Rightarrow \mathbf{B = 12}$$

Strategy Note: We target 'a' first specifically because it mathematically isolates B, solving 50% of the equation instantly without needing system of equations.

Finding A (The Multiplier)

Now we solve for the slope. We pick c (index 2) which mapped to E (index 4).

$$4 = (A \cdot 2 + 12) \pmod{26}$$ $$4 - 12 = 2A \pmod{26}$$ $$-8 \equiv 18 \pmod{26}$$ $$2A = 18$$

Dividing by 2 gives: $\mathbf{A = 9}$

The Coprimality Constraint

For an Affine Cipher to work, $A$ must be coprime to 26 (i.e., $\gcd(A, 26) = 1$). If $A$ shares a factor with 26 (like $A=2$ or $A=13$), the mapping creates collisions.

  • Example: If $A=2$, then both 'a' ($0$) and 'n' ($13$) would map to the same ciphertext: $2(0)=0$ and $2(13)=26 \equiv 0$.

Since the function must be a bijection (one-to-one mapping) to be reversible (decryptable), valid values for $A$ can only be: 1, 3, 5, 7, 9, 11, 15, 17, 19, 21, 23, 25. Our found key $A=9$ is valid.


Decryption and Exploitation

We have the encryption keys ($A=9, B=12$), but we need to go backward. We need to solve for $x$:

$$y = Ax + B \Rightarrow y - B = Ax \Rightarrow x = A^{-1}(y - B)$$

Modular Multiplicative Inverse

In modular arithmetic, $A^{-1}$ is the Modular Multiplicative Inverse. It is the number that, when multiplied by $A$, equals 1 modulo 26.

We need: $9 \times ? \equiv 1 \pmod{26}$.

  • Trial: $9 \times 3 = 27 \equiv 1$.
    So, $A^{-1} = 3$.

Decryption Formula: $$x = 3(y - 12) \pmod{26}$$

Applying Decryption

Applying this to QMZEXWOILRE:

  • Q (16) $\rightarrow 3(16-12) = 12 \rightarrow$ M
  • M (12) $\rightarrow 3(12-12) = 0 \rightarrow$ A
  • Z (25) $\rightarrow 3(25-12) = 39 \equiv 13 \rightarrow$ N
    ...

Result: MANCHEGOXPC.

We return to the server prompt, select (g)uess, and enter our decrypted password.


Troubleshooting: Input Sanitization Issues

Ideally, cryptography is precise. However, CTF challenges (and real-world implementations) often have parsing errors. In a previous run, we decrypted the output DEVONTBLUEQUB.

  • Mathematically correct decryption: "DEVON BLUE" (plus salt).
  • Server Response: BLEHHHHH (Failure).

Why This Happens

This is likely an input sanitization issue on the server side. The original plaintext "Devon Blue" contains a space. The encryption oracle might handle the space differently (or discard it) compared to the verification function. Since we are dealing with a strict "exact string match" check, a missing space or an extra character causes failure.

Lesson: In black-box testing, if the crypto is solid but the auth fails, suspect the parsing logic. Restarting for a simpler, single-word cheese (like Cheddar or Manchego) bypasses this variable.


Automated Solution Script

Below is a Python script using pwntools. It automates the CPA, math solving, and error checking.

from pwn import *
import re

def solve():
    # --- Configuration ---
    HOST = 'verbal-sleep.picoctf.net'
    PORT = 53660  # Update with your active instance port!

    print(f"[*] Connecting to {HOST}:{PORT}...")
    try:
        io = remote(HOST, PORT)
    except:
        print(f"[!] Could not connect. Check the PORT number.")
        return

    # 1. Capture the Ciphertext
    io.recvuntil(b'guess it:')
    secret_line = io.recvline().decode().strip()
    target_ciphertext = secret_line.strip()
    print(f"[+] Target Secret Cheese (Ciphertext): {target_ciphertext}")

    # 2. Chosen Plaintext Attack
    known_plaintext = "cheddar"
    io.recvuntil(b'do?')
    io.sendline(b'e')
    io.recvuntil(b'encrypt?')
    io.sendline(known_plaintext.encode())
    
    io.recvuntil(b'cheese:')
    encrypted_cheddar = io.recvline().decode().strip()
    print(f"[+] 'cheddar' encrypts to: {encrypted_cheddar}")

    # 3. Calculate Keys (A and B)
    def to_num(char):
        if 'a' <= char <= 'z': return ord(char) - ord('a')
        if 'A' <= char <= 'Z': return ord(char) - ord('A')
        return 0

    def to_char(num):
        return chr((num % 26) + ord('A'))

    # Calculate Key B using 'a' (index 5 of cheddar -> value 0)
    cipher_char_for_a = encrypted_cheddar[5] 
    B = to_num(cipher_char_for_a)
    print(f"[*] Found Key B: {B}")

    # Calculate Key A using brute force
    cipher_char_for_c = encrypted_cheddar[0]
    target_y_for_c = to_num(cipher_char_for_c)
    
    A = None
    for candidate_a in range(1, 26):
        if candidate_a % 2 == 0 or candidate_a == 13: continue # Coprime check
        
        # Verification with two points prevents false positives
        if (candidate_a * 2 + B) % 26 == target_y_for_c:
             cipher_char_for_h = encrypted_cheddar[1]
             if (candidate_a * 7 + B) % 26 == to_num(cipher_char_for_h):
                A = candidate_a
                break
            
    print(f"[*] Found Key A: {A}")

    # 4. Decrypt and Submit
    try:
        A_inv = pow(A, -1, 26) # Python 3.8+ native modular inverse
    except ValueError:
        print("[!] Key Error: A is not invertible (not coprime).")
        return

    decrypted_secret = ""
    for char in target_ciphertext:
        if not char.isalpha(): continue
        y = to_num(char)
        x = (A_inv * (y - B)) % 26
        decrypted_secret += to_char(x)

    print(f"[+] Decrypted Secret: {decrypted_secret}")

    # Send Answer
    io.recvuntil(b'do?')
    io.sendline(b'g')
    io.recvuntil(b'cheese?')
    io.sendline(decrypted_secret.encode())

    # Receive Flag
    response = io.recvall(timeout=3).decode(errors='ignore')
    if "picoCTF{" in response:
        flag = re.search(r'picoCTF\{.*?\}', response).group(0)
        print(f"\nSUCCESS! FLAG: {flag}")
    else:
        print("\n[!] Check output for errors.")
        io.close()

if __name__ == "__main__":
    solve()

Code Design Decisions

  • Verification Strategy: Notice the nested if inside the loop. Just finding an A that works for 'c' isn't enough; we verify it against 'h' (the second letter of cheddar). This double-check prevents false positives in key calculation.
  • Brute Force vs Algebra: For calculating A, we brute-force the loop range(1, 26). While we could use algebra ($A = (y - B) \cdot x^{-1}$), solving for the inverse of $x$ is messy if $x$ isn't coprime to 26. Since the keyspace is trivial (25 possibilities), brute force is actually more robust here.
  • Native Modular Inverse: We use pow(A, -1, 26) for the final decryption. This is Python's highly optimized implementation of the Extended Euclidean Algorithm, which is far safer than writing our own loop.

Flag Capture

With the correct single-word cheese identified, the server validates our input, displays a victory animation, and reveals the flag.

Flag: picoCTF{ChEeSy7df82c21}


Key Takeaways

Cryptanalysis Techniques

  • Chosen Plaintext Attacks provide surgical precision for breaking ciphers
  • Strategic test input selection accelerates cryptanalysis
  • Monoalphabetic vs polyalphabetic cipher identification using repeated letters
  • Affine cipher vulnerability to algebraic attacks

Attack Methodology

  1. Identify cipher type through pattern analysis
  2. Select strategic known plaintexts
  3. Extract keys using mathematical relationships
  4. Verify keys with multiple data points
  5. Decrypt target ciphertext
  6. Automate for real-time exploitation