ZZKAoK::Kalmar CTF 2025

Time: 2025.03.09-10 (online)

I played with r3kapig last weekend, and we managed to solve 5 out of 8 crypto challs.

ZZKAoK (6 solves)

I’ll explain this challenge following my path during the contest -> logical confusion and nonsense.

Zk Proof System

Here comes a proof system for integer factorization. Given $N$, the Prover $\mathcal{P}$ can show the knowledge of two factors $(p, q)$ satisfying $N = p \cdot q$ to the Verifier $\mathcal{V}$, though the following constraints:

1
2
3
4
5
6
7
8
9
10
11
12
13
# relation to prove:
# p \notin {-1, 1}
# q \notin {-1, 1}
# p * q = N

# via:
# a^2 - 4 >= 0 <=> a \notin {-1, 1}
# b^2 - 4 >= 0 <=> b \notin {-1, 1}

# arithmetic circuit:
# p * q = N
# a1^2 + a2^2 + a3^2 + a4^2 = p^2 - 4
# b1^2 + b2^2 + b3^2 + b4^2 = q^2 - 4

Apparently $\mathcal{P}$ is able to compute a valid proof if and only if he knows the factors.

But we don’t want to reveal any other information about the factors (thus zeor-knowledge), so $\mathcal{P}$ CANNOT just send the values to $\mathcal{V}$. Instead, we randomly choose small prime $p_i$ and check the constraints on $GF_{p_i}$. Our proof is more reliable with more primes chosen, as analyzed in the challenge:

1
2
3
4
5
6
7
8
9
10
TOTAL_PRIME_BITS = 0
for p in PRIMES:
TOTAL_PRIME_BITS += p.bit_length() - 1

BITS = 50_000 # maximum norm, bits
SECP = 256 # \secpar
RATE = (TOTAL_PRIME_BITS + BITS - 1)//BITS # rate
SECB = RATE.bit_length() - 1 # security bits per query
QUERIES = (SECP + SECB - 1) // SECB # total number of queries
assert len(PRIMES) == 1 << NUML

In fact $\mathcal{P}$ leaks $secret \pmod{p_i}$ in this process, so this is technically not that “zero-knowledge”.

For a secure proof system, we must make sure $\mathcal{P}$ computed the proof correctly, rather than giving a random answer on $GF_{p_i}$. Here is how we do:

  1. $\mathcal{P}$ makes commitments on secret values.
  2. $\mathcal{V}$ chooses some primes $p_i \in PRIMES$.
  3. $\mathcal{P}$ computes constraints inputs and outputs values on $GF_{p_i}$.
  4. $\mathcal{V}$ check the validity on $GF_{p_i}$.

Merkle Tree

With Commitment Scheme, we ensure that $\mathcal{P}$ cannot modify input values in the subsequnt proof steps. Here the challenge uses Merkle Tree Commitment, which is suitable for set commitment and membership open/proof.

  1. $\mathcal{P}$ computes $secret \pmod{p_i}$ on all $p_i \in PRIMES$, storing in a list.
  2. $\mathcal{P}$ generates a Merkle Tree on the list, using Tree Root as the commitment.
  3. For a chosen prime $p_i$, $\mathcal{P}$ opens the commitment by providing the Merkle Path of $secret \pmod{p_i}$.

Then we can make sure $\mathcal{P}$ is using $secret \pmod{p_i}$ with the committed $secret$.

NIZK

What we have is a interactive proof system. For a non-interactive one, we need $\mathcal{P}$ to generate challenge data on his own while ensuring security. We introduce a transcript structure:

  • The internal state of the transcript gets changes every time $\mathcal{P}$ performs operations such as commit and evaluate.
  • $\mathcal{P}$ generates random values based on the internal state of the transcript as challenge data provided from $\mathcal{V}$.

The transcript binds its internal state to the proof operations, while its subsequent output is determined but unpredictable. This is an important structure in non-interactive zero-knowledge proofs. Check Fiat-Shamir Heuristic.

And here comes our NIZK:

$\mathcal{P}$:

  1. $\mathcal{P}$ makes commitments on secret values.
  2. $\mathcal{P}$ gets some primes $p_i \in PRIMES$ from transcripts.
  3. $\mathcal{P}$ opens the commitments on the chosen $p_i$.
  4. $\mathcal{P}$ computes constraints inputs and outputs values on $GF_{p_i}$.

$\mathcal{V}$:

  1. $\mathcal{V}$ repeats the whole process to get the status of the transcript.
  2. $\mathcal{V}$ checks the validity of the opens.
  3. $\mathcal{V}$ verify the validity of the proof.

In addition to the target constraints, we need to add a new constraint binding all the commitments (a linear one is ok).

Vuln1 (Unintended)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# Merkle Tree Path Verification
def verify(root: str, proof: list, pos: int, size: int) -> bytes:
assert len(proof) > 0
assert len(proof) < 32

lvls = len(proof) - 1
assert 1 << lvls == size

leaf = proof[0]
node = sha256(leaf)

for i in range(1, lvls+1):
(dirc, sibl) = proof[i]
assert isinstance(sibl, str)
assert isinstance(dirc, int)
assert len(sibl) == 64
assert dirc in [0, 1]
if dirc == 0:
node = sha256(node + sibl)
else:
node = sha256(sibl + node)

assert node == root
return leaf

We can find that the verification for Merkle Tree Path does not check whether the path corresponds to the pos, so we can choose any valid path from the tree when opening. My strategy is as follows:

  1. $p = N, q = 1$
  2. $four(a_1, a_2, a_3, a_4) = p^2 - 4$
  3. $(b_1, b_2, b_3, b_4) = (rand, rand, rand, rand)$

For each $pos$, we should find three positions $(i, j, k)$ where $(b_2, b_3, b_4)$ can be opened to satisfy the constraints. I just iterate through $i$ and solve the following equations:

  1. $b_3^2 + b_4^2 = r \pmod{p_i}$
  2. $f_0 \cdot b_3 + f_1 \cdot b_4 = v \pmod{p_i}$

After finding possible solutions, we can then check if there are positions $(j, k)$ that satisfy the solutions.

Vuln2 (Intended)

The second vuln lies in the final constraint over all the commitments. This time our strategy is more direct:

  1. $p = N, q = 1$
  2. $four(a_1, a_2, a_3, a_4) = p^2 - 4$
  3. find $(b_1, b_2, b_3, b_4)$ such that $\forall p_i \in PRIMES \rightarrow (b_1, b_2, b_3, b_4) = (-1, -1, -1, 0) \pmod{p_i}$

Or something else, satisfying $b_1^2 + b_2^2 + b_3^2 + b_4^2 = -3 \pmod{p_i}$. It turns out that one can easily find a solution with CRT, which will be very large. The value of the last constraints (linear combined commitments) is too large to pass the following code in Verifier.

1
2
3
4
5
6
class Verifier:
def value(self):
value = next(self.vals)
assert - 2**BITS < value < 2**BITS
self.tx.value(value)
return value

Actually $\mathcal{V}$ does NOT check how many commitments are used, and the verification for the last constraint just sum all of them up. So we can add an extra “unauthorized” commitment to control the value to zero. Since transcript is unchanged, we can first get both the constraint and all the $p_i$, and then construct out commitment.

During the contest we did notice some strange implementations in the challenge.

image image

But nobody gave it a further analysis XD