Plaid CTF - innov8 challenge

Published on 9 min read

NKCyber recently participated in PlaidCTF 2025.

More Info about PlaidCTF

PlaidCTF is a Capture The Flag hosted by the Plaid Parliament of Pwning (PPP), a CTF team from Carnegie Mellon University. It has multiple sponsors, including the trading firm Jane Street.

Challenge

Challenge Description: Remember the past. (Solving this gives the password for part 2's server and handout.) Reward: 42 points Solves: 100 solves — First solved by valgrind (in 9 minutes), MM_ (in 12 minutes), and Kalmaroos (in 14 minutes)

I’m going to be focusing on the innov8 > excav8 challenge

The name is a pun on the V8 engine, which powers everything from Chrome to Node.js.

Terminal window
innov8_excav8$ ls -l
total 39076
-rw-r--r--. 1 sarge sarge 425 Mar 12 15:37 chall.py
-rwxr-xr-x. 1 sarge sarge 39692296 Mar 12 15:40 d8
-rw-r--r--. 1 sarge sarge 664 Mar 12 15:37 Dockerfile
-rw-r--r--. 1 sarge sarge 65 Mar 12 15:37 gen.js
-rw-r--r--. 1 sarge sarge 303530 Mar 12 15:37 output.txt

The d8 file is for running JavaScript directly with the V8 Developer Shell.

We can see the files that are relevant to the challenge:

chall.py
import subprocess
secret = open('secret.txt').read().strip()
secretbits = ''.join(f'{ord(i):08b}' for i in secret)
output = []
for bit in secretbits:
if bit == '0':
output += [float(i) for i in subprocess.check_output('./d8 gen.js', shell=True).decode().split()]
else:
output += [float(i) for i in subprocess.check_output('node gen.js', shell=True).decode().split()]
for i in output:
print(i)

output.txt contains the output from chall.py.

gen.js
for (let i = 0; i < 24; i++) {
console.log(Math.random());
}

Terminal window
$ ./d8 -version
V8 version 13.6.1
d8>

At time of writing, the most recent version of V8 is 13.7.9, so I appreciate that the challenge authors provided their specific version to ensure compatibility.

Show Dockerfile
Dockerfile
FROM node:23.9.0-slim AS base
RUN apt-get update && apt-get upgrade -y && apt-get install -y \
curl \
g++ \
git \
pkg-config \
python3
FROM base AS build
RUN git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git
ENV PATH="/depot_tools:${PATH}"
RUN fetch v8
WORKDIR /v8
RUN git checkout 13.6.1
RUN gn gen out/build --args='is_debug=false v8_use_external_startup_data=false target_cpu="x64" use_goma=false v8_enable_i18n_support=false symbol_level=0'
RUN ninja -C out/build d8
FROM base AS chall
WORKDIR /chall
COPY --from=build /v8/out/build/d8 .
COPY secret.txt gen.js chall.py .
ENTRYPOINT ["python3", "chall.py"]

Summary

So, we know that the flag was encoded by taking each byte of text and joining them together as a binary string.

From there, each 0 bit then ran gen.js with d8, and each 1 bit ran gen.js with Node.js.

Our goal is to loop through each group of 24 numbers produced by gen.js, and determine if the random numbers came from Node.js or V8.

Approach

How do we know if the numbers came from V8 or Node.JS?

In my research, I found the article “[V8 Deep Dives] Random Thoughts on Math.random()” by Andrey Pechkurov.

This breaks down the internals of the V8 engine, and leads us to the following code:

math.tq
// ES6 #sec-math.random
extern macro RefillMathRandom(NativeContext): Smi;
transitioning javascript builtin MathRandom(
js-implicit context: NativeContext, receiver: JSAny)(): Number {
let smiIndex: Smi = *NativeContextSlot(ContextSlot::MATH_RANDOM_INDEX_INDEX);
if (smiIndex == 0) {
// refill math random.
smiIndex = RefillMathRandom(context);
}
const newSmiIndex: Smi = smiIndex - 1;
*NativeContextSlot(ContextSlot::MATH_RANDOM_INDEX_INDEX) = newSmiIndex;
const array: FixedDoubleArray =
*NativeContextSlot(ContextSlot::MATH_RANDOM_CACHE_INDEX);
const random: float64 =
array.values[Convert<intptr>(newSmiIndex)].ValueUnsafeAssumeNotHole();
return AllocateHeapNumberWithValue(random);
}

As someone initially unfamiliar with the Torque language, I’ll break down a few of the Torque-specific concepts that I didn’t get at first:

TermSourceDefinition
externdocsextern signifies that this class is defined in C++, rather than defined only in Torque.
macrodocsmacros can either be fully defined in Torque, in which case the CSA code is generated by Torque, or marked extern , in which case the implementation must be provided as hand-written CSA code in a CodeStubAssembler class. Conceptually, it’s useful to think of macros of chunks of inlinable CSA code that are inlined at callsites.
RefillMathRandom(NativeContext)nodejs v8This macro refills the cache of random numbers, and is implementation dependent.
Smiblog sourceSmall Integers (31 data bits; 1 sign bit)
transitioningdocsThis data can change layout at runtime
builtindocsbuiltins are similar to macros in that they can either be fully defined in Torque or marked extern.
js-implicitdocsFor builtins with JavaScript linkage defined in Torque, you should use the keyword js-implicit instead of implicit. The arguments are limited to these four components of the calling convention:

While Torque specifies the high level API, the current V8 engine has the following implementation:

math-random.cc
Address MathRandom::RefillCache(Isolate* isolate, Address raw_native_context) {
Tagged<Context> native_context =
Cast<Context>(Tagged<Object>(raw_native_context));
DisallowGarbageCollection no_gc;
Tagged<PodArray<State>> pod =
Cast<PodArray<State>>(native_context->math_random_state());
State state = pod->get(0);
// Initialize state if not yet initialized. If a fixed random seed was
// requested, use it to reset our state the first time a script asks for
// random numbers in this context. This ensures the script sees a consistent
// sequence.
if (state.s0 == 0 && state.s1 == 0) {
uint64_t seed;
if (v8_flags.random_seed != 0) {
seed = v8_flags.random_seed;
} else {
isolate->random_number_generator()->NextBytes(&seed, sizeof(seed));
}
state.s0 = base::RandomNumberGenerator::MurmurHash3(seed);
state.s1 = base::RandomNumberGenerator::MurmurHash3(~seed);
CHECK(state.s0 != 0 || state.s1 != 0);
}
Tagged<FixedDoubleArray> cache =
Cast<FixedDoubleArray>(native_context->math_random_cache());
// Create random numbers.
for (int i = 0; i < kCacheSize; i++) {
// Generate random numbers using xorshift128+.
base::RandomNumberGenerator::XorShift128(&state.s0, &state.s1);
cache->set(i, base::RandomNumberGenerator::ToDouble(state.s0));
}
pod->set(0, state);
Tagged<Smi> new_index = Smi::FromInt(kCacheSize);
native_context->set_math_random_index(new_index);
return new_index.ptr();
}

When we’re just running V8 by itself, it will define RandomNumberGenerator as taking entropy from /dev/urandom (if available) or system time (by default).

// Gather entropy from /dev/urandom if available.
FILE* fp = fopen("/dev/urandom", "rb");
if (fp != NULL) {
int64_t seed;
size_t n = fread(&seed, sizeof(seed), 1, fp);
fclose(fp);
if (n == 1) {
SetSeed(seed);
return;
}
}
// We cannot assume that random() or rand() were seeded
// properly, so instead of relying on random() or rand(),
// we just seed our PRNG using timing data as fallback.
// This is weak entropy, but it's sufficient, because
// it is the responsibility of the embedder to install
// an entropy source using v8::V8::SetEntropySource(),
// which provides reasonable entropy, see:
// https://code.google.com/p/v8/issues/detail?id=2905
int64_t seed = Time::NowFromSystemTime().ToInternalValue() << 24;
seed ^= TimeTicks::HighResolutionNow().ToInternalValue() << 16;
seed ^= TimeTicks::Now().ToInternalValue() << 8;
SetSeed(seed);

So, now we know that V8 does not provide a high entropy source by default.

If we look at Node.js, we will see:

V8::SetEntropySource([](unsigned char* buffer, size_t length) {
// V8 falls back to very weak entropy when this function fails
// and /dev/urandom isn't available. That wouldn't be so bad if
// the entropy was only used for Math.random() but it's also used for
// hash table and address space layout randomization. Better to abort.
CHECK(ncrypto::CSPRNG(buffer, length));
return true;
});

which uses ncrypto::CSPRNG as their pseudorandom number generator.

The source code comments make it really clear that V8 uses an insecure entropy source.

Now we need to figure out how to exploit this fact.

Attack

Our goal is to be able to prove whether a set of inputs were produced by a known algorithm.

Luckily, the excellent creator PwnFunction wrote a program that uses Microsoft’s z3 theorem prover to reverse engineer the pseudorandom engine state using only a few inputs.

Z3 is a state-of-the art theorem prover from Microsoft Research. It can be used to check the satisfiability of logical formulas over one or more theories. Z3 offers a compelling match for software analysis and verification tools, since several common software constructs map directly into supported theories.

https://microsoft.github.io/z3guide/docs/logic/intro

Given that we have the chall.py output, being chunks of 24 random numbers, we need to apply this to detect whether the chunk of 24 floats was generated with Node.js or V8.

deepseek.py
"""
This file contains functions primarily generated by DeepSeek-V3 (then edited by me).
I gave Deepseek the files:
- https://github.com/PwnFunction/v8-randomness-predictor/blob/main/main.py
- chall.py
"""
import z3
import struct
from itertools import batched
def is_node_sequence(floats: list[float]) -> bool:
assert len(floats) == 24
reversed_floats = floats[::-1]
solver = z3.Solver()
state0, state1 = z3.BitVecs("se_state0 se_state1", 64)
current_state0 = state0
current_state1 = state1
for num in reversed_floats:
# Compute the mantissa of the number (as per V8's ToDouble method)
try:
float_plus_one = num + 1.0
packed = struct.pack("d", float_plus_one)
uint64 = struct.unpack("<Q", packed)[0]
mantissa = uint64 & ((1 << 52) - 1)
except:
return False
# Perform one step of XorShift128+
s1 = current_state0
s0 = current_state1
new_state0 = s0
s1 ^= s1 << 23
s1 ^= z3.LShR(s1, 17)
s1 ^= s0
s1 ^= z3.LShR(s0, 26)
new_state1 = s1
# Add constraint: the upper 52 bits of new_state0 (after shift) must equal the mantissa
solver.add(z3.LShR(new_state0, 12) == mantissa)
# Update the current states for next iteration
current_state0, current_state1 = new_state0, new_state1
return solver.check() == z3.sat
def reverse_secretbits(secretbits: str) -> str:
# Split the binary string into chunks of 8 bits
chunks = [secretbits[i:i+8] for i in range(0, len(secretbits), 8)]
# Convert each 8-bit binary string to its corresponding character
return ''.join(chr(int(chunk, 2)) for chunk in chunks)

Then, once I was sure I was on the right track, I wrote some glue logic to apply to the challenge:

Python 3
"""
this script takes "output.txt" from plaidctf.com for the "innov8" challenge.
- https://plaidctf.com/challenge/5
- https://plaidctf.com/files/innov8_excav8.d35e5bf36e3e6438dd960aa7adeeb1dcbb25479bddd96509ba72968e1238b488.tgz
It was adapted from PwnFunction's v8-randomness-predictor.
- https://github.com/PwnFunction/v8-randomness-predictor
"""
from deepseek import is_node_sequence, reverse_secretbits
from itertools import batched
from tqdm import tqdm
def is_node_wrapper(f: list[float]) -> int:
assert len(f) == 24
if is_node_sequence(f):
return 1
else:
return 0
def main():
print('starting...')
with open("output.txt") as f:
numbers = [float(line.strip()) for line in f]
windows = batched(numbers, 24)
decoded_bits = ""
for group in tqdm(windows):
bit = is_node_wrapper(group)
print(bit)
decoded_bits += str(bit)
print(reverse_secretbits(decoded_bits))
if __name__ == "__main__":
main()

Running this script on output.txt produces:

Terminal window
$ python3 attack.py
# ... processing elided ...
flag: PCTF{BuilD1nG_v8_i5_SuCh_4_pa1N...}
password to part 2: oaq1MD92evRsDZvH

And with that, we get the flag.

What did we learn?

  • The chromium codebase can be intimidating, but there’s a lot of good documentation and resources to help you approach it.
  • It’s important to have a cryptographically secure entropy source for pseudorandom number generation.
  • The Z3 theorem solver is useful for representing known algorithms.
  • Node.js depends on V8, but is different from V8.
  • Large Language Models are good at reformatting similar solutions, but not great at large logical leaps.

I’m not sure we’ll do well in the event overall, given that we’ve only spent a few hours on it, and it’s open for 3 days.

However, I think I learned something new, and I hope you did too!