Plaid CTF - innov8 challenge

Published on Mar 5, 2025 • 9 min read

NKCyber recently participated in PlaidCTF 2025.

More Info about PlaidCTF
PlaidCTF is a Capture The Flag hosted by the Plaid Parliament of Pwning (PPP), a CTF team from Carnegie Mellon University. It has multiple sponsors, including the trading firm Jane Street.

Website

CTFTime

Challenge

I’m going to be focusing on the innov8 > excav8 challenge

The name is a pun on the V8 engine, which powers everything from Chrome to Node.js.

innov8_excav8$ ls -l
total 39076
-rw-r--r--. 1 sarge sarge      425 Mar 12 15:37 chall.py
-rwxr-xr-x. 1 sarge sarge 39692296 Mar 12 15:40 d8
-rw-r--r--. 1 sarge sarge      664 Mar 12 15:37 Dockerfile
-rw-r--r--. 1 sarge sarge       65 Mar 12 15:37 gen.js
-rw-r--r--. 1 sarge sarge   303530 Mar 12 15:37 output.txt

The d8 file is for running JavaScript directly with the V8 Developer Shell.

We can see the files that are relevant to the challenge:

import subprocess

secret = open('secret.txt').read().strip()
secretbits = ''.join(f'{ord(i):08b}' for i in secret)

output = []

for bit in secretbits:
    if bit == '0':
        output += [float(i) for i in subprocess.check_output('./d8 gen.js', shell=True).decode().split()]
    else:
        output += [float(i) for i in subprocess.check_output('node gen.js', shell=True).decode().split()]

for i in output:
    print(i)

output.txt contains the output from chall.py.

for (let i = 0; i < 24; i++) {
  console.log(Math.random());
}

$ ./d8 -version
V8 version 13.6.1
d8>

At time of writing, the most recent version of V8 is 13.7.9, so I appreciate that the challenge authors provided their specific version to ensure compatibility.

Show Dockerfile

FROM node:23.9.0-slim AS base

RUN apt-get update && apt-get upgrade -y && apt-get install -y \
    curl \
    g++ \
    git \
    pkg-config \
    python3

FROM base AS build

RUN git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git
ENV PATH="/depot_tools:${PATH}"
RUN fetch v8

WORKDIR /v8

RUN git checkout 13.6.1
RUN gn gen out/build --args='is_debug=false v8_use_external_startup_data=false target_cpu="x64" use_goma=false v8_enable_i18n_support=false symbol_level=0'
RUN ninja -C out/build d8

FROM base AS chall

WORKDIR /chall
COPY --from=build /v8/out/build/d8 .
COPY secret.txt gen.js chall.py .

ENTRYPOINT ["python3", "chall.py"]

Summary

So, we know that the flag was encoded by taking each byte of text and joining them together as a binary string.

From there, each 0 bit then ran gen.js with d8, and each 1 bit ran gen.js with Node.js.

Our goal is to loop through each group of 24 numbers produced by gen.js, and determine if the random numbers came from Node.js or V8.

Approach

How do we know if the numbers came from V8 or Node.JS?

In my research, I found the article “[V8 Deep Dives] Random Thoughts on Math.random()” by Andrey Pechkurov.

This breaks down the internals of the V8 engine, and leads us to the following code:

// ES6 #sec-math.random
extern macro RefillMathRandom(NativeContext): Smi;

transitioning javascript builtin MathRandom(
    js-implicit context: NativeContext, receiver: JSAny)(): Number {
  let smiIndex: Smi = *NativeContextSlot(ContextSlot::MATH_RANDOM_INDEX_INDEX);
  if (smiIndex == 0) {
    // refill math random.
    smiIndex = RefillMathRandom(context);
  }
  const newSmiIndex: Smi = smiIndex - 1;
  *NativeContextSlot(ContextSlot::MATH_RANDOM_INDEX_INDEX) = newSmiIndex;

  const array: FixedDoubleArray =
      *NativeContextSlot(ContextSlot::MATH_RANDOM_CACHE_INDEX);
  const random: float64 =
      array.values[Convert<intptr>(newSmiIndex)].ValueUnsafeAssumeNotHole();
  return AllocateHeapNumberWithValue(random);
}

As someone initially unfamiliar with the Torque language, I’ll break down a few of the Torque-specific concepts that I didn’t get at first:

Term	Source	Definition
`extern`	docs	extern signifies that this class is defined in C++, rather than defined only in Torque.
`macro`	docs	macros can either be fully defined in Torque, in which case the CSA code is generated by Torque, or marked extern , in which case the implementation must be provided as hand-written CSA code in a CodeStubAssembler class. Conceptually, it’s useful to think of macros of chunks of inlinable CSA code that are inlined at callsites.
`RefillMathRandom(NativeContext)`	nodejs v8	This macro refills the cache of random numbers, and is implementation dependent.
`Smi`	blog source	Small Integers (31 data bits; 1 sign bit)
`transitioning`	docs	This data can change layout at runtime
`builtin`	docs	builtins are similar to macros in that they can either be fully defined in Torque or marked extern.
`js-implicit`	docs	For builtins with JavaScript linkage defined in Torque, you should use the keyword js-implicit instead of implicit. The arguments are limited to these four components of the calling convention:

While Torque specifies the high level API, the current V8 engine has the following implementation:

Address MathRandom::RefillCache(Isolate* isolate, Address raw_native_context) {
  Tagged<Context> native_context =
      Cast<Context>(Tagged<Object>(raw_native_context));
  DisallowGarbageCollection no_gc;
  Tagged<PodArray<State>> pod =
      Cast<PodArray<State>>(native_context->math_random_state());
  State state = pod->get(0);
  // Initialize state if not yet initialized. If a fixed random seed was
  // requested, use it to reset our state the first time a script asks for
  // random numbers in this context. This ensures the script sees a consistent
  // sequence.

  if (state.s0 == 0 && state.s1 == 0) {
    uint64_t seed;
    if (v8_flags.random_seed != 0) {
      seed = v8_flags.random_seed;
    } else {

      isolate->random_number_generator()->NextBytes(&seed, sizeof(seed));
    }
    state.s0 = base::RandomNumberGenerator::MurmurHash3(seed);
    state.s1 = base::RandomNumberGenerator::MurmurHash3(~seed);
    CHECK(state.s0 != 0 || state.s1 != 0);
  }

  Tagged<FixedDoubleArray> cache =
      Cast<FixedDoubleArray>(native_context->math_random_cache());
  // Create random numbers.
  for (int i = 0; i < kCacheSize; i++) {
    // Generate random numbers using xorshift128+.
    base::RandomNumberGenerator::XorShift128(&state.s0, &state.s1);
    cache->set(i, base::RandomNumberGenerator::ToDouble(state.s0));
  }
  pod->set(0, state);

  Tagged<Smi> new_index = Smi::FromInt(kCacheSize);
  native_context->set_math_random_index(new_index);
  return new_index.ptr();
}

When we’re just running V8 by itself, it will define RandomNumberGenerator as taking entropy from /dev/urandom (if available) or system time (by default).

  // Gather entropy from /dev/urandom if available.
  FILE* fp = fopen("/dev/urandom", "rb");
  if (fp != NULL) {
    int64_t seed;
    size_t n = fread(&seed, sizeof(seed), 1, fp);
    fclose(fp);
    if (n == 1) {
      SetSeed(seed);
      return;
    }
  }

  // We cannot assume that random() or rand() were seeded
  // properly, so instead of relying on random() or rand(),
  // we just seed our PRNG using timing data as fallback.
  // This is weak entropy, but it's sufficient, because
  // it is the responsibility of the embedder to install
  // an entropy source using v8::V8::SetEntropySource(),
  // which provides reasonable entropy, see:
  // https://code.google.com/p/v8/issues/detail?id=2905
  int64_t seed = Time::NowFromSystemTime().ToInternalValue() << 24;
  seed ^= TimeTicks::HighResolutionNow().ToInternalValue() << 16;
  seed ^= TimeTicks::Now().ToInternalValue() << 8;
  SetSeed(seed);

So, now we know that V8 does not provide a high entropy source by default.

If we look at Node.js, we will see:

    V8::SetEntropySource([](unsigned char* buffer, size_t length) {
      // V8 falls back to very weak entropy when this function fails
      // and /dev/urandom isn't available. That wouldn't be so bad if
      // the entropy was only used for Math.random() but it's also used for
      // hash table and address space layout randomization. Better to abort.
      CHECK(ncrypto::CSPRNG(buffer, length));
      return true;
    });

which uses ncrypto::CSPRNG as their pseudorandom number generator.

The source code comments make it really clear that V8 uses an insecure entropy source.

Now we need to figure out how to exploit this fact.

Attack

Our goal is to be able to prove whether a set of inputs were produced by a known algorithm.

Luckily, the excellent creator PwnFunction wrote a program that uses Microsoft’s z3 theorem prover to reverse engineer the pseudorandom engine state using only a few inputs.

Z3 is a state-of-the art theorem prover from Microsoft Research. It can be used to check the satisfiability of logical formulas over one or more theories. Z3 offers a compelling match for software analysis and verification tools, since several common software constructs map directly into supported theories.

— https://microsoft.github.io/z3guide/docs/logic/intro

Given that we have the chall.py output, being chunks of 24 random numbers, we need to apply this to detect whether the chunk of 24 floats was generated with Node.js or V8.

"""
This file contains functions primarily generated by DeepSeek-V3 (then edited by me).

I gave Deepseek the files:

- https://github.com/PwnFunction/v8-randomness-predictor/blob/main/main.py
- chall.py
"""

import z3
import struct
from itertools import batched

def is_node_sequence(floats: list[float]) -> bool:
    assert len(floats) == 24

    reversed_floats = floats[::-1]

    solver = z3.Solver()
    state0, state1 = z3.BitVecs("se_state0 se_state1", 64)

    current_state0 = state0
    current_state1 = state1

    for num in reversed_floats:
        # Compute the mantissa of the number (as per V8's ToDouble method)
        try:
            float_plus_one = num + 1.0
            packed = struct.pack("d", float_plus_one)
            uint64 = struct.unpack("<Q", packed)[0]
            mantissa = uint64 & ((1 << 52) - 1)
        except:
            return False

        # Perform one step of XorShift128+
        s1 = current_state0
        s0 = current_state1
        new_state0 = s0
        s1 ^= s1 << 23
        s1 ^= z3.LShR(s1, 17)
        s1 ^= s0
        s1 ^= z3.LShR(s0, 26)
        new_state1 = s1

        # Add constraint: the upper 52 bits of new_state0 (after shift) must equal the mantissa
        solver.add(z3.LShR(new_state0, 12) == mantissa)

        # Update the current states for next iteration
        current_state0, current_state1 = new_state0, new_state1

    return solver.check() == z3.sat

def reverse_secretbits(secretbits: str) -> str:
    # Split the binary string into chunks of 8 bits
    chunks = [secretbits[i:i+8] for i in range(0, len(secretbits), 8)]
    # Convert each 8-bit binary string to its corresponding character
    return ''.join(chr(int(chunk, 2)) for chunk in chunks)

Then, once I was sure I was on the right track, I wrote some glue logic to apply to the challenge:

"""
this script takes "output.txt" from plaidctf.com for the "innov8" challenge.

- https://plaidctf.com/challenge/5
- https://plaidctf.com/files/innov8_excav8.d35e5bf36e3e6438dd960aa7adeeb1dcbb25479bddd96509ba72968e1238b488.tgz

It was adapted from PwnFunction's v8-randomness-predictor.

- https://github.com/PwnFunction/v8-randomness-predictor
"""

from deepseek import is_node_sequence, reverse_secretbits
from itertools import batched
from tqdm import tqdm


def is_node_wrapper(f: list[float]) -> int:
    assert len(f) == 24
    if is_node_sequence(f):
        return 1
    else:
        return 0

def main():
    print('starting...')

    with open("output.txt") as f:
        numbers = [float(line.strip()) for line in f]
        windows = batched(numbers, 24)

    decoded_bits = ""

    for group in tqdm(windows):
        bit = is_node_wrapper(group)
        print(bit)
        decoded_bits += str(bit)

    print(reverse_secretbits(decoded_bits))

if __name__ == "__main__":
    main()

Running this script on output.txt produces:

$ python3 attack.py
# ... processing elided ...
flag: PCTF{BuilD1nG_v8_i5_SuCh_4_pa1N...}
password to part 2: oaq1MD92evRsDZvH

And with that, we get the flag.

What did we learn?

The chromium codebase can be intimidating, but there’s a lot of good documentation and resources to help you approach it.
It’s important to have a cryptographically secure entropy source for pseudorandom number generation.
The Z3 theorem solver is useful for representing known algorithms.
Node.js depends on V8, but is different from V8.
Large Language Models are good at reformatting similar solutions, but not great at large logical leaps.

I’m not sure we’ll do well in the event overall, given that we’ve only spent a few hours on it, and it’s open for 3 days.

However, I think I learned something new, and I hope you did too!

Comments

Sorry, comments may not be available at this time.

Feel free to contact me, if you want!