🦖

Random Numbers From an Old VHS Cassette Player 📼

or how to use a VHS player as a (crappy) True Random Number Generator

#electronics
#programming
#physics

TRNG what?

Unlike pseudo-random number generators (PRNGs), which use algorithms to generate numbers that appear random but are actually deterministic, True Random Number Generators (TRNGs) are devices that take advantage of an intrinsically stochastic physical process as an entropy source for the generation of random numbers such as electronic noise or radioactive decay.

Devices able to produce truly random numbers are essential in many fields, such as cryptography, statistical sampling, simulation, and gaming, where randomness is required to ensure fairness, security, or accuracy.

The idea of trying to use a VHS came to me during this Christmas holidays when I was digitizing some old VHS cassettes for my mum using a cheap VHS video grabber. There are a variety of grabbers with prices ranging from 10 to over 100 € on Amazon. Since I was not seeking a professional result, I purchased one that included a SCART-to-RCA adapter (about 25 €). Even though the VHSs I attempted to digitize were not in good condition, I believe the result is more than acceptable. Unfortunately, there are very few technical specs about the product, and in particular, no info about the ADC is given by the manufacturer. This device can be used as an external video source using OBS 1. The device is simple to use and includes a manual with instructions for setting it up with OBS. I tested it using both Mac OS and Ubuntu 22.04 LTS.


Video Grabber from DIGITNOW!



The thing that triggered this post was noticing that a noise pattern appears when ejecting the VHS while keeping the grabber on. I though that this noise comes from the tape heads, however I am not 100% sure about this.



Noise coming from the VHS player heads.



Generating random numbers from noisy frames

So the question is:

Can we use these kinds of noisy frames in order to produce random numbers?

First of all, to answer this question, we need some data to work with, so I recorded 10 minutes of this noise using OBS and saved it into an MP4 file for a total of 380 MB.

In order to understand what we are dealing with, I’ve written a Python script for reading our video one frame at a time using OpenCV 2:

import cv2

cap = cv2.VideoCapture('vhs_noise_long.mp4')

    while cap.isOpened():
        
        _,frame = cap.read()


Each frame contains 3 channels representing the red, blue, and green channels. In turn, each channel has a resolution of 480x720 pixels or, in other words, frame.shape = (480,720,3), and each pixel is represented by an 8-bit number (i.e., each pixel can have values ranging from 0 up to 255). We can try a different approach for generating random bits out of these frames. The easiest one is to sum up all the values in each channel and then take the remainder of the division by two; however, in this way, our bitrate is relatively low since we can generate 3 random bits per frame and thus 72 bits per second (bps).

This approach has another drawback, if we look at two adjacent frames, it is possible to see that there are only a few spots where the two are actually different; this could be because the ADC in our VHS grabber is cutting the high-frequency component of our noise making two adjacent frames nearly equal.



Difference between two successive frame.



In order to avoid correlated bit generation, we have to take only separate frames, which results in an even lower bitrate. However, by using this naive approach, we are throwing away a lot of information. A better approach that allows enhancing our bitrate considerably is to take each frame and flatten it into a one-dimensional array; then, we can split the resulting array into chunks of about 100 elements. For each chunk, we can take the sum and check if the remainder of the division by two is 1 or 0.

By doing this, we can achieve a bitrate of 2.5 kps (only one frame every 100 is used in order to avoid possible correlation, as described before).


import cv2
import numpy as np
from tqdm import tqdm
import matplotlib.pyplot as plt

def divide_chunks(l, n):
    for i in range(0, len(l), n):
        yield l[i:i + n]

def main(): 

    # Open .mp4 video
    cap = cv2.VideoCapture('vhs_noise_long.mp4')
    
    # Initialize frame counts
    count = 0

    # List for storing all the generated random_bits
    random_bits = list()

    # Loop over all frames
    while cap.isOpened():
        # Updating frame count
        count += 1

        # Read the i-th frame
        _,frame = cap.read()

        # If i-th is not a multiple of 100 skip this frame
        if count % 100 != 0 : 
            continue

        # Flatten each frame channel and split it into chunks of length 120
        red = divide_chunks(frame[:,:,0].flatten(),120)
        green = divide_chunks(frame[:,:,1].flatten(),120)
        blue = divide_chunks(frame[:,:,2].flatten(),120)

        # Generating random bits
        for r,g,b in zip(red,green,blue):

            random_bits.append(int(np.sum(r))%2)
            random_bits.append(int(np.sum(g))%2)
            random_bits.append(int(np.sum(b))%2)
        


Now we have at our disposal a sequence of random bits that we can easily pack in order to build our random numbers, however, we have to verify that our bitstream is indeed random by using some statistical tests, in particular the ones contained in ENT 3.

Since ENT accepts binary input files containing 8-bit numbers, we have to save our random numbers in this way, in Python this can be simply achieved as follow:

# Let's define a list for storing our 8bit numbers
random_8bit_numbers = list()

    # Pack 8 bit into a byte
    for i in range(len(random_bits)//8):
        rn = 0
        for j in range(8):
            rn += random_bits[i*8+j]*(2**j)
        
        random_8bit_numbers.append(rn)
    
    # Open the output file
    with open("randombytes_vhs.bin","wb") as bf:
        # Write all the generated bytes into a binary file
        bf.write(bytes(random_bits))

        bf.close()


Now that we have this file, we can provide it to ENT so that it can execute all the statistical tests:

$ ent randombytes_vhs.bin
Entropy = 7.999070 bits per byte.

Optimum compression would reduce the size
of this 211680 byte file by 0 percent.

Chi square distribution for 211680 samples is 272.49, and randomly
would exceed this value 21.57 percent of the times.

Arithmetic mean value of data bytes is 127.4212 (127.5 = random).
Monte Carlo value for Pi is 3.144217687 (error 0.08 percent).
Serial correlation coefficient is -0.001024 (totally uncorrelated = 0.0).


The results given by ENT seem to confirm that our numbers are indeed random!

Let’s briefly comment each point:

  • Optimum compression should be 0 for TRNGs.
  • The chi-square divided by the degree of freedoms (i.e., 255 in our case) should be as close as possible to 1 for a TRNG, so our 1.0685 is acceptable.
  • Arithmetic mean value is self-explanatory, and the obtained result appears random.
  • Monte Carlo value will be the topic of another blog post, but the idea is that a TRNG should reproduce the value of Pi.
  • Serial correlation measures how much a byte depends on the previous one, so the value should be 0 for a TRNG.

In conclusion, using noise from a VHS tape player as a starting point, we were able to produce really random integers!


References


  1. OBS, Free and open source software for video recording and live streaming. ↩︎

  2. OpenCV , real-time optimized Computer Vision library. ↩︎

  3. ENT, A Pseudorandom Number Sequence Test Program. ↩︎