Thursday 20 March 2014

Using the RPi Camera module to generate random numbers


I do know about /dev/hwrng (sudo modprobe bcm2708-rng) on the Raspberry Pi, but it has a maximum output of 125KiB/second. So just for fun I thought I'd see how much cryptographically secure random numbers could be generated by the quantization errors on the camera module. Anyhow I also do not fully trust broadcoms random number generator because it is not public knowledge how the random numbers are being generated. And yes I do see some irony in using a blob infested RPi to generate high quality true random numbers. But with binary blobs everywhere, even in hard disks, it is hard to stay away from them. No matter how much you personally feel that they are bad, for any kind of good security/privacy.

The raspistill command has a --raw argument which appends at the end of a jpeg image file the raw uncompressed monochrome data from the image sensor before it is processed to add colour via the BGGR Bayer filter. In fact it is before quite a lot of processing: lens shading, black level correction, debayering, denoise, sharpening, defective pixel correction, colour correction.

Taking two raw images of the same subject should generate some least significant bit differences as the 10-bit ADC within the image sensor converted the voltages at each pixel into digital values. At the simplest level this could be due to random changes in light levels between images (number of photons hitting each sensor element) or thermal variations within the system between images or vibration of the image sensor or at a macro level a bird flying by the image sensor (if it was pointing at the sky).

Anyhow the idea is simple enough.
  • take two images using rpistill --raw, 
  • extract and XOR the raw images together to remove everything unchanged between the images so that only the differences remain.
  • There will be randomness here, but it will be low grade (mostly zero bits), so it needs to be concentrated or distilled somehow.  Or it is biased towards zero, so it needs to be unbiased. One solution is to pass the data through a Von Neumann's bias correcting algorithm. Which at its simplest is to throw away bits. i.e. 00-> /dev/null, 01->0, 10->1, 11->/dev/null. If the data was random then on average 8 bits input would become at most 2 bits output.
  • To help reduce any systemic biases every other output generated will be flipped.

Prerequisites:

sudo apt-get install rng-tools

Hardware:

Raspberry Pi model B
  • 512MiB RAM, 128MB of RAM allocated to GPU, no overclocking
  • Inbuilt 10/100 NIC is not connected.
  • Top USB port is empty
  • Bottom  USB port is empty
Raspberry Pi camera board

Code:

$ cat makeraw.sh

#!/bin/sh -- 
bytes=6404096
header=32768
sensordata=6345216
footer=26112
trunk1=`expr $bytes - $header`
#trunk1=`expr $sensordata + $footer`
# 6404096 bytes of RAW data is appended after a jpeg files data
# RAW image is always the maximum size x=2592 and y=1944
# 32768 byte header block before bit packed sensor data
# 10 bits, 10 bits, 10 bits, 10 bits = first 5 bytes of sensor data
# each line of image has 24 extra bytes at the end (unknown)
# ((2592*10/8)+24)=3264 bytes per horizontal line
# 1944 vertical lines of 3264 bytes each (3264x1944=6345216 bytes of sensor data)
# 26112 byte footer block after sensor data
raspistill --raw -q 1 -w 200 -h 200 --verbose --timeout 30 --thumb none --exif none --nopreview -o - | tail -c $trunk1 | head -c $sensordata > a.raw
raspistill --raw -q 1 -w 200 -h 200 --verbose --timeout 30 --thumb none --exif none --nopreview -o - | tail -c $trunk1 | head -c $sensordata > b.raw


if [ ! -r xor.c ]; then
cat << EOF > xor.c
#include <stdio.h>
#include <stdlib.h>

// exclusive or two files together and produce a third file of the difference
int main(int argc, char **argv)
{
 FILE *filea, *fileb, *filec;
 int a, b;

 filea=fopen(argv[1],"r");
 fileb=fopen(argv[2],"r");
 filec=fopen(argv[3],"w");

 a=getc(filea);
 b=getc(fileb);
 while (!feof(filea) && !feof(fileb)) {
        putc(a ^ b, filec);
        a=getc(filea);
        b=getc(fileb);
 }
        fclose(filea);
 fclose(fileb);
 fclose(filec);
}
EOF
fi

if [ ! -r xor ]; then
gcc -o xor xor.c
fi

./xor a.raw b.raw c.raw

if [ ! -r von-Neumann.c ]; then
cat << EOF > von-Neumann.c
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv)
{
        FILE *filea, *filec;
        int count, a, c=0, bits=0, flipbit=0;

        filea=fopen(argv[1],"r");
        filec=fopen(argv[2],"w");

        a=getc(filea);
        while (!feof(filea)) {
// 00-> /dev/null
// 01-> 0
// 10-> 1
// 11-> /dev/null
// on average 8 bits are reduced to 2, but if there is no entropy it will shrink smaller
// to remove any systemic bias, flip every other bit on the output
                for (count=0;count<3;count++) {
                        switch (a&3) {
                                case 0: break;
                                case 1: c=(c<<1)|(0^flipbit); bits++; break;
                                case 2: c=(c<<1)|(1^flipbit); bits++ ; break;
                                case 3: break;
                                default: break; //should never reach here
                        }
                        a=(a>>2);
                        flipbit^=1;
                        if(bits==8) {
                                bits=0;
                                putc(c,filec);
                                c=0;
                        }
                }
                a=getc(filea);
        }
        fclose(filea);
        fclose(filec);
}
EOF
fi


if [ ! -r von-Neumann ]; then
gcc -o von-Neumann von-Neumann.c
fi

./von-Neumann c.raw d.raw
./von-Neumann d.raw e.raw
./von-Neumann e.raw f.raw
./von-Neumann f.raw g.raw


Quick tests:

$ ./makeraw.sh
$ cat c.raw | rngtest
rngtest 2-unofficial-mt.14
Copyright (c) 2004 by Henrique de Moraes Holschuh
This is free software; see the source for copying conditions.  There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

rngtest: starting FIPS tests...
rngtest: entropy source exhausted!
rngtest: bits received from input: 50761728
rngtest: FIPS 140-2 successes: 0
rngtest: FIPS 140-2 failures: 2538
rngtest: FIPS 140-2(2001-10-10) Monobit: 2538
rngtest: FIPS 140-2(2001-10-10) Poker: 2538
rngtest: FIPS 140-2(2001-10-10) Runs: 2538
rngtest: FIPS 140-2(2001-10-10) Long run: 2538
rngtest: FIPS 140-2(2001-10-10) Continuous run: 2069
rngtest: input channel speed: (min=8.826; avg=282.277; max=1733.953)Mibits/s
rngtest: FIPS tests speed: (min=2.334; avg=9.600; max=10.260)Mibits/s
rngtest: Program run time: 5239237 microseconds


This result was to be expected, so lets look at more distilled randomness:

$ cat d.raw | rngtest
rngtest 2-unofficial-mt.14
Copyright (c) 2004 by Henrique de Moraes Holschuh
This is free software; see the source for copying conditions.  There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

rngtest: starting FIPS tests...
rngtest: entropy source exhausted!
rngtest: bits received from input: 3771400
rngtest: FIPS 140-2 successes: 139
rngtest: FIPS 140-2 failures: 49
rngtest: FIPS 140-2(2001-10-10) Monobit: 0
rngtest: FIPS 140-2(2001-10-10) Poker: 14
rngtest: FIPS 140-2(2001-10-10) Runs: 47
rngtest: FIPS 140-2(2001-10-10) Long run: 0
rngtest: FIPS 140-2(2001-10-10) Continuous run: 0
rngtest: input channel speed: (min=10.590; avg=250.494; max=1733.953)Mibits/s
rngtest: FIPS tests speed: (min=4.299; avg=6.761; max=8.473)Mibits/s
rngtest: Program run time: 559268 microseconds



Better but still not great, so lets look at even more distilled randomness:

$ cat e.raw | rngtest
rngtest 2-unofficial-mt.14
Copyright (c) 2004 by Henrique de Moraes Holschuh
This is free software; see the source for copying conditions.  There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

rngtest: starting FIPS tests...
rngtest: entropy source exhausted!
rngtest: bits received from input: 723504
rngtest: FIPS 140-2 successes: 36
rngtest: FIPS 140-2 failures: 0

rngtest: FIPS 140-2(2001-10-10) Monobit: 0
rngtest: FIPS 140-2(2001-10-10) Poker: 0
rngtest: FIPS 140-2(2001-10-10) Runs: 0
rngtest: FIPS 140-2(2001-10-10) Long run: 0
rngtest: FIPS 140-2(2001-10-10) Continuous run: 0
rngtest: input channel speed: (min=9.201; avg=209.407; max=1362.392)Mibits/s
rngtest: FIPS tests speed: (min=5.208; avg=6.055; max=6.223)Mibits/s
rngtest: Program run time: 126666 microseconds


This looks good but as you can see there is a lot less random numbers, but they are getting cryptographically more secure.

$ ls -l
-rw-r--r-- 1 pi pi 6345216
Mar 20 13:16 a.raw
-rw-r--r-- 1 pi pi 6345216
Mar 20 13:16 b.raw
-rw-r--r-- 1 pi pi 6345216 Mar 20 13:16 c.raw
-rw-r--r-- 1 pi pi 471425  Mar 20 13:16 d.raw
-rw-r--r-- 1 pi pi 90438   Mar 20 13:16 e.raw
-rw-r--r-- 1 pi pi 17204   Mar 20 13:16 f.raw
-rwxr-xr-x 1 pi pi 2603    Mar 20 13:16 makeraw.sh
-rwxr-xr-x 1 pi pi 5930    Mar 20 13:16 von-Neumann
-rw-r--r-- 1 pi pi 1259    Mar 20 13:16 von-Neumann.c
-rwxr-xr-x 1 pi pi 5766    Mar 20 13:16 xor
-rw-r--r-- 1 pi pi 482     Mar 20 13:16 xor.c


Conclusion

So for 2 still images at resolution (2592×1944) we have generated about 90438 bytes of cryptographically secure randomness. In theory the sensor could capture 15 frames a second at this resolution, but unfortunately this does not include the 6MiB of raw sensor data. So we can only capture about 1 frame a second which includes the RAW sensor data. So approximately 88 KiB/second. To fill a 3TB hard disk with random numbers would take about 384 days. If the random number generation rate was higher I would probably implement everything in software (no SD writes, most of the calculations done in RAM) but it is way too low a random number generation rate for me to even think about doing that.

P.S. In case you did not notice all the code is quick and dirty with no error checking at all. In all quick tests done by me, error checking is always skipped - makes the code more readable and it is a lot quicker to write. And also efficiency was was minimised to maximise readability.

2 comments:

  1. I wonder if you combined your idea with the one in the link below if you could get more trusted random numbers out ?
    http://jamesdotcom.com/?p=417

    ReplyDelete
    Replies
    1. It is an interesting idea. But the problem with it is that the source of the random numbers is mostly echo's from the big bang and this can be seen and recorded planet wide (or maybe from 6 recording locations to cover all sides of the planet). So it is not random if multiple people who have never met can have the same numbers. Of course some testing should be done. Analogue TV is in the VHF(30-300MHz) and UHF(300-3000MHz) frequency ranges. TV tuners probably only go up to like 1GHz.

      A SDR could be used for the testing like any of the following:
      HackRF ( http://www.greatscottgadgets.com/hackrf/ ) 20MHz bandwidth 8 bits 10->6000MHz
      Airspy ( http://airspy.com/ ) 10MHz bandwidth 12 bits 24–>1750 MHz
      RTLSDR ( http://sdr.osmocom.org/trac/wiki/rtl-sdr ) 2MHz bandwidth 8 bits 54->1100MHz
      FunCube( http://www.funcubedongle.com/ ) 192KHz bandwidth 16 bits 64->1700MHz

      Recording the empty RF spectrum (if there is any these days) using a SDR like the above would provide the same data that a TV does and use less electricity. Maybe get two people to record spectrum in the northern and southern hemisphere at the same time and check if there is high/some/little/none cross-correlation between the signals.

      Delete