The first file is called random_01.aes with 8080 bytes of random binary bits. The key is all zeros for the random numbers made with three variable oscillators. Encryption was using software called Perfect File Encryption with AES. This software failed validation tests, but the file is presented as an artifact from software written in the year 2000.

The second file has random numbers from a keyboard input. It is named ran-2.aes It has 8080 bytes. The key and IV are zeros for aes 128 cbc mode (Cipher Block Chaining). The encryption software is called OpenSSL.

Random numbers can come from nature or from computer complications. There are several simple random-seeming functions that have been developed in software. Random numbers can be tested for their qualities by using some statistical tests that were programmed by George Marsaglia. The suite of statistical tests are called Diehard. The Diehard tests will be done on the three files of random numbers linked above. Two are encrypted random numbers and one is a file of computer generated random numbers (pseudo-random numbers) that is not encrypted. They should all get high grades from the Diehard statistical tests.

**July 17, 2010 : Beginning Diehard tests**. The test needs a binary file with 10 megabytes to 11 megabytes of random bits. The three files given above only have 8 kilobytes, so new 10 megabyte files will be prepared.

ran_04.dat is a text file with copies of "Alien Art For Sale".

ran_05.dat is that text file encrypted with OpenSSL AES-128-cbc.

ran_06.dat is unencrypted random numbers from OpenSSL.

_____________________________________________________

**The results from Diehard:**

Three result files came from Diehard:

ran_04.txt

ran_05.txt

ran_06.txt

Those correspond to the three .dat files listed above.

The text file ran_04.dat was tested by Diehard and it fails to seem random. The results from Diehard are in ran_04.txt at Toyon Jungle Technology.

ran_04.dat has 10,346,041 bytes

so a fractional aes block is there with .5625 of a block.

**Excerpts from Diehard results for ran_04.dat**(renamed from g.txt)

Birthday test

For a sample of size 500: mean

g.txt using bits 6 to 29 224.988

duplicate number number

spacings observed expected

0 500. 67.668

1 0. 135.335

2 0. 135.335

3 0. 90.224

4 0. 45.112

5 0. 18.045

6 to INF 0. 8.282

Chisquare with 6 d.o.f. = 3194.53 p-value= 1.000000

OPERM5 test for file g.txt

For a sample of 1,000,000 consecutive 5-tuples,

chisquare for 99 degrees of freedom=*******; p-value=1.000000

Binary rank test for g.txt

Rank test for 31x31 binary matrices:

rows from leftmost 31 bits of each 32-bit integer

rank observed expected (o-e)^2/e sum

28 35228 211.4*******************

29 3627 5134.0442.359800*********

30 1030 23103.0*******************

31 115 11551.5*******************

chisquare=****** for 3 d. of f.; p-value=1.000000

________________________________________________

The p-value of 1.000000 means the file seems non-random. The documentation with Diehard explains this.

__________________________________________

**The diehard results from ran_05.txt:**

$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

:: THE BITSTREAM TEST ::

:: The file under test is viewed as a stream of bits. Call them ::

:: b1,b2,... . Consider an alphabet with two "letters", 0 and 1 ::

:: and think of the stream of bits as a succession of 20-letter ::

:: "words", overlapping. Thus the first word is b1b2...b20, the ::

:: second is b2b3...b21, and so on. The bitstream test counts ::

:: the number of missing 20-letter (20-bit) words in a string of ::

:: 2^21 overlapping 20-letter words. There are 2^20 possible 20 ::

:: letter words. For a truly random string of 2^21+19 bits, the ::

:: number of missing words j should be (very close to) normally ::

:: distributed with mean 141,909 and sigma 428. Thus ::

:: (j-141909)/428 should be a standard normal variate (z score) ::

:: that leads to a uniform [0,1) p value. The test is repeated ::

:: twenty times. ::

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

THE OVERLAPPING 20-tuples BITSTREAM TEST, 20 BITS PER WORD, N words

This test uses N=2^21 and samples the bitstream 20 times.

No. missing words should average 141909. with sigma=428.

---------------------------------------------------------

tst no 1: 141454 missing words, -1.06 sigmas from mean, p-value= .14370

tst no 2: 141816 missing words, -.22 sigmas from mean, p-value= .41369

tst no 3: 142364 missing words, 1.06 sigmas from mean, p-value= .85595

tst no 4: 141669 missing words, -.56 sigmas from mean, p-value= .28722

tst no 5: 142302 missing words, .92 sigmas from mean, p-value= .82055

tst no 6: 141718 missing words, -.45 sigmas from mean, p-value= .32743

tst no 7: 141826 missing words, -.19 sigmas from mean, p-value= .42282

tst no 8: 141701 missing words, -.49 sigmas from mean, p-value= .31322

tst no 9: 142109 missing words, .47 sigmas from mean, p-value= .67958

tst no 10: 142054 missing words, .34 sigmas from mean, p-value= .63233

tst no 11: 141461 missing words, -1.05 sigmas from mean, p-value= .14744

tst no 12: 141885 missing words, -.06 sigmas from mean, p-value= .47734

tst no 13: 141640 missing words, -.63 sigmas from mean, p-value= .26459

tst no 14: 142022 missing words, .26 sigmas from mean, p-value= .60382

tst no 15: 142098 missing words, .44 sigmas from mean, p-value= .67033

tst no 16: 142193 missing words, .66 sigmas from mean, p-value= .74627

tst no 17: 141872 missing words, -.09 sigmas from mean, p-value= .46525

tst no 18: 141940 missing words, .07 sigmas from mean, p-value= .52857

tst no 19: 141417 missing words, -1.15 sigmas from mean, p-value= .12501

tst no 20: 142281 missing words, .87 sigmas from mean, p-value= .80741

__________________________________________________

**Here is an excerpt from ran_06.txt :**

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

:: This is the COUNT-THE-1's TEST for specific bytes. ::

:: Consider the file under test as a stream of 32-bit integers. ::

:: From each integer, a specific byte is chosen , say the left- ::

:: most:: bits 1 to 8. Each byte can contain from 0 to 8 1's, ::

:: with probabilitie 1,8,28,56,70,56,28,8,1 over 256. Now let ::

:: the specified bytes from successive integers provide a string ::

:: of (overlapping) 5-letter words, each "letter" taking values ::

:: A,B,C,D,E. The letters are determined by the number of 1's, ::

:: in that byte:: 0,1,or 2 ---> A, 3 ---> B, 4 ---> C, 5 ---> D,::

:: and 6,7 or 8 ---> E. Thus we have a monkey at a typewriter ::

:: hitting five keys with with various probabilities:: 37,56,70,::

:: 56,37 over 256. There are 5^5 possible 5-letter words, and ::

:: from a string of 256,000 (overlapping) 5-letter words, counts ::

:: are made on the frequencies for each word. The quadratic form ::

:: in the weak inverse of the covariance matrix of the cell ::

:: counts provides a chisquare test:: Q5-Q4, the difference of ::

:: the naive Pearson sums of (OBS-EXP)^2/EXP on counts for 5- ::

:: and 4-letter cell counts. ::

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

Chi-square with 5^5-5^4=2500 d.of f. for sample size: 256000

chisquare equiv normal p value

Results for COUNT-THE-1's in specified bytes:

bits 1 to 8 2571.60 1.013 .844362

bits 2 to 9 2531.62 .447 .672648

bits 3 to 10 2473.61 -.373 .354510

bits 4 to 11 2516.43 .232 .591855

bits 5 to 12 2471.80 -.399 .345004

bits 6 to 13 2389.82 -1.558 .059594

bits 7 to 14 2491.06 -.126 .449688

bits 8 to 15 2411.62 -1.250 .105660

bits 9 to 16 2460.97 -.552 .290495

bits 10 to 17 2459.58 -.572 .283796

bits 11 to 18 2486.62 -.189 .424977

bits 12 to 19 2440.71 -.839 .200872

bits 13 to 20 2506.74 .095 .537967

bits 14 to 21 2383.18 -1.652 .049261

bits 15 to 22 2561.02 .863 .805936

bits 16 to 23 2576.98 1.089 .861840

bits 17 to 24 2496.84 -.045 .482175

bits 18 to 25 2449.56 -.713 .237812

bits 19 to 26 2372.72 -1.800 .035927

bits 20 to 27 2507.49 .106 .542157

bits 21 to 28 2582.16 1.162 .877359

bits 22 to 29 2422.61 -1.095 .136866

bits 23 to 30 2646.38 2.070 .980776

bits 24 to 31 2493.85 -.087 .465361

bits 25 to 32 2509.92 .140 .555770

____________________________________________________

Discussion of the Diehard results for files numbered 4, 5, and 6.

Discussion of the Diehard results for files numbered 4, 5, and 6.

The text file of a book (ran_04.dat) had p values of 1.000 and 0.000, so the randomness tests easily identify the non-randomness.

The encrypted text file (ran_05.dat) has p-values typically like 0.28331, so its looks random to Diehard.

The unencrypted random numbers file (ran_06.dat) generated by OpenSSL has p-values like .10640, so it may be less random than the encrypted file.

It is interesting to compare the results of Diehard for files 5 and 6, the encrypted file and the random number file. For the birthday test, here are two excerpts:

**ran_05.dat**using bits 1 to 24 1.944

duplicate number number

spacings observed expected

0 69. 67.668

1 139. 135.335

2 130. 135.335

3 96. 90.224

4 49. 45.112

5 13. 18.045

6 to INF 4. 8.282

Chisquare with 6 d.o.f. = 4.66 p-value= .412542

**ran_06.dat**using bits 1 to 24 2.032

duplicate number number

spacings observed expected

0 71. 67.668

1 132. 135.335

2 122. 135.335

3 92. 90.224

4 59. 45.112

5 17. 18.045

6 to INF 7. 8.282

Chisquare with 6 d.o.f. = 6.13 p-value= .591194

_________________________________________

July 18, 2010

A significant difference seems to exist for the Diehard DNA test, comparing the encrypted file and the random file (ran_05.dat versus ran_06.dat). The p-values were tabulated into the following histogram :

p ran05 ran06

.0 1 5

.1 1 2

.2 1 2

.3 2 2

.4 3 2

.5 8 4

.6 3 1

.7 3 0

.8 5 6

.9 1 7

The first column is the approximate p-value.

The second column is ran_05 and the third column is for ran_06.dat number of occurrences of the p-value in the Diehard DNA table of results. Notice how the right column has many occurrences near p=0.9 and near p=0.0. That trend toward the 0.0 and 1.0 p-values makes the OpenSSL random numbers seem to have worse randomness qualities than the AES-128-cbc ciphertext numbers.

________________________________________________________

July 19, 2010

Diehard summaries 7/19/2010

BIRTHDAY SPACINGS TEST

p-values-- 0.0 .1 .2 .3 .4 .5 .6 .7 .8 .9 approx. p-value

ran_05.dat 2 1 2 0 1 0 0 1 1 1 occurrences

ran_06.dat 1 1 0 1 0 2 0 2 1 1 occurrences

OVERLAPPING 5-PERMUTATION TEST

p-values-- 0.0 .1 .2 .3 .4 .5 .6 .7 .8 .9

ran_05.dat 0 0 1 0 0 0 0 1 0 0

ran_06.dat 0 1 1 0 0 0 0 0 0 0

BINARY RANK TEST for 31x31 matrices

p-values-- 0.0 .1 .2 .3 .4 .5 .6 .7 .8 .9

ran_05.dat 0 0 0 0 0 0 0 1 0 0

ran_06.dat 0 0 0 0 1 0 0 0 0 0

BINARY RANK TEST for 6x8 matrices.

p-values-- 0.0 .1 .2 .3 .4 .5 .6 .7 .8 .9

ran_05.dat 2 2 1 7 2 4 1 2 2 1

ran_06.dat 3 3 6 1 4 0 3 2 3 0

OVERLAPPING 20-tuples BITSTREAM TEST

p-values-- 0.0 .1 .2 .3 .4 .5 .6 .7 .8 .9

ran_05.dat 0 3 2 2 4 1 4 1 3 0

ran_06.dat 1 2 0 0 3 3 2 1 3 2

OPSO test

p-values-- 0.0 .1 .2 .3 .4 .5 .6 .7 .8 .9

ran_05.dat 1 1 1 2 2 4 3 5 2 1

ran_06.dat 2 4 0 2 3 2 4 1 3 1

OQSO test

p-values-- 0.0 .1 .2 .3 .4 .5 .6 .7 .8 .9

ran_05.dat 2 2 6 2 2 1 3 3 3 2

ran_06.dat 4 3 4 3 6 1 2 1 0 3

COUNT-THE-1's in specified bytes

p-values-- 0.0 .1 .2 .3 .4 .5 .6 .7 .8 .9

ran_05.dat 2 4 4 3 1 2 0 3 3 3

ran_06.dat 3 2 4 2 4 4 1 0 4 1

CDPARK

p-values-- 0.0 .1 .2 .3 .4 .5 .6 .7 .8 .9

ran_05.dat 3 2 1 0 0 0 0 1 2 1

ran_06.dat 0 2 1 2 1 0 2 0 1 1

3DSPHERES test

p-values-- 0.0 .1 .2 .3 .4 .5 .6 .7 .8 .9

ran_05.dat 1 2 2 3 1 3 1 2 1 4

ran_06.dat 2 0 2 4 1 4 1 1 1 4

OVERLAPPING SUMS test

p-values-- 0.0 .1 .2 .3 .4 .5 .6 .7 .8 .9

ran_05.dat 1 0 1 0 1 1 3 1 2 0

ran_06.dat 3 1 1 0 0 1 0 1 1 1

DNA test

p-value--- .0 .1 .2 .3 .4 .5 .6 .7 .8 .9 p-value

ran_05.txt 1 1 1 3 3 8 3 5 5 1 occurrences

ran_06.txt 5 2 2 2 2 4 1 0 6 7 occurrences

Summary of Summaries of p-values

ran_05.dat less than 0.1 has 15 occurrences

ran_05.dat more than 0.9 has 14 occurrences

ran_06.dat less than 0.1 has 24 occurrences

ran_06.dat more than 0.9 has 21 occurrences

It seeems that the OpenSSL random number generator has a bias more than the OpenSSL AES-128-cbc randomness. OpenSSL Rev. 0.9.8 o

_________________________________________

**Conclusion, August 8, 2010**

Mistakes were made when I (the author and publisher) added up the numbers. The author was wrong about seeing a difference in quality of the random numbers from two sources: ciphertext and a random number generator. This second look at the random numbers was more careful than the first look. The result is

**a retraction**of the guess that one file was less random than another file.

A Perl program was used to make summaries of Diehard reports. It is called merit_47.pl and it is free for you to download. Merit_47.pl provides two output files, one is a bare table of p-values for 17 statistical Diehard tests. This is suitable to be imported into a spreadsheet. The other output file is a commented table of p-values as well as histograms of the number of occurrences of p-values for each test. A batch file called eight.bat is also on line so sixteen files of Diehard results can be processed automatically by the merit_47.pl program.