Tuesday, November 23, 2010

Back At It: D'Agapeyeff Cipher

To recap my assumptions before we dive in:

1.  The D'Agapeyeff Cipher consists of numbers 1-5 derived from a polybius square, numbers 6-9 which are nulls to disguise the ciphertext, and the number zero used as a termination point and to pad out the cipher.
 "The cipher is of course easily made out, but if every third, fourth, or fifth letter, as may be previously arranged, is a dummy inserted after a message has been put into cipher, it is then extremely difficult to decipher unless you are in the secret."

2.  The D'Agapeyeff Cipher can be written into a 14 x 14 square matrix.

3.  Phi tests on the cipher text as pairs of numbers running column-wise, row-wise, or pairing digits before and after the zero near the middle fail for mono-alphabetic substitution.  Therefore, columnar-transposition must be in play.

4.  Once removing the nulls from the ciphertext, the remaining digits should be placed into columns of 14 digits:



5.  What remains is a substitution followed by fractionated transposition.


There are the assumptions, and now for a little preliminary analysis on the digits to set the table:

Count of 1s     33
Count of 2s     46
Count of 3s     29
Count of 4s     43
Count of 5s     45

One more assumption we'll need to make is that the polybius square was keyed with a keyword and that the most infrequently used letters fell into a row together (example below):


In this case, "VWXYZ" fall into a row together and their frequencies in the English language would mean two things for our impending analysis.  First of which is that 5, in this example above, would be the least represented number in the ciphertext.  Second, 5 would be more likely to appear as an even numbered column rather than an odd numbered column before the columnar transposition occurred.

Using this information and applying it to our cipher text, it would appear that ciphertext 3 represents this row coordinate of these infrequently used letters as it only appears 29 times.  Based on the frequency count for each digit, we may be looking at a polybius square that looks something like this:


It's helpful to disassociate the numerical order that these figures represent normally.

Let's take a look again at our ciphertext.  I've added column headers for referencing the columns and shaded all instances of 3.



Since it's likely that 3 was used more often than not as a second number in a pair rather than a first we should be able to identify which columns were likely odd numbered columns in the original matrix (the first number in a pair) and which columns were likely even numbered columns in the original matrix (the second number in the pair).


There are four columns that contain more instances of 3 than most:  C7, C12, C3, and C13.  Interestingly enough the column containing the digit 4 that was previous attached to zero (04) seems to slot in as an even numbered column from the original matrix.  My hypothesis is that "04" was used to signify the last column in the original matrix.

There are five columns that contain zero or one instances of 3.  These columns are the most probable to be odd numbered columns from the original matrix.

To be continued . . . .

No comments:

Post a Comment