Monday, November 15, 2010

Is The D'Agapeyeff Cipher The ADFGX Cipher?

As a continuation of my previous post, I decided to break the D'Agapeyeff cipher into two smaller 7x7 squares.

To achieve this meant pairing up the digits in some fashion.  In one test used the "04" spot in the original ciphertext to mark the break point for the first 49 pairs and second 49 pairs.

In the second test, I used the "04" as the stopping point for the first digit in the pairs and all the numbers after the "04" as the second digit.

As mentioned in that post, I did remove all the 6-9 digits and the zeros before pairing up the remaining 196 digits.

My assumption was that if it was enciphered in this manner the same polybius square would have been used as there would be no need to complete two squares.  When comparing the frequencies of the pairs of numbers I didn't see enough commonalities between the two sets of 49 pairs to continue pursuing this method.

One example would be that in one group of 49 pairs the most frequent number was 22, but 22 wasn't represented at all in the other 49 pairs.

This brought me back to the drawing board, and I went back to review D'Agapeyeff's section on substitution plus transposition and my corresponding notes, which read:

I wonder if he understood how complex this concept is when introducing fractionated letters during the transposition step?

Look again at his example from the book:
This brings me back to one of my original hypotheses, although I was hoping it not to be this complex.  A completed filled 14x14 square with fractionated substitution.  To put it very simply, we'd be looking at the ADFGX cipher or as it became after the polybius square was expanded to accommodate numeric characters, the ADFGVX cipher.  This cipher was employed by the German military during World War I.

If you've read Friedman's Military Cryptanalysis, Part IV, Transposition and Fractionating Systems then you may recall there a couple different ways to solve the ADFGX cipher.  During the war they relied on high volumes of traffic and special circumstances such as messages with the same beginning, same ending, or completely filled rectangles.  One thing remains constant with all those special circumstance solutions:   high volume.

We have only one cryptogram, but we do have a couple things working for us.

  1. There is a completely filled rectangle.  196 factors into 4x49, 7x28, or 14x14
  2. If we assume the 14x14 square was used, we have an even number of columns which also reduces the cryptanalytic work
Let's consider the steps.  The 98 character plaintext was converted to digits via a polybius square resulting in pairs of numbers for each letter.  Consider the plaintext:

The bridge is out on the southeast road.  Use the bridge on the road north of the city for attack.  Hold position two day more.

The intermediate text was then inscribed into a 14x14 square horizontally as shown in D'Agapeyeff's example above.  A keyword/phrase is inscribed in the column headers for transposition.


Transposition is then completed via alphabetizing the columns:


The final ciphertext can be written out be reading column-wise (vertically) starting with the first column.  Nulls may be inserted every other character to re-create the appearance of the D'Agapeyeff cipher.

What does this mean in terms of attack strategies for the D'Agapeyeff cipher?  If we can assume the cipher was written on a 14x14 square we can assume there are 7 columns which represent the row coordinates from the polybius square and 7 columns which represent the column coordinates from the polybius square.

Unfortunately, we don't have additional ciphers to draw upon, so if this is the method by which the cipher was constructed it will be tough to crack.  Furthermore, if D'Agapeyeff's polybius square didn't have the low frequency letters "VWXYZ" in a row together as in my example the cipher becomes impervious to attack fairly quickly.

I've done some initial calculations assuming a 14x14 transposition rectangle (square) and there is evidence to suggest that VWXYZ or some combination of low frequency letters may have ended up in the third row, or D'Agapyeff may have changed the order of the row/column coordinates from the traditional 12345 to something like 34512.  I'll go into more details after further testing.

No comments:

Post a Comment