Wednesday, January 26, 2011

Since I've Got The Time

Genetic search algorithms and hill-climbers are all very serious business, so I thought I'd try out a couple of educated guesses on the transposition key for the D'Agapeyeff Cipher for fun.   Why?  Why not?  I've definitely got the time.  Sure it's a 1 : 87 billion guess, but it's EDUCATED! :)

I started with a few of my usual assumptions:

1.  Strip out the odd digits from the ciphertext (all the 5, 6, 7, 8, 9, and 0 digits)
2.  Write the ciphertext vertically into 14 columns (14x14 grid)
3.  Assume that "04" near the middle of the cipher text marks the last character in the cryptogram

That leaves us looking for a 14 character transposition key, where the last column of the original matrix is what I like to call C7 (Column 7 after transposition).

What do we know about Alexander D'Agapeyeff?  We know he was a cartographer, born in Russia, and lived (and died) in England.  We also know that he showed us an example of substitution + fractionated transposition in his Codes and Ciphers book:

A cartographer living in the UK you say?  I believe it.  Mr. D'Agapeyeff used MANCHEST(E)R as the keyword in his example.  As you can see, he removed the 2nd instance of "E" to give a resulting 9 columns for the transposition key width.

Let's assume for a moment that he's playing by the same rules with his challenge cipher.

We're looking for a key phrase with 14 unique letters.  Not exactly a simple task.

NOT EXACTLY A SIMPLE TASK.  There are 21 characters in that sentence, but only 14 unique.  But let's revisit the mind of a cartographer.  A man who made maps for a living.  No doubt his mind is littered with names of places.  Towns, villages, cities, countries, etc.

Let's also revisit one of the first assumptions:

"That leaves us looking for a 14 character transposition key, where the last column of the original matrix is what I like to call C7 (Column 7 after transposition)."
 Okay, so we need a key phrase where the last unique character is somewhere near the middle of the alphabet.  Why is that?

Remember when a key is finished being transposed, essentially what you have is:

MANCHESTR to ACEHMNRST

If we "know" (and I use that term loosely, remember this is just for fun) the last letter of the keyword alphabetizes to the 7th position there needs to be 6 letters preceding it and 7 following it in the alphabet.

ABCDEF     GHIJKLMNOPQRS     TUVWXYZ

We can rule out A-F being the last unique letter in the key phrase, as well as T-Z.  That leaves us with G-S as the possibilities.  But let's be realistic.  I highly doubt all of ABCDEF or TUVWXYZ are all used in the keyphrase.  We are likely looking for a key phrase where the last unique letter is IJKLMNOPQ.

So back to D'Agapeyeff's keyword of choice in his book.  MANCHEST(E)R.

MANCHESTER UNITED KINGDOM.  15 unique characters.  Probably not it (I tried)

But hey, there are a ton of cities in the UK for a map maker to draw from, and there are 8 unique letters just in "UNITED K(IN)G(D)OM".  Added bonus that M and O are the last unique letters provided that they both don't show up in the city name you choose.  Take for instance Liverpool.

LIVERPOOL, UNITED KINGDOM

LIVERPO(OL)UN(I)T(E)DK(IN)G(DO)M.  14 unique characters.  When alphabetized:

DEGIKLMNOPRTUV.  M is the last unique character and falls into the 7th spot when reordered for transposition.  Beautiful.  Cipher Solved!  Re-arrange the columns, pair up the digits to form the 2 digit numbers from the polybius square used for substitution and plug into my mono-alphabetic substitution solver
, and . . . . nope, garbage.  Coventry fit the bill as well, but no dice there either.

But hey, it wasn't hard to try with a spreadsheet setup to do all the leg work.  Realistically I checked the phi-test value (380 something) and it told me it wasn't English.

Tuesday, January 11, 2011

D'Agapeyeff Cipher: Searching For A Solution

It's been almost  two months since my last post on the D'Agapeyeff Cipher, and I guess you could say that I've moved from "interested in this cipher" to "actively looking for a solution". As far as I can tell, there aren't many in this category, but I am interested in discussing this cipher with other interested parties, so please feel free to contact me.

In my last post from November 24, 2010 I attempted to identify the "row-columns" and "column-columns" that exist from using a polybius square for the substitution phase of the encryption.

I did do some due diligence and check the frequency counts for each digit of a 7x28 matrix, but saw no compelling evidence to suggest this shape was used.

My focus then remains on the 14x14 transposition matrix. Once the columns are paired (un-fractionated?) there will be a 7x14 (98 character) simple substitution cipher remaining. The trick is first solving the the transposition encryption independently of the substitution.

Tiago Rodrigues does a great job summarizing this strategy on his D'Agapeyeff.com website.


Back to the 14 columns, and the transposition space:

C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14
5 4 4 3 5 4 3 4 2 5 5 3 2 2
2 4 5 5 4 5 5 1 2 3 2 3 3 4
2 4 3 5 3 5 2 1 5 2 4 5 2 2
5 2 2 1 2 3 1 1 1 4 4 2 2 2
1 4 4 3 5 2 4 4 5 4 4 3 2 3
2 3 1 1 4 2 3 3 1 1 1 4 2 5
1 1 5 5 3 2 3 5 4 2 3 2 3 1
4 3 4 5 1 3 2 4 1 4 5 2 2 4
1 1 5 5 5 2 5 5 2 2 4 2 5 5
4 1 3 5 1 1 1 4 5 2 1 5 4 4
1 4 5 4 4 1 3 5 4 4 1 2 5 5
4 4 5 2 5 2 3 5 5 1 5 3 2 1
5 2 5 2 4 1 3 2 5 1 2 5 1 2
4 2 3 2 5 4 4 4 1 3 4 2 3 2


I surmised that columns C3, C7, C12, and C13 were likely to be "row-columns" while C1, C8, C11, and C14 were likely to be "column-columns" based on differing frequency counts for each of the 5 ciphertext characters.  The remaining six columns were sorted into row-columns or column-columns based on their frequency distribution.


Another member of the D'Agapeyeff cipher Yahoo! forum continued with some analysis of his own based on my assumptions to date.  That analysis is located here.  The outcome is similar to my findings with the caveat that C4 and C10 don't seem to favor being a row or a column.

Additionally this analysis output the largest Phi values possible - where 584 is the maximum value and is attained when the pairs are:

C1 - C6
C2 - C12
C5 - C7
C8 - C14
C9 - C13
C10 - C4
C11 - C3

If you recall, 634 is the expected Phi value for English text where N=98.  584, as well as several other sets of column pairings fall within an acceptable range, however no value of Phi(o) exceeds 634.

Assume for a moment that these are the correct column pairings, what remains is a 7! transposition which when in the correct order will leave a very simple substitution cipher.

Perhaps we're making headway against this as yet unsolved cipher.