kryptos_sculptor

Cryptography: The Kryptos Sculpture Puzzle and Vigenère Ciphers

I was doing some reading on the Kryptos cipher, a famous sculpture outside CIA headquarters in Virginia. Erected in the 1990, it takes the form of a scroll split into four sections, each of which is a separate puzzle. The first three puzzles have been solved and their solutions have been released publicly. The fourth has not, though my suspicion is that it has been solved internally. I mean really, you mean to tell me that professional CIA cryptographers walk past one of the most famous puzzles in the world every morning and no one has taken a crack at it?

Anyway, the first two of the first three puzzles are solved with a Vigenère cipher, one of the oldest and most used cyphers in the world. You may have heard of the classic “Caeser cipher”, where messages are encrypted by taking each letter in the message and moving it a set number of letters up the alphabet based on a key letter. For example, say our key is the letter “D”. D is now the start of our alphabet, and all of the letters in our message move down 4 letters compared to our old alphabet. In otherwords, D = A, E = B, F = C, and so on. In the word “cat”, “c”” would become “f”, “a” would become “d”, and “t” would become “w”. In the message the word would then be “ftw”. It’s called a Caeser cipher because it was famously used by Julius Caeser and his generals. The “key” letter would change periodically to keep the code elusive.

A Vigenère cipher is basically a Caeser cipher on steroids. A Vigenère cipher takes a word, or really any number of words as a key, and for each progressive letter in an encrypted text it uses a Caeser cipher based on the letter in a corresponding position in the key. So let’s say our key is the word “dog”, and secret message  is “Be sure to drink your Ovaltine”. To encrypt “B”, we’d use an alphabet starting at “d” (the first letter in dog), for “e”, we’d use an alphabet starting at “o”, for “s”, we’d use an alphabet starting at “g”, and then back to “d” for “u” and we’d repeat the process until we encrypt the whole thing: “ES YXFK WC JUWTN MUXF UYORWWTH.” Vigenère ciphers are useful because they elude frequency analysis. As a first step at cracking a code, cryptographers will often check to see which letters appear most frequently in a text. They can then compare those frequencies to the frequency profiles of various languages and see if there’s a match. Vigenères add a layer of complexity by making it where any one letter can be substituted with several different letters, masking its true frequency. You can also add an extra layer of difficulty by changing the initial alphabet you’re using to encode.

The creator of Kryptos, Jim Sanborn, has said that the solution to the fourth part has something to do with a type of clock in Berlin, that letters 64-69, “NYPVTT” should resolve to “BERLIN”, and has stated that letters 70-74 “MZFPK” should decrypt to the word “CLOCK”. Further, Sanborn stated, “You’d better delve into that particular clock. There are several really interesting clocks in Berlin.” So what are we to do here? Two of the codes have been Vigenère ciphers, one has been a Transposition cipher, we need to use some aspect of German clocks to solve. I started to think about the Vigenère cipher option. There are any number of clock related words words in English and German someone could feed into a Vigenère cipher as keys. I’d recently put together a project in R where I put together a list of words that fed into a function to locate similar words in a data set, and I thought a similar principle could apply. Why not build a function that solves Vignere ciphers, and then feed a big ass list of clock related vocabulary words to test?

This sounded fun, since it’s something so very different from what I usually do. Now before you ask, yes, I’m sure someone has already made something for R that does this, and there are definitely websites that have little widgets for it. This is really about the journey though, so I decided not to look for pre-made R libraries or look at common heuristics for solving Vigenère ciphers and make my own from scratch.

So here’s the general idea: The function will take a given key in the form of a string, and a given text in the form of a string as arguments. It’ll remove punctuation from the text, and break both text and key into lists of constituent characters. For the key, it’ll make a [1:26 x length(key)] data frame with each column being an alphabet that starts with each letter of the key word as df[1,n] in turn. This data frame is designated gamma. For the text, it’ll lay it out into a separate [length(text) x 2] data frame with each letter of the text in turn in one column, and a repeating string of ID numbers equal in length to the key in the other. (Ex: For our dog example df[,1] would be BESURETODRINKYOUROVALTINE vertically  and df[,2] would be 123 (the number of letters in dog) repeated until it fills as many places as length(text)). This data frame will be designated newKey. To make the transformation, it’ll note the numeric position in the alphabet held by each letter in the text, then check what letter holds that position in the alphabet that corresponds to its position in the text in newKey. That column will correspond to the number position of the column that we need to check the letter against. I know this sounds a little confusing, but it’s actually fairly intuitive.

The structure is somewhat reminiscent of a relational database in the way we’re using ID numbers and keys, though it’s not quite the same. Check it out for yourself below! I’ll use it crack the Kryptos code and become a super spy. Jk, maybe I’ll make a Shiny app doe. Enjoy!

Update: I made a Shiny app for this. One day when I stop being cheap and buy a pro WordPress I’ll be able to embed things. Today is not that day.

https://grantoliveira.shinyapps.io/Cipher-Machine/

encrypt <- function(key,text){
       alpha <- c("A","B","C","D","E","F","G","H","I","J","K","L",
                  "M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z")
       gamma <- data.frame(alpha)
       text <- gsub('[[:punct:]]','',text) 
       text <- c(unlist(strsplit(text, "?")))
       key <- c(unlist(strsplit(key, "?")))
       textSpaces <- which(text == " ")
       text <- text[which(text != " ")]
   
       text2 <- toupper(text)
       key2 <- toupper(key)
       l <- length(alpha)
       
       key2 <- key2[which(key != " ")]
                for (i in 1:length(key2)) {
                        n <- match(key2[i], alpha)
                        n <-c(alpha[c(n:l)],
                              alpha[-c(n:l)])
                        gamma <- data.frame(cbind(gamma,as.character(n)))
                }
       
       q <- ceiling(length(text2)/length(key2))
       nums <- seq(1:length(key2))
       newKey <- data.frame(cbind(text2,nums))
       
       gamma <- gamma[,-1]
       gamma <- sapply(gamma, as.character)
       newKey[,1] <- sapply(newKey[,1], as.character)
       newKey[,2] <- sapply(newKey[,2], as.numeric)
       f <- rep(NA, times = length(text2))
       
       for(i in 1:length(text2)){
             convert <- match(newKey[i,1],alpha)  
             n <- gamma[convert,newKey[i,2]]
             f[i] <- n
     }
     print(paste(f, collapse = ""))
}
decrypt <- function(key,text){
       alpha <- c("A","B","C","D","E","F","G","H","I","J","K","L",
                  "M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z")
       gamma <- data.frame(alpha)
       text <- gsub('[[:punct:]]','',text) 
       text <- c(unlist(strsplit(text, "?")))
       key <- c(unlist(strsplit(key, "?")))
       textSpaces <- which(text == " ")
       text <- text[which(text != " ")]
   
       text2 <- toupper(text)
       key2 <- toupper(key)
       l <- length(alpha)
       
       key2 <- key2[which(key != " ")]
                for (i in 1:length(key2)) {
                        n <- match(key2[i], alpha)
                        n <-c(alpha[c(n:l)],
                              alpha[-c(n:l)])
                        gamma <- data.frame(cbind(gamma,as.character(n)))
                }
       
       q <- ceiling(length(text2)/length(key2))
       nums <- seq(1:length(key2))
       options(warn=-1)
       newKey <- data.frame(cbind(text2,nums))
       options(warn=0)
       gamma <- gamma[,-1]
       gamma <- sapply(gamma, as.character)
       newKey[,1] <- sapply(newKey[,1], as.character)
       newKey[,2] <- sapply(newKey[,2], as.numeric)
       f <- rep(NA, times = length(text2))
       
       for(i in 1:length(text2)){
             convert <- match(newKey[i,1],gamma[,newKey[i,2]])
             n <- alpha[convert]
             f[i] <- n
       }
     print(paste(f, collapse = ""))
}
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s