Introduction to How Code Breakers Work

Information is an important commodity. Nations, corporations and individuals protect secret information with encryption, using a variety of methods ranging from substituting one letter for another to using a complex algorithm to encrypt a message. On the other side of the information equation are people who use a combination of logic and intuition to uncover secret information. These people are cryptanalysts, also known as code breakers.

Binary Code
Carston Müller, SXC
Binary code is the basis for many modern ciphers.

A person who communicates through secret writing is called a cryptographer. Cryptographers might use codes, ciphers or a combination of both to keep messages safe from others. What cryptographers create, cryptanalysts attempt to unravel.

Throughout the history of cryptography, people who created codes or ciphers were often convinced their systems were unbreakable. Cryptanalysts have proven these people wrong by relying on everything from the scientific method to a lucky guess. Today, even the amazingly complex encryption schemes common in Internet transactions may have a limited useful lifetime -- quantum computing might make solving such difficult equations a snap.

You Say Cryptology, I Say Cryptography
In English, the words cryptology and cryptography are often interchangeable -- both refer to the science of secret writing. Some people prefer to differentiate the words, using cryptology to refer to the science and cryptography to refer to the practice of secret writing.

In this article, we'll look at some of the most popular codes and cipher systems used throughout history. We'll learn about the techniques cryptanalysts use to break codes and ciphers, and what steps cryptographers can take to make their messages more difficult to figure out. At the end, you'll get the chance to take a crack at an enciphered message.

 

To learn how code breakers crack secret messages, you need to know how people create codes. In the next section, we'll learn about some of the earliest attempts at hiding messages.

Polybius Squares and Caesar Shifts

Although historical findings show that several ancient civilizations used elements of ciphers and codes in their writing, code experts say that these examples were meant to give the message a sense of importance and formality. The person writing the message intended for his audience to be able to read it.

The Greeks were one of the first civilizations to use ciphers to communicate in secrecy. A Greek scholar named Polybius proposed a system for enciphering a message in which a cryptographer represented each letter with a pair of numbers ranging from one to five using a 5-by-5 square (the letters I and J shared a square). The Polybius Square (sometimes called the checkerboard) looks like this:

1

2

3

4

5

1

A

B

C

D

E

2

F

G

H

I/J

K

3

L

M

N

O

P

4

Q

R

S

T

U

5

V

W

X

Y

Z

A cryptographer would write the letter "B" as "12". The letter O is "34". To encipher the phrase "How Stuff Works," the cryptographer would write "233452 4344452121 5234422543." Because he replaces each letter with two numbers, it's difficult for someone unfamiliar with the code to determine what this message means. The cryptographer could make it even more difficult by mixing up the order of the letters instead of writing them out alphabetically.

Julius Caesar invented another early cipher -- one that was very simple and yet confounded his enemies. He created enciphered messages by shifting the order of the alphabet by a certain number of letters. For example, if you were to shift the English alphabet down three places, the letter "D" would represent the letter "A," while the letter "E" would mean "B" and so forth. You can visualize this code by writing the two alphabets on top of one another with the corresponding plaintext and cipher matching up like this:

Plaintext

a

b

c

d

e

f

g

h

i

j

k

l

m

Cipher

D

E

F

G

H

I

J

K

L

M

N

O

P

Plaintext

n

o

p

q

r

s

t

u

v

w

x

y

z

Cipher

Q

R

S

T

U

V

W

X

Y

Z

A

B

C


Notice that the cipher alphabet wraps around to "A" after reaching "Z." Using this cipher system, you could encipher the phrase "How Stuff Works" as "KRZ VWXII ZRUNV."

Both of these systems, the Polybius Square and the Caesar Shift, formed the basis of many future cipher systems.

In the next section, we'll look at a few of these more advanced methods of encryption.

Deciphering the Language
To encipher a message means to replace the letters in the text with the replacement alphabet. The readable message is called the plaintext. The cryptographer converts the plaintext into a cipher and sends it on. The recipient of the message uses the proper technique, called the key, to decipher the message, changing it from a cipher back into a plaintext.

The Trimethius Tableau

After the fall of the Roman Empire, the Western world entered what we now call the Dark Ages. During this time, scholarship declined and cryptography suffered the same fate. It wasn't until the Renaissance that cryptography again became popular. The Renaissance was not only a period of intense creativity and learning, but also of intrigue, politics, warfare and deception.

Cryptographers began to search for new ways to encipher messages. The Caesar Shift was too easy to crack -- given enough time and patience, almost anyone could uncover the plaintext behind the ciphered text. Kings and priests hired scholars to come up with new ways to send secret messages.

One such scholar was Johannes Trimethius, who proposed laying out the alphabet in a matrix, or tableau. The matrix was 26 rows long and 26 columns wide. The first row contained the alphabet as it is normally written. The next row used a Caesar Shift to move the alphabet over one space. Each row shifted the alphabet another spot so that the final row began with "Z" and ended in "Y." You could read the alphabet normally by looking across the first row or down the first column. It looks like this:

Trimethius Tableau
As you can see, each row is a Caesar Shift. To encipher a letter, the cryptographer picks a row and uses the top row as the plaintext guide. A cryptographer using the 10th row, for example, would encipher the plaintext letter "A" as "J."

Trimethius didn't stop there -- he suggested that cryptographers encipher messages by using the first row for the first letter, the second row for the second letter, and so on down the tableau. After 26 consecutive letters, the cryptographer would start back at the first row and work down again until he had enciphered the entire message. Using this method, he could encipher the phrase "How Stuff Works" as "HPY VXZLM EXBVE."

Trimethius' tableau is a good example of a polyalphabetic cipher. Most early ciphers were monoalphabetic, meaning that one cipher alphabet replaced the plaintext alphabet. A polyalphabetic cipher uses multiple alphabets to replace the plaintext. Although the same letters are used in each row, the letters of that row have a different meaning. A cryptographer enciphers a plaintext "A" in row three as a "C," but an "A" in row 23 is a "W." Trimethius' system therefore uses 26 alphabets -- one for each letter in the normal alphabet.

In the next section, we'll learn how a scholar named Vigenère created a complex polyalphabetic cipher.

The Vigenère Cipher

In the late 1500s, Blaise de Vigenère proposed a polyalphabetic system that is particularly difficult to decipher. His method used a combination of the Trimethius tableau and a key. The key determined which of the alphabets in the table the decipherer should use, but wasn't necessarily part of the actual message. Let's look at the Trimethius tableau again:

Let's assume you are encrypting a message using the key word "CIPHER." You would encipher the first letter using the "C" row as a guide, using the letter found at the intersection of the "C" row and the corresponding plaintext letter's column. For the second letter, you'd use the "I" row, and so on. Once you use the "R" row to encipher a letter, you'd start back at "C". Using this key word and method, you could encipher "How Stuff Works" this way:

Key

C

I

P

H

E

R

C

I

P

H

E

R

C

Plain

H

O

W

S

T

U

F

F

W

O

R

K

S

Cipher

J

W

L

Z

X

L

H

N

L

V

V

B

U


Your enciphered message would read, "JWL ZXLHN LVVBU." If you wanted to write a longer message, you'd keep repeating the key over and over to encipher your plaintext. The recipient of your message would need to know the key beforehand in order to decipher the text.

Vigenère suggested an even more complex scheme that used a priming letter followed by the message itself as the key. The priming letter designated the row the cryptographer first used to begin the message. Both the cryptographer and the recipient knew which priming letter to use beforehand. This method made cracking ciphers extremely difficult, but it was also time-consuming, and one error early in the message could garble everything that followed. While the system was secure, most people found it too complex to use effectively. Here is an example of Vigenère's system -- in this case the priming letter is "D":

Key

D

H

O

W

S

T

U

F

F

W

O

R

K

Plain

H

O

W

S

T

U

F

F

W

O

R

K

S

Cipher

K

V

K

O

L

N

Z

K

B

K

F

B

C


To decipher, the recipient would first look at the first letter of the encrypted message, a "K" in this case, and use the Trimethius table to find where the "K" fell in the "D" row -- remember, both the cryptographer and recipient know beforehand that the first letter of the key will always be "D," no matter what the rest of the message says. The letter at the top of that column is "H." The "H" becomes the next letter in the cipher's key, so the recipient would look at the "H" row next and find the next letter in the cipher -- a "V" in this case. That would give the recipient an "O." Following this method, the recipient can decipher the entire message, though it takes some time.

The more complex Vigenère system didn't catch on until the 1800s, but it's still used in modern cipher machines [source: Kahn].

In the next section, we'll learn about the ADFGX code created by Germany during World War I.

ADFGX Cipher

After the invention of the telegraph, it was now possible for individuals to communicate across entire countries instantaneously using Morse code. Unfortunately, it was also possible for anyone with the right equipment to wiretap a line and listen in on exchanges. Moreover, most people had to rely on clerks to encode and decode messages, making it impossible to send plaintext clandestinely. Once again, ciphers became important.

Germany created a new cipher based on a combination of the Polybius checkerboard and ciphers using key words. It was known as the ADFGX cipher, because those were the only letters used in the cipher. The Germans chose these letters because their Morse code equivalents are difficult to confuse, reducing the chance of errors.

The first step was to create a matrix that looked a lot like the Polybius checkerboard:

A

D

F

G

X

A

A

B

C

D

E

D

F

G

H

I/J

K

F

L

M

N

O

P

G

Q

R

S

T

U

X

V

W

X

Y

Z

Cryptographers would use pairs of cipher letters to represent plaintext letters. The letter's row becomes the first cipher in the pair, and the column becomes the second cipher. In this example, the enciphered letter "B" becomes "AD," while "O" becomes "FG." Not all ADFGX matrices had the alphabet plotted in alphabetical order.

Next, the cryptographer would encipher his message. Let's stick with "How Stuff Works." Using this matrix, we'd get "DFFGXD GFGGGXDADA XDFGGDDXGF."

The next step was to determine a key word, which could be any length but couldn't include any repeated letters. For this example, we'll use the word DEUTSCH. The cryptographer would create a grid with the key word spelled across the top. The cryptographer would then write the enciphered message into the grid, splitting the cipher pairs into individual letters and wrapping around from one row to the next.

D

E

U

T

S

C

H

D

F

F

G

X

D

G

F

G

G

G

X

D

A

D

A

X

D

F

G

G

D

D

X

G

F

Next, the cryptographer would rearrange the grid so that the letters of the key word were in alphabetical order, shifting the letters' corresponding columns accordingly:

C

D

E

H

S

T

U

D

D

F

G

X

G

F

D

F

G

A

X

G

G

G

D

A

G

F

D

X

D

D

F

G

X

He would then write out the message by following down each column (disregarding the letters of the key word on the top row). This message would come out as "DDG DFDD FGAD GAG XXFF GGDG FGXX." It's probably clear why this code was so challenging -- cryptographers enciphered and transposed every plaintext character. To decode, you would need to know the key word (DEUTSCH), then you'd work backward from there. You'd start with a grid with the columns arranged alphabetically. Once you filled it out, you could rearrange the columns properly and use your matrix to decipher the message.

Words Count
One of the ways you can guess at a key word in an ADFGX cipher is to count the number of words in the ciphered message. The number of ciphered words will tell you how long the key word is -- each ciphered word represents a column of text, and each column corresponds to a letter in the key word. In our example, there are seven words in the ciphered message, meaning there are seven columns with a seven-letter key word. Sure enough, DEUTSCH has seven letters. Because the ciphered words and the original message can have different word counts -- seven ciphered words versus three plaintext words in our example -- deciphering the message becomes more challenging.

In the next section, we'll look at some of the devices cryptographers have invented to create puzzling ciphers.

Cipher Machines

One of the earliest cipher devices known is the Alberti Disc, invented by Leon Battista Alberti, in the 15th century. The device consisted of two discs, the inner one containing a scrambled alphabet and the outer one a second, truncated alphabet and the numbers 1 to 4. The outer disc rotated to match up different letters with the inner circle, which letters the cryptographer used as plaintext. The outer disc's letters then served as the cipher text.

Da Vinci Code
William West/AFP/Getty Images
Dan Brown's novel "The Da Vinci Code" follows the adventures
of a symbology professor as he solves codes and ciphers, some
of which he breaks using a Cardano Grille.


Because the inner disc's alphabet was scrambled, the recipient would need an identical copy of the disc the cryptographer used to decipher the message. To make the system more secure, the cryptographer could change the disc's alignment in the middle of a message, perhaps after three or four words. The cryptographer and recipient would know to change the disc settings after a prescribed number of words, perhaps first setting the disc so that the inner circle "A" matched with the outer circle "W" for the first four words, then with "N" for the next four, and so on. This made cracking the cipher much more difficult.

Cardano Grilles and Steganography
A clever way to hide a secret message is in plain sight. One way to do this is to use a Cardano Grille -- a piece of paper or cardboard with holes cut out of it. To cipher a message, you lay a grille on a blank sheet of paper and write out your message through the grille's holes. You fill the rest of the paper with innocent text. When your recipient receives the message, he lays an identical grille over it to see the secret text. This is a form of steganography, hiding a message within something else.

In the 19th century, Thomas Jefferson proposed a new ciphering machine. It was a cylinder of discs mounted on a spindle. On the edge of each disc were the letters of the alphabet, arranged in random sequence. A cryptographer could align the discs to spell out a short message across the cylinder. He would then look at another row across the cylinder, which would appear to be gibberish, and send that to the recipient. The recipient would use an identical cylinder to spell out the series of nonsense letters, then scan the rest of the cylinder, looking for a message spelled out in English. In 1922, the United States Army adopted a device very similar to Jefferson's; other branches of the military soon followed suit [source: Kahn].

Perhaps the most famous ciphering device was Germany's Enigma Machine from the early 20th century. The Enigma Machine resembled a typewriter, but instead of letter keys it had a series of lights with a letter stamped on each. Pressing a key caused an electric current to run through a complex system of wires and gears, resulting in a ciphered letter illuminating. For instance, you might press the key for the letter "A" and see "T" light up.

Enigma Machine
Photo Courtesy U.S. Army
German soliders using an
Enigma Machine in the field.

What made the Enigma Machine such a formidable ciphering device was that once you pressed a letter, a rotor in the machine would turn, changing the electrode contact points inside the machine. This means if you pressed "A" a second time, a different letter would light up instead of "T." Each time you typed a letter, the rotor turned, and after a certain number of letters, a second rotor engaged, then a third. The machine allowed the operator to switch how letters fed into the machine, so that when you pressed one letter, the machine would interpret it as if you had pressed a different letter.

How does a cryptanalyst crack such a difficult code? In the next section, we'll learn how codes and ciphers are broken.

Cryptanalysis

While there are hundreds of different codes and cipher systems in the world, there are some universal traits and techniques cryptanalysts use to solve them. Patience and perseverance are two of the most important qualities in a cryptanalyst. Solving a cipher can take a lot of time, sometimes requiring you to retrace your steps or start over. It is tempting to give up when you are faced with a particuarly challenging cipher. 

Another important skill to have is a strong familiarity with the language in which the plaintext is written. Trying to solve a coded message written in an unfamiliar language is almost impossible.

Navajo Code Talkers
During World War II, the United States employed Navajo Native Americans to encode messages. The Navajos used a code system based on how their language translated into English. They assigned terms like "airplane" to code words such as "Da-he-tih-hi," which means "Hummingbird." To encipher words that didn't have a corresponding code word, they used an encoded alphabet. This encoded alphabet used Navajo translations of English words to represent letters; for instance, the Navajo word "wol-la-chee" meant "ant," so "wol-la-chee" could stand for the letter "a." Some letters were represented by multiple Navajo words. The Navajo language was so foreign to the Japanese, they never broke the code [source: Kahn].

A strong familiarity with a language includes a grasp of the language's redundancy.

Redundancy means that every language contains more characters or words than are actually needed to convey information. The rules of the English language create redundancy -- for example, no English word will begin with the letters "ng." English also relies heavily on a small number of words. Words like "the," "of," "and," "to," "a," "in," "that," "it," "is," and "I" account for more than one quarter of the text of an average message written in English [source: Kahn].

Knowing the redundant qualities of a language makes a cryptanalyst's task much easier. No matter how convoluted the cipher is, it follows some language's rules in order for the recipient to understand the message. Cryptanalysts look for patterns within ciphers to find common words and letter pairings.

One basic technique in cryptanalysis is frequency analysis. Every language uses certain letters more often than others. In English, the letter "e" is the most common letter. By counting up the characters in a text, a cryptanalyst can see very quickly what sort of cipher he has. If the distribution of cipher frequency is similar to the distribution of the frequency of a normal alphabet, the cryptanalyst may conclude that he's dealing with a monoalphabetic cipher.

Frequency Table
©HowStuffWorks 2007
This chart shows the frequency with which
each letter in the English language is used.


In the next section, we'll look at more complex cryptanalysis and the role luck plays in breaking a cipher.

Tricks of the Trade
Cryptographers use many methods to confuse cryptanalysts. Acrophony is a method that encodes a letter by using a word that starts with that letter's sound. "Bat" might stand for "b," while "cunning" could stand for "k." A polyphone is a symbol that represents more than one letter of plaintext -- a "%" might represent both an "r" and a "j" for example, whereas homophonic substitution uses different ciphers to represent the same plaintext letter -- "%" and "&" could both represent the letter "c." Some cryptographers even throw in null symbols that don't mean anything at all.

Breaking the Code

More complicated ciphers require a combination of experience, experimentation and the occasional shot-in-the-dark guess. The most difficult ciphers are short, continuous blocks of characters. If the cryptographer's message includes word breaks, spaces between each enciphered word, it makes deciphering much easier. The cryptanalyst looks for groups of repeated ciphers, analyze where those groups of letters fall within the context of words and make guesses at what those letters might mean. If the cryptanalyst has a clue about the message's content, he might look for certain words. A cryptanalyst intercepting a message from a Navy captain to command might look for terms referring to weather patterns or sea conditions. If he guesses that "hyuwna" means "stormy," he might be able to crack the rest of the cipher.

Rosslyn Chapel
Christopher Furlong/ Getty Images
Breaking the code carved into the ceiling of the Rosslyn Chapel in Scotland reveals a series of musical passages. 

Many polyalphabetic ciphers rely on key words, which makes the message vulnerable. If the cryptanalyst correctly guesses the right key word, he can quickly decipher the entire message. It's important for cryptographers to change key words frequently and to use uncommon or nonsense key words. Remembering a nonsense key word can be challenging, and if you make your cipher system so difficult that your recipient can't decipher the message quickly, your communication system fails.

Cryptanalysts take advantage of any opportunity to solve a cipher. If the cryptographer used a ciphering device, a savvy cryptanalyst will try to get the same device or make one based on his theories of the cryptographer's methodology. During World War II, Polish cryptanalysts obtained an Enigma Machine and were close to figuring out Germany's ciphering system when it became too dangerous to continue. The Polish exchanged their information and technology with the Allies, who created their own Enigma Machines and deciphered many of Germany's coded messages.

Modern high-level encryption methods rely on mathematical processes that are relatively simple to create, but extremely difficult to decipher. Public-key encryption is a good example. It uses two keys -- one for encoding a message and another for decoding. The encoding key is the public key, available to whomever wants to communicate with the holder of the secret key. The secret key decodes messages encrypted by the public key and vice versa. For more information on public-key encryption, see How Encryption Works.

The complex algorithms cryptographers use ensure secrecy for now. That will change if quantum computing becomes a reality. Quantum computers could find the factors of a large number much faster than a classic computer. If engineers build a reliable quantum computer, practically every encrypted message on the Internet will be vulnerable. To learn more about how cryptographers plan to deal with problem, read How Quantum Encryption Works.

In the next section, we'll look at some codes and ciphers that remain unsolved, much to cryptanalysts' chagrin.

Famous Unsolved Codes

While most cryptanalysts will tell you that, theoretically, there's no such thing as an unbreakable code, a few cryptographers have created codes and ciphers that no one has managed to crack. In most cases, there's just not enough text in the message for cryptanalysts to analyze. Sometimes, the cryptographer's system is too complex, or there may be no message at all -- the codes and ciphers could be hoaxes.

In the 1800s, a pamphlet with three encrypted messages began to show up in a small community in Virginia. The pamphlet described the adventures of a man named Beale who'd struck it rich panning for gold. Reportedly, Beale had hidden most of his wealth in a secret location and left a coded message leading to the treasure's location with an innkeeper. Twenty years passed with no word from Beale, and the innkeeper sought out help solving the coded messages. Eventually, someone determined that one of the messages used the Declaration of Independence as a code book, but the deciphered message only gave vague hints at the location of the treasure and claimed that the other messages would lead directly to it. No one has solved either of the other messages, and many believe the whole thing to be a hoax.

Zodiac Cipher
The Zodiac killer sent ciphered messages like this one to
San Francisco newspapers
in the 1960s.

In the mid 1960s, residents of San Francisco and surrounding counties were terrified of a vicious killer who taunted police with coded messages. The killer called himself the Zodiac and sent most of his letters to San Francisco newspapers, occasionally dividing up one long ciphered message between three papers. Allegedly, the ciphers perplexed law enforcement and intelligence agencies, though amateur cryptanalysts managed to crack most of them. There are a few messages that have never been solved, some supposedly a clue to the killer's identity.

Richard Feynman, physicist and pioneer in the field of nanotechnology, received three encoded messages from a scientist at Los Alamos and shared them with his graduate students when he couldn't decipher them himself. Currently, they are posted on a puzzle site. Cryptanalysts have only managed to decipher the first message, which turned out to be the opening lines of Chaucer's "Canterbury Tales" written in Middle English.

In 1990, Jim Sanborn created a sculpture called Kryptos for the CIA headquarters in Langley, Va. Kryptos contains four enciphered messages, but cryptanalysts have solved only three. The final message has very few characters (either 97 or 98, depending on whether one character truly belongs to the fourth message), making it very difficult to analyze. Several people and organizations have boasted about solving the other three messages, including the CIA and the NSA.

While these messages along with many others are unsolved today, there's no reason to believe they will remain unsolved forever. For more than 100 years, a ciphered message written by Edgar Allen Poe went unsolved, puzzling professional and amateur cryptanalysts. But in 2000, a man named Gil Broza cracked the cipher. He found that the cipher used multiple homophonic substitutions -- Poe had used 14 ciphers to represent the letter "e" -- as well as several mistakes. Broza's work proves that just because a code hasn't been solved doesn't mean it's not solvable [source: Elonka.com].

You're the Cryptanalyst
The following message is enciphered text using a method similar to one discussed in this article. There are clues in the article that can help you solve the cipher. It might take you a while to find a method that works, but with a little patience you'll figure it out. Good luck!
KWKWKKRWRKKKKKWRSRWWO
SWWSWORSSRWOROSROKSKWK
OKOKWSOWRSSORWRKWOWKR
KSRKRWKWRWSWRROWRSOKS
KSRSWRKKOOWOOOKSOKKRS
RWRWSWROSKKWRWKKSWKSS
RWOORWRWWSWSSKWSWOWRK
SWSWKWKOKKORKROWSKRRK
WSWWWKWOOROWSKRKSKOWW

Highlight below with your mouse to see the answer:

You have deciphered a code based on the ADFGX cipher used by Germany in World War I. The key word was Discovery.

To learn more about cryptology, follow the links on the next page.

Lots More Information


Related HowStuffWorks Articles

More Great Links

Sources

  • Elonka.com http://www.elonka.com author name
  • Kahn, David. "The Code-Breakers." Macmillan Publishing Co., Inc.
    New York. 1967.
  • Kozaczuk, Wladyslaw. "Enigma." University Publications of
    America, Inc. 1985.
  • Pincock, Stephen. "Codebreaker." Walker & Company.
    New York. 2006.
  • Sutherland, Scott. "An Introduction to Cryptography."
    October 14, 2005. http://www.math.sunysb.edu/~scott/papers/MSTP/crypto/crypto.html
  • The Enigma Cipher Machine
    http://www.codesandciphers.org.uk/enigma/index.htm