Kasiski test

Also known as : Kasiski examination · Kasiski method

The Kasiski test is the method that brought down the indecipherable cipher. Published in 1863 by the Prussian major Friedrich Kasiski, it lets you guess the key length of a Vigenère-style polyalphabetic cipher — which reduces the cracking task to a series of frequency analyses on monoalphabetic sub-messages. Once the length is known, the cipher collapses in hours.

A double discovery

The history is rich with irony. Charles Babbage, designer of the Analytical Engine (the conceptual ancestor of the computer), discovered the principle as early as 1854 — nine years before Kasiski. But Babbage didn’t publish. Three hypotheses circulate:

The British Empire, mid-Crimean War, asked Babbage to keep to himself a discovery that gave intelligence services a strategic edge.
Babbage was notoriously a procrastinator on publication (his Analytical Engine was never even built in his lifetime).
Babbage was preparing a more ambitious book of which cryptanalysis was only one chapter.

Whatever the reason, Kasiski published in 1863 a slim volume — Die Geheimschriften und die Dechiffrir-Kunst — which became the reference for a century. The method kept his name. Babbage’s written trace was rediscovered only in the 1980s, in Royal Society archives.

Principle

The Kasiski test rests on a simple observation: if the same plaintext sequence happens to be encrypted twice with the same key letters, it produces exactly the same ciphertext. That’s inevitable and that’s the leak Kasiski exploits.

Procedure:

Find repetitions: scan the ciphertext and identify every sequence of 3 or more characters that appears at least twice. The longer the sequence, the more significant the find (accidental repetitions on 3 letters happen; on 5+ they are rare).
Measure distances: for each pair of repetitions, note the distance between the two occurrences (number of characters between them).
GCD: compute the greatest common divisor (GCD) of all distances. That GCD is, with high probability, the key length (or a multiple).
Confirm with IC: paired with the index of coincidence, verify that testing sub-messages taken every k characters raises the IC back to the language’s expected value.

Why it works

If the same plaintext sequence (for instance THE) lands twice on the same position of the key cycle, it will be encrypted exactly the same way both times. The distance between the two occurrences must therefore be a multiple of the key length. By cross-checking several distances, the GCD eliminates accidental multiples.

Toy example. Take the ciphertext OWHRWNEKAIYWHRTGZAOWHRDP enciphered with an unknown key length. The sequence OWHR appears three times at positions 0, 11, 19. The distances are 11 and 8. GCD(11, 8) = 1 — not very useful. But looking closer, WHR also appears at the same spots. Distances alone aren’t enough if you have few repetitions; you then combine with the IC to confirm.

On a longer text (200+ characters), repetitions accumulate, the GCD stabilizes, and the key length emerges without hesitation.

Once the length is known

Here’s where Vigenère collapses. If the key length is L, you slice the ciphertext into L sub-messages:

Sub-message 1: characters at positions 1, 1+L, 1+2L, 1+3L…
Sub-message 2: characters at positions 2, 2+L, 2+2L, 2+3L…
… and so on up to L.

Each sub-message has been encrypted with a single key letter — i.e. with a Caesar cipher. Frequency analysis on each sub-message yields its key letter in minutes. You reconstruct the full key, then decipher the message.

Combined with Friedman’s index of coincidence (1922) which statistically confirms the length, the Kasiski method makes Vigenère crackable by hand in half a day. Today, a 50-line Python script automates the whole thing.

Historical defenses

How do you resist Kasiski?

Key as long as the message → one-time pad. No more repetitions possible. But key sharing becomes impossible at scale.
Running key (Vigenère’s own Autokey, 1586) → the key evolves with each character, no cycle. Harder than standard Vigenère, but still crackable by other methods (probable words).
Modern ciphers (AES, ChaCha20) → they no longer work by substitution but by confusion + diffusion. The Kasiski test loses all meaning.

Key takeaways:

The Kasiski test looks for repetitions in the ciphertext; the GCD of distances reveals the key length.
Discovered by Babbage in 1854 (unpublished), published by Kasiski in 1863.
Once the length is known, you reduce a Vigenère to L independent frequency analyses.
Combined with the index of coincidence, Vigenère is hand-crackable in hours on a sufficiently long ciphertext.
Defense: OTP (key as long as the message) or modern ciphers (AES).

← Whole glossary