Monday 17 October 2016

Difference between Confusion and Diffusion in Cryptography

A ciphertext has the possibility of being broken by using statistical analysis that could provide some information on the frequency of characters, which can then be compared to common characters in a known language. For example, the letter 'e' has the highest usage in English language and therefore a cryptanalyst may match the highest frequency of a character in the ciphertext to letter 'e' and starts attacking the ciphertext. Similarly, a digram like 'th' or trigram like 'the' can also be used, as they have the highest usage in English. The same method can then be experimented with other letters, until a reasonable number of characters could be revealed to break the ciphertext.

For a ciphertext to be secure enough, it is important that statistical or frequency analyses on the ciphertext would not yield enough information to break it. This is possible by providing "confusion" and "diffusion" through the encryption process.

The purpose of confusion is to make the relationship between the ciphertext and its key to be as complex as possible. The encryption operation performed should keep the relationship between the key and ciphertext obscured. The goal of confusion is - even if the cryptanalyst has some knowledge about the statistics, it would still be difficult to deduce the key.

Claude Shannon proposed that to make it hard for the statistical attacks, the cryptographer could dissipate the statistical structure of the plaintext, in the long range statistics of the ciphertext. This process is called as diffusion. This is possible if many of the plaintext characters can affect each of the ciphertext characters. When such a process takes place, the ciphertext characters will no longer have matching characters in the plaintext in terms of statistics.

In binary block ciphers, such as Data Encryption Standard (DES) and Advanced Encryption Standard (AES), diffusion can be provided by applying permutations on the plaintext data.The output data from the permutations can then be channeled to a function that will produce the ciphertext. This will complicate the statistics of the ciphertext. 

In DES and AES, confusion is done by using substitution while diffusion is achieved by using permutation. More will be discussed on these in other posts.

Popular Posts