UOW SIM BSSP/DSS/BIT: CSCI361 Assignment 1

This course is conducted by A.Prof Willy Susilo. It includes a lot of cryptography stuffs and many people are afraid of this subject. You need to understand this in order to pass the exam. Assignment 1 has 2 tasks:

Task 1: Apply cryptanalysis to two ciphertexts; generated using a monoalphabetic substitution cipher and a polyalphabetic cipher (Vigenere cipher)

To do the task 1, you will be given a program by Willy - Krypto.EXE . It is a very good program to use and you will need it for analysis. Monoalphabetic cipher is very simple cipher - each alphabet is replaced by another alphabet. You are given with a cipher textfile and if you open the file, it does not make any sense. Your task is to replace back the original alphabet. Luckily, we are doing it in English, it means there are 26 alphabets in total.

Monoalphabetic Cipher
================

Construct a table from the letter "a" to "z" and we will update it along the way. Since we are trying to crack it without knowing its arrangement secrets, we will have to use the method "guess". But you cannot guess blindly - you need some solid proof to make your stand.

=Guess=
In English, everyone knows that the most letter appeared in any of english lanugage book is - the letter "e". With this fact, we use the Krypto.EXE program to find the highest frequent letter in the cipher textfile and we know it is mostly the letter "e". Update the table for the most frequent letter with the letter "e". There are many ways to continue. For example, if we look at the 3 letters word and it is ended with the most frequent letter (original is "e"), then it is likely "the", "are", etc. By replacing one at a time, you will manage to construct the table from letter "a" to "z". If you are not sure of the word, don't worry - go to the website: http://www.oneacross.com to help you guess.

Note: You need to learn how to use the program - the command that i used in that program:
r = read file
f 1 = frequency for one letter
g = display it in graph
i = find the index of coicedence (for english language, IC = 0.065)
l = display the file content
s a b = substitue the letter "a" to "b"
w = write file

Polyalphabetic Cipher
===============

The difference between monoalphabetic and polyalphabetic ciphers the construction of the key. In monoalphabetic, one letter is being replaced by another letter and it is unique. The disadvantage of monoalphabetic is the revealing of language characteristics. The most frequent letter is always the most frequent letter regardless of replaced by any other letter. Polyalphabetic solved this problem by flatting the statistic. It used a user-selected length of word that will served as the key. The length of the word could be any length and the word will repeat again and again to encrypt the message. It sounds promising, does it?

To break this cipher, we need to find the period (the length of word) first. There are two methods - Kasiski method and Index of Coincidence. The formula is -

K = 0.247 n / { IC (n-1) - 0.038 n + 0.065 }

=Guess=
You can use the program to do the formula. The method is you have to put the command in the program, -> i 1. Observe the Average value, check if it is close to 0.065. If no, increase the length, -> i 2 in the program. Observe it until it reach close to 0.065. Once it is close, then it is the length of the keyword.

Construct a table with the length of keyword. Now, we need to guess by observing the ciphertext file. Look for the possible 3 letters word. As we know, it is in English and it is likely "the" or "are". Below is an example - our length of keyword is 6 and a portion of cipher text is like "hvadbuc khm xucciaa...". The word "khm" is 3 letters word and we guess it is likely the word "the". We then modify our keyword to make the word "khm" becomes "the". Since our lenght of word is 6, we count where "khm" appeared - in this case, it appeared at 8th character in the file. Since the length of word is 6, we divide it - 8/6 and the remainder is 2. It associate the letter "k" to 2nd word in our keyword, "h" to 3rd in our keyword, "m" to 4th in our keyword.

Refer to the Vigenere Cipher table. Find the assoicate letter in the table and replace it. Update the keyword table until it completes. The command that I used:
r = read file
i 1 = find index of coicidence with the length of keyword at 1
p = display the ciphertext
S -v xxxx = Substitube the keyword with "xxxx"
w = write file

Task 2: Code the Kama Sutra Substitution Cipher

Your task is to write a C/C++ program for Kama Sutra Substitution Cipher. In the 4th century of BC, the Indian text "Kama Sutra" proposed a method of encrypting text. Kama Sutra is a San Script (ancient language used in most South East Asia). Kama stand for "sex" and Sutra stands for "work/method". I guess Sex by nature requires a pair and it relates to the cryptography in the sense of pair mechanism.

In Kama Sutra Cipher, each letter of the alphabet is paired with another unique alphabet. A ciphertext appears by replacing its pair, a plaintext. For example, if the letter "a" and "b" are a pair, then when the plaintext is "a", the ciphertext will be "b". It is very simple and the strength of this scheme used in English language, is surprisingly high, around 7.9 x 10^12. The attack, such as exhaustive search on such scheme would be infeasible at the time the scheme is suggested. This cipher was founded by a Brahmin scholar, Vatsyayana in 4th century AD.

=Coding=

This is very easy. It just requires doing some randomization on the key generation and replacing alphabets based on the key.

=Cryptanalysis=
Similar to task 1 monoalphabetic cipher.

3 comments:

AnonymousMarch 17, 2012 at 8:55 PM
I am doing a Vigenere Autokey Cipher Assignment. Do you have any sample program I can refer? My email is cheahsi78@gmail.com
4teeMarch 17, 2012 at 9:14 PM
Sorry. I don't have it. If you are looking for Vigenere Cipher, there are a few if you google it.
AnonymousJuly 10, 2012 at 1:48 PM
Hi, what is the syntax for using the "f" command to find frequency? The syntax in his help is not helpful at all. Thank you.