CS50 PSet 2: Substitution

A guide to the ‘Substitution’ problem in CS50 Week 2.

Goal: To write a program in C that implements a substitution cypher, as per the below.

$ ./substitution JTREKYAVOGDXPSNCUIZLFBMWHQ
plaintext: HELLO
ciphertext: VKXXN

The program must encrypt only the letters, irrespective of whether they are upper or lower case.

$ ./substitution VCHPRZGJNTLSKFBDQWAXEUYMOI
plaintext: hello, world
ciphertext: jrssb, ybwsp

It must accept only one command line argument, the key to be used for the substitution. This key must be exactly 26 characters long, only accept letters and also be case insensitive.

It must then ask for the plain text and return the encrypted cipher text.

First we import the required libraries and define a constant N, which is the required length of the key, and a string of characters in the alphabet.

#include <stdio.h>
#include <cs50.h>
#include <ctype.h>
#include <string.h>

Next declare the main() function and ensure it can only accept one command line argument.

int main(int argc, string argv[])
{
// Check for correct number of args
if (argc != 2)
{
printf("Please provide one command line argument only.\n");
return 1;
}

If this check is passed, the program can continue and we can check the validity of the command line argument. This must meet the following conditions:

  1. Only letters can be used.
  2. There must be no repeated letters.

I started by declaring a blank integer array of length N called letters, which I will be using in the upcoming loop.

The key can then be looped through character by character, first checking there are only letters used with a few basic operators and an if statement.

        // Check validity of key content
int letters[N];
for (int i = 0, n = strlen(argv[1]); i < n; i++)
{
// Check only letters are used
if (!((argv[1][i] >= 'a' && argv[1][i] <= 'z') ||
(argv[1][i] >= 'A' && argv[1][i] <= 'Z')))
{
printf("Key must contain only letters.\n");
return 2;
}

If this check is passed, I then converted the entire key to uppercase using toupper() for reasons which will become clear later.

            // Convert to uppercase
else if (argv[1][i] >= 'a' && argv[1][i] <= 'z')
{
argv[1][i] = toupper(argv[1][i]);
}

Repeated letters can now be checked for, which I did using a nested for loop and the letters array defined earlier. The first iteration of this loop will pass as the letters array is initially empty. By adding the current letter in the key to the letters array at the end of each outer loop and then looping through the letters array in the inner loop, we can ensure no letters are being repeated.

            // Check for repeated letters
for (int j = 0; j < N; j++)
{
if (argv[1][i] == letters[j])
{
printf("Key must not contain repeated letters.\n");
return 3;
}
}

letters[i] = argv[1][i];
}

With these checks passed on the command line argument, the program can now ask the user for input to be encrypted. The blank cipher text array is also defined here, with a length of l + 1 to allow space for the null terminator.

        // Ask for plaintext
string plaintext = get_string("plaintext: ");
int l = strlen(plaintext);
char ciphertext[l + 1];

This plain text can now be converted using the defined encryption key. This is done by looping through each character in the plain text and converting them in sequence. If the plain text character is upper case, the character is converted directly from the key which if you recall has been converted to all upper case.

The inner loop loops through the alphabet to check whether the plain text character is a letter or not. If it is, the plain text character is converted to cipher text using the corresponding index of the encryption key.

        // Convert to ciphertext
// Loop through each char in plaintext
for (int i = 0; i < l; i++)
{
// Check if uppercase and if so use standard alphabet/key
if (isupper(plaintext[i]) != 0)
{
for (int j = 0; j < N; j++)
{
if (plaintext[i] == alphabet[j])
{
ciphertext[i] = argv[1][j];
break;
}
}
}

A similar process can be followed if the plain text character is lower case, which is checked using the islower() function. This function is repeated to convert to a lower case cipher text character.

            // If lowercase use lowercase alphabet and key
else if (islower(plaintext[i]) != 0)
{
for (int j = 0; j < strlen(alphabet); j++)
{
if (plaintext[i] == tolower(alphabet[j]))
{
ciphertext[i] = tolower(argv[1][j]);
break;
}
}
}

If the conditional checks have made it this far, it must not be a letter and the cipher text character will simply be the same as the plain text equivalent.

            // Finally replace non letters
else
{
ciphertext[i] = plaintext[i];
}
}

Finally add the null character to the end of the cipher text to make it a string and print the result.

        // Add null char to make it a string
ciphertext[l] = '\0';

And that’s substitution, our first but definitely not last exposure to using command line arguments and all that they entail.