But if you want to do it yourself…
Awesome! Okay, let’s talk a little bit about how to do that. First, let’s break the problem down into steps, based on the book. Cat says she:
- converts the base pairs into binary…
- then ASCII…
- and finally into a string of alphanumeric letters.
Let’s talk through these steps one at a time. Depending on what language you’re coding in, you might mix up the way these steps are done. But let’s step through them like this anyway.
Converting the base pairs to binary
Cat says she converts the “base pairs into binary”. For information on binary numbers, check out this cool site. But how does Cat convert to binary, when there are four bases in DNA (G,T,C, and A) and only two bases in binary (0 and 1)? Is there any reason you can think of to put the letters into two groups? (Hint: they’re called base pairs for a reason…)
Once you’ve figured out which letters go together, you’ll have to build a mapping from each letter to either 1 or 0. I’m going to tell you now that A maps to 0, and C maps to 1. So if we have a string of letters: ACACACAC, and A -> 0, and C -> 1, then this would be converted to the binary number 01010101.
Your first job in decoding the message is figuring out the rest of this mapping, and then thinking about what commands you would use in your preferred language to change the letters in a long string of text. At the end of this step, you should have a long string of binary numbers, i.e. 010101010101010101010101 etc.
Converting binary to ASCII
Again, for information on binary numbers, check out this cool site. For information on ASCII, you can check out this table here.
For this step in the decoding process, you’ll need to split up your long string of binary digits into smaller sections. One thing you’ll notice about ASCII is that there are 128 characters in the ASCII system. This is because these characters can be represented by 8 “bits”, in the form of an 8-digit binary number. So once you have a long string of binary, you’ll need to chop it up into sections of 8 digits, and then turn those 8-digit numbers into decimals. This step requires you to understand binary numbers – so check out the link above!
For example, the string: 010101010101110001011111 would be split into three numbers: 01010101, 01011100, and 01011111. Depending on what language you’re coding in, you might split this text up in a loop over all letters, a loop over every eight letters, or you may be able to use a command to split the long string into shorter ones in one step.
Once you are dealing with 8-digit binary numbers, they can be re-written as decimal numbers: 01010101, 01011100, and 01011111 become 85, 92, and 95. When you’re done with this step you should be dealing with numbers between 0 and 127. If you take a look at the ASCII table linked above, you can see we’re almost done.
Converting ASCII to Alphanumeric Letters
This is the last step! By now you should be working with numbers like 85, 92, 95, etc. For this last step, you’ll need to use a built-in function in your preferred language to convert these numbers to letters. For instance, 85 -> “U”, 92 -> “\”, and 95 -> “_”. To find the command for your preferred language, Google is your friend! At the end of this step, you should have the full text!