In 1985 I got a book out from the school library. I can’t remember the title, but it was something to do with code making. I remember it talked about Pigpen cipher which interested me. Ever since reading that book I have been fascinated with all things cryptographic. In 2001 I read a book which is my favourite of all time “Applied Cryptography Second Edition” by Bruce Schneier. Prior to reading this I had done what many people thing is an easy task, invented my own cipher system. I later discovered how rubbish this system was. In fact there is something like 10 standard terrible systems that almost everyone who thinks cryptography is easy ends up “inventing”. This book explained a very important truth: cryptography is hard.
Its very easy to assume that its easy to make a good crypt system, after all it appears your algorithm completely scrambles up the original message, so how could anyone ever hope to recover it? Well Arthur Scherbius obviously thought this when he created the Enigma machine, but history tells us how that was broken and Winston Churchill told King George VI that it was due to the success of breaking Enigma that World War 2 was won. The lesson to be learnt is that there are some incredibly smart people who study these things and can find weaknesses. Therefore unless you are equally as smart, you probably shouldn’t try inventing your own. Of course it depends what you want the cryptography for. If its just for fun, then of course feel free to make up your own system. Just make sure it doesn’t end up being used to “secure” some important system in the future.
A lot of cipher systems have quite complex code to look at. In fact, not only is it recommended not to invent your own cipher, it is usually recommended against creating your own implementation of an existing cipher, because it can be easy to make a mistake. However one cipher stands out as an exception to this: RC4.
RC4 was created by Ron Rivest in 1987. It was originally a trade secret, back in the days when people thought obscurity made things more secure. It was leaked in 1994, and has become common place in systems throughout the world. The beauty of RC4 is its simplicity. The code is incredibly small, and yet it produces a very good pseudo random number stream. Because it is some small, its fairly hard to mess up when implementing, and its not hard to test the output anyway against some known values.
RC4 is known as a stream-cipher, or otherwise a pseudo-random generation algorithm (PRGA). stream ciphers have a simple mode of operation. You initialise a generator with a key, then you can output as many bytes of the stream as you wish. A stream cipher does not actually encrypt anything on its own, it simply outputs a stream of bytes. Typically this stream of bytes is XORed onto the plain text to produce the cipher text. As many bytes as there are in the plain text is produced by the stream. Basically you are using the stream to create a one-time pad, which is applied to the plain text.
RC4 has always been my favourite cipher because of its simplicity. It is also very fast. Unfortunately it tends to have a bad reputation as being a broken cipher. This is quite unfair. RC4 is still a good system. It has a few weaknesses that are easily overcome. Other than that it just has to be used correctly. The inventors of WEP encryption for WiFi have caused the most damage to RC4′s name. The significant thing about RC4 is that it is a stream cipher, which is essentially a tool for generating one-time pads. The critical thing about one-time pads is that you only ever use them once. In the case of WEP they effectively use the same pads over and over again.
The poor use of RC4 in WEP is not a weakness in RC4, and would have applied with any stream cipher being used in the same manner. RC4 does have some intrinsic faults of its own though. In particular the first few bytes of the stream are slightly biased. This is easily overcome by discarding the first few bytes of the stream after initialising it. This system is known as RC4-dropN. Where N is the number of bytes to discard. RC4-drop256 is a commonly recommended value. Also RC4 output can be distinguished from random after about 1 GByte. If using RC4 to encrypt long streams, it would be good to rekey the system after about 1 Gbyte of output.
The key schedule of RC4 is not as strong as it could be, in particular related keys are not great for use. So implementing a system where you have a fixed portion of key, and then append on an index value for the complete key is not a good idea. The solution to this is don’t do that. If you need a system with has related keys, then perform a hash of the key first then feed it into the RC4 generator.
So in summary, if you follow the following rules, then RC4 can be good for you.
- Never use the same stream (ie generated from same key) more than once. EVER.
- Discard at least the first 256 bytes after generating the stream.
- Do not use the key for a stream more than 1Gbyte of output.
- Do not use related keys. Preferably hash the key first.
I have written a module in C that initialises an RC4 stream and outputs the stream bytes. It consists of two functions: Rc4Initialise and Rc4Output, and it uses a context type calledRc4Context.
//////////////////////////////////////////////////////////////////////////////// // Rc4Initialise // // Initialises an RC4 cipher and discards the specified number of first bytes. //////////////////////////////////////////////////////////////////////////////// void Rc4Initialise ( Rc4Context* Context, void* Key, uint32_t KeySize, uint32_t DropN ); //////////////////////////////////////////////////////////////////////////////// // Rc4Output // // Outputs the requested number of bytes from the RC4 stream //////////////////////////////////////////////////////////////////////////////// void Rc4Output ( Rc4Context* Context, void* Buffer, uint32_t Size );
I have a sample program that uses it called Rc4Output which I have written for Windows and OSX. This is a simple command line tool that outputs (in hex) a specified number of bytes of an RC4-dropN stream that has been initialised with the provided ascii key. This can be used to verify the output against known values.
The following example outputs the first 16 bytes of an RC4 stream (with no drop) keyed with the ascii value “waterjuice.org”
>Rc4Output.exe waterjuice.org 16 0 3a32db5844a2bc37491a6108aec488e8
C source code and binaries for RC4
Rc4Output.zip contains the source code for the above program, including the RC4 module. This contains workspace for Visual Studio 10 for Windows, and Xcode 4.6 for OSX.
This is free and unencumbered software released into the public domain.