Daily Archives: December 2, 2017

Setting up a virtual Big Endian environment

I mentioned previously that it is a good idea to be able to actually test your code works on a Big-Endian architecture. I recently updated CryptLib and made it compatible for Big-Endian. The way I tested it was to use QEMU to emulate a Big-Endian MIPS processor running Linux. QEMU creates Virtual Machines, but unlike VMWare (which virtualises the computer but leaves instructions running on the hardware) it emulates the CPU. This is great because we can now run an operating system for a different chip. It is however extremely slow.

There is pretty much a Linux distribution for every CPU ever made, so you should be able to install a version of Linux for any chosen chip emulation in QEMU. I chose to use MIPS because I know that is big-endian and is also the chips used in a bunch of OPEN-WRT routers, so I figured it would be supported.

Installing the OS from an installation .iso takes a long time, however I was lucky enough to find that someone had made a QEMU image of a Debian MIPS installation available on the Internet: https://people.debian.org/~aurel32/qemu/mips/

From that site download the two files:

Install QEMU if you haven’t already got it. Then run the following command

qemu-system-mips -M malta -kernel vmlinux-3.2.0-4-4kc-malta -hda debian\_wheezy\_mips\_standard.qcow2 -append "root=/dev/sda1 console=tty0" -nographic -redir tcp:2222::22 -m 256

This tells QEMU to boot up the image with no graphics terminal, and instead output the console to stdout. This means the output will just appear in your terminal rather than coming up as a separate window.

QEMU will by default give this a NAT network interface. The -redir flag says to redirect the host port 2222 to the virtual machine’s internal 22 (SSH) port. This makes it easy to SSH into the guest from another computer on your network by SSHing to port 2222 of your host machine.

When you run this you will see Linux slowly boot up. After a while you will get a login prompt. This image has been setup with the user accounts:

user (password user) as a standard user account.
root (password root) as root account.

It is a fairly minimal install, but it does have SSH on, which for my purposes was sufficient. as that is enough to copy files onto it (using scp) and then to execute them.

You can add to this image by installing gcc etc, however actually using the VM to build is extremely slow. I did attempt to do it, but it is not worth the effort. For example to install cmake required building it from source. That took 24 hours!

Instead of trying to build on the VM, the best approach is to build on your actual computer using a cross-compiler.

So if you are using Linux then it is easy, simply run

sudo apt-get install gcc-mips-linux-gnu g++-mips-linux-gnu

to install a MIPS cross compiler.

If you use CMake then it is very easy to use a ToolChain file that is setup to use the mips compiler. Here is my toolchain file. I call it mips.cmake

# toolchain for compiling Linux for MIPS

set(CMAKE_C_COMPILER mips-linux-gnu-gcc)
set(CMAKE_CXX_COMPILER mips-linux-gnu-g++)

# Compiler settings
set( CMAKE_C_FLAGS "-Wall -Werror -Wno-deprecated-declarations" CACHE STRING "" )

 message(STATUS "No build type selected, default to Debug")

# Set install location

Most of this is just my other settings I use. The important part for cross compiling is the first block where it sets up the C compiler.

The following cmake command generates a build system for a project and then builds it:

cmake -H. -Bbuild/mips -DCMAKE_TOOLCHAIN_FILE=mips.cmake
cmake --build build/mips --target install

This will generate the build system in the directory build/mips.

One your program is built you can scp it over to the mips VM and run it. I used this to test CryptoLib. The MD5 and SHA1 algorithms were broken and the test vectors failed in Big-Endian originally until I fixed them up.



Why care about Big Endian?

Since the demise of PowerPC and Sparc, there are not a lot of Big Endian machines around for most people to access. These days the apart from some large mainframes, the only Big Endian machines left tend to be network routers using MIPS (and even then a lot of MIPS devices are run in Little Endian mode).

Why should you care about Big Endian? Probably for the most part you don’t ever need to. It seems likely that Little Endian has won. However you may not wish to limit yourself. You never know when your project needs to be ported to a different system that you didn’t anticipate.

When I started my programming job 15 years ago the only thing my company cared about was Windows 32 bit. Windows 2000 was the OS of choice. Mac and Linux were irrelevantly small and mobile platforms did not exist. Even 64 bit x64 had not shown up yet. Coding was easy, and fully centred around Windows APIs. Unfortunately this led to some poor assumptions made and bad programming practices. A few years later we needed to support the new x64 (Back when it was called AMD64). That took about 6 months of updating our code base. There were various places where assumptions had been made that a void* (PVOID in Windows.h terms) could be happily exchanged with a DWORD or ULONG (Windows terms for 32 bit unsigned ints).

After we had cleared the 64 bit hurdle we were then set for many years. Cross platform compatibility to us meant Windows with x86 or x64. Then along came the mobile platforms running on ARM chips. The big players were iOS and Android. Both of which have a unix ancestry. Before these were relevant platforms for us to code on we had started new projects and had anticipated that we would need to be cross platform in the future. We wrote a large compatibility library rather than directly using Windows.h and its APIs. We did however get this quite wrong. Our compatibility library was all based around the way things work in Windows and when we finally came to start porting it to other platforms we discovered that again we had assumptions that didn’t hold. Now currently, in 2017, we build for Windows, MacOS, Linux, iOS, and Android with x86, x64, Arm, and Arm64 architectures. Build time has certainly increased! We still have to live with some of the early poor assumptions we made (such that assuming that using Windows wide chars (UTF16) would be future proofing ourselves!) but overall now have a pretty good handle of working over multiple platforms and architectures.

My point after this long ramble, is that you never know where your project will go and what kind of things will be important in the future. Maybe in 10 years time, the new popular computers are all Big Endian. So you might just want to make sure your code works on it to save yourself headaches down the line.

Some people say you should never worry about Endianess and always write in an Endian-netural manner. This maybe good advice, but it is incredibly easy to slip up. It just takes you writing a number to disc straight out of memory to have made  the code no longer Endian-neutral. So without actually being able to test it, how will you know for sure?

There are two ways of dealing with Endianness that I can see

  1. Never copy memory bytes into words. eg never do the following: memcpy( &my32BitValue, myByteArray, sizeof(uint32_t). Instead always deal byte at a time and assemble your 32bit Value with bit shifts and ORs.
  2. Just wrap everything in htons and ntohs style wrappers before using them.

You will still want to test that it actually worked though. This is the rather tricky part as you most likely won’t have access to a Big Endian machine. However it is possible to emulate one using QEMU. In the next article I will give some concise instructions to get you up and going with an emulated MIPS Linux setup.


CryptLib 2.0.0

After a four year gap, I have updated CryptLib at GitHub. I have now released version 2.0.0.


  • Added AES and AES-CTR modules. AES-CTR conforms to the same counter mode used with AES in OpenSSL.
  • All algorithms now work on Big-Endian architectures.
  • Now uses CMake for building rather than make files and Visual Studio projects. CMake will generate whatever system is required.
  • Input function parameters are now marked const
  • File names have been changed to have the prefix CryptLib_ rather than Lib
  • Various formatting changes to the files.

This is free and unencumbered software released into the public domain.