C library for JSON/JSON5 parsing and writing

I have recently been using JSON in various projects and quite like the format. It is very easy to use in Python and is one of the standard libraries that ship with Python. I decided I wanted to use JSON as configuration files in some of my C projects. Previously I have used XML for configuration and have some convenient libraries for marshalling and unmarshalling the data in an XML file to C structures. I wanted to write a library that would let me swap over to using JSON files without adding complexity to those projects.

In particular I want to be able to have a C structure that contains configuration information and be able to save it to JSON and read it back out without much work. In Python this is very easy as a lot of python objects can just be written to JSON with a single function. As C doesn’t have anyway of inspecting its own C structure there needs to be some mapping provided.

So over the past few months I’ve worked on creating “JsonLib“. This is a C library that can marshall and unmarshall data to and from JSON. Originally I was just going to write a JSON parser with an extension to allow comments (because having comments is very convenient in configuration files). While looking up what the most standard extensions were for handling comments I came across JSON5. I decided to make my library JSON5 compliant rather than simply JSON with non official extensions.

By default JsonLib will accept and parse JSON5, however it has a mode which will only accept strict JSON if desired. In general, unless you are writing something to validate JSON, it is better to always accept JSON5 as input. For output the library defaults to strict JSON unless JSON5 is explicity enabled. This provdes greatest compatibility. By default it can parse anything and for output it always makes strict JSON which anything else will be able to read. It should be noted that JSON5 is a strict superset of JSON. So not enabling JSON5 simply means rejecting input that otherwise could unambiguously be parsed.

The library is now available on GitHub

This is free and unencumbered software released into the public domain.

Documentation is rather sparse at the moment, but I wanted to release it anyway. Hopefully there is enough examples in the unit tests to demonstrate its use.

This a link to the current user guide


I previously talked about OIDs and how they are used as globally unique identifiers through coordination. GUIDs on the other hand require no coordination, anyone can generate a GUID at any time they need. However there are situations where you don’t get a choice and you must use an OID. Getting an OID can be difficult and time consuming if you don’t already have a tree (although I set up a service to get a free instant one here).

The creators of the OID system came up with a solution for creating an instant OID that did not require coordination. Create a GUID and turn it into an OID. They set up a very nice spot right up high in the tree for it. 2.25 is the root OID for the tree containing GUIDS. A GUID is a 128 bit number and this 128 bit number can be represented as a decimal and then simply appended to 2.25.

For example the GUID {53c08bb6-b2eb-5038-bf28-ad41a08c50ef} can be made into the following OID: 2.25.111325678376819997685911819737516232943

This seems initially like a nice scheme however it has a big problem. Converting a 128 bit number into decimal is not something that is trivial in all computer languages. In Python it is very simple and can be done with in one line:

myOid = '2.25.%u' % int( myGuid.hex, 16 )

However this is not very easy in C as most compilers don’t have native 128 bit math. Additionally some implementations handling OIDs will store them internally using an array of integers, which will also not be able to handle a 128 bit number. Apart from the 2.25 branch, pretty much no other OID subcomponent will be a enormous number. So you can easily get away with just using 32 bit numbers for each oid part and you’ll be able to handle almost any OID presented. Except of course the enormous 2.25 OID.

Microsoft came up with a solution to this problem and reserved a branch in their OID space of 1.2.840.113556.1.8000.2554. Appended onto this is the GUID but broken down into several smaller sub parts. Again with my example GUID of {53c08bb6-b2eb-5038-bf28-ad41a08c50ef} the MS OID is: 1.2.840.113556.1.8000.2554.21440.35766.45803.20536.48936.11354528.9195759

MS provided a VBScript to convert a GUID into an OID if this type. Each component of the OID first within 32bits so is easy to handle in any language.

In Python you can create the MS OID with the following code

oidParts = [None] * 7
oidParts[0] = str( int( myGuid.hex[0:4], 16 ) )
oidParts[1] = str( int( myGuid.hex[4:8], 16 ) )
oidParts[2] = str( int( myGuid.hex[8:12], 16 ) )
oidParts[3] = str( int( myGuid.hex[12:16], 16 ) )
oidParts[4] = str( int( myGuid.hex[16:20], 16 ) )
oidParts[5] = str( int( myGuid.hex[20:26], 16 ) )
oidParts[6] = str( int( myGuid.hex[26:32], 16 ) )
myOid = '1.2.840.113556.1.8000.2554.%s' % '.'.join(oidParts)

The MS solution has one big problem as well. Because they gave it such a large prefix the entire OID string is pretty long. And some implementations handling OIDs are known to have a 64 character limit. This is not part of the standard, but an implementation limit. However you won’t be able to use the OID in these systems.

What would have been good is if the prefix could have been a tiny one like the 2.25 one. However only the ISO OID committee can allocate that. However I decided to do what I could to help the situation.

I have reserved the following OID for the use of converting a GUID to an OID:

My scheme is simpler than the MS one. The GUID is broken into two 64 bit numbers and then appended. So again my example GUID {53c08bb6-b2eb-5038-bf28-ad41a08c50ef} becomes the OID:

This OID is less than 64 characters long and it also can be created easily with any language that can handle 64 bit numbers.

Python code for this OID is:

oidParts = [None] * 2
oidParts[0] = str( int( myGuid.hex[0:16], 16 ) )
oidParts[1] = str( int( myGuid.hex[16:32], 16 ) )
myOid = '' % '.'.join(oidParts)

This OID might still have the problem with systems that store the OID using 32 bit numbers for the parts. In which case I have also reserved the OID for breaking the GUID into 4 32 bit numbers. So again my example GUID {53c08bb6-b2eb-5038-bf28-ad41a08c50ef} becomes the OID: This is also under 64 characters in length and each sub part can be stored in 32 bit.

Python code for making the 32 bit compatible OID:

oidParts = [None] * 4
oidParts[0] = str( int( myGuid.hex[0:8], 16 ) )
oidParts[1] = str( int( myGuid.hex[8:16], 16 ) )
oidParts[2] = str( int( myGuid.hex[16:24], 16 ) )
oidParts[3] = str( int( myGuid.hex[24:32], 16 ) )
myOid = '' % '.'.join(oidParts)

Finally for completeness I decided to reserve the OID for representing a GUID using 8 16 bit parts. This time my example GUID {53c08bb6-b2eb-5038-bf28-ad41a08c50ef} becomes the OID:

Note however this final form is also over 64 characters in length so offers no advantage over the Microsoft version.

I have created a script that creates all 5 forms of OIDs from a GUID.

This script is public domain and you are free to use it how you wish. I’m hopeful that this will mean people can create OIDs from GUIDs that are fully useable in any implementation.


Quantum GUIDs!

I recently discovered the following web site operated by the Australian National University (ANU): https://qrng.anu.edu.au

This is a random number generator using quantum physics. This should produce the purest of random numbers. From their web site you can get random numbers in various formats and they also provide a simple API that returns the values in JSON format.

I wrote a python script to collect random values from the site and produce Type 4 GUIDs from it. These GUIDs will have 122 bits of pure random. Quantum random!

The python file will generate between 1 and 1000 GUIDs specified on the command line.

python3 QuantumGuid.py 10

These GUIDs feel so much more random than a regular type 4 version that I really feel they should be in their own type space so as to not be contaminated with regular non quantum GUIDs ;-). Perhaps they could be type 8 GUIDs? (type 6 is already adhoc used, and I already have an idea for type 7!)

If you want to get your own, then download the script

This is free and unencumbered software released into the public domain.

Personal GUIDs – Fixed

Six years ago I wrote the article Personal GUIDs. This introduced a technique for assigning a unique unchanging type 5 GUID to each person on the planet. It made a string of the format:


which in theory should be unique for everyone, and turned it into a type 5 GUID using the Namespace GUID:


(Read https://waterjuiceweb.wordpress.com/2013/06/16/type-3-and-5-guids/ for a description on how a type 5 GUID is made). I wrote a C program to generate the GUIDs.

Unfortunately, I made a mistake!

The C code did not use the DateOfBirth value in creating the string to be hashed. This meant only the first 5 fields were used. As a result the GUIDs produced did not match the description. Also it would mean two people born in the same place with same name but at different times could generate the same GUID.

I only just noticed the error while I was including the code into a new library I’m producing (Stay tuned for WjGuid coming soon!) and I was testing the results against an online type 5 GUID generator.

I have fixed the code and released PersonalGuid 1.0.1. The latest version is available here:

 >PersonalGuid.exe Doe John m Australia Sydney 19700101

To manually produce the personal guid without this tool you can use a site such as https://www.toolswow.com/generator/uuid-v5

For NameSpace mode use UUID and use the value ” {{5b390b3f-9a62-508a-b235-6e6e8d270720}”.

Screenshot from https://www.toolswow.com/generator/uuid-v5

I additionally made a Python version that will work in Python 2 or 3. The python file is inside the .zip, but I have included it here as well.




Making a basic web site look better

TL;DR: Here is a repo https://github.com/WaterJuice/BottleMaterializeTemplate which contains all you need to use Materialize and Bottle in an self contained host using only Python.

Recently I wrote about OIDs and made a basic website that would issue out a new OID to anyone who wanted one without any registration required. (Here)

I don’t pretend to be a web designer but I think it is safe to say the original site was pretty ugly:

Screen Shot 2019-08-23 at 6.08.10 pm.png

I wanted something simple to “make it look good” without requiring me to go to any particular great effort or learn a bunch of CSS. Materialize was recommended to me, so I had a look. Using one of the samples provided I was quickly able to make my site look considerably better. The end result:

Screen Shot 2019-08-24 at 2.51.30 pm.png

I think this was considerably better. Additionally it works well on mobile devices.

While making the site I decided to make a template that I could easily use to start any future projects. In particular I wanted to make have all the files local so it could be self hosted without Internet access (for example on an internal intranet). I used bottle which is a web server framework written in python. This can be installed with pip, or simply copy the single file (bottle.py). Materialize just required two files (materialize.min.css and materialize.min.js), plus the additional Material fonts which are normally hosted on google fonts. If you want to host them yourself you just need a few file MaterialIcons-Regular.ttf, MaterialIcons-Regular.woff, and MaterialIcons-Regular.woff2. Plus an additional css file that maps them.

I put all the necessary files together and made a very simple starting example called “bottle_app.py”

All of the files are in a github repo: https://github.com/WaterJuice/BottleMaterializeTemplate

All that is required is python (2 or 3). Simply run

python bottle_app.py

This will start a simple web server running on port 8080. Then browse to http://localhost:8080 (or ip address of your computer) to see the sample page. This is what it looks like on an iPhone.


The template also contains error pages for common HTTP errors. Eg try browsing to an invalid page and you’ll get a 404 error.

Perhaps this will be useful to you.

Materialize and Bottle are both released using the MIT License, so I have retained the same for my template.

Get the template at https://github.com/WaterJuice/BottleMaterializeTemplate




Oh my OID!

TL;DR: Get a free OID online instantly at https://freeoid.pythonanywhere.com

I have long been interested in GUIDs (UUIDs) which provide a mechanism for unique IDs across multiple domains without any central authority required. The version 1 GUID partitioned a huge 128 bit number space up into unique computer, and also time domains. This meant GUIDs could be generated anywhere anytime at  extremely fast rates without any worry of collision. The downside to version 1 GUIDs are that the computer network address (MAC address) is generally encoded into GUID which leads to privacy issues. Version 4 GUIDs take the approach that 128 bits (122 actually) is so huge that if everyone just picks random numbers the chance of collision is close to 0 anyway. Version 4 GUIDs are now the most common form and they work great as unique IDs. It is not, however, the only approach provide globally unique IDs. I recently stumbled across a different scheme called the OID.

I had been revisiting information on version 3 and 5 GUIDs and looking to see what pre-made namespaces were provided. Disappointingly there are only 4 default namespaces: NameSpace_DNS, NameSpace_URL,NameSpace_OID, and NameSpace_X500.I was curious as to what the NameSpace_OID was for and that led me to discover the Object Identifier (OID).

What is an Object Identifier (OID)

The OID is a globally unique identifier that is guaranteed to be unique by using a tree structure where each part of the tree is responsible for assigning the elements directly under it. This provides a controlled, yet distributed mechanism for assigning OIDs so that they will never collide.

The most common way to display an OID is in “dotted decimal” form. Such as: 

This OID is registered to this website. I can subdelegate this anyway I wish by appending a futher dot and number. I am responsible for assigning any futher oids from this one. I can also give one of my sub OIDs or range of sub OIDs to someone else to manage.

So how did this ID get formed?

The first level is controlled by ITU-T and ISO organisations. They have assigned only 3 values. 0 for ITU-T, 1 for ISO, and 2 for joint ITU-T and ISO things. The number follows a path down a tree, which in the case of our example is

  • 1 – ISO
  • 1.3 – Identified organization
  • 1.3.6 – DoD (Department of Defence)
  • – Internet
  • – Private
  • – IANA enterprise numbers
  • – ViaThinkSoft (IANA number 37476)
  • – FreeOid
  • – WaterJuice

Because any OIDs I create will be appended on as* there is no chance that someone else will create the same IDs as they will be starting theirs from somewhere else in the tree.

The majority of all OIDs are hanging off the branch. Anyone can (with a bit of effort) get an IANA “Enterprise Number” for free. You have to register for it and provide details and it takes about a week. Once you have your number you can assign sub OIDs however you want. Obviously no one is supposed to just arbitrarily use someone else’s branch and assume they can just add some branches that they don’t think are being used. The whole scheme breaks down if people do that.

There are other OID branches further up the tree but there are much harder to get attached to one of those. They tend to be used for ISO things, or reserved for countries etc. There are some places that will charge you a fee to get an OID. So generally most organisations get themselves a free IANA number. It is disappointing that the IANA numbers can’t start much higher up, ideally even at the top level. It could have been 3.* for the IANA enterprise numbers. However the biggest problem with the OIDs are how precious people who control the early ones tend to be. ViaThinkSoft provide free ones attached to a sub OID from their IANA number. You just need an email address to get one. One will be assigned and sent in a reasonably short amount of time.

There are also two methods to get an OID instantly by converting a GUID into an OID. This requires no registration or server interaction and can be done at anytime by anyone. However neither of the techniques are particularly satisfactory (I’ll discuss why in a later article) and you have a rather ugly looking OID such as




(These incidentally both represent the GUID {f3f88f7f-5bd4-40f9-9b9e-4664bb1845df} )

These might be okay for automatically generated OIDs that stand alone, but they don’t make a particularly good branch to add a whole sub tree to. I could sub-delegate out that tree and give, for example, 1.2.840.113556.1.8000.2554.62456.36735.23508.16633.39838.4613307.1590751.2 to someone to start their own tree, but its not a great starting point, they would generally prefer to be much higher up the tree.

So the methods available to get an OID seemed to be either generate an obnoxiously large one of your own instantly from a GUID, or wait around for someone to give you a nicer one from their tree. The IANA branch is the best one you can realistically get, but it can be an annoying process getting one from them (and you will start getting spam immediately after you get one as they publish your email address!).

Given that OIDs are just numbers and not precious gems, it seems ridiculous that its not possible just to easily get one without any hassle. The only point of the scheme is to avoid collisions, the whole registering for one and providing identification etc is unnecessary. So I decided I would provide a free service to give a decent OID to anyone and everyone who wants one!

Get a free OID


This is a very simple website I setup to issue unique OIDs at the press of a button. There is no registration, email address, or anything required. Simply press the button and a new OID will be generated. I was originally going to attach it to a branch off my OID, but I felt that it would be too far down the tree and there is no reason why anyone shouldn’t be able to easily have a better one. So I registered an IANA number for the purpose and have assigned the following OID to be the branch for these free OIDs

If you go to the site you can get an instant OID that will be a sub OID from this one. You might be wondering what the .5 part is for. I have already assigned the earlier ones for a different use that I will detail in a future article).

There are no limits on generating OIDs with this site. It will simply increment an internal counter and give you the next one available. There is no practical limit as to how large the number can grow. I don’t imagine the service will be particularly busy so the numbers aren’t likely to get to enormous.

So feel free to get yourself your very own OID for free right now, or get several if you want. I don’t mind how many you want or what you want to do with them!



The previous two cipher modes of AES I wrote into WjCryptLib were AES-CTR and AES-OFB. Both of these turn AES into stream ciphers. In both cases only the AES block encrypt function is used. So today I add AES-CBC (Cipher Block Chaining) mode to the library. I don’t particularly like CBC as a mode personally, however it is one of the most common modes used so I wanted to include it in the library.

Cipher Block Chaining mode works by XORing the previous cipher block onto the plaintext before performing the block encrypt. An IV is used as the first “previous cipher block”. A change in a byte of plaintext will cause all the following cipher text to be different. A disadvantage of the mode is that it has to work with whole number of blocks (16 bytes in the case of AES). This limitation is usually overcome by padding the last block and keeping a count value of the actual data. There is also a fancier technique called cipher text stealing which reduces the limitation to only requiring a minimum of a a single block. I have not included this technique I my implementation.

CBC is not a stream cipher mode, as in it does not generate a parallel stream of bytes that are then applied (usually with XOR) onto the input stream. CBC uses the block encrypt and decrypt block functions on the input data.

I have released WjCryptLib 2.3.0 which contains AES-CBC.

The relevant source files needed are:

This is free and unencumbered software released into the public domain.


Pushing old data off a disc

I often like to clean out deleted data from discs. Especially ones that are going to be recycled and used by other people. The problem I find is that secure wiping programs are just too slow. I don’t need that level of protection, I just want to quickly write over every block on the disc.

Recently I was trying to delete a large (1TB) drive. It had been formatted and I just wanted to fill up all the blocks on it with a huge file and then delete the file. This way I would be fairly confident that every block on the disc had been overwritten. The fastest way is to copy /dev/zero onto a file on disc. However I never feel confident that writing zeros actually overwrites anything. It would be very easy for the underlying device to simply mark the block as all zero rather than physically writing it. I believe the old ZIPDRIVE discs did something like this.

Instead of using /dev/zero the obvious solution is to use /dev/random or /dev/urandom. However these are far slower due to generating cryptographically secure random numbers. This is overkill for what I was trying to do. I just wanted to ensure that something was written. In the end I opted for making some big files withs /dev/random and then appending the file over and over until the disc was full.

It seemed unlikely that the device would be able to detect repeated blocks being written but it still niggled at me. Also it was not a particularly convenient method. I just wanted to run something and leave it until the disc was full. So I wrote a small tool called PushFill. This will keep writing data to a file until it runs out of space.

This uses RC4 to create a random 10Mbyte block of data which it writes to the file. It then writes the same 10Mbytes another 255 times, each time with every byte incremented by 1. After 256 writes (2.5G) it starts again with another 10Mbyte block from the RC4 stream. This way the RC4 generator is only used for a small percentage of the time and therefore does not slow down the writing. The step of incrementing each byte in the block by 1 is barely noticeable.

The advantage of this method is it is very fast, while still making every single block written different. Therefore the underlying system can not do any smart cheats such as noticing repeated blocks (think of how DropBox works, where each unique block is hashed and only physically stored once ever). Additionally the output of RC4 prevents any disc compression being able to use less physical blocks to store the data.

The syntax is simple:

PushFill <filename>

This will create or append to the specified filename. It will keep on writing until the disc is full, or program is aborted (ctrl-c).

Every two seconds the program will display how much it wrote in that time along with its rate. It will also display the total amount written so far and the average rate.

A sample output:

Block:    1.9 GB  Rate:  948.1 MBps  |  Total:    1.9 GB  AvgRate:  948.1 MBps
Block:    2.1 GB  Rate:    1.1 GBps  |  Total:    4.0 GB  AvgRate: 1017.2 MBps
Block:    2.3 GB  Rate:    1.1 GBps  |  Total:    6.3 GB  AvgRate:    1.0 GBps


Some systems will cache writes so the first few seconds will show a much higher rate than its actually writing to the disc.

Compiled binaries for the program are available here. The package contains binaries for Windows x64, MacOS x64, Linux x64, and Linux Arm (eg a Raspberry Pi).

Full source code available on GitHub here.

This is free and unencumbered software released into the public domain.



I previously added AES-CTR to my library WjCryptLib. AES-CTR is by far the best way to use AES as stream cipher. However it was not the first mode of operation devised for using a block cipher as a stream cipher. Output-Feedback-Mode (OFB) was one of the original modes of operation specified for the original block ciphers like DES. The way OFB works is to start with an IV the size of a block and repeatedly encrypt it. Each encryption produces another block worth of stream bytes.

When running as a single thread AES-OFB is exactly the same speed as AES-CTR. However it can not be parallelised, nor can the stream be synced to an arbitrary location. So in pretty much every situation AES-CTR is a better choice than AES-OFB. However if you are required to use AES-OFB due to a pre specified protocol then there are times you may need it. I have added AES-OFB to WjCryptLib.

Public domain C source code for AES-OFB:

These depend on the AES block cipher files:

Parallelising AES-CTR with OpenMP

In order for AES (or any block cipher) to be particularly useful you generally need to use it in a “mode of operation” which allows it to work with much larger data than a single 128 bit block. Some modes of operation use the underlying block cipher and create a stream cipher. Both AES-OFB and AES-CTR are stream ciphers that use the outputs of the block encryption to produce 16 bytes of stream output at a time. They operate very similarly and when single threaded are the same speed.

AES-OFB keeps on applying the AES encryption on the same block to generate a new block each time. Whereas AES-CTR apples AES encryption on a “counter” block. Cryptanalysis has shown both methods to be equally secure. AES-CTR has two huge advantages over AES-OFB which is why in my opinion it should be used in preference always.

The first advantage is that you can jump to any point in the output stream without having to generate preceding bytes. This would be useful if you were appending to a file for example.

The second advantage is that because each block is calculated independently of other blocks, the entire process can be parallelised. If 1 MB of data is to be encrypted with AES-CTR then on a quad core processor the task can be split into encrypting 256 kB, each performed in a seperate thread. It is not possible to parallelise AES-OFB as each block requires the previous block to have been calculated first.

Parallelising an algorithm such as AES-CTR is a bit different from regular multithreading. If the AES-CTR library had to be responsible for the threading it puts in quite a burden of management, and also the overhead of bringing up threads and closing them each time its called could be more than the improvements in speed. Alternatively providing a set of interfaces to allow the caller to provide the multithread environment could be quite cumbersome and tricky to use. Fortunately there is a standard that is perfect for this job, OpenMP.

OpenMP is a standard that is implemented in many C compilers (including, surprisingly, Microsoft Visual Studio). This allows functions to be marked-up using special pragmas that will allow them to be parallelised when built with OpenMP support, and also run correctly in a single thread without OpenMP.

The following mark up will cause the for loop to be parallelised with OpenMP. The for loop will run on as many threads as there are processing cores (by default, this can be changed). Each thread will run a smaller subset of the range i – numIterations. Without OpenMP this will just run as normal over the entire range.

#ifdef _OPENMP
    #pragma omp parallel for
for( i=0; i<numIterations; i++ )

There are several extra markups that can be added that control how the threads share data etc. The #ifdef is not technically needed as the #pragma will be ignored by compilers that don’t understand it. However some compilers will warn about unknown pragmas so it can be quieter to #ifdef it.

My AES-CTR implementation without OpenMP gave the following results on a quad core MacBookPro running Linux.

AES-CTR (128) 232.33 Mbps
AES-CTR (192) 203.29 Mbps
AES-CTR (256) 177.69 Mbps
RC4 368.89 Mbps

As expected RC4 is the fastest as it is much simpler.

I reworked my AES-CTR implementation to work with OpenMP and the performance increase is considerable.

AES-CTR (128) 730.00 Mbps
AES-CTR (192) 518.44 Mbps
AES-CTR (256) 489.51 Mbps
RC4 363.46 Mbps

This is over 3 times as fast. Interestingly it is not 4 times as fast, despite now running 100% CPU usage over 4 cores instead of just 1. I assume the reason is because the 4 processors still need to access the same memory controller and that becomes the bottleneck. The memory controller can’t serve all four cores simultaneously at the same speed it can service just one.

As a side note, when I first wrote this using my original AES implementation, it was so slow that even with the paralysation it was still out performed by RC4! This motivated me to change the AES block cipher implementation to a faster one.

The nice thing about OpenMP is that it can just be enabled at build time without a lot of fuss (-fopenmp for gcc and /openmp for MSVC). If it is not enabled the code works fine in a single thread.

There is one disappointing thing about Microsoft’s implementation. It requires the use of a DLL which is not a system DLL. Visual Studio 2017 dynamically links OpenMP executables to VCOMP140.DLL. There is no option to statically link it. Also Apple’s version of clang that comes with Xcode does not support OpenMP.

Public Domain C source code for AES-CTR with optional OpenMP support

This is free and unencumbered software released into the public domain.