I recently decided to collect coins. In particular I want to collect one of every type of circulated $1 Australian coin. According to the Royal Australian Mint, there are 17 commemorative $1 coin designs in circulation, and 15 different years for the standard “kangaroos” design. So a complete set of one of each would make 32 coins. I later discovered there is an error on the RAM website and there is in fact no 1992 standard $1 coin in circulation. I am not counting all the special non circulated coins as they make dozens of designs each year, I wanted ones that could be collected just from regular change.

Getting the first few coins is easy as there is a high chance that you’ll get a coin that you haven’t already got. Getting that last coin of course will be the hardest as there is much more chance that any subsequent coin you get will be one of the other 30 already acquired. I was curious about the probability of getting the coins. Unfortunately it has been too many years since I studied maths and I couldn’t come up with the appropriate equations. Instead I decided to work it out by brute force, using the computer as a handy calculator.

For my initial trial I assumed that there was an equal distribution of coins of each design, and that there were 32. I wrote a program that just randomly picked numbers between 0 and 31 and placed them in “buckets”. Each time it added a coin to a previously empty bucket it would print out the number of coins it had selected up until that point.

I ran this program a few times and gave some good results, but of course it was random trials. I wanted to get the statistical probabilities of coins required to complete the set, so I adapted the program to run multiple trials. Half a million trials in fact. It calculated the mean and standard deviation for the number of coins required to fill n buckets. The random number generator I used was an RC4 stream which produces good statistical random. The results it printed out are:

1 after 1.0 coins (sd: 0.0) 2 after 2.0 coins (sd: 0.2) 3 after 3.1 coins (sd: 0.3) 4 after 4.2 coins (sd: 0.5) 5 after 5.3 coins (sd: 0.6) 6 after 6.5 coins (sd: 0.8) 7 after 7.8 coins (sd: 0.9) 8 after 9.0 coins (sd: 1.1) 9 after 10.4 coins (sd: 1.3) 10 after 11.8 coins (sd: 1.5) 11 after 13.2 coins (sd: 1.7) 12 after 14.7 coins (sd: 1.9) 13 after 16.3 coins (sd: 2.2) 14 after 18.0 coins (sd: 2.4) 15 after 19.8 coins (sd: 2.7) 16 after 21.7 coins (sd: 3.0) 17 after 23.7 coins (sd: 3.3) 18 after 25.8 coins (sd: 3.6) 19 after 28.1 coins (sd: 4.0) 20 after 30.6 coins (sd: 4.4) 21 after 33.2 coins (sd: 4.9) 22 after 36.1 coins (sd: 5.5) 23 after 39.3 coins (sd: 6.1) 24 after 42.9 coins (sd: 6.8) 25 after 46.9 coins (sd: 7.6) 26 after 51.5 coins (sd: 8.6) 27 after 56.8 coins (sd: 9.9) 28 after 63.2 coins (sd: 11.5) 29 after 71.2 coins (sd: 13.7) 30 after 81.9 coins (sd: 17.0) 31 after 97.9 coins (sd: 23.0) 32 after 129.9 coins (sd: 39.0)

This table shows that if you have a even distribution of 32 types of coins, then the mean number of coins required to collect is about 130. This means there is a 50% chance of getting a complete set from 130 coins. Using the standard deviation we can make some more predictions. Using the old 65-95-99.7 rule from statistics we can say that there is a 95% chance that a complete set will be found between 52 and 208 coins. Or another way of looking at it is that if you collect 208 coins there is only 2.5% chance of not getting a complete set.

I got a bag of 20 $1 coins from the bank, it had 15 unique coins in it which was bang on for the stats as shown in the table. However the coins are not uniformly distributed. The first $1 coin minted was in 1984 and 186.3 million were produced. This actually represents slightly over 1 in 5 of all $1 coins put into circulation. I also learnt that there are in fact only 31 coins. The number of coins produced with each design varies greatly. The rarest is a 2003 coin where only 4.1 million were made. I’m going to update the program to use the proportions of coins in circulation for a more accurate estimation. Although this reminded me of an interesting thing about distribution of random numbers which I want to talk about in the next article.

As of now I have 24 of the 31 coins, so 7 more to go. It also reminds me of the proof of work concept. Just as the number of first bits being zero increases the number of trials required to find one by chance, the same applies with the coins. Getting the first fire coins are easy, but getting the 31st one will take the most work. Like the proof of work scheme, it is easy for someone to verify that you have a complete set compared to the work required to acquire the set in the first place.

I haven’t made the modifications to do the correct proportions yet, but here is the source code of the current version. I’m not going to bother putting up binaries for this, as it basically prints an almost static table. But in case anyone is interested here it is. Note it links in with LibMd5 and LibRc4 from previous projects.

////////////////////////////////////////////////////////////////////////////////// // CoinProb // // Program to attempt to show odds of getting a complete set of 32 unique coins. // This is free and unencumbered software released into the public domain // June 2013 waterjuice.org ////////////////////////////////////////////////////////////////////////////////// ////////////////////////////////////////////////////////////////////////////////// // IMPORTS ////////////////////////////////////////////////////////////////////////////////// #include <stdio.h> #include <stdlib.h> #include <stdint.h> #include <sys/timeb.h> #include <math.h> #include <memory.h> #include "LibRc4.h" #include "LibMd5.h" ////////////////////////////////////////////////////////////////////////////////// // CONSTANTS ////////////////////////////////////////////////////////////////////////////////// #define COIN_TYPES 32 #define NUM_ITERATIONS 500000 ////////////////////////////////////////////////////////////////////////////////// // TYPES ////////////////////////////////////////////////////////////////////////////////// typedef struct { uint64_t ReachedAfterThisManyCoins [COIN_TYPES]; } StatSet; ////////////////////////////////////////////////////////////////////////////////// // FUNCTIONS ////////////////////////////////////////////////////////////////////////////////// ////////////////////////////////////////////////////////////////////////////////// // SeedStreamWithRandom // // Initialises an RC4 stream with a random key ////////////////////////////////////////////////////////////////////////////////// static void SeedStreamWithRandom ( Rc4Context* Context, uint8_t ThreadNum ) { struct { uint8_t threadNum; struct timeb time; void* address; } seedValues; MD5_HASH hash; Md5Context md5Context; seedValues.threadNum = ThreadNum; ftime( &seedValues.time ); seedValues.address = Context; Md5Initialise( &md5Context ); Md5Update( &md5Context, &seedValues, sizeof(seedValues) ); Md5Finalise( &md5Context, &hash ); Rc4Initialise( Context, &hash, sizeof(hash), 0 ); } ////////////////////////////////////////////////////////////////////////////////// // AcquireEntireSet // // Gets an entire set of coins. ////////////////////////////////////////////////////////////////////////////////// static void AcquireEntireSet ( Rc4Context* RandContext, StatSet* Statistics ) { int totalCoins = 0; int uniqueCoins = 0; uint32_t value; int sets [COIN_TYPES] = {0}; memset( Statistics, 0, sizeof(StatSet) ); while( uniqueCoins < COIN_TYPES ) { // Get next random coin Rc4Output( RandContext, &value, sizeof(value) ); value %= COIN_TYPES; totalCoins += 1; sets[value] += 1; if( 1 == sets[value] ) { // First time we have seen this coin Statistics->ReachedAfterThisManyCoins[uniqueCoins] = totalCoins; uniqueCoins += 1; } } } ////////////////////////////////////////////////////////////////////////////////// // main // // Entry point ////////////////////////////////////////////////////////////////////////////////// int main ( void ) { Rc4Context randContext; int iteration; StatSet stats; StatSet statsTotals = {{0}}; StatSet statsSquares = {{0}}; int i; SeedStreamWithRandom( &randContext, 1 ); for( iteration=0; iteration<NUM_ITERATIONS; iteration++ ) { AcquireEntireSet( &randContext, &stats ); for( i=0; i<COIN_TYPES; i++ ) { statsTotals.ReachedAfterThisManyCoins[i] += stats.ReachedAfterThisManyCoins[i]; statsSquares.ReachedAfterThisManyCoins[i] += ( stats.ReachedAfterThisManyCoins[i] * stats.ReachedAfterThisManyCoins[i] ); } } // Display results printf( "%u Iterations\n", NUM_ITERATIONS ); for( i=0; i<COIN_TYPES; i++ ) { double mean; double sigma; double sd; mean = (double)statsTotals.ReachedAfterThisManyCoins[i] / (double)NUM_ITERATIONS; sigma = ( (double)statsSquares.ReachedAfterThisManyCoins[i] / (double)NUM_ITERATIONS ) - ( mean * mean ); sd = sqrt( sigma ); printf( "%u after %.1f coins (sd: %.1f)\n", i+1, mean, sd ); } return 0; }

This is one of the things I love the best about programming. While most of the time I am writing programs for specific things at work, its nice to be able to write a program to solve some problem as a hobby. After all the computer is a fantastic calculator and being able to control it to do what you wish is a very handy skill.