Taxman Programming Project: Model Solution

Description of the Algorithm

At each stage look at the largest number p that has exactly two factors left. If no number has exactly two factors left, look at the largest number that has exactly three factors left, and so on. If there is no number with at least two factors left, then we are done.

Now p is usually the best number to take, but there is one situation when it is not. This is when there is a proper multiple q of p where the only proper factors left for q are factors of p AND there are no proper multiples of q left. In that case, removing p would strand q, leaving it for the taxman at the end, so the best thing to do is to take q yourself. Naturally, if there are several such proper multiples q available, then always take the biggest one!

Implementation Details

Firstly, to make it efficient to find the factors of a number, we precompute an array factor[i] containing the highest prime factor of i.

We also maintain an array number[i] containing the number of factors left for i.

At each stage we have a total order defined on the set of numbers from 1 to n, like this:

p is more desirable than q if:

p and q have the same number of factors left, and p is bigger than q.
The number of factors left for p comes before the number of factors left for q in the following list: 2, 3, 4, ..., 1, 0.

Note that the array number[i] makes this order very easy to calculate.

Now we arrange our numbers 1 up to n in a binary heap, where the topmost element is the most desirable number. (Note that the definition of desirability means that the topmost number is always available for us to take right until the end.)

The algorithm now goes as follows:

Take the topmost element p from the heap, if p is not available to take then we are done.
Check that p doesn't have any proper multiples that are better (see Description of the Algorithm), if so set p to be the largest one of these.
Calculate the factors left for p.
For each factor f left for p, reduce by one number[i] for all the multiples i of f. Every time we change number[i] for any i, we may also need to move it up or down the heap a few places.

Complexity Analysis

Setting up the arrays factor[i] and number[i] can be done in time O(n log n), using the following sieving technique:

  for (i = 1; i <= n; i = i + 1) {
    factor[i] = 1;
  }

  for (i = 2; i <= n; i = i + 1) {
    if (factor[i] == 1) {
      for (j = i; j <= n; j = j + i) {
        factor[j] = i;
      }
    }
  }

This takes time n + n/2 + n/3 + n/5 + ... + n/n < n(1 + 1/2 + 1/3 + 1/4 + ... + 1/n) = O(n log n).

Heapifying takes O(n).

Now whenever we remove a number p (for whatever reason) we must reduce number[q] by 1 for every multiple q of p. By a similar analysis as for the seiving, the number of reductions we must perform is O(n log n). In addition, every time this is done we may need to move i up or down the heap some places. In the worst case, O(log n) swaps may be required. Therefore, the total time required by the main body of the algorithm is O(n (log n)²).

The complexity of the whole algorithm is O(n (log n)²).

Performance Information

All times are measured in seconds, running on a Pentium II 200MHz and using the java virtual machine that comes with version 4.76 of the Netscape browser.

n	score	time
1	0	0.009
2	2	0.038
5	9	0.155
10	40	0.766
20	121	0.870
50	800	1.054
100	3148	2.157
200	12614	4.262
500	78284	0.267
1000	312350	0.451
2000	1242601	0.951
5000	7764257	2.693
10000	31018936	5.539
20000	124182891	12.106
50000	774677696	32.596
100000	3101575417	70.231
200000	12397258724	147.388
500000	77385834248	404.655
1000000	309391446240	859.213