2014년 8월 7일 목요일

HackerRank as a better platform for solving Euler Project problems

Project Euler has a great archive of algorithm-oriented programming problems, but Project Euler is started to look a bit dated as MOOCs featuring online autograders for coding assignments become commonplace and general programming challenge sites like CodeEval, HackerRank, checkIO, etc. have online judges built-in.

Let's look at Problem 1 from PE:

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.
Find the sum of all the multiples of 3 or 5 below 1000.

The input domain is sufficiently small for brute-force solutions to work quickly. In Pseudocode:

1. Go through all numbers between 1 and 999 inclusive one at a time.
2. If a number is divisible by 3 or 5, put it in a separate list A.
3. Once you have finished going through all the numbers [1, 2, 3 ..., 999], return the sum of A.

In Python 3:



The above solution works fine for such a small domain. Here's the same problem on HackerRank:

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.
Find the sum of all the multiples of 3 or 5 below N.
Input Format 
First line contains T that denotes the number of test cases. This is followed by T lines, each containing an integer, N.
Output Format 
For each test case, print an integer that denotes the sum of all the multiples of 3 or 5 below N.
Constraints
$$1\le T \le 10^5$$
$$1 \le N \le 10^9$$


Unlike the original problem from the Euler Project, the upper limit for N in the Hacker Rank version is much larger than 1000; it goes up to one billion! For really large N the naive approach we used before (checking divisibility by 3 or 5 for each number) will no longer work quickly enough as it has \(O(n)\) time complexity (linear time). Although the naive solution passes a few test cases from HackerRank's autograder, it fails cases with really large N.

Having larger input ranges is really great as it forces you to think of a better solution. Fortunately I had encountered a similar problem, Bullseye, from Google Code Jam 2013 Round 1A that required knowledge of how to analytically find the sum of an arithmetic progression.

You might recall from high school math class that if we have a finite sequence with a common difference between terms, i.e. $$\{3, 6, 9, 12, 15, 18\}$$ we can find \(S_n\), the sum of arithmetic progression, from the following formula:
$$S_n = \frac{(a_1 + a_n)n}{2}$$
where \(a_1\) is the first term and \(a_n\) is the n-th term in the progression. If we plug in \(a_1=3\), \(a_n=18\) from the 6-element sequence above, we get the sum \(\frac{21 \cdot 6}{2}=63\), which we can easily verify by manually adding up all six terms.

Also, the formula for calculating the n-th term in an arithmetic sequence or progression is given by:
$$a_n = a_1 + (n-1)d$$
where \(a_1\) is the first term in the sequence and \(d\) is the constant difference between terms.

Let's say we want to find the number of terms less than 1,000,000 in the arithmetic progression where the constant difference between terms is 5. We can find the number of terms as follows:
$$\lfloor\frac{(1000000 - 1)}{5}\rfloor = 199999$$
The bracket-like figures \(\lfloor\) and \(\rfloor\) denote rounding down, so the expression above is doing integer floor division.

We subtract 1 from 1000000 because one million is a multiple of 5, but the terms of the arithmetic progression must be less than one million. Therefore the number of terms in the arithmetic sequence \(\{5, 10, 15, 20, \cdots, 999990, 999995\}\) is 199999.

Using the formulas above, we can now solve the original problem in \(O(1)\) constant time. Here is the new version stated in Pseudocode:

1. Given an upper limit lim, find the sum of all numbers less than lim that are divisible by 3 and the sum of all numbers less than lim divisible by 5.

2. Subtract the sum of all numbers less than lim that are divisible by 15 (to account for double counting; i.e. multiples of 15 will be included in the arithmetic progression {3, 6, 9, 12, 15, ... 30, ..., lim} as well as in {5, 10, 15, 20, 25, 30, ..., lim} so we need to subtract the arithmetic progression {15, 30, 45, 60, 75, ..., lim} to remedy this)

3. Return the overall sum

Here is the implementation in Python 3:



Warning: It would not be a good idea to simply copy-and-paste the code above into the HackerRank autograder! The autograder looks for duplicate submissions and red flags non-unique solutions (i.e. answers that have already been submitted)!