--- title: "Project Euler #2: Even Fibonacci Numbers" author: nbloomf date: 2017-03-06 tags: project-euler, literate-haskell --- First some boilerplate. > module ProjectEuler002 where > > import Data.Ratio > import System.Exit [Problem 2](https://projecteuler.net/problem=2) from Project Euler:
Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be: $$1, 2, 3, 5, 8, 13, 21, 34, 55, 89, \ldots$$ By considering the terms in the Fibonacci sequence whose values do not exceed four million, find the sum of the even-valued terms.
Let's start with the obvious thing: take the Fibonacci numbers under four million, filter for the evens, and sum. We'll use the definition of the sequence from the problem statement, indexed from 1 like so: $$F(1) = 1, F(2) = 1, F(n+2) = F(n) + F(n+1).$$ We'll translate this to an unfold like so. > fibs :: [Integer] > fibs = 1 : 1 : fibs' 1 1 > where > fibs' a b = let c = (a+) $! b in c : fibs' b c > > pe2' :: Integer -> Integer > pe2' n = sum $ filter even $ takeWhile (< n) fibs Note that strictness annotation (``$!``) in the definition of ``fibs``; it's needed to force the evaluation of each successive number, so we don't end up with a bunch of unevaluated additions. We can verify that ``pe2' 89 == 44`` as expected. Then our answer is ```haskell $> pe2' 4000000 4613732 ``` And done. But wait! Anything worth doing is worth overdoing, so let’s try something less obvious. I think it will be pretty hard to squeeze much more performance out of ``pe2'``. The Fibonacci numbers grow reasonably quickly, getting a new digit every 5 terms or so. Why is that? The first Fibonacci number with a given number of digits (always?) has leading digit 1, and the one after that has leading digit 1 or 2, so at that point (after moving the decimal point) it looks like we're starting a new Fibonacci sequence with initial terms $1 + \varepsilon$ and $1 + \delta$, which reaches 10 after at most six terms. This means that computing ``pe2' (10*n)`` requires looking at only 5 or 6 more terms in ``fibs`` than ``pe2' n`` does. And computing the next term in ``fibs`` is cheap. Let's look at the first several $F(n)$ again, this time highlighting the even terms: $$1, 1, \fbox{2}, 3, 5, \fbox{8}, 13, 21, \fbox{34}, 55, 89, \fbox{134}, \ldots$$ And what's this then -- there's a pattern! It appears that every third $F(n)$ is even, and no others. But does this pattern always hold? Indeed it does; to see why, unpack the recursive definition of $F$ on $F(n+3)$. This gives the congruence $$F(n+3) = F(n) + 2F(n+1) \equiv F(n) \pmod{2}.$$ So the parity pattern of $F(n)$ repeats every third term, and because $F(1)$ and $F(2)$ are odd while $F(3)$ is even, this explains the pattern. That means we don't really need to check the parity of our Fibonacci numbers; it's enough to simply throw out all but every third term. > pe2'' :: Integer -> Integer > pe2'' n = sum $ takeWhile (< n) $ take3rd fibs > where > take3rd (_:_:a:as) = a : take3rd as But the observation that $F(n)$ is even precisely when $n = 3k$ leads to another idea; what we really want is $$\sum_{k=1}^t F(3k)$$ where $t$ is the *index* of the largest Fibonacci number less than $N = 4,000,000$. This is interesting because there is a closed form formula for $F(n)$ known as [Binet's formula](https://en.wikipedia.org/wiki/Fibonacci_number#Closed-form_expression): $$F(n) = \frac{1}{\sqrt{5}}(\varphi^n - \overline{\varphi}^n),$$ where $\varphi = (1+\sqrt{5})/2$ is the largest real solution of $x^2 - x - 1 = 0$ and $\overline{\varphi} = (1-\sqrt{5})/2$ is its quadratic conjugate. But now: $$\begin{eqnarray*} \sum_{k=1}^t F(3k) & = & \sum_{k=1}^t \frac{1}{\sqrt{5}}\left( \varphi^{3k} - \overline{\varphi}^{3k} \right) \\ & = & \frac{1}{\sqrt{5}} \left( \sum_{k=0}^{t-1} (\varphi^3)^{k+1} - \sum_{k=0}^{t-1} (\overline{\varphi}^3)^{k+1} \right) \\ & = & \frac{1}{\sqrt{5}} \left( \varphi^3 \sum_{k=0}^{t-1} (\varphi^3)^k - \overline{\varphi}^3 \sum_{k=0}^{t-1} (\overline{\varphi}^3)^k \right) \\ & = & \frac{1}{\sqrt{5}} \left( \varphi^3 \frac{\varphi^{3t} - 1}{\varphi^3 - 1} - \overline{\varphi}^3 \frac{\overline{\varphi}^{3t} - 1}{\overline{\varphi}^3 - 1} \right) \\ & = & \frac{1}{2\sqrt{5}} \left( \varphi^{3t+2} - \overline{\varphi}^{3t+2} - \varphi^2 + \overline{\varphi}^2 \right) \\ & = & \frac{1}{2}\left( \frac{1}{\sqrt{5}}\left( \varphi^{3t+2} - \overline{\varphi}^{3t+2} \right) - \frac{1}{\sqrt{5}}\left( \varphi^2 - \overline{\varphi}^2 \right) \right) \\ & = & \frac{F(3t+2) - 1}{2} \end{eqnarray*}$$ That is, the sum of the first $t$ even Fibonacci numbers is essentially the $(3t+2)$th Fibonacci number. The only obstacle to applying this to our current problem is that it's not obvious how to take an upper bound $N$ and find the *index* of the largest even Fibonacci number less than $N$. Here's a quick and dirty way to find the largest index of a Fibonacci number below $N$. > maxfibidx' :: Integer -> Integer > maxfibidx' n = > fst $ > head $ > dropWhile (\(_,m) -> m < n ) $ > zip [1..] (tail fibs) But this doesn't really improve on our naive implementation of ``pe2'``, since it still requires generating all the Fibonacci numbers up to $n$. Here's a better idea. $F$ is a monotone increasing function on the natural numbers, and we wish to find the largest $m$ such that $F(m) < N$. We can find this $m$ using a binary search. First, we find integers $a$ and $b$ such that $F(a) < N \leq F(b)$; this can be done by looking at $m = 2^i$ for increasing $i$. Then, once such $a$ and $b$ are found, we tighten the interval $(a,b)$ bounding a root of $F(x) - N$ by bisection. The ``binarysearch`` function does this, taking a function $G$ as a parameter. > -- g is nondecreasing on positive integers > -- g(1) < n > -- returns the largest t such that g(t) < n > binarysearch :: (Integer -> Integer) -> Integer -> Integer > binarysearch g n = refine init > where > -- find powers of 2 that bound a root of g(x) - n > k = > fst $ > head $ > dropWhile ((< n) . snd) $ > map (\k -> (k, g $ 2^k)) [1..] > > -- this interval contains a root of g(x) - n > init = (2^(k-1), 2^k) > > -- bisect a root-containing interval > refine (a,b) > | b-a <= 1 = a > | otherwise = let m = (a+b)`quot`2 in > case compare (g m) n of > EQ -> m-1 > LT -> refine (m,b) > GT -> refine (a,m) With an implementation of Binet's formula, we'd be nearly there. But in order to do exact arithmetic on quadratic numbers of the form $p + q\sqrt{5}$ -- as required by Binet -- we need the following type and instances. > data Root5 = Root5 Rational Rational > deriving (Eq, Show) > > phi = Root5 (1%2) (1%2) > sqrt5 = Root5 0 1 > > -- quadratic conjugate > conj :: Root5 -> Root5 > conj (Root5 a b) = Root5 a (negate b) > > -- rational part > ratpart :: Root5 -> Rational > ratpart (Root5 p _) = p > > instance Num Root5 where > fromInteger k = Root5 (k%1) 0 > > (Root5 a1 b1) + (Root5 a2 b2) = > Root5 (a1 + a2) (b1 + b2) > > (Root5 a1 b1) * (Root5 a2 b2) = > Root5 (a1*a2 + 5*b1*b2) (a1*b2 + a2*b1) > > negate (Root5 a b) = > Root5 (negate a) (negate b) > > abs = undefined; signum = undefined > > instance Fractional Root5 where > fromRational q = Root5 q 0 > > recip (Root5 0 0) = undefined > recip (Root5 a b) = Root5 (a/d) (-b/d) > where d = a*a - 5*b*b Then, first of all, we can implement and test Binet's formula. > -- directly compute F(n) > binet :: Integer -> Integer > binet n = numerator $ ratpart $ > (phi^n - (conj phi)^n) / sqrt5 > > -- binet k == fibs !! (k-1) > test_binet :: Integer -> Bool > test_binet n = and $ zipWith (==) fibs $ map binet [1..n] We can now implement ``maxfibidx`` using binary search. > maxfibidx :: Integer -> Integer > maxfibidx n = binarysearch binet n And finally, an alternative implementation of ``pe2'``. > pe2''' :: Integer -> Integer > pe2''' n = ((binet $ 3*t+2) - 1) `quot` 2 > where > t = (maxfibidx n) `quot` 3 As a sanity check: ```haskell $> pe2' 4000000 4613732 $> pe2'' 4000000 4613732 $> pe2''' 4000000 4613732 ``` And more generally: > test :: Integer -> Bool > test n = (pe2' n == pe2'' n) && (pe2' n == pe2''' n) With a check: ```haskell $> all test [1..1000] True ``` For comparison, here is a table of timings on my machine. I should note that the times for ``pe2''`` are misleading; they do not include time spent thrashing in memory. ``pe2'''`` did not suffer from this. The problem was so bad I couldn't complete this table, but I think a trend is already evident. ------------------------------------------------------------------ n ``pe2'' n`` Time (s) ``pe2''' n`` Time (s) -- --------------------- ---------------------- $10^{1 \cdot 10^4}$ 0.59 0.15 $10^{2 \cdot 10^4}$ 1.07 0.28 $10^{3 \cdot 10^4}$ 1.50 0.44 $10^{4 \cdot 10^4}$ 2.03 0.58 $10^{5 \cdot 10^4}$ 3.02 0.74 $10^{6 \cdot 10^4}$ 6.84 0.98 $10^{7 \cdot 10^4}$ ??? 1.08 $10^{8 \cdot 10^4}$ ??? 1.25 $10^{9 \cdot 10^4}$ ??? 1.40 ------------------------------------------------------------------ Just for fun, the sum of all even Fibonacci numbers less than $10^{1000000}$ is even. Shocking, I know! But my machine determined this by brute force in 14 seconds. :) So the final answer is: > pe2 :: Integer > pe2 = pe2''' 4000000 > > main :: IO () > main = do > let success = all test [1..1000] > if success > then putStrLn $ show pe2 > else exitFailure