--- layout: post title: "Purr yourself into a math genius" tags: [rstats, purrr, combinatorics, math puzzle] # bibliography: ~/Literature/Bibtex/jabref.bib header-includes: - \usepackage{bm} comments: true editor_options: chunk_output_type: console --- ```{r,include=FALSE,echo=FALSE,message=FALSE} ##If default fig.path, then set it. if (knitr::opts_chunk$get("fig.path") == "figure/") { knitr::opts_knit$set( base.dir = '/Users/hoehle/Sandbox/Blog/') knitr::opts_chunk$set(fig.path="figure/source/2019-01-04-mathgenius/") } fullFigPath <- paste0(knitr::opts_knit$get("base.dir"),knitr::opts_chunk$get("fig.path")) filePath <- file.path("","Users","hoehle","Sandbox", "Blog", "figure", "source", "2019-01-04-mathgenius") knitr::opts_chunk$set(echo = TRUE,fig.width=8,fig.height=4,fig.cap='',fig.align='center',echo=FALSE,dpi=72*2)#, global.par = TRUE) options(width=150, scipen=1e3) suppressPackageStartupMessages(library(tidyverse)) suppressPackageStartupMessages(library(magrittr)) suppressPackageStartupMessages(library(knitr)) ##Configuration options(knitr.table.format = "html") theme_set(theme_minimal()) #if there are more than n rows in the tibble, print only the first m rows. options(tibble.print_max = 10, tibble.print_min = 5) ``` ## Abstract: We use the purrr package to solve a popular math puzzle via a combinatorial functional programming approach. A small shiny app is provided to allow the user to solve their own variations of the puzzle. ```{r,results='asis',echo=FALSE,fig.cap=""} cat(paste0("

\n")) ``` {% include license.html %} ## Introduction [No. 4](http://www.briddles.com/2011/12/maths-puzzle.html) of the [Top 5 hard math puzzles](http://www.briddles.com/riddles/top-5-hard-math) at [briddles.com](www.briddles.com) goes like this:

*How can I get the answer 24 by only using the numbers 8,8,3,3. You can use the main signs add, subtract multiply and divide.*

Note: a solution has to use each of the specified 4 numbers exactly ONCE, but they can be used in any order. In other words the standard scheme is to solve expressions of the kind:

`a op1 b op2 c op3 d`

where `a`, `b`, `c` and `d` denote a permutation of the numbers 8, 8, 3, 3 and each of `op1`, `op2` and `op3` denotes the use of one binary operator selected from +, -, * or /. An example is the expression `8 + 3 + 8 * 3`. Parentheses are used to control the order in which the operators are applied, i.e. `(8 + 3 + 8) * 3` yields a different result than `8 + 3 + (8 * 3)`. After a few unsuccessful attempts to solve the above puzzle with pen and paper it felt more *efficient* and computationally *challenging* to solve this puzzle via a combinatorial approach: Simply try out all permutations of the 4 numbers, the 4 binary operators and all possible sets of parentheses to combine the operators. One can show that there are at most $$ \begin{align*} && \text{# permutations of the $k=4$ base numbers} \\ \times && \text{# ways to select with replacement $(k-1)$ binary operators from the set $\{+,-,*,/\}$ }\\ \times && \text{# ways to parenthesize the $(k-1)$ binary operators} \\ &=&k! \times 4^{(k-1)} \times \frac{1}{k} \binom{2k-2}p{k-1} \end{align*} $$ ```{r, echo=FALSE} catalan <- function(n) { choose(2*n, n) / (n+1) } nCombinations <- function(k) { factorial(k) * 4^(k-1) * catalan(k) } ``` different combinations to choose from [^1]. As an example: for $k=4$ the maximal number of unique combinations is `r nCombinations(k=4)`. #### Strategy We will use a functional approach to solve the above combinatorial problem. Why? * because it seems like a good use-case for [functional programming](https://en.wikipedia.org/wiki/Functional_programming), * because it is important to extend your programming horizon every once in a while, and * because the [`purrr`](https://cran.r-project.org/web/packages/purrr/index.html) functional programming toolkit for R allows you to experiment with this without having to leave the R universe [^2]. For those not familar with `purrr` can find a wonderful didactic introduction in the [useR! 2017 tutorial](https://github.com/cwickham/purrr-tutorial) by [Charlotte Wickham](https://twitter.com/CVWickham). Furthermore, learning `purrr` was the 7th most frequent mentioned package in the [#rstats users' 2019 R goals](https://masalmon.eu/2019/01/01/r-goals/). In other words: Attention #rstats new years resolution makers: reading this post is as **obligatory** as going to the gym on 01 Jan! ## Solving the Math Puzzle We will divide-and-conquer the solution along the lines of the number of combinations formula: Firstly, we will store all permutations of the $(k-1)$ base numbers in a list `perm`. Secondly, we will store all possible combinations of the $(k-1)$ operators in a list `operators` and, thirdly, we generate all possible ways of putting parentheses around the operators into a list `brackets`. Subsequently, we form the Cartesian product of these three lists and build the corresponding expression for each triple of permutation, operators and parentheses. Finally, each generated expression is evaluated. The entire result is a data frame containing all possible expressions and their associated value obtained when evaluating the expression. ### Permutations of the base numbers We let the variable `base_numbers` contain the specification of the numbers to permute for the expression. The code should be written general enough so it is possible to use a different base, e.g., $k=3$ or $k=5$. ```{r, echo=TRUE} base_numbers <- c(8,8,3,3) k <- length(base_numbers) number_perm <- combinat::permn(base_numbers) %>% map(setNames, nm=letters[seq_len(k)]) ##Slim in case permutations of the base numbers contain duplicates. perm <- number_perm[!duplicated(map(number_perm, paste0, collapse=""))] ``` For $k=4$ the first step yields a total `r nCombinations(k=4)` combinations. However, since the numbers 8 and 3 both appear more than once in the base numbers, we can slim the number of permutations from `r length(combinat::permn(base_numbers))` to `r length(combinat::permn(base_numbers))/2/2`. Hence, there are altogether only `r nCombinations(k=4)/4` combinations to investigate. ### Combinations of the operators The next step is to make all combinations of the $k-1$ binary operators needed to combine the $k$ numbers. We use the string format to represent the operators [^3] and thus just need the $k-1$'th Cartesian product of the set $\{+, -, *, /\}$ represented as strings. ```{r, echo=TRUE} opList <- list("+", "-", "*", "/") ##Repeat the opList k-1 times opsList <- map( seq_len(k-1), ~ opList) ##Form the Cartesian product operators <- cross(opsList) %>% map( setNames, nm=paste0("op",seq_len(k-1))) ``` ### Arrangements of the parentheses As all the involved operators are binary it becomes clear that finding all possible ways to parenthesize the expression corresponds to finding all [binary trees](https://en.wikipedia.org/wiki/Binary_tree) with $k-1$ leaves. Beautiful recursive code inspiration for how to solve this can be found on [leetcode.com](https://leetcode.com/problems/all-possible-full-binary-trees/solution/). Some adaptation to R and our problem at hand was necessary - the idea is to use recursion in $k$ and use a hash-map to cache results of previous computations. ```{r, echo=TRUE} ##Initialize hashmap to save the results of all binary trees up to n=1 leaves trees <- list() trees[["0"]] <- NULL trees[["1"]] <- list(list(val="node", left=NULL, right=NULL)) ``` The rather elegant **recursive solution** for generating all binary trees with $n$ leaves works by combining all possible ways to generate subbranches containing $x$ and $n-x$ leaves, respectively: ```{r, echo=TRUE} allBinTrees <- function(n) { ##Character version of n, which is used as hash key n_char <- as.character(n) ##Only compute something if n is not already in the hashlist. if (is.null(pluck(trees, n_char))) { trees[[n_char]] <<- list() ##Combine all possible ways to generate bintrees with $i$ and $n-i$ leaves for (i in 1:(n-1)) { j = n - i for (left_tree in allBinTrees(i)) { for (right_tree in allBinTrees(j)) { trees[[n_char]][[length(trees[[n_char]]) + 1]] <<- list(val=NULL, left=left_tree, right=right_tree) } } } } #end if not already in tree list ##Return result from our hashmap return(pluck(trees, n_char)) } ``` We can test the function for $n=2$, which yields exactly one tree: ```{r TREE2STRING} ##Helper function to print a binary tree (recursive edition) tree2String <- function(tree) { if (is.character(tree$val)) return(tree$val) paste0("(", tree2String(tree$left), " op " , tree2String(tree$right), ")") } ``` ```{r OPNUMHELP} ##Convention: Number the operators from left to right. We do the search ##and replace recursively. Any clever way to do this as a regexp? addOpNumbers <- function(str, i=1) { if (!grepl(" op ", str)) return(str) ##Replace one "op" addOpNumbers( str=sub(" op ", paste0(" op",i," "), str), i=i+1) } ##Convert the "node" placeholders into the variables a, b, c, ... ##Convention: Name the numbers from left to right by "a", "b", "c", ... replaceNodes <- function(str, i=1) { if (!grepl("node", str)) return(str) ##Replace one "node" replaceNodes( str=sub("node", letters[i], str), i=i+1) } ``` ```{r TESTALLBINTREESN2, echo=TRUE} ##Manual construction trees2 <- list(list(val=NULL, left=trees[["1"]][[1]], right=trees[["1"]][[1]])) all.equal(allBinTrees(n=2), trees2) ``` The result is: ```{r TESTRESULT, echo=TRUE} tree2String(allBinTrees(n=2)[[1]]) %>% replaceNodes() %>% addOpNumbers ``` In the above code segments the function `tree2String` is a small helper function to convert the nested list structure to a string - in this case: `` `r tree2String(allBinTrees(n=2)[[1]])` ``. Furthermore, the function `replaceNodes` renames the terms `node` into the variables `` `r tree2String(allBinTrees(n=2)[[1]]) %>% replaceNodes()` ``. The `op`-strings are converted into numbered `op`-strings using `addOpNumbers`, i.e. the result becomes `` `r tree2String(allBinTrees(n=2)[[1]]) %>% replaceNodes() %>% addOpNumbers` ``. Details about the helper functions can be found in the [code](`r paste0("https://raw.githubusercontent.com/hoehleatsu/hoehleatsu.github.io/master/_source/",current_input())`) on github. With all preparations in place we can now generate all `r catalan(3)` possible ways to parenthesize the 3 binary operations using the following code: ```{r, echo=TRUE} ##Make all possible brackets bracketing <- map_chr( allBinTrees(n=k), ~ tree2String(.x) %>% addOpNumbers %>% replaceNodes) ``` ```{r} bracketing ``` ### Putting it all together We can now generate all combinations of numbers, operators and bracketing by the Cartesian of the three lists: ```{r, echo=TRUE} combos <- cross3( perm, map( operators, unlist), bracketing) %>% map(setNames, c("numbers", "operators", "bracket")) ``` ```{r} ##Even more general helper function for numbers as well as operators replace <- function(str, what) { if (length(what) == 0) return(str) replace( str=gsub(names(what)[1], what[1], str), what=what[-1]) } ``` We can now finally evaluate each of the `r length(combos)` combinations. Note: Because this might take a while it's a good idea to add a [progress bar](https://github.com/tidyverse/purrr/issues/149#issuecomment-359236625) for this `purrr` computation. ```{r, echo=TRUE} ##Set up a progress bar for use with the map function pb <- progress_estimated(length(combos)) ##Compute res <- map(combos, .f=function(l) { pb$tick()$print() l[["expr"]] <- l[["bracket"]] %>% replace(l[["numbers"]]) %>% replace(l[["operators"]]) l[["value"]] <- eval(parse(text=l[["expr"]])) return(l) }) ``` Again, `replace(v)` is a small helper function to replace the strings in `names(v)` with `v`'s content. The actual evaluation of each possible solution string is done by parsing the string with `parse` and then evaluate the resulting expression. We extract the relevant results into a `data.frame` ```{r CONVERTTODF, warning=FALSE, echo=TRUE} df <- map_df(res, ~ data.frame(expr=.x$expr, value=.x$value)) ``` ```{r DATATABLE} DT::datatable(df) ``` We can now easily extract the solution: ```{r EXTRACT24, echo=TRUE} ##First element to give the value 24 detect(res, ~ isTRUE(all.equal(.x$value, 24))) ``` Voila! QED! ```{r} solveMathPuzzle <- function(base_numbers, expr_result, operatorList) { ##Variables k <- length(base_numbers) ##Permuations of the base numbers perm <- combinat::permn(base_numbers) %>% map(setNames, nm=letters[seq_len(k)]) ##Slim it? perm <- perm[!duplicated(map(perm, paste0, collapse=""))] ##Make all combinations of the operators opsList <- map( seq_len(k-1), function(.x) operatorList) operators <- cross(opsList) %>% map( setNames, nm=paste0("op",seq_len(k-1))) ##Make all possible brackets bracketing <- map_chr( allBinTrees(n=k), ~ tree2String(.x) %>% addOpNumbers %>% replaceNodes) ##All combinations of the numbers, the order and the bracketing. ##Depending on the combinations this might take a while... combos <- cross3( perm, map( operators, unlist), bracketing) %>% map(setNames, c("numbers", "operators", "bracket")) ##Compute value of all combinations (with progress bar) ##Set up a progress bar for use with the map function pb <- progress_estimated(length(combos)) ##Compute res <- map(combos, .f=function(l) { pb$tick()$print() l[["expr"]] <- l[["bracket"]] %>% replace(l[["numbers"]]) %>% replace(l[["operators"]]) l[["value"]] <- eval(parse(text=l[["expr"]])) return(l) }) ##Convert results to a data.frame pb <- progress_estimated(length(res)) df <- map_df(res, ~ { pb$tick()$print() ; data.frame(expr=.x$expr, value=.x$value)}) ##Match 24 is_zero <- function(x) isTRUE(all.equal(x, 0)) ##Only those with nice integers results df_int <- df %>% mutate(rounded = round(value, digits=0), diff=value - rounded) %>% rowwise %>% filter(is_zero(diff)) res <- list(combos=df_int %>% select(expr, value), expr=df_int %>% filter(rounded==expr_result) %$% expr) return(res) } ``` #### Extended New Years Fun For user experimentation we wrapped all the above steps into one function `solveMathPuzzle` (see [github code](`r paste0("https://raw.githubusercontent.com/hoehleatsu/hoehleatsu.github.io/master/_source/",current_input())`) for details). To underline the generalizability of the approach we solve a classical 2019 new-year's puzzle: ```{r COMPUTE2019, cache=TRUE, echo=TRUE} res <- suppressWarnings(solveMathPuzzle( base_numbers=c(7,7,11,11,43,43), expr_result=2019, operatorList=c("+","*"))) res$expr[[1]] ``` ## Shiny App To make the above solution accessible to a wider audience we wrote a small Shiny app to play with the code for $k=4$:

http://michaelhoehle.eu/shiny/mathgenius/

Here one can alter the input numbers in case variants of the puzzle are in need of a solution or, if you occasionally need to generate math puzzles for your nephew...

```{r,results='asis',echo=FALSE,fig.cap=""} cat(paste0("

\n")) ``` Besides possible solutions one can view the result of all possible combinations yielding integer results in the "Details" tab. We invite you to experiment with the app or download the [source code of the Shiny app](`r paste0("https://raw.githubusercontent.com/hoehleatsu/hoehleatsu.github.io/master/figure/source/",gsub("\\.Rmd","",current_input()),"/mathgenius/app.R")`) from github for the full math experience. `r emo::ji("smiley")` As always: it's amazing how easy you can wrap a interactive web based UI around your running R code with Shiny! ## Discussion We used a brute force solution approach by trying out all possible combinations to solve the math puzzle. The code of our solution approach is flexible enough to handle more or less base numbers, however, the number of combinations to try quickly exceeds reasonable memory and timing constraints. We stress that **a mathematical purr does not need speed, it lives from the beauty of recursion and mappings**! Clever mathematicians might be able to achieve considerable speed gains by exploiting for example commutative properties of the operators whereas skilled computer scientists would parallelise the computations. [^1]: Note: The term $\frac{1}{k} \binom{2k-2}{k-1}$ is the so called [Catalan number](https://en.wikipedia.org/wiki/Catalan_number#Applications_in_combinatorics), which - among other applications - also denotes the number of ways to parenthesize $k-1$ binary operations. [^2]: Actually, the package more or less adds a lot of convenience wrapping for functional programming in R, the functional programming approach is rather deeply rooted in R due to the [S language being inspired by Scheme](https://web.archive.org/web/20140414081950/https://daspringate.github.io/posts/2013/05/Functional_programming_in_R.html). [^3]: A purer functional approach would have been to use the function definition of the operators directly, i.e. to define `operatorList` with elements such as `` `+`(e1, e2) `` and then use these functions to build the parse tree as an expression. The disadvantage of such an approach is that the expressions become more cumbersome to write. For example `(5 + 3 + 2) * 4` as `` `*`( `+`( `+`(5, 3), 2), 4) ``. ## Literature