--- layout: post title: "Tupper's Self Referential Formula in R" tags: [rstats, stats, obesity, nonparametrics] comments: yes editor_options: chunk_output_type: console bibliography: /Users/hoehle/Literature/Bibtex/jabref.bib --- ```{r,include=FALSE,echo=FALSE,message=FALSE} ##If default fig.path, then set it. if (knitr::opts_chunk$get("fig.path") == "figure/") { knitr::opts_knit$set( base.dir = '/Users/hoehle/Sandbox/Blog/') knitr::opts_chunk$set(fig.path="figure/source/2022-10-15-tupper/") } fullFigPath <- paste0(knitr::opts_knit$get("base.dir"),knitr::opts_chunk$get("fig.path")) filePath <- file.path("","Users","hoehle","Sandbox", "Blog", "figure", "source", "2022-10-15-tupper") knitr::opts_chunk$set(echo = FALSE, fig.width=8, fig.height=4, fig.cap='', fig.align='center', dpi=72*2)#, global.par = TRUE) options(width=150, scipen=1e3) suppressPackageStartupMessages(library(tidyverse)) suppressPackageStartupMessages(library(knitr)) suppressPackageStartupMessages(library(stringi)) # Non CRAN packages # devtools::install_github("hadley/emo") ##Configuration options(knitr.table.format = "html") theme_set(theme_minimal()) #if there are more than n rows in the tibble, print only the first m rows. options(tibble.print_max = 10, tibble.print_min = 5) # Fix seed value for the post set.seed(20220610) ``` ## Abstract: We implement Tupper's self-referencing formula in R. This has been done before by others, but the joy was to be able to learn how to do it yourself using the tidyverse.
```{r,results='asis',echo=FALSE} cat(paste0("![]({{ site.baseurl }}/",knitr::opts_chunk$get("fig.path"),"plot_tupper-1.png"),")") ```

Creative Commons License This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. The [R-markdown source code](`r paste0("https://raw.githubusercontent.com/hoehleatsu/hoehleatsu.github.io/master/_source/",current_input())`) of this blog is available under a [GNU General Public License (GPL v3)](https://www.gnu.org/licenses/gpl-3.0.html) license from GitHub. ## Introduction [Tupper's self-referencial formula](https://en.wikipedia.org/wiki/Tupper%27s_self-referential_formula #) [@tupper2001] is an equation which maps a 2D $(x,y)$ coordinate to an $\{\text{FALSE},\text{TRUE}\}$ value. If $(x,y)$ represent pixel locations, the output over a grid of values can thought of as a black & white image where the true/false values are mapped to $\{0,1\}$ in the usual way. Tupper's formula is $f(x,y) =$ $$ \frac{1}{2} < \left\lfloor \operatorname{mod}\left( \left\lfloor\frac{y}{17}\right\rfloor\cdot 2^{-17\lfloor x \rfloor - \operatorname{mod}(\lfloor y \rfloor, 17)}, 2\right)\right\rfloor. $$ We note that if one evaluates the function for all integers $(x,y)$ for $0 \leq x \leq 105$ and $k\leq y\leq k+16$, where $k$ is a fixed constant, then one gets a binary image with 106x17 pixels. The entire magic of Tupper's formula is that it's just unpacking an encoding of that 106x17 grid by representing it as a `r 106*17` long binary number which is then converted from base-2 into base-10. For some (not so obvious) reason this number is then multiplied by 17 to yield the final value of $k$ (in base-10)[^1]. We also note that since the right-hand side of Tupper's equation will always be 0 or 1, the comparison with $\frac{1}{2}$ appears superfluous and seems to be just a way to get a Boolean instead of a 0/1. Furthermore, since we will be using only integer values for $x$ and $y$, the floor operators around $x$ and $y$ are not really needed either. ### More Background Initially, I learned about the formula from a twitter post:

Tupper's self-referential formula is a formula that visually represents itself when graphed at a specific location in the (x, y) plane. pic.twitter.com/wAUVahJ9Dq

— Fermat's Library (@fermatslibrary) February 4, 2018
A nice Numberphile video has also been dedicated to the formula. ## R Implementation ```{r, message=FALSE} <> ``` One challenge of implementing Tupper's formula in R is that $k$ will be a very large integer (~`r round(str_length(as.character(k)), digits=-2)` digits). Hence, one needs a special purpose library to handle these large numbers. StaTEAstics in their 2013 [R blog post on Tupper's formula](https://www.r-bloggers.com/2013/03/tuppers-self-referential-formula/) use the GNU Multiple Precision Arithmetic Library for this purpose and is interfaced in the [`gmp` R package](https://cran.r-project.org/web/packages/gmp/index.html). We follow their implementation: ```{r gmp_and_k, echo=TRUE, results="hide", message=FALSE} ## GNU Multiple Precision Arithmetic Library) for handling the long integers library(gmp) ## Define the constant k k <- as.bigz("960939379918958884971672962127852754715004339660129306651505519271702802395266424689642842174350718121267153782770623355993237280874144307891325963941337723487857735749823926629715517173716995165232890538221612403238855866184013235585136048828693337902491454229288667081096184496091705183454067827731551705405381627380967602565625016981482083418783163849115590225610003652351370343874461848378737238198224849863465033159410054974700593138339226497249461751545728366702369745461014655997933798537483143786841806593422227898388722980000748404719") ``` Thus an R function implementing Tupper's formula is: ```{r, echo=TRUE} tupper <- function(x, y, k) { z1 <- as.bigz(y + k) z2 <- as.bigq(floor(z1/17)) z3 <- 2^(-17 * floor(x) - as.bigz(floor(z1) %% 17)) return(0.5 < floor(as.bigz(z2 * z3) %% 2)) } ``` Here we have used $k$ explicitly in order to have $y$ run from 0 to 16, which is easier for the subsequent plotting. Applying the R function to an appropriate grid of values (and reversing the index directions to account for the horizontal plotting direction): ```{r compute_tupper, echo=TRUE} im <- expand_grid(x=0:105L, y=0:16L) %>% rowwise() %>% mutate(z=tupper(105-x, 16-y, k=k)) ``` The result can then easily plotted: ```{r plot_tupper, echo=TRUE, fig.width=10.6, fig.height=1.7} plot_tupper <- function(im, palette=c("darkblue", "lightgray")) { ggplot(data=im, aes(x=x,y=y,fill=as.factor(z))) + geom_tile() + scale_fill_manual(values=palette) + theme_void() + theme(legend.position="none") + coord_equal() } plot_tupper(im) ``` ## Behind the Scenes To check the underlying binary representation we convert $k/17$ to base-2 notation. However, the multiplication by 17 (`r as.character(as.bigz(17), b=2)` in base-2) not only ensures that taking $k$ modulo the height of the image starts at zero, but it is also helpful to keep possible trailing zeroes in the encoding of the image. Since we know the image size has to be $17\times 106=1802$ we simply fill the trailing zeroes, if the base-2 converted result of $k/17$ does not have a length of 1802. Convert to base-2 number of length 1802 and visualize the number: ```{r, echo=TRUE,comment=NA } char <- as.character(k/17, b=2) ## Add trailing zeroes, which are missing coz the first two pixel are 0. char <- str_c(str_c(rep("0",17*106 - str_length(char)), collapse=""), char) cat(char) str_length(char) ``` Shown splitted into chunks of 17 and for better plotting replacing 0 with " " and 1 with "█" ```{r, echo=TRUE,comment=NA } char_split <- stri_sub(char, seq(1, stringi::stri_length(char),by=17), length=17) plot_string <- str_c(char_split, collapse="\n") %>% str_replace_all("0", " ") %>% str_replace_all("1", "█") cat(plot_string) ``` or somewhat better visible as a plot: ```{r, echo=TRUE,comment=NA } ## Convert to image data.frame im2 <- expand_grid(x=0:105, y=0:16) im2 <- im2 %>% mutate(idx = y + (x*17) + 1) %>% rowwise() %>% mutate(value_str = str_sub(char, start=idx, end=idx), value = as.numeric(value_str)) ``` ```{r show_pixels, fig.width=10.6, fig.height=3, fig.align="center"} ggplot(im2, aes(x=x, y=y, fill=as.factor(value))) + geom_tile() + theme_minimal() + scale_fill_manual(values=c("gray90", "gray50")) + scale_x_continuous(expand=c(0,0)) + geom_text(aes(label=value), size=2.5) + theme(legend.position="none") + coord_equal() ``` We find the above binary number by starting in the (0,0) cell and reading upwards in the $x=0$ column. ## Discussion Given the entire "magic" of Tupper's formula is in $k$, this [site](https://keelyhill.github.io/tuppers-formula) allows you to upload a raw 106x17 image and get the corresponding $k$ for use in the formula. With [Gimp](https://www.gimp.org/)'s PBM export functionality it's thus easy to make your own plotting, e.g. with $k$ equal to ```{r, comment=NA} #Created with gimp and https://keelyhill.github.io/tuppers-formula k <- as.bigz("186884211780601757089521467754254266534847988959618908270134320886923032590936706609566110951773945064529540811157829398942842590351995031478543240582993263095682288889081666401727057238884719133521833705371096422637085577259001963761107220646739852199923964701689237214047197937015515747842387117086366819859986916183575585602891273928856883765838042528273754853751383296206633974324557163987001300322007312244691824532706662875082651525203923748809153375012301876787226286483554151163460581654755346590825663755194466304") cat(as.character(k)) ``` we get ```{r discussion_plot, fig.width=10.6, fig.height=1.7} im <- expand_grid(x=0:105L, y=0:16L) %>% rowwise() %>% mutate(z=tupper(105-x, 16-y, k=k)) plot_tupper(im, palette=c("white", "black")) ``` Happy self-referential plotting! ## Literature [^1]: For details about why the multiplication with the height is done, see Arvind Narayanan's post [Tupper’s Self-Referential Formula Debunked](https://web.archive.org/web/20150424181239/http://arvindn.livejournal.com/132943.html).