`when`

, `how_long`

, `diff`

, and `weight`

correspond to the date, the time between measurements, the difference in kg, and the mean kg, respectively. The numbers in the upper-right panels are the Pearson correlations."}
psych::pairs.panels(poo_df)
```
Nothing too weird, none of the correlations are really that striking.
### Pizzles vs. deuces
So let's see how the number ones compare to the number twos. Before running this experiment, I expected that the average doodoo would outweigh the average piddle. Although I'm pretty sure more liquid weight makes its way through our bodies each day, I generally micturate more than I defecate. And since the way I measured a number two _included_ the accompanying number one, I thought this hypothesis would be a slam dunk.
```{r, fig.cap="Boxplot of weight of my recorded BMs by type. The thick dark lines represent the medians and the red lines represent the means."}
combo_df <- bind_rows("number one" = pee_df,
"number two" = poo_df,
.id = "Type") %>%
filter(diff >= 0) %>%
group_by(Type)
stat_combo_df <- combo_df %>%
summarise(median=median(diff),
mean = mean(diff)) %>%
tidyr::gather(stat,value,median:mean)
combo_df %>%
ggplot(aes(y = diff, x = Type)) +
# geom_violin() +
geom_boxplot( outlier.alpha=0) +
geom_point(position = position_jitter(0.1)) +
ggtitle("How heavy are my toilet trips?") +
stat_summary(fun.data = function(x) data.frame(y = mean(x), ymax = mean(x), ymin = mean(x)),
geom="errorbar", color="red",
width=0.75) +
kg_lb_scale("weight change") +
xlab("bowel movement type") +
ggrepel::geom_label_repel(aes(label = paste(round(value, 3), "kg"),
y = value,
color = stat),
data = stat_combo_df,
min.segment.length=0,
force = 500,
box.padding = 4,
max.iter = 10000,
seed=2,
size = geom_text_size) +
scale_color_manual(guide=FALSE,
values=c("red","black"))
```
Surprisingly, my recorded tinkles outweigh my recorded dookies in both mean _and_ median! This difference isn't statistically significant, but it is numerically intriguing.
Thinking about it, I have a few potential explanations.
First, it might be true. Water (essentially what urine is) by volume is pretty heavy. Try picking up a bucket of water, it's heavier than you might think. Maybe a full bladder is heavier than a full lower intestine. What makes this theory pretty weak is that, as we have all seen, feces most often sink to the bottom of the toilet bowl---they must be denser than water/urine.
More likely, it could be a problem with how I _collected_ the data: I have the feeling that I was more likely to remember to weigh myself when I had to pee really badly, but had an easier time remembering to weigh myself before any poop. With a poop on its way, I would know that I might have to 'settle' in for a longer toilet visit, so I would take the time to measure beforehand.
To solve this last issue, I could be *incredibly* diligent about weighing myself every time I use the restroom. I'd have to stay home for a while, but it could be possible to pull off. If _every_ bowel movement is measured, this bias should decrease. However, this line of reasoning led me to a different thought.
### Maybe the wrong question?
The frequency with which we poop is not (fully) determined by how much we eat: many studies say that what is healthy ranges from [three times a day to three times a week](https://www.vice.com/en_us/article/j54xv7/how-often-should-you-poop).[^6] In my layman's opinion, this frequency is much more stable than the frequency of urination, which I feel is more determined by things such as how busy we are, etc. And given that the weight per widdle is a function of consumption by frequency, does talking about "individual" urinations make that much sense?
Sure, the frequency of these bowel movements could be interesting in its own right, but I would have to be much more diligent about recording all of them. This just comes back to the second issue I brought up. I approached this question with the lazy understanding that I could just record whichever bowel movements I happened to remember about, and that these ~~stool~~ samples would be representative of an "average" bowel movement. But in reality, thinking about an "average" BM isn't that useful, especially for urination. Overall "output" rates and frequency seem much more applicable.
## Question 3: Lots of questions
Time for a lightning round!
### Question 3.1: Does a shower make me lighter?
I'm pretty sure scientists say that you lose dead skin cells, hair, grime, etc. in the process of showering. But is that change noticeable? I recently started weighing myself before and after a shower to see.
Below are the changes in weight, before-and-after, in pounds:
```{r}
shower_df %>%
arrange(diff) %>%
pluck("diff") %>% {.*KG2LB} %>% round(2)
```
**Answer: Inconclusive.** While the mean is slightly positive, there seems to be enough noise in the data that I can't tell as of now!
### Question 3.2: Longer bathroom breaks = more poo/pee?
Do I stay in the bathroom longer for bigger BMs?
```{r q32, fig.height = base_fig_height * 0.8 }
diff_dfs <- bind_rows("poo" = poo_df,
"pee" = pee_df %>% filter(diff > 0),
"sleep" = sleep_df,
"shower" = shower_df,
.id = "df_type") %>%
tidyr::nest(-df_type) %>%
mutate(p.value = data %>%
map_dbl(~cor.test(.$how_long, .$diff)$p.value),
cor = data %>%
map_dbl(~cor.test(.$how_long, .$diff)$estimate),
R2 = cor^2,
lm = data %>%
map(~lm(diff ~ how_long, data=.x) %>%
broom::tidy() %>% select(term, estimate)),
slope = map_dbl(lm, ~.[.$term=="how_long",]$estimate),
int = map_dbl(lm, ~.[.$term=="(Intercept)",]$estimate)) %>%
tidyr::unnest(data)
diff_dfs %>%
filter(df_type %in% c("poo","pee")) %>%
ggplot(aes(x = how_long, y = diff)) +
geom_point() +
# geom_abline(aes(intercept=int, slope=slope),
# color="red") +
geom_smooth(method="lm") +
facet_wrap(~df_type, scales="free") +
zplyr::geom_abs_label(ypos=0.75, xpos=0.7,
aes(label = paste0("p=", format.pval(p.value, digits = 2))),
size=geom_text_size) +
zplyr::geom_abs_label(ypos=0.85, xpos=0.7,
aes(label = paste0("R=", round(cor,2))),
size=geom_text_size) +
kg_lb_scale("BM weight") +
xlab("minutes spent on toilet")
```
**Answer: No.** In fact, it looks like I stay longer for `fx`

is set to `TRUE`

."}
all_dfs %>%
filter(coef != "shower") %>%
ggplot(aes(x=fake_k, y=abs(1-est/mean), color=coef, group=coef)) +
geom_point() +
geom_rect(ymin = -Inf, ymax=Inf,
xmin=-Inf, xmax=-15,
fill="grey90",
data = data.frame(type = "gamm()", g = 0),
inherit.aes = FALSE) +
geom_point() +
geom_line() +
facet_wrap(~type,scales="free_x") +
scale_size_continuous(guide=FALSE) +
# theme(axis.text.x = element_text(angle = 45,hjust=1)) +
scale_color_discrete("effect") +
scale_y_continuous("percent error\n(1-predicted/gt)",
labels = scales::percent_format(accuracy = 1)) +
ggtitle("How well do GAMs capture the effects?") +
scale_x_continuous(
breaks = c( -20, -10, -5, 0, 10,20,30,40,50),
labels = c("lm", "0","c","1","10","20","30","40","50")) +
xlab("'k' of the time smooth")
```
First, I should say that "ground truth" estimates for each effect are all within the 95% CI for each GAM. So in some sense, the answer seems to be "yes." However, as we see, we can get estimates that are ~40% off in some models, so it's worth being cautious.
Perhaps unsurprisingly, as the number of basis functions increases, the accuracy of the estimations seem to get better and better, up to a point.[^8]
However, in the figure above, I fixed the `k` term for each smooth, instead of letting `mgcv()` automatically determine the best number of basis functions with `k`-1 as the upper bound.
When I repeat what I did above, but let `mgcv` decide the optimal number of basis functions (represented below via the estimated degrees of freedom), we immediately see that there is *much* less 'thrashing' around of the estimated effect sizes.
```{r, fig.cap="How well the GAMs capture the effects versus the number of basis functions of the time smooth (represented via the estimated degrees of freedom for the smooth) as `k`

increases. Note how the `gamm()`

function tops out around 25."}
all_free_dfs %>%
filter(coef != "shower") %>%
filter(k %in% as.character(3:50)) %>%
mutate(`function`=type) %>%
ggplot(aes(x=edf, y=abs(1-est/mean), color=coef,
linetype=`function`)) +
geom_point() +
geom_line() +
scale_size_continuous(guide=FALSE) +
scale_color_discrete("effect") +
scale_y_continuous("percent error",
labels = scales::percent_format(accuracy = 1)) +
ggtitle("How well do GAMs capture the effects?",
"smooth parameters estimated automatically") +
xlab("e.d.f. of time smooth")
```
**Lesson learned: don't fix `k` unless you really know what you're doing!**
Interestingly though, it seems that `gamm()` is much more conservative with the amount of wiggliness it bestows the time spline---even when `k` is set to `50`, the edf barely goes above 25, while the edf of `gam()` go much higher. If anybody has any clue _why_, let me know!
## Source Code: > [`2020-01-20-weight_analysis_pt_3.Rmd`](https://raw.githubusercontent.com/burchill/burchill.github.io/master/_source/2020-01-20-weight_analysis_pt_3.Rmd) If you want to see how I did what I did in this post, check out the source code of the R Markdown file! ### Footnotes: [^1]: Eh, bad phrasing. [^2]: If you want access to the git repo for this project, drop me a line on Twitter and I'll add you as a collaborator. Otherwise, I'm keeping it private for the time being. ๐ [^3]: I want to brag about it so much. I've set up a subdomain to redirect me to the landing page of my local site: `scale.zachburchill.ml` (only available when you're connected to my home WiFi network). I also have a webhook system that will send push notifications directly to my phone. I feel so 133t. [^4]: See the issues in [my previous post]({{ site.baseurl }}{% post_url 2019-09-07-bluetooth_scale_intro %}). If not for these issues, I could get a _really_ sexy system going easy-peasy with some real-time dashboards and whatnot. [^5]: The important functions are defined in the `gathering functions` chunk and there's a commented example in the `group tags` chunk. [^6]: I'll go on record that I think both of those frequencies are _bizarre_, by the way. [^7]: I'm making a lot of assumptions here. [^8]: I originally had a 'time-of-day' spline in the GAMs, but it ended up eating up the variance of the sleep effect (since I have a somewhat regular bedtime), and led to very unstable estimates.