---
title: "MNIST - MLP"
author: "Dr. Hua Zhou"
date: "3/15/2018"
output: 
  html_document:
    toc: true
    toc_depth: 4
---

```{r setup, include=FALSE}
#options(width = 1000)
knitr::opts_chunk$set(echo = TRUE)
```

```{r}
sessionInfo()
```

Source: <https://tensorflow.rstudio.com/keras/articles/examples/mnist_mlp.html>

In this example, we train an MLP (multi-layer perceptron) on the [MNIST](https://en.wikipedia.org/wiki/MNIST_database) data set. Achieve testing accuracy 98.11% after 30 epochs.

- The **MNIST** database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits ($28 \times 28$) that is commonly used for training and testing machine learning algorithms.

- 60,000 training images, 10,000 testing images. 

### Prepare data

Aquire data:
```{r}
library(keras)
mnist <- dataset_mnist()
x_train <- mnist$train$x
y_train <- mnist$train$y
x_test <- mnist$test$x
y_test <- mnist$test$y
```
Training set:
```{r}
dim(x_train)
dim(y_train)
x_train[1, , ]
y_train[1]
```
Testing set:
```{r}
dim(x_test)
dim(y_test)
```
Vectorize $28 \times 28$ images into $784$-vectors and scale entries to [0, 1]:
```{r}
# reshape
x_train <- array_reshape(x_train, c(nrow(x_train), 784))
x_test <- array_reshape(x_test, c(nrow(x_test), 784))
# rescale
x_train <- x_train / 255
x_test <- x_test / 255
dim(x_train)
dim(x_test)
```
Encode $y$ as binary class matrix:
```{r}
y_train <- to_categorical(y_train, 10)
y_test <- to_categorical(y_test, 10)
dim(y_train)
dim(y_test)
```

### Define the model

Define a **sequential model** (a linear stack of layers) with 2 fully-connected hidden layers (256 and 128 neurons):
```{r}
model <- keras_model_sequential() 
model %>% 
  layer_dense(units = 256, activation = 'relu', input_shape = c(784)) %>% 
  layer_dropout(rate = 0.4) %>% 
  layer_dense(units = 128, activation = 'relu') %>%
  layer_dropout(rate = 0.3) %>%
  layer_dense(units = 10, activation = 'softmax')
summary(model)
```

Compile the model with appropriate loss function, optimizer, and metrics:
```{r}
model %>% compile(
  loss = 'categorical_crossentropy',
  optimizer = optimizer_rmsprop(),
  metrics = c('accuracy')
)
```

### Training and validation

```{r}
system.time({
history <- model %>% fit(
  x_train, y_train, 
  epochs = 30, batch_size = 128, 
  validation_split = 0.2
)
})
plot(history)
```

### Testing

Evaluate model performance on the test data:
```{r}
model %>% evaluate(x_test, y_test)
```
Generate predictions on new data:
```{r}
model %>% predict_classes(x_test) %>% head()
```