#### R ```{r} model <- keras_model_sequential() model %>% layer_dense(units = 256, activation = 'relu', input_shape = c(784)) %>% layer_dropout(rate = 0.4) %>% layer_dense(units = 128, activation = 'relu') %>% layer_dropout(rate = 0.3) %>% layer_dense(units = 10, activation = 'softmax') summary(model) ``` ::: Compile the model with appropriate loss function, optimizer, and metrics: ::: {.panel-tabset} #### Python ```{python} model.compile( loss = "categorical_crossentropy", optimizer = "rmsprop", metrics = ["accuracy"] ) ``` #### R ```{r} model %>% compile( loss = 'categorical_crossentropy', optimizer = optimizer_rmsprop(), metrics = c('accuracy') ) ``` ::: ## Training and validation 80%/20% split for the train/validation set. ::: {.panel-tabset} #### Python ```{python} #| output: false batch_size = 128 epochs = 30 history = model.fit( x_train, y_train, batch_size = batch_size, epochs = epochs, validation_split = 0.2 ) ``` Plot training history: ```{python} #| code-fold: true hist = pd.DataFrame(history.history) hist['epoch'] = np.arange(1, epochs + 1) hist = hist.melt( id_vars = ['epoch'], value_vars = ['loss', 'accuracy', 'val_loss', 'val_accuracy'], var_name = 'type', value_name = 'value' ) hist['split'] = np.where(['val' in s for s in hist['type']], 'validation', 'train') hist['metric'] = np.where(['loss' in s for s in hist['type']], 'loss', 'accuracy') # Accuracy trace plot plt.figure() sns.relplot( data = hist[hist['metric'] == 'accuracy'], kind = 'scatter', x = 'epoch', y = 'value', hue = 'split' ).set( xlabel = 'Epoch', ylabel = 'Accuracy' ); plt.show() # Loss trace plot plt.figure() sns.relplot( data = hist[hist['metric'] == 'loss'], kind = 'scatter', x = 'epoch', y = 'value', hue = 'split' ).set( xlabel = 'Epoch', ylabel = 'Loss' ); plt.show() ``` #### R ```{r} system.time({ history <- model %>% fit( x_train, y_train, epochs = 30, batch_size = 128, validation_split = 0.2 ) }) plot(history) ``` ::: ## Testing Evaluate model performance on the test data: ::: {.panel-tabset} #### Python ```{python} score = model.evaluate(x_test, y_test, verbose = 0) print("Test loss:", score[0]) print("Test accuracy:", score[1]) ``` #### R ```{r} model %>% evaluate(x_test, y_test) ``` Generate predictions on new data: ```{r} model %>% predict(x_test) %>% k_argmax() ``` ::: ## Exercise Suppose we want to fit a multinomial-logit model and use it as a baseline method to neural networks. How to do that? Of course we can use `mlogit` or other packages. Instead we can fit the same model using keras, since multinomial-logit is just an MLP with (1) one input layer with linear activation and (2) one output layer with softmax link function. ::: {.panel-tabset} #### Python #### R ```{r} # set up model library(keras) mlogit <- keras_model_sequential() mlogit %>% # layer_dense(units = 256, activation = 'linear', input_shape = c(784)) %>% # layer_dropout(rate = 0.4) %>% layer_dense(units = 10, activation = 'softmax', input_shape = c(784)) summary(mlogit) ``` ```{r} # compile model mlogit %>% compile( loss = 'categorical_crossentropy', optimizer = optimizer_rmsprop(), metrics = c('accuracy') ) mlogit ``` ```{r} # fit model mlogit_history <- mlogit %>% fit( x_train, y_train, epochs = 20, batch_size = 128, validation_split = 0.2 ) ``` ```{r} # Evaluate model performance on the test data: mlogit %>% evaluate(x_test, y_test) ``` Generate predictions on new data: ```{r} mlogit %>% predict(x_test) %>% k_argmax() ``` ::: Experiment: Change the `linear` activation to `relu` in the multinomial-logit model and see the change in classification accuracy.