--- title: "Classification" author: "Your name here!" date: "02/24/2025" --- **Abstract:** This is a technical blog post of **both** an HTML file *and* [.qmd file](https://raw.githubusercontent.com/cd-public/D505/refs/heads/master/hws/src/classify.qmd) hosted on GitHub pages. # 0. Quarto Type-setting - This document is rendered with Quarto, and configured to embed an images using the `embed-resources` option in the header. - If you wish to use a similar header, here's is the format specification for this document: ```email format: html: embed-resources: true ``` # 1. Setup **Step Up Code:** ```{.r} sh <- suppressPackageStartupMessages sh(library(tidyverse)) sh(library(caret)) sh(library(naivebayes)) wine <- readRDS(gzcon(url("https://github.com/cd-public/D505/raw/master/dat/pinot.rds"))) ``` # 2. Logistic Concepts Why do we call it Logistic Regression even though we are using the technique for classification? > TODO: *Explain.* # 3. Modeling We train a logistic regression algorithm to classify a whether a wine comes from Marlborough using: 1. An 80-20 train-test split. 2. Three features engineered from the description 3. 5-fold cross validation. We report Kappa after using the model to predict provinces in the holdout sample. ```{.r} # TODO ``` # 4. Binary vs Other Classification What is the difference between determining some form of classification through logistic regression versus methods like $K$-NN and Naive Bayes which performed classifications. > TODO: *Explain.* # 5. ROC Curves We can display an ROC for the model to explain your model's quality. ```{.r} # You can find a tutorial on ROC curves here: https://towardsdatascience.com/understanding-the-roc-curve-and-auc-dd4f9a192ecb/ ``` > TODO: *Explain.*