class: center, middle, inverse, title-slide .title[ # Topic 14: Classification ] .subtitle[ ## Part 2: Examples ] .author[ ### Nick Hagerty
ECNS 460/560 Fall 2023
Montana State University ] --- <style type="text/css"> .scroll-output-full { height: 90%; overflow-y: scroll; } .scroll-output-75 { height: 75%; overflow-y: scroll; } </style> # Table of contents 1. [Setup: Credit card default data](#setup) 1. [Logistic regression and KNN](#simple) 1. [Cross-validation](#cv) 1. [Decision trees](#trees) 1. [Teach your laptop how to read](#read) --- layout: true # Setup: Credit card default data --- name: setup class: inverse, middle --- **1.** Load some libraries: ```r library(pacman) p_load(tidyverse, skimr, janitor, tidymodels, magrittr, tune, glmnet, readxl, kknn, rpart.plot) ``` And this data on credit card defaults: ```r default = # skip the first row to load the correct variable names read_excel("data/default of credit card clients.xls", skip=1) %>% # clean variable names with janitor::clean_names() clean_names() ``` --- .scroll-output-full[ ```r skim(default) ``` <table style='width: auto;' class='table table-condensed'> <caption>Data summary</caption> <tbody> <tr> <td style="text-align:left;"> Name </td> <td style="text-align:left;"> default </td> </tr> <tr> <td style="text-align:left;"> Number of rows </td> <td style="text-align:left;"> 30000 </td> </tr> <tr> <td style="text-align:left;"> Number of columns </td> <td style="text-align:left;"> 25 </td> </tr> <tr> <td style="text-align:left;"> _______________________ </td> <td style="text-align:left;"> </td> </tr> <tr> <td style="text-align:left;"> Column type frequency: </td> <td style="text-align:left;"> </td> </tr> <tr> <td style="text-align:left;"> numeric </td> <td style="text-align:left;"> 25 </td> </tr> <tr> <td style="text-align:left;"> ________________________ </td> <td style="text-align:left;"> </td> </tr> <tr> <td style="text-align:left;"> Group variables </td> <td style="text-align:left;"> None </td> </tr> </tbody> </table> **Variable type: numeric** <table> <thead> <tr> <th style="text-align:left;"> skim_variable </th> <th style="text-align:right;"> n_missing </th> <th style="text-align:right;"> complete_rate </th> <th style="text-align:right;"> mean </th> <th style="text-align:right;"> sd </th> <th style="text-align:right;"> p0 </th> <th style="text-align:right;"> p25 </th> <th style="text-align:right;"> p50 </th> <th style="text-align:right;"> p75 </th> <th style="text-align:right;"> p100 </th> <th style="text-align:left;"> hist </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> id </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 15000.50 </td> <td style="text-align:right;"> 8660.40 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 7500.75 </td> <td style="text-align:right;"> 15000.5 </td> <td style="text-align:right;"> 22500.25 </td> <td style="text-align:right;"> 30000 </td> <td style="text-align:left;"> ▇▇▇▇▇ </td> </tr> <tr> <td style="text-align:left;"> limit_bal </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 167484.32 </td> <td style="text-align:right;"> 129747.66 </td> <td style="text-align:right;"> 10000 </td> <td style="text-align:right;"> 50000.00 </td> <td style="text-align:right;"> 140000.0 </td> <td style="text-align:right;"> 240000.00 </td> <td style="text-align:right;"> 1000000 </td> <td style="text-align:left;"> ▇▃▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> sex </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1.60 </td> <td style="text-align:right;"> 0.49 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1.00 </td> <td style="text-align:right;"> 2.0 </td> <td style="text-align:right;"> 2.00 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> ▅▁▁▁▇ </td> </tr> <tr> <td style="text-align:left;"> education </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1.85 </td> <td style="text-align:right;"> 0.79 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1.00 </td> <td style="text-align:right;"> 2.0 </td> <td style="text-align:right;"> 2.00 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:left;"> ▆▇▃▁▁ </td> </tr> <tr> <td style="text-align:left;"> marriage </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1.55 </td> <td style="text-align:right;"> 0.52 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1.00 </td> <td style="text-align:right;"> 2.0 </td> <td style="text-align:right;"> 2.00 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> ▁▇▁▇▁ </td> </tr> <tr> <td style="text-align:left;"> age </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 35.49 </td> <td style="text-align:right;"> 9.22 </td> <td style="text-align:right;"> 21 </td> <td style="text-align:right;"> 28.00 </td> <td style="text-align:right;"> 34.0 </td> <td style="text-align:right;"> 41.00 </td> <td style="text-align:right;"> 79 </td> <td style="text-align:left;"> ▇▇▂▁▁ </td> </tr> <tr> <td style="text-align:left;"> pay_0 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> -0.02 </td> <td style="text-align:right;"> 1.12 </td> <td style="text-align:right;"> -2 </td> <td style="text-align:right;"> -1.00 </td> <td style="text-align:right;"> 0.0 </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> ▇▂▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> pay_2 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> -0.13 </td> <td style="text-align:right;"> 1.20 </td> <td style="text-align:right;"> -2 </td> <td style="text-align:right;"> -1.00 </td> <td style="text-align:right;"> 0.0 </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> pay_3 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> -0.17 </td> <td style="text-align:right;"> 1.20 </td> <td style="text-align:right;"> -2 </td> <td style="text-align:right;"> -1.00 </td> <td style="text-align:right;"> 0.0 </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> pay_4 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> -0.22 </td> <td style="text-align:right;"> 1.17 </td> <td style="text-align:right;"> -2 </td> <td style="text-align:right;"> -1.00 </td> <td style="text-align:right;"> 0.0 </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> pay_5 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> -0.27 </td> <td style="text-align:right;"> 1.13 </td> <td style="text-align:right;"> -2 </td> <td style="text-align:right;"> -1.00 </td> <td style="text-align:right;"> 0.0 </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> pay_6 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> -0.29 </td> <td style="text-align:right;"> 1.15 </td> <td style="text-align:right;"> -2 </td> <td style="text-align:right;"> -1.00 </td> <td style="text-align:right;"> 0.0 </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> bill_amt1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 51223.33 </td> <td style="text-align:right;"> 73635.86 </td> <td style="text-align:right;"> -165580 </td> <td style="text-align:right;"> 3558.75 </td> <td style="text-align:right;"> 22381.5 </td> <td style="text-align:right;"> 67091.00 </td> <td style="text-align:right;"> 964511 </td> <td style="text-align:left;"> ▇▃▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> bill_amt2 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 49179.08 </td> <td style="text-align:right;"> 71173.77 </td> <td style="text-align:right;"> -69777 </td> <td style="text-align:right;"> 2984.75 </td> <td style="text-align:right;"> 21200.0 </td> <td style="text-align:right;"> 64006.25 </td> <td style="text-align:right;"> 983931 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> bill_amt3 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 47013.15 </td> <td style="text-align:right;"> 69349.39 </td> <td style="text-align:right;"> -157264 </td> <td style="text-align:right;"> 2666.25 </td> <td style="text-align:right;"> 20088.5 </td> <td style="text-align:right;"> 60164.75 </td> <td style="text-align:right;"> 1664089 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> bill_amt4 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 43262.95 </td> <td style="text-align:right;"> 64332.86 </td> <td style="text-align:right;"> -170000 </td> <td style="text-align:right;"> 2326.75 </td> <td style="text-align:right;"> 19052.0 </td> <td style="text-align:right;"> 54506.00 </td> <td style="text-align:right;"> 891586 </td> <td style="text-align:left;"> ▇▃▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> bill_amt5 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 40311.40 </td> <td style="text-align:right;"> 60797.16 </td> <td style="text-align:right;"> -81334 </td> <td style="text-align:right;"> 1763.00 </td> <td style="text-align:right;"> 18104.5 </td> <td style="text-align:right;"> 50190.50 </td> <td style="text-align:right;"> 927171 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> bill_amt6 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 38871.76 </td> <td style="text-align:right;"> 59554.11 </td> <td style="text-align:right;"> -339603 </td> <td style="text-align:right;"> 1256.00 </td> <td style="text-align:right;"> 17071.0 </td> <td style="text-align:right;"> 49198.25 </td> <td style="text-align:right;"> 961664 </td> <td style="text-align:left;"> ▁▇▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> pay_amt1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 5663.58 </td> <td style="text-align:right;"> 16563.28 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1000.00 </td> <td style="text-align:right;"> 2100.0 </td> <td style="text-align:right;"> 5006.00 </td> <td style="text-align:right;"> 873552 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> pay_amt2 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 5921.16 </td> <td style="text-align:right;"> 23040.87 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 833.00 </td> <td style="text-align:right;"> 2009.0 </td> <td style="text-align:right;"> 5000.00 </td> <td style="text-align:right;"> 1684259 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> pay_amt3 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 5225.68 </td> <td style="text-align:right;"> 17606.96 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 390.00 </td> <td style="text-align:right;"> 1800.0 </td> <td style="text-align:right;"> 4505.00 </td> <td style="text-align:right;"> 896040 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> pay_amt4 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4826.08 </td> <td style="text-align:right;"> 15666.16 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 296.00 </td> <td style="text-align:right;"> 1500.0 </td> <td style="text-align:right;"> 4013.25 </td> <td style="text-align:right;"> 621000 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> pay_amt5 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4799.39 </td> <td style="text-align:right;"> 15278.31 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 252.50 </td> <td style="text-align:right;"> 1500.0 </td> <td style="text-align:right;"> 4031.50 </td> <td style="text-align:right;"> 426529 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> pay_amt6 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 5215.50 </td> <td style="text-align:right;"> 17777.47 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 117.75 </td> <td style="text-align:right;"> 1500.0 </td> <td style="text-align:right;"> 4000.00 </td> <td style="text-align:right;"> 528666 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> default_payment_next_month </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0.22 </td> <td style="text-align:right;"> 0.42 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:right;"> 0.0 </td> <td style="text-align:right;"> 0.00 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> ▇▁▁▁▂ </td> </tr> </tbody> </table> ] --- **2.** Convert categorical variables to factors (because `tidymodels` isn't so great for this step) .scroll-output-75[ ```r default2 = default %>% mutate(across(sex:marriage | pay_0:pay_6 | default_payment_next_month, as_factor)) skim(default2) ``` <table style='width: auto;' class='table table-condensed'> <caption>Data summary</caption> <tbody> <tr> <td style="text-align:left;"> Name </td> <td style="text-align:left;"> default2 </td> </tr> <tr> <td style="text-align:left;"> Number of rows </td> <td style="text-align:left;"> 30000 </td> </tr> <tr> <td style="text-align:left;"> Number of columns </td> <td style="text-align:left;"> 25 </td> </tr> <tr> <td style="text-align:left;"> _______________________ </td> <td style="text-align:left;"> </td> </tr> <tr> <td style="text-align:left;"> Column type frequency: </td> <td style="text-align:left;"> </td> </tr> <tr> <td style="text-align:left;"> factor </td> <td style="text-align:left;"> 10 </td> </tr> <tr> <td style="text-align:left;"> numeric </td> <td style="text-align:left;"> 15 </td> </tr> <tr> <td style="text-align:left;"> ________________________ </td> <td style="text-align:left;"> </td> </tr> <tr> <td style="text-align:left;"> Group variables </td> <td style="text-align:left;"> None </td> </tr> </tbody> </table> **Variable type: factor** <table> <thead> <tr> <th style="text-align:left;"> skim_variable </th> <th style="text-align:right;"> n_missing </th> <th style="text-align:right;"> complete_rate </th> <th style="text-align:left;"> ordered </th> <th style="text-align:right;"> n_unique </th> <th style="text-align:left;"> top_counts </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> sex </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> FALSE </td> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> 2: 18112, 1: 11888 </td> </tr> <tr> <td style="text-align:left;"> education </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> FALSE </td> <td style="text-align:right;"> 7 </td> <td style="text-align:left;"> 2: 14030, 1: 10585, 3: 4917, 5: 280 </td> </tr> <tr> <td style="text-align:left;"> marriage </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> FALSE </td> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> 2: 15964, 1: 13659, 3: 323, 0: 54 </td> </tr> <tr> <td style="text-align:left;"> pay_0 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> FALSE </td> <td style="text-align:right;"> 11 </td> <td style="text-align:left;"> 0: 14737, -1: 5686, 1: 3688, -2: 2759 </td> </tr> <tr> <td style="text-align:left;"> pay_2 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> FALSE </td> <td style="text-align:right;"> 11 </td> <td style="text-align:left;"> 0: 15730, -1: 6050, 2: 3927, -2: 3782 </td> </tr> <tr> <td style="text-align:left;"> pay_3 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> FALSE </td> <td style="text-align:right;"> 11 </td> <td style="text-align:left;"> 0: 15764, -1: 5938, -2: 4085, 2: 3819 </td> </tr> <tr> <td style="text-align:left;"> pay_4 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> FALSE </td> <td style="text-align:right;"> 11 </td> <td style="text-align:left;"> 0: 16455, -1: 5687, -2: 4348, 2: 3159 </td> </tr> <tr> <td style="text-align:left;"> pay_5 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> FALSE </td> <td style="text-align:right;"> 10 </td> <td style="text-align:left;"> 0: 16947, -1: 5539, -2: 4546, 2: 2626 </td> </tr> <tr> <td style="text-align:left;"> pay_6 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> FALSE </td> <td style="text-align:right;"> 10 </td> <td style="text-align:left;"> 0: 16286, -1: 5740, -2: 4895, 2: 2766 </td> </tr> <tr> <td style="text-align:left;"> default_payment_next_month </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> FALSE </td> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> 0: 23364, 1: 6636 </td> </tr> </tbody> </table> **Variable type: numeric** <table> <thead> <tr> <th style="text-align:left;"> skim_variable </th> <th style="text-align:right;"> n_missing </th> <th style="text-align:right;"> complete_rate </th> <th style="text-align:right;"> mean </th> <th style="text-align:right;"> sd </th> <th style="text-align:right;"> p0 </th> <th style="text-align:right;"> p25 </th> <th style="text-align:right;"> p50 </th> <th style="text-align:right;"> p75 </th> <th style="text-align:right;"> p100 </th> <th style="text-align:left;"> hist </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> id </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 15000.50 </td> <td style="text-align:right;"> 8660.40 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 7500.75 </td> <td style="text-align:right;"> 15000.5 </td> <td style="text-align:right;"> 22500.25 </td> <td style="text-align:right;"> 30000 </td> <td style="text-align:left;"> ▇▇▇▇▇ </td> </tr> <tr> <td style="text-align:left;"> limit_bal </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 167484.32 </td> <td style="text-align:right;"> 129747.66 </td> <td style="text-align:right;"> 10000 </td> <td style="text-align:right;"> 50000.00 </td> <td style="text-align:right;"> 140000.0 </td> <td style="text-align:right;"> 240000.00 </td> <td style="text-align:right;"> 1000000 </td> <td style="text-align:left;"> ▇▃▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> age </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 35.49 </td> <td style="text-align:right;"> 9.22 </td> <td style="text-align:right;"> 21 </td> <td style="text-align:right;"> 28.00 </td> <td style="text-align:right;"> 34.0 </td> <td style="text-align:right;"> 41.00 </td> <td style="text-align:right;"> 79 </td> <td style="text-align:left;"> ▇▇▂▁▁ </td> </tr> <tr> <td style="text-align:left;"> bill_amt1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 51223.33 </td> <td style="text-align:right;"> 73635.86 </td> <td style="text-align:right;"> -165580 </td> <td style="text-align:right;"> 3558.75 </td> <td style="text-align:right;"> 22381.5 </td> <td style="text-align:right;"> 67091.00 </td> <td style="text-align:right;"> 964511 </td> <td style="text-align:left;"> ▇▃▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> bill_amt2 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 49179.08 </td> <td style="text-align:right;"> 71173.77 </td> <td style="text-align:right;"> -69777 </td> <td style="text-align:right;"> 2984.75 </td> <td style="text-align:right;"> 21200.0 </td> <td style="text-align:right;"> 64006.25 </td> <td style="text-align:right;"> 983931 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> bill_amt3 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 47013.15 </td> <td style="text-align:right;"> 69349.39 </td> <td style="text-align:right;"> -157264 </td> <td style="text-align:right;"> 2666.25 </td> <td style="text-align:right;"> 20088.5 </td> <td style="text-align:right;"> 60164.75 </td> <td style="text-align:right;"> 1664089 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> bill_amt4 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 43262.95 </td> <td style="text-align:right;"> 64332.86 </td> <td style="text-align:right;"> -170000 </td> <td style="text-align:right;"> 2326.75 </td> <td style="text-align:right;"> 19052.0 </td> <td style="text-align:right;"> 54506.00 </td> <td style="text-align:right;"> 891586 </td> <td style="text-align:left;"> ▇▃▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> bill_amt5 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 40311.40 </td> <td style="text-align:right;"> 60797.16 </td> <td style="text-align:right;"> -81334 </td> <td style="text-align:right;"> 1763.00 </td> <td style="text-align:right;"> 18104.5 </td> <td style="text-align:right;"> 50190.50 </td> <td style="text-align:right;"> 927171 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> bill_amt6 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 38871.76 </td> <td style="text-align:right;"> 59554.11 </td> <td style="text-align:right;"> -339603 </td> <td style="text-align:right;"> 1256.00 </td> <td style="text-align:right;"> 17071.0 </td> <td style="text-align:right;"> 49198.25 </td> <td style="text-align:right;"> 961664 </td> <td style="text-align:left;"> ▁▇▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> pay_amt1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 5663.58 </td> <td style="text-align:right;"> 16563.28 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1000.00 </td> <td style="text-align:right;"> 2100.0 </td> <td style="text-align:right;"> 5006.00 </td> <td style="text-align:right;"> 873552 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> pay_amt2 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 5921.16 </td> <td style="text-align:right;"> 23040.87 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 833.00 </td> <td style="text-align:right;"> 2009.0 </td> <td style="text-align:right;"> 5000.00 </td> <td style="text-align:right;"> 1684259 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> pay_amt3 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 5225.68 </td> <td style="text-align:right;"> 17606.96 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 390.00 </td> <td style="text-align:right;"> 1800.0 </td> <td style="text-align:right;"> 4505.00 </td> <td style="text-align:right;"> 896040 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> pay_amt4 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4826.08 </td> <td style="text-align:right;"> 15666.16 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 296.00 </td> <td style="text-align:right;"> 1500.0 </td> <td style="text-align:right;"> 4013.25 </td> <td style="text-align:right;"> 621000 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> pay_amt5 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4799.39 </td> <td style="text-align:right;"> 15278.31 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 252.50 </td> <td style="text-align:right;"> 1500.0 </td> <td style="text-align:right;"> 4031.50 </td> <td style="text-align:right;"> 426529 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> <tr> <td style="text-align:left;"> pay_amt6 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 5215.50 </td> <td style="text-align:right;"> 17777.47 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 117.75 </td> <td style="text-align:right;"> 1500.0 </td> <td style="text-align:right;"> 4000.00 </td> <td style="text-align:right;"> 528666 </td> <td style="text-align:left;"> ▇▁▁▁▁ </td> </tr> </tbody> </table> ] --- **3.** Split the sample into training (80%) and test (20%). ```r set.seed(101) default_split = default2 %>% initial_split(prop = 0.8) default_train = default_split %>% training() default_test = default_split %>% testing() ``` **4.** Define a recipe. We'll be predicting whether each customer defaults on their credit card payment in the following month. ```r default_recipe = default_train %>% recipe(default_payment_next_month ~ .) %>% update_role(id, new_role = "ID") %>% step_dummy(all_nominal_predictors()) %>% step_normalize(all_numeric_predictors()) %>% step_nzv(all_predictors()) default_clean = default_recipe %>% prep() %>% juice() ``` --- layout: false name: simple class: inverse, middle # Logistic regression and KNN --- layout: true # Logistic regression example --- Fit a simple logistic regression: ```r # Define a simple logistic model model_logistic = logistic_reg() %>% set_engine("glm") # Define the workflow workflow_logistic = workflow() %>% add_recipe(default_recipe) %>% add_model(model_logistic) # Fit the model in the training set, make predictions in the test set predict_logistic = workflow_logistic %>% fit(data = default_train) %>% augment(new_data = default_test) ``` --- View the results: ```r head(predict_logistic %>% select(default_payment_next_month | starts_with(".")), 10) ```
default_payment_next_month
.pred_class
.pred_0
.pred_1
0
0
0.779
0.221
0
0
0.889
0.111
0
0
0.77
0.23
0
0
0.884
0.116
1
1
0.225
0.775
0
0
0.805
0.195
1
0
0.664
0.336
0
0
0.897
0.103
0
0
0.861
0.139
1
1
0.409
0.591
--- Calculate accuracy and the confusion matrix: ```r accuracy(predict_logistic, truth = default_payment_next_month, estimate = .pred_class) ```
.metric
.estimator
.estimate
accuracy
binary
0.815
```r conf_mat(predict_logistic, truth = default_payment_next_month, estimate = .pred_class) ``` ``` #> Truth #> Prediction 0 1 #> 0 4457 893 #> 1 215 435 ``` --- layout: false # KNN example Fit a k-nearest neighbors model instead (choosing `\(k=10\)`): ```r # Define a simple KNN model model_knn = nearest_neighbor(neighbors = 10, mode = "classification") %>% set_engine("kknn", scale = TRUE) # Define the workflow workflow_knn = workflow() %>% add_recipe(default_recipe) %>% add_model(model_knn) # Fit the model in the training set, make predictions in the test set predict_knn = workflow_knn %>% fit(data = default_train) %>% augment(new_data = default_test) # Calculate accuracy accuracy(predict_knn, truth = default_payment_next_month, estimate = .pred_class) ```
.metric
.estimator
.estimate
accuracy
binary
0.786
--- # Tuning K in KNN We can optimize *k*-nearest neighbors by tuning the value of `\(k=\)` `neighbors` through cross-validation. Note you can use KNN for regression too - just switch to `mode = "regression"`. --- layout: false name: cv class: inverse, middle # Cross-validation --- layout: true # Logistic regression + shrinkage --- We can add regularization to a logistic regression in almost exactly the same way as a linear regression. ```r # Create a 5-fold cross-validation split default_cv = default_train %>% vfold_cv(v = 5) # Define a logistic model with ridge regularization model_logridge = logistic_reg(penalty = tune(), mixture = 0) %>% set_engine("glmnet") # Define the workflow workflow_logridge = workflow() %>% add_recipe(default_recipe) %>% add_model(model_logridge) # Define a sequence of lambdas to try lambdas = 10 ^ seq(from = 4, to = -2, length = 50) ``` --- The difference is that we have to specify a metric appropriate to classification (i.e., `accuracy` instead of `rmse`). - You could also specify `sensitivity`, `specificity`, or `precision`. ```r # Cross-validate cv_logridge = workflow_logridge %>% tune_grid( default_cv, grid = data.frame(penalty = lambdas), metrics = metric_set(accuracy) ) # Show the best model cv_logridge %>% show_best(n=3) ```
penalty
.metric
.estimator
mean
n
std_err
.config
0.01
accuracy
binary
0.814
5
0.00379
Preprocessor1_Model01
0.0133
accuracy
binary
0.814
5
0.00379
Preprocessor1_Model02
0.0176
accuracy
binary
0.813
5
0.00367
Preprocessor1_Model03
--- .small[ ```r # Fit the best model in the training set, make predictions in the test set predict_logridge = workflow_logridge %>% finalize_workflow(select_best(cv_logridge, 'accuracy')) %>% last_fit(default_split) %>% collect_predictions() # Calculate accuracy and the confusion matrix accuracy(predict_logridge, truth = default_payment_next_month, estimate = .pred_class) ```
.metric
.estimator
.estimate
accuracy
binary
0.814
```r conf_mat(predict_logridge, truth = default_payment_next_month, estimate = .pred_class) ``` ``` #> Truth #> Prediction 0 1 #> 0 4467 910 #> 1 205 418 ``` ] --- layout: true # Decision trees --- name: trees class: inverse, middle --- Fit a decision tree (give it the raw data, not the normalized or dummy variables): ```r # Define a decision tree model model_tree = decision_tree(mode = "classification", cost_complexity = 0.003) %>% set_engine("rpart") # Define the workflow & fit the model fit_tree = workflow() %>% add_formula(default_payment_next_month ~ .) %>% add_model(model_tree) %>% fit(data = default_train) ``` --- .small[ Output: ``` #> ══ Workflow [trained] ══════════════════════════════════════════════════════════ #> Preprocessor: Formula #> Model: decision_tree() #> #> ── Preprocessor ──────────────────────────────────────────────────────────────── #> default_payment_next_month ~ . #> #> ── Model ─────────────────────────────────────────────────────────────────────── #> n= 24000 #> #> node), split, n, loss, yval, (yprob) #> * denotes terminal node #> #> 1) root 24000 5308 0 (0.7788333 0.2211667) #> 2) pay_0=-2,-1,0,1 21472 3549 0 (0.8347150 0.1652850) #> 4) pay_2=-2,-1,0,1,8 19713 2821 0 (0.8568965 0.1431035) * #> 5) pay_2=2,3,4,5,6,7 1759 728 0 (0.5861285 0.4138715) #> 10) pay_6=-2,-1,0,4 1253 459 0 (0.6336792 0.3663208) * #> 11) pay_6=2,3,5,8 506 237 1 (0.4683794 0.5316206) * #> 3) pay_0=2,3,4,5,6,7,8 2528 769 1 (0.3041930 0.6958070) * ``` ] --- Visualize the tree: ```r rpart.plot(extract_fit_engine(fit_tree)) ``` <img src="14b-Examples_files/figure-html/unnamed-chunk-17-1.svg" width="90%" style="display: block; margin: auto;" /> --- layout: false # Tuning (and pruning) the tree Decision trees have several possible tuning parameters: - `cost_complexity` - `tree_depth` - `min_n` You can read more about them in the R documentation, [ISLR](https://web.stanford.edu/~hastie/ISLR2/ISLRv2_website.pdf) chapter 8, or by Googling (try [here](https://www.tidymodels.org/start/tuning/), [here](https://emilhvitfeldt.github.io/ISLR-tidymodels-labs/tree-based-methods.html), or [here](https://juliasilge.com/blog/wind-turbine/)). --- layout: true # Teach your laptop how to read --- name: read class: inverse, middle This section is adapted from [*Introduction to Data Science*](http://rafalab.dfci.harvard.edu/dsbook/machine-learning-in-practice.html) by Rafael A. Irizarry, used under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0). --- .small[ Believe it or not, we've already gotten far enough with machine learning that you can now teach your computer how to read. ] -- .small[ The **MNIST database** is a set of images of handwritten numerical digits. They look like this: <img src="images/digit-images-example-1.png" width="40%" style="display: block; margin: auto;" /> - The images are scanned, centered, and represented in `\(28 \times 28=784\)` pixels. - Each pixel's value is a grayscale intensity between 0 (white) and 255 (black). - Each image has also been read by a human and identified as a digit (0-9). ] --- **1. Load the data** and take a subset (for quicker computation): ```r p_load(dslabs) mnist = read_mnist() set.seed(1990) index = sample(nrow(mnist$train$images), 10000) x = mnist$train$images[index,] y = factor(mnist$train$labels[index]) numbers = bind_cols("y"=y, as_tibble(x)) ``` --- **2. Split the sample** and take a look at the data: ```r numbers_split = numbers %>% initial_split(prop = 0.8) numbers_train = numbers_split %>% training() numbers_test = numbers_split %>% testing() head(numbers_test, 8) ```
y
V1
V2
V3
V4
V5
V6
V7
V8
V9
V10
V11
V12
V13
V14
V15
V16
V17
V18
V19
V20
V21
V22
V23
V24
V25
V26
V27
V28
V29
V30
V31
V32
V33
V34
V35
V36
V37
V38
V39
V40
V41
V42
V43
V44
V45
V46
V47
V48
V49
V50
V51
V52
V53
V54
V55
V56
V57
V58
V59
V60
V61
V62
V63
V64
V65
V66
V67
V68
V69
V70
V71
V72
V73
V74
V75
V76
V77
V78
V79
V80
V81
V82
V83
V84
V85
V86
V87
V88
V89
V90
V91
V92
V93
V94
V95
V96
V97
V98
V99
V100
V101
V102
V103
V104
V105
V106
V107
V108
V109
V110
V111
V112
V113
V114
V115
V116
V117
V118
V119
V120
V121
V122
V123
V124
V125
V126
V127
V128
V129
V130
V131
V132
V133
V134
V135
V136
V137
V138
V139
V140
V141
V142
V143
V144
V145
V146
V147
V148
V149
V150
V151
V152
V153
V154
V155
V156
V157
V158
V159
V160
V161
V162
V163
V164
V165
V166
V167
V168
V169
V170
V171
V172
V173
V174
V175
V176
V177
V178
V179
V180
V181
V182
V183
V184
V185
V186
V187
V188
V189
V190
V191
V192
V193
V194
V195
V196
V197
V198
V199
V200
V201
V202
V203
V204
V205
V206
V207
V208
V209
V210
V211
V212
V213
V214
V215
V216
V217
V218
V219
V220
V221
V222
V223
V224
V225
V226
V227
V228
V229
V230
V231
V232
V233
V234
V235
V236
V237
V238
V239
V240
V241
V242
V243
V244
V245
V246
V247
V248
V249
V250
V251
V252
V253
V254
V255
V256
V257
V258
V259
V260
V261
V262
V263
V264
V265
V266
V267
V268
V269
V270
V271
V272
V273
V274
V275
V276
V277
V278
V279
V280
V281
V282
V283
V284
V285
V286
V287
V288
V289
V290
V291
V292
V293
V294
V295
V296
V297
V298
V299
V300
V301
V302
V303
V304
V305
V306
V307
V308
V309
V310
V311
V312
V313
V314
V315
V316
V317
V318
V319
V320
V321
V322
V323
V324
V325
V326
V327
V328
V329
V330
V331
V332
V333
V334
V335
V336
V337
V338
V339
V340
V341
V342
V343
V344
V345
V346
V347
V348
V349
V350
V351
V352
V353
V354
V355
V356
V357
V358
V359
V360
V361
V362
V363
V364
V365
V366
V367
V368
V369
V370
V371
V372
V373
V374
V375
V376
V377
V378
V379
V380
V381
V382
V383
V384
V385
V386
V387
V388
V389
V390
V391
V392
V393
V394
V395
V396
V397
V398
V399
V400
V401
V402
V403
V404
V405
V406
V407
V408
V409
V410
V411
V412
V413
V414
V415
V416
V417
V418
V419
V420
V421
V422
V423
V424
V425
V426
V427
V428
V429
V430
V431
V432
V433
V434
V435
V436
V437
V438
V439
V440
V441
V442
V443
V444
V445
V446
V447
V448
V449
V450
V451
V452
V453
V454
V455
V456
V457
V458
V459
V460
V461
V462
V463
V464
V465
V466
V467
V468
V469
V470
V471
V472
V473
V474
V475
V476
V477
V478
V479
V480
V481
V482
V483
V484
V485
V486
V487
V488
V489
V490
V491
V492
V493
V494
V495
V496
V497
V498
V499
V500
V501
V502
V503
V504
V505
V506
V507
V508
V509
V510
V511
V512
V513
V514
V515
V516
V517
V518
V519
V520
V521
V522
V523
V524
V525
V526
V527
V528
V529
V530
V531
V532
V533
V534
V535
V536
V537
V538
V539
V540
V541
V542
V543
V544
V545
V546
V547
V548
V549
V550
V551
V552
V553
V554
V555
V556
V557
V558
V559
V560
V561
V562
V563
V564
V565
V566
V567
V568
V569
V570
V571
V572
V573
V574
V575
V576
V577
V578
V579
V580
V581
V582
V583
V584
V585
V586
V587
V588
V589
V590
V591
V592
V593
V594
V595
V596
V597
V598
V599
V600
V601
V602
V603
V604
V605
V606
V607
V608
V609
V610
V611
V612
V613
V614
V615
V616
V617
V618
V619
V620
V621
V622
V623
V624
V625
V626
V627
V628
V629
V630
V631
V632
V633
V634
V635
V636
V637
V638
V639
V640
V641
V642
V643
V644
V645
V646
V647
V648
V649
V650
V651
V652
V653
V654
V655
V656
V657
V658
V659
V660
V661
V662
V663
V664
V665
V666
V667
V668
V669
V670
V671
V672
V673
V674
V675
V676
V677
V678
V679
V680
V681
V682
V683
V684
V685
V686
V687
V688
V689
V690
V691
V692
V693
V694
V695
V696
V697
V698
V699
V700
V701
V702
V703
V704
V705
V706
V707
V708
V709
V710
V711
V712
V713
V714
V715
V716
V717
V718
V719
V720
V721
V722
V723
V724
V725
V726
V727
V728
V729
V730
V731
V732
V733
V734
V735
V736
V737
V738
V739
V740
V741
V742
V743
V744
V745
V746
V747
V748
V749
V750
V751
V752
V753
V754
V755
V756
V757
V758
V759
V760
V761
V762
V763
V764
V765
V766
V767
V768
V769
V770
V771
V772
V773
V774
V775
V776
V777
V778
V779
V780
V781
V782
V783
V784
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
9
105
132
237
253
253
133
132
132
132
47
9
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
5
188
252
252
252
252
252
253
252
252
252
252
221
47
4
0
0
0
0
0
0
0
0
0
0
0
0
6
135
252
242
216
247
252
252
252
146
96
119
242
252
252
53
0
0
0
0
0
0
0
0
0
0
0
0
60
252
252
156
0
217
179
141
241
0
0
0
60
131
252
220
39
0
0
0
0
0
0
0
0
0
0
47
225
252
180
17
0
188
221
8
69
0
0
0
0
21
112
244
169
3
0
0
0
0
0
0
0
0
0
145
252
252
72
0
0
23
60
3
0
0
0
0
0
0
0
217
252
11
0
0
0
0
0
0
0
0
5
181
252
172
14
0
0
0
0
0
0
0
0
0
0
0
0
120
252
110
0
0
0
0
0
0
0
0
59
252
221
12
0
0
0
0
0
0
0
0
0
0
0
0
0
200
252
131
0
0
0
0
0
0
0
0
132
252
164
0
0
0
0
0
0
0
0
0
0
0
0
0
0
217
252
131
0
0
0
0
0
0
0
0
190
252
96
0
0
0
0
0
0
0
0
0
0
0
0
0
0
217
252
17
0
0
0
0
0
0
0
0
255
241
46
0
0
0
0
0
0
0
0
0
0
0
0
0
91
253
138
0
0
0
0
0
0
0
0
0
253
228
0
0
0
0
0
0
0
0
0
0
0
0
0
42
232
185
10
0
0
0
0
0
0
0
0
0
253
228
0
0
0
0
0
0
0
0
0
0
0
0
0
147
252
36
0
0
0
0
0
0
0
0
0
0
253
244
64
0
0
0
0
0
0
0
0
0
0
0
64
233
195
12
0
0
0
0
0
0
0
0
0
0
253
252
182
0
0
0
0
0
0
0
0
0
0
41
198
235
70
0
0
0
0
0
0
0
0
0
0
0
219
252
216
0
0
0
0
0
0
0
0
7
90
233
252
145
0
0
0
0
0
0
0
0
0
0
0
0
25
165
245
182
65
0
0
0
0
0
45
158
252
252
203
12
0
0
0
0
0
0
0
0
0
0
0
0
0
41
252
252
244
217
120
97
97
97
237
252
252
187
61
0
0
0
0
0
0
0
0
0
0
0
0
0
0
3
99
216
252
252
252
252
252
252
253
242
127
8
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
31
131
131
131
189
241
136
69
10
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
5
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
3
21
178
235
171
171
249
255
228
3
0
0
0
0
0
0
0
0
0
0
0
0
0
0
11
157
175
175
215
254
254
254
249
245
222
214
245
4
0
0
0
0
0
0
0
0
0
0
0
0
0
0
19
169
247
254
254
231
86
75
32
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
19
201
251
179
34
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
3
68
218
251
167
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
3
130
254
231
97
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
18
151
254
221
49
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
2
66
198
254
245
129
118
105
28
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
9
254
254
238
216
233
254
254
246
204
204
159
47
16
0
0
0
0
0
0
0
0
0
0
0
0
0
0
5
125
46
27
0
21
46
58
74
128
172
248
254
228
83
31
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
36
116
211
254
219
37
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
10
81
234
218
21
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
67
251
164
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
168
219
2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
158
254
4
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
158
233
3
0
0
0
0
0
0
0
0
32
177
45
0
0
0
0
0
0
0
0
0
0
0
0
0
20
240
174
0
0
0
0
0
0
0
0
0
1
75
249
176
76
68
0
0
0
0
0
0
0
0
44
152
240
231
41
0
0
0
0
0
0
0
0
0
0
1
44
179
254
253
209
152
80
80
107
150
163
237
250
254
182
37
0
0
0
0
0
0
0
0
0
0
0
0
0
1
11
87
151
181
254
254
254
254
254
234
149
65
2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
158
68
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
8
80
118
118
65
5
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
73
183
254
253
253
253
193
117
12
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
4
107
244
253
254
253
253
253
253
254
229
42
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
20
253
253
253
254
253
253
253
253
254
253
115
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
118
254
254
254
196
91
0
16
218
255
254
254
42
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
200
253
245
139
15
0
0
0
105
254
253
253
124
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
184
253
223
192
0
0
0
0
0
254
253
253
213
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
17
117
26
5
0
0
0
0
0
201
253
253
235
32
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
156
253
253
253
58
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
28
59
127
254
255
254
254
238
36
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
73
217
253
253
253
254
253
253
253
103
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
61
224
253
253
253
253
254
253
253
253
223
31
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
22
226
254
253
253
216
131
254
253
253
253
253
197
6
0
0
0
0
0
0
0
0
0
0
0
0
0
0
40
253
254
253
154
27
135
254
253
253
253
253
254
151
0
0
0
0
0
0
0
0
0
0
0
0
0
0
100
254
255
219
30
47
231
255
254
145
219
254
255
254
223
12
0
0
0
0
0
0
0
0
0
0
0
0
92
253
254
190
147
235
253
254
236
42
89
248
254
253
253
19
0
0
0
0
0
0
0
0
0
0
0
0
12
212
254
253
253
253
253
235
57
0
0
128
254
253
253
117
0
0
0
0
0
0
0
0
0
0
0
0
0
128
238
253
253
253
237
53
0
0
0
45
238
253
253
198
0
0
0
0
0
0
0
0
0
0
0
0
0
0
36
140
155
140
36
0
0
0
0
0
59
238
253
79
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
7
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
56
253
249
77
34
34
48
143
143
143
143
143
91
13
0
0
0
0
0
0
0
0
0
0
0
0
0
0
104
252
252
252
252
252
253
252
252
252
252
252
252
128
0
0
0
0
0
0
0
0
0
0
0
0
0
0
10
117
252
252
252
252
253
252
252
252
252
252
252
227
29
0
0
0
0
0
0
0
0
0
0
0
0
0
0
4
15
121
121
121
59
39
77
10
112
252
252
160
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
15
184
252
252
136
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
100
252
252
166
22
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
96
251
252
244
69
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
31
217
252
243
45
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
59
233
252
236
51
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
97
239
252
209
82
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
93
253
255
245
38
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
108
233
252
239
77
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
35
105
236
252
209
92
13
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
15
113
243
252
236
82
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
118
252
252
243
100
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
152
234
252
244
45
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
3
103
238
252
210
78
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
26
226
252
252
145
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
29
228
252
163
14
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
161
243
112
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
18
130
254
180
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
67
67
158
217
253
253
212
26
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
88
254
253
253
253
232
232
253
147
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
24
215
252
254
253
253
234
16
189
253
160
6
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
56
181
253
253
184
203
253
142
31
161
253
224
23
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
232
253
253
166
0
87
55
3
0
160
253
225
23
0
0
0
0
0
0
0
0
0
0
0
0
0
32
209
252
247
127
12
0
0
0
0
0
190
253
213
6
0
0
0
0
0
0
0
0
0
0
0
0
4
133
253
251
129
0
0
0
0
0
0
0
151
253
228
29
0
0
0
0
0
0
0
0
0
0
0
0
105
253
253
127
0
0
0
0
0
0
0
0
199
253
209
0
0
0
0
0
0
0
0
0
0
0
0
0
85
253
253
70
0
0
0
0
0
0
0
0
151
253
180
29
0
0
0
0
0
0
0
0
0
0
0
57
250
230
80
0
0
0
0
0
0
0
0
56
254
254
114
0
0
0
0
0
0
0
0
0
0
0
0
177
253
186
17
0
0
0
0
0
0
0
6
155
253
246
64
0
0
0
0
0
0
0
0
0
0
0
0
167
253
131
0
0
0
0
0
0
0
0
14
219
253
242
0
0
0
0
0
0
0
0
0
0
0
0
6
210
242
47
0
0
0
0
0
0
0
0
137
250
253
127
0
0
0
0
0
0
0
0
0
0
0
0
38
235
241
39
0
0
0
0
0
0
0
22
246
253
234
65
0
0
0
0
0
0
0
0
0
0
0
0
67
253
231
0
0
0
0
0
0
0
92
220
253
253
79
0
0
0
0
0
0
0
0
0
0
0
0
0
6
165
252
86
0
0
0
4
60
123
214
253
252
170
29
0
0
0
0
0
0
0
0
0
0
0
0
0
0
173
252
232
117
45
113
185
253
255
253
253
192
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
243
253
253
253
253
253
253
254
199
85
3
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
89
143
230
253
253
253
253
177
10
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
4
150
253
253
253
128
6
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
6
147
252
252
241
241
253
120
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
135
252
240
113
50
50
161
211
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
7
129
229
238
79
0
0
24
59
204
151
129
169
77
49
7
0
0
0
0
0
0
0
0
0
0
0
0
66
252
252
151
0
53
113
215
253
252
252
252
252
252
252
191
129
3
0
0
0
0
0
0
0
0
0
7
184
252
171
14
65
125
180
180
180
111
60
60
60
151
180
242
252
104
0
0
0
0
0
0
0
0
0
123
252
236
48
0
57
24
0
0
0
0
0
0
0
0
0
87
244
172
0
0
0
0
0
0
0
0
0
145
252
157
0
0
0
0
0
0
0
0
0
0
0
0
0
0
229
252
0
0
0
0
0
0
0
0
64
242
219
8
0
0
0
0
0
0
0
0
0
0
0
0
0
42
239
142
0
0
0
0
0
0
0
0
190
252
216
0
0
0
0
0
0
0
0
0
0
0
0
0
0
212
252
74
0
0
0
0
0
0
0
0
255
253
96
0
0
0
0
0
0
0
0
0
0
0
0
0
5
220
196
6
0
0
0
0
0
0
0
0
253
252
96
0
0
0
0
0
0
0
0
0
0
0
0
0
97
252
132
0
0
0
0
0
0
0
0
0
236
252
96
0
0
0
0
0
0
0
0
0
0
0
0
63
234
175
9
0
0
0
0
0
0
0
0
0
132
252
176
0
0
0
0
0
0
0
0
0
0
0
64
233
252
36
0
0
0
0
0
0
0
0
0
0
47
252
224
20
0
0
0
0
0
0
0
0
0
0
181
252
162
10
0
0
0
0
0
0
0
0
0
0
3
170
252
189
21
0
0
0
0
0
0
0
11
153
235
197
12
0
0
0
0
0
0
0
0
0
0
0
0
39
220
252
112
0
0
0
0
0
0
49
166
252
106
34
0
0
0
0
0
0
0
0
0
0
0
0
0
0
54
252
244
126
65
0
42
97
160
237
238
75
10
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
4
47
221
252
244
229
239
252
196
143
66
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
8
104
230
206
131
131
6
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
140
255
153
91
197
91
82
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
130
251
253
253
253
253
246
180
180
180
180
180
11
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
133
212
253
253
253
253
253
253
253
253
190
9
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
73
231
253
253
253
253
253
253
253
81
24
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
69
241
253
253
253
253
173
253
253
253
32
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
74
235
253
253
237
92
40
30
226
253
253
181
21
0
0
0
0
0
0
0
0
0
0
0
0
0
0
69
235
253
253
167
77
0
0
0
52
227
253
253
106
0
0
0
0
0
0
0
0
0
0
0
0
0
9
157
253
253
193
72
0
0
0
0
0
205
253
253
143
4
0
0
0
0
0
0
0
0
0
0
0
0
33
253
253
167
9
0
0
0
0
0
0
205
253
253
253
15
0
0
0
0
0
0
0
0
0
0
0
35
211
253
253
114
0
0
0
0
0
0
0
205
253
253
253
15
0
0
0
0
0
0
0
0
0
0
0
107
253
253
219
37
0
0
0
0
0
0
0
205
253
253
253
15
0
0
0
0
0
0
0
0
0
0
0
107
253
253
204
0
0
0
0
0
0
0
45
224
253
253
195
10
0
0
0
0
0
0
0
0
0
0
2
122
253
241
78
0
0
0
0
0
0
3
128
253
253
247
96
0
0
0
0
0
0
0
0
0
0
0
17
253
253
129
0
0
0
0
0
0
0
100
253
253
253
121
0
0
0
0
0
0
0
0
0
0
0
0
17
253
253
230
33
0
0
0
0
0
17
199
253
253
253
32
0
0
0
0
0
0
0
0
0
0
0
0
8
177
253
253
40
0
0
0
0
90
179
253
253
253
184
16
0
0
0
0
0
0
0
0
0
0
0
0
0
107
253
253
226
131
58
58
111
243
253
253
253
179
16
0
0
0
0
0
0
0
0
0
0
0
0
0
0
107
253
253
253
253
253
253
253
253
253
253
253
48
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
7
143
215
253
253
253
253
253
253
224
57
15
3
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
44
121
253
253
253
199
89
55
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
4
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
62
214
151
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
82
223
253
70
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
123
254
253
254
50
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
163
243
233
232
253
50
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
31
193
254
192
0
123
254
50
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
82
213
252
192
50
0
203
253
50
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
21
173
253
244
162
0
21
72
233
254
71
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
183
253
252
243
162
203
223
253
252
253
192
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
173
253
254
253
254
253
254
253
254
253
203
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
253
252
253
252
253
171
151
111
172
252
203
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
102
102
102
61
0
0
0
0
152
253
102
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
71
252
142
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
152
253
102
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
233
252
20
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
21
254
253
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
102
253
212
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
103
255
131
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
102
233
30
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
163
204
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
203
122
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
--- **3. Set up a recipe** & remove near-zero-variance predictors: ```r numbers_recipe = numbers_train %>% recipe(y ~ .) %>% step_nzv(all_predictors()) numbers_clean = numbers_recipe %>% prep() %>% juice() head(numbers_clean, 8) ```
V153
V154
V155
V156
V157
V158
V159
V160
V179
V180
V181
V182
V183
V184
V185
V186
V187
V188
V189
V206
V207
V208
V209
V210
V211
V212
V213
V214
V215
V216
V217
V218
V233
V234
V235
V236
V237
V238
V239
V240
V241
V242
V243
V244
V245
V246
V261
V262
V263
V264
V265
V266
V267
V268
V269
V270
V271
V272
V273
V274
V289
V290
V291
V292
V293
V294
V295
V296
V297
V298
V299
V300
V301
V302
V317
V318
V319
V320
V321
V322
V323
V324
V325
V326
V327
V328
V329
V345
V346
V347
V348
V349
V350
V351
V352
V353
V354
V355
V356
V357
V372
V373
V374
V375
V376
V377
V378
V379
V380
V381
V382
V383
V384
V385
V400
V401
V402
V403
V404
V405
V406
V407
V408
V409
V410
V411
V412
V413
V414
V428
V429
V430
V431
V432
V433
V434
V435
V436
V437
V438
V439
V440
V441
V442
V456
V457
V458
V459
V460
V461
V462
V463
V464
V465
V466
V467
V468
V469
V470
V484
V485
V486
V487
V488
V489
V490
V491
V492
V493
V494
V495
V496
V497
V512
V513
V514
V515
V516
V517
V518
V519
V520
V521
V522
V523
V524
V525
V540
V541
V542
V543
V544
V545
V546
V547
V548
V549
V550
V551
V552
V553
V568
V569
V570
V571
V572
V573
V574
V575
V576
V577
V578
V579
V580
V581
V596
V597
V598
V599
V600
V601
V602
V603
V604
V605
V606
V607
V608
V625
V626
V627
V628
V629
V630
V631
V632
V633
V634
V654
V655
V656
V657
V658
V659
V660
V661
V684
V685
V686
V687
y
154
154
154
119
36
4
0
0
253
202
205
253
253
253
253
179
29
0
0
228
78
3
6
65
147
204
253
253
215
30
0
0
189
68
0
0
0
0
0
15
146
245
253
202
0
0
22
6
0
0
0
0
0
0
0
181
253
236
37
0
0
0
0
0
0
0
0
0
0
5
249
253
145
0
0
0
0
0
0
0
0
0
0
0
182
253
210
0
0
37
42
42
42
13
0
0
0
130
253
253
0
17
156
245
253
253
253
200
141
60
6
130
253
253
132
211
253
253
181
253
253
253
253
253
200
225
253
239
49
240
253
80
41
3
41
41
124
187
253
253
253
253
206
0
230
53
5
0
0
0
0
0
7
217
253
253
253
227
32
85
0
0
0
0
0
0
9
139
253
253
253
253
253
82
0
0
27
30
41
148
222
253
253
226
163
178
253
169
48
144
243
253
253
253
253
239
162
8
0
25
215
253
253
253
253
252
188
188
106
49
0
0
0
0
14
132
170
74
53
51
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
2
179
253
253
231
45
0
0
0
228
57
9
144
253
253
233
43
0
0
0
20
36
0
0
9
144
253
253
134
0
0
0
0
0
0
0
0
0
0
27
253
253
134
0
0
0
0
0
0
0
0
0
0
49
253
253
134
0
0
0
0
0
0
0
0
0
40
219
253
239
49
0
0
0
0
0
0
0
0
3
178
253
253
120
0
0
0
0
0
0
0
36
159
253
253
246
57
0
0
0
0
0
0
0
0
131
254
254
228
0
0
0
0
0
0
0
0
0
204
251
253
158
11
0
0
0
0
0
0
0
0
0
205
249
253
233
50
0
0
0
0
0
0
0
0
0
91
249
253
196
31
0
0
0
0
0
0
0
0
0
0
142
253
232
67
0
0
0
0
0
0
0
0
0
0
241
253
144
0
0
0
0
0
0
0
0
0
0
0
241
253
144
30
0
0
0
0
0
0
0
0
13
0
156
253
253
229
161
161
161
161
161
161
161
161
190
0
45
226
251
253
253
253
253
253
253
255
253
253
0
0
102
120
218
253
253
253
253
255
0
0
0
0
0
0
0
0
0
0
0
0
2
43
96
218
253
175
29
0
0
36
157
252
252
252
252
253
231
16
0
0
0
203
253
252
233
126
126
200
252
21
0
0
0
0
0
185
253
252
202
48
0
49
128
7
0
0
126
0
0
18
122
252
252
242
79
0
0
0
71
194
253
0
0
0
0
36
224
253
253
107
15
192
253
253
254
0
0
0
0
0
48
252
252
217
211
252
252
252
0
0
0
0
0
13
217
252
253
252
252
238
99
0
0
0
0
0
22
167
252
252
253
252
212
28
0
0
0
0
0
124
225
252
252
252
253
252
38
0
0
0
0
0
55
236
254
253
250
211
185
255
253
83
0
0
0
0
52
232
252
253
224
110
0
106
253
252
126
0
0
0
43
234
252
252
170
40
0
0
203
253
252
29
0
0
171
252
252
155
0
0
0
0
211
253
252
21
0
0
252
252
155
7
0
0
0
0
211
253
217
12
0
0
253
241
35
0
0
0
0
38
236
255
168
0
0
0
252
238
28
0
0
0
8
171
252
253
89
0
0
252
212
39
39
162
234
252
252
161
5
252
252
253
252
252
252
155
35
253
217
147
77
8
0
0
0
0
0
0
0
0
242
245
245
203
98
19
0
0
0
0
0
190
113
141
152
226
254
222
76
0
0
0
0
0
4
0
0
0
0
7
141
249
238
27
0
0
0
0
0
0
0
0
0
0
0
108
249
191
7
0
0
0
0
0
0
0
0
0
0
0
197
254
49
0
0
0
0
0
0
0
0
0
0
0
133
254
103
0
0
33
33
10
0
0
0
0
0
94
255
103
0
0
229
254
254
232
195
116
68
3
0
102
254
80
0
0
217
164
164
176
233
254
254
198
57
217
239
25
0
0
0
231
48
0
0
7
66
171
252
254
254
164
0
0
0
0
254
235
154
39
20
7
8
133
254
254
240
98
4
0
0
91
220
254
254
254
224
228
254
252
182
235
254
196
105
0
0
60
157
221
242
242
174
82
0
41
173
243
254
0
0
0
0
0
0
0
0
0
0
0
0
4
105
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
2
0
0
0
0
0
0
0
0
0
0
15
17
17
17
17
17
9
0
0
107
107
107
240
253
253
253
253
253
186
21
0
0
253
253
253
253
253
253
253
253
253
253
253
70
0
0
253
253
149
129
129
129
129
129
178
253
253
212
0
0
40
40
7
0
0
0
0
0
99
253
253
212
0
0
0
0
0
0
0
0
0
5
184
253
253
122
0
0
0
0
0
0
0
22
178
253
253
148
5
0
0
0
0
0
0
0
97
187
253
253
238
53
0
0
0
0
0
0
0
118
250
253
253
221
48
0
0
0
0
0
0
5
41
172
246
253
250
132
50
0
0
0
0
0
0
13
174
253
253
253
253
107
0
0
0
0
0
0
0
0
33
253
253
253
204
65
2
0
0
0
0
0
0
0
18
194
253
253
217
116
3
0
0
0
0
0
0
0
0
24
194
253
253
253
206
205
205
205
205
205
205
0
0
0
24
194
253
253
253
253
253
253
253
253
253
0
0
0
0
18
74
195
195
195
195
201
253
253
0
0
0
0
0
0
0
0
0
11
0
0
0
0
0
0
0
0
0
0
0
0
2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
6
118
189
255
169
0
0
0
0
0
0
0
23
64
184
253
253
253
242
54
0
0
0
0
2
120
223
253
253
253
230
227
222
70
0
0
0
114
191
253
253
253
198
83
36
76
147
25
0
0
115
248
253
253
239
82
11
0
40
229
253
201
25
227
253
251
135
24
0
33
107
227
253
253
253
70
129
253
253
137
73
143
243
246
253
143
249
253
172
36
178
253
253
253
253
253
253
235
130
10
249
253
88
0
0
60
246
253
253
253
247
184
22
0
72
252
253
88
0
0
0
63
76
76
76
64
0
0
0
195
253
233
27
0
0
0
0
0
0
0
0
0
0
110
248
250
101
0
0
0
0
0
0
0
0
0
0
213
253
197
0
0
0
0
0
0
0
0
0
0
0
213
253
124
0
0
0
0
0
0
0
0
0
0
82
253
253
7
0
0
0
0
0
0
0
0
0
0
195
253
187
2
0
0
0
0
0
0
0
39
233
253
141
0
0
0
0
0
60
253
253
141
0
0
60
253
9
0
0
0
0
0
0
239
159
0
0
0
0
0
0
0
48
249
115
0
0
0
0
0
0
0
0
0
193
253
0
0
0
0
0
0
0
0
0
0
0
47
239
208
0
0
0
0
0
10
65
0
0
0
0
176
253
149
0
0
0
0
41
234
151
0
0
0
2
210
254
41
0
0
0
10
189
253
117
0
0
0
67
253
245
14
0
0
186
254
147
9
0
0
0
175
253
174
0
0
0
64
253
254
185
32
0
0
0
190
253
66
0
0
0
24
144
254
253
245
168
110
90
249
253
31
0
50
0
0
0
0
30
115
198
247
254
255
254
254
254
179
214
0
0
0
0
0
0
0
36
137
254
253
237
221
221
133
0
0
0
0
0
0
0
0
15
254
253
62
0
0
0
0
0
0
0
0
0
0
125
254
245
36
0
0
0
0
0
0
0
0
0
0
158
254
221
0
0
0
0
0
0
0
0
0
0
0
189
255
113
0
0
0
0
0
0
0
0
0
0
28
244
254
63
0
0
0
0
0
0
0
0
104
253
241
12
0
0
0
0
0
142
253
193
0
0
0
119
253
85
4
222
252
253
252
252
252
218
217
252
252
252
252
253
252
226
215
217
215
215
206
252
252
231
108
108
108
31
0
0
0
0
0
37
252
252
252
215
0
0
0
0
0
0
0
0
0
5
119
252
252
226
31
0
0
0
0
0
0
0
0
0
1
133
247
252
211
21
0
0
0
0
0
0
0
0
0
0
217
252
252
144
0
0
0
0
0
0
0
0
0
156
253
253
255
98
0
0
0
0
0
0
0
0
0
10
149
252
253
242
114
31
0
0
0
0
0
0
0
0
11
154
253
252
252
211
0
0
0
0
0
0
0
0
0
0
0
170
252
252
252
84
0
0
0
0
0
0
0
0
0
0
0
73
253
253
255
98
0
0
0
0
0
0
0
0
0
0
10
149
252
253
179
0
0
0
0
0
0
0
0
0
0
37
252
253
221
41
0
0
0
0
0
0
0
0
0
37
252
253
179
0
109
109
109
109
15
0
0
0
32
212
253
255
180
0
252
252
252
252
222
217
217
218
227
252
252
253
138
252
252
252
252
252
252
253
252
252
252
108
108
232
252
252
253
252
231
0
0
0
0
5
--- **4. Fit a KNN model,** k=10 (this will take a minute or so): ```r # Define the model numbers_model = nearest_neighbor(neighbors = 10, mode = "classification") %>% set_engine("kknn", scale = TRUE) # Fit the model numbers_workflow = workflow() %>% add_model(numbers_model) %>% add_recipe(numbers_recipe) %>% fit(numbers_train) # Predict in the test set test_predictions = numbers_workflow %>% last_fit(numbers_split) %>% collect_predictions() ``` --- **Take a look at the results!** ```r head(test_predictions) ```
--- **Confusion matrix:** ```r conf_mat(test_predictions, truth=y, estimate=.pred_class) ``` ``` #> Truth #> Prediction 0 1 2 3 4 5 6 7 8 9 #> 0 200 0 0 0 0 0 0 0 0 1 #> 1 0 230 2 1 2 1 0 0 6 0 #> 2 1 0 179 1 0 0 1 0 3 2 #> 3 0 1 0 207 0 3 0 0 5 5 #> 4 0 1 0 0 185 1 0 0 0 0 #> 5 1 0 0 2 0 168 1 1 4 0 #> 6 1 0 0 1 2 1 221 0 1 0 #> 7 0 0 2 1 2 2 0 200 0 2 #> 8 0 0 0 1 0 2 0 0 142 0 #> 9 0 0 0 2 11 2 0 3 2 184 ``` --- **Overall accuracy:** ```r accuracy(test_predictions, truth=y, estimate=.pred_class) ```
.metric
.estimator
.estimate
accuracy
multiclass
0.958
--- Let's calculate accuracy metrics by digit... ```r pred_numbers = test_predictions %>% dplyr::select(y_hat = .pred_class, y) %>% mutate(correct = 1*(y==y_hat)) sens = pred_numbers %>% group_by(y) %>% summarize(sensitivity = mean(correct)) prec = pred_numbers %>% group_by(y_hat) %>% summarize(precision = mean(correct)) by_class = bind_cols(sens, prec) %>% dplyr::select(-y_hat) ``` --- **Sensitivity and precision by digit:** ```r by_class ```
y
sensitivity
precision
0
0.985
0.995
1
0.991
0.95
2
0.978
0.957
3
0.958
0.937
4
0.916
0.989
5
0.933
0.949
6
0.991
0.974
7
0.98
0.957
8
0.871
0.979
9
0.948
0.902
--- Our error rate was **4.2%.** * [Others](https://en.wikipedia.org/wiki/MNIST_database) have gotten as low as **0.52%** with KNN. * And as low as **0.17%** with convolutional neural networks. -- </br> Now you basically know how to make a self-driving car!