{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## 5. 모델 평가와 성능 향상" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Test set score: 0.88\n" ] } ], "source": [ "from sklearn.datasets import make_blobs\n", "from sklearn.linear_model import LogisticRegression\n", "from sklearn.model_selection import train_test_split\n", "\n", "# create a synthetic dataset\n", "X, y = make_blobs(random_state=0)\n", "\n", "# split data and labels into a training and a test set\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)\n", "\n", "# instantiate a model and fit it to the training set\n", "logreg = LogisticRegression().fit(X_train, y_train)\n", "\n", "# evaluate the model on the test set\n", "print(\"Test set score: {:.2f}\".format(logreg.score(X_test, y_test)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5.1 교차 검증\n", "- 교차 검증은 일반화 성능을 재기 위해 훈련 세트와 테스트 세트로 한번 나누는 것 보다 더 안정적이고, 뛰어난 통계적 방법\n", "- 데이터를 여러번 반복해서 나누고 여러 모델을 학습" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline\n", "import sys \n", "sys.path.append('..')\n", "from preamble import *\n", "from sklearn.model_selection import train_test_split" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "교차 검증 점수: [0.961 0.922 0.958]\n" ] } ], "source": [ "from sklearn.model_selection import cross_val_score\n", "from sklearn.datasets import load_iris\n", "from sklearn.linear_model import LogisticRegression\n", "\n", "iris = load_iris()\n", "logreg = LogisticRegression()\n", "\n", "scores = cross_val_score(logreg, iris.data, iris.target)\n", "print(\"교차 검증 점수: {}\".format(scores))" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "교차 검증 점수: [1. 0.967 0.933 0.9 1. ]\n" ] } ], "source": [ "scores = cross_val_score(logreg, iris.data, iris.target, cv=5)\n", "print(\"교차 검증 점수: {}\".format(scores))" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "교차 검증 평균 점수: 0.96\n" ] } ], "source": [ "print(\"교차 검증 평균 점수: {:.2f}\".format(scores.mean()))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5.1.2 교차 검증의 장점\n", "- 테스트 세트에 각 샘플이 정확하게 한 번씩 들어감\n", "- 각 샘플은 폴드 중 하나에 속하며 각 폴드는 한 번씩 테스트 세트가 됨" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5.1.3 계층별 k-겹 교차 검증과 그외 전략들" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Iris labels:\n", "[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n", " 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n", " 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2\n", " 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2\n", " 2 2]\n" ] } ], "source": [ "from sklearn.datasets import load_iris\n", "iris = load_iris()\n", "print(\"Iris labels:\\n{}\".format(iris.target))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 교차 검증 상세 옵션" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from sklearn.model_selection import KFold\n", "kfold = KFold(n_splits=5)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cross-validation scores:\n", "[1. 0.933 0.433 0.967 0.433]\n" ] } ], "source": [ "print(\"Cross-validation scores:\\n{}\".format(\n", " cross_val_score(logreg, iris.data, iris.target, cv=kfold)))" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cross-validation scores:\n", "[0. 0. 0.]\n" ] } ], "source": [ "kfold = KFold(n_splits=3)\n", "print(\"Cross-validation scores:\\n{}\".format(\n", " cross_val_score(logreg, iris.data, iris.target, cv=kfold)))" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cross-validation scores:\n", "[0.9 0.96 0.96]\n" ] } ], "source": [ "kfold = KFold(n_splits=3, shuffle=True, random_state=0)\n", "print(\"Cross-validation scores:\\n{}\".format(\n", " cross_val_score(logreg, iris.data, iris.target, cv=kfold)))" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "교차 검증 분할 횟수: 150\n", "평균 정확도: 0.95\n" ] } ], "source": [ "from sklearn.model_selection import LeaveOneOut\n", "loo = LeaveOneOut()\n", "scores = cross_val_score(logreg, iris.data, iris.target, cv=loo)\n", "print(\"교차 검증 분할 횟수: \", len(scores))\n", "print(\"평균 정확도: {:.2f}\".format(scores.mean()))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 임의 분할 교차 검증" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "교차 검증 점수:\n", "[0.907 0.973 0.893 0.973 0.933 0.893 0.893 0.907 0.893 0.96 ]\n" ] } ], "source": [ "from sklearn.model_selection import ShuffleSplit\n", "shuffle_split = ShuffleSplit(test_size=.5, train_size=.5, n_splits=10)\n", "scores = cross_val_score(logreg, iris.data, iris.target, cv=shuffle_split)\n", "print(\"교차 검증 점수:\\n{}\".format(scores))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 그룹별 교차 검증" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "교차 검증 점수:\n", "[0.75 0.8 0.667]\n" ] } ], "source": [ "from sklearn.model_selection import GroupKFold\n", "\n", "X, y = make_blobs(n_samples=12, random_state=0)\n", "\n", "groups = [0, 0, 0, 1, 1, 1, 1, 2, 2, 3, 3, 3]\n", "scores = cross_val_score(logreg, X, y, groups, cv=GroupKFold(n_splits=3))\n", "print(\"교차 검증 점수:\\n{}\".format(scores))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5.2 그리드 서치\n", "### 5.2.1 간단한 그리드 서치" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Size of training set: 112 size of test set: 38\n", "최고 점수: 0.97\n", "최고 매개변수: {'C': 100, 'gamma': 0.001}\n" ] } ], "source": [ "from sklearn.svm import SVC\n", "X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, random_state=0)\n", "print(\"Size of training set: {} size of test set: {}\".format(X_train.shape[0], X_test.shape[0]))\n", "\n", "best_score = 0\n", "\n", "for gamma in [0.001, 0.01, 0.1, 1, 10, 100]:\n", " for C in [0.001, 0.01, 0.1, 1, 10, 100]:\n", " svm = SVC(gamma=gamma, C=C)\n", " svm.fit(X_train, y_train)\n", " score = svm.score(X_test, y_test)\n", " if score > best_score:\n", " best_score = score\n", " best_parameters = {'C': C, 'gamma': gamma}\n", "\n", "print(\"최고 점수: {:.2f}\".format(best_score))\n", "print(\"최고 매개변수: {}\".format(best_parameters))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5.2.2 매개변수 과대적합과 검증 세트" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Size of training set: 84, size of validation set: 28,size of test set: 38\n", "\n", "검증 세트에서 최고 점수: 0.96\n", "최고 파라미터: {'C': 10, 'gamma': 0.001}\n", "최적 매개변수에서 테스트 세트 점수: 0.92\n" ] } ], "source": [ "from sklearn.svm import SVC\n", "\n", "X_trainval, X_test, y_trainval, y_test = train_test_split(iris.data, iris.target, random_state=0)\n", "\n", "X_train, X_valid, y_train, y_valid = train_test_split(X_trainval, y_trainval, random_state=1)\n", "\n", "print(\"Size of training set: {}, size of validation set: {},size of test set: {}\\n\".format(\n", " X_train.shape[0], \n", " X_valid.shape[0], \n", " X_test.shape[0]))\n", "\n", "best_score = 0\n", "\n", "for gamma in [0.001, 0.01, 0.1, 1, 10, 100]:\n", " for C in [0.001, 0.01, 0.1, 1, 10, 100]:\n", " svm = SVC(gamma=gamma, C=C)\n", " svm.fit(X_train, y_train)\n", " score = svm.score(X_valid, y_valid)\n", " if score > best_score:\n", " best_score = score\n", " best_parameters = {'C': C, 'gamma': gamma}\n", "\n", "svm = SVC(**best_parameters)\n", "svm.fit(X_trainval, y_trainval)\n", "\n", "test_score = svm.score(X_test, y_test)\n", "print(\"검증 세트에서 최고 점수: {:.2f}\".format(best_score))\n", "print(\"최고 파라미터: \", best_parameters)\n", "print(\"최적 매개변수에서 테스트 세트 점수: {:.2f}\".format(test_score))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5.2.3 교차 검증을 사용한 그리드 서치" ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "SVC(C=100, cache_size=200, class_weight=None, coef0=0.0,\n", " decision_function_shape=None, degree=3, gamma=0.01, kernel='rbf',\n", " max_iter=-1, probability=False, random_state=None, shrinking=True,\n", " tol=0.001, verbose=False)" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "for gamma in [0.001, 0.01, 0.1, 1, 10, 100]:\n", " for C in [0.001, 0.01, 0.1, 1, 10, 100]:\n", " svm = SVC(gamma=gamma, C=C)\n", " scores = cross_val_score(svm, X_trainval, y_trainval, cv=5)\n", " score = np.mean(scores)\n", " if score > best_score:\n", " best_score = score\n", " best_parameters = {'C': C, 'gamma': gamma}\n", "\n", "svm = SVC(**best_parameters)\n", "svm.fit(X_trainval, y_trainval)" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Parameter grid:\n", "{'C': [0.001, 0.01, 0.1, 1, 10, 100], 'gamma': [0.001, 0.01, 0.1, 1, 10, 100]}\n" ] } ], "source": [ "param_grid = { 'C': [0.001, 0.01, 0.1, 1, 10, 100],\n", " 'gamma': [0.001, 0.01, 0.1, 1, 10, 100]}\n", "print(\"Parameter grid:\\n{}\".format(param_grid))" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from sklearn.model_selection import GridSearchCV\n", "from sklearn.svm import SVC\n", "grid_search = GridSearchCV(SVC(), param_grid, cv=5, return_train_score=True)" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "collapsed": true }, "outputs": [], "source": [ "X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, random_state=0)" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "GridSearchCV(cv=5, error_score='raise',\n", " estimator=SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,\n", " decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',\n", " max_iter=-1, probability=False, random_state=None, shrinking=True,\n", " tol=0.001, verbose=False),\n", " fit_params={}, iid=True, n_jobs=1,\n", " param_grid={'C': [0.001, 0.01, 0.1, 1, 10, 100], 'gamma': [0.001, 0.01, 0.1, 1, 10, 100]},\n", " pre_dispatch='2*n_jobs', refit=True, return_train_score=True,\n", " scoring=None, verbose=0)" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "grid_search.fit(X_train, y_train)" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "테스트 세트 점수: 0.97\n" ] } ], "source": [ "print(\"테스트 세트 점수: {:.2f}\".format(grid_search.score(X_test, y_test)))" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "최고 매개변수: {'C': 100, 'gamma': 0.01}\n", "최적 매개변수에서 테스트 세트 점수: 0.97\n" ] } ], "source": [ "print(\"최고 매개변수: {}\".format(grid_search.best_params_))\n", "print(\"최적 매개변수에서 테스트 세트 점수: {:.2f}\".format(grid_search.best_score_))" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "최고 성능 모델:\n", "SVC(C=100, cache_size=200, class_weight=None, coef0=0.0,\n", " decision_function_shape=None, degree=3, gamma=0.01, kernel='rbf',\n", " max_iter=-1, probability=False, random_state=None, shrinking=True,\n", " tol=0.001, verbose=False)\n" ] } ], "source": [ "print(\"최고 성능 모델:\\n{}\".format(grid_search.best_estimator_))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 교차 검증 결과 분석" ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Index(['mean_fit_time', 'mean_score_time', 'mean_test_score',\n", " 'mean_train_score', 'param_C', 'param_gamma', 'params',\n", " 'rank_test_score', 'split0_test_score', 'split0_train_score',\n", " 'split1_test_score', 'split1_train_score', 'split2_test_score',\n", " 'split2_train_score', 'split3_test_score', 'split3_train_score',\n", " 'split4_test_score', 'split4_train_score', 'std_fit_time',\n", " 'std_score_time', 'std_test_score', 'std_train_score'],\n", " dtype='object')\n" ] }, { "data": { "text/html": [ "
\n", " | mean_fit_time | \n", "mean_score_time | \n", "mean_test_score | \n", "mean_train_score | \n", "... | \n", "std_fit_time | \n", "std_score_time | \n", "std_test_score | \n", "std_train_score | \n", "
---|---|---|---|---|---|---|---|---|---|
0 | \n", "6.86e-04 | \n", "2.68e-04 | \n", "0.37 | \n", "0.37 | \n", "... | \n", "1.76e-04 | \n", "4.39e-05 | \n", "0.01 | \n", "2.85e-03 | \n", "
1 | \n", "6.11e-04 | \n", "2.53e-04 | \n", "0.37 | \n", "0.37 | \n", "... | \n", "2.05e-05 | \n", "6.22e-06 | \n", "0.01 | \n", "2.85e-03 | \n", "
2 | \n", "6.77e-04 | \n", "2.93e-04 | \n", "0.37 | \n", "0.37 | \n", "... | \n", "8.31e-05 | \n", "5.52e-05 | \n", "0.01 | \n", "2.85e-03 | \n", "
3 | \n", "5.89e-04 | \n", "2.60e-04 | \n", "0.37 | \n", "0.37 | \n", "... | \n", "1.95e-05 | \n", "3.59e-05 | \n", "0.01 | \n", "2.85e-03 | \n", "
4 | \n", "6.35e-04 | \n", "2.46e-04 | \n", "0.37 | \n", "0.37 | \n", "... | \n", "1.99e-05 | \n", "6.66e-06 | \n", "0.01 | \n", "2.85e-03 | \n", "
5 rows × 22 columns
\n", "\n", " | 0 | \n", "1 | \n", "2 | \n", "3 | \n", "... | \n", "38 | \n", "39 | \n", "40 | \n", "41 | \n", "
---|---|---|---|---|---|---|---|---|---|
mean_fit_time | \n", "0.00065 | \n", "0.0006 | \n", "0.00059 | \n", "0.00062 | \n", "... | \n", "0.00035 | \n", "0.00034 | \n", "0.00034 | \n", "0.00035 | \n", "
mean_score_time | \n", "0.00029 | \n", "0.00025 | \n", "0.00025 | \n", "0.00025 | \n", "... | \n", "0.00021 | \n", "0.0002 | \n", "0.0002 | \n", "0.0002 | \n", "
mean_test_score | \n", "0.37 | \n", "0.37 | \n", "0.37 | \n", "0.37 | \n", "... | \n", "0.95 | \n", "0.97 | \n", "0.96 | \n", "0.96 | \n", "
mean_train_score | \n", "0.37 | \n", "0.37 | \n", "0.37 | \n", "0.37 | \n", "... | \n", "0.97 | \n", "0.98 | \n", "0.99 | \n", "0.99 | \n", "
param_C | \n", "0.001 | \n", "0.001 | \n", "0.001 | \n", "0.001 | \n", "... | \n", "0.1 | \n", "1 | \n", "10 | \n", "100 | \n", "
param_gamma | \n", "0.001 | \n", "0.01 | \n", "0.1 | \n", "1 | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
param_kernel | \n", "rbf | \n", "rbf | \n", "rbf | \n", "rbf | \n", "... | \n", "linear | \n", "linear | \n", "linear | \n", "linear | \n", "
params | \n", "{'C': 0.001, 'gamma': 0.001, 'kernel': 'rbf'} | \n", "{'C': 0.001, 'gamma': 0.01, 'kernel': 'rbf'} | \n", "{'C': 0.001, 'gamma': 0.1, 'kernel': 'rbf'} | \n", "{'C': 0.001, 'gamma': 1, 'kernel': 'rbf'} | \n", "... | \n", "{'C': 0.1, 'kernel': 'linear'} | \n", "{'C': 1, 'kernel': 'linear'} | \n", "{'C': 10, 'kernel': 'linear'} | \n", "{'C': 100, 'kernel': 'linear'} | \n", "
rank_test_score | \n", "27 | \n", "27 | \n", "27 | \n", "27 | \n", "... | \n", "11 | \n", "1 | \n", "3 | \n", "3 | \n", "
split0_test_score | \n", "0.38 | \n", "0.38 | \n", "0.38 | \n", "0.38 | \n", "... | \n", "0.96 | \n", "1 | \n", "0.96 | \n", "0.96 | \n", "
split0_train_score | \n", "0.36 | \n", "0.36 | \n", "0.36 | \n", "0.36 | \n", "... | \n", "0.97 | \n", "0.99 | \n", "0.99 | \n", "0.99 | \n", "
split1_test_score | \n", "0.35 | \n", "0.35 | \n", "0.35 | \n", "0.35 | \n", "... | \n", "0.91 | \n", "0.96 | \n", "1 | \n", "1 | \n", "
split1_train_score | \n", "0.37 | \n", "0.37 | \n", "0.37 | \n", "0.37 | \n", "... | \n", "0.98 | \n", "0.98 | \n", "0.99 | \n", "0.99 | \n", "
split2_test_score | \n", "0.36 | \n", "0.36 | \n", "0.36 | \n", "0.36 | \n", "... | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "
split2_train_score | \n", "0.37 | \n", "0.37 | \n", "0.37 | \n", "0.37 | \n", "... | \n", "0.94 | \n", "0.98 | \n", "0.98 | \n", "0.99 | \n", "
split3_test_score | \n", "0.36 | \n", "0.36 | \n", "0.36 | \n", "0.36 | \n", "... | \n", "0.91 | \n", "0.95 | \n", "0.91 | \n", "0.91 | \n", "
split3_train_score | \n", "0.37 | \n", "0.37 | \n", "0.37 | \n", "0.37 | \n", "... | \n", "0.98 | \n", "0.99 | \n", "0.99 | \n", "1 | \n", "
split4_test_score | \n", "0.38 | \n", "0.38 | \n", "0.38 | \n", "0.38 | \n", "... | \n", "0.95 | \n", "0.95 | \n", "0.95 | \n", "0.95 | \n", "
split4_train_score | \n", "0.36 | \n", "0.36 | \n", "0.36 | \n", "0.36 | \n", "... | \n", "0.97 | \n", "0.99 | \n", "1 | \n", "1 | \n", "
std_fit_time | \n", "8.8e-05 | \n", "1.6e-05 | \n", "1.8e-05 | \n", "1.7e-05 | \n", "... | \n", "1.1e-05 | \n", "1e-05 | \n", "1.3e-05 | \n", "2.9e-05 | \n", "
std_score_time | \n", "6e-05 | \n", "3.2e-06 | \n", "1.5e-05 | \n", "6.7e-06 | \n", "... | \n", "3.6e-06 | \n", "1.1e-06 | \n", "1.9e-06 | \n", "1.8e-06 | \n", "
std_test_score | \n", "0.011 | \n", "0.011 | \n", "0.011 | \n", "0.011 | \n", "... | \n", "0.033 | \n", "0.022 | \n", "0.034 | \n", "0.034 | \n", "
std_train_score | \n", "0.0029 | \n", "0.0029 | \n", "0.0029 | \n", "0.0029 | \n", "... | \n", "0.012 | \n", "0.0055 | \n", "0.007 | \n", "0.0055 | \n", "
23 rows × 42 columns
\n", "