{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "Ss6WwQxZcyB3" }, "source": [ "# 머신 러닝 교과서 3판" ] }, { "cell_type": "markdown", "metadata": { "id": "lTv_uMjxcyB4" }, "source": [ "# 8장 - 감성 분석에 머신 러닝 적용\n" ] }, { "cell_type": "markdown", "metadata": { "id": "qeuUzWpZcyB5" }, "source": [ "**아래 링크를 통해 이 노트북을 주피터 노트북 뷰어(nbviewer.jupyter.org)로 보거나 구글 코랩(colab.research.google.com)에서 실행할 수 있습니다.**\n", "\n", "
\n",
" 주피터 노트북 뷰어로 보기\n",
" | \n",
" \n",
" 구글 코랩(Colab)에서 실행하기\n",
" | \n",
"
| \n", " | review | \n", "sentiment | \n", "
|---|---|---|
| 0 | \n", "In 1974, the teenager Martha Moxley (Maggie Gr... | \n", "1 | \n", "
| 1 | \n", "OK... so... I really like Kris Kristofferson a... | \n", "0 | \n", "
| 2 | \n", "***SPOILER*** Do not read this, if you think a... | \n", "0 | \n", "
GridSearchCV(cv=5,\n",
" estimator=Pipeline(steps=[('vect',\n",
" TfidfVectorizer(lowercase=False)),\n",
" ('clf',\n",
" LogisticRegression(random_state=0,\n",
" solver='liblinear'))]),\n",
" n_jobs=-1,\n",
" param_grid=[{'clf__C': [1.0, 10.0, 100.0],\n",
" 'clf__penalty': ['l1', 'l2'],\n",
" 'vect__ngram_range': [(1, 1)],\n",
" 'vect__stop_words': [['a', 'about', 'above', 'after',\n",
" 'again', 'against', 'ain',\n",
" 'all', 'am', 'an', 'and', '...\n",
" 'vect__stop_words': [['a', 'about', 'above', 'after',\n",
" 'again', 'against', 'ain',\n",
" 'all', 'am', 'an', 'and', 'any',\n",
" 'are', 'aren', "aren't", 'as',\n",
" 'at', 'be', 'because', 'been',\n",
" 'before', 'being', 'below',\n",
" 'between', 'both', 'but', 'by',\n",
" 'can', 'couldn', "couldn't", ...],\n",
" None],\n",
" 'vect__tokenizer': [<function tokenizer at 0x7d90d71d4400>,\n",
" <function tokenizer_porter at 0x7d90d515af20>],\n",
" 'vect__use_idf': [False]}],\n",
" scoring='accuracy')In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. GridSearchCV(cv=5,\n",
" estimator=Pipeline(steps=[('vect',\n",
" TfidfVectorizer(lowercase=False)),\n",
" ('clf',\n",
" LogisticRegression(random_state=0,\n",
" solver='liblinear'))]),\n",
" n_jobs=-1,\n",
" param_grid=[{'clf__C': [1.0, 10.0, 100.0],\n",
" 'clf__penalty': ['l1', 'l2'],\n",
" 'vect__ngram_range': [(1, 1)],\n",
" 'vect__stop_words': [['a', 'about', 'above', 'after',\n",
" 'again', 'against', 'ain',\n",
" 'all', 'am', 'an', 'and', '...\n",
" 'vect__stop_words': [['a', 'about', 'above', 'after',\n",
" 'again', 'against', 'ain',\n",
" 'all', 'am', 'an', 'and', 'any',\n",
" 'are', 'aren', "aren't", 'as',\n",
" 'at', 'be', 'because', 'been',\n",
" 'before', 'being', 'below',\n",
" 'between', 'both', 'but', 'by',\n",
" 'can', 'couldn', "couldn't", ...],\n",
" None],\n",
" 'vect__tokenizer': [<function tokenizer at 0x7d90d71d4400>,\n",
" <function tokenizer_porter at 0x7d90d515af20>],\n",
" 'vect__use_idf': [False]}],\n",
" scoring='accuracy')Pipeline(steps=[('vect',\n",
" TfidfVectorizer(lowercase=False,\n",
" tokenizer=<function tokenizer at 0x7d90d71d4400>)),\n",
" ('clf',\n",
" LogisticRegression(C=10.0, random_state=0,\n",
" solver='liblinear'))])TfidfVectorizer(lowercase=False,\n",
" tokenizer=<function tokenizer at 0x7d90d71d4400>)LogisticRegression(C=10.0, random_state=0, solver='liblinear')
| \n", " | review | \n", "sentiment | \n", "
|---|---|---|
| 0 | \n", "In 1974, the teenager Martha Moxley (Maggie Gr... | \n", "1 | \n", "
| 1 | \n", "OK... so... I really like Kris Kristofferson a... | \n", "0 | \n", "
| 2 | \n", "***SPOILER*** Do not read this, if you think a... | \n", "0 | \n", "