{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "kernelspec": { "display_name": "TensorFlow 2.3 on Python 3.6 (CUDA 10.1)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.9" }, "colab": { "name": "11-2.dense_sentiment_classifier.ipynb", "provenance": [] }, "accelerator": "GPU" }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "idtwLdsHd7jS" }, "source": [ "# 밀집 신경망을 사용한 감성 분류기" ] }, { "cell_type": "markdown", "metadata": { "id": "-oRWHp5Wd7jW" }, "source": [ "이 노트북에서 IMDB 영화 리뷰를 감성에 따라 분류하는 밀집 신경망을 만듭니다." ] }, { "cell_type": "markdown", "metadata": { "id": "3JzWCOsvd7jW" }, "source": [ "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/rickiepark/dl-illustrated/blob/master/notebooks/11-2.dense_sentiment_classifier.ipynb)" ] }, { "cell_type": "markdown", "metadata": { "id": "yTPyx0Zmd7jX" }, "source": [ "#### 라이브러리를 적재합니다." ] }, { "cell_type": "code", "metadata": { "id": "PpYTAf5od7jX" }, "source": [ "from tensorflow import keras\n", "from tensorflow.keras.datasets import imdb # new! \n", "from tensorflow.keras.preprocessing.sequence import pad_sequences #new!\n", "from tensorflow.keras.models import Sequential\n", "from tensorflow.keras.layers import Dense, Flatten, Dropout\n", "from tensorflow.keras.layers import Embedding # new!\n", "from tensorflow.keras.callbacks import ModelCheckpoint # new! \n", "import os # new! \n", "from sklearn.metrics import roc_auc_score, roc_curve # new!\n", "import pandas as pd\n", "import matplotlib.pyplot as plt # new!\n", "%matplotlib inline" ], "execution_count": 1, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "UFV-K9X4d7jX" }, "source": [ "#### 하이퍼파라미터 셋팅" ] }, { "cell_type": "code", "metadata": { "id": "2f-Bg31Wd7jY" }, "source": [ "# 출력 디렉토리\n", "output_dir = 'model_output/dense'\n", "\n", "# 훈련\n", "epochs = 4\n", "batch_size = 128\n", "\n", "# 벡터 공간 임베딩\n", "n_dim = 64\n", "n_unique_words = 5000 # Maas et al. (2011); 최적이 아닐 수 있음\n", "n_words_to_skip = 50 # 상동\u001f\n", "max_review_length = 100\n", "pad_type = trunc_type = 'pre'\n", "\n", "# 신경망 구조\n", "n_dense = 64\n", "dropout = 0.5" ], "execution_count": 2, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "ketzJRXEd7jY" }, "source": [ "#### 데이터를 적재합니다." ] }, { "cell_type": "markdown", "metadata": { "id": "TFoTzUWId7jY" }, "source": [ "이 데이터셋에서\n", "\n", "* [케라스 텍스트 유틸리티](http://keras-ko.kr/api/preprocessing/text/)는 빠르게 자연어를 전처리하고 인덱스로 변환합니다.\n", "* `keras.preprocessing.text.Tokenizer` 클래스는 필요한 모든 것을 한 줄로 처리할 수 있습니다.\n", " * 단어나 문자로 토큰화하기\n", " * `num_words`: 고유한 최대 토큰\n", " * 구둣점 삭제\n", " * 소문자로 변경\n", " * 단어를 정수 인덱스로 변경하기" ] }, { "cell_type": "code", "metadata": { "id": "IG6RsgI_d7jY", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "d63d5b50-2671-4029-d2a9-0f462d8b5494" }, "source": [ "(x_train, y_train), (x_valid, y_valid) = imdb.load_data(num_words=n_unique_words, \n", " skip_top=n_words_to_skip) " ], "execution_count": 3, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz\n", "17464789/17464789 [==============================] - 0s 0us/step\n" ] } ] }, { "cell_type": "code", "metadata": { "id": "Q4N9bbD5d7jY", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "4965ef43-478f-4c4e-f73c-e1cce3dc77eb" }, "source": [ "x_train[0:6] # 0은 패딩; 1은 시작 문자; 2는 모르는 문자; 3은 가장 흔한 단어를 의미합니다." ], "execution_count": 4, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([list([2, 2, 2, 2, 2, 530, 973, 1622, 1385, 65, 458, 4468, 66, 3941, 2, 173, 2, 256, 2, 2, 100, 2, 838, 112, 50, 670, 2, 2, 2, 480, 284, 2, 150, 2, 172, 112, 167, 2, 336, 385, 2, 2, 172, 4536, 1111, 2, 546, 2, 2, 447, 2, 192, 50, 2, 2, 147, 2025, 2, 2, 2, 2, 1920, 4613, 469, 2, 2, 71, 87, 2, 2, 2, 530, 2, 76, 2, 2, 1247, 2, 2, 2, 515, 2, 2, 2, 626, 2, 2, 2, 62, 386, 2, 2, 316, 2, 106, 2, 2, 2223, 2, 2, 480, 66, 3785, 2, 2, 130, 2, 2, 2, 619, 2, 2, 124, 51, 2, 135, 2, 2, 1415, 2, 2, 2, 2, 215, 2, 77, 52, 2, 2, 407, 2, 82, 2, 2, 2, 107, 117, 2, 2, 256, 2, 2, 2, 3766, 2, 723, 2, 71, 2, 530, 476, 2, 400, 317, 2, 2, 2, 2, 1029, 2, 104, 88, 2, 381, 2, 297, 98, 2, 2071, 56, 2, 141, 2, 194, 2, 2, 2, 226, 2, 2, 134, 476, 2, 480, 2, 144, 2, 2, 2, 51, 2, 2, 224, 92, 2, 104, 2, 226, 65, 2, 2, 1334, 88, 2, 2, 283, 2, 2, 4472, 113, 103, 2, 2, 2, 2, 2, 178, 2]),\n", " list([2, 194, 1153, 194, 2, 78, 228, 2, 2, 1463, 4369, 2, 134, 2, 2, 715, 2, 118, 1634, 2, 394, 2, 2, 119, 954, 189, 102, 2, 207, 110, 3103, 2, 2, 69, 188, 2, 2, 2, 2, 2, 249, 126, 93, 2, 114, 2, 2300, 1523, 2, 647, 2, 116, 2, 2, 2, 2, 229, 2, 340, 1322, 2, 118, 2, 2, 130, 4901, 2, 2, 1002, 2, 89, 2, 952, 2, 2, 2, 455, 2, 2, 2, 2, 1543, 1905, 398, 2, 1649, 2, 2, 2, 163, 2, 3215, 2, 2, 1153, 2, 194, 775, 2, 2, 2, 349, 2637, 148, 605, 2, 2, 2, 123, 125, 68, 2, 2, 2, 349, 165, 4362, 98, 2, 2, 228, 2, 2, 2, 1157, 2, 299, 120, 2, 120, 174, 2, 220, 175, 136, 50, 2, 4373, 228, 2, 2, 2, 656, 245, 2350, 2, 2, 2, 131, 152, 491, 2, 2, 2, 2, 1212, 2, 2, 2, 371, 78, 2, 625, 64, 1382, 2, 2, 168, 145, 2, 2, 1690, 2, 2, 2, 1355, 2, 2, 2, 52, 154, 462, 2, 89, 78, 285, 2, 145, 95]),\n", " list([2, 2, 2, 2, 2, 2, 2, 2, 249, 108, 2, 2, 2, 54, 61, 369, 2, 71, 149, 2, 2, 112, 2, 2401, 311, 2, 2, 3711, 2, 75, 2, 1829, 296, 2, 86, 320, 2, 534, 2, 263, 4821, 1301, 2, 1873, 2, 89, 78, 2, 66, 2, 2, 360, 2, 2, 58, 316, 334, 2, 2, 1716, 2, 645, 662, 2, 257, 85, 1200, 2, 1228, 2578, 83, 68, 3912, 2, 2, 165, 1539, 278, 2, 69, 2, 780, 2, 106, 2, 2, 1338, 2, 2, 2, 2, 215, 2, 610, 2, 2, 87, 326, 2, 2300, 2, 2, 2, 2, 272, 2, 57, 2, 2, 2, 2, 2, 2, 2307, 51, 2, 170, 2, 595, 116, 595, 1352, 2, 191, 79, 638, 89, 2, 2, 2, 2, 106, 607, 624, 2, 534, 2, 227, 2, 129, 113]),\n", " list([2, 2, 2, 2, 2, 2804, 2, 2040, 432, 111, 153, 103, 2, 1494, 2, 70, 131, 67, 2, 61, 2, 744, 2, 3715, 761, 61, 2, 452, 2, 2, 985, 2, 2, 59, 166, 2, 105, 216, 1239, 2, 1797, 2, 2, 2, 2, 744, 2413, 2, 2, 2, 687, 2, 2, 2, 2, 2, 3693, 2, 2, 2, 121, 59, 456, 2, 2, 2, 265, 2, 575, 111, 153, 159, 59, 2, 1447, 2, 2, 586, 482, 2, 2, 96, 59, 716, 2, 2, 172, 65, 2, 579, 2, 2, 2, 1615, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 464, 2, 314, 2, 2, 2, 719, 605, 2, 2, 202, 2, 310, 2, 3772, 3501, 2, 2722, 58, 2, 2, 537, 2116, 180, 2, 2, 413, 173, 2, 263, 112, 2, 152, 377, 2, 537, 263, 846, 579, 178, 54, 75, 71, 476, 2, 413, 263, 2504, 182, 2, 2, 75, 2306, 922, 2, 279, 131, 2895, 2, 2867, 2, 2, 2, 921, 2, 192, 2, 1219, 3890, 2, 2, 217, 4122, 1710, 537, 2, 1236, 2, 736, 2, 2, 61, 403, 2, 2, 2, 61, 4494, 2, 2, 4494, 159, 90, 263, 2311, 4319, 309, 2, 178, 2, 82, 4319, 2, 65, 2, 2, 145, 143, 2, 2, 2, 537, 746, 537, 537, 2, 2, 2, 2, 594, 2, 2, 94, 2, 3987, 2, 2, 2, 2, 538, 2, 1795, 246, 2, 2, 2, 2, 635, 2, 2, 51, 408, 2, 94, 318, 1382, 2, 2, 2, 2683, 936, 2, 2, 2, 2, 2, 2, 2, 1885, 2, 1118, 2, 80, 126, 842, 2, 2, 2, 2, 4726, 2, 4494, 2, 1550, 3633, 159, 2, 341, 2, 2733, 2, 4185, 173, 2, 90, 2, 2, 2, 2, 2, 1784, 86, 1117, 2, 3261, 2, 2, 2, 2, 2, 2, 2841, 2, 2, 1010, 2, 793, 2, 2, 1386, 1830, 2, 2, 246, 50, 2, 2, 2750, 1944, 746, 90, 2, 2, 2, 124, 2, 882, 2, 882, 496, 2, 2, 2213, 537, 121, 127, 1219, 130, 2, 2, 494, 2, 124, 2, 882, 496, 2, 341, 2, 2, 846, 2, 2, 2, 2, 1906, 2, 97, 2, 236, 2, 1311, 2, 2, 2, 2, 2, 2, 2, 91, 2, 3987, 70, 2, 882, 2, 579, 2, 2, 2, 2, 2, 537, 2, 2, 2, 2, 65, 2, 537, 75, 2, 1775, 3353, 2, 1846, 2, 2, 2, 154, 2, 2, 518, 53, 2, 2, 2, 3211, 882, 2, 399, 2, 75, 257, 3807, 2, 2, 2, 2, 456, 2, 65, 2, 2, 205, 113, 2, 2, 2, 2, 2, 2, 2, 242, 2, 91, 1202, 2, 2, 2070, 307, 2, 2, 2, 126, 93, 2, 2, 2, 188, 1076, 3222, 2, 2, 2, 2, 2348, 537, 2, 53, 537, 2, 82, 2, 2, 2, 2, 2, 280, 2, 219, 2, 2, 431, 758, 859, 2, 953, 1052, 2, 2, 2, 2, 94, 2, 2, 238, 60, 2, 2, 2, 804, 2, 2, 2, 2, 132, 2, 67, 2, 2, 2, 2, 283, 2, 2, 2, 2, 2, 242, 955, 2, 2, 279, 2, 2, 2, 1685, 195, 2, 238, 60, 796, 2, 2, 671, 2, 2804, 2, 2, 559, 154, 888, 2, 726, 50, 2, 2, 2, 2, 566, 2, 579, 2, 64, 2574]),\n", " list([2, 249, 1323, 2, 61, 113, 2, 2, 2, 1637, 2, 2, 56, 2, 2401, 2, 457, 88, 2, 2626, 1400, 2, 3171, 2, 70, 79, 2, 706, 919, 2, 2, 355, 340, 355, 1696, 96, 143, 2, 2, 2, 289, 2, 61, 369, 71, 2359, 2, 2, 2, 131, 2073, 249, 114, 249, 229, 249, 2, 2, 2, 126, 110, 2, 473, 2, 569, 61, 419, 56, 429, 2, 1513, 2, 2, 534, 95, 474, 570, 2, 2, 124, 138, 88, 2, 421, 1543, 52, 725, 2, 61, 419, 2, 2, 1571, 2, 1543, 2, 2, 2, 2, 2, 296, 2, 3524, 2, 2, 421, 128, 74, 233, 334, 207, 126, 224, 2, 562, 298, 2167, 1272, 2, 2601, 2, 516, 988, 2, 2, 79, 120, 2, 595, 2, 784, 2, 3171, 2, 165, 170, 143, 2, 2, 2, 2, 2, 226, 251, 2, 61, 113]),\n", " list([2, 778, 128, 74, 2, 630, 163, 2, 2, 1766, 2, 1051, 2, 2, 85, 156, 2, 2, 148, 139, 121, 664, 665, 2, 2, 1361, 173, 2, 749, 2, 2, 3804, 2, 2, 226, 65, 2, 2, 127, 2, 2, 2, 2])],\n", " dtype=object)" ] }, "metadata": {}, "execution_count": 4 } ] }, { "cell_type": "code", "metadata": { "id": "dE-B4rtWd7jZ", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "f99c60d8-2659-46c4-db77-783b6c037b86" }, "source": [ "for x in x_train[0:6]:\n", " print(len(x))" ], "execution_count": 5, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "218\n", "189\n", "141\n", "550\n", "147\n", "43\n" ] } ] }, { "cell_type": "code", "metadata": { "id": "v49ped-Sd7ja", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "6b07f706-d652-4340-a466-4dee8bf86d94" }, "source": [ "y_train[0:6]" ], "execution_count": 6, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([1, 0, 0, 1, 0, 0])" ] }, "metadata": {}, "execution_count": 6 } ] }, { "cell_type": "code", "metadata": { "id": "z85LoqVId7ja", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "dec54a78-5c3e-4d94-d2b2-1c2eee58cfef" }, "source": [ "len(x_train), len(x_valid)" ], "execution_count": 7, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(25000, 25000)" ] }, "metadata": {}, "execution_count": 7 } ] }, { "cell_type": "markdown", "metadata": { "id": "fvzvJVGYd7ja" }, "source": [ "#### 인덱스에서 단어 복원하기" ] }, { "cell_type": "code", "metadata": { "id": "nuzdBUi-d7ja", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "3916fb05-4ead-4ee0-a8ef-a6fef4cb3f95" }, "source": [ "word_index = keras.datasets.imdb.get_word_index()\n", "word_index = {k:(v+3) for k,v in word_index.items()}\n", "word_index[\"PAD\"] = 0\n", "word_index[\"START\"] = 1\n", "word_index[\"UNK\"] = 2" ], "execution_count": 8, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb_word_index.json\n", "1641221/1641221 [==============================] - 0s 0us/step\n" ] } ] }, { "cell_type": "code", "metadata": { "id": "5AYoOAK7d7ja", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "36cb8dbb-7aac-4713-c1ba-423cf6a79099" }, "source": [ "word_index" ], "execution_count": 9, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "{'fawn': 34704,\n", " 'tsukino': 52009,\n", " 'nunnery': 52010,\n", " 'sonja': 16819,\n", " 'vani': 63954,\n", " 'woods': 1411,\n", " 'spiders': 16118,\n", " 'hanging': 2348,\n", " 'woody': 2292,\n", " 'trawling': 52011,\n", " \"hold's\": 52012,\n", " 'comically': 11310,\n", " 'localized': 40833,\n", " 'disobeying': 30571,\n", " \"'royale\": 52013,\n", " \"harpo's\": 40834,\n", " 'canet': 52014,\n", " 'aileen': 19316,\n", " 'acurately': 52015,\n", " \"diplomat's\": 52016,\n", " 'rickman': 25245,\n", " 'arranged': 6749,\n", " 'rumbustious': 52017,\n", " 'familiarness': 52018,\n", " \"spider'\": 52019,\n", " 'hahahah': 68807,\n", " \"wood'\": 52020,\n", " 'transvestism': 40836,\n", " \"hangin'\": 34705,\n", " 'bringing': 2341,\n", " 'seamier': 40837,\n", " 'wooded': 34706,\n", " 'bravora': 52021,\n", " 'grueling': 16820,\n", " 'wooden': 1639,\n", " 'wednesday': 16821,\n", " \"'prix\": 52022,\n", " 'altagracia': 34707,\n", " 'circuitry': 52023,\n", " 'crotch': 11588,\n", " 'busybody': 57769,\n", " \"tart'n'tangy\": 52024,\n", " 'burgade': 14132,\n", " 'thrace': 52026,\n", " \"tom's\": 11041,\n", " 'snuggles': 52028,\n", " 'francesco': 29117,\n", " 'complainers': 52030,\n", " 'templarios': 52128,\n", " '272': 40838,\n", " '273': 52031,\n", " 'zaniacs': 52133,\n", " '275': 34709,\n", " 'consenting': 27634,\n", " 'snuggled': 40839,\n", " 'inanimate': 15495,\n", " 'uality': 52033,\n", " 'bronte': 11929,\n", " 'errors': 4013,\n", " 'dialogs': 3233,\n", " \"yomada's\": 52034,\n", " \"madman's\": 34710,\n", " 'dialoge': 30588,\n", " 'usenet': 52036,\n", " 'videodrome': 40840,\n", " \"kid'\": 26341,\n", " 'pawed': 52037,\n", " \"'girlfriend'\": 30572,\n", " \"'pleasure\": 52038,\n", " \"'reloaded'\": 52039,\n", " \"kazakos'\": 40842,\n", " 'rocque': 52040,\n", " 'mailings': 52041,\n", " 'brainwashed': 11930,\n", " 'mcanally': 16822,\n", " \"tom''\": 52042,\n", " 'kurupt': 25246,\n", " 'affiliated': 21908,\n", " 'babaganoosh': 52043,\n", " \"noe's\": 40843,\n", " 'quart': 40844,\n", " 'kids': 362,\n", " 'uplifting': 5037,\n", " 'controversy': 7096,\n", " 'kida': 21909,\n", " 'kidd': 23382,\n", " \"error'\": 52044,\n", " 'neurologist': 52045,\n", " 'spotty': 18513,\n", " 'cobblers': 30573,\n", " 'projection': 9881,\n", " 'fastforwarding': 40845,\n", " 'sters': 52046,\n", " \"eggar's\": 52047,\n", " 'etherything': 52048,\n", " 'gateshead': 40846,\n", " 'airball': 34711,\n", " 'unsinkable': 25247,\n", " 'stern': 7183,\n", " \"cervi's\": 52049,\n", " 'dnd': 40847,\n", " 'dna': 11589,\n", " 'insecurity': 20601,\n", " \"'reboot'\": 52050,\n", " 'trelkovsky': 11040,\n", " 'jaekel': 52051,\n", " 'sidebars': 52052,\n", " \"sforza's\": 52053,\n", " 'distortions': 17636,\n", " 'mutinies': 52054,\n", " 'sermons': 30605,\n", " '7ft': 40849,\n", " 'boobage': 52055,\n", " \"o'bannon's\": 52056,\n", " 'populations': 23383,\n", " 'chulak': 52057,\n", " 'mesmerize': 27636,\n", " 'quinnell': 52058,\n", " 'yahoo': 10310,\n", " 'meteorologist': 52060,\n", " 'beswick': 42580,\n", " 'boorman': 15496,\n", " 'voicework': 40850,\n", " \"ster'\": 52061,\n", " 'blustering': 22925,\n", " 'hj': 52062,\n", " 'intake': 27637,\n", " 'morally': 5624,\n", " 'jumbling': 40852,\n", " 'bowersock': 52063,\n", " \"'porky's'\": 52064,\n", " 'gershon': 16824,\n", " 'ludicrosity': 40853,\n", " 'coprophilia': 52065,\n", " 'expressively': 40854,\n", " \"india's\": 19503,\n", " \"post's\": 34713,\n", " 'wana': 52066,\n", " 'wang': 5286,\n", " 'wand': 30574,\n", " 'wane': 25248,\n", " 'edgeways': 52324,\n", " 'titanium': 34714,\n", " 'pinta': 40855,\n", " 'want': 181,\n", " 'pinto': 30575,\n", " 'whoopdedoodles': 52068,\n", " 'tchaikovsky': 21911,\n", " 'travel': 2106,\n", " \"'victory'\": 52069,\n", " 'copious': 11931,\n", " 'gouge': 22436,\n", " \"chapters'\": 52070,\n", " 'barbra': 6705,\n", " 'uselessness': 30576,\n", " \"wan'\": 52071,\n", " 'assimilated': 27638,\n", " 'petiot': 16119,\n", " 'most\\x85and': 52072,\n", " 'dinosaurs': 3933,\n", " 'wrong': 355,\n", " 'seda': 52073,\n", " 'stollen': 52074,\n", " 'sentencing': 34715,\n", " 'ouroboros': 40856,\n", " 'assimilates': 40857,\n", " 'colorfully': 40858,\n", " 'glenne': 27639,\n", " 'dongen': 52075,\n", " 'subplots': 4763,\n", " 'kiloton': 52076,\n", " 'chandon': 23384,\n", " \"effect'\": 34716,\n", " 'snugly': 27640,\n", " 'kuei': 40859,\n", " 'welcomed': 9095,\n", " 'dishonor': 30074,\n", " 'concurrence': 52078,\n", " 'stoicism': 23385,\n", " \"guys'\": 14899,\n", " \"beroemd'\": 52080,\n", " 'butcher': 6706,\n", " \"melfi's\": 40860,\n", " 'aargh': 30626,\n", " 'playhouse': 20602,\n", " 'wickedly': 11311,\n", " 'fit': 1183,\n", " 'labratory': 52081,\n", " 'lifeline': 40862,\n", " 'screaming': 1930,\n", " 'fix': 4290,\n", " 'cineliterate': 52082,\n", " 'fic': 52083,\n", " 'fia': 52084,\n", " 'fig': 34717,\n", " 'fmvs': 52085,\n", " 'fie': 52086,\n", " 'reentered': 52087,\n", " 'fin': 30577,\n", " 'doctresses': 52088,\n", " 'fil': 52089,\n", " 'zucker': 12609,\n", " 'ached': 31934,\n", " 'counsil': 52091,\n", " 'paterfamilias': 52092,\n", " 'songwriter': 13888,\n", " 'shivam': 34718,\n", " 'hurting': 9657,\n", " 'effects': 302,\n", " 'slauther': 52093,\n", " \"'flame'\": 52094,\n", " 'sommerset': 52095,\n", " 'interwhined': 52096,\n", " 'whacking': 27641,\n", " 'bartok': 52097,\n", " 'barton': 8778,\n", " 'frewer': 21912,\n", " \"fi'\": 52098,\n", " 'ingrid': 6195,\n", " 'stribor': 30578,\n", " 'approporiately': 52099,\n", " 'wobblyhand': 52100,\n", " 'tantalisingly': 52101,\n", " 'ankylosaurus': 52102,\n", " 'parasites': 17637,\n", " 'childen': 52103,\n", " \"jenkins'\": 52104,\n", " 'metafiction': 52105,\n", " 'golem': 17638,\n", " 'indiscretion': 40863,\n", " \"reeves'\": 23386,\n", " \"inamorata's\": 57784,\n", " 'brittannica': 52107,\n", " 'adapt': 7919,\n", " \"russo's\": 30579,\n", " 'guitarists': 48249,\n", " 'abbott': 10556,\n", " 'abbots': 40864,\n", " 'lanisha': 17652,\n", " 'magickal': 40866,\n", " 'mattter': 52108,\n", " \"'willy\": 52109,\n", " 'pumpkins': 34719,\n", " 'stuntpeople': 52110,\n", " 'estimate': 30580,\n", " 'ugghhh': 40867,\n", " 'gameplay': 11312,\n", " \"wern't\": 52111,\n", " \"n'sync\": 40868,\n", " 'sickeningly': 16120,\n", " 'chiara': 40869,\n", " 'disturbed': 4014,\n", " 'portmanteau': 40870,\n", " 'ineffectively': 52112,\n", " \"duchonvey's\": 82146,\n", " \"nasty'\": 37522,\n", " 'purpose': 1288,\n", " 'lazers': 52115,\n", " 'lightened': 28108,\n", " 'kaliganj': 52116,\n", " 'popularism': 52117,\n", " \"damme's\": 18514,\n", " 'stylistics': 30581,\n", " 'mindgaming': 52118,\n", " 'spoilerish': 46452,\n", " \"'corny'\": 52120,\n", " 'boerner': 34721,\n", " 'olds': 6795,\n", " 'bakelite': 52121,\n", " 'renovated': 27642,\n", " 'forrester': 27643,\n", " \"lumiere's\": 52122,\n", " 'gaskets': 52027,\n", " 'needed': 887,\n", " 'smight': 34722,\n", " 'master': 1300,\n", " \"edie's\": 25908,\n", " 'seeber': 40871,\n", " 'hiya': 52123,\n", " 'fuzziness': 52124,\n", " 'genesis': 14900,\n", " 'rewards': 12610,\n", " 'enthrall': 30582,\n", " \"'about\": 40872,\n", " \"recollection's\": 52125,\n", " 'mutilated': 11042,\n", " 'fatherlands': 52126,\n", " \"fischer's\": 52127,\n", " 'positively': 5402,\n", " '270': 34708,\n", " 'ahmed': 34723,\n", " 'zatoichi': 9839,\n", " 'bannister': 13889,\n", " 'anniversaries': 52130,\n", " \"helm's\": 30583,\n", " \"'work'\": 52131,\n", " 'exclaimed': 34724,\n", " \"'unfunny'\": 52132,\n", " '274': 52032,\n", " 'feeling': 547,\n", " \"wanda's\": 52134,\n", " 'dolan': 33269,\n", " '278': 52136,\n", " 'peacoat': 52137,\n", " 'brawny': 40873,\n", " 'mishra': 40874,\n", " 'worlders': 40875,\n", " 'protags': 52138,\n", " 'skullcap': 52139,\n", " 'dastagir': 57599,\n", " 'affairs': 5625,\n", " 'wholesome': 7802,\n", " 'hymen': 52140,\n", " 'paramedics': 25249,\n", " 'unpersons': 52141,\n", " 'heavyarms': 52142,\n", " 'affaire': 52143,\n", " 'coulisses': 52144,\n", " 'hymer': 40876,\n", " 'kremlin': 52145,\n", " 'shipments': 30584,\n", " 'pixilated': 52146,\n", " \"'00s\": 30585,\n", " 'diminishing': 18515,\n", " 'cinematic': 1360,\n", " 'resonates': 14901,\n", " 'simplify': 40877,\n", " \"nature'\": 40878,\n", " 'temptresses': 40879,\n", " 'reverence': 16825,\n", " 'resonated': 19505,\n", " 'dailey': 34725,\n", " '2\\x85': 52147,\n", " 'treize': 27644,\n", " 'majo': 52148,\n", " 'kiya': 21913,\n", " 'woolnough': 52149,\n", " 'thanatos': 39800,\n", " 'sandoval': 35734,\n", " 'dorama': 40882,\n", " \"o'shaughnessy\": 52150,\n", " 'tech': 4991,\n", " 'fugitives': 32021,\n", " 'teck': 30586,\n", " \"'e'\": 76128,\n", " 'doesn’t': 40884,\n", " 'purged': 52152,\n", " 'saying': 660,\n", " \"martians'\": 41098,\n", " 'norliss': 23421,\n", " 'dickey': 27645,\n", " 'dicker': 52155,\n", " \"'sependipity\": 52156,\n", " 'padded': 8425,\n", " 'ordell': 57795,\n", " \"sturges'\": 40885,\n", " 'independentcritics': 52157,\n", " 'tempted': 5748,\n", " \"atkinson's\": 34727,\n", " 'hounded': 25250,\n", " 'apace': 52158,\n", " 'clicked': 15497,\n", " \"'humor'\": 30587,\n", " \"martino's\": 17180,\n", " \"'supporting\": 52159,\n", " 'warmongering': 52035,\n", " \"zemeckis's\": 34728,\n", " 'lube': 21914,\n", " 'shocky': 52160,\n", " 'plate': 7479,\n", " 'plata': 40886,\n", " 'sturgess': 40887,\n", " \"nerds'\": 40888,\n", " 'plato': 20603,\n", " 'plath': 34729,\n", " 'platt': 40889,\n", " 'mcnab': 52162,\n", " 'clumsiness': 27646,\n", " 'altogether': 3902,\n", " 'massacring': 42587,\n", " 'bicenntinial': 52163,\n", " 'skaal': 40890,\n", " 'droning': 14363,\n", " 'lds': 8779,\n", " 'jaguar': 21915,\n", " \"cale's\": 34730,\n", " 'nicely': 1780,\n", " 'mummy': 4591,\n", " \"lot's\": 18516,\n", " 'patch': 10089,\n", " 'kerkhof': 50205,\n", " \"leader's\": 52164,\n", " \"'movie\": 27647,\n", " 'uncomfirmed': 52165,\n", " 'heirloom': 40891,\n", " 'wrangle': 47363,\n", " 'emotion\\x85': 52166,\n", " \"'stargate'\": 52167,\n", " 'pinoy': 40892,\n", " 'conchatta': 40893,\n", " 'broeke': 41131,\n", " 'advisedly': 40894,\n", " \"barker's\": 17639,\n", " 'descours': 52169,\n", " 'lots': 775,\n", " 'lotr': 9262,\n", " 'irs': 9882,\n", " 'lott': 52170,\n", " 'xvi': 40895,\n", " 'irk': 34731,\n", " 'irl': 52171,\n", " 'ira': 6890,\n", " 'belzer': 21916,\n", " 'irc': 52172,\n", " 'ire': 27648,\n", " 'requisites': 40896,\n", " 'discipline': 7696,\n", " 'lyoko': 52964,\n", " 'extend': 11313,\n", " 'nature': 876,\n", " \"'dickie'\": 52173,\n", " 'optimist': 40897,\n", " 'lapping': 30589,\n", " 'superficial': 3903,\n", " 'vestment': 52174,\n", " 'extent': 2826,\n", " 'tendons': 52175,\n", " \"heller's\": 52176,\n", " 'quagmires': 52177,\n", " 'miyako': 52178,\n", " 'moocow': 20604,\n", " \"coles'\": 52179,\n", " 'lookit': 40898,\n", " 'ravenously': 52180,\n", " 'levitating': 40899,\n", " 'perfunctorily': 52181,\n", " 'lookin': 30590,\n", " \"lot'\": 40901,\n", " 'lookie': 52182,\n", " 'fearlessly': 34873,\n", " 'libyan': 52184,\n", " 'fondles': 40902,\n", " 'gopher': 35717,\n", " 'wearying': 40904,\n", " \"nz's\": 52185,\n", " 'minuses': 27649,\n", " 'puposelessly': 52186,\n", " 'shandling': 52187,\n", " 'decapitates': 31271,\n", " 'humming': 11932,\n", " \"'nother\": 40905,\n", " 'smackdown': 21917,\n", " 'underdone': 30591,\n", " 'frf': 40906,\n", " 'triviality': 52188,\n", " 'fro': 25251,\n", " 'bothers': 8780,\n", " \"'kensington\": 52189,\n", " 'much': 76,\n", " 'muco': 34733,\n", " 'wiseguy': 22618,\n", " \"richie's\": 27651,\n", " 'tonino': 40907,\n", " 'unleavened': 52190,\n", " 'fry': 11590,\n", " \"'tv'\": 40908,\n", " 'toning': 40909,\n", " 'obese': 14364,\n", " 'sensationalized': 30592,\n", " 'spiv': 40910,\n", " 'spit': 6262,\n", " 'arkin': 7367,\n", " 'charleton': 21918,\n", " 'jeon': 16826,\n", " 'boardroom': 21919,\n", " 'doubts': 4992,\n", " 'spin': 3087,\n", " 'hepo': 53086,\n", " 'wildcat': 27652,\n", " 'venoms': 10587,\n", " 'misconstrues': 52194,\n", " 'mesmerising': 18517,\n", " 'misconstrued': 40911,\n", " 'rescinds': 52195,\n", " 'prostrate': 52196,\n", " 'majid': 40912,\n", " 'climbed': 16482,\n", " 'canoeing': 34734,\n", " 'majin': 52198,\n", " 'animie': 57807,\n", " 'sylke': 40913,\n", " 'conditioned': 14902,\n", " 'waddell': 40914,\n", " '3\\x85': 52199,\n", " 'hyperdrive': 41191,\n", " 'conditioner': 34735,\n", " 'bricklayer': 53156,\n", " 'hong': 2579,\n", " 'memoriam': 52201,\n", " 'inventively': 30595,\n", " \"levant's\": 25252,\n", " 'portobello': 20641,\n", " 'remand': 52203,\n", " 'mummified': 19507,\n", " 'honk': 27653,\n", " 'spews': 19508,\n", " 'visitations': 40915,\n", " 'mummifies': 52204,\n", " 'cavanaugh': 25253,\n", " 'zeon': 23388,\n", " \"jungle's\": 40916,\n", " 'viertel': 34736,\n", " 'frenchmen': 27654,\n", " 'torpedoes': 52205,\n", " 'schlessinger': 52206,\n", " 'torpedoed': 34737,\n", " 'blister': 69879,\n", " 'cinefest': 52207,\n", " 'furlough': 34738,\n", " 'mainsequence': 52208,\n", " 'mentors': 40917,\n", " 'academic': 9097,\n", " 'stillness': 20605,\n", " 'academia': 40918,\n", " 'lonelier': 52209,\n", " 'nibby': 52210,\n", " \"losers'\": 52211,\n", " 'cineastes': 40919,\n", " 'corporate': 4452,\n", " 'massaging': 40920,\n", " 'bellow': 30596,\n", " 'absurdities': 19509,\n", " 'expetations': 53244,\n", " 'nyfiken': 40921,\n", " 'mehras': 75641,\n", " 'lasse': 52212,\n", " 'visability': 52213,\n", " 'militarily': 33949,\n", " \"elder'\": 52214,\n", " 'gainsbourg': 19026,\n", " 'hah': 20606,\n", " 'hai': 13423,\n", " 'haj': 34739,\n", " 'hak': 25254,\n", " 'hal': 4314,\n", " 'ham': 4895,\n", " 'duffer': 53262,\n", " 'haa': 52216,\n", " 'had': 69,\n", " 'advancement': 11933,\n", " 'hag': 16828,\n", " \"hand'\": 25255,\n", " 'hay': 13424,\n", " 'mcnamara': 20607,\n", " \"mozart's\": 52217,\n", " 'duffel': 30734,\n", " 'haq': 30597,\n", " 'har': 13890,\n", " 'has': 47,\n", " 'hat': 2404,\n", " 'hav': 40922,\n", " 'haw': 30598,\n", " 'figtings': 52218,\n", " 'elders': 15498,\n", " 'underpanted': 52219,\n", " 'pninson': 52220,\n", " 'unequivocally': 27655,\n", " \"barbara's\": 23676,\n", " \"bello'\": 52222,\n", " 'indicative': 13000,\n", " 'yawnfest': 40923,\n", " 'hexploitation': 52223,\n", " \"loder's\": 52224,\n", " 'sleuthing': 27656,\n", " \"justin's\": 32625,\n", " \"'ball\": 52225,\n", " \"'summer\": 52226,\n", " \"'demons'\": 34938,\n", " \"mormon's\": 52228,\n", " \"laughton's\": 34740,\n", " 'debell': 52229,\n", " 'shipyard': 39727,\n", " 'unabashedly': 30600,\n", " 'disks': 40404,\n", " 'crowd': 2293,\n", " 'crowe': 10090,\n", " \"vancouver's\": 56437,\n", " 'mosques': 34741,\n", " 'crown': 6630,\n", " 'culpas': 52230,\n", " 'crows': 27657,\n", " 'surrell': 53347,\n", " 'flowless': 52232,\n", " 'sheirk': 52233,\n", " \"'three\": 40926,\n", " \"peterson'\": 52234,\n", " 'ooverall': 52235,\n", " 'perchance': 40927,\n", " 'bottom': 1324,\n", " 'chabert': 53366,\n", " 'sneha': 52236,\n", " 'inhuman': 13891,\n", " 'ichii': 52237,\n", " 'ursla': 52238,\n", " 'completly': 30601,\n", " 'moviedom': 40928,\n", " 'raddick': 52239,\n", " 'brundage': 51998,\n", " 'brigades': 40929,\n", " 'starring': 1184,\n", " \"'goal'\": 52240,\n", " 'caskets': 52241,\n", " 'willcock': 52242,\n", " \"threesome's\": 52243,\n", " \"mosque'\": 52244,\n", " \"cover's\": 52245,\n", " 'spaceships': 17640,\n", " 'anomalous': 40930,\n", " 'ptsd': 27658,\n", " 'shirdan': 52246,\n", " 'obscenity': 21965,\n", " 'lemmings': 30602,\n", " 'duccio': 30603,\n", " \"levene's\": 52247,\n", " \"'gorby'\": 52248,\n", " \"teenager's\": 25258,\n", " 'marshall': 5343,\n", " 'honeymoon': 9098,\n", " 'shoots': 3234,\n", " 'despised': 12261,\n", " 'okabasho': 52249,\n", " 'fabric': 8292,\n", " 'cannavale': 18518,\n", " 'raped': 3540,\n", " \"tutt's\": 52250,\n", " 'grasping': 17641,\n", " 'despises': 18519,\n", " \"thief's\": 40931,\n", " 'rapes': 8929,\n", " 'raper': 52251,\n", " \"eyre'\": 27659,\n", " 'walchek': 52252,\n", " \"elmo's\": 23389,\n", " 'perfumes': 40932,\n", " 'spurting': 21921,\n", " \"exposition'\\x85\": 52253,\n", " 'denoting': 52254,\n", " 'thesaurus': 34743,\n", " \"shoot'\": 40933,\n", " 'bonejack': 49762,\n", " 'simpsonian': 52256,\n", " 'hebetude': 30604,\n", " \"hallow's\": 34744,\n", " 'desperation\\x85': 52257,\n", " 'incinerator': 34745,\n", " 'congratulations': 10311,\n", " 'humbled': 52258,\n", " \"else's\": 5927,\n", " 'trelkovski': 40848,\n", " \"rape'\": 52259,\n", " \"'chapters'\": 59389,\n", " '1600s': 52260,\n", " 'martian': 7256,\n", " 'nicest': 25259,\n", " 'eyred': 52262,\n", " 'passenger': 9460,\n", " 'disgrace': 6044,\n", " 'moderne': 52263,\n", " 'barrymore': 5123,\n", " 'yankovich': 52264,\n", " 'moderns': 40934,\n", " 'studliest': 52265,\n", " 'bedsheet': 52266,\n", " 'decapitation': 14903,\n", " 'slurring': 52267,\n", " \"'nunsploitation'\": 52268,\n", " \"'character'\": 34746,\n", " 'cambodia': 9883,\n", " 'rebelious': 52269,\n", " 'pasadena': 27660,\n", " 'crowne': 40935,\n", " \"'bedchamber\": 52270,\n", " 'conjectural': 52271,\n", " 'appologize': 52272,\n", " 'halfassing': 52273,\n", " 'paycheque': 57819,\n", " 'palms': 20609,\n", " \"'islands\": 52274,\n", " 'hawked': 40936,\n", " 'palme': 21922,\n", " 'conservatively': 40937,\n", " 'larp': 64010,\n", " 'palma': 5561,\n", " 'smelling': 21923,\n", " 'aragorn': 13001,\n", " 'hawker': 52275,\n", " 'hawkes': 52276,\n", " 'explosions': 3978,\n", " 'loren': 8062,\n", " \"pyle's\": 52277,\n", " 'shootout': 6707,\n", " \"mike's\": 18520,\n", " \"driscoll's\": 52278,\n", " 'cogsworth': 40938,\n", " \"britian's\": 52279,\n", " 'childs': 34747,\n", " \"portrait's\": 52280,\n", " 'chain': 3629,\n", " 'whoever': 2500,\n", " 'puttered': 52281,\n", " 'childe': 52282,\n", " 'maywether': 52283,\n", " 'chair': 3039,\n", " \"rance's\": 52284,\n", " 'machu': 34748,\n", " 'ballet': 4520,\n", " 'grapples': 34749,\n", " 'summerize': 76155,\n", " 'freelance': 30606,\n", " \"andrea's\": 52286,\n", " '\\x91very': 52287,\n", " 'coolidge': 45882,\n", " 'mache': 18521,\n", " 'balled': 52288,\n", " 'grappled': 40940,\n", " 'macha': 18522,\n", " 'underlining': 21924,\n", " 'macho': 5626,\n", " 'oversight': 19510,\n", " 'machi': 25260,\n", " 'verbally': 11314,\n", " 'tenacious': 21925,\n", " 'windshields': 40941,\n", " 'paychecks': 18560,\n", " 'jerk': 3399,\n", " \"good'\": 11934,\n", " 'prancer': 34751,\n", " 'prances': 21926,\n", " 'olympus': 52289,\n", " 'lark': 21927,\n", " 'embark': 10788,\n", " 'gloomy': 7368,\n", " 'jehaan': 52290,\n", " 'turaqui': 52291,\n", " \"child'\": 20610,\n", " 'locked': 2897,\n", " 'pranced': 52292,\n", " 'exact': 2591,\n", " 'unattuned': 52293,\n", " 'minute': 786,\n", " 'skewed': 16121,\n", " 'hodgins': 40943,\n", " 'skewer': 34752,\n", " 'think\\x85': 52294,\n", " 'rosenstein': 38768,\n", " 'helmit': 52295,\n", " 'wrestlemanias': 34753,\n", " 'hindered': 16829,\n", " \"martha's\": 30607,\n", " 'cheree': 52296,\n", " \"pluckin'\": 52297,\n", " 'ogles': 40944,\n", " 'heavyweight': 11935,\n", " 'aada': 82193,\n", " 'chopping': 11315,\n", " 'strongboy': 61537,\n", " 'hegemonic': 41345,\n", " 'adorns': 40945,\n", " 'xxth': 41349,\n", " 'nobuhiro': 34754,\n", " 'capitães': 52301,\n", " 'kavogianni': 52302,\n", " 'antwerp': 13425,\n", " 'celebrated': 6541,\n", " 'roarke': 52303,\n", " 'baggins': 40946,\n", " 'cheeseburgers': 31273,\n", " 'matras': 52304,\n", " \"nineties'\": 52305,\n", " \"'craig'\": 52306,\n", " 'celebrates': 13002,\n", " 'unintentionally': 3386,\n", " 'drafted': 14365,\n", " 'climby': 52307,\n", " '303': 52308,\n", " 'oldies': 18523,\n", " 'climbs': 9099,\n", " 'honour': 9658,\n", " 'plucking': 34755,\n", " '305': 30077,\n", " 'address': 5517,\n", " 'menjou': 40947,\n", " \"'freak'\": 42595,\n", " 'dwindling': 19511,\n", " 'benson': 9461,\n", " 'white’s': 52310,\n", " 'shamelessness': 40948,\n", " 'impacted': 21928,\n", " 'upatz': 52311,\n", " 'cusack': 3843,\n", " \"flavia's\": 37570,\n", " 'effette': 52312,\n", " 'influx': 34756,\n", " 'boooooooo': 52313,\n", " 'dimitrova': 52314,\n", " 'houseman': 13426,\n", " 'bigas': 25262,\n", " 'boylen': 52315,\n", " 'phillipenes': 52316,\n", " 'fakery': 40949,\n", " \"grandpa's\": 27661,\n", " 'darnell': 27662,\n", " 'undergone': 19512,\n", " 'handbags': 52318,\n", " 'perished': 21929,\n", " 'pooped': 37781,\n", " 'vigour': 27663,\n", " 'opposed': 3630,\n", " 'etude': 52319,\n", " \"caine's\": 11802,\n", " 'doozers': 52320,\n", " 'photojournals': 34757,\n", " 'perishes': 52321,\n", " 'constrains': 34758,\n", " 'migenes': 40951,\n", " 'consoled': 30608,\n", " 'alastair': 16830,\n", " 'wvs': 52322,\n", " 'ooooooh': 52323,\n", " 'approving': 34759,\n", " 'consoles': 40952,\n", " 'disparagement': 52067,\n", " 'futureistic': 52325,\n", " 'rebounding': 52326,\n", " \"'date\": 52327,\n", " 'gregoire': 52328,\n", " 'rutherford': 21930,\n", " 'americanised': 34760,\n", " 'novikov': 82199,\n", " 'following': 1045,\n", " 'munroe': 34761,\n", " \"morita'\": 52329,\n", " 'christenssen': 52330,\n", " 'oatmeal': 23109,\n", " 'fossey': 25263,\n", " 'livered': 40953,\n", " 'listens': 13003,\n", " \"'marci\": 76167,\n", " \"otis's\": 52333,\n", " 'thanking': 23390,\n", " 'maude': 16022,\n", " 'extensions': 34762,\n", " 'ameteurish': 52335,\n", " \"commender's\": 52336,\n", " 'agricultural': 27664,\n", " 'convincingly': 4521,\n", " 'fueled': 17642,\n", " 'mahattan': 54017,\n", " \"paris's\": 40955,\n", " 'vulkan': 52339,\n", " 'stapes': 52340,\n", " 'odysessy': 52341,\n", " 'harmon': 12262,\n", " 'surfing': 4255,\n", " 'halloran': 23497,\n", " 'unbelieveably': 49583,\n", " \"'offed'\": 52342,\n", " 'quadrant': 30610,\n", " 'inhabiting': 19513,\n", " 'nebbish': 34763,\n", " 'forebears': 40956,\n", " 'skirmish': 34764,\n", " 'ocassionally': 52343,\n", " \"'resist\": 52344,\n", " 'impactful': 21931,\n", " 'spicier': 52345,\n", " 'touristy': 40957,\n", " \"'football'\": 52346,\n", " 'webpage': 40958,\n", " 'exurbia': 52348,\n", " 'jucier': 52349,\n", " 'professors': 14904,\n", " 'structuring': 34765,\n", " 'jig': 30611,\n", " 'overlord': 40959,\n", " 'disconnect': 25264,\n", " 'sniffle': 82204,\n", " 'slimeball': 40960,\n", " 'jia': 40961,\n", " 'milked': 16831,\n", " 'banjoes': 40962,\n", " 'jim': 1240,\n", " 'workforces': 52351,\n", " 'jip': 52352,\n", " 'rotweiller': 52353,\n", " 'mundaneness': 34766,\n", " \"'ninja'\": 52354,\n", " \"dead'\": 11043,\n", " \"cipriani's\": 40963,\n", " 'modestly': 20611,\n", " \"professor'\": 52355,\n", " 'shacked': 40964,\n", " 'bashful': 34767,\n", " 'sorter': 23391,\n", " 'overpowering': 16123,\n", " 'workmanlike': 18524,\n", " 'henpecked': 27665,\n", " 'sorted': 18525,\n", " \"jōb's\": 52357,\n", " \"'always\": 52358,\n", " \"'baptists\": 34768,\n", " 'dreamcatchers': 52359,\n", " \"'silence'\": 52360,\n", " 'hickory': 21932,\n", " 'fun\\x97yet': 52361,\n", " 'breakumentary': 52362,\n", " 'didn': 15499,\n", " 'didi': 52363,\n", " 'pealing': 52364,\n", " 'dispite': 40965,\n", " \"italy's\": 25265,\n", " 'instability': 21933,\n", " 'quarter': 6542,\n", " 'quartet': 12611,\n", " 'padmé': 52365,\n", " \"'bleedmedry\": 52366,\n", " 'pahalniuk': 52367,\n", " 'honduras': 52368,\n", " 'bursting': 10789,\n", " \"pablo's\": 41468,\n", " 'irremediably': 52370,\n", " 'presages': 40966,\n", " 'bowlegged': 57835,\n", " 'dalip': 65186,\n", " 'entering': 6263,\n", " 'newsradio': 76175,\n", " 'presaged': 54153,\n", " \"giallo's\": 27666,\n", " 'bouyant': 40967,\n", " 'amerterish': 52371,\n", " 'rajni': 18526,\n", " 'leeves': 30613,\n", " 'macauley': 34770,\n", " 'seriously': 615,\n", " 'sugercoma': 52372,\n", " 'grimstead': 52373,\n", " \"'fairy'\": 52374,\n", " 'zenda': 30614,\n", " \"'twins'\": 52375,\n", " 'realisation': 17643,\n", " 'highsmith': 27667,\n", " 'raunchy': 7820,\n", " 'incentives': 40968,\n", " 'flatson': 52377,\n", " 'snooker': 35100,\n", " 'crazies': 16832,\n", " 'crazier': 14905,\n", " 'grandma': 7097,\n", " 'napunsaktha': 52378,\n", " 'workmanship': 30615,\n", " 'reisner': 52379,\n", " \"sanford's\": 61309,\n", " '\\x91doña': 52380,\n", " 'modest': 6111,\n", " \"everything's\": 19156,\n", " 'hamer': 40969,\n", " \"couldn't'\": 52382,\n", " 'quibble': 13004,\n", " 'socking': 52383,\n", " 'tingler': 21934,\n", " 'gutman': 52384,\n", " 'lachlan': 40970,\n", " 'tableaus': 52385,\n", " 'headbanger': 52386,\n", " 'spoken': 2850,\n", " 'cerebrally': 34771,\n", " \"'road\": 23493,\n", " 'tableaux': 21935,\n", " \"proust's\": 40971,\n", " 'periodical': 40972,\n", " \"shoveller's\": 52388,\n", " 'tamara': 25266,\n", " 'affords': 17644,\n", " 'concert': 3252,\n", " \"yara's\": 87958,\n", " 'someome': 52389,\n", " 'lingering': 8427,\n", " \"abraham's\": 41514,\n", " 'beesley': 34772,\n", " 'cherbourg': 34773,\n", " 'kagan': 28627,\n", " 'snatch': 9100,\n", " \"miyazaki's\": 9263,\n", " 'absorbs': 25267,\n", " \"koltai's\": 40973,\n", " 'tingled': 64030,\n", " 'crossroads': 19514,\n", " 'rehab': 16124,\n", " 'falworth': 52392,\n", " 'sequals': 52393,\n", " ...}" ] }, "metadata": {}, "execution_count": 9 } ] }, { "cell_type": "code", "metadata": { "id": "X9V6cai7d7jb" }, "source": [ "index_word = {v:k for k,v in word_index.items()}" ], "execution_count": 10, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "_h9PfNSyd7jb", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "1ea388ed-a042-487e-abc7-882968b5fd45" }, "source": [ "x_train[0]" ], "execution_count": 11, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "[2,\n", " 2,\n", " 2,\n", " 2,\n", " 2,\n", " 530,\n", " 973,\n", " 1622,\n", " 1385,\n", " 65,\n", " 458,\n", " 4468,\n", " 66,\n", " 3941,\n", " 2,\n", " 173,\n", " 2,\n", " 256,\n", " 2,\n", " 2,\n", " 100,\n", " 2,\n", " 838,\n", " 112,\n", " 50,\n", " 670,\n", " 2,\n", " 2,\n", " 2,\n", " 480,\n", " 284,\n", " 2,\n", " 150,\n", " 2,\n", " 172,\n", " 112,\n", " 167,\n", " 2,\n", " 336,\n", " 385,\n", " 2,\n", " 2,\n", " 172,\n", " 4536,\n", " 1111,\n", " 2,\n", " 546,\n", " 2,\n", " 2,\n", " 447,\n", " 2,\n", " 192,\n", " 50,\n", " 2,\n", " 2,\n", " 147,\n", " 2025,\n", " 2,\n", " 2,\n", " 2,\n", " 2,\n", " 1920,\n", " 4613,\n", " 469,\n", " 2,\n", " 2,\n", " 71,\n", " 87,\n", " 2,\n", " 2,\n", " 2,\n", " 530,\n", " 2,\n", " 76,\n", " 2,\n", " 2,\n", " 1247,\n", " 2,\n", " 2,\n", " 2,\n", " 515,\n", " 2,\n", " 2,\n", " 2,\n", " 626,\n", " 2,\n", " 2,\n", " 2,\n", " 62,\n", " 386,\n", " 2,\n", " 2,\n", " 316,\n", " 2,\n", " 106,\n", " 2,\n", " 2,\n", " 2223,\n", " 2,\n", " 2,\n", " 480,\n", " 66,\n", " 3785,\n", " 2,\n", " 2,\n", " 130,\n", " 2,\n", " 2,\n", " 2,\n", " 619,\n", " 2,\n", " 2,\n", " 124,\n", " 51,\n", " 2,\n", " 135,\n", " 2,\n", " 2,\n", " 1415,\n", " 2,\n", " 2,\n", " 2,\n", " 2,\n", " 215,\n", " 2,\n", " 77,\n", " 52,\n", " 2,\n", " 2,\n", " 407,\n", " 2,\n", " 82,\n", " 2,\n", " 2,\n", " 2,\n", " 107,\n", " 117,\n", " 2,\n", " 2,\n", " 256,\n", " 2,\n", " 2,\n", " 2,\n", " 3766,\n", " 2,\n", " 723,\n", " 2,\n", " 71,\n", " 2,\n", " 530,\n", " 476,\n", " 2,\n", " 400,\n", " 317,\n", " 2,\n", " 2,\n", " 2,\n", " 2,\n", " 1029,\n", " 2,\n", " 104,\n", " 88,\n", " 2,\n", " 381,\n", " 2,\n", " 297,\n", " 98,\n", " 2,\n", " 2071,\n", " 56,\n", " 2,\n", " 141,\n", " 2,\n", " 194,\n", " 2,\n", " 2,\n", " 2,\n", " 226,\n", " 2,\n", " 2,\n", " 134,\n", " 476,\n", " 2,\n", " 480,\n", " 2,\n", " 144,\n", " 2,\n", " 2,\n", " 2,\n", " 51,\n", " 2,\n", " 2,\n", " 224,\n", " 92,\n", " 2,\n", " 104,\n", " 2,\n", " 226,\n", " 65,\n", " 2,\n", " 2,\n", " 1334,\n", " 88,\n", " 2,\n", " 2,\n", " 283,\n", " 2,\n", " 2,\n", " 4472,\n", " 113,\n", " 103,\n", " 2,\n", " 2,\n", " 2,\n", " 2,\n", " 2,\n", " 178,\n", " 2]" ] }, "metadata": {}, "execution_count": 11 } ] }, { "cell_type": "code", "metadata": { "id": "UjdPZwuDd7jb", "colab": { "base_uri": "https://localhost:8080/", "height": 161 }, "outputId": "7230e18f-6649-4aff-8595-9ea830fb1a2d" }, "source": [ "' '.join(index_word[id] for id in x_train[0])" ], "execution_count": 12, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "\"UNK UNK UNK UNK UNK brilliant casting location scenery story direction everyone's really suited UNK part UNK played UNK UNK could UNK imagine being there robert UNK UNK UNK amazing actor UNK now UNK same being director UNK father came UNK UNK same scottish island UNK myself UNK UNK loved UNK fact there UNK UNK real connection UNK UNK UNK UNK witty remarks throughout UNK UNK were great UNK UNK UNK brilliant UNK much UNK UNK bought UNK UNK UNK soon UNK UNK UNK released UNK UNK UNK would recommend UNK UNK everyone UNK watch UNK UNK fly UNK UNK amazing really cried UNK UNK end UNK UNK UNK sad UNK UNK know what UNK say UNK UNK cry UNK UNK UNK UNK must UNK been good UNK UNK definitely UNK also UNK UNK UNK two little UNK UNK played UNK UNK UNK norman UNK paul UNK were UNK brilliant children UNK often left UNK UNK UNK UNK list UNK think because UNK stars UNK play them UNK grown up UNK such UNK big UNK UNK UNK whole UNK UNK these children UNK amazing UNK should UNK UNK UNK what UNK UNK done don't UNK think UNK whole story UNK UNK lovely because UNK UNK true UNK UNK someone's life after UNK UNK UNK UNK UNK us UNK\"" ], "application/vnd.google.colaboratory.intrinsic+json": { "type": "string" } }, "metadata": {}, "execution_count": 12 } ] }, { "cell_type": "code", "metadata": { "id": "x2wC7hDfd7jb" }, "source": [ "(all_x_train,_),(all_x_valid,_) = imdb.load_data() " ], "execution_count": 13, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "UZ4MjKWSd7jb", "colab": { "base_uri": "https://localhost:8080/", "height": 161 }, "outputId": "0fa1ec56-c008-41a7-e07a-1be0823ee1a5" }, "source": [ "' '.join(index_word[id] for id in all_x_train[0])" ], "execution_count": 14, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "\"START this film was just brilliant casting location scenery story direction everyone's really suited the part they played and you could just imagine being there robert redford's is an amazing actor and now the same being director norman's father came from the same scottish island as myself so i loved the fact there was a real connection with this film the witty remarks throughout the film were great it was just brilliant so much that i bought the film as soon as it was released for retail and would recommend it to everyone to watch and the fly fishing was amazing really cried at the end it was so sad and you know what they say if you cry at a film it must have been good and this definitely was also congratulations to the two little boy's that played the part's of norman and paul they were just brilliant children are often left out of the praising list i think because the stars that play them all grown up are such a big profile for the whole film but these children are amazing and should be praised for what they have done don't you think the whole story was so lovely because it was true and was someone's life after all that was shared with us all\"" ], "application/vnd.google.colaboratory.intrinsic+json": { "type": "string" } }, "metadata": {}, "execution_count": 14 } ] }, { "cell_type": "markdown", "metadata": { "id": "XQPNZbj8d7jc" }, "source": [ "#### 데이터 전처리" ] }, { "cell_type": "code", "metadata": { "id": "JUr9JbNnd7jc" }, "source": [ "x_train = pad_sequences(x_train, maxlen=max_review_length, \n", " padding=pad_type, truncating=trunc_type, value=0)\n", "x_valid = pad_sequences(x_valid, maxlen=max_review_length, \n", " padding=pad_type, truncating=trunc_type, value=0)" ], "execution_count": 15, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "Sj2oyZZId7jc", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "97c446cb-5cbb-4863-adfb-57e04e784488" }, "source": [ "x_train[0:6]" ], "execution_count": 16, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([[1415, 2, 2, 2, 2, 215, 2, 77, 52, 2, 2,\n", " 407, 2, 82, 2, 2, 2, 107, 117, 2, 2, 256,\n", " 2, 2, 2, 3766, 2, 723, 2, 71, 2, 530, 476,\n", " 2, 400, 317, 2, 2, 2, 2, 1029, 2, 104, 88,\n", " 2, 381, 2, 297, 98, 2, 2071, 56, 2, 141, 2,\n", " 194, 2, 2, 2, 226, 2, 2, 134, 476, 2, 480,\n", " 2, 144, 2, 2, 2, 51, 2, 2, 224, 92, 2,\n", " 104, 2, 226, 65, 2, 2, 1334, 88, 2, 2, 283,\n", " 2, 2, 4472, 113, 103, 2, 2, 2, 2, 2, 178,\n", " 2],\n", " [ 163, 2, 3215, 2, 2, 1153, 2, 194, 775, 2, 2,\n", " 2, 349, 2637, 148, 605, 2, 2, 2, 123, 125, 68,\n", " 2, 2, 2, 349, 165, 4362, 98, 2, 2, 228, 2,\n", " 2, 2, 1157, 2, 299, 120, 2, 120, 174, 2, 220,\n", " 175, 136, 50, 2, 4373, 228, 2, 2, 2, 656, 245,\n", " 2350, 2, 2, 2, 131, 152, 491, 2, 2, 2, 2,\n", " 1212, 2, 2, 2, 371, 78, 2, 625, 64, 1382, 2,\n", " 2, 168, 145, 2, 2, 1690, 2, 2, 2, 1355, 2,\n", " 2, 2, 52, 154, 462, 2, 89, 78, 285, 2, 145,\n", " 95],\n", " [1301, 2, 1873, 2, 89, 78, 2, 66, 2, 2, 360,\n", " 2, 2, 58, 316, 334, 2, 2, 1716, 2, 645, 662,\n", " 2, 257, 85, 1200, 2, 1228, 2578, 83, 68, 3912, 2,\n", " 2, 165, 1539, 278, 2, 69, 2, 780, 2, 106, 2,\n", " 2, 1338, 2, 2, 2, 2, 215, 2, 610, 2, 2,\n", " 87, 326, 2, 2300, 2, 2, 2, 2, 272, 2, 57,\n", " 2, 2, 2, 2, 2, 2, 2307, 51, 2, 170, 2,\n", " 595, 116, 595, 1352, 2, 191, 79, 638, 89, 2, 2,\n", " 2, 2, 106, 607, 624, 2, 534, 2, 227, 2, 129,\n", " 113],\n", " [ 2, 2, 2, 188, 1076, 3222, 2, 2, 2, 2, 2348,\n", " 537, 2, 53, 537, 2, 82, 2, 2, 2, 2, 2,\n", " 280, 2, 219, 2, 2, 431, 758, 859, 2, 953, 1052,\n", " 2, 2, 2, 2, 94, 2, 2, 238, 60, 2, 2,\n", " 2, 804, 2, 2, 2, 2, 132, 2, 67, 2, 2,\n", " 2, 2, 283, 2, 2, 2, 2, 2, 242, 955, 2,\n", " 2, 279, 2, 2, 2, 1685, 195, 2, 238, 60, 796,\n", " 2, 2, 671, 2, 2804, 2, 2, 559, 154, 888, 2,\n", " 726, 50, 2, 2, 2, 2, 566, 2, 579, 2, 64,\n", " 2574],\n", " [ 2, 2, 131, 2073, 249, 114, 249, 229, 249, 2, 2,\n", " 2, 126, 110, 2, 473, 2, 569, 61, 419, 56, 429,\n", " 2, 1513, 2, 2, 534, 95, 474, 570, 2, 2, 124,\n", " 138, 88, 2, 421, 1543, 52, 725, 2, 61, 419, 2,\n", " 2, 1571, 2, 1543, 2, 2, 2, 2, 2, 296, 2,\n", " 3524, 2, 2, 421, 128, 74, 233, 334, 207, 126, 224,\n", " 2, 562, 298, 2167, 1272, 2, 2601, 2, 516, 988, 2,\n", " 2, 79, 120, 2, 595, 2, 784, 2, 3171, 2, 165,\n", " 170, 143, 2, 2, 2, 2, 2, 226, 251, 2, 61,\n", " 113],\n", " [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n", " 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n", " 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n", " 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n", " 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n", " 0, 0, 2, 778, 128, 74, 2, 630, 163, 2, 2,\n", " 1766, 2, 1051, 2, 2, 85, 156, 2, 2, 148, 139,\n", " 121, 664, 665, 2, 2, 1361, 173, 2, 749, 2, 2,\n", " 3804, 2, 2, 226, 65, 2, 2, 127, 2, 2, 2,\n", " 2]], dtype=int32)" ] }, "metadata": {}, "execution_count": 16 } ] }, { "cell_type": "code", "metadata": { "id": "kUei28rvd7jc", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "1dd58dde-392a-4bfd-8f07-28c0031e3710" }, "source": [ "for x in x_train[0:6]:\n", " print(len(x))" ], "execution_count": 17, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "100\n", "100\n", "100\n", "100\n", "100\n", "100\n" ] } ] }, { "cell_type": "code", "metadata": { "id": "X1lGNbqud7jc", "colab": { "base_uri": "https://localhost:8080/", "height": 89 }, "outputId": "ed4e08b2-1810-4c56-ef6b-86c7c211087c" }, "source": [ "' '.join(index_word[id] for id in x_train[0])" ], "execution_count": 18, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "\"cry UNK UNK UNK UNK must UNK been good UNK UNK definitely UNK also UNK UNK UNK two little UNK UNK played UNK UNK UNK norman UNK paul UNK were UNK brilliant children UNK often left UNK UNK UNK UNK list UNK think because UNK stars UNK play them UNK grown up UNK such UNK big UNK UNK UNK whole UNK UNK these children UNK amazing UNK should UNK UNK UNK what UNK UNK done don't UNK think UNK whole story UNK UNK lovely because UNK UNK true UNK UNK someone's life after UNK UNK UNK UNK UNK us UNK\"" ], "application/vnd.google.colaboratory.intrinsic+json": { "type": "string" } }, "metadata": {}, "execution_count": 18 } ] }, { "cell_type": "code", "metadata": { "id": "7rWPdd_hd7jd", "colab": { "base_uri": "https://localhost:8080/", "height": 89 }, "outputId": "53449b03-6c0b-42f2-cabd-5ee0c3d06546" }, "source": [ "' '.join(index_word[id] for id in x_train[5])" ], "execution_count": 19, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD PAD UNK begins better than UNK ends funny UNK UNK russian UNK crew UNK UNK other actors UNK UNK those scenes where documentary shots UNK UNK spoiler part UNK message UNK UNK contrary UNK UNK whole story UNK UNK does UNK UNK UNK UNK'" ], "application/vnd.google.colaboratory.intrinsic+json": { "type": "string" } }, "metadata": {}, "execution_count": 19 } ] }, { "cell_type": "markdown", "metadata": { "collapsed": true, "id": "fXhtAkjhd7jd" }, "source": [ "#### 신경망 만들기" ] }, { "cell_type": "code", "metadata": { "id": "ML2tiKTbd7jd" }, "source": [ "model = Sequential()\n", "model.add(Embedding(n_unique_words, n_dim, input_length=max_review_length))\n", "model.add(Flatten())\n", "model.add(Dense(n_dense, activation='relu'))\n", "model.add(Dropout(dropout))\n", "# model.add(Dense(n_dense, activation='relu'))\n", "# model.add(Dropout(dropout))\n", "model.add(Dense(1, activation='sigmoid')) # 두 개의 클래스가 있는 소프트맥스와 수학적으로 동일합니다" ], "execution_count": 20, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "zcXXtRwGd7jd", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "5500cb18-6d34-4b49-e59f-8a967c3478e5" }, "source": [ "model.summary() # 파라미터가 얼마나 많나요!" ], "execution_count": 21, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Model: \"sequential\"\n", "_________________________________________________________________\n", " Layer (type) Output Shape Param # \n", "=================================================================\n", " embedding (Embedding) (None, 100, 64) 320000 \n", " \n", " flatten (Flatten) (None, 6400) 0 \n", " \n", " dense (Dense) (None, 64) 409664 \n", " \n", " dropout (Dropout) (None, 64) 0 \n", " \n", " dense_1 (Dense) (None, 1) 65 \n", " \n", "=================================================================\n", "Total params: 729,729\n", "Trainable params: 729,729\n", "Non-trainable params: 0\n", "_________________________________________________________________\n" ] } ] }, { "cell_type": "code", "metadata": { "id": "2XihCV2ud7jd", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "49831c01-c1f7-494b-85fa-e96f243cabb0" }, "source": [ "# 임베딩 층의 차원과 파라미터\n", "n_dim, n_unique_words, n_dim*n_unique_words" ], "execution_count": 22, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(64, 5000, 320000)" ] }, "metadata": {}, "execution_count": 22 } ] }, { "cell_type": "code", "metadata": { "id": "yTRz9XGPd7jd", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "33aa5adb-0ab0-4f80-df83-90e2f2d18a12" }, "source": [ "# ...Flatten()\n", "max_review_length, n_dim, n_dim*max_review_length" ], "execution_count": 23, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(100, 64, 6400)" ] }, "metadata": {}, "execution_count": 23 } ] }, { "cell_type": "code", "metadata": { "id": "6g__lQ1Md7jd", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "afe27069-40cb-42e5-fbb3-e260729d1e67" }, "source": [ "# ...Dense\n", "n_dense, n_dim*max_review_length*n_dense + n_dense # weights + biases" ], "execution_count": 24, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(64, 409664)" ] }, "metadata": {}, "execution_count": 24 } ] }, { "cell_type": "code", "metadata": { "id": "Ua9vvYaid7je", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "2b0eb8bc-d09d-4274-9b5f-4c5cb7b7a6c4" }, "source": [ "# ...그리고 출력\n", "n_dense + 1 " ], "execution_count": 25, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "65" ] }, "metadata": {}, "execution_count": 25 } ] }, { "cell_type": "markdown", "metadata": { "id": "67JYmScud7je" }, "source": [ "#### 모델을 설정합니다." ] }, { "cell_type": "code", "metadata": { "id": "U_Cj2Cfsd7je" }, "source": [ "model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])" ], "execution_count": 26, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "gJCw9JsFd7je" }, "source": [ "modelcheckpoint = ModelCheckpoint(filepath=output_dir+\n", " \"/weights.{epoch:02d}.hdf5\")" ], "execution_count": 27, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "-0V5wJ99d7je" }, "source": [ "if not os.path.exists(output_dir):\n", " os.makedirs(output_dir)" ], "execution_count": 28, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "MacaKG4Od7je" }, "source": [ "#### 훈련!" ] }, { "cell_type": "code", "metadata": { "id": "J1H6gsBCd7je", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "35114c50-4c4b-4ea8-9245-2c1f4af08265" }, "source": [ "model.fit(x_train, y_train, \n", " batch_size=batch_size, epochs=epochs, verbose=1, \n", " validation_data=(x_valid, y_valid), \n", " callbacks=[modelcheckpoint])" ], "execution_count": 29, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Epoch 1/4\n", "196/196 [==============================] - 4s 8ms/step - loss: 0.5365 - accuracy: 0.7164 - val_loss: 0.3461 - val_accuracy: 0.8458\n", "Epoch 2/4\n", "196/196 [==============================] - 1s 7ms/step - loss: 0.2715 - accuracy: 0.8913 - val_loss: 0.3484 - val_accuracy: 0.8463\n", "Epoch 3/4\n", "196/196 [==============================] - 1s 6ms/step - loss: 0.1185 - accuracy: 0.9634 - val_loss: 0.4260 - val_accuracy: 0.8348\n", "Epoch 4/4\n", "196/196 [==============================] - 1s 7ms/step - loss: 0.0244 - accuracy: 0.9964 - val_loss: 0.5395 - val_accuracy: 0.8332\n" ] }, { "output_type": "execute_result", "data": { "text/plain": [ "" ] }, "metadata": {}, "execution_count": 29 } ] }, { "cell_type": "markdown", "metadata": { "id": "9cjF7zdPd7jf" }, "source": [ "#### 평가" ] }, { "cell_type": "code", "metadata": { "id": "aExy5ssRd7jf" }, "source": [ "model.load_weights(output_dir+\"/weights.02.hdf5\") # NOT zero-indexed" ], "execution_count": 30, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "q2Zpf3Edd7jf", "outputId": "860c4419-30b2-4d93-a59b-cbab8102f701", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "y_hat = model.predict(x_valid)" ], "execution_count": 31, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "782/782 [==============================] - 1s 1ms/step\n" ] } ] }, { "cell_type": "code", "metadata": { "id": "6Pku3Zfkd7jf", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "1631a039-5870-4c0c-a485-b857681e11d3" }, "source": [ "len(y_hat)" ], "execution_count": 32, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "25000" ] }, "metadata": {}, "execution_count": 32 } ] }, { "cell_type": "code", "metadata": { "id": "gGck2JuZd7jf", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "a2fc1d6b-c6fe-4db2-dadc-185c404795fa" }, "source": [ "y_hat[0]" ], "execution_count": 33, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "array([0.24574783], dtype=float32)" ] }, "metadata": {}, "execution_count": 33 } ] }, { "cell_type": "code", "metadata": { "id": "ONWOj1tad7jf", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "b8166509-1b48-4f0b-a085-61b2de94aacb" }, "source": [ "y_valid[0]" ], "execution_count": 34, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "0" ] }, "metadata": {}, "execution_count": 34 } ] }, { "cell_type": "code", "metadata": { "id": "kA4XP_Vid7jf", "colab": { "base_uri": "https://localhost:8080/", "height": 265 }, "outputId": "8885f097-1b4e-4ec3-a29e-b03db880c942" }, "source": [ "plt.hist(y_hat)\n", "_ = plt.axvline(x=0.5, color='orange')" ], "execution_count": 35, "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ "
" ], "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX0AAAD4CAYAAAAAczaOAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAATRUlEQVR4nO3dcayd9X3f8fcnOCRbmsQmuBaynZmqbjraKYRdAVGnLo1bY5wKIy1FROtwkTVPHavardpGtj+8QSMRTWtWpJbOK15N1IZQtgyrYWWeQxRtmgmXQmmAMt8QKPYA32LjrEVJR/rdH+dnegL3cs/NPffcXH7vl3R0nuf7/M7z/H5c8znP/T3POTdVhSSpD29Z6Q5IkibH0Jekjhj6ktQRQ1+SOmLoS1JH1qx0B97I+eefX1u2bFnpbkiv9/UnB8/vet/K9kOaw0MPPfQnVbV+rm3f1aG/ZcsWpqenV7ob0uv99w8Nnn/8iyvZC2lOSZ6Zb5vTO5LUEUNfkjpi6EtSRwx9SeqIoS9JHTH0Jakjhr4kdcTQl6SOGPqS1JHv6k/kLtWWGz+/Isd9+paPrMhxJWkhnulLUkcMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjhj6ktQRQ1+SOvKm/nCWJC3FSn3AE5bvQ56e6UtSRxYM/STvS/LI0OPrSX4hyXlJDic51p7XtfZJcmuSmSSPJrlkaF+7W/tjSXYv58AkSa+3YOhX1ZNVdXFVXQz8TeBl4HPAjcCRqtoKHGnrAFcCW9tjL3AbQJLzgH3AZcClwL6zbxSSpMlY7PTONuCrVfUMsAs42OoHgavb8i7gjho4CqxNcgFwBXC4qk5V1WngMLBjySOQJI1ssaF/LfCZtryhqp5ry88DG9ryRuDZodccb7X56t8myd4k00mmZ2dnF9k9SdIbGTn0k5wLXAX8zmu3VVUBNY4OVdX+qpqqqqn169ePY5eSpGYxZ/pXAr9fVS+09RfatA3t+WSrnwA2D71uU6vNV5ckTchiQv9j/OXUDsAh4OwdOLuBe4bq17W7eC4HzrRpoPuA7UnWtQu421tNkjQhI304K8k7gJ8A/sFQ+RbgriR7gGeAa1r9XmAnMMPgTp/rAarqVJKbgQdbu5uq6tSSRyBJGtlIoV9Vfwa85zW1FxnczfPatgXcMM9+DgAHFt9NSdI4+IlcSeqIoS9JHTH0Jakjhr4kdcTQl6SOGPqS1BFDX5I6YuhLUkcMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjhj6ktQRQ1+SOmLoS1JHDH1J6oihL0kdMfQlqSMjhX6StUnuTvJHSZ5I8sEk5yU5nORYe17X2ibJrUlmkjya5JKh/exu7Y8l2b1cg5IkzW3UM/1fAX6vqn4QeD/wBHAjcKSqtgJH2jrAlcDW9tgL3AaQ5DxgH3AZcCmw7+wbhSRpMhYM/STvBn4UuB2gqv68ql4CdgEHW7ODwNVteRdwRw0cBdYmuQC4AjhcVaeq6jRwGNgx1tFIkt7QKGf6FwKzwH9M8nCS30jyDmBDVT3X2jwPbGjLG4Fnh15/vNXmq3+bJHuTTCeZnp2dXdxoJElvaJTQXwNcAtxWVR8A/oy/nMoBoKoKqHF0qKr2V9VUVU2tX79+HLuUJDWjhP5x4HhVPdDW72bwJvBCm7ahPZ9s208Am4dev6nV5qtLkiZkwdCvqueBZ5O8r5W2AY8Dh4Czd+DsBu5py4eA69pdPJcDZ9o00H3A9iTr2gXc7a0mSZqQNSO2+zngt5KcCzwFXM/gDeOuJHuAZ4BrWtt7gZ3ADPBya0tVnUpyM/Bga3dTVZ0ayygkSSMZKfSr6hFgao5N2+ZoW8AN8+znAHBgMR2UJI2Pn8iVpI4Y+pLUEUNfkjpi6EtSRwx9SeqIoS9JHTH0Jakjhr4kdcTQl6SOGPqS1BFDX5I6YuhLUkcMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjhj6ktSRkUI/ydNJ/jDJI0mmW+28JIeTHGvP61o9SW5NMpPk0SSXDO1nd2t/LMnu+Y4nSVoeiznT/7Gquriqzv6t3BuBI1W1FTjS1gGuBLa2x17gNhi8SQD7gMuAS4F9Z98oJEmTsZTpnV3AwbZ8ELh6qH5HDRwF1ia5ALgCOFxVp6rqNHAY2LGE40uSFmnU0C/gvyV5KMneVttQVc+15eeBDW15I/Ds0GuPt9p89W+TZG+S6STTs7OzI3ZPkjSKNSO2+1tVdSLJ9wKHk/zR8MaqqiQ1jg5V1X5gP8DU1NRY9ilJGhjpTL+qTrTnk8DnGMzJv9CmbWjPJ1vzE8DmoZdvarX56pKkCVkw9JO8I8k7zy4D24GvAIeAs3fg7AbuacuHgOvaXTyXA2faNNB9wPYk69oF3O2tJkmakFGmdzYAn0tytv1vV9XvJXkQuCvJHuAZ4JrW/l5gJzADvAxcD1BVp5LcDDzY2t1UVafGNhJJ0oIWDP2qegp4/xz1F4Ftc9QLuGGefR0ADiy+m5KkcfATuZLUEUNfkjpi6EtSRwx9SeqIoS9JHTH0Jakjhr4kdcTQl6SOGPqS1BFDX5I6YuhLUkcMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjhj6ktQRQ1+SOmLoS1JHRg79JOckeTjJ77b1C5M8kGQmyWeTnNvqb2vrM237lqF9fLzVn0xyxbgHI0l6Y4s50/954Imh9U8Cn6qq7wdOA3tafQ9wutU/1dqR5CLgWuCHgB3AryU5Z2ndlyQtxkihn2QT8BHgN9p6gA8Dd7cmB4Gr2/Kutk7bvq213wXcWVXfrKqvATPApeMYhCRpNKOe6f874J8Bf9HW3wO8VFWvtPXjwMa2vBF4FqBtP9Pav1qf4zWvSrI3yXSS6dnZ2UUMRZK0kAVDP8lPAier6qEJ9Ieq2l9VU1U1tX79+kkcUpK6sWaENj8CXJVkJ/B24F3ArwBrk6xpZ/ObgBOt/QlgM3A8yRrg3cCLQ/Wzhl8jSZqABc/0q+rjVbWpqrYwuBD7har6u8D9wEdbs93APW35UFunbf9CVVWrX9vu7rkQ2Ap8eWwjkSQtaJQz/fn8c+DOJL8EPAzc3uq3A59OMgOcYvBGQVU9luQu4HHgFeCGqvrWEo4vSVqkRYV+VX0R+GJbfoo57r6pqm8APzXP6z8BfGKxnZQkjYefyJWkjhj6ktQRQ1+SOmLoS1JHDH1J6oihL0kdMfQlqSOGviR1xNCXpI4Y+pLUEUNfkjpi6EtSRwx9SeqIoS9JHTH0Jakjhr4kdcTQl6SOGPqS1JEFQz/J25N8OckfJHksyb9u9QuTPJBkJslnk5zb6m9r6zNt+5ahfX281Z9McsVyDUqSNLdRzvS/CXy4qt4PXAzsSHI58EngU1X1/cBpYE9rvwc43eqfau1IchGDP5L+Q8AO4NeSnDPOwUiS3tiCoV8Df9pW39oeBXwYuLvVDwJXt+VdbZ22fVuStPqdVfXNqvoaMMMcf1hdkrR8RprTT3JOkkeAk8Bh4KvAS1X1SmtyHNjYljcCzwK07WeA9wzX53jN8LH2JplOMj07O7v4EUmS5jVS6FfVt6rqYmATg7PzH1yuDlXV/qqaqqqp9evXL9dhJKlLi7p7p6peAu4HPgisTbKmbdoEnGjLJ4DNAG37u4EXh+tzvEaSNAGj3L2zPsnatvxXgJ8AnmAQ/h9tzXYD97TlQ22dtv0LVVWtfm27u+dCYCvw5XENRJK0sDULN+EC4GC70+YtwF1V9btJHgfuTPJLwMPA7a397cCnk8wApxjcsUNVPZbkLuBx4BXghqr61niHI0l6IwuGflU9CnxgjvpTzHH3TVV9A/ipefb1CeATi++mJGkc/ESuJHXE0Jekjhj6ktQRQ1+SOmLoS1JHDH1J6oihL0kdMfQlqSOGviR1xNCXpI4Y+pLUEUNfkjpi6EtSR0b5amVJWlFbbvz8SnfhTcMzfUnqiKEvSR0x9CWpI4a+JHXE0JekjiwY+kk2J7k/yeNJHkvy861+XpLDSY6153WtniS3JplJ8miSS4b2tbu1P5Zk9/INS5I0l1HO9F8BfrGqLgIuB25IchFwI3CkqrYCR9o6wJXA1vbYC9wGgzcJYB9wGYM/qL7v7BuFJGkyFgz9qnquqn6/Lf9f4AlgI7ALONiaHQSubsu7gDtq4CiwNskFwBXA4ao6VVWngcPAjrGORpL0hhY1p59kC/AB4AFgQ1U91zY9D2xoyxuBZ4dedrzV5qu/9hh7k0wnmZ6dnV1M9yRJCxj5E7lJvgf4T8AvVNXXk7y6raoqSY2jQ1W1H9gPMDU1NZZ9TtpKfXrw6Vs+siLHlbR6jHSmn+StDAL/t6rqP7fyC23ahvZ8stVPAJuHXr6p1earS5ImZJS7dwLcDjxRVb88tOkQcPYOnN3APUP169pdPJcDZ9o00H3A9iTr2gXc7a0mSZqQUaZ3fgT4e8AfJnmk1f4FcAtwV5I9wDPANW3bvcBOYAZ4GbgeoKpOJbkZeLC1u6mqTo1lFJKkkSwY+lX1P4DMs3nbHO0LuGGefR0ADiymg5Kk8fETuZLUEUNfkjpi6EtSRwx9SeqIoS9JHTH0Jakjhr4kdcTQl6SOGPqS1BFDX5I6YuhLUkcMfUnqiKEvSR0x9CWpIyP/uURJWqk/Barx8Uxfkjpi6EtSR5zeeRNZyV+9n77lIyt2bEmjG+UPox9IcjLJV4Zq5yU5nORYe17X6klya5KZJI8muWToNbtb+2NJds91LEnS8hpleuc3gR2vqd0IHKmqrcCRtg5wJbC1PfYCt8HgTQLYB1wGXArsO/tGIUmanAVDv6q+BJx6TXkXcLAtHwSuHqrfUQNHgbVJLgCuAA5X1amqOg0c5vVvJJKkZfadzulvqKrn2vLzwIa2vBF4dqjd8Vabr/46SfYy+C2B9773vd9h96Q3L2+b1FIs+e6dqiqgxtCXs/vbX1VTVTW1fv36ce1WksR3HvovtGkb2vPJVj8BbB5qt6nV5qtLkiboOw39Q8DZO3B2A/cM1a9rd/FcDpxp00D3AduTrGsXcLe3miRpghac00/yGeBDwPlJjjO4C+cW4K4ke4BngGta83uBncAM8DJwPUBVnUpyM/Bga3dTVb324rBWsZWaZ17JzwccfepFrnV+XavMgqFfVR+bZ9O2OdoWcMM8+zkAHFhU76QFrNSbzZ3f9+KKHFdaKr+GQZI6YuhLUkcMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjhj6ktQRQ1+SOmLoS1JHDH1J6oihL0kdMfQlqSOGviR1xNCXpI4Y+pLUEUNfkjpi6EtSRwx9SerIxEM/yY4kTyaZSXLjpI8vST2baOgnOQf4VeBK4CLgY0kummQfJKlnkz7TvxSYqaqnqurPgTuBXRPugyR1a82Ej7cReHZo/Thw2XCDJHuBvW31T5M8uYTjnQ/8yRJev9r0Nl5YoTF/8NWln5z0ocGfcxfyySWN+a/Nt2HSob+gqtoP7B/HvpJMV9XUOPa1GvQ2XnDMvXDM4zPp6Z0TwOah9U2tJkmagEmH/oPA1iQXJjkXuBY4NOE+SFK3Jjq9U1WvJPlHwH3AOcCBqnpsGQ85lmmiVaS38YJj7oVjHpNU1XLsV5L0XchP5EpSRwx9SerIqg/9hb7WIcnbkny2bX8gyZbJ93K8RhjzP0nyeJJHkxxJMu89u6vFqF/fkeTvJKkkq/72vlHGnOSa9rN+LMlvT7qP4zbCv+33Jrk/ycPt3/fOlejnuCQ5kORkkq/Msz1Jbm3/PR5NcsmSD1pVq/bB4GLwV4HvA84F/gC46DVt/iHw6235WuCzK93vCYz5x4C/2pZ/tocxt3bvBL4EHAWmVrrfE/g5bwUeBta19e9d6X5PYMz7gZ9tyxcBT690v5c45h8FLgG+Ms/2ncB/BQJcDjyw1GOu9jP9Ub7WYRdwsC3fDWxLkgn2cdwWHHNV3V9VL7fVoww+D7Gajfr1HTcDnwS+McnOLZNRxvz3gV+tqtMAVXVywn0ct1HGXMC72vK7gf8zwf6NXVV9CTj1Bk12AXfUwFFgbZILlnLM1R76c32tw8b52lTVK8AZ4D0T6d3yGGXMw/YwOFNYzRYcc/u1d3NVfX6SHVtGo/ycfwD4gST/M8nRJDsm1rvlMcqY/xXw00mOA/cCPzeZrq2Yxf7/vqDvuq9h0Pgk+WlgCvjbK92X5ZTkLcAvAz+zwl2ZtDUMpng+xOC3uS8l+RtV9dKK9mp5fQz4zar6t0k+CHw6yQ9X1V+sdMdWi9V+pj/K1zq82ibJGga/Er44kd4tj5G+yiLJjwP/Eriqqr45ob4tl4XG/E7gh4EvJnmawdznoVV+MXeUn/Nx4FBV/b+q+hrwvxm8CaxWo4x5D3AXQFX9L+DtDL6M7c1q7F9ds9pDf5SvdTgE7G7LHwW+UO0KySq14JiTfAD49wwCf7XP88ICY66qM1V1flVtqaotDK5jXFVV0yvT3bEY5d/2f2Fwlk+S8xlM9zw1yU6O2Shj/mNgG0CSv84g9Gcn2svJOgRc1+7iuRw4U1XPLWWHq3p6p+b5WockNwHTVXUIuJ3Br4AzDC6YXLtyPV66Ecf8b4DvAX6nXbP+46q6asU6vUQjjvlNZcQx3wdsT/I48C3gn1bVqv0tdsQx/yLwH5L8YwYXdX9mNZ/EJfkMgzfu89t1in3AWwGq6tcZXLfYCcwALwPXL/mYq/i/lyRpkVb79I4kaREMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjhj6ktSR/w+jGZ8uKW1v4QAAAABJRU5ErkJggg==\n" }, "metadata": { "needs_background": "light" } } ] }, { "cell_type": "code", "metadata": { "id": "ltzJ7NU2d7jg" }, "source": [ "pct_auc = roc_auc_score(y_valid, y_hat)*100.0" ], "execution_count": 36, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "mO-L4gPdd7jg", "colab": { "base_uri": "https://localhost:8080/", "height": 36 }, "outputId": "9eb790d8-2620-4372-cec1-7e2753c25068" }, "source": [ "\"{:0.2f}\".format(pct_auc)" ], "execution_count": 37, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "'92.70'" ], "application/vnd.google.colaboratory.intrinsic+json": { "type": "string" } }, "metadata": {}, "execution_count": 37 } ] }, { "cell_type": "code", "metadata": { "id": "17vCxOZdd7jg" }, "source": [ "float_y_hat = []\n", "for y in y_hat:\n", " float_y_hat.append(y[0])" ], "execution_count": 38, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "ySvKumMLd7jg" }, "source": [ "ydf = pd.DataFrame(list(zip(float_y_hat, y_valid)), columns=['y_hat', 'y'])" ], "execution_count": 39, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "kvsP_5_Dd7jg", "colab": { "base_uri": "https://localhost:8080/", "height": 363 }, "outputId": "e969d407-208a-4df2-8bb7-96c9624955b7" }, "source": [ "ydf.head(10)" ], "execution_count": 40, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ " y_hat y\n", "0 0.245748 0\n", "1 0.980824 1\n", "2 0.886931 1\n", "3 0.637679 0\n", "4 0.996760 1\n", "5 0.921040 1\n", "6 0.892002 1\n", "7 0.004962 0\n", "8 0.822434 0\n", "9 0.711035 1" ], "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
y_haty
00.2457480
10.9808241
20.8869311
30.6376790
40.9967601
50.9210401
60.8920021
70.0049620
80.8224340
90.7110351
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ] }, "metadata": {}, "execution_count": 40 } ] }, { "cell_type": "code", "metadata": { "id": "SNMkR9DWd7jg", "colab": { "base_uri": "https://localhost:8080/", "height": 71 }, "outputId": "f3c3e1d4-6525-4e6b-e7ab-560fb54ca4f6" }, "source": [ "' '.join(index_word[id] for id in all_x_valid[0])" ], "execution_count": 41, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "\"START please give this one a miss br br kristy swanson and the rest of the cast rendered terrible performances the show is flat flat flat br br i don't know how michael madison could have allowed this one on his plate he almost seemed to know this wasn't going to work out and his performance was quite lacklustre so all you madison fans give this a miss\"" ], "application/vnd.google.colaboratory.intrinsic+json": { "type": "string" } }, "metadata": {}, "execution_count": 41 } ] }, { "cell_type": "code", "metadata": { "id": "LkvTfvFrd7jg", "colab": { "base_uri": "https://localhost:8080/", "height": 161 }, "outputId": "e69f6f1f-13d6-4f1d-8152-d51d0f2de813" }, "source": [ "' '.join(index_word[id] for id in all_x_valid[6]) " ], "execution_count": 42, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "\"START originally supposed to be just a part of a huge epic the year 1905 depicting the revolution of 1905 potemkin is the story of the mutiny of the crew of the potemkin in odessa harbor the film opens with the crew protesting meat and the captain ordering the execution of the an uprising takes place during which the revolutionary leader is killed this crewman is taken to the shore to lie in state when the townspeople gather on a huge flight of steps overlooking the harbor czarist troops appear and march down the steps breaking up the crowd a naval squadron is sent to retake the potemkin but at the moment when the ships come into range their crews allow the to pass through eisenstein's non historically accurate ending is open ended thus indicating that this was the seed of the later bolshevik revolution that would bloom in russia the film is broken into five parts men and maggots drama on the an appeal from the dead the odessa steps and meeting the squadron br br eisenstein was a revolutionary artist but at the genius level not wanting to make a historical drama eisenstein used visual texture to give the film a newsreel look so that the viewer feels he is eavesdropping on a thrilling and politically revolutionary story this technique is used by the battle of algiers br br unlike eisenstein relied on or the casting of non professionals who had striking physical appearances the extraordinary faces of the cast are what one remembers from potemkin this technique is later used by frank capra in mr deeds goes to town and meet john doe but in potemkin no one individual is cast as a hero or heroine the story is told through a series of scenes that are combined in a special effect known as montage the editing and selection of short segments to produce a desired effect on the viewer d w griffith also used the montage but no one mastered it so well as eisenstein br br the artistic filming of the crew sleeping in their is complemented by the graceful swinging of tables suspended from chains in the galley in contrast the confrontation between the crew and their officers is charged with electricity and the clenched fists of the masses demonstrate their rage with injustice br br eisenstein introduced the technique of showing an action and repeating it again but from a slightly different angle to demonstrate intensity the breaking of a plate bearing the words give us this day our daily bread signifies the beginning of the end this technique is used in last year at marienbad also when the ship's surgeon is tossed over the side his nez dangles from the rigging it was these glasses that the officer used to inspect and pass the maggot infested meat this sequence ties the punishment to the corruption of the czarist era br br the most noted sequence in the film and perhaps in all of film history is the odessa steps the broad expanse of the steps are filled with hundreds of extras rapid and dramatic violence is always suggested and not explicit yet the visual images of the deaths of a few will last in the minds of the viewer forever br br the angular shots of marching boots and legs descending the steps are cleverly accentuated with long menacing shadows from a sun at the top of the steps the pace of the sequence is deliberately varied between the marching soldiers and a few civilians who summon up courage to beg them to stop a close up of a woman's face frozen in horror after being struck by a soldier's sword is the direct antecedent of the bank teller in bonnie in clyde and gives a lasting impression of the horror of the czarist regime br br the death of a young mother leads to a baby carriage careening down the steps in a sequence that has been copied by hitchcock in foreign correspondent by terry gilliam in brazil and brian depalma in the untouchables this sequence is shown repeatedly from various angles thus drawing out what probably was only a five second event br br potemkin is a film that the revolutionary spirit celebrates it for those already committed and it for the unconverted it seethes of fire and roars with the senseless injustices of the decadent czarist regime its greatest impact has been on film students who have borrowed and only slightly improved on techniques invented in russia several generations ago\"" ], "application/vnd.google.colaboratory.intrinsic+json": { "type": "string" } }, "metadata": {}, "execution_count": 42 } ] }, { "cell_type": "code", "metadata": { "id": "eY_j9oHTd7jh", "colab": { "base_uri": "https://localhost:8080/", "height": 363 }, "outputId": "15a9236c-9409-4611-d763-fe4bd02f04e7" }, "source": [ "ydf[(ydf.y == 0) & (ydf.y_hat > 0.9)].head(10)" ], "execution_count": 43, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ " y_hat y\n", "75 0.964311 0\n", "112 0.923961 0\n", "152 0.931623 0\n", "256 0.962183 0\n", "386 0.953319 0\n", "447 0.902034 0\n", "455 0.915358 0\n", "495 0.900015 0\n", "555 0.901919 0\n", "693 0.962619 0" ], "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
y_haty
750.9643110
1120.9239610
1520.9316230
2560.9621830
3860.9533190
4470.9020340
4550.9153580
4950.9000150
5550.9019190
6930.9626190
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ] }, "metadata": {}, "execution_count": 43 } ] }, { "cell_type": "code", "metadata": { "id": "79SWpkGsd7jh", "colab": { "base_uri": "https://localhost:8080/", "height": 143 }, "outputId": "c1181f58-0fda-4754-d66d-83011aa1bea0" }, "source": [ "' '.join(index_word[id] for id in all_x_valid[386]) " ], "execution_count": 44, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "\"START wow another kevin costner hero movie postman tin cup waterworld bodyguard wyatt earp robin hood even that baseball movie seems like he makes movies specifically to be the center of attention the characters are almost always the same the heroics the flaws the greatness the fall the redemption yup within the 1st 5 minutes of the movie we're all supposed to be in awe of his character and it builds up more and more from there br br and this time the story story is just a collage of different movies you don't need a spoiler you've seen this movie several times though it had different titles you'll know what will happen way before it happens this is like mixing an officer and a gentleman with but both are easily better movies watch to see how this kind of movie should be made and also to see how an good but slightly underrated actor russell plays the hero\"" ], "application/vnd.google.colaboratory.intrinsic+json": { "type": "string" } }, "metadata": {}, "execution_count": 44 } ] }, { "cell_type": "code", "metadata": { "id": "MymI2j5Nd7jh", "colab": { "base_uri": "https://localhost:8080/", "height": 363 }, "outputId": "0b5f92c5-ee95-4e2d-c040-309361995228" }, "source": [ "ydf[(ydf.y == 1) & (ydf.y_hat < 0.1)].head(10)" ], "execution_count": 45, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ " y_hat y\n", "101 0.068137 1\n", "167 0.098123 1\n", "248 0.043480 1\n", "300 0.053444 1\n", "325 0.073808 1\n", "333 0.063191 1\n", "345 0.090264 1\n", "349 0.076665 1\n", "355 0.065370 1\n", "384 0.098886 1" ], "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
y_haty
1010.0681371
1670.0981231
2480.0434801
3000.0534441
3250.0738081
3330.0631911
3450.0902641
3490.0766651
3550.0653701
3840.0988861
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ] }, "metadata": {}, "execution_count": 45 } ] }, { "cell_type": "code", "metadata": { "id": "ULdDC13hd7jh", "colab": { "base_uri": "https://localhost:8080/", "height": 107 }, "outputId": "eca2355e-9210-4c23-b1ad-3eca86412dcb" }, "source": [ "' '.join(index_word[id] for id in all_x_valid[224]) " ], "execution_count": 46, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "\"START finally a true horror movie this is the first time in years that i had to cover my eyes i am a horror buff and i recommend this movie but it is quite gory i am not a big wrestling fan but kane really pulled the whole monster thing off i have to admit that i didn't want to see this movie my 17 year old dragged me to it but am very glad i did during and after the movie i was looking over my shoulder i have to agree with others about the whole remake horror movies enough is enough i think that is why this movie is getting some good reviews it is a refreshing change and takes you back to the texas chainsaw first one michael myers and jason and no cgi crap\"" ], "application/vnd.google.colaboratory.intrinsic+json": { "type": "string" } }, "metadata": {}, "execution_count": 46 } ] } ] }