{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Deep Learning Models -- A collection of various deep learning architectures, models, and tips for TensorFlow and PyTorch in Jupyter Notebooks.\n", "- Author: Sebastian Raschka\n", "- GitHub Repository: https://github.com/rasbt/deeplearning-models" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "vY4SK0xKAJgm" }, "source": [ "# Bidirectional Multi-layer RNN with LSTM with Own Dataset in CSV Format (AG News)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Dataset Description\n", "\n", "```\n", "AG's News Topic Classification Dataset\n", "\n", "Version 3, Updated 09/09/2015\n", "\n", "\n", "ORIGIN\n", "\n", "AG is a collection of more than 1 million news articles. News articles have been gathered from more than 2000 news sources by ComeToMyHead in more than 1 year of activity. ComeToMyHead is an academic news search engine which has been running since July, 2004. The dataset is provided by the academic community for research purposes in data mining (clustering, classification, etc), information retrieval (ranking, search, etc), xml, data compression, data streaming, and any other non-commercial activity. For more information, please refer to the link http://www.di.unipi.it/~gulli/AG_corpus_of_news_articles.html .\n", "\n", "The AG's news topic classification dataset is constructed by Xiang Zhang (xiang.zhang@nyu.edu) from the dataset above. It is used as a text classification benchmark in the following paper: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015).\n", "\n", "\n", "DESCRIPTION\n", "\n", "The AG's news topic classification dataset is constructed by choosing 4 largest classes from the original corpus. Each class contains 30,000 training samples and 1,900 testing samples. The total number of training samples is 120,000 and testing 7,600.\n", "\n", "The file classes.txt contains a list of classes corresponding to each label.\n", "\n", "The files train.csv and test.csv contain all the training samples as comma-sparated values. There are 3 columns in them, corresponding to class index (1 to 4), title and description. The title and description are escaped using double quotes (\"), and any internal double quote is escaped by 2 double quotes (\"\"). New lines are escaped by a backslash followed with an \"n\" character, that is \"\\n\".\n", "```" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "colab": {}, "colab_type": "code", "id": "moNmVfuvnImW" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Sebastian Raschka \n", "\n", "CPython 3.7.3\n", "IPython 7.9.0\n", "\n", "torch 1.4.0\n" ] } ], "source": [ "%load_ext watermark\n", "%watermark -a 'Sebastian Raschka' -v -p torch\n", "\n", "\n", "import torch\n", "import torch.nn.functional as F\n", "from torchtext import data\n", "from torchtext import datasets\n", "import time\n", "import random\n", "import pandas as pd\n", "import numpy as np\n", "\n", "torch.backends.cudnn.deterministic = True" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "GSRL42Qgy8I8" }, "source": [ "## General Settings" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "colab": {}, "colab_type": "code", "id": "OvW1RgfepCBq" }, "outputs": [], "source": [ "RANDOM_SEED = 123\n", "torch.manual_seed(RANDOM_SEED)\n", "\n", "VOCABULARY_SIZE = 5000\n", "LEARNING_RATE = 1e-3\n", "BATCH_SIZE = 128\n", "NUM_EPOCHS = 50\n", "DROPOUT = 0.5\n", "DEVICE = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')\n", "\n", "EMBEDDING_DIM = 128\n", "BIDIRECTIONAL = True\n", "HIDDEN_DIM = 256\n", "NUM_LAYERS = 2\n", "OUTPUT_DIM = 4" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "mQMmKUEisW4W" }, "source": [ "## Dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The AG News dataset is available from Xiang Zhang's Google Drive folder at\n", "\n", "https://drive.google.com/drive/u/0/folders/0Bz8a_Dbh9Qhbfll6bVpmNUtUcFdjYmF2SEpmZUZUcVNiMUw1TWN6RDV3a0JHT3kxLVhVR2M\n", "\n", "From the Google Drive folder, download the file \n", "\n", "- `ag_news_csv.tar.gz`" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# !tar xvzf ag_news_csv.tar.gz" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "World\n", "Sports\n", "Business\n", "Sci/Tech\n" ] } ], "source": [ "!cat ag_news_csv/classes.txt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check that the dataset looks okay:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
classlabeltitlecontent
02Wall St. Bears Claw Back Into the Black (Reuters)Reuters - Short-sellers, Wall Street's dwindli...
12Carlyle Looks Toward Commercial Aerospace (Reu...Reuters - Private investment firm Carlyle Grou...
22Oil and Economy Cloud Stocks' Outlook (Reuters)Reuters - Soaring crude prices plus worries\\ab...
32Iraq Halts Oil Exports from Main Southern Pipe...Reuters - Authorities have halted oil export\\f...
42Oil prices soar to all-time record, posing new...AFP - Tearaway world oil prices, toppling reco...
\n", "
" ], "text/plain": [ " classlabel title \\\n", "0 2 Wall St. Bears Claw Back Into the Black (Reuters) \n", "1 2 Carlyle Looks Toward Commercial Aerospace (Reu... \n", "2 2 Oil and Economy Cloud Stocks' Outlook (Reuters) \n", "3 2 Iraq Halts Oil Exports from Main Southern Pipe... \n", "4 2 Oil prices soar to all-time record, posing new... \n", "\n", " content \n", "0 Reuters - Short-sellers, Wall Street's dwindli... \n", "1 Reuters - Private investment firm Carlyle Grou... \n", "2 Reuters - Soaring crude prices plus worries\\ab... \n", "3 Reuters - Authorities have halted oil export\\f... \n", "4 AFP - Tearaway world oil prices, toppling reco... " ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv('ag_news_csv/train.csv', header=None, index_col=None)\n", "df.columns = ['classlabel', 'title', 'content']\n", "df['classlabel'] = df['classlabel']-1\n", "df.head()" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0, 1, 2, 3])" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.unique(df['classlabel'].values)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([30000, 30000, 30000, 30000])" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.bincount(df['classlabel'])" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "df[['classlabel', 'content']].to_csv('ag_news_csv/train_prepocessed.csv', index=None)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
classlabeltitlecontent
02Fears for T N pension after talksUnions representing workers at Turner Newall...
13The Race is On: Second Private Team Sets Launc...SPACE.com - TORONTO, Canada -- A second\\team o...
23Ky. Company Wins Grant to Study Peptides (AP)AP - A company founded by a chemistry research...
33Prediction Unit Helps Forecast Wildfires (AP)AP - It's barely dawn when Mike Fitzpatrick st...
43Calif. Aims to Limit Farm-Related Smog (AP)AP - Southern California's smog-fighting agenc...
\n", "
" ], "text/plain": [ " classlabel title \\\n", "0 2 Fears for T N pension after talks \n", "1 3 The Race is On: Second Private Team Sets Launc... \n", "2 3 Ky. Company Wins Grant to Study Peptides (AP) \n", "3 3 Prediction Unit Helps Forecast Wildfires (AP) \n", "4 3 Calif. Aims to Limit Farm-Related Smog (AP) \n", "\n", " content \n", "0 Unions representing workers at Turner Newall... \n", "1 SPACE.com - TORONTO, Canada -- A second\\team o... \n", "2 AP - A company founded by a chemistry research... \n", "3 AP - It's barely dawn when Mike Fitzpatrick st... \n", "4 AP - Southern California's smog-fighting agenc... " ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv('ag_news_csv/test.csv', header=None, index_col=None)\n", "df.columns = ['classlabel', 'title', 'content']\n", "df['classlabel'] = df['classlabel']-1\n", "df.head()" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0, 1, 2, 3])" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.unique(df['classlabel'].values)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1900, 1900, 1900, 1900])" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.bincount(df['classlabel'])" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "df[['classlabel', 'content']].to_csv('ag_news_csv/test_prepocessed.csv', index=None)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "del df" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "4GnH64XvsV8n" }, "source": [ "Define the Label and Text field formatters:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "TEXT = data.Field(sequential=True,\n", " tokenize='spacy',\n", " include_lengths=True) # necessary for packed_padded_sequence\n", "\n", "LABEL = data.LabelField(dtype=torch.float)\n", "\n", "\n", "# If you get an error [E050] Can't find model 'en'\n", "# you need to run the following on your command line:\n", "# python -m spacy download en" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Process the dataset:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "fields = [('classlabel', LABEL), ('content', TEXT)]\n", "\n", "train_dataset = data.TabularDataset(\n", " path=\"ag_news_csv/train_prepocessed.csv\", format='csv',\n", " skip_header=True, fields=fields)\n", "\n", "test_dataset = data.TabularDataset(\n", " path=\"ag_news_csv/test_prepocessed.csv\", format='csv',\n", " skip_header=True, fields=fields)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Split the training dataset into training and validation:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 68 }, "colab_type": "code", "id": "WZ_4jiHVnMxN", "outputId": "dfa51c04-4845-44c3-f50b-d36d41f132b8" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Num Train: 114000\n", "Num Valid: 6000\n" ] } ], "source": [ "train_data, valid_data = train_dataset.split(\n", " split_ratio=[0.95, 0.05],\n", " random_state=random.seed(RANDOM_SEED))\n", "\n", "print(f'Num Train: {len(train_data)}')\n", "print(f'Num Valid: {len(valid_data)}')" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "L-TBwKWPslPa" }, "source": [ "Build the vocabulary based on the top \"VOCABULARY_SIZE\" words:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 51 }, "colab_type": "code", "id": "e8uNrjdtn4A8", "outputId": "6cf499d7-7722-4da0-8576-ee0f218cc6e3" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Vocabulary size: 5002\n", "Number of classes: 4\n" ] } ], "source": [ "TEXT.build_vocab(train_data,\n", " max_size=VOCABULARY_SIZE,\n", " vectors='glove.6B.100d',\n", " unk_init=torch.Tensor.normal_)\n", "\n", "LABEL.build_vocab(train_data)\n", "\n", "print(f'Vocabulary size: {len(TEXT.vocab)}')\n", "print(f'Number of classes: {len(LABEL.vocab)}')" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['1', '3', '0', '2']" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(LABEL.vocab.freqs)[-10:]" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "JpEMNInXtZsb" }, "source": [ "The TEXT.vocab dictionary will contain the word counts and indices. The reason why the number of words is VOCABULARY_SIZE + 2 is that it contains to special tokens for padding and unknown words: `` and ``." ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "eIQ_zfKLwjKm" }, "source": [ "Make dataset iterators:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "colab": {}, "colab_type": "code", "id": "i7JiHR1stHNF" }, "outputs": [], "source": [ "train_loader, valid_loader, test_loader = data.BucketIterator.splits(\n", " (train_data, valid_data, test_dataset), \n", " batch_size=BATCH_SIZE,\n", " sort_within_batch=True, # necessary for packed_padded_sequence\n", " sort_key=lambda x: len(x.content),\n", " device=DEVICE)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "R0pT_dMRvicQ" }, "source": [ "Testing the iterators (note that the number of rows depends on the longest document in the respective batch):" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 204 }, "colab_type": "code", "id": "y8SP_FccutT0", "outputId": "fe33763a-4560-4dee-adee-31cc6c48b0b2" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Train\n", "Text matrix size: torch.Size([35, 128])\n", "Target vector size: torch.Size([128])\n", "\n", "Valid:\n", "Text matrix size: torch.Size([17, 128])\n", "Target vector size: torch.Size([128])\n", "\n", "Test:\n", "Text matrix size: torch.Size([16, 128])\n", "Target vector size: torch.Size([128])\n" ] } ], "source": [ "print('Train')\n", "for batch in train_loader:\n", " print(f'Text matrix size: {batch.content[0].size()}')\n", " print(f'Target vector size: {batch.classlabel.size()}')\n", " break\n", " \n", "print('\\nValid:')\n", "for batch in valid_loader:\n", " print(f'Text matrix size: {batch.content[0].size()}')\n", " print(f'Target vector size: {batch.classlabel.size()}')\n", " break\n", " \n", "print('\\nTest:')\n", "for batch in test_loader:\n", " print(f'Text matrix size: {batch.content[0].size()}')\n", " print(f'Target vector size: {batch.classlabel.size()}')\n", " break" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "G_grdW3pxCzz" }, "source": [ "## Model" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "colab": {}, "colab_type": "code", "id": "nQIUm5EjxFNa" }, "outputs": [], "source": [ "import torch.nn as nn\n", "\n", "\n", "class RNN(nn.Module):\n", " def __init__(self, input_dim, embedding_dim, bidirectional, hidden_dim, num_layers, output_dim, dropout, pad_idx):\n", " \n", " super().__init__()\n", " \n", " self.embedding = nn.Embedding(input_dim, embedding_dim, padding_idx=pad_idx)\n", " self.rnn = nn.LSTM(embedding_dim, \n", " hidden_dim,\n", " num_layers=num_layers,\n", " bidirectional=bidirectional, \n", " dropout=dropout)\n", " self.fc1 = nn.Linear(hidden_dim * num_layers, 64)\n", " self.fc2 = nn.Linear(64, output_dim)\n", " self.dropout = nn.Dropout(dropout)\n", " \n", " def forward(self, text, text_length):\n", "\n", " embedded = self.dropout(self.embedding(text))\n", " packed_embedded = nn.utils.rnn.pack_padded_sequence(embedded, text_length)\n", " packed_output, (hidden, cell) = self.rnn(packed_embedded)\n", " # output, output_lengths = nn.utils.rnn.pad_packed_sequence(packed_output)\n", " hidden = self.dropout(torch.cat((hidden[-2,:,:], hidden[-1,:,:]), dim=1))\n", " hidden = self.fc1(hidden)\n", " hidden = self.dropout(hidden)\n", " hidden = self.fc2(hidden)\n", " return hidden" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "colab": {}, "colab_type": "code", "id": "Ik3NF3faxFmZ" }, "outputs": [], "source": [ "INPUT_DIM = len(TEXT.vocab)\n", "\n", "PAD_IDX = TEXT.vocab.stoi[TEXT.pad_token]\n", "\n", "torch.manual_seed(RANDOM_SEED)\n", "model = RNN(INPUT_DIM, EMBEDDING_DIM, BIDIRECTIONAL, HIDDEN_DIM, NUM_LAYERS, OUTPUT_DIM, DROPOUT, PAD_IDX)\n", "model = model.to(DEVICE)\n", "optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "Lv9Ny9di6VcI" }, "source": [ "## Training" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "colab": {}, "colab_type": "code", "id": "T5t1Afn4xO11" }, "outputs": [], "source": [ "def compute_accuracy(model, data_loader, device):\n", " model.eval()\n", " correct_pred, num_examples = 0, 0\n", " with torch.no_grad():\n", " for batch_idx, batch_data in enumerate(data_loader):\n", " text, text_lengths = batch_data.content\n", " logits = model(text, text_lengths)\n", " _, predicted_labels = torch.max(logits, 1)\n", " num_examples += batch_data.classlabel.size(0)\n", " correct_pred += (predicted_labels.long() == batch_data.classlabel.long()).sum()\n", " return correct_pred.float()/num_examples * 100" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 1836 }, "colab_type": "code", "id": "EABZM8Vo0ilB", "outputId": "5d45e293-9909-4588-e793-8dfaf72e5c67" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch: 001/050 | Batch 000/891 | Cost: 1.3877\n", "Epoch: 001/050 | Batch 050/891 | Cost: 1.2299\n", "Epoch: 001/050 | Batch 100/891 | Cost: 1.0337\n", "Epoch: 001/050 | Batch 150/891 | Cost: 0.8675\n", "Epoch: 001/050 | Batch 200/891 | Cost: 0.8217\n", "Epoch: 001/050 | Batch 250/891 | Cost: 0.6656\n", "Epoch: 001/050 | Batch 300/891 | Cost: 0.6976\n", "Epoch: 001/050 | Batch 350/891 | Cost: 0.7211\n", "Epoch: 001/050 | Batch 400/891 | Cost: 0.5315\n", "Epoch: 001/050 | Batch 450/891 | Cost: 0.5550\n", "Epoch: 001/050 | Batch 500/891 | Cost: 0.5794\n", "Epoch: 001/050 | Batch 550/891 | Cost: 0.5368\n", "Epoch: 001/050 | Batch 600/891 | Cost: 0.4791\n", "Epoch: 001/050 | Batch 650/891 | Cost: 0.6736\n", "Epoch: 001/050 | Batch 700/891 | Cost: 0.4740\n", "Epoch: 001/050 | Batch 750/891 | Cost: 0.9449\n", "Epoch: 001/050 | Batch 800/891 | Cost: 0.5111\n", "Epoch: 001/050 | Batch 850/891 | Cost: 0.4126\n", "training accuracy: 84.93%\n", "valid accuracy: 84.10%\n", "Time elapsed: 0.50 min\n", "Epoch: 002/050 | Batch 000/891 | Cost: 0.4338\n", "Epoch: 002/050 | Batch 050/891 | Cost: 0.4728\n", "Epoch: 002/050 | Batch 100/891 | Cost: 0.4738\n", "Epoch: 002/050 | Batch 150/891 | Cost: 0.5698\n", "Epoch: 002/050 | Batch 200/891 | Cost: 0.5169\n", "Epoch: 002/050 | Batch 250/891 | Cost: 0.3508\n", "Epoch: 002/050 | Batch 300/891 | Cost: 0.5877\n", "Epoch: 002/050 | Batch 350/891 | Cost: 0.5818\n", "Epoch: 002/050 | Batch 400/891 | Cost: 0.4324\n", "Epoch: 002/050 | Batch 450/891 | Cost: 0.3107\n", "Epoch: 002/050 | Batch 500/891 | Cost: 0.4036\n", "Epoch: 002/050 | Batch 550/891 | Cost: 0.4403\n", "Epoch: 002/050 | Batch 600/891 | Cost: 0.6188\n", "Epoch: 002/050 | Batch 650/891 | Cost: 0.3749\n", "Epoch: 002/050 | Batch 700/891 | Cost: 0.4714\n", "Epoch: 002/050 | Batch 750/891 | Cost: 0.4166\n", "Epoch: 002/050 | Batch 800/891 | Cost: 0.4081\n", "Epoch: 002/050 | Batch 850/891 | Cost: 0.5608\n", "training accuracy: 89.13%\n", "valid accuracy: 88.40%\n", "Time elapsed: 1.00 min\n", "Epoch: 003/050 | Batch 000/891 | Cost: 0.3957\n", "Epoch: 003/050 | Batch 050/891 | Cost: 0.4003\n", "Epoch: 003/050 | Batch 100/891 | Cost: 0.3422\n", "Epoch: 003/050 | Batch 150/891 | Cost: 0.3350\n", "Epoch: 003/050 | Batch 200/891 | Cost: 0.2862\n", "Epoch: 003/050 | Batch 250/891 | Cost: 0.7255\n", "Epoch: 003/050 | Batch 300/891 | Cost: 0.3194\n", "Epoch: 003/050 | Batch 350/891 | Cost: 0.4845\n", "Epoch: 003/050 | Batch 400/891 | Cost: 0.3754\n", "Epoch: 003/050 | Batch 450/891 | Cost: 0.4159\n", "Epoch: 003/050 | Batch 500/891 | Cost: 0.3210\n", "Epoch: 003/050 | Batch 550/891 | Cost: 0.3639\n", "Epoch: 003/050 | Batch 600/891 | Cost: 0.2480\n", "Epoch: 003/050 | Batch 650/891 | Cost: 0.3586\n", "Epoch: 003/050 | Batch 700/891 | Cost: 0.8477\n", "Epoch: 003/050 | Batch 750/891 | Cost: 0.2967\n", "Epoch: 003/050 | Batch 800/891 | Cost: 0.3125\n", "Epoch: 003/050 | Batch 850/891 | Cost: 0.2451\n", "training accuracy: 90.50%\n", "valid accuracy: 89.47%\n", "Time elapsed: 1.51 min\n", "Epoch: 004/050 | Batch 000/891 | Cost: 0.2751\n", "Epoch: 004/050 | Batch 050/891 | Cost: 0.3306\n", "Epoch: 004/050 | Batch 100/891 | Cost: 0.8538\n", "Epoch: 004/050 | Batch 150/891 | Cost: 0.5015\n", "Epoch: 004/050 | Batch 200/891 | Cost: 0.3141\n", "Epoch: 004/050 | Batch 250/891 | Cost: 0.2756\n", "Epoch: 004/050 | Batch 300/891 | Cost: 0.2920\n", "Epoch: 004/050 | Batch 350/891 | Cost: 0.4124\n", "Epoch: 004/050 | Batch 400/891 | Cost: 0.4118\n", "Epoch: 004/050 | Batch 450/891 | Cost: 0.3355\n", "Epoch: 004/050 | Batch 500/891 | Cost: 0.2594\n", "Epoch: 004/050 | Batch 550/891 | Cost: 0.2008\n", "Epoch: 004/050 | Batch 600/891 | Cost: 0.2917\n", "Epoch: 004/050 | Batch 650/891 | Cost: 0.1437\n", "Epoch: 004/050 | Batch 700/891 | Cost: 0.2682\n", "Epoch: 004/050 | Batch 750/891 | Cost: 0.2572\n", "Epoch: 004/050 | Batch 800/891 | Cost: 0.2653\n", "Epoch: 004/050 | Batch 850/891 | Cost: 0.1637\n", "training accuracy: 91.44%\n", "valid accuracy: 90.28%\n", "Time elapsed: 2.02 min\n", "Epoch: 005/050 | Batch 000/891 | Cost: 0.3751\n", "Epoch: 005/050 | Batch 050/891 | Cost: 0.3224\n", "Epoch: 005/050 | Batch 100/891 | Cost: 0.4595\n", "Epoch: 005/050 | Batch 150/891 | Cost: 0.4083\n", "Epoch: 005/050 | Batch 200/891 | Cost: 0.3154\n", "Epoch: 005/050 | Batch 250/891 | Cost: 0.2272\n", "Epoch: 005/050 | Batch 300/891 | Cost: 0.2790\n", "Epoch: 005/050 | Batch 350/891 | Cost: 0.3233\n", "Epoch: 005/050 | Batch 400/891 | Cost: 0.3187\n", "Epoch: 005/050 | Batch 450/891 | Cost: 0.2227\n", "Epoch: 005/050 | Batch 500/891 | Cost: 0.3384\n", "Epoch: 005/050 | Batch 550/891 | Cost: 0.3132\n", "Epoch: 005/050 | Batch 600/891 | Cost: 0.3325\n", "Epoch: 005/050 | Batch 650/891 | Cost: 0.2679\n", "Epoch: 005/050 | Batch 700/891 | Cost: 0.4807\n", "Epoch: 005/050 | Batch 750/891 | Cost: 0.2496\n", "Epoch: 005/050 | Batch 800/891 | Cost: 0.2778\n", "Epoch: 005/050 | Batch 850/891 | Cost: 0.2846\n", "training accuracy: 92.07%\n", "valid accuracy: 90.63%\n", "Time elapsed: 2.55 min\n", "Epoch: 006/050 | Batch 000/891 | Cost: 0.6337\n", "Epoch: 006/050 | Batch 050/891 | Cost: 0.1976\n", "Epoch: 006/050 | Batch 100/891 | Cost: 0.3846\n", "Epoch: 006/050 | Batch 150/891 | Cost: 0.2781\n", "Epoch: 006/050 | Batch 200/891 | Cost: 0.2588\n", "Epoch: 006/050 | Batch 250/891 | Cost: 0.3977\n", "Epoch: 006/050 | Batch 300/891 | Cost: 0.3890\n", "Epoch: 006/050 | Batch 350/891 | Cost: 0.3945\n", "Epoch: 006/050 | Batch 400/891 | Cost: 0.3439\n", "Epoch: 006/050 | Batch 450/891 | Cost: 0.2981\n", "Epoch: 006/050 | Batch 500/891 | Cost: 0.3398\n", "Epoch: 006/050 | Batch 550/891 | Cost: 0.3683\n", "Epoch: 006/050 | Batch 600/891 | Cost: 0.2633\n", "Epoch: 006/050 | Batch 650/891 | Cost: 0.2803\n", "Epoch: 006/050 | Batch 700/891 | Cost: 0.3132\n", "Epoch: 006/050 | Batch 750/891 | Cost: 0.1624\n", "Epoch: 006/050 | Batch 800/891 | Cost: 0.2108\n", "Epoch: 006/050 | Batch 850/891 | Cost: 0.3767\n", "training accuracy: 92.38%\n", "valid accuracy: 91.03%\n", "Time elapsed: 3.08 min\n", "Epoch: 007/050 | Batch 000/891 | Cost: 0.2237\n", "Epoch: 007/050 | Batch 050/891 | Cost: 0.2486\n", "Epoch: 007/050 | Batch 100/891 | Cost: 0.4113\n", "Epoch: 007/050 | Batch 150/891 | Cost: 0.1935\n", "Epoch: 007/050 | Batch 200/891 | Cost: 0.1993\n", "Epoch: 007/050 | Batch 250/891 | Cost: 0.2414\n", "Epoch: 007/050 | Batch 300/891 | Cost: 0.3459\n", "Epoch: 007/050 | Batch 350/891 | Cost: 0.2281\n", "Epoch: 007/050 | Batch 400/891 | Cost: 0.3353\n", "Epoch: 007/050 | Batch 450/891 | Cost: 0.2310\n", "Epoch: 007/050 | Batch 500/891 | Cost: 0.2138\n", "Epoch: 007/050 | Batch 550/891 | Cost: 0.2781\n", "Epoch: 007/050 | Batch 600/891 | Cost: 0.1706\n", "Epoch: 007/050 | Batch 650/891 | Cost: 0.3315\n", "Epoch: 007/050 | Batch 700/891 | Cost: 0.4015\n", "Epoch: 007/050 | Batch 750/891 | Cost: 0.6616\n", "Epoch: 007/050 | Batch 800/891 | Cost: 0.1962\n", "Epoch: 007/050 | Batch 850/891 | Cost: 0.3632\n", "training accuracy: 92.55%\n", "valid accuracy: 90.55%\n", "Time elapsed: 3.61 min\n", "Epoch: 008/050 | Batch 000/891 | Cost: 0.1526\n", "Epoch: 008/050 | Batch 050/891 | Cost: 0.2569\n", "Epoch: 008/050 | Batch 100/891 | Cost: 0.2024\n", "Epoch: 008/050 | Batch 150/891 | Cost: 0.4151\n", "Epoch: 008/050 | Batch 200/891 | Cost: 0.3168\n", "Epoch: 008/050 | Batch 250/891 | Cost: 0.2224\n", "Epoch: 008/050 | Batch 300/891 | Cost: 0.2139\n", "Epoch: 008/050 | Batch 350/891 | Cost: 0.1567\n", "Epoch: 008/050 | Batch 400/891 | Cost: 0.1942\n", "Epoch: 008/050 | Batch 450/891 | Cost: 0.3651\n", "Epoch: 008/050 | Batch 500/891 | Cost: 0.4346\n", "Epoch: 008/050 | Batch 550/891 | Cost: 0.2333\n", "Epoch: 008/050 | Batch 600/891 | Cost: 0.4014\n", "Epoch: 008/050 | Batch 650/891 | Cost: 0.2443\n", "Epoch: 008/050 | Batch 700/891 | Cost: 0.2304\n", "Epoch: 008/050 | Batch 750/891 | Cost: 0.3688\n", "Epoch: 008/050 | Batch 800/891 | Cost: 0.2376\n", "Epoch: 008/050 | Batch 850/891 | Cost: 0.2561\n", "training accuracy: 93.36%\n", "valid accuracy: 91.45%\n", "Time elapsed: 4.13 min\n", "Epoch: 009/050 | Batch 000/891 | Cost: 0.2353\n", "Epoch: 009/050 | Batch 050/891 | Cost: 0.3266\n", "Epoch: 009/050 | Batch 100/891 | Cost: 0.2345\n", "Epoch: 009/050 | Batch 150/891 | Cost: 0.3621\n", "Epoch: 009/050 | Batch 200/891 | Cost: 0.2544\n", "Epoch: 009/050 | Batch 250/891 | Cost: 0.3480\n", "Epoch: 009/050 | Batch 300/891 | Cost: 0.4042\n", "Epoch: 009/050 | Batch 350/891 | Cost: 0.2110\n", "Epoch: 009/050 | Batch 400/891 | Cost: 0.1583\n", "Epoch: 009/050 | Batch 450/891 | Cost: 0.2829\n", "Epoch: 009/050 | Batch 500/891 | Cost: 0.2130\n", "Epoch: 009/050 | Batch 550/891 | Cost: 0.2088\n", "Epoch: 009/050 | Batch 600/891 | Cost: 0.3278\n", "Epoch: 009/050 | Batch 650/891 | Cost: 0.3618\n", "Epoch: 009/050 | Batch 700/891 | Cost: 0.2778\n", "Epoch: 009/050 | Batch 750/891 | Cost: 0.4374\n", "Epoch: 009/050 | Batch 800/891 | Cost: 0.2463\n", "Epoch: 009/050 | Batch 850/891 | Cost: 0.2187\n", "training accuracy: 93.74%\n", "valid accuracy: 91.37%\n", "Time elapsed: 4.65 min\n", "Epoch: 010/050 | Batch 000/891 | Cost: 0.1810\n", "Epoch: 010/050 | Batch 050/891 | Cost: 0.2491\n", "Epoch: 010/050 | Batch 100/891 | Cost: 0.1872\n", "Epoch: 010/050 | Batch 150/891 | Cost: 0.5379\n", "Epoch: 010/050 | Batch 200/891 | Cost: 0.3171\n", "Epoch: 010/050 | Batch 250/891 | Cost: 0.1732\n", "Epoch: 010/050 | Batch 300/891 | Cost: 0.2367\n", "Epoch: 010/050 | Batch 350/891 | Cost: 0.2784\n", "Epoch: 010/050 | Batch 400/891 | Cost: 0.4789\n", "Epoch: 010/050 | Batch 450/891 | Cost: 0.2235\n", "Epoch: 010/050 | Batch 500/891 | Cost: 0.2694\n", "Epoch: 010/050 | Batch 550/891 | Cost: 0.2759\n", "Epoch: 010/050 | Batch 600/891 | Cost: 0.2000\n", "Epoch: 010/050 | Batch 650/891 | Cost: 0.2420\n", "Epoch: 010/050 | Batch 700/891 | Cost: 0.2196\n", "Epoch: 010/050 | Batch 750/891 | Cost: 0.3454\n", "Epoch: 010/050 | Batch 800/891 | Cost: 0.2498\n", "Epoch: 010/050 | Batch 850/891 | Cost: 0.2910\n", "training accuracy: 93.97%\n", "valid accuracy: 91.20%\n", "Time elapsed: 5.18 min\n", "Epoch: 011/050 | Batch 000/891 | Cost: 0.1767\n", "Epoch: 011/050 | Batch 050/891 | Cost: 0.1857\n", "Epoch: 011/050 | Batch 100/891 | Cost: 0.1880\n", "Epoch: 011/050 | Batch 150/891 | Cost: 0.3116\n", "Epoch: 011/050 | Batch 200/891 | Cost: 0.1706\n", "Epoch: 011/050 | Batch 250/891 | Cost: 0.2218\n", "Epoch: 011/050 | Batch 300/891 | Cost: 0.1673\n", "Epoch: 011/050 | Batch 350/891 | Cost: 0.4530\n", "Epoch: 011/050 | Batch 400/891 | Cost: 0.2309\n", "Epoch: 011/050 | Batch 450/891 | Cost: 0.1871\n", "Epoch: 011/050 | Batch 500/891 | Cost: 0.1490\n", "Epoch: 011/050 | Batch 550/891 | Cost: 0.2857\n", "Epoch: 011/050 | Batch 600/891 | Cost: 0.2446\n", "Epoch: 011/050 | Batch 650/891 | Cost: 0.1511\n", "Epoch: 011/050 | Batch 700/891 | Cost: 0.1921\n", "Epoch: 011/050 | Batch 750/891 | Cost: 0.3078\n", "Epoch: 011/050 | Batch 800/891 | Cost: 0.1326\n", "Epoch: 011/050 | Batch 850/891 | Cost: 0.1922\n", "training accuracy: 94.24%\n", "valid accuracy: 91.45%\n", "Time elapsed: 5.71 min\n", "Epoch: 012/050 | Batch 000/891 | Cost: 0.1514\n", "Epoch: 012/050 | Batch 050/891 | Cost: 0.2781\n", "Epoch: 012/050 | Batch 100/891 | Cost: 0.1480\n", "Epoch: 012/050 | Batch 150/891 | Cost: 0.1900\n", "Epoch: 012/050 | Batch 200/891 | Cost: 0.2881\n", "Epoch: 012/050 | Batch 250/891 | Cost: 0.3169\n", "Epoch: 012/050 | Batch 300/891 | Cost: 0.1878\n", "Epoch: 012/050 | Batch 350/891 | Cost: 0.1954\n", "Epoch: 012/050 | Batch 400/891 | Cost: 0.2762\n", "Epoch: 012/050 | Batch 450/891 | Cost: 0.2427\n", "Epoch: 012/050 | Batch 500/891 | Cost: 0.1896\n", "Epoch: 012/050 | Batch 550/891 | Cost: 0.2148\n", "Epoch: 012/050 | Batch 600/891 | Cost: 0.1580\n", "Epoch: 012/050 | Batch 650/891 | Cost: 0.2480\n", "Epoch: 012/050 | Batch 700/891 | Cost: 0.2760\n", "Epoch: 012/050 | Batch 750/891 | Cost: 0.3120\n", "Epoch: 012/050 | Batch 800/891 | Cost: 0.1030\n", "Epoch: 012/050 | Batch 850/891 | Cost: 0.2267\n", "training accuracy: 94.46%\n", "valid accuracy: 91.68%\n", "Time elapsed: 6.23 min\n", "Epoch: 013/050 | Batch 000/891 | Cost: 0.2121\n", "Epoch: 013/050 | Batch 050/891 | Cost: 0.2147\n", "Epoch: 013/050 | Batch 100/891 | Cost: 0.2820\n", "Epoch: 013/050 | Batch 150/891 | Cost: 0.3664\n", "Epoch: 013/050 | Batch 200/891 | Cost: 0.1671\n", "Epoch: 013/050 | Batch 250/891 | Cost: 0.2233\n", "Epoch: 013/050 | Batch 300/891 | Cost: 0.2576\n", "Epoch: 013/050 | Batch 350/891 | Cost: 0.4332\n", "Epoch: 013/050 | Batch 400/891 | Cost: 0.2597\n", "Epoch: 013/050 | Batch 450/891 | Cost: 0.1760\n", "Epoch: 013/050 | Batch 500/891 | Cost: 0.3148\n", "Epoch: 013/050 | Batch 550/891 | Cost: 0.2573\n", "Epoch: 013/050 | Batch 600/891 | Cost: 0.2120\n", "Epoch: 013/050 | Batch 650/891 | Cost: 0.2722\n", "Epoch: 013/050 | Batch 700/891 | Cost: 0.0958\n", "Epoch: 013/050 | Batch 750/891 | Cost: 0.1930\n", "Epoch: 013/050 | Batch 800/891 | Cost: 0.0750\n", "Epoch: 013/050 | Batch 850/891 | Cost: 0.2382\n", "training accuracy: 94.74%\n", "valid accuracy: 91.47%\n", "Time elapsed: 6.76 min\n", "Epoch: 014/050 | Batch 000/891 | Cost: 0.1844\n", "Epoch: 014/050 | Batch 050/891 | Cost: 0.1602\n", "Epoch: 014/050 | Batch 100/891 | Cost: 0.2462\n", "Epoch: 014/050 | Batch 150/891 | Cost: 0.1282\n", "Epoch: 014/050 | Batch 200/891 | Cost: 0.1453\n", "Epoch: 014/050 | Batch 250/891 | Cost: 0.2589\n", "Epoch: 014/050 | Batch 300/891 | Cost: 0.2492\n", "Epoch: 014/050 | Batch 350/891 | Cost: 0.0958\n", "Epoch: 014/050 | Batch 400/891 | Cost: 0.6354\n", "Epoch: 014/050 | Batch 450/891 | Cost: 0.1346\n", "Epoch: 014/050 | Batch 500/891 | Cost: 0.3579\n", "Epoch: 014/050 | Batch 550/891 | Cost: 0.1079\n", "Epoch: 014/050 | Batch 600/891 | Cost: 0.1896\n", "Epoch: 014/050 | Batch 650/891 | Cost: 0.2278\n", "Epoch: 014/050 | Batch 700/891 | Cost: 0.4933\n", "Epoch: 014/050 | Batch 750/891 | Cost: 0.3213\n", "Epoch: 014/050 | Batch 800/891 | Cost: 0.2413\n", "Epoch: 014/050 | Batch 850/891 | Cost: 0.2485\n", "training accuracy: 94.84%\n", "valid accuracy: 91.70%\n", "Time elapsed: 7.28 min\n", "Epoch: 015/050 | Batch 000/891 | Cost: 0.2655\n", "Epoch: 015/050 | Batch 050/891 | Cost: 0.0850\n", "Epoch: 015/050 | Batch 100/891 | Cost: 0.2339\n", "Epoch: 015/050 | Batch 150/891 | Cost: 0.1445\n", "Epoch: 015/050 | Batch 200/891 | Cost: 0.1013\n", "Epoch: 015/050 | Batch 250/891 | Cost: 0.2296\n", "Epoch: 015/050 | Batch 300/891 | Cost: 0.1205\n", "Epoch: 015/050 | Batch 350/891 | Cost: 0.1492\n", "Epoch: 015/050 | Batch 400/891 | Cost: 0.3134\n", "Epoch: 015/050 | Batch 450/891 | Cost: 0.2489\n", "Epoch: 015/050 | Batch 500/891 | Cost: 0.1313\n", "Epoch: 015/050 | Batch 550/891 | Cost: 0.2463\n", "Epoch: 015/050 | Batch 600/891 | Cost: 0.1853\n", "Epoch: 015/050 | Batch 650/891 | Cost: 0.1878\n", "Epoch: 015/050 | Batch 700/891 | Cost: 0.2329\n", "Epoch: 015/050 | Batch 750/891 | Cost: 0.1648\n", "Epoch: 015/050 | Batch 800/891 | Cost: 0.1891\n", "Epoch: 015/050 | Batch 850/891 | Cost: 0.1200\n", "training accuracy: 95.22%\n", "valid accuracy: 91.83%\n", "Time elapsed: 7.80 min\n", "Epoch: 016/050 | Batch 000/891 | Cost: 0.2548\n", "Epoch: 016/050 | Batch 050/891 | Cost: 0.3054\n", "Epoch: 016/050 | Batch 100/891 | Cost: 0.1123\n", "Epoch: 016/050 | Batch 150/891 | Cost: 0.1788\n", "Epoch: 016/050 | Batch 200/891 | Cost: 0.0968\n", "Epoch: 016/050 | Batch 250/891 | Cost: 0.2611\n", "Epoch: 016/050 | Batch 300/891 | Cost: 0.1720\n", "Epoch: 016/050 | Batch 350/891 | Cost: 0.1352\n", "Epoch: 016/050 | Batch 400/891 | Cost: 0.2122\n", "Epoch: 016/050 | Batch 450/891 | Cost: 0.3495\n", "Epoch: 016/050 | Batch 500/891 | Cost: 0.2742\n", "Epoch: 016/050 | Batch 550/891 | Cost: 0.3351\n", "Epoch: 016/050 | Batch 600/891 | Cost: 0.0711\n", "Epoch: 016/050 | Batch 650/891 | Cost: 0.1606\n", "Epoch: 016/050 | Batch 700/891 | Cost: 0.1502\n", "Epoch: 016/050 | Batch 750/891 | Cost: 0.1500\n", "Epoch: 016/050 | Batch 800/891 | Cost: 0.1290\n", "Epoch: 016/050 | Batch 850/891 | Cost: 0.1974\n", "training accuracy: 95.38%\n", "valid accuracy: 91.82%\n", "Time elapsed: 8.33 min\n", "Epoch: 017/050 | Batch 000/891 | Cost: 0.3753\n", "Epoch: 017/050 | Batch 050/891 | Cost: 0.2603\n", "Epoch: 017/050 | Batch 100/891 | Cost: 0.0900\n", "Epoch: 017/050 | Batch 150/891 | Cost: 0.1902\n", "Epoch: 017/050 | Batch 200/891 | Cost: 0.2403\n", "Epoch: 017/050 | Batch 250/891 | Cost: 0.1488\n", "Epoch: 017/050 | Batch 300/891 | Cost: 0.1474\n", "Epoch: 017/050 | Batch 350/891 | Cost: 0.2314\n", "Epoch: 017/050 | Batch 400/891 | Cost: 0.1752\n", "Epoch: 017/050 | Batch 450/891 | Cost: 0.1610\n", "Epoch: 017/050 | Batch 500/891 | Cost: 0.2189\n", "Epoch: 017/050 | Batch 550/891 | Cost: 0.2283\n", "Epoch: 017/050 | Batch 600/891 | Cost: 0.2098\n", "Epoch: 017/050 | Batch 650/891 | Cost: 0.2482\n", "Epoch: 017/050 | Batch 700/891 | Cost: 0.1573\n", "Epoch: 017/050 | Batch 750/891 | Cost: 0.1941\n", "Epoch: 017/050 | Batch 800/891 | Cost: 0.1842\n", "Epoch: 017/050 | Batch 850/891 | Cost: 0.1926\n", "training accuracy: 95.67%\n", "valid accuracy: 92.17%\n", "Time elapsed: 8.85 min\n", "Epoch: 018/050 | Batch 000/891 | Cost: 0.2376\n", "Epoch: 018/050 | Batch 050/891 | Cost: 0.1245\n", "Epoch: 018/050 | Batch 100/891 | Cost: 0.1663\n", "Epoch: 018/050 | Batch 150/891 | Cost: 0.1179\n", "Epoch: 018/050 | Batch 200/891 | Cost: 0.2016\n", "Epoch: 018/050 | Batch 250/891 | Cost: 0.1451\n", "Epoch: 018/050 | Batch 300/891 | Cost: 0.1310\n", "Epoch: 018/050 | Batch 350/891 | Cost: 0.2826\n", "Epoch: 018/050 | Batch 400/891 | Cost: 0.1151\n", "Epoch: 018/050 | Batch 450/891 | Cost: 0.0847\n", "Epoch: 018/050 | Batch 500/891 | Cost: 0.3294\n", "Epoch: 018/050 | Batch 550/891 | Cost: 0.2216\n", "Epoch: 018/050 | Batch 600/891 | Cost: 0.3044\n", "Epoch: 018/050 | Batch 650/891 | Cost: 0.2693\n", "Epoch: 018/050 | Batch 700/891 | Cost: 0.1898\n", "Epoch: 018/050 | Batch 750/891 | Cost: 0.1130\n", "Epoch: 018/050 | Batch 800/891 | Cost: 0.4650\n", "Epoch: 018/050 | Batch 850/891 | Cost: 0.1036\n", "training accuracy: 95.91%\n", "valid accuracy: 91.83%\n", "Time elapsed: 9.37 min\n", "Epoch: 019/050 | Batch 000/891 | Cost: 0.1979\n", "Epoch: 019/050 | Batch 050/891 | Cost: 0.3851\n", "Epoch: 019/050 | Batch 100/891 | Cost: 0.1857\n", "Epoch: 019/050 | Batch 150/891 | Cost: 0.1524\n", "Epoch: 019/050 | Batch 200/891 | Cost: 0.2859\n", "Epoch: 019/050 | Batch 250/891 | Cost: 0.1978\n", "Epoch: 019/050 | Batch 300/891 | Cost: 0.1859\n", "Epoch: 019/050 | Batch 350/891 | Cost: 0.3971\n", "Epoch: 019/050 | Batch 400/891 | Cost: 0.1393\n", "Epoch: 019/050 | Batch 450/891 | Cost: 0.4079\n", "Epoch: 019/050 | Batch 500/891 | Cost: 0.3164\n", "Epoch: 019/050 | Batch 550/891 | Cost: 0.2275\n", "Epoch: 019/050 | Batch 600/891 | Cost: 0.0996\n", "Epoch: 019/050 | Batch 650/891 | Cost: 0.1961\n", "Epoch: 019/050 | Batch 700/891 | Cost: 0.1276\n", "Epoch: 019/050 | Batch 750/891 | Cost: 0.2926\n", "Epoch: 019/050 | Batch 800/891 | Cost: 0.1130\n", "Epoch: 019/050 | Batch 850/891 | Cost: 0.1441\n", "training accuracy: 95.94%\n", "valid accuracy: 91.77%\n", "Time elapsed: 9.89 min\n", "Epoch: 020/050 | Batch 000/891 | Cost: 0.1467\n", "Epoch: 020/050 | Batch 050/891 | Cost: 0.1647\n", "Epoch: 020/050 | Batch 100/891 | Cost: 0.2161\n", "Epoch: 020/050 | Batch 150/891 | Cost: 0.1547\n", "Epoch: 020/050 | Batch 200/891 | Cost: 0.4020\n", "Epoch: 020/050 | Batch 250/891 | Cost: 0.2447\n", "Epoch: 020/050 | Batch 300/891 | Cost: 0.2506\n", "Epoch: 020/050 | Batch 350/891 | Cost: 0.1571\n", "Epoch: 020/050 | Batch 400/891 | Cost: 0.1199\n", "Epoch: 020/050 | Batch 450/891 | Cost: 0.0830\n", "Epoch: 020/050 | Batch 500/891 | Cost: 0.1957\n", "Epoch: 020/050 | Batch 550/891 | Cost: 0.1530\n", "Epoch: 020/050 | Batch 600/891 | Cost: 0.4454\n", "Epoch: 020/050 | Batch 650/891 | Cost: 0.1160\n", "Epoch: 020/050 | Batch 700/891 | Cost: 0.1137\n", "Epoch: 020/050 | Batch 750/891 | Cost: 0.1297\n", "Epoch: 020/050 | Batch 800/891 | Cost: 0.3041\n", "Epoch: 020/050 | Batch 850/891 | Cost: 0.0867\n", "training accuracy: 96.05%\n", "valid accuracy: 91.98%\n", "Time elapsed: 10.41 min\n", "Epoch: 021/050 | Batch 000/891 | Cost: 0.1451\n", "Epoch: 021/050 | Batch 050/891 | Cost: 0.1487\n", "Epoch: 021/050 | Batch 100/891 | Cost: 0.0903\n", "Epoch: 021/050 | Batch 150/891 | Cost: 0.1391\n", "Epoch: 021/050 | Batch 200/891 | Cost: 0.1830\n", "Epoch: 021/050 | Batch 250/891 | Cost: 0.1728\n", "Epoch: 021/050 | Batch 300/891 | Cost: 0.2294\n", "Epoch: 021/050 | Batch 350/891 | Cost: 0.0633\n", "Epoch: 021/050 | Batch 400/891 | Cost: 0.1786\n", "Epoch: 021/050 | Batch 450/891 | Cost: 0.0823\n", "Epoch: 021/050 | Batch 500/891 | Cost: 0.0895\n", "Epoch: 021/050 | Batch 550/891 | Cost: 0.1229\n", "Epoch: 021/050 | Batch 600/891 | Cost: 0.2732\n", "Epoch: 021/050 | Batch 650/891 | Cost: 0.1437\n", "Epoch: 021/050 | Batch 700/891 | Cost: 0.0879\n", "Epoch: 021/050 | Batch 750/891 | Cost: 0.1119\n", "Epoch: 021/050 | Batch 800/891 | Cost: 0.1358\n", "Epoch: 021/050 | Batch 850/891 | Cost: 0.1967\n", "training accuracy: 96.15%\n", "valid accuracy: 91.85%\n", "Time elapsed: 10.94 min\n", "Epoch: 022/050 | Batch 000/891 | Cost: 0.2011\n", "Epoch: 022/050 | Batch 050/891 | Cost: 0.1103\n", "Epoch: 022/050 | Batch 100/891 | Cost: 0.2082\n", "Epoch: 022/050 | Batch 150/891 | Cost: 0.1258\n", "Epoch: 022/050 | Batch 200/891 | Cost: 0.3730\n", "Epoch: 022/050 | Batch 250/891 | Cost: 0.4325\n", "Epoch: 022/050 | Batch 300/891 | Cost: 0.2348\n", "Epoch: 022/050 | Batch 350/891 | Cost: 0.1401\n", "Epoch: 022/050 | Batch 400/891 | Cost: 0.3020\n", "Epoch: 022/050 | Batch 450/891 | Cost: 0.1173\n", "Epoch: 022/050 | Batch 500/891 | Cost: 0.1262\n", "Epoch: 022/050 | Batch 550/891 | Cost: 0.2594\n", "Epoch: 022/050 | Batch 600/891 | Cost: 0.1213\n", "Epoch: 022/050 | Batch 650/891 | Cost: 0.0961\n", "Epoch: 022/050 | Batch 700/891 | Cost: 0.1579\n", "Epoch: 022/050 | Batch 750/891 | Cost: 0.1669\n", "Epoch: 022/050 | Batch 800/891 | Cost: 0.1836\n", "Epoch: 022/050 | Batch 850/891 | Cost: 0.1857\n", "training accuracy: 96.38%\n", "valid accuracy: 91.85%\n", "Time elapsed: 11.46 min\n", "Epoch: 023/050 | Batch 000/891 | Cost: 0.1616\n", "Epoch: 023/050 | Batch 050/891 | Cost: 0.0922\n", "Epoch: 023/050 | Batch 100/891 | Cost: 0.2086\n", "Epoch: 023/050 | Batch 150/891 | Cost: 0.4053\n", "Epoch: 023/050 | Batch 200/891 | Cost: 0.2502\n", "Epoch: 023/050 | Batch 250/891 | Cost: 0.1509\n", "Epoch: 023/050 | Batch 300/891 | Cost: 0.3012\n", "Epoch: 023/050 | Batch 350/891 | Cost: 0.1202\n", "Epoch: 023/050 | Batch 400/891 | Cost: 0.4270\n", "Epoch: 023/050 | Batch 450/891 | Cost: 0.2277\n", "Epoch: 023/050 | Batch 500/891 | Cost: 0.1788\n", "Epoch: 023/050 | Batch 550/891 | Cost: 0.1663\n", "Epoch: 023/050 | Batch 600/891 | Cost: 0.1667\n", "Epoch: 023/050 | Batch 650/891 | Cost: 0.1578\n", "Epoch: 023/050 | Batch 700/891 | Cost: 0.1768\n", "Epoch: 023/050 | Batch 750/891 | Cost: 0.1270\n", "Epoch: 023/050 | Batch 800/891 | Cost: 0.1632\n", "Epoch: 023/050 | Batch 850/891 | Cost: 0.2621\n", "training accuracy: 96.47%\n", "valid accuracy: 92.28%\n", "Time elapsed: 11.99 min\n", "Epoch: 024/050 | Batch 000/891 | Cost: 0.0953\n", "Epoch: 024/050 | Batch 050/891 | Cost: 0.0845\n", "Epoch: 024/050 | Batch 100/891 | Cost: 0.1797\n", "Epoch: 024/050 | Batch 150/891 | Cost: 0.1241\n", "Epoch: 024/050 | Batch 200/891 | Cost: 0.1801\n", "Epoch: 024/050 | Batch 250/891 | Cost: 0.2227\n", "Epoch: 024/050 | Batch 300/891 | Cost: 0.4965\n", "Epoch: 024/050 | Batch 350/891 | Cost: 0.1874\n", "Epoch: 024/050 | Batch 400/891 | Cost: 0.1172\n", "Epoch: 024/050 | Batch 450/891 | Cost: 0.2244\n", "Epoch: 024/050 | Batch 500/891 | Cost: 0.1262\n", "Epoch: 024/050 | Batch 550/891 | Cost: 0.2427\n", "Epoch: 024/050 | Batch 600/891 | Cost: 0.1131\n", "Epoch: 024/050 | Batch 650/891 | Cost: 0.2320\n", "Epoch: 024/050 | Batch 700/891 | Cost: 0.1078\n", "Epoch: 024/050 | Batch 750/891 | Cost: 0.0839\n", "Epoch: 024/050 | Batch 800/891 | Cost: 0.2036\n", "Epoch: 024/050 | Batch 850/891 | Cost: 0.1953\n", "training accuracy: 96.49%\n", "valid accuracy: 91.95%\n", "Time elapsed: 12.51 min\n", "Epoch: 025/050 | Batch 000/891 | Cost: 0.2558\n", "Epoch: 025/050 | Batch 050/891 | Cost: 0.1072\n", "Epoch: 025/050 | Batch 100/891 | Cost: 0.2158\n", "Epoch: 025/050 | Batch 150/891 | Cost: 0.1381\n", "Epoch: 025/050 | Batch 200/891 | Cost: 0.0871\n", "Epoch: 025/050 | Batch 250/891 | Cost: 0.3461\n", "Epoch: 025/050 | Batch 300/891 | Cost: 0.0968\n", "Epoch: 025/050 | Batch 350/891 | Cost: 0.3009\n", "Epoch: 025/050 | Batch 400/891 | Cost: 0.1789\n", "Epoch: 025/050 | Batch 450/891 | Cost: 0.1351\n", "Epoch: 025/050 | Batch 500/891 | Cost: 0.4432\n", "Epoch: 025/050 | Batch 550/891 | Cost: 0.1543\n", "Epoch: 025/050 | Batch 600/891 | Cost: 0.1859\n", "Epoch: 025/050 | Batch 650/891 | Cost: 0.2304\n", "Epoch: 025/050 | Batch 700/891 | Cost: 0.1545\n", "Epoch: 025/050 | Batch 750/891 | Cost: 0.2133\n", "Epoch: 025/050 | Batch 800/891 | Cost: 0.1626\n", "Epoch: 025/050 | Batch 850/891 | Cost: 0.1345\n", "training accuracy: 96.52%\n", "valid accuracy: 91.67%\n", "Time elapsed: 13.04 min\n", "Epoch: 026/050 | Batch 000/891 | Cost: 0.1949\n", "Epoch: 026/050 | Batch 050/891 | Cost: 0.3824\n", "Epoch: 026/050 | Batch 100/891 | Cost: 0.1669\n", "Epoch: 026/050 | Batch 150/891 | Cost: 0.0921\n", "Epoch: 026/050 | Batch 200/891 | Cost: 0.1204\n", "Epoch: 026/050 | Batch 250/891 | Cost: 0.2094\n", "Epoch: 026/050 | Batch 300/891 | Cost: 0.3778\n", "Epoch: 026/050 | Batch 350/891 | Cost: 0.1472\n", "Epoch: 026/050 | Batch 400/891 | Cost: 0.2276\n", "Epoch: 026/050 | Batch 450/891 | Cost: 0.3556\n", "Epoch: 026/050 | Batch 500/891 | Cost: 0.2241\n", "Epoch: 026/050 | Batch 550/891 | Cost: 0.4314\n", "Epoch: 026/050 | Batch 600/891 | Cost: 0.2155\n", "Epoch: 026/050 | Batch 650/891 | Cost: 0.1677\n", "Epoch: 026/050 | Batch 700/891 | Cost: 0.1383\n", "Epoch: 026/050 | Batch 750/891 | Cost: 0.1661\n", "Epoch: 026/050 | Batch 800/891 | Cost: 0.3100\n", "Epoch: 026/050 | Batch 850/891 | Cost: 0.1083\n", "training accuracy: 96.67%\n", "valid accuracy: 91.70%\n", "Time elapsed: 13.56 min\n", "Epoch: 027/050 | Batch 000/891 | Cost: 0.0772\n", "Epoch: 027/050 | Batch 050/891 | Cost: 0.0812\n", "Epoch: 027/050 | Batch 100/891 | Cost: 0.1793\n", "Epoch: 027/050 | Batch 150/891 | Cost: 0.1480\n", "Epoch: 027/050 | Batch 200/891 | Cost: 0.1768\n", "Epoch: 027/050 | Batch 250/891 | Cost: 0.3068\n", "Epoch: 027/050 | Batch 300/891 | Cost: 0.1652\n", "Epoch: 027/050 | Batch 350/891 | Cost: 0.1633\n", "Epoch: 027/050 | Batch 400/891 | Cost: 0.2064\n", "Epoch: 027/050 | Batch 450/891 | Cost: 0.1655\n", "Epoch: 027/050 | Batch 500/891 | Cost: 0.1756\n", "Epoch: 027/050 | Batch 550/891 | Cost: 0.1434\n", "Epoch: 027/050 | Batch 600/891 | Cost: 0.2568\n", "Epoch: 027/050 | Batch 650/891 | Cost: 0.0844\n", "Epoch: 027/050 | Batch 700/891 | Cost: 0.0799\n", "Epoch: 027/050 | Batch 750/891 | Cost: 0.1349\n", "Epoch: 027/050 | Batch 800/891 | Cost: 0.2556\n", "Epoch: 027/050 | Batch 850/891 | Cost: 0.1254\n", "training accuracy: 96.94%\n", "valid accuracy: 92.20%\n", "Time elapsed: 14.09 min\n", "Epoch: 028/050 | Batch 000/891 | Cost: 0.1303\n", "Epoch: 028/050 | Batch 050/891 | Cost: 0.2345\n", "Epoch: 028/050 | Batch 100/891 | Cost: 0.1625\n", "Epoch: 028/050 | Batch 150/891 | Cost: 0.1978\n", "Epoch: 028/050 | Batch 200/891 | Cost: 0.1598\n", "Epoch: 028/050 | Batch 250/891 | Cost: 0.1072\n", "Epoch: 028/050 | Batch 300/891 | Cost: 0.1831\n", "Epoch: 028/050 | Batch 350/891 | Cost: 0.0910\n", "Epoch: 028/050 | Batch 400/891 | Cost: 0.0870\n", "Epoch: 028/050 | Batch 450/891 | Cost: 0.1054\n", "Epoch: 028/050 | Batch 500/891 | Cost: 0.1814\n", "Epoch: 028/050 | Batch 550/891 | Cost: 0.1450\n", "Epoch: 028/050 | Batch 600/891 | Cost: 0.1180\n", "Epoch: 028/050 | Batch 650/891 | Cost: 0.1368\n", "Epoch: 028/050 | Batch 700/891 | Cost: 0.1233\n", "Epoch: 028/050 | Batch 750/891 | Cost: 0.0832\n", "Epoch: 028/050 | Batch 800/891 | Cost: 0.1648\n", "Epoch: 028/050 | Batch 850/891 | Cost: 0.1635\n", "training accuracy: 97.06%\n", "valid accuracy: 92.13%\n", "Time elapsed: 14.61 min\n", "Epoch: 029/050 | Batch 000/891 | Cost: 0.0895\n", "Epoch: 029/050 | Batch 050/891 | Cost: 0.0893\n", "Epoch: 029/050 | Batch 100/891 | Cost: 0.2326\n", "Epoch: 029/050 | Batch 150/891 | Cost: 0.2078\n", "Epoch: 029/050 | Batch 200/891 | Cost: 0.0743\n", "Epoch: 029/050 | Batch 250/891 | Cost: 0.1169\n", "Epoch: 029/050 | Batch 300/891 | Cost: 0.2828\n", "Epoch: 029/050 | Batch 350/891 | Cost: 0.1916\n", "Epoch: 029/050 | Batch 400/891 | Cost: 0.1416\n", "Epoch: 029/050 | Batch 450/891 | Cost: 0.1501\n", "Epoch: 029/050 | Batch 500/891 | Cost: 0.2920\n", "Epoch: 029/050 | Batch 550/891 | Cost: 0.1433\n", "Epoch: 029/050 | Batch 600/891 | Cost: 0.1443\n", "Epoch: 029/050 | Batch 650/891 | Cost: 0.4024\n", "Epoch: 029/050 | Batch 700/891 | Cost: 0.1745\n", "Epoch: 029/050 | Batch 750/891 | Cost: 0.1506\n", "Epoch: 029/050 | Batch 800/891 | Cost: 0.1827\n", "Epoch: 029/050 | Batch 850/891 | Cost: 0.1941\n", "training accuracy: 97.06%\n", "valid accuracy: 91.80%\n", "Time elapsed: 15.13 min\n", "Epoch: 030/050 | Batch 000/891 | Cost: 0.0538\n", "Epoch: 030/050 | Batch 050/891 | Cost: 0.1574\n", "Epoch: 030/050 | Batch 100/891 | Cost: 0.1078\n", "Epoch: 030/050 | Batch 150/891 | Cost: 0.0910\n", "Epoch: 030/050 | Batch 200/891 | Cost: 0.4213\n", "Epoch: 030/050 | Batch 250/891 | Cost: 0.4354\n", "Epoch: 030/050 | Batch 300/891 | Cost: 0.1978\n", "Epoch: 030/050 | Batch 350/891 | Cost: 0.3105\n", "Epoch: 030/050 | Batch 400/891 | Cost: 0.0855\n", "Epoch: 030/050 | Batch 450/891 | Cost: 0.0950\n", "Epoch: 030/050 | Batch 500/891 | Cost: 0.1578\n", "Epoch: 030/050 | Batch 550/891 | Cost: 0.1812\n", "Epoch: 030/050 | Batch 600/891 | Cost: 0.1503\n", "Epoch: 030/050 | Batch 650/891 | Cost: 0.2524\n", "Epoch: 030/050 | Batch 700/891 | Cost: 0.2850\n", "Epoch: 030/050 | Batch 750/891 | Cost: 0.2929\n", "Epoch: 030/050 | Batch 800/891 | Cost: 0.1662\n", "Epoch: 030/050 | Batch 850/891 | Cost: 0.1461\n", "training accuracy: 96.99%\n", "valid accuracy: 91.63%\n", "Time elapsed: 15.65 min\n", "Epoch: 031/050 | Batch 000/891 | Cost: 0.2477\n", "Epoch: 031/050 | Batch 050/891 | Cost: 0.0818\n", "Epoch: 031/050 | Batch 100/891 | Cost: 0.3006\n", "Epoch: 031/050 | Batch 150/891 | Cost: 0.1007\n", "Epoch: 031/050 | Batch 200/891 | Cost: 0.1521\n", "Epoch: 031/050 | Batch 250/891 | Cost: 0.2553\n", "Epoch: 031/050 | Batch 300/891 | Cost: 0.1161\n", "Epoch: 031/050 | Batch 350/891 | Cost: 0.1272\n", "Epoch: 031/050 | Batch 400/891 | Cost: 0.1417\n", "Epoch: 031/050 | Batch 450/891 | Cost: 0.2192\n", "Epoch: 031/050 | Batch 500/891 | Cost: 0.1461\n", "Epoch: 031/050 | Batch 550/891 | Cost: 0.0548\n", "Epoch: 031/050 | Batch 600/891 | Cost: 0.0588\n", "Epoch: 031/050 | Batch 650/891 | Cost: 0.1124\n", "Epoch: 031/050 | Batch 700/891 | Cost: 0.1215\n", "Epoch: 031/050 | Batch 750/891 | Cost: 0.1673\n", "Epoch: 031/050 | Batch 800/891 | Cost: 0.3652\n", "Epoch: 031/050 | Batch 850/891 | Cost: 0.1577\n", "training accuracy: 97.28%\n", "valid accuracy: 91.93%\n", "Time elapsed: 16.18 min\n", "Epoch: 032/050 | Batch 000/891 | Cost: 0.2157\n", "Epoch: 032/050 | Batch 050/891 | Cost: 0.1044\n", "Epoch: 032/050 | Batch 100/891 | Cost: 0.1418\n", "Epoch: 032/050 | Batch 150/891 | Cost: 0.1295\n", "Epoch: 032/050 | Batch 200/891 | Cost: 0.1992\n", "Epoch: 032/050 | Batch 250/891 | Cost: 0.1287\n", "Epoch: 032/050 | Batch 300/891 | Cost: 0.1237\n", "Epoch: 032/050 | Batch 350/891 | Cost: 0.1700\n", "Epoch: 032/050 | Batch 400/891 | Cost: 0.0834\n", "Epoch: 032/050 | Batch 450/891 | Cost: 0.1187\n", "Epoch: 032/050 | Batch 500/891 | Cost: 0.1210\n", "Epoch: 032/050 | Batch 550/891 | Cost: 0.1013\n", "Epoch: 032/050 | Batch 600/891 | Cost: 0.1093\n", "Epoch: 032/050 | Batch 650/891 | Cost: 0.1273\n", "Epoch: 032/050 | Batch 700/891 | Cost: 0.0825\n", "Epoch: 032/050 | Batch 750/891 | Cost: 0.0576\n", "Epoch: 032/050 | Batch 800/891 | Cost: 0.3141\n", "Epoch: 032/050 | Batch 850/891 | Cost: 0.1311\n", "training accuracy: 97.36%\n", "valid accuracy: 91.75%\n", "Time elapsed: 16.70 min\n", "Epoch: 033/050 | Batch 000/891 | Cost: 0.1296\n", "Epoch: 033/050 | Batch 050/891 | Cost: 0.1740\n", "Epoch: 033/050 | Batch 100/891 | Cost: 0.1495\n", "Epoch: 033/050 | Batch 150/891 | Cost: 0.1684\n", "Epoch: 033/050 | Batch 200/891 | Cost: 0.1388\n", "Epoch: 033/050 | Batch 250/891 | Cost: 0.0879\n", "Epoch: 033/050 | Batch 300/891 | Cost: 0.1247\n", "Epoch: 033/050 | Batch 350/891 | Cost: 0.0976\n", "Epoch: 033/050 | Batch 400/891 | Cost: 0.1558\n", "Epoch: 033/050 | Batch 450/891 | Cost: 0.1188\n", "Epoch: 033/050 | Batch 500/891 | Cost: 0.2809\n", "Epoch: 033/050 | Batch 550/891 | Cost: 0.1375\n", "Epoch: 033/050 | Batch 600/891 | Cost: 0.1907\n", "Epoch: 033/050 | Batch 650/891 | Cost: 0.2093\n", "Epoch: 033/050 | Batch 700/891 | Cost: 0.1889\n", "Epoch: 033/050 | Batch 750/891 | Cost: 0.1331\n", "Epoch: 033/050 | Batch 800/891 | Cost: 0.1284\n", "Epoch: 033/050 | Batch 850/891 | Cost: 0.1431\n", "training accuracy: 97.42%\n", "valid accuracy: 91.95%\n", "Time elapsed: 17.22 min\n", "Epoch: 034/050 | Batch 000/891 | Cost: 0.1271\n", "Epoch: 034/050 | Batch 050/891 | Cost: 0.1313\n", "Epoch: 034/050 | Batch 100/891 | Cost: 0.1259\n", "Epoch: 034/050 | Batch 150/891 | Cost: 0.1604\n", "Epoch: 034/050 | Batch 200/891 | Cost: 0.1298\n", "Epoch: 034/050 | Batch 250/891 | Cost: 0.2076\n", "Epoch: 034/050 | Batch 300/891 | Cost: 0.1235\n", "Epoch: 034/050 | Batch 350/891 | Cost: 0.1878\n", "Epoch: 034/050 | Batch 400/891 | Cost: 0.1428\n", "Epoch: 034/050 | Batch 450/891 | Cost: 0.1437\n", "Epoch: 034/050 | Batch 500/891 | Cost: 0.2830\n", "Epoch: 034/050 | Batch 550/891 | Cost: 0.1939\n", "Epoch: 034/050 | Batch 600/891 | Cost: 0.2164\n", "Epoch: 034/050 | Batch 650/891 | Cost: 0.1532\n", "Epoch: 034/050 | Batch 700/891 | Cost: 0.0598\n", "Epoch: 034/050 | Batch 750/891 | Cost: 0.2219\n", "Epoch: 034/050 | Batch 800/891 | Cost: 0.0449\n", "Epoch: 034/050 | Batch 850/891 | Cost: 0.1881\n", "training accuracy: 97.50%\n", "valid accuracy: 91.83%\n", "Time elapsed: 17.75 min\n", "Epoch: 035/050 | Batch 000/891 | Cost: 0.2045\n", "Epoch: 035/050 | Batch 050/891 | Cost: 0.0852\n", "Epoch: 035/050 | Batch 100/891 | Cost: 0.1590\n", "Epoch: 035/050 | Batch 150/891 | Cost: 0.1173\n", "Epoch: 035/050 | Batch 200/891 | Cost: 0.0929\n", "Epoch: 035/050 | Batch 250/891 | Cost: 0.1028\n", "Epoch: 035/050 | Batch 300/891 | Cost: 0.1426\n", "Epoch: 035/050 | Batch 350/891 | Cost: 0.1643\n", "Epoch: 035/050 | Batch 400/891 | Cost: 0.1684\n", "Epoch: 035/050 | Batch 450/891 | Cost: 0.1423\n", "Epoch: 035/050 | Batch 500/891 | Cost: 0.0537\n", "Epoch: 035/050 | Batch 550/891 | Cost: 0.1361\n", "Epoch: 035/050 | Batch 600/891 | Cost: 0.1196\n", "Epoch: 035/050 | Batch 650/891 | Cost: 0.2022\n", "Epoch: 035/050 | Batch 700/891 | Cost: 0.1325\n", "Epoch: 035/050 | Batch 750/891 | Cost: 0.1634\n", "Epoch: 035/050 | Batch 800/891 | Cost: 0.0780\n", "Epoch: 035/050 | Batch 850/891 | Cost: 0.0622\n", "training accuracy: 97.54%\n", "valid accuracy: 92.08%\n", "Time elapsed: 18.27 min\n", "Epoch: 036/050 | Batch 000/891 | Cost: 0.2047\n", "Epoch: 036/050 | Batch 050/891 | Cost: 0.1147\n", "Epoch: 036/050 | Batch 100/891 | Cost: 0.1562\n", "Epoch: 036/050 | Batch 150/891 | Cost: 0.1287\n", "Epoch: 036/050 | Batch 200/891 | Cost: 0.1003\n", "Epoch: 036/050 | Batch 250/891 | Cost: 0.0321\n", "Epoch: 036/050 | Batch 300/891 | Cost: 0.0996\n", "Epoch: 036/050 | Batch 350/891 | Cost: 0.3548\n", "Epoch: 036/050 | Batch 400/891 | Cost: 0.3519\n", "Epoch: 036/050 | Batch 450/891 | Cost: 0.1706\n", "Epoch: 036/050 | Batch 500/891 | Cost: 0.0928\n", "Epoch: 036/050 | Batch 550/891 | Cost: 0.2362\n", "Epoch: 036/050 | Batch 600/891 | Cost: 0.0272\n", "Epoch: 036/050 | Batch 650/891 | Cost: 0.1204\n", "Epoch: 036/050 | Batch 700/891 | Cost: 0.1232\n", "Epoch: 036/050 | Batch 750/891 | Cost: 0.0554\n", "Epoch: 036/050 | Batch 800/891 | Cost: 0.1261\n", "Epoch: 036/050 | Batch 850/891 | Cost: 0.1711\n", "training accuracy: 97.61%\n", "valid accuracy: 91.78%\n", "Time elapsed: 18.79 min\n", "Epoch: 037/050 | Batch 000/891 | Cost: 0.2958\n", "Epoch: 037/050 | Batch 050/891 | Cost: 0.1589\n", "Epoch: 037/050 | Batch 100/891 | Cost: 0.1260\n", "Epoch: 037/050 | Batch 150/891 | Cost: 0.1790\n", "Epoch: 037/050 | Batch 200/891 | Cost: 0.1086\n", "Epoch: 037/050 | Batch 250/891 | Cost: 0.1195\n", "Epoch: 037/050 | Batch 300/891 | Cost: 0.0967\n", "Epoch: 037/050 | Batch 350/891 | Cost: 0.1505\n", "Epoch: 037/050 | Batch 400/891 | Cost: 0.1043\n", "Epoch: 037/050 | Batch 450/891 | Cost: 0.0591\n", "Epoch: 037/050 | Batch 500/891 | Cost: 0.1217\n", "Epoch: 037/050 | Batch 550/891 | Cost: 0.1842\n", "Epoch: 037/050 | Batch 600/891 | Cost: 0.1192\n", "Epoch: 037/050 | Batch 650/891 | Cost: 0.1334\n", "Epoch: 037/050 | Batch 700/891 | Cost: 0.1788\n", "Epoch: 037/050 | Batch 750/891 | Cost: 0.0667\n", "Epoch: 037/050 | Batch 800/891 | Cost: 0.3219\n", "Epoch: 037/050 | Batch 850/891 | Cost: 0.1975\n", "training accuracy: 97.71%\n", "valid accuracy: 91.72%\n", "Time elapsed: 19.32 min\n", "Epoch: 038/050 | Batch 000/891 | Cost: 0.1844\n", "Epoch: 038/050 | Batch 050/891 | Cost: 0.1545\n", "Epoch: 038/050 | Batch 100/891 | Cost: 0.1334\n", "Epoch: 038/050 | Batch 150/891 | Cost: 0.1063\n", "Epoch: 038/050 | Batch 200/891 | Cost: 0.2812\n", "Epoch: 038/050 | Batch 250/891 | Cost: 0.0981\n", "Epoch: 038/050 | Batch 300/891 | Cost: 0.1523\n", "Epoch: 038/050 | Batch 350/891 | Cost: 0.2879\n", "Epoch: 038/050 | Batch 400/891 | Cost: 0.2729\n", "Epoch: 038/050 | Batch 450/891 | Cost: 0.0612\n", "Epoch: 038/050 | Batch 500/891 | Cost: 0.1598\n", "Epoch: 038/050 | Batch 550/891 | Cost: 0.0723\n", "Epoch: 038/050 | Batch 600/891 | Cost: 0.2697\n", "Epoch: 038/050 | Batch 650/891 | Cost: 0.1282\n", "Epoch: 038/050 | Batch 700/891 | Cost: 0.1593\n", "Epoch: 038/050 | Batch 750/891 | Cost: 0.0659\n", "Epoch: 038/050 | Batch 800/891 | Cost: 0.1573\n", "Epoch: 038/050 | Batch 850/891 | Cost: 0.1656\n", "training accuracy: 97.69%\n", "valid accuracy: 91.58%\n", "Time elapsed: 19.84 min\n", "Epoch: 039/050 | Batch 000/891 | Cost: 0.1314\n", "Epoch: 039/050 | Batch 050/891 | Cost: 0.1625\n", "Epoch: 039/050 | Batch 100/891 | Cost: 0.0831\n", "Epoch: 039/050 | Batch 150/891 | Cost: 0.1587\n", "Epoch: 039/050 | Batch 200/891 | Cost: 0.1787\n", "Epoch: 039/050 | Batch 250/891 | Cost: 0.1757\n", "Epoch: 039/050 | Batch 300/891 | Cost: 0.1766\n", "Epoch: 039/050 | Batch 350/891 | Cost: 0.0869\n", "Epoch: 039/050 | Batch 400/891 | Cost: 0.1955\n", "Epoch: 039/050 | Batch 450/891 | Cost: 0.1461\n", "Epoch: 039/050 | Batch 500/891 | Cost: 0.1332\n", "Epoch: 039/050 | Batch 550/891 | Cost: 0.1721\n", "Epoch: 039/050 | Batch 600/891 | Cost: 0.1060\n", "Epoch: 039/050 | Batch 650/891 | Cost: 0.1121\n", "Epoch: 039/050 | Batch 700/891 | Cost: 0.0702\n", "Epoch: 039/050 | Batch 750/891 | Cost: 0.1067\n", "Epoch: 039/050 | Batch 800/891 | Cost: 0.1447\n", "Epoch: 039/050 | Batch 850/891 | Cost: 0.4161\n", "training accuracy: 97.93%\n", "valid accuracy: 91.78%\n", "Time elapsed: 20.36 min\n", "Epoch: 040/050 | Batch 000/891 | Cost: 0.2074\n", "Epoch: 040/050 | Batch 050/891 | Cost: 0.1328\n", "Epoch: 040/050 | Batch 100/891 | Cost: 0.4158\n", "Epoch: 040/050 | Batch 150/891 | Cost: 0.1248\n", "Epoch: 040/050 | Batch 200/891 | Cost: 0.1959\n", "Epoch: 040/050 | Batch 250/891 | Cost: 0.0962\n", "Epoch: 040/050 | Batch 300/891 | Cost: 0.1825\n", "Epoch: 040/050 | Batch 350/891 | Cost: 0.1554\n", "Epoch: 040/050 | Batch 400/891 | Cost: 0.1273\n", "Epoch: 040/050 | Batch 450/891 | Cost: 0.1137\n", "Epoch: 040/050 | Batch 500/891 | Cost: 0.1901\n", "Epoch: 040/050 | Batch 550/891 | Cost: 0.0814\n", "Epoch: 040/050 | Batch 600/891 | Cost: 0.1345\n", "Epoch: 040/050 | Batch 650/891 | Cost: 0.2639\n", "Epoch: 040/050 | Batch 700/891 | Cost: 0.1025\n", "Epoch: 040/050 | Batch 750/891 | Cost: 0.1327\n", "Epoch: 040/050 | Batch 800/891 | Cost: 0.1714\n", "Epoch: 040/050 | Batch 850/891 | Cost: 0.1343\n", "training accuracy: 97.81%\n", "valid accuracy: 92.17%\n", "Time elapsed: 20.89 min\n", "Epoch: 041/050 | Batch 000/891 | Cost: 0.1353\n", "Epoch: 041/050 | Batch 050/891 | Cost: 0.1946\n", "Epoch: 041/050 | Batch 100/891 | Cost: 0.0811\n", "Epoch: 041/050 | Batch 150/891 | Cost: 0.1745\n", "Epoch: 041/050 | Batch 200/891 | Cost: 0.1002\n", "Epoch: 041/050 | Batch 250/891 | Cost: 0.1357\n", "Epoch: 041/050 | Batch 300/891 | Cost: 0.1622\n", "Epoch: 041/050 | Batch 350/891 | Cost: 0.2214\n", "Epoch: 041/050 | Batch 400/891 | Cost: 0.1607\n", "Epoch: 041/050 | Batch 450/891 | Cost: 0.1431\n", "Epoch: 041/050 | Batch 500/891 | Cost: 0.2578\n", "Epoch: 041/050 | Batch 550/891 | Cost: 0.1356\n", "Epoch: 041/050 | Batch 600/891 | Cost: 0.1696\n", "Epoch: 041/050 | Batch 650/891 | Cost: 0.1122\n", "Epoch: 041/050 | Batch 700/891 | Cost: 0.0957\n", "Epoch: 041/050 | Batch 750/891 | Cost: 0.0836\n", "Epoch: 041/050 | Batch 800/891 | Cost: 0.1506\n", "Epoch: 041/050 | Batch 850/891 | Cost: 0.0962\n", "training accuracy: 97.78%\n", "valid accuracy: 91.58%\n", "Time elapsed: 21.41 min\n", "Epoch: 042/050 | Batch 000/891 | Cost: 0.2229\n", "Epoch: 042/050 | Batch 050/891 | Cost: 0.1423\n", "Epoch: 042/050 | Batch 100/891 | Cost: 0.1003\n", "Epoch: 042/050 | Batch 150/891 | Cost: 0.0959\n", "Epoch: 042/050 | Batch 200/891 | Cost: 0.1080\n", "Epoch: 042/050 | Batch 250/891 | Cost: 0.1520\n", "Epoch: 042/050 | Batch 300/891 | Cost: 0.0732\n", "Epoch: 042/050 | Batch 350/891 | Cost: 0.1583\n", "Epoch: 042/050 | Batch 400/891 | Cost: 0.1231\n", "Epoch: 042/050 | Batch 450/891 | Cost: 0.2447\n", "Epoch: 042/050 | Batch 500/891 | Cost: 0.0683\n", "Epoch: 042/050 | Batch 550/891 | Cost: 0.1204\n", "Epoch: 042/050 | Batch 600/891 | Cost: 0.1543\n", "Epoch: 042/050 | Batch 650/891 | Cost: 0.1600\n", "Epoch: 042/050 | Batch 700/891 | Cost: 0.0901\n", "Epoch: 042/050 | Batch 750/891 | Cost: 0.1604\n", "Epoch: 042/050 | Batch 800/891 | Cost: 0.1715\n", "Epoch: 042/050 | Batch 850/891 | Cost: 0.2226\n", "training accuracy: 97.77%\n", "valid accuracy: 91.15%\n", "Time elapsed: 21.94 min\n", "Epoch: 043/050 | Batch 000/891 | Cost: 0.1232\n", "Epoch: 043/050 | Batch 050/891 | Cost: 0.1437\n", "Epoch: 043/050 | Batch 100/891 | Cost: 0.0858\n", "Epoch: 043/050 | Batch 150/891 | Cost: 0.1087\n", "Epoch: 043/050 | Batch 200/891 | Cost: 0.0706\n", "Epoch: 043/050 | Batch 250/891 | Cost: 0.1048\n", "Epoch: 043/050 | Batch 300/891 | Cost: 0.1699\n", "Epoch: 043/050 | Batch 350/891 | Cost: 0.1475\n", "Epoch: 043/050 | Batch 400/891 | Cost: 0.2350\n", "Epoch: 043/050 | Batch 450/891 | Cost: 0.1415\n", "Epoch: 043/050 | Batch 500/891 | Cost: 0.1563\n", "Epoch: 043/050 | Batch 550/891 | Cost: 0.2188\n", "Epoch: 043/050 | Batch 600/891 | Cost: 0.1957\n", "Epoch: 043/050 | Batch 650/891 | Cost: 0.1960\n", "Epoch: 043/050 | Batch 700/891 | Cost: 0.2074\n", "Epoch: 043/050 | Batch 750/891 | Cost: 0.2902\n", "Epoch: 043/050 | Batch 800/891 | Cost: 0.1978\n", "Epoch: 043/050 | Batch 850/891 | Cost: 0.0669\n", "training accuracy: 97.89%\n", "valid accuracy: 91.38%\n", "Time elapsed: 22.46 min\n", "Epoch: 044/050 | Batch 000/891 | Cost: 0.2068\n", "Epoch: 044/050 | Batch 050/891 | Cost: 0.1964\n", "Epoch: 044/050 | Batch 100/891 | Cost: 0.1017\n", "Epoch: 044/050 | Batch 150/891 | Cost: 0.0945\n", "Epoch: 044/050 | Batch 200/891 | Cost: 0.1398\n", "Epoch: 044/050 | Batch 250/891 | Cost: 0.1392\n", "Epoch: 044/050 | Batch 300/891 | Cost: 0.1261\n", "Epoch: 044/050 | Batch 350/891 | Cost: 0.2008\n", "Epoch: 044/050 | Batch 400/891 | Cost: 0.2173\n", "Epoch: 044/050 | Batch 450/891 | Cost: 0.0855\n", "Epoch: 044/050 | Batch 500/891 | Cost: 0.0770\n", "Epoch: 044/050 | Batch 550/891 | Cost: 0.1380\n", "Epoch: 044/050 | Batch 600/891 | Cost: 0.3052\n", "Epoch: 044/050 | Batch 650/891 | Cost: 0.0486\n", "Epoch: 044/050 | Batch 700/891 | Cost: 0.1263\n", "Epoch: 044/050 | Batch 750/891 | Cost: 0.1256\n", "Epoch: 044/050 | Batch 800/891 | Cost: 0.1150\n", "Epoch: 044/050 | Batch 850/891 | Cost: 0.0973\n", "training accuracy: 97.76%\n", "valid accuracy: 91.35%\n", "Time elapsed: 22.98 min\n", "Epoch: 045/050 | Batch 000/891 | Cost: 0.1594\n", "Epoch: 045/050 | Batch 050/891 | Cost: 0.1549\n", "Epoch: 045/050 | Batch 100/891 | Cost: 0.0711\n", "Epoch: 045/050 | Batch 150/891 | Cost: 0.1032\n", "Epoch: 045/050 | Batch 200/891 | Cost: 0.0720\n", "Epoch: 045/050 | Batch 250/891 | Cost: 0.1090\n", "Epoch: 045/050 | Batch 300/891 | Cost: 0.0773\n", "Epoch: 045/050 | Batch 350/891 | Cost: 0.0606\n", "Epoch: 045/050 | Batch 400/891 | Cost: 0.0950\n", "Epoch: 045/050 | Batch 450/891 | Cost: 0.1379\n", "Epoch: 045/050 | Batch 500/891 | Cost: 0.0536\n", "Epoch: 045/050 | Batch 550/891 | Cost: 0.1675\n", "Epoch: 045/050 | Batch 600/891 | Cost: 0.0619\n", "Epoch: 045/050 | Batch 650/891 | Cost: 0.1666\n", "Epoch: 045/050 | Batch 700/891 | Cost: 0.1070\n", "Epoch: 045/050 | Batch 750/891 | Cost: 0.1447\n", "Epoch: 045/050 | Batch 800/891 | Cost: 0.1363\n", "Epoch: 045/050 | Batch 850/891 | Cost: 0.1717\n", "training accuracy: 98.00%\n", "valid accuracy: 91.67%\n", "Time elapsed: 23.50 min\n", "Epoch: 046/050 | Batch 000/891 | Cost: 0.0955\n", "Epoch: 046/050 | Batch 050/891 | Cost: 0.0806\n", "Epoch: 046/050 | Batch 100/891 | Cost: 0.0657\n", "Epoch: 046/050 | Batch 150/891 | Cost: 0.2222\n", "Epoch: 046/050 | Batch 200/891 | Cost: 0.0978\n", "Epoch: 046/050 | Batch 250/891 | Cost: 0.0767\n", "Epoch: 046/050 | Batch 300/891 | Cost: 0.1464\n", "Epoch: 046/050 | Batch 350/891 | Cost: 0.1771\n", "Epoch: 046/050 | Batch 400/891 | Cost: 0.2743\n", "Epoch: 046/050 | Batch 450/891 | Cost: 0.1303\n", "Epoch: 046/050 | Batch 500/891 | Cost: 0.2106\n", "Epoch: 046/050 | Batch 550/891 | Cost: 0.0764\n", "Epoch: 046/050 | Batch 600/891 | Cost: 0.0796\n", "Epoch: 046/050 | Batch 650/891 | Cost: 0.0901\n", "Epoch: 046/050 | Batch 700/891 | Cost: 0.2567\n", "Epoch: 046/050 | Batch 750/891 | Cost: 0.1266\n", "Epoch: 046/050 | Batch 800/891 | Cost: 0.0914\n", "Epoch: 046/050 | Batch 850/891 | Cost: 0.1228\n", "training accuracy: 97.94%\n", "valid accuracy: 91.57%\n", "Time elapsed: 24.02 min\n", "Epoch: 047/050 | Batch 000/891 | Cost: 0.0675\n", "Epoch: 047/050 | Batch 050/891 | Cost: 0.1272\n", "Epoch: 047/050 | Batch 100/891 | Cost: 0.1254\n", "Epoch: 047/050 | Batch 150/891 | Cost: 0.1105\n", "Epoch: 047/050 | Batch 200/891 | Cost: 0.1292\n", "Epoch: 047/050 | Batch 250/891 | Cost: 0.1707\n", "Epoch: 047/050 | Batch 300/891 | Cost: 0.2328\n", "Epoch: 047/050 | Batch 350/891 | Cost: 0.2123\n", "Epoch: 047/050 | Batch 400/891 | Cost: 0.0974\n", "Epoch: 047/050 | Batch 450/891 | Cost: 0.1456\n", "Epoch: 047/050 | Batch 500/891 | Cost: 0.1195\n", "Epoch: 047/050 | Batch 550/891 | Cost: 0.1078\n", "Epoch: 047/050 | Batch 600/891 | Cost: 0.1064\n", "Epoch: 047/050 | Batch 650/891 | Cost: 0.0680\n", "Epoch: 047/050 | Batch 700/891 | Cost: 0.0793\n", "Epoch: 047/050 | Batch 750/891 | Cost: 0.1284\n", "Epoch: 047/050 | Batch 800/891 | Cost: 0.1557\n", "Epoch: 047/050 | Batch 850/891 | Cost: 0.1397\n", "training accuracy: 97.87%\n", "valid accuracy: 91.40%\n", "Time elapsed: 24.55 min\n", "Epoch: 048/050 | Batch 000/891 | Cost: 0.0918\n", "Epoch: 048/050 | Batch 050/891 | Cost: 0.1379\n", "Epoch: 048/050 | Batch 100/891 | Cost: 0.2946\n", "Epoch: 048/050 | Batch 150/891 | Cost: 0.1350\n", "Epoch: 048/050 | Batch 200/891 | Cost: 0.1663\n", "Epoch: 048/050 | Batch 250/891 | Cost: 0.0810\n", "Epoch: 048/050 | Batch 300/891 | Cost: 0.1619\n", "Epoch: 048/050 | Batch 350/891 | Cost: 0.0793\n", "Epoch: 048/050 | Batch 400/891 | Cost: 0.0792\n", "Epoch: 048/050 | Batch 450/891 | Cost: 0.1441\n", "Epoch: 048/050 | Batch 500/891 | Cost: 0.3115\n", "Epoch: 048/050 | Batch 550/891 | Cost: 0.0545\n", "Epoch: 048/050 | Batch 600/891 | Cost: 0.0591\n", "Epoch: 048/050 | Batch 650/891 | Cost: 0.0831\n", "Epoch: 048/050 | Batch 700/891 | Cost: 0.1871\n", "Epoch: 048/050 | Batch 750/891 | Cost: 0.0829\n", "Epoch: 048/050 | Batch 800/891 | Cost: 0.2762\n", "Epoch: 048/050 | Batch 850/891 | Cost: 0.1183\n", "training accuracy: 98.02%\n", "valid accuracy: 91.68%\n", "Time elapsed: 25.07 min\n", "Epoch: 049/050 | Batch 000/891 | Cost: 0.0937\n", "Epoch: 049/050 | Batch 050/891 | Cost: 0.0760\n", "Epoch: 049/050 | Batch 100/891 | Cost: 0.1527\n", "Epoch: 049/050 | Batch 150/891 | Cost: 0.2894\n", "Epoch: 049/050 | Batch 200/891 | Cost: 0.0581\n", "Epoch: 049/050 | Batch 250/891 | Cost: 0.1349\n", "Epoch: 049/050 | Batch 300/891 | Cost: 0.0351\n", "Epoch: 049/050 | Batch 350/891 | Cost: 0.2301\n", "Epoch: 049/050 | Batch 400/891 | Cost: 0.0575\n", "Epoch: 049/050 | Batch 450/891 | Cost: 0.1455\n", "Epoch: 049/050 | Batch 500/891 | Cost: 0.1668\n", "Epoch: 049/050 | Batch 550/891 | Cost: 0.2178\n", "Epoch: 049/050 | Batch 600/891 | Cost: 0.1040\n", "Epoch: 049/050 | Batch 650/891 | Cost: 0.0888\n", "Epoch: 049/050 | Batch 700/891 | Cost: 0.0934\n", "Epoch: 049/050 | Batch 750/891 | Cost: 0.2147\n", "Epoch: 049/050 | Batch 800/891 | Cost: 0.0826\n", "Epoch: 049/050 | Batch 850/891 | Cost: 0.0803\n", "training accuracy: 98.09%\n", "valid accuracy: 91.72%\n", "Time elapsed: 25.59 min\n", "Epoch: 050/050 | Batch 000/891 | Cost: 0.0851\n", "Epoch: 050/050 | Batch 050/891 | Cost: 0.0672\n", "Epoch: 050/050 | Batch 100/891 | Cost: 0.1876\n", "Epoch: 050/050 | Batch 150/891 | Cost: 0.1164\n", "Epoch: 050/050 | Batch 200/891 | Cost: 0.0853\n", "Epoch: 050/050 | Batch 250/891 | Cost: 0.1113\n", "Epoch: 050/050 | Batch 300/891 | Cost: 0.1476\n", "Epoch: 050/050 | Batch 350/891 | Cost: 0.2833\n", "Epoch: 050/050 | Batch 400/891 | Cost: 0.0722\n", "Epoch: 050/050 | Batch 450/891 | Cost: 0.1272\n", "Epoch: 050/050 | Batch 500/891 | Cost: 0.0763\n", "Epoch: 050/050 | Batch 550/891 | Cost: 0.1446\n", "Epoch: 050/050 | Batch 600/891 | Cost: 0.1152\n", "Epoch: 050/050 | Batch 650/891 | Cost: 0.2281\n", "Epoch: 050/050 | Batch 700/891 | Cost: 0.2060\n", "Epoch: 050/050 | Batch 750/891 | Cost: 0.1476\n", "Epoch: 050/050 | Batch 800/891 | Cost: 0.0931\n", "Epoch: 050/050 | Batch 850/891 | Cost: 0.0703\n", "training accuracy: 98.20%\n", "valid accuracy: 91.88%\n", "Time elapsed: 26.11 min\n", "Total Training Time: 26.11 min\n", "Test accuracy: 90.87%\n" ] } ], "source": [ "start_time = time.time()\n", "\n", "for epoch in range(NUM_EPOCHS):\n", " model.train()\n", " for batch_idx, batch_data in enumerate(train_loader):\n", " \n", " text, text_lengths = batch_data.content\n", " \n", " ### FORWARD AND BACK PROP\n", " logits = model(text, text_lengths)\n", " cost = F.cross_entropy(logits, batch_data.classlabel.long())\n", " optimizer.zero_grad()\n", " \n", " cost.backward()\n", " \n", " ### UPDATE MODEL PARAMETERS\n", " optimizer.step()\n", " \n", " ### LOGGING\n", " if not batch_idx % 50:\n", " print (f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} | '\n", " f'Batch {batch_idx:03d}/{len(train_loader):03d} | '\n", " f'Cost: {cost:.4f}')\n", "\n", " with torch.set_grad_enabled(False):\n", " print(f'training accuracy: '\n", " f'{compute_accuracy(model, train_loader, DEVICE):.2f}%'\n", " f'\\nvalid accuracy: '\n", " f'{compute_accuracy(model, valid_loader, DEVICE):.2f}%')\n", " \n", " print(f'Time elapsed: {(time.time() - start_time)/60:.2f} min')\n", " \n", "print(f'Total Training Time: {(time.time() - start_time)/60:.2f} min')\n", "print(f'Test accuracy: {compute_accuracy(model, test_loader, DEVICE):.2f}%')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Evaluation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Evaluating on some new text that has been collected from recent news articles and is not part of the training or test sets." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "colab": {}, "colab_type": "code", "id": "jt55pscgFdKZ" }, "outputs": [], "source": [ "import spacy\n", "nlp = spacy.load('en')\n", "\n", "\n", "map_dictionary = {\n", " 0: \"World\",\n", " 1: \"Sports\",\n", " 2: \"Business\",\n", " 3:\"Sci/Tech\",\n", "}\n", "\n", "\n", "def predict_class(model, sentence, min_len=4):\n", " # Somewhat based on\n", " # https://github.com/bentrevett/pytorch-sentiment-analysis/\n", " # blob/master/5%20-%20Multi-class%20Sentiment%20Analysis.ipynb\n", " model.eval()\n", " tokenized = [tok.text for tok in nlp.tokenizer(sentence)]\n", " if len(tokenized) < min_len:\n", " tokenized += [''] * (min_len - len(tokenized))\n", " indexed = [TEXT.vocab.stoi[t] for t in tokenized]\n", " length = [len(indexed)]\n", " tensor = torch.LongTensor(indexed).to(DEVICE)\n", " tensor = tensor.unsqueeze(1)\n", " length_tensor = torch.LongTensor(length)\n", " preds = model(tensor, length_tensor)\n", " preds = torch.softmax(preds, dim=1)\n", " \n", " proba, class_label = preds.max(dim=1)\n", " return proba.item(), class_label.item()" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Class Label: 2 -> Business\n", "Probability: 0.6041601896286011\n" ] } ], "source": [ "text = \"\"\"\n", "The windfall follows a tender offer by Z Holdings, which is controlled by SoftBank’s domestic wireless unit, \n", "for half of Zozo’s shares this month.\n", "\"\"\"\n", "\n", "proba, pred_label = predict_class(model, text)\n", "\n", "print(f'Class Label: {pred_label} -> {map_dictionary[pred_label]}')\n", "print(f'Probability: {proba}')" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Class Label: 0 -> World\n", "Probability: 0.932104229927063\n" ] } ], "source": [ "text = \"\"\"\n", "EU data regulator issues first-ever sanction of an EU institution, \n", "against the European parliament over its use of US-based NationBuilder to process voter data \n", "\"\"\"\n", "\n", "proba, pred_label = predict_class(model, text)\n", "\n", "print(f'Class Label: {pred_label} -> {map_dictionary[pred_label]}')\n", "print(f'Probability: {proba}')" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Class Label: 3 -> Sci/Tech\n", "Probability: 0.5513855814933777\n" ] } ], "source": [ "text = \"\"\"\n", "LG announces CEO Jo Seong-jin will be replaced by Brian Kwon Dec. 1, amid 2020 \n", "leadership shakeup and LG smartphone division's 18th straight quarterly loss\n", "\"\"\"\n", "\n", "proba, pred_label = predict_class(model, text)\n", "\n", "print(f'Class Label: {pred_label} -> {map_dictionary[pred_label]}')\n", "print(f'Probability: {proba}')" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "colab": {}, "colab_type": "code", "id": "7lRusB3dF80X" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "numpy 1.17.4\n", "torchtext 0.4.0\n", "torch 1.4.0\n", "pandas 0.24.2\n", "spacy 2.2.3\n", "\n" ] } ], "source": [ "%watermark -iv" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "colab": { "collapsed_sections": [], "name": "rnn_lstm_packed_imdb.ipynb", "provenance": [], "version": "0.3.2" }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 4 }