{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "QU13Xg9n0bSM" }, "source": [ "# Retail Product Recommendations using word2vec\n", "> Creating a system that automatically recommends a certain number of products to the consumers on an E-commerce website based on the past purchase behavior of the consumers.\n", "\n", "- toc: true\n", "- badges: true\n", "- comments: true\n", "- categories: [sequence, retail]\n", "- image: " ] }, { "cell_type": "markdown", "metadata": { "id": "VwgbuHwVtshy" }, "source": [ "A person involved in sports-related activities might have an online buying pattern similar to this:" ] }, { "cell_type": "markdown", "metadata": { "id": "NQLbbKfmtn_O" }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "id": "2tYh4SrWtynJ" }, "source": [ "If we can represent each of these products by a vector, then we can easily find similar products. So, if a user is checking out a product online, then we can easily recommend him/her similar products by using the vector similarity score between the products." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "executionInfo": { "elapsed": 1427, "status": "ok", "timestamp": 1619252390068, "user": { "displayName": "sparsh agarwal", "photoUrl": "", "userId": "00322518567794762549" }, "user_tz": -330 }, "id": "W0wI5j74nc9W" }, "outputs": [], "source": [ "#hide\n", "import pandas as pd\n", "import numpy as np\n", "import random\n", "from tqdm import tqdm\n", "from gensim.models import Word2Vec \n", "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "\n", "import warnings;\n", "warnings.filterwarnings('ignore')" ] }, { "cell_type": "markdown", "metadata": { "id": "cDi5Gu8Ou7bb" }, "source": [ "## Data gathering and understanding" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "executionInfo": { "elapsed": 3274, "status": "ok", "timestamp": 1619252391927, "user": { "displayName": "sparsh agarwal", "photoUrl": "", "userId": "00322518567794762549" }, "user_tz": -330 }, "id": "TlW0pTzGniGo", "outputId": "0392f779-4c42-4e9e-b096-51ee6487b53d" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--2021-04-24 08:19:50-- https://archive.ics.uci.edu/ml/machine-learning-databases/00352/Online%20Retail.xlsx\n", "Resolving archive.ics.uci.edu (archive.ics.uci.edu)... 128.195.10.252\n", "Connecting to archive.ics.uci.edu (archive.ics.uci.edu)|128.195.10.252|:443... connected.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 23715344 (23M) [application/x-httpd-php]\n", "Saving to: ‘Online Retail.xlsx’\n", "\n", "Online Retail.xlsx 100%[===================>] 22.62M 22.7MB/s in 1.0s \n", "\n", "2021-04-24 08:19:51 (22.7 MB/s) - ‘Online Retail.xlsx’ saved [23715344/23715344]\n", "\n" ] } ], "source": [ "#hide-output\n", "!wget https://archive.ics.uci.edu/ml/machine-learning-databases/00352/Online%20Retail.xlsx" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 204 }, "executionInfo": { "elapsed": 45709, "status": "ok", "timestamp": 1619252434373, "user": { "displayName": "sparsh agarwal", "photoUrl": "", "userId": "00322518567794762549" }, "user_tz": -330 }, "id": "z_Kt8wtZnjRm", "outputId": "e30cdfd6-fd16-48ab-fa22-283fdb3d2578" }, "outputs": [ { "data": { "text/html": [ "
| \n", " | InvoiceNo | \n", "StockCode | \n", "Description | \n", "Quantity | \n", "InvoiceDate | \n", "UnitPrice | \n", "CustomerID | \n", "Country | \n", "
|---|---|---|---|---|---|---|---|---|
| 0 | \n", "536365 | \n", "85123A | \n", "WHITE HANGING HEART T-LIGHT HOLDER | \n", "6 | \n", "2010-12-01 08:26:00 | \n", "2.55 | \n", "17850.0 | \n", "United Kingdom | \n", "
| 1 | \n", "536365 | \n", "71053 | \n", "WHITE METAL LANTERN | \n", "6 | \n", "2010-12-01 08:26:00 | \n", "3.39 | \n", "17850.0 | \n", "United Kingdom | \n", "
| 2 | \n", "536365 | \n", "84406B | \n", "CREAM CUPID HEARTS COAT HANGER | \n", "8 | \n", "2010-12-01 08:26:00 | \n", "2.75 | \n", "17850.0 | \n", "United Kingdom | \n", "
| 3 | \n", "536365 | \n", "84029G | \n", "KNITTED UNION FLAG HOT WATER BOTTLE | \n", "6 | \n", "2010-12-01 08:26:00 | \n", "3.39 | \n", "17850.0 | \n", "United Kingdom | \n", "
| 4 | \n", "536365 | \n", "84029E | \n", "RED WOOLLY HOTTIE WHITE HEART. | \n", "6 | \n", "2010-12-01 08:26:00 | \n", "3.39 | \n", "17850.0 | \n", "United Kingdom | \n", "