{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Product Range Analysis " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In 2019, there were about 1.92 billion digital buyers across the world economy. This means e-commerce is growing rapidly. As e-commerce continues to thrive, traditional brick and mortar establishments are modifying and digitalizing their business models to keep up to or beat competitions. This has increased competition in the e-commerce industry. To enhance marketing, optimize prices, to deeply understand customer expectations, etc, demands analytics. As a junior analyst at an online store that sells unique all-occasion gift-ware, my task is to analyze the store's product range between 29/11/2018 and 07/12/2019. " ] }, { "cell_type": "markdown", "metadata": { "toc": true }, "source": [ "

Table of Contents

\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 1: Data Preprocessing" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I would import libraries, data, and preprocessit:\n", "- would automate the data import and processing\n", "- would drop duplicates in the data, convert date dypes into their right format, and create new columns\n", "- would drop product descriptions that are not actual products but mere charges\n", "- And filter data for only positive prices" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Data size: 531240\n", "\n", "First 5 rows of data:\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
invoicenostockcodedescriptionquantityinvoicedateunitpricecustomeriddaterevenue
053636585123Awhite hanging heart t-light holder62018-11-29 08:26:002.5517850.02018-11-2915.30
153636571053white metal lantern62018-11-29 08:26:003.3917850.02018-11-2920.34
253636584406Bcream cupid hearts coat hanger82018-11-29 08:26:002.7517850.02018-11-2922.00
353636584029Gknitted union flag hot water bottle62018-11-29 08:26:003.3917850.02018-11-2920.34
453636584029Ered woolly hottie white heart.62018-11-29 08:26:003.3917850.02018-11-2920.34
\n", "
" ], "text/plain": [ " invoiceno stockcode description quantity \\\n", "0 536365 85123A white hanging heart t-light holder 6 \n", "1 536365 71053 white metal lantern 6 \n", "2 536365 84406B cream cupid hearts coat hanger 8 \n", "3 536365 84029G knitted union flag hot water bottle 6 \n", "4 536365 84029E red woolly hottie white heart. 6 \n", "\n", " invoicedate unitprice customerid date revenue \n", "0 2018-11-29 08:26:00 2.55 17850.0 2018-11-29 15.30 \n", "1 2018-11-29 08:26:00 3.39 17850.0 2018-11-29 20.34 \n", "2 2018-11-29 08:26:00 2.75 17850.0 2018-11-29 22.00 \n", "3 2018-11-29 08:26:00 3.39 17850.0 2018-11-29 20.34 \n", "4 2018-11-29 08:26:00 3.39 17850.0 2018-11-29 20.34 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "Data Description\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countuniquetopfreqfirstlastmeanstdmin25%50%75%max
invoiceno531240231965735851113NaTNaTNaNNaNNaNNaNNaNNaNNaN
stockcode531240392885123A2295NaTNaTNaNNaNNaNNaNNaNNaNNaN
description5312404033white hanging heart t-light holder2353NaTNaTNaNNaNNaNNaNNaNNaNNaN
quantity531240.0NaNNaNNaNNaTNaT9.960477217.000645-80995.01.03.010.080995.0
invoicedate531240213082019-10-29 14:41:0011132018-11-29 08:26:002019-12-07 12:50:00NaNNaNNaNNaNNaNNaNNaN
unitprice531240.0NaNNaNNaNNaTNaT3.31448415.8303680.0011.252.084.1311062.06
customerid399659.0NaNNaNNaNNaTNaT15288.7817191710.78659412346.013959.015152.016791.018287.0
date5312403052019-12-03 00:00:0052652018-11-29 00:00:002019-12-07 00:00:00NaNNaNNaNNaNNaNNaNNaN
revenue531240.0NaNNaNNaNNaTNaT18.41553370.054268-168469.63.759.917.4168469.6
\n", "
" ], "text/plain": [ " count unique top freq \\\n", "invoiceno 531240 23196 573585 1113 \n", "stockcode 531240 3928 85123A 2295 \n", "description 531240 4033 white hanging heart t-light holder 2353 \n", "quantity 531240.0 NaN NaN NaN \n", "invoicedate 531240 21308 2019-10-29 14:41:00 1113 \n", "unitprice 531240.0 NaN NaN NaN \n", "customerid 399659.0 NaN NaN NaN \n", "date 531240 305 2019-12-03 00:00:00 5265 \n", "revenue 531240.0 NaN NaN NaN \n", "\n", " first last mean \\\n", "invoiceno NaT NaT NaN \n", "stockcode NaT NaT NaN \n", "description NaT NaT NaN \n", "quantity NaT NaT 9.960477 \n", "invoicedate 2018-11-29 08:26:00 2019-12-07 12:50:00 NaN \n", "unitprice NaT NaT 3.314484 \n", "customerid NaT NaT 15288.781719 \n", "date 2018-11-29 00:00:00 2019-12-07 00:00:00 NaN \n", "revenue NaT NaT 18.41553 \n", "\n", " std min 25% 50% 75% max \n", "invoiceno NaN NaN NaN NaN NaN NaN \n", "stockcode NaN NaN NaN NaN NaN NaN \n", "description NaN NaN NaN NaN NaN NaN \n", "quantity 217.000645 -80995.0 1.0 3.0 10.0 80995.0 \n", "invoicedate NaN NaN NaN NaN NaN NaN \n", "unitprice 15.830368 0.001 1.25 2.08 4.13 11062.06 \n", "customerid 1710.786594 12346.0 13959.0 15152.0 16791.0 18287.0 \n", "date NaN NaN NaN NaN NaN NaN \n", "revenue 370.054268 -168469.6 3.75 9.9 17.4 168469.6 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import plotly.express as px\n", "import numpy as np\n", "import missingno as msno\n", "import seaborn as sns\n", "from scipy import stats as st\n", "from itertools import combinations\n", "from collections import Counter\n", "from sklearn.metrics import classification_report, confusion_matrix, accuracy_score, precision_score, recall_score\n", "import scikitplot as skplt\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.feature_extraction.text import TfidfVectorizer\n", "from sklearn.naive_bayes import MultinomialNB\n", "from sklearn.pipeline import make_pipeline\n", "import warnings\n", "warnings.filterwarnings('ignore')\n", "pd.set_option('max_colwidth', 400)\n", "%matplotlib inline\n", "\n", "def data_preprocessing(dataset_path):\n", " try:\n", " df = pd.read_csv(dataset_path, sep =\"\\t\")\n", " df.columns= df.columns.str.lower()\n", " df.description=df.description.str.lower()\n", " df.drop_duplicates(inplace=True)\n", " df[\"invoicedate\"]= pd.to_datetime(df[\"invoicedate\"], format = \"%m/%d/%Y %H:%M\")\n", " df[\"date\"] = df[\"invoicedate\"].astype(\"datetime64[D]\")\n", " df=df.dropna(subset=[\"description\"])\n", " df=df.query('unitprice > 0')\n", " df[\"revenue\"]= df[\"quantity\"] * df[\"unitprice\"]\n", " drop_desc =[\"amazon fee\",\"postage\", \"manual\", \"samples\",\"carriage\", \"cruk commission\", \"discount\", \"bank charges\",\"dotcom postage\"]\n", " df =df[~df[\"description\"].isin(drop_desc)]\n", " except:\n", " print(\"data does not fit automation\")\n", " return df\n", "data = data_preprocessing(\"ecommerce_dataset_us.csv\") \n", "print(\"Data size:\", data.shape[0])\n", "print()\n", "print(\"First 5 rows of data:\")\n", "display(data.head()) \n", "print()\n", "print (\"Data Description\")\n", "display(data.describe(include=\"all\").T)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`white hanging heart t-light holder` was the most frequently ordered product ~ ordered 2353 times. Invoice number 573585 was the most frequent with 1113 products ordered. There were 4033 unique products. The maximum ordered quantity was 80995 and the minimum ordered qunatity was -8099; why would an order be negative? Probably returned products. The data ranged between 29/11/2018 and 07/12/2019." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To conduct the `Product Range Analysis`, I would categorize the products into five main categories and an additional category called `others` for products that do not fall in any of the created categories. As the data now has 531240 rows, and more than 4000 unique products, I will manually create categories, and train a machine learning model to complete the categorisation." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "kitchenware =[\"regency cakestand 3 tier\",\"set of 4 knick knack tins doiley\", \"set of 3 cake tins pantry design\", \"pack of 72 retrospot cake cases\",\n", " \"jam making set with jars\", \"natural slate heart chalkboard\", \"jam making set printed\",\n", " \"recipe box pantry yellow design\", \"roses regency teacup and saucer\", \"set of 4 pantry jelly moulds\", \n", " \"set/20 red retrospot paper napkins\", \"retrospot tea set ceramic 11 pc\", \"6 ribbons rustic charm\",\n", " \"baking set 9 piece retrospot\", \"set/5 red retrospot lid glass bowls\", \"spaceboy lunch box\", \n", " \"set of 3 regency cake tins\", \"ivory kitchen scales\", \"hand warmer scotty dog design\", \n", " \"red retrospot cake stand\", \"red kitchen scales\",\"hand warmer bird design\", \"childrens apron spaceboy design\", \n", " \"set of 12 fairy cake baking cases\", \"small dolly mix design orange bowl\", \"pack of 60 spaceboy cake cases\", \n", " \"set of 20 kids cookie cutters\", \"set/6 red spotty paper plates\", \"hand warmer union jack\", \n", " \"natural slate chalkboard large\", \"set of tea coffee sugar tins pantry\", \"pack of 60 pink paisley cake cases\", \n", " \"dolly girl lunch box\", \"60 teatime fairy cake cases\", \"set of 6 spice tins pantry design\", \"popcorn holder\", \n", " \"pink regency teacup and saucer\", \"round snack boxes set of4 woodland\",\"pack of 20 napkins pantry design\",\n", " \"set of 3 butterfly cookie cutters\", \"lunch bag dolly girl design\", \"set of 3 heart cookie cutters\",\n", " \"set of 3 butterfly cookie cutters\", \"small heart measuring spoons\", \"red retrospot bowl\", \n", " \"set of 12 mini loaf baking cases\", \"memo board retrospot design\",\"60 cake cases dolly girl design\", \n", " \"regency tea plate roses\",\"red retrospot oven glove\", \"small marshmallows pink bowl\", \"enamel bread bin cream\", \n", " \"mint kitchen scales\", \"black kitchen scales\", \"poppy's playhouse kitchen\", \"kitchen metal sign\", \n", " \"french kitchen sign blue metal\", \"vintage kitchen print fruits\", \"vintage kitchen print seafood\", \n", " \"childrens cutlery circus parade\", \"pack of 20 spaceboy napkins\", \"baking set 9 piece retrospot\",\n", " \"childrens cutlery circus parade\", \"childrens cutlery dolly girl\", \"children's apron dolly girl\"] \n", "\n", "home_decor= [\"wire flower t-light holder\",\"white hanging heart t-light holder\", \"victorian glass hanging t-light\", \n", " \"rabbit night light\", \"pink boudoir t-light holder\", \"hanging heart jar t-light holder\", \n", " \"antique silver t-light glass\", \"chilli lights\", \"colour glass t-light holder hanging\",\n", " \"christmas lights 10 vintage baubles\",\"single heart zinc t-light holder\", \"red toadstool led night light\", \n", " \"glass star frosted t-light holder\",\"fairy tale cottage night light\",\"multi colour silver t-light holder\", \n", " \"red toadstool led night light\",\"set of 6 t-lights snowmen\", \"chilli lights\", \"christmas lights 10 santas\", \n", " \"star portable table light\",\"wooden frame antique white\",\"black candelabra t-light holder\", \"snowflake portable table light\", \n", " \"set of 6 t-lights santa\", \"set 10 lights night owl\", \"babushka lights string of 10\",\"hyacinth bulb t-light candles\",\n", " \"wooden picture frame white finish\", \"red hanging heart t-light holder\", \"rabbit night light\", \n", " \"rotating silver angels t-light hldr\", \"white metal lantern\", \"photo frame cornice\", \"no singing metal sign\",\n", " \"pottering in the shed metal sign\",\"please one person metal sign\", \"gin + tonic diet metal sign\",\"cook with wine metal sign\", \n", " \"ladies & gentlemen metal sign\", \"beware of the cat metal sign\", \"you're confusing me metal sign\",\"toilet sign occupied or vacant\", \n", " \"alarm clock bakelike pink\",\"doormat welcome to our home\",\"doormat red retrospot\",\"doormat keep calm and come in\",\n", " \"alarm clock bakelike red\", \"doormat fancy font home sweet home\",\"doormat red retrospot\",\"alarm clock bakelike green\"]\n", "\n", "event_and_party = [\"party bunting\", \"assorted colour bird ornament\", \"heart of wicker small\", \n", " \"paper chain kit 50's christmas\", \"spotty bunting\",\"heart of wicker large\", \"set/10 red polkadot party candles\", \n", " \"party invites jazz hearts\", \"party invites football\",\"party cones carnival assorted\", \n", " \"retrospot party bag + sticker set\", \"blue party bags\", \"party cone christmas decoration\", \n", " \"tea party birthday card\", \"card party games\", \"birthday party cordon barrier tape\",\n", " \"12 coloured party balloons\", \"dinosaur party bag + sticker set\", \"feltcraft 6 flower friends\", \n", " \"christmas craft little friends\", \"lovebird hanging decoration white\", \"wooden heart christmas scandinavian\", \n", " \"zinc metal heart decoration\", \"sweetheart ceramic trinket box\", \"scandinavian reds ribbons\",\n", " \"world war 2 gliders asstd designs\", \"pink blue felt craft trinket box\", \"spaceboy birthday card\",\n", " \"enamel flower jug cream\", \"assorted colours silk fan\", \"set of 72 pink heart paper doilies\",\n", " \"metal 4 hook hanger french chateau\", \"feltcraft butterfly hearts\", \"paper chain kit vintage christmas\",\n", " \"paper bunting retrospot\", \"clothes pegs retrospot pack 24\", \"strawberry ceramic trinket box\", \n", " \"jumbo bag 50's christmas\", \"pink fairy cake childrens apron\", \"jumbo bag vintage christmas\", \n", " \"christmas craft tree top angel\", \"vintage union jack bunting\", \"christmas decoupage candle\", \n", " \"christmas metal tags assorted\", \"rocking horse red christmas\", \"christmas gingham tree\", \n", " \"turquoise christmas tree\", \"pack of 12 london tissues\"]\n", "\n", "plant_and_accessories =[\"zinc plant pot holder\", \"white wood garden plant ladder\", \"gardeners kneeling pad keep calm\", \n", " \"gardeners kneeling pad cup of tea\", \"white anemone artificial flower\", \"grow your own plant in a can\", \n", " \"classic metal birdcage plant holder\", \"zinc finish 15cm planter pots\", \"white wood garden plant ladder\",\n", " \"enchanted bird plant cage\", \"classic metal birdcage plant holder\", \"cream wall planter heart shaped\",\n", " \"grow your own plant in a can\", \"blue pot plant candle\",\"decorative plant pot with frieze\", \n", " \"pink pot plant candle\", \"set/3 pot plant candles\",\"blue pot plant candle\",\"s/2 zinc heart design planters\",\n", " \"zinc heart lattice 2 wall planter\",\"s/3 pink square planters roses\", \"white hearts wire plant pot holder\",\n", " \"zinc hearts plant pot holder\", \"yellow pot plant candle\"]\n", "\n", "bags_and_toys= [\"lunch bag red retrospot\", \"jumbo bag red retrospot\", \"lunch bag black skull\", \n", " \"jumbo bag pink polkadot\",\"jumbo storage bag suki\", \"jumbo shopper vintage red paisley\", \"lunch bag cars blue\",\n", " \"lunch bag spaceboy design\", \"lunch bag suki design\", \"lunch bag pink polkadot\", \"jumbo bag apples\",\n", " \"red retrospot charlotte bag\", \"lunch bag woodland\", \"rex cash+carry jumbo shopper\", \"jumbo bag alphabet\", \n", " \"gumball coat rack\",\"red retrospot picnic bag\", \"suki shoulder bag\",\"jumbo bag toys\",\"jumbo bag doiley patterns\", \n", " \"red retrospot peg bag\",\"lunch bag doiley pattern\", \"charlotte bag suki design\", \"jumbo bag vintage leaf\",\n", " \"jumbo bag pink vintage paisley\",\"jumbo bag woodland animals\", \"woodland charlotte bag\", \"jumbo bag strawberry\", \n", " \"lunch bag alphabet design\",\"charlotte bag pink polkadot\", \"jumbo bag toys\",\"recycling bag retrospot\", \n", " \"lunch box i love london\",\"jumbo bag vintage doily\", \"jumbo storage bag skulls\", \"jumbo bag spaceboy design\",\n", " \"scandinavian paisley picnic bag\", \"charlotte bag vintage alphabet\",\"mr robot soft toy\", \"toy tidy pink polkadot\", \n", " \"jumbo bag charlie and lola toys\"]\n", "\n", "others = [\"travel card wallet keep calm\", \"cream sweetheart mini chest\",\"plasters in tin strongman\", \n", " \"white skull hot water bottle\", \"chocolate hot water bottle\", \"fawn blue hot water bottle\", \"gumball coat rack\",\n", " \"pantry magnetic shopping list\",\"love hot water bottle\", \"hot water bottle keep calm\", \"edwardian parasol natural\", \n", " \"scottie dog hot water bottle\", \"home building block word\", \"scottie dog hot water bottle\",\n", " \"plasters in tin woodland animals\", \"dotcom postage\", \"wood black board ant white finish\", \n", " \"travel sewing kit\",\"knitted union flag hot water bottle\", \"4 traditional spinning tops\", \n", " \"set of 6 soldier skittles\", \"clear drawer knob acrylic edwardian\", \"vintage paisley stationery set\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Defining a function to assign categories" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
invoicenostockcodedescriptionquantityinvoicedateunitpricecustomeriddaterevenuesample_categories
053636585123Awhite hanging heart t-light holder62018-11-29 08:26:002.5517850.02018-11-2915.30home_decor
153636571053white metal lantern62018-11-29 08:26:003.3917850.02018-11-2920.34home_decor
253636584406Bcream cupid hearts coat hanger82018-11-29 08:26:002.7517850.02018-11-2922.00undefined
353636584029Gknitted union flag hot water bottle62018-11-29 08:26:003.3917850.02018-11-2920.34others
453636584029Ered woolly hottie white heart.62018-11-29 08:26:003.3917850.02018-11-2920.34undefined
553636522752set 7 babushka nesting boxes22018-11-29 08:26:007.6517850.02018-11-2915.30undefined
653636521730glass star frosted t-light holder62018-11-29 08:26:004.2517850.02018-11-2925.50home_decor
753636622633hand warmer union jack62018-11-29 08:28:001.8517850.02018-11-2911.10kitchenware
853636622632hand warmer red polka dot62018-11-29 08:28:001.8517850.02018-11-2911.10undefined
953636784879assorted colour bird ornament322018-11-29 08:34:001.6913047.02018-11-2954.08event_and_party
\n", "
" ], "text/plain": [ " invoiceno stockcode description quantity \\\n", "0 536365 85123A white hanging heart t-light holder 6 \n", "1 536365 71053 white metal lantern 6 \n", "2 536365 84406B cream cupid hearts coat hanger 8 \n", "3 536365 84029G knitted union flag hot water bottle 6 \n", "4 536365 84029E red woolly hottie white heart. 6 \n", "5 536365 22752 set 7 babushka nesting boxes 2 \n", "6 536365 21730 glass star frosted t-light holder 6 \n", "7 536366 22633 hand warmer union jack 6 \n", "8 536366 22632 hand warmer red polka dot 6 \n", "9 536367 84879 assorted colour bird ornament 32 \n", "\n", " invoicedate unitprice customerid date revenue \\\n", "0 2018-11-29 08:26:00 2.55 17850.0 2018-11-29 15.30 \n", "1 2018-11-29 08:26:00 3.39 17850.0 2018-11-29 20.34 \n", "2 2018-11-29 08:26:00 2.75 17850.0 2018-11-29 22.00 \n", "3 2018-11-29 08:26:00 3.39 17850.0 2018-11-29 20.34 \n", "4 2018-11-29 08:26:00 3.39 17850.0 2018-11-29 20.34 \n", "5 2018-11-29 08:26:00 7.65 17850.0 2018-11-29 15.30 \n", "6 2018-11-29 08:26:00 4.25 17850.0 2018-11-29 25.50 \n", "7 2018-11-29 08:28:00 1.85 17850.0 2018-11-29 11.10 \n", "8 2018-11-29 08:28:00 1.85 17850.0 2018-11-29 11.10 \n", "9 2018-11-29 08:34:00 1.69 13047.0 2018-11-29 54.08 \n", "\n", " sample_categories \n", "0 home_decor \n", "1 home_decor \n", "2 undefined \n", "3 others \n", "4 undefined \n", "5 undefined \n", "6 home_decor \n", "7 kitchenware \n", "8 undefined \n", "9 event_and_party " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def categories(classification):\n", " \"\"\"This function assigns categories to the product descriptions\"\"\"\n", " description = classification[\"description\"]\n", " list_of_list=[[kitchenware, home_decor, event_and_party, plant_and_accessories, bags_and_toys, others]]\n", " for i in list_of_list:\n", " if description in i[0]:\n", " return \"kitchenware\"\n", " if description in i[1]:\n", " return \"home_decor\"\n", " if description in i[2]:\n", " return \"event_and_party\"\n", " if description in i[3]:\n", " return \"plant_and_accessories\"\n", " if description in i[4]:\n", " return \"bags_and_toys\"\n", " if description in i[5]:\n", " return \"others\"\n", " \n", " return \"undefined\"\n", "# Applying the function to the dataframe.\n", "data[\"sample_categories\"] = data.apply(categories, axis =1)\n", "data.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Calculating the size and proportion of data that has been categorized manually to for training and validation." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "112453" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "0.212" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "display(data.query(\"sample_categories!='undefined'\")[\"sample_categories\"].value_counts().sum())\n", "display(round(data.query(\"sample_categories!='undefined'\")[\"sample_categories\"].value_counts().sum()/data.shape[0],3))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The data for training and validation is about 21% of the data. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Filtering the manually categorized data." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
invoicenostockcodedescriptionquantityinvoicedateunitpricecustomeriddaterevenuesample_categories
053636585123Awhite hanging heart t-light holder62018-11-29 08:26:002.5517850.02018-11-2915.30home_decor
153636571053white metal lantern62018-11-29 08:26:003.3917850.02018-11-2920.34home_decor
353636584029Gknitted union flag hot water bottle62018-11-29 08:26:003.3917850.02018-11-2920.34others
653636521730glass star frosted t-light holder62018-11-29 08:26:004.2517850.02018-11-2925.50home_decor
753636622633hand warmer union jack62018-11-29 08:28:001.8517850.02018-11-2911.10kitchenware
\n", "
" ], "text/plain": [ " invoiceno stockcode description quantity \\\n", "0 536365 85123A white hanging heart t-light holder 6 \n", "1 536365 71053 white metal lantern 6 \n", "3 536365 84029G knitted union flag hot water bottle 6 \n", "6 536365 21730 glass star frosted t-light holder 6 \n", "7 536366 22633 hand warmer union jack 6 \n", "\n", " invoicedate unitprice customerid date revenue \\\n", "0 2018-11-29 08:26:00 2.55 17850.0 2018-11-29 15.30 \n", "1 2018-11-29 08:26:00 3.39 17850.0 2018-11-29 20.34 \n", "3 2018-11-29 08:26:00 3.39 17850.0 2018-11-29 20.34 \n", "6 2018-11-29 08:26:00 4.25 17850.0 2018-11-29 25.50 \n", "7 2018-11-29 08:28:00 1.85 17850.0 2018-11-29 11.10 \n", "\n", " sample_categories \n", "0 home_decor \n", "1 home_decor \n", "3 others \n", "6 home_decor \n", "7 kitchenware " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "array(['home_decor', 'others', 'kitchenware', 'event_and_party',\n", " 'bags_and_toys', 'plant_and_accessories'], dtype=object)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data_selected= data.query(\"sample_categories!='undefined'\")\n", "display(data_selected.head())\n", "data_selected.sample_categories.unique()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Training the text data with a Naive Base algorithm, validation, and reports on the model performance." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "' precision recall f1-score support\\n\\n bags_and_toys 1.00 1.00 1.00 5675\\n event_and_party 1.00 0.98 0.99 3961\\n home_decor 1.00 1.00 1.00 4198\\n kitchenware 0.98 1.00 0.99 6199\\n others 1.00 1.00 1.00 2197\\nplant_and_accessories 1.00 1.00 1.00 261\\n\\n accuracy 1.00 22491\\n macro avg 1.00 1.00 1.00 22491\\n weighted avg 1.00 1.00 1.00 22491\\n'" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\tAccuracy: 1.00\n", "\tPrecision: 1.00\n", "\tRecall: 1.00\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "def training_the_model(X,y):\n", " X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=.20, random_state=42)\n", " global nb_model\n", " nb_model = make_pipeline(TfidfVectorizer(), MultinomialNB()) \n", " nb_model.fit(X_train, y_train) \n", "\n", " label=nb_model.predict(X_test)\n", "\n", " mtrx= confusion_matrix(y_test,label)\n", "\n", " display(classification_report(y_test, label))\n", " print('\\tAccuracy: {:.2f}'.format(accuracy_score(y_test,label)))\n", " print('\\tPrecision: {:.2f}'.format(precision_score(y_test,label, average='weighted')))\n", " print('\\tRecall: {:.2f}'.format(recall_score(y_test,label, average='weighted')))\n", " skplt.metrics.plot_confusion_matrix(y_test,label,cmap='tab20_r',figsize= (20,10), \\\n", " title= \"Confusion Matrix for Naive Base\")\n", " return plt.show()\n", "training_the_model(data_selected.description, data_selected.sample_categories)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The model does extremely well as it rarely made an error in the text prediction. It has an accuracy of about 100%. As evidenced in the confusion matrix, only 97 products of `events_and_party` were categorized as `kitchenware`. All the other categories were predicted exactly as the true label. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using the trained model to categorize." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
invoicenostockcodedescriptionquantityinvoicedateunitpricecustomeriddaterevenuesample_categoriescategories
053636585123Awhite hanging heart t-light holder62018-11-29 08:26:002.5517850.02018-11-2915.30home_decorhome_decor
153636571053white metal lantern62018-11-29 08:26:003.3917850.02018-11-2920.34home_decorhome_decor
253636584406Bcream cupid hearts coat hanger82018-11-29 08:26:002.7517850.02018-11-2922.00undefinedevent_and_party
353636584029Gknitted union flag hot water bottle62018-11-29 08:26:003.3917850.02018-11-2920.34othersothers
453636584029Ered woolly hottie white heart.62018-11-29 08:26:003.3917850.02018-11-2920.34undefinedhome_decor
553636522752set 7 babushka nesting boxes22018-11-29 08:26:007.6517850.02018-11-2915.30undefinedhome_decor
653636521730glass star frosted t-light holder62018-11-29 08:26:004.2517850.02018-11-2925.50home_decorhome_decor
753636622633hand warmer union jack62018-11-29 08:28:001.8517850.02018-11-2911.10kitchenwarekitchenware
853636622632hand warmer red polka dot62018-11-29 08:28:001.8517850.02018-11-2911.10undefinedkitchenware
953636784879assorted colour bird ornament322018-11-29 08:34:001.6913047.02018-11-2954.08event_and_partyevent_and_party
\n", "
" ], "text/plain": [ " invoiceno stockcode description quantity \\\n", "0 536365 85123A white hanging heart t-light holder 6 \n", "1 536365 71053 white metal lantern 6 \n", "2 536365 84406B cream cupid hearts coat hanger 8 \n", "3 536365 84029G knitted union flag hot water bottle 6 \n", "4 536365 84029E red woolly hottie white heart. 6 \n", "5 536365 22752 set 7 babushka nesting boxes 2 \n", "6 536365 21730 glass star frosted t-light holder 6 \n", "7 536366 22633 hand warmer union jack 6 \n", "8 536366 22632 hand warmer red polka dot 6 \n", "9 536367 84879 assorted colour bird ornament 32 \n", "\n", " invoicedate unitprice customerid date revenue \\\n", "0 2018-11-29 08:26:00 2.55 17850.0 2018-11-29 15.30 \n", "1 2018-11-29 08:26:00 3.39 17850.0 2018-11-29 20.34 \n", "2 2018-11-29 08:26:00 2.75 17850.0 2018-11-29 22.00 \n", "3 2018-11-29 08:26:00 3.39 17850.0 2018-11-29 20.34 \n", "4 2018-11-29 08:26:00 3.39 17850.0 2018-11-29 20.34 \n", "5 2018-11-29 08:26:00 7.65 17850.0 2018-11-29 15.30 \n", "6 2018-11-29 08:26:00 4.25 17850.0 2018-11-29 25.50 \n", "7 2018-11-29 08:28:00 1.85 17850.0 2018-11-29 11.10 \n", "8 2018-11-29 08:28:00 1.85 17850.0 2018-11-29 11.10 \n", "9 2018-11-29 08:34:00 1.69 13047.0 2018-11-29 54.08 \n", "\n", " sample_categories categories \n", "0 home_decor home_decor \n", "1 home_decor home_decor \n", "2 undefined event_and_party \n", "3 others others \n", "4 undefined home_decor \n", "5 undefined home_decor \n", "6 home_decor home_decor \n", "7 kitchenware kitchenware \n", "8 undefined kitchenware \n", "9 event_and_party event_and_party " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data[\"categories\"]= nb_model.predict(data[\"description\"])\n", "data.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Droping the manually created categories. " ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "data.drop(['sample_categories'], axis=1, inplace=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Dividing the data into cancelled orders and non-cancelled orders. " ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
invoicenostockcodedescriptionquantityinvoicedateunitpricecustomeriddaterevenuecategories
154C53638335004Cset of 3 coloured flying ducks-12018-11-29 09:49:004.6515311.02018-11-29-4.65event_and_party
235C53639122556plasters in tin circus parade-122018-11-29 10:24:001.6517548.02018-11-29-19.80others
236C53639121984pack of 12 pink paisley tissues-242018-11-29 10:24:000.2917548.02018-11-29-6.96kitchenware
237C53639121983pack of 12 blue paisley tissues-242018-11-29 10:24:000.2917548.02018-11-29-6.96kitchenware
238C53639121980pack of 12 red retrospot tissues-242018-11-29 10:24:000.2917548.02018-11-29-6.96kitchenware
\n", "
" ], "text/plain": [ " invoiceno stockcode description quantity \\\n", "154 C536383 35004C set of 3 coloured flying ducks -1 \n", "235 C536391 22556 plasters in tin circus parade -12 \n", "236 C536391 21984 pack of 12 pink paisley tissues -24 \n", "237 C536391 21983 pack of 12 blue paisley tissues -24 \n", "238 C536391 21980 pack of 12 red retrospot tissues -24 \n", "\n", " invoicedate unitprice customerid date revenue \\\n", "154 2018-11-29 09:49:00 4.65 15311.0 2018-11-29 -4.65 \n", "235 2018-11-29 10:24:00 1.65 17548.0 2018-11-29 -19.80 \n", "236 2018-11-29 10:24:00 0.29 17548.0 2018-11-29 -6.96 \n", "237 2018-11-29 10:24:00 0.29 17548.0 2018-11-29 -6.96 \n", "238 2018-11-29 10:24:00 0.29 17548.0 2018-11-29 -6.96 \n", "\n", " categories \n", "154 event_and_party \n", "235 others \n", "236 kitchenware \n", "237 kitchenware \n", "238 kitchenware " ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cancelled_orders= data[data.invoiceno.str.contains(\"C\", na=False)][data[\"quantity\"]<0]\n", "cancelled_orders = data.query(\"invoiceno in @cancelled_orders.invoiceno\")\n", "cancelled_orders.head()" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(522572, 10)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "orders_data= data.query(\"invoiceno not in @cancelled_orders.invoiceno\")\n", "orders_data.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Interim Conclusion**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I have ensured that data have the right data types, duplicates have been checked, and missing values have been delth with. I have investigated outliers in the data, and categorized the products. New columns, i.e. `revenue`, and `categories` have been created. Hence, the data is ready for analysis." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 2: Carry out exploratory data analysis" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Leading orders by invoice**" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
573585581219581492580729558475579777581217537434580730538071
invoice_count1113748730720704686675674661650
\n", "
" ], "text/plain": [ " 573585 581219 581492 580729 558475 579777 581217 537434 \\\n", "invoice_count 1113 748 730 720 704 686 675 674 \n", "\n", " 580730 538071 \n", "invoice_count 661 650 " ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "top_invoices= orders_data.invoiceno.value_counts().to_frame()\n", "top_invoices.rename(columns={\"invoiceno\":\"invoice_count\"}).head(10).T" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Invoice number 573585 had the highest number of products ordered (1113 products). My assertion is that top ten invoices count indicate that the customers of the store are most likely wholesalers. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**What are the most frequently purchased categories?**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I will count invoice numbers by categories and visualize. " ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ " \n", " " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "alignmentgroup": "True", "hovertemplate": "invoice_count=%{text}
categories=%{y}", "legendgroup": "", "marker": { "color": "#636efa", "pattern": { "shape": "" } }, "name": "", "offsetgroup": "", "orientation": "h", "showlegend": false, "text": [ 6721, 12736, 12775, 14805, 15669, 16351 ], "textposition": "outside", "texttemplate": "%{text:.3s}", "type": "bar", "x": [ 6721, 12736, 12775, 14805, 15669, 16351 ], "xaxis": "x", "y": [ "plant_and_accessories", "bags_and_toys", "others", "home_decor", "event_and_party", "kitchenware" ], "yaxis": "y" } ], "layout": { "barmode": "relative", "legend": { "tracegroupgap": 0 }, "margin": { "b": 8, "l": 8, "r": 10, "t": 30 }, "showlegend": false, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "fillpattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "autotypenumbers": "strict", "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Most Frequently Purchased Categories" }, "xaxis": { "anchor": "y", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "title": { "text": "Invoice Count" } }, "yaxis": { "anchor": "x", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "title": { "text": "Categories" } } } }, "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "frequent_categories= orders_data.groupby(\"categories\")[\"invoiceno\"].nunique().sort_values(ascending=True).to_frame().reset_index()\\\n", ".rename(columns={\"invoiceno\":\"invoice_count\"})\n", "\n", "fig = px.bar(frequent_categories, y='categories', x='invoice_count',orientation = \"h\",title= \"Most Frequently Purchased Categories\", text='invoice_count')\n", "fig.update_layout(\n", " showlegend=False,\n", " margin=dict(t=30,l=8,b=8,r=10))\n", "fig.update_traces(texttemplate='%{text:.3s}', textposition='outside')\n", "fig.update_yaxes(tickfont_family=\"Arial Black\", title_text=\"Categories\")\n", "fig.update_xaxes(tickfont_family=\"Arial Black\", title_text=\"Invoice Count\")\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`kitchenware` is the most frequently purchased category, `plant_and_accessories` is the least frequently purchased category." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**How many orders do they make during a given period of time?**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I will visualize the number of daily orders for the entire period and the number of monthly orders." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "hovertemplate": "date=%{x}
invoiceno=%{y}", "legendgroup": "", "line": { "color": "#636efa", "dash": "solid" }, "marker": { "symbol": "circle" }, "mode": "lines", "name": "", "orientation": "v", "showlegend": false, "type": "scatter", "x": [ "2018-11-29T00:00:00", "2018-11-30T00:00:00", "2018-12-01T00:00:00", "2018-12-03T00:00:00", "2018-12-04T00:00:00", "2018-12-05T00:00:00", "2018-12-06T00:00:00", "2018-12-07T00:00:00", "2018-12-08T00:00:00", "2018-12-10T00:00:00", "2018-12-11T00:00:00", "2018-12-12T00:00:00", "2018-12-13T00:00:00", "2018-12-14T00:00:00", "2018-12-15T00:00:00", "2018-12-17T00:00:00", "2018-12-18T00:00:00", "2018-12-19T00:00:00", "2018-12-20T00:00:00", "2018-12-21T00:00:00", "2019-01-02T00:00:00", "2019-01-03T00:00:00", "2019-01-04T00:00:00", "2019-01-05T00:00:00", "2019-01-07T00:00:00", "2019-01-08T00:00:00", "2019-01-09T00:00:00", "2019-01-10T00:00:00", "2019-01-11T00:00:00", "2019-01-12T00:00:00", "2019-01-14T00:00:00", "2019-01-15T00:00:00", "2019-01-16T00:00:00", "2019-01-17T00:00:00", "2019-01-18T00:00:00", "2019-01-19T00:00:00", "2019-01-21T00:00:00", "2019-01-22T00:00:00", "2019-01-23T00:00:00", "2019-01-24T00:00:00", "2019-01-25T00:00:00", "2019-01-26T00:00:00", "2019-01-28T00:00:00", "2019-01-29T00:00:00", "2019-01-30T00:00:00", "2019-01-31T00:00:00", "2019-02-01T00:00:00", "2019-02-02T00:00:00", "2019-02-04T00:00:00", "2019-02-05T00:00:00", "2019-02-06T00:00:00", "2019-02-07T00:00:00", "2019-02-08T00:00:00", "2019-02-09T00:00:00", "2019-02-11T00:00:00", "2019-02-12T00:00:00", "2019-02-13T00:00:00", "2019-02-14T00:00:00", "2019-02-15T00:00:00", "2019-02-16T00:00:00", "2019-02-18T00:00:00", "2019-02-19T00:00:00", "2019-02-20T00:00:00", "2019-02-21T00:00:00", "2019-02-22T00:00:00", "2019-02-23T00:00:00", "2019-02-25T00:00:00", "2019-02-26T00:00:00", "2019-02-27T00:00:00", "2019-02-28T00:00:00", "2019-03-01T00:00:00", "2019-03-02T00:00:00", "2019-03-04T00:00:00", "2019-03-05T00:00:00", "2019-03-06T00:00:00", "2019-03-07T00:00:00", "2019-03-08T00:00:00", "2019-03-09T00:00:00", "2019-03-11T00:00:00", "2019-03-12T00:00:00", "2019-03-13T00:00:00", "2019-03-14T00:00:00", "2019-03-15T00:00:00", "2019-03-16T00:00:00", "2019-03-18T00:00:00", "2019-03-19T00:00:00", "2019-03-20T00:00:00", "2019-03-21T00:00:00", "2019-03-22T00:00:00", "2019-03-23T00:00:00", "2019-03-25T00:00:00", "2019-03-26T00:00:00", "2019-03-27T00:00:00", "2019-03-28T00:00:00", "2019-03-29T00:00:00", "2019-03-30T00:00:00", "2019-04-01T00:00:00", "2019-04-02T00:00:00", "2019-04-03T00:00:00", "2019-04-04T00:00:00", "2019-04-05T00:00:00", "2019-04-06T00:00:00", "2019-04-08T00:00:00", "2019-04-09T00:00:00", "2019-04-10T00:00:00", "2019-04-11T00:00:00", "2019-04-12T00:00:00", "2019-04-13T00:00:00", "2019-04-15T00:00:00", "2019-04-16T00:00:00", "2019-04-17T00:00:00", "2019-04-18T00:00:00", "2019-04-19T00:00:00", "2019-04-24T00:00:00", "2019-04-25T00:00:00", "2019-04-26T00:00:00", "2019-04-29T00:00:00", "2019-05-01T00:00:00", "2019-05-02T00:00:00", "2019-05-03T00:00:00", "2019-05-04T00:00:00", "2019-05-06T00:00:00", "2019-05-07T00:00:00", "2019-05-08T00:00:00", "2019-05-09T00:00:00", "2019-05-10T00:00:00", "2019-05-11T00:00:00", "2019-05-13T00:00:00", "2019-05-14T00:00:00", "2019-05-15T00:00:00", "2019-05-16T00:00:00", "2019-05-17T00:00:00", "2019-05-18T00:00:00", "2019-05-20T00:00:00", "2019-05-21T00:00:00", "2019-05-22T00:00:00", "2019-05-23T00:00:00", "2019-05-24T00:00:00", "2019-05-25T00:00:00", "2019-05-27T00:00:00", "2019-05-29T00:00:00", "2019-05-30T00:00:00", "2019-05-31T00:00:00", "2019-06-01T00:00:00", "2019-06-03T00:00:00", "2019-06-04T00:00:00", "2019-06-05T00:00:00", "2019-06-06T00:00:00", "2019-06-07T00:00:00", "2019-06-08T00:00:00", "2019-06-10T00:00:00", "2019-06-11T00:00:00", "2019-06-12T00:00:00", "2019-06-13T00:00:00", "2019-06-14T00:00:00", "2019-06-15T00:00:00", "2019-06-17T00:00:00", "2019-06-18T00:00:00", "2019-06-19T00:00:00", "2019-06-20T00:00:00", "2019-06-21T00:00:00", "2019-06-22T00:00:00", "2019-06-24T00:00:00", "2019-06-25T00:00:00", "2019-06-26T00:00:00", "2019-06-27T00:00:00", "2019-06-28T00:00:00", "2019-06-29T00:00:00", "2019-07-01T00:00:00", "2019-07-02T00:00:00", "2019-07-03T00:00:00", "2019-07-04T00:00:00", "2019-07-05T00:00:00", "2019-07-06T00:00:00", "2019-07-08T00:00:00", "2019-07-09T00:00:00", "2019-07-10T00:00:00", "2019-07-11T00:00:00", "2019-07-12T00:00:00", "2019-07-13T00:00:00", "2019-07-15T00:00:00", "2019-07-16T00:00:00", "2019-07-17T00:00:00", "2019-07-18T00:00:00", "2019-07-19T00:00:00", "2019-07-20T00:00:00", "2019-07-22T00:00:00", "2019-07-23T00:00:00", "2019-07-24T00:00:00", "2019-07-25T00:00:00", "2019-07-26T00:00:00", "2019-07-27T00:00:00", "2019-07-29T00:00:00", "2019-07-30T00:00:00", "2019-07-31T00:00:00", "2019-08-01T00:00:00", "2019-08-02T00:00:00", "2019-08-03T00:00:00", "2019-08-05T00:00:00", "2019-08-06T00:00:00", "2019-08-07T00:00:00", "2019-08-08T00:00:00", "2019-08-09T00:00:00", "2019-08-10T00:00:00", "2019-08-12T00:00:00", "2019-08-13T00:00:00", "2019-08-14T00:00:00", "2019-08-15T00:00:00", "2019-08-16T00:00:00", "2019-08-17T00:00:00", "2019-08-19T00:00:00", "2019-08-20T00:00:00", "2019-08-21T00:00:00", "2019-08-22T00:00:00", "2019-08-23T00:00:00", "2019-08-24T00:00:00", "2019-08-26T00:00:00", "2019-08-28T00:00:00", "2019-08-29T00:00:00", "2019-08-30T00:00:00", "2019-08-31T00:00:00", "2019-09-02T00:00:00", "2019-09-03T00:00:00", "2019-09-04T00:00:00", "2019-09-05T00:00:00", "2019-09-06T00:00:00", "2019-09-07T00:00:00", "2019-09-09T00:00:00", "2019-09-10T00:00:00", "2019-09-11T00:00:00", "2019-09-12T00:00:00", "2019-09-13T00:00:00", "2019-09-14T00:00:00", "2019-09-16T00:00:00", "2019-09-17T00:00:00", "2019-09-18T00:00:00", "2019-09-19T00:00:00", "2019-09-20T00:00:00", "2019-09-21T00:00:00", "2019-09-23T00:00:00", "2019-09-24T00:00:00", "2019-09-25T00:00:00", "2019-09-26T00:00:00", "2019-09-27T00:00:00", "2019-09-28T00:00:00", "2019-09-30T00:00:00", "2019-10-01T00:00:00", "2019-10-02T00:00:00", "2019-10-03T00:00:00", "2019-10-04T00:00:00", "2019-10-05T00:00:00", "2019-10-07T00:00:00", "2019-10-08T00:00:00", "2019-10-09T00:00:00", "2019-10-10T00:00:00", "2019-10-11T00:00:00", "2019-10-12T00:00:00", "2019-10-14T00:00:00", "2019-10-15T00:00:00", "2019-10-16T00:00:00", "2019-10-17T00:00:00", "2019-10-18T00:00:00", "2019-10-19T00:00:00", "2019-10-21T00:00:00", "2019-10-22T00:00:00", "2019-10-23T00:00:00", "2019-10-24T00:00:00", "2019-10-25T00:00:00", "2019-10-26T00:00:00", "2019-10-28T00:00:00", "2019-10-29T00:00:00", "2019-10-30T00:00:00", "2019-10-31T00:00:00", "2019-11-01T00:00:00", "2019-11-02T00:00:00", "2019-11-04T00:00:00", "2019-11-05T00:00:00", "2019-11-06T00:00:00", "2019-11-07T00:00:00", "2019-11-08T00:00:00", "2019-11-09T00:00:00", "2019-11-11T00:00:00", "2019-11-12T00:00:00", "2019-11-13T00:00:00", "2019-11-14T00:00:00", "2019-11-15T00:00:00", "2019-11-16T00:00:00", "2019-11-18T00:00:00", "2019-11-19T00:00:00", "2019-11-20T00:00:00", "2019-11-21T00:00:00", "2019-11-22T00:00:00", "2019-11-23T00:00:00", "2019-11-25T00:00:00", "2019-11-26T00:00:00", "2019-11-27T00:00:00", "2019-11-28T00:00:00", "2019-11-29T00:00:00", "2019-11-30T00:00:00", "2019-12-02T00:00:00", "2019-12-03T00:00:00", "2019-12-04T00:00:00", "2019-12-05T00:00:00", "2019-12-06T00:00:00", "2019-12-07T00:00:00" ], "xaxis": "x", "y": [ 127, 141, 68, 88, 102, 82, 116, 106, 76, 44, 70, 89, 73, 121, 66, 23, 61, 54, 16, 27, 36, 55, 50, 53, 48, 39, 58, 45, 44, 47, 25, 51, 40, 35, 38, 38, 27, 52, 66, 56, 52, 40, 24, 62, 63, 62, 45, 52, 11, 42, 45, 29, 41, 45, 20, 42, 58, 63, 67, 38, 26, 37, 52, 63, 55, 49, 33, 55, 60, 43, 49, 49, 27, 67, 48, 53, 56, 50, 16, 50, 51, 52, 62, 63, 60, 53, 52, 63, 67, 56, 32, 59, 62, 75, 65, 74, 20, 56, 51, 40, 64, 70, 32, 62, 61, 60, 86, 56, 42, 68, 59, 63, 77, 71, 65, 58, 18, 64, 65, 88, 82, 63, 71, 95, 80, 84, 73, 31, 67, 79, 73, 99, 71, 61, 65, 68, 65, 64, 64, 24, 54, 42, 52, 44, 67, 56, 77, 99, 85, 50, 39, 59, 57, 59, 83, 52, 60, 73, 55, 58, 72, 50, 26, 45, 53, 42, 70, 48, 25, 50, 73, 70, 76, 64, 24, 52, 53, 58, 70, 40, 51, 57, 65, 56, 76, 43, 58, 58, 51, 48, 80, 64, 42, 39, 44, 64, 82, 55, 30, 39, 40, 49, 67, 51, 25, 48, 51, 54, 69, 54, 39, 65, 60, 76, 73, 45, 37, 40, 44, 75, 73, 50, 67, 62, 51, 75, 61, 75, 68, 67, 68, 80, 50, 27, 69, 70, 75, 110, 64, 75, 65, 79, 85, 104, 73, 35, 66, 81, 95, 120, 87, 39, 99, 81, 86, 76, 71, 35, 87, 84, 83, 87, 64, 41, 76, 75, 92, 103, 77, 96, 69, 76, 85, 102, 93, 102, 92, 107, 119, 127, 102, 87, 106, 107, 130, 136, 108, 100, 100, 134, 130, 112, 83, 57, 114, 135, 107, 121, 120, 66, 126, 114, 105, 120, 44 ], "yaxis": "y" } ], "layout": { "legend": { "tracegroupgap": 0 }, "margin": { "b": 8, "l": 8, "r": 10, "t": 30 }, "showlegend": true, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "fillpattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "autotypenumbers": "strict", "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Daily Orders" }, "xaxis": { "anchor": "y", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "title": { "text": "Invoice Date" } }, "yaxis": { "anchor": "x", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "title": { "text": "Invoice Count" } } } }, "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig = px.line(orders_data.groupby('date')['invoiceno'].nunique().reset_index(),x=\"date\", y=\"invoiceno\", \n", " title=\"Daily Orders\")\n", "\n", "fig.update_layout(\n", " showlegend=True,\n", " margin=dict(t=30,l=8,b=8,r=10))\n", "fig.update_xaxes(tickfont_family=\"Arial Black\", title_text=\"Invoice Date\")\n", "fig.update_yaxes(tickfont_family=\"Arial Black\", title_text=\"Invoice Count\")\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The highest daily orders was on November 30th, 2018, followed by November 15th, 2019 (141 and 136 orders respectively). The lowest daily order was on 4th February 2019 (just 11 orders)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "hovertemplate": "invoicedate=%{x}
invoiceno=%{y}", "legendgroup": "", "line": { "color": "#636efa", "dash": "solid", "shape": "spline" }, "marker": { "symbol": "circle" }, "mode": "markers+lines", "name": "", "orientation": "v", "showlegend": false, "type": "scatter", "x": [ "2018-11-01T00:00:00", "2018-12-01T00:00:00", "2019-01-01T00:00:00", "2019-02-01T00:00:00", "2019-03-01T00:00:00", "2019-04-01T00:00:00", "2019-05-01T00:00:00", "2019-06-01T00:00:00", "2019-07-01T00:00:00", "2019-08-01T00:00:00", "2019-09-01T00:00:00", "2019-10-01T00:00:00", "2019-11-01T00:00:00", "2019-12-01T00:00:00" ], "xaxis": "x", "y": [ 268, 1282, 1206, 1071, 1411, 1179, 1744, 1479, 1487, 1405, 1705, 2131, 2831, 575 ], "yaxis": "y" } ], "layout": { "legend": { "tracegroupgap": 0 }, "margin": { "b": 8, "l": 8, "r": 10, "t": 30 }, "showlegend": true, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "fillpattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "autotypenumbers": "strict", "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Monthly Orders" }, "xaxis": { "anchor": "y", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "title": { "text": "Invoice Date" } }, "yaxis": { "anchor": "x", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "title": { "text": "Invoice Count" } } } }, "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import plotly.express as px\n", "fig = px.line(orders_data.groupby([orders_data['invoicedate'].astype(\"datetime64[M]\")])['invoiceno'].nunique().reset_index(),\n", " x=\"invoicedate\", y=\"invoiceno\", title=\"Monthly Orders\", line_shape= \"spline\", markers= True)\n", "\n", "fig.update_layout(\n", " showlegend=True,\n", " margin=dict(t=30,l=8,b=8,r=10))\n", "fig.update_xaxes(tickfont_family=\"Arial Black\", title_text=\"Invoice Date\")\n", "fig.update_yaxes(tickfont_family=\"Arial Black\", title_text=\"Invoice Count\")\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Excluding November 2018 and December 2019 (as we only have less than 8 days of orders), The number of total monthly orders improved a lot from December 2018 to November 2019, i.e. from 1282 orders to 2831 orders- about 121% increament." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**What are the top ten products by revenue?**" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "alignmentgroup": "True", "hovertemplate": "revenue=%{text}
description=%{y}", "legendgroup": "", "marker": { "color": "#636efa", "pattern": { "shape": "" } }, "name": "", "offsetgroup": "", "orientation": "h", "showlegend": false, "text": [ 0.003, 0.42, 0.65, 0.84, 0.85, 0.95, 0.95, 1.1, 1.25, 1.25 ], "textposition": "outside", "texttemplate": "%{text:.3s}", "type": "bar", "x": [ 0.003, 0.42, 0.65, 0.84, 0.85, 0.95, 0.95, 1.1, 1.25, 1.25 ], "xaxis": "x", "y": [ "pads to match all cushions", "hen house w chick in nest", "set 12 colouring pencils doiley", "vintage blue tinsel reel", "pink crystal guitar phone charm", "happy birthday card teddy/cake", "cat with sunglasses blank card", "60 gold and silver fairy cake cases", "pack 4 flower/butterfly patches", "set 36 colouring pencils doiley" ], "yaxis": "y" } ], "layout": { "barmode": "relative", "legend": { "tracegroupgap": 0 }, "margin": { "b": 8, "l": 8, "r": 10, "t": 30 }, "showlegend": false, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "fillpattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "autotypenumbers": "strict", "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Top 10 Products by Revenue" }, "xaxis": { "anchor": "y", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "tickformat": ",.2f", "tickprefix": "£", "title": { "text": "Revenue (Million)" } }, "yaxis": { "anchor": "x", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "title": { "text": "Products" } } } }, "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "top_products= orders_data.groupby(\"description\")[\"revenue\"].sum().to_frame().sort_values(by =\"revenue\", ascending=True)\\\n", ".reset_index().head(10)\n", "\n", "fig = px.bar(top_products, y='description', x='revenue',title= \"Top 10 Products by Revenue\", text='revenue')\n", "fig.update_layout(\n", " showlegend=False,\n", " margin=dict(t=30,l=8,b=8,r=10))\n", "fig.update_traces(texttemplate='%{text:.3s}', textposition='outside')\n", "fig.update_layout(xaxis_tickprefix = '£', xaxis_tickformat = ',.2f')\n", "fig.update_yaxes(tickfont_family=\"Arial Black\", title_text=\"Products\")\n", "fig.update_xaxes(tickfont_family=\"Arial Black\", title_text=\"Revenue (Million)\")\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`regency cakestand 3 tier` and `paper craft, little birdie` are the top two products in term of revenue generations. `regency cakestand 3 tier` generated a revenue amounting to about £174,200 - the highest." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**What products get cancelled the most?**" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "alignmentgroup": "True", "hovertemplate": "invoice_count=%{text}
description=%{y}", "legendgroup": "", "marker": { "color": "#636efa", "pattern": { "shape": "" } }, "name": "", "offsetgroup": "", "orientation": "h", "showlegend": false, "text": [ 42, 42, 43, 43, 47, 52, 54, 73, 87, 180 ], "textposition": "outside", "texttemplate": "%{text:.2s}", "type": "bar", "x": [ 42, 42, 43, 43, 47, 52, 54, 73, 87, 180 ], "xaxis": "x", "y": [ "white hanging heart t-light holder", "green regency teacup and saucer", "jumbo bag red retrospot", "lunch bag red retrospot", "recipe box pantry yellow design", "strawberry ceramic trinket box", "roses regency teacup and saucer ", "set of 3 cake tins pantry design ", "jam making set with jars", "regency cakestand 3 tier" ], "yaxis": "y" } ], "layout": { "barmode": "relative", "legend": { "tracegroupgap": 0 }, "margin": { "b": 8, "l": 8, "r": 10, "t": 30 }, "showlegend": false, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "fillpattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "autotypenumbers": "strict", "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Most Frequently Cancelled Products" }, "xaxis": { "anchor": "y", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "title": { "text": "Invoice Count" } }, "yaxis": { "anchor": "x", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "title": { "text": "Products" } } } }, "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "frequently_cancelled_products= cancelled_orders.groupby(\"description\")[\"invoiceno\"].nunique().sort_values(ascending=True).to_frame().reset_index()\\\n", ".rename(columns={\"invoiceno\":\"invoice_count\"}).tail(10)\n", "\n", "fig = px.bar(frequently_cancelled_products, y='description', x='invoice_count',title= \"Most Frequently Cancelled Products\", text='invoice_count')\n", "fig.update_layout(\n", " showlegend=False,\n", " margin=dict(t=30,l=8,b=8,r=10))\n", "fig.update_traces(texttemplate='%{text:.2s}', textposition='outside')\n", "fig.update_yaxes(tickfont_family=\"Arial Black\", title_text=\"Products\")\n", "fig.update_xaxes(tickfont_family=\"Arial Black\", title_text=\"Invoice Count\")\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Interestinely, `regency cakestand 3 tier` is generating the highest revenue but it gets cancelled the most." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Are there seasonalities in revenues?**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I will visualize daily revenues" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "hovertemplate": "invoicedate=%{x}
revenue=%{y}", "legendgroup": "", "line": { "color": "#636efa", "dash": "solid" }, "marker": { "symbol": "circle" }, "mode": "lines", "name": "", "orientation": "v", "showlegend": false, "type": "scatter", "x": [ "2018-11-29T00:00:00", "2018-11-30T00:00:00", "2018-12-01T00:00:00", "2018-12-03T00:00:00", "2018-12-04T00:00:00", "2018-12-05T00:00:00", "2018-12-06T00:00:00", "2018-12-07T00:00:00", "2018-12-08T00:00:00", "2018-12-10T00:00:00", "2018-12-11T00:00:00", "2018-12-12T00:00:00", "2018-12-13T00:00:00", "2018-12-14T00:00:00", "2018-12-15T00:00:00", "2018-12-17T00:00:00", "2018-12-18T00:00:00", "2018-12-19T00:00:00", "2018-12-20T00:00:00", "2018-12-21T00:00:00", "2019-01-02T00:00:00", "2019-01-03T00:00:00", "2019-01-04T00:00:00", "2019-01-05T00:00:00", "2019-01-07T00:00:00", "2019-01-08T00:00:00", "2019-01-09T00:00:00", "2019-01-10T00:00:00", "2019-01-11T00:00:00", "2019-01-12T00:00:00", "2019-01-14T00:00:00", "2019-01-15T00:00:00", "2019-01-16T00:00:00", "2019-01-17T00:00:00", "2019-01-18T00:00:00", "2019-01-19T00:00:00", "2019-01-21T00:00:00", "2019-01-22T00:00:00", "2019-01-23T00:00:00", "2019-01-24T00:00:00", "2019-01-25T00:00:00", "2019-01-26T00:00:00", "2019-01-28T00:00:00", "2019-01-29T00:00:00", "2019-01-30T00:00:00", "2019-01-31T00:00:00", "2019-02-01T00:00:00", "2019-02-02T00:00:00", "2019-02-04T00:00:00", "2019-02-05T00:00:00", "2019-02-06T00:00:00", "2019-02-07T00:00:00", "2019-02-08T00:00:00", "2019-02-09T00:00:00", "2019-02-11T00:00:00", "2019-02-12T00:00:00", "2019-02-13T00:00:00", "2019-02-14T00:00:00", "2019-02-15T00:00:00", "2019-02-16T00:00:00", "2019-02-18T00:00:00", "2019-02-19T00:00:00", "2019-02-20T00:00:00", "2019-02-21T00:00:00", "2019-02-22T00:00:00", "2019-02-23T00:00:00", "2019-02-25T00:00:00", "2019-02-26T00:00:00", "2019-02-27T00:00:00", "2019-02-28T00:00:00", "2019-03-01T00:00:00", "2019-03-02T00:00:00", "2019-03-04T00:00:00", "2019-03-05T00:00:00", "2019-03-06T00:00:00", "2019-03-07T00:00:00", "2019-03-08T00:00:00", "2019-03-09T00:00:00", "2019-03-11T00:00:00", "2019-03-12T00:00:00", "2019-03-13T00:00:00", "2019-03-14T00:00:00", "2019-03-15T00:00:00", "2019-03-16T00:00:00", "2019-03-18T00:00:00", "2019-03-19T00:00:00", "2019-03-20T00:00:00", "2019-03-21T00:00:00", "2019-03-22T00:00:00", "2019-03-23T00:00:00", "2019-03-25T00:00:00", "2019-03-26T00:00:00", "2019-03-27T00:00:00", "2019-03-28T00:00:00", "2019-03-29T00:00:00", "2019-03-30T00:00:00", "2019-04-01T00:00:00", "2019-04-02T00:00:00", "2019-04-03T00:00:00", "2019-04-04T00:00:00", "2019-04-05T00:00:00", "2019-04-06T00:00:00", "2019-04-08T00:00:00", "2019-04-09T00:00:00", "2019-04-10T00:00:00", "2019-04-11T00:00:00", "2019-04-12T00:00:00", "2019-04-13T00:00:00", "2019-04-15T00:00:00", "2019-04-16T00:00:00", "2019-04-17T00:00:00", "2019-04-18T00:00:00", "2019-04-19T00:00:00", "2019-04-24T00:00:00", "2019-04-25T00:00:00", "2019-04-26T00:00:00", "2019-04-29T00:00:00", "2019-05-01T00:00:00", "2019-05-02T00:00:00", "2019-05-03T00:00:00", "2019-05-04T00:00:00", "2019-05-06T00:00:00", "2019-05-07T00:00:00", "2019-05-08T00:00:00", "2019-05-09T00:00:00", "2019-05-10T00:00:00", "2019-05-11T00:00:00", "2019-05-13T00:00:00", "2019-05-14T00:00:00", "2019-05-15T00:00:00", "2019-05-16T00:00:00", "2019-05-17T00:00:00", "2019-05-18T00:00:00", "2019-05-20T00:00:00", "2019-05-21T00:00:00", "2019-05-22T00:00:00", "2019-05-23T00:00:00", "2019-05-24T00:00:00", "2019-05-25T00:00:00", "2019-05-27T00:00:00", "2019-05-29T00:00:00", "2019-05-30T00:00:00", "2019-05-31T00:00:00", "2019-06-01T00:00:00", "2019-06-03T00:00:00", "2019-06-04T00:00:00", "2019-06-05T00:00:00", "2019-06-06T00:00:00", "2019-06-07T00:00:00", "2019-06-08T00:00:00", "2019-06-10T00:00:00", "2019-06-11T00:00:00", "2019-06-12T00:00:00", "2019-06-13T00:00:00", "2019-06-14T00:00:00", "2019-06-15T00:00:00", "2019-06-17T00:00:00", "2019-06-18T00:00:00", "2019-06-19T00:00:00", "2019-06-20T00:00:00", "2019-06-21T00:00:00", "2019-06-22T00:00:00", "2019-06-24T00:00:00", "2019-06-25T00:00:00", "2019-06-26T00:00:00", "2019-06-27T00:00:00", "2019-06-28T00:00:00", "2019-06-29T00:00:00", "2019-07-01T00:00:00", "2019-07-02T00:00:00", "2019-07-03T00:00:00", "2019-07-04T00:00:00", "2019-07-05T00:00:00", "2019-07-06T00:00:00", "2019-07-08T00:00:00", "2019-07-09T00:00:00", "2019-07-10T00:00:00", "2019-07-11T00:00:00", "2019-07-12T00:00:00", "2019-07-13T00:00:00", "2019-07-15T00:00:00", "2019-07-16T00:00:00", "2019-07-17T00:00:00", "2019-07-18T00:00:00", "2019-07-19T00:00:00", "2019-07-20T00:00:00", "2019-07-22T00:00:00", "2019-07-23T00:00:00", "2019-07-24T00:00:00", "2019-07-25T00:00:00", "2019-07-26T00:00:00", "2019-07-27T00:00:00", "2019-07-29T00:00:00", "2019-07-30T00:00:00", "2019-07-31T00:00:00", "2019-08-01T00:00:00", "2019-08-02T00:00:00", "2019-08-03T00:00:00", "2019-08-05T00:00:00", "2019-08-06T00:00:00", "2019-08-07T00:00:00", "2019-08-08T00:00:00", "2019-08-09T00:00:00", "2019-08-10T00:00:00", "2019-08-12T00:00:00", "2019-08-13T00:00:00", "2019-08-14T00:00:00", "2019-08-15T00:00:00", "2019-08-16T00:00:00", "2019-08-17T00:00:00", "2019-08-19T00:00:00", "2019-08-20T00:00:00", "2019-08-21T00:00:00", "2019-08-22T00:00:00", "2019-08-23T00:00:00", "2019-08-24T00:00:00", "2019-08-26T00:00:00", "2019-08-28T00:00:00", "2019-08-29T00:00:00", "2019-08-30T00:00:00", "2019-08-31T00:00:00", "2019-09-02T00:00:00", "2019-09-03T00:00:00", "2019-09-04T00:00:00", "2019-09-05T00:00:00", "2019-09-06T00:00:00", "2019-09-07T00:00:00", "2019-09-09T00:00:00", "2019-09-10T00:00:00", "2019-09-11T00:00:00", "2019-09-12T00:00:00", "2019-09-13T00:00:00", "2019-09-14T00:00:00", "2019-09-16T00:00:00", "2019-09-17T00:00:00", "2019-09-18T00:00:00", "2019-09-19T00:00:00", "2019-09-20T00:00:00", "2019-09-21T00:00:00", "2019-09-23T00:00:00", "2019-09-24T00:00:00", "2019-09-25T00:00:00", "2019-09-26T00:00:00", "2019-09-27T00:00:00", "2019-09-28T00:00:00", "2019-09-30T00:00:00", "2019-10-01T00:00:00", "2019-10-02T00:00:00", "2019-10-03T00:00:00", "2019-10-04T00:00:00", "2019-10-05T00:00:00", "2019-10-07T00:00:00", "2019-10-08T00:00:00", "2019-10-09T00:00:00", "2019-10-10T00:00:00", "2019-10-11T00:00:00", "2019-10-12T00:00:00", "2019-10-14T00:00:00", "2019-10-15T00:00:00", "2019-10-16T00:00:00", "2019-10-17T00:00:00", "2019-10-18T00:00:00", "2019-10-19T00:00:00", "2019-10-21T00:00:00", "2019-10-22T00:00:00", "2019-10-23T00:00:00", "2019-10-24T00:00:00", "2019-10-25T00:00:00", "2019-10-26T00:00:00", "2019-10-28T00:00:00", "2019-10-29T00:00:00", "2019-10-30T00:00:00", "2019-10-31T00:00:00", "2019-11-01T00:00:00", "2019-11-02T00:00:00", "2019-11-04T00:00:00", "2019-11-05T00:00:00", "2019-11-06T00:00:00", "2019-11-07T00:00:00", "2019-11-08T00:00:00", "2019-11-09T00:00:00", "2019-11-11T00:00:00", "2019-11-12T00:00:00", "2019-11-13T00:00:00", "2019-11-14T00:00:00", "2019-11-15T00:00:00", "2019-11-16T00:00:00", "2019-11-18T00:00:00", "2019-11-19T00:00:00", "2019-11-20T00:00:00", "2019-11-21T00:00:00", "2019-11-22T00:00:00", "2019-11-23T00:00:00", "2019-11-25T00:00:00", "2019-11-26T00:00:00", "2019-11-27T00:00:00", "2019-11-28T00:00:00", "2019-11-29T00:00:00", "2019-11-30T00:00:00", "2019-12-02T00:00:00", "2019-12-03T00:00:00", "2019-12-04T00:00:00", "2019-12-05T00:00:00", "2019-12-06T00:00:00", "2019-12-07T00:00:00" ], "xaxis": "x", "y": [ 57442.33, 47596.42, 44788.9, 30908.670000000002, 51667.12, 81454.99, 44153.98, 49992.52, 56688.13, 17125.65, 36574.67, 44327.11, 30367.56, 48868.14, 41190.659999999996, 7288.47, 25514.81, 43643.41, 4856.57, 11264.84, 15324.88, 31961.52, 39968.45, 27661.67, 15400.68, 23305.29, 67856.96, 23418.55, 19598.32, 46154.96, 6973.89, 27746.68, 94364.53, 25090.4, 20387.35, 31544.2, 10198.1, 24296.87, 28106.65, 19141.17, 23279.97, 19634.46, 6542.61, 22481.3, 28508.93, 20155.79, 22929.2, 23836.22, 3439.67, 25521.62, 19899.76, 15521.33, 14590.39, 22391.91, 5645.48, 26202.48, 39550.76, 24588.36, 24743.38, 15852.4, 9434.4, 35266.79, 32130.82, 25775.21, 22570.98, 20128.56, 9364.71, 19817.39, 24329.54, 18669.94, 35728.31, 19083.55, 9959.62, 31066.64, 24548.53, 21683.92, 25912.62, 21762.22, 4069.14, 25634.78, 19713.87, 21497.170000000002, 38419.84, 26661.86, 21412.84, 16649.47, 29734.78, 23028.93, 36030.79, 30241.72, 9155.77, 21948.08, 69118.51, 30920.35, 32859.05, 24807.58, 6897.18, 26216.2, 28691.239999999998, 17151.15, 18041.170000000002, 22882.87, 9834.75, 22032.48, 25190.12, 24395.38, 35712.1, 26138.691, 12544.34, 55029.01, 23145.89, 28559.32, 31622.26, 29500.28, 25385.39, 21692.26, 6814.11, 26539.02, 27189.96, 28772.28, 35708.29, 18622.98, 26954.96, 44013.7, 33292.159999999996, 60214.52, 30401.65, 9840.08, 39204.24, 52716.7, 33923.46, 34511.97, 26163.68, 23832, 30588.5, 37015.31, 23927.13, 33083.78, 27582.16, 7266.9400000000005, 21856.75, 19880.24, 31843.6, 16652.65, 24856.760000000002, 17332.55, 37182.86, 40730.44, 45254.29, 61626.229999999996, 12351.94, 21491.63, 40015.69, 48005.93, 33865, 20924.47, 22269.48, 32608.54, 24109.12, 20864.31, 24564.79, 18412.61, 6884.36, 16180.29, 34199.14, 21219.21, 44357.86, 12990.93, 5972.7300000000005, 42684.65, 38769.75, 25662.41, 31605.510000000002, 25448.98, 5953.24, 22043.73, 25177.86, 21840.19, 32826.01, 13640.67, 17100.26, 28167.14, 49686.97, 31639.98, 31728.24, 19571.27, 26987.56, 25372.68, 21254.571, 25132.82, 55710.82, 18476.88, 32806.82, 20857.7, 26111.5, 27800.14, 64709.89, 20887.920000000002, 7543.86, 22798.53, 27338.91, 28056.26, 64974.62, 31516.32, 5701.87, 17077.95, 19141.08, 51866.56, 53928.7, 16901.07, 14230.27, 29443.54, 25573.59, 47190.62, 22707.18, 25320.19, 10686.56, 29219.37, 23786.02, 36740.83, 40007.08, 16878.74, 36276.35, 27998.06, 31781.4, 26609.17, 29592.43, 34904.92, 29297.96, 54071.49, 23347.59, 74526.49, 26982.27, 15637.9, 46030.92, 108364.34, 45895.62, 58442.17, 39013.6, 30966.781, 28451.76, 35107.1, 42444.95, 45837.74, 43137.72, 11531.56, 61786.28, 47547.68, 74686.4, 61799.31, 52855.37, 12273.1, 46182.7, 51804.46, 30762.88, 35688.090000000004, 35062.42, 21767.55, 49296.75, 44360.78, 34932.57, 60683.53, 62978.39, 12106.19, 44898.77, 39887.45, 35978.840000000004, 48140.77, 39845.95, 33931.41, 52541.72, 28809.9, 44868.41, 62590.96, 61799.62, 42153.85, 84150.01, 54617.93, 64865.13, 68593.52, 52813.44, 32666.69, 111348.61, 58178.16, 62258.29, 58721.66, 49613.18, 33058.46, 48538.76, 61948.8, 75984.45, 48737.6, 46909.52, 20273.09, 54429.43, 68098.41, 56088.1, 50605.15, 55917.17, 24243.47, 80011.23, 55257.31, 72799.09, 77571.19, 198094.63999999998 ], "yaxis": "y" } ], "layout": { "legend": { "tracegroupgap": 0 }, "margin": { "b": 8, "l": 8, "r": 10, "t": 30 }, "showlegend": true, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "fillpattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "autotypenumbers": "strict", "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Daily Revenues" }, "xaxis": { "anchor": "y", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "title": { "text": "Invoice Date" } }, "yaxis": { "anchor": "x", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "tickprefix": "£", "title": { "text": "Revenue" } } } }, "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig = px.line(orders_data.groupby(orders_data['invoicedate'].astype(\"datetime64[D]\"))[\"revenue\"].sum()\\\n", " .reset_index(),x=\"invoicedate\", y=\"revenue\",title= \"Daily Revenues\")\n", " \n", "fig.update_layout(\n", " showlegend=True,\n", " margin=dict(t=30,l=8,b=8,r=10))\n", "fig.update_layout(yaxis_tickprefix = '£')\n", "fig.update_yaxes(tickfont_family=\"Arial Black\", title_text=\"Revenue\")\n", "fig.update_xaxes(tickfont_family=\"Arial Black\", title_text=\"Invoice Date\")\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Revenue from December 7, 2019 is extreme, so the plot is not fitting well. Also, November 2018 and December 2019 have just fews days of data, I will filter these months out and plot again; this time, I will plot per month." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "hovertemplate": "invoicedate=%{x}
revenue=%{y}", "legendgroup": "", "line": { "color": "#636efa", "dash": "solid", "shape": "spline" }, "marker": { "symbol": "circle" }, "mode": "markers+lines", "name": "", "orientation": "v", "showlegend": false, "type": "scatter", "x": [ "2018-12-01T00:00:00", "2019-01-01T00:00:00", "2019-02-01T00:00:00", "2019-03-01T00:00:00", "2019-04-01T00:00:00", "2019-05-01T00:00:00", "2019-06-01T00:00:00", "2019-07-01T00:00:00", "2019-08-01T00:00:00", "2019-09-01T00:00:00", "2019-10-01T00:00:00", "2019-11-01T00:00:00" ], "xaxis": "x", "y": [ 670676.2, 719104.18, 502201.3, 671649.94, 497476.191, 784946.06, 698951.08, 722230.941, 765148.93, 963129.031, 1165477.67, 1484959.99 ], "yaxis": "y" } ], "layout": { "legend": { "tracegroupgap": 0 }, "margin": { "b": 8, "l": 8, "r": 10, "t": 30 }, "showlegend": true, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "fillpattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "autotypenumbers": "strict", "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Total Monthly Revenues" }, "xaxis": { "anchor": "y", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "title": { "text": "Invoice Date" } }, "yaxis": { "anchor": "x", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "tickprefix": "£", "title": { "text": "Total Revenue" } } } }, "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig = px.line(orders_data.groupby(orders_data[(orders_data[\"date\"]>= \"2018-12-01\") & (orders_data[\"date\"]<= \"2019-11-30\")]['invoicedate']\\\n", " .astype(\"datetime64[M]\"))[\"revenue\"].sum().reset_index(),x=\"invoicedate\", y=\"revenue\",\\\n", " title= \"Total Monthly Revenues\", line_shape= \"spline\", markers= True)\n", " \n", "fig.update_layout(\n", " showlegend=True,\n", " margin=dict(t=30,l=8,b=8,r=10))\n", "fig.update_layout(yaxis_tickprefix = '£')\n", "fig.update_yaxes(tickfont_family=\"Arial Black\", title_text=\"Total Revenue\")\n", "fig.update_xaxes(tickfont_family=\"Arial Black\", title_text=\"Invoice Date\")\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Even though revenues were low in December 2018 and infact, 20th December 2018 accounted for the lowest revenue, the 2018 data only has two days in November and then December, so arguements could not be raised as probably, the previous months had better or lower revenue. Looking at 2019, it is evident there exit seasonality in revenues as total monthly revenues from January to July 2019 are below £800,000 but revenues were comparatively higher from August to November 2019. Looking at the First graph on daily reveues, December 7, 2019 had the highest revenue - about £198,000." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Interim Conclusion**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Invoice number `573585` had the highest number of products ordered (1113 products). The top ten invoices shows the customers of the store are mostly wholesalers.\n", "- `Kitchen ware` is the most frequently purchased category, `plant and accessories` are the least frequently purchased category.\n", "- The highest daily orders was on November 30th, 2018, followed by November 15th, 2019 (141 and 136 orders respectively). The lowest daily order was on 4th February 2019 - just 11 orders.\n", "- The number of total monthly orders from December 2018 to November increased by about 121%.\n", "- `Regency cakestand 3 tier` and `paper craft, little birdie` are the top two products in term of revenue generations. `Regency cakestand 3 tier` generated a revenue amounting to about £174,200 - the highest.\n", "- The most cancelled product order is `Regency cakestand 3 tier` - cancelled 180 times.\n", "- Revenues are comparatively lower from January to July and higher from August to November.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 3: Analyze the product range" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Average Revenue (top individual products, and categories)**" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "alignmentgroup": "True", "hovertemplate": "revenue=%{text}
description=%{y}", "legendgroup": "", "marker": { "color": "#636efa", "pattern": { "shape": "" } }, "name": "", "offsetgroup": "", "orientation": "h", "showlegend": false, "text": [ 0.001, 0.42, 0.55, 0.65, 0.7599999999999999, 0.76, 0.834, 0.84, 0.85, 0.8925000000000001 ], "textposition": "outside", "texttemplate": "%{text:.2s}", "type": "bar", "x": [ 0.001, 0.42, 0.55, 0.65, 0.7599999999999999, 0.76, 0.834, 0.84, 0.85, 0.8925000000000001 ], "xaxis": "x", "y": [ "pads to match all cushions", "hen house w chick in nest", "60 gold and silver fairy cake cases", "set 12 colouring pencils doiley", "wine bottle dressing lt.blue", "champagne tray blank card", "mug , dotcomgiftshop.com ", "vintage blue tinsel reel", "pink crystal guitar phone charm", "asstd design bubble gum ring" ], "yaxis": "y" } ], "layout": { "barmode": "relative", "legend": { "tracegroupgap": 0 }, "margin": { "b": 8, "l": 8, "r": 10, "t": 30 }, "showlegend": false, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "fillpattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "autotypenumbers": "strict", "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Top 10 Average Revenue by Products" }, "xaxis": { "anchor": "y", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "tickprefix": "£", "title": { "text": "Revenue (Million)" } }, "yaxis": { "anchor": "x", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "title": { "text": "Products" } } } }, "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "leading_avg_revenues= orders_data.groupby(\"description\")[\"revenue\"].mean().sort_values(ascending=True).to_frame().reset_index()\\\n", ".head(10)\n", "\n", "fig = px.bar(leading_avg_revenues, y='description', x='revenue',title= \"Top 10 Average Revenue by Products\", text='revenue')\n", "fig.update_layout(\n", " showlegend=False,\n", " margin=dict(t=30,l=8,b=8,r=10))\n", "fig.update_traces(texttemplate='%{text:.2s}', textposition='outside')\n", "fig.update_layout(xaxis_tickprefix = '£')\n", "fig.update_yaxes(tickfont_family=\"Arial Black\", title_text=\"Products\")\n", "fig.update_xaxes(tickfont_family=\"Arial Black\", title_text=\"Revenue (Million)\")\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`Regency cakestand 3 tier` had the highest sum of revenues but on average, `paper craft, little birdie` generated the highest revenue ~ £168,469. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I will proceed with average revenue by category." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "alignmentgroup": "True", "hovertemplate": "revenue=%{text}
categories=%{y}", "legendgroup": "", "marker": { "color": "#636efa", "pattern": { "shape": "" } }, "name": "", "offsetgroup": "", "orientation": "h", "showlegend": false, "text": [ 17.6410013445662, 19.09718180371418, 19.58055580006287, 19.7178921673692, 20.318720696697103, 23.120815507616285 ], "textposition": "outside", "texttemplate": "%{text:.3s}", "type": "bar", "x": [ 17.6410013445662, 19.09718180371418, 19.58055580006287, 19.7178921673692, 20.318720696697103, 23.120815507616285 ], "xaxis": "x", "y": [ "kitchenware", "bags_and_toys", "plant_and_accessories", "event_and_party", "others", "home_decor" ], "yaxis": "y" } ], "layout": { "barmode": "relative", "legend": { "tracegroupgap": 0 }, "margin": { "b": 8, "l": 8, "r": 1, "t": 30 }, "showlegend": false, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "fillpattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "autotypenumbers": "strict", "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Average Revenue by Category" }, "xaxis": { "anchor": "y", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "tickformat": ",.2f", "tickprefix": "£", "title": { "text": "Revenue" } }, "yaxis": { "anchor": "x", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "title": { "text": "Categories" } } } }, "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "leading_avg_revenues= orders_data.groupby(\"categories\")[\"revenue\"].mean().sort_values(ascending=True).to_frame().reset_index()\\\n", ".head(10)\n", "\n", "fig = px.bar(leading_avg_revenues, y='categories', x='revenue', orientation = \"h\", title= \"Average Revenue by Category\", text='revenue')\n", "fig.update_layout(\n", " showlegend=False,\n", " margin=dict(t=30,l=8,b=8,r=1))\n", "fig.update_traces(texttemplate='%{text:.3s}', textposition='outside')\n", "fig.update_layout(xaxis_tickprefix = '£', xaxis_tickformat = ',.2f')\n", "fig.update_yaxes(tickfont_family=\"Arial Black\", title_text=\"Categories\")\n", "fig.update_xaxes(tickfont_family=\"Arial Black\", title_text=\"Revenue\")\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Even though `Kitchen ware`is the most frequently ordered category, on average, it generated the least revenue - about £18; `home decorations` generated the highest average revenue ~ £23. Are these averages statistically and significantly different?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Which products are most often sold together**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I will combine pairs of products (bundles) and count the number of times they were sold together." ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
combinationscount
0(jumbo bag pink polkadot, jumbo bag red retrospot)541
1(green regency teacup and saucer, roses regency teacup and saucer )485
2(jumbo shopper vintage red paisley, jumbo bag red retrospot)465
3(jumbo storage bag suki, jumbo bag red retrospot)453
4(lunch bag red retrospot, lunch bag black skull.)427
5(green regency teacup and saucer, pink regency teacup and saucer)396
6(jumbo bag red retrospot, jumbo bag baroque black white)393
7(jumbo bag apples, jumbo bag red retrospot)389
8(woodland charlotte bag, red retrospot charlotte bag)379
9(alarm clock bakelike green, alarm clock bakelike red )375
\n", "
" ], "text/plain": [ " combinations count\n", "0 (jumbo bag pink polkadot, jumbo bag red retrospot) 541\n", "1 (green regency teacup and saucer, roses regency teacup and saucer ) 485\n", "2 (jumbo shopper vintage red paisley, jumbo bag red retrospot) 465\n", "3 (jumbo storage bag suki, jumbo bag red retrospot) 453\n", "4 (lunch bag red retrospot, lunch bag black skull.) 427\n", "5 (green regency teacup and saucer, pink regency teacup and saucer) 396\n", "6 (jumbo bag red retrospot, jumbo bag baroque black white) 393\n", "7 (jumbo bag apples, jumbo bag red retrospot) 389\n", "8 (woodland charlotte bag, red retrospot charlotte bag) 379\n", "9 (alarm clock bakelike green, alarm clock bakelike red ) 375" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "orders_data[\"product_bundle\"]=orders_data.groupby('invoiceno')['description'].transform(lambda x: ';'.join(x))\n", "product_bundle= orders_data[['invoiceno', \"product_bundle\"]].drop_duplicates()\n", "\n", "count = Counter()\n", "for row in product_bundle[\"product_bundle\"]:\n", " row_list = row.split(';')\n", " count.update(Counter(combinations(row_list,2)))\n", "paired=count.most_common(10)\n", "df = pd.DataFrame(paired,columns=['combinations', \"count\"])\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There were so many products that were sold together, but `Jumbo bag and pink polkadot` and `jumbo bag red retrospot` were the products most often sold together." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Which products are more often sold by themselves, and which ones are more often combined with others?**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I will group by invoice number and uniquely count products that were sold alone. " ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
invoicenoitem_count
45363691
65363711
95363741
145363801
255363931
\n", "
" ], "text/plain": [ " invoiceno item_count\n", "4 536369 1\n", "6 536371 1\n", "9 536374 1\n", "14 536380 1\n", "25 536393 1" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "1501" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "by_themselves = orders_data.groupby(\"invoiceno\")[\"description\"].nunique().reset_index()\\\n", ".rename(columns={\"description\": \"item_count\"}).query(\"item_count <2\")\n", "display(by_themselves.head())\n", "display(by_themselves.shape[0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's visualize: individual products." ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "alignmentgroup": "True", "hovertemplate": "count=%{text}
description=%{y}", "legendgroup": "", "marker": { "color": "#636efa", "pattern": { "shape": "" } }, "name": "", "offsetgroup": "", "orientation": "h", "showlegend": false, "text": [ 10, 10, 10, 11, 11, 12, 13, 14, 15, 15, 16, 19, 23, 30, 32 ], "textposition": "outside", "texttemplate": "%{text:.2s}", "type": "bar", "x": [ 10, 10, 10, 11, 11, 12, 13, 14, 15, 15, 16, 19, 23, 30, 32 ], "xaxis": "x", "y": [ "please one person metal sign", "antique silver tea glass engraved", "jam making set printed", "small popcorn holder", "vintage union jack bunting", "popcorn holder", "rex cash+carry jumbo shopper", "doormat keep calm and come in", "party bunting", "black record cover frame", "jumbo bag red retrospot", "white hanging heart t-light holder", "regency cakestand 3 tier", "chilli lights", "rabbit night light" ], "yaxis": "y" } ], "layout": { "barmode": "relative", "legend": { "tracegroupgap": 0 }, "margin": { "b": 8, "l": 8, "r": 10, "t": 30 }, "showlegend": false, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "fillpattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "autotypenumbers": "strict", "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Top 15 Products Sold by Themselves" }, "xaxis": { "anchor": "y", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "title": { "text": "Count" } }, "yaxis": { "anchor": "x", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "title": { "text": "Products" } } } }, "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "main_products= orders_data.query(\"invoiceno in @by_themselves.invoiceno\")\n", "main_products= main_products[\"description\"].value_counts().to_frame().reset_index().rename(columns={\"index\": \"description\", \"description\": \"count\"})\n", "fig = px.bar(main_products.head(15).sort_values(by= 'count',ascending=True), y='description', x='count',orientation = \"h\",title= \"Top 15 Products Sold by Themselves\", text='count')\n", "fig.update_layout(\n", " showlegend=False,\n", " margin=dict(t=30,l=8,b=8,r=10))\n", "fig.update_traces(texttemplate='%{text:.2s}', textposition='outside')\n", "fig.update_yaxes(tickfont_family=\"Arial Black\", title_text=\"Products\")\n", "fig.update_xaxes(tickfont_family=\"Arial Black\", title_text=\"Count\")\n", "\n", "fig.show()\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There were 1501 products that were sold by themselves. `Rabbit night light` was sold alone 32 time ~ the most sold alone product. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's visualize: additional assortment products." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "alignmentgroup": "True", "hovertemplate": "count=%{text}
description=%{y}", "legendgroup": "", "marker": { "color": "#636efa", "pattern": { "shape": "" } }, "name": "", "offsetgroup": "", "orientation": "h", "showlegend": false, "text": [ 1177, 1179, 1194, 1215, 1230, 1254, 1299, 1352, 1383, 1474, 1580, 1684, 1984, 2093, 2292 ], "textposition": "outside", "texttemplate": "%{text:.2s}", "type": "bar", "x": [ 1177, 1179, 1194, 1215, 1230, 1254, 1299, 1352, 1383, 1474, 1580, 1684, 1984, 2093, 2292 ], "xaxis": "x", "y": [ "paper chain kit 50's christmas ", "jumbo shopper vintage red paisley", "jumbo storage bag suki", "heart of wicker small", "jumbo bag pink polkadot", "natural slate heart chalkboard ", "lunch bag black skull.", "pack of 72 retrospot cake cases", "set of 3 cake tins pantry design ", "assorted colour bird ornament", "lunch bag red retrospot", "party bunting", "regency cakestand 3 tier", "jumbo bag red retrospot", "white hanging heart t-light holder" ], "yaxis": "y" } ], "layout": { "barmode": "relative", "legend": { "tracegroupgap": 0 }, "margin": { "b": 8, "l": 8, "r": 10, "t": 30 }, "showlegend": false, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "fillpattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "autotypenumbers": "strict", "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Top 15 Additional Assortment Products" }, "xaxis": { "anchor": "y", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "title": { "text": "Count" } }, "yaxis": { "anchor": "x", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "title": { "text": "Products" } } } }, "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "additional_products= orders_data.query(\"invoiceno not in @by_themselves.invoiceno\")\n", "additional_products= additional_products[\"description\"].value_counts().to_frame().reset_index().rename(columns={\"index\": \"description\", \"description\": \"count\"})\n", "fig = px.bar(additional_products.head(15).sort_values(by= 'count',ascending=True), y='description', x='count',orientation = \"h\",title= \"Top 15 Additional Assortment Products\", text='count')\n", "fig.update_layout(\n", " showlegend=False,\n", " margin=dict(t=30,l=8,b=8,r=10))\n", "fig.update_traces(texttemplate='%{text:.2s}', textposition='outside')\n", "fig.update_yaxes(tickfont_family=\"Arial Black\", title_text=\"Products\")\n", "fig.update_xaxes(tickfont_family=\"Arial Black\", title_text=\"Count\")\n", "\n", "fig.show()" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "99.71%\n" ] } ], "source": [ "print(str(round(((orders_data.shape[0] - by_themselves.shape[0])/orders_data.shape[0])*100,2))+\"%\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "About 99.7% of the products were sold together with others - additional assortment. `White hanging heart t-light holder` was the product sold the most with others - about 2300 times. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**What product groups are more often included in the additional assortment?**" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "alignmentgroup": "True", "hovertemplate": "count=%{text}
categories=%{y}", "legendgroup": "", "marker": { "color": "#636efa", "pattern": { "shape": "" } }, "name": "", "offsetgroup": "", "orientation": "h", "showlegend": false, "text": [ 15840, 58766, 80974, 84813, 125204, 152610 ], "textposition": "outside", "texttemplate": "%{text:.3s}", "type": "bar", "x": [ 15840, 58766, 80974, 84813, 125204, 152610 ], "xaxis": "x", "y": [ "plant_and_accessories", "others", "bags_and_toys", "home_decor", "event_and_party", "kitchenware" ], "yaxis": "y" } ], "layout": { "barmode": "relative", "legend": { "tracegroupgap": 0 }, "margin": { "b": 5, "l": 2, "r": 1, "t": 30 }, "showlegend": false, "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 }, "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "pattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "fillpattern": { "fillmode": "overlay", "size": 10, "solidity": 0.2 }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "autotypenumbers": "strict", "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } }, "title": { "text": "Top Additional Assortment Categories" }, "xaxis": { "anchor": "y", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "title": { "text": "Count" } }, "yaxis": { "anchor": "x", "domain": [ 0, 1 ], "tickfont": { "family": "Arial Black" }, "title": { "text": "Categories" } } } }, "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "additional_group_assortment = orders_data.groupby(\"invoiceno\")[\"categories\"].nunique().reset_index()\\\n", ".rename(columns={\"categories\": \"item_count\"}).query(\"item_count > 1\")\n", "\n", "additional_product_group= orders_data.query(\"invoiceno in @additional_group_assortment.invoiceno\")\n", "additional_product_group= additional_product_group[\"categories\"].value_counts().to_frame().reset_index().rename(columns={\"index\": \"categories\", \"categories\": \"count\"})\n", "fig = px.bar(additional_product_group.sort_values(by= 'count',ascending=True), y='categories', x='count', orientation =\"h\", title= \"Top Additional Assortment Categories\",text='count')\n", "fig.update_layout(\n", " showlegend=False,\n", " margin=dict(t=30,l=2,b=5,r=1), )\n", "fig.update_traces(texttemplate='%{text:.3s}', textposition='outside')\n", "fig.update_yaxes(tickfont_family=\"Arial Black\", title_text=\"Categories\")\n", "fig.update_xaxes(tickfont_family=\"Arial Black\", title_text=\"Count\")\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`Kitchen ware` was most often included in additioanl assortment (about 152,610 times) and `plant and accessories` was the least - about 15,840 times. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**What bundles of product groups are often present in shopping carts?** " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I will combine pairs of product categories (bundles) and count the number of times they were sold together." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
combinationscount
0(kitchenware, kitchenware)2694220
1(event_and_party, kitchenware)2272974
2(kitchenware, event_and_party)2025698
3(event_and_party, event_and_party)1900017
4(kitchenware, home_decor)1374740
5(home_decor, kitchenware)1351066
6(bags_and_toys, kitchenware)1324704
7(event_and_party, home_decor)1231966
8(kitchenware, bags_and_toys)1173235
9(home_decor, event_and_party)1091238
\n", "
" ], "text/plain": [ " combinations count\n", "0 (kitchenware, kitchenware) 2694220\n", "1 (event_and_party, kitchenware) 2272974\n", "2 (kitchenware, event_and_party) 2025698\n", "3 (event_and_party, event_and_party) 1900017\n", "4 (kitchenware, home_decor) 1374740\n", "5 (home_decor, kitchenware) 1351066\n", "6 (bags_and_toys, kitchenware) 1324704\n", "7 (event_and_party, home_decor) 1231966\n", "8 (kitchenware, bags_and_toys) 1173235\n", "9 (home_decor, event_and_party) 1091238" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "orders_data[\"categories_bundle\"]=orders_data.groupby('invoiceno')['categories'].transform(lambda x: ';'.join(x))\n", "categories_bundle= orders_data[['invoiceno', \"categories_bundle\"]].drop_duplicates()\n", "count = Counter()\n", "for row in categories_bundle[\"categories_bundle\"]:\n", " row_list = row.split(';')\n", " count.update(Counter(combinations(row_list,2)))\n", "paired1=count.most_common(10)\n", "df1 = pd.DataFrame(paired1,columns=['combinations', \"count\"])\n", "df1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It can be observed that range of products under `Kitchen ware` were mostly present in shopping carts. Considering the groups, `event and party`category was mostly present in shopping carts with `Kitchen ware`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Interim Conclusion**\n", "\n", "- on average, `paper craft, little birdie` generated the highest revenue - about £168,469.\n", "- on average, `Kitchen ware` generated the least revenue - about £18 while `home decorations` generated the highest- about £23.\n", "- There were 1501 products that were sold by themselves. `Rabbit night light` was sold alone 32 times- the most sold alone product. \n", "- About 99.7% of the products were sold together with others - additional assortment.\n", "- `Jumbo bag and pink polkadot` and `jumbo bag red retrospot` were the products most often sold together. \n", "- `White hanging heart t-light holder` was the product sold the most with others - about 2300 times. \n", " `Kitchen ware` was most often included in additioanl assortment (about 152,610 times) and `plant and accessories` was the least - about 15,840 times. \n", "- `event and party`category was mostly present in shopping carts with `Kitchen ware`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 4: Formulate and test statistical hypotheses\n", "\n", "I will formulate two hypotheses and test. Since these would be test of averages, I will used t-test; I will first conduct levene test for variance to be a able to pass the right option in the test, i.e. equal or non-equal variance." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Hypothesis 1:**\n", "\n", "H0: There is no statistically significant difference in average revenue from categoires, `home_decor` and `kitchenware`. \n", "\n", "H1: There is statistically significant difference in average revenue from categoires, `home_decor` and `kitchenware`. \n", "\n", "Alpha = 0.05" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "p-value (levene test): 0.0000\n", "We reject the null hypothesis\n" ] } ], "source": [ "home_decor = orders_data[orders_data[\"categories\"]==\"home_decor\"]\n", "kitchenware = orders_data[orders_data[\"categories\"]==\"kitchenware\"]\n", "\n", "alpha = .05 \n", "result = st.levene(home_decor[\"revenue\"], kitchenware[\"revenue\"])\n", "\n", "print('p-value (levene test): {:.4f}'.format(result.pvalue))\n", "\n", "\n", "if (result.pvalue < alpha):\n", " print(\"We reject the null hypothesis\")\n", "else:\n", " print(\"We can't reject the null hypothesis\")" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "p-value: 0.0000\n", "We reject H0: there is statistically significant difference between average revenue from categoires home decor and kitchen ware.\n" ] } ], "source": [ "alpha = .05 \n", "home_decor = orders_data[orders_data[\"categories\"]==\"home_decor\"]\n", "kitchenware = orders_data[orders_data[\"categories\"]==\"kitchenware\"]\n", "result = st.ttest_ind(home_decor[\"revenue\"], kitchenware[\"revenue\"],equal_var=False)\n", "\n", "print('p-value: {:.4f}'.format(result.pvalue))\n", "\n", "if (result.pvalue < alpha):\n", " print(\"We reject H0: there is statistically significant difference between average revenue from categoires home decor and kitchen ware.\")\n", "else:\n", " print(\"We can't reject H1: there is statistically no significant difference between average revenue from categoires home decor and kitchen ware.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Hypothesis 2:**\n", "\n", "H0: There is statistically no significant difference in average revenue from the product `paper craft , little birdie` and all other products.\n", "\n", "H1: There is statistically significant difference in average revenue from the product `paper craft , little birdie` and all other products. \n", "\n", "Alpha = 0.05" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "p-value (levene test): 0.9123\n", "We can't reject the null hypothesis\n" ] } ], "source": [ "alpha = .05 \n", "\n", "papar_craft = orders_data[orders_data[\"description\"]==\"paper craft , little birdie\"]\n", "non_paper_craft = orders_data[orders_data[\"description\"]!=\"paper craft , little birdie\"]\n", " \n", "results = st.levene(papar_craft[\"revenue\"], non_paper_craft[\"revenue\"])\n", "\n", "print('p-value (levene test): {:.4f}'.format(results.pvalue))\n", "\n", "\n", "if (results.pvalue < alpha):\n", " print(\"We reject the null hypothesis\")\n", "else:\n", " print(\"We can't reject the null hypothesis\")" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "p-value: 0.9123\n", "We can't reject H0: there is statistically no significant difference between average revenue from the product 'paper craft, little birdie' and all other products.\n" ] } ], "source": [ "alpha = .05 \n", "\n", "result = st.ttest_ind(papar_craft[\"revenue\"], non_paper_craft[\"revenue\"],equal_var=True)\n", "\n", "print('p-value: {:.4f}'.format(results.pvalue))\n", "\n", "if (results.pvalue < alpha):\n", " print(\"We reject H0: there is statistically significant difference between average revenue from the product 'paper craft, little birdie' and all other products.\")\n", "else:\n", " print(\"We can't reject H0: there is statistically no significant difference between average revenue from the product 'paper craft, little birdie' and all other products.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 5: Conclusion and Recommendations" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Conclusion**\n", ".\n", "- There were 4206 unique products. The maximum ordered quantity was 80995.\n", "- The quantity of products ordered had a mean of about 10 and a standard deviation of about 219. \n", "- The highest unit price of a product costed £38,970.00. \n", "- Unit price had a mean on about £4.6. \n", "- Invoice number `573585` had the highest number of products ordered (1113 products). The top ten invoices show the customers of the store are mostly wholesalers.\n", "- `Kitchen ware` is the most frequently purchased category, `plant and accessories` are the least frequently purchased category.\n", "- The highest daily orders was on November 30th 2018, followed by November 15th 2019 (141 and 136 orders respectively). The lowest daily order was on 4th February 2019 - just 11 orders.\n", "- The number of total monthly orders from December 2018 to November 2019 increased by about 121%.\n", "- Revenues are comparatively lower from January to July and higher from August to November.\n", "- `Regency cakestand 3 tier` and `paper craft little birdie` are the top two products in terms of revenue generation. `Regency cakestand 3 tier` generated a revenue amounting to about £174,200.00 - the highest.\n", "- The most cancelled product order is `Regency cakestand 3 tier` - cancelled 180 times.\n", "- On average, `paper craft, little birdie` generated the highest revenue - about £168,469.\n", "- On average, `Kitchen ware` generated the least revenue - about £18 while `home decorations` generated the highest- about £23.\n", "- There were 1501 products that were sold by themselves. `Rabbit night light` was sold alone 32 times- the most sold alone product. \n", "- About 99.7% of the products were sold together with others - additional assortment.\n", "- `Jumbo bag and pink polkadot` and `jumbo bag red retrospot` were the products most often sold together. \n", "- `White hanging heart t-light holder` was the product sold the most with others - about 2300 times. \n", " `Kitchen ware` was most often included in additioanl assortment (about 152,610 times) and `plant and accessories` was the least - about 15,840 times. \n", "- `event and party`category was mostly present in shopping carts with `Kitchen ware`\n", "\n", "- The difference between average revenue from `home decorations` and `Kitchen ware` statistically significant. \n", "- The average revenue generated by `paper craft little birdie` is not statistically and significantly different from average revenue from all other products. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Recommendations**\n", "\n", "- Since about 99.7% of products are included in additional assortment, there is a need to create a product recommendation system.\n", "- `Home decorations` is the third most purchased category but has the highest average revenue. Hence, invest more in advertising `home decorations` to boost purchase rate and revenue. \n", "- As `plant and accessories` is the least frequently purchased category, increase advertising investment to enhance orders.\n", "- `Regency cakestand 3 tier` is the leading revenue generator on aggragate but the most cancelled product order. Pay much attention to this product. For instance, why does it often get cancelled? If the cancellation rate is minimized, revenue would be maximized. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**References**\n", "\n", "- [6 Benefits of Big Data Analytics for E-Commerce](https://www.octoparse.com/blog/benefits-of-big-data-analytics-for-e-commerce)\n", "- [Guide to Text Classification with Machine Learning & NLP](https://monkeylearn.com/text-classification/) \n", "- [Text Classification Using Naive Bayes](https://youtu.be/60pqgfT5tZM)\n", "- [Product Sales Analysis](https://medium.com/swlh/product-sales-analysis-using-python-863b29026957)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.11" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": true, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": true, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "165px" }, "toc_section_display": true, "toc_window_display": false }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }