{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### Heroes Of Pymoli Data Analysis\n", "* Of the 576 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).\n", "\n", "* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%). \n", "\n", "\n", "
    Conclusion:\n", "
  1. The total revenue is \\$ 2379.77 from 780 purchases out of 179 unique items, in which the average item price is \\$ 3.05.
  2. \n", "
  3. Males have spent almost three times more on purchases than females (\\$1967.64-male vs \\$361.94-female). This would be an interesting topic as to why females prefer playing computer games less than males, in general.
  4. \n", "
  5. The teenages aged less than 19 years old made up 25.35% of the group, which is concerning on the total time consumption these teenagers would spend on playing computer games. Their mental and physical health related to the time consumption on playing games should raise public awareness.
  6. \n", "
  7. The \"Final Critic\" game is the most profitable item and the most popular item, which has been sold 13 times and made $60 as the total purchase value.
  8. \n", "
\n", " \n", "\n", "-----" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "# Dependencies and Setup\n", "import pandas as pd\n", "\n", "# File to Load \n", "# file_to_load = \"Resources/purchase_data.csv\"\n", "file_to_load = \"purchase_data.csv\"\n", "# Read Purchasing File and store into Pandas data frame\n", "purchase_data = pd.read_csv(file_to_load)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As final considerations:\n", "\n", "You must use the Pandas Library and the Jupyter Notebook.\n", "You must submit a link to your Jupyter Notebook with the viewable Data Frames.\n", "You must include a written description of three observable trends based on the data.\n", "See Example Solution for a reference on expected format." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Purchase IDAgeItem IDPrice
count780.000000780.000000780.000000780.000000
mean389.50000022.71410391.7551283.050987
std225.3108966.65944452.6977021.169549
min0.0000007.0000000.0000001.000000
25%194.75000020.00000047.7500001.980000
50%389.50000022.00000092.0000003.150000
75%584.25000025.000000138.0000004.080000
max779.00000045.000000183.0000004.990000
\n", "
" ], "text/plain": [ " Purchase ID Age Item ID Price\n", "count 780.000000 780.000000 780.000000 780.000000\n", "mean 389.500000 22.714103 91.755128 3.050987\n", "std 225.310896 6.659444 52.697702 1.169549\n", "min 0.000000 7.000000 0.000000 1.000000\n", "25% 194.750000 20.000000 47.750000 1.980000\n", "50% 389.500000 22.000000 92.000000 3.150000\n", "75% 584.250000 25.000000 138.000000 4.080000\n", "max 779.000000 45.000000 183.000000 4.990000" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "purchase_data.describe()" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Purchase IDSNAgeGenderItem IDItem NamePrice
00Lisim7820Male108Extraction, Quickblade Of Trembling Hands3.53
11Lisovynya3840Male143Frenzied Scimitar1.56
22Ithergue4824Male92Final Critic4.88
33Chamassasya8624Male100Blindscythe3.27
44Iskosia9023Male131Fury1.44
\n", "
" ], "text/plain": [ " Purchase ID SN Age Gender Item ID \\\n", "0 0 Lisim78 20 Male 108 \n", "1 1 Lisovynya38 40 Male 143 \n", "2 2 Ithergue48 24 Male 92 \n", "3 3 Chamassasya86 24 Male 100 \n", "4 4 Iskosia90 23 Male 131 \n", "\n", " Item Name Price \n", "0 Extraction, Quickblade Of Trembling Hands 3.53 \n", "1 Frenzied Scimitar 1.56 \n", "2 Final Critic 4.88 \n", "3 Blindscythe 3.27 \n", "4 Fury 1.44 " ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "purchase_data.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Player Count" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Display the total number of players\n" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Total Players
0576
\n", "
" ], "text/plain": [ " Total Players\n", "0 576" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "player_number = len(purchase_data['SN'].unique())\n", "\n", "total_players = pd.DataFrame([\n", " {\"Total Players\": player_number}\n", "])\n", "\n", "total_players" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Purchasing Analysis (Total)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Run basic calculations to obtain number of unique items, average price, etc.\n", "\n", "\n", "* Create a summary data frame to hold the results\n", "\n", "\n", "* Optional: give the displayed data cleaner formatting\n", "\n", "\n", "* Display the summary data frame\n" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Number of Unique ItemsAverage PriceNumber of PurchasesTotal Revenue
0179$3.05780$2379.77
\n", "
" ], "text/plain": [ " Number of Unique Items Average Price Number of Purchases Total Revenue\n", "0 179 $3.05 780 $2379.77" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "unique_items=len(purchase_data[\"Item ID\"].unique())\n", "avg_price=purchase_data[\"Price\"].mean()\n", "purchase_number = sum(purchase_data[\"Item ID\"].value_counts())\n", "total_revenue = purchase_data[\"Price\"].sum()\n", "\n", "p_analysis = pd.DataFrame([\n", " {\"Number of Unique Items\": unique_items,\n", " \"Average Price\": avg_price,\n", " \"Number of Purchases\": purchase_number,\n", " \"Total Revenue\": total_revenue}])\n", "\n", "p_analysis[\"Average Price\"]=p_analysis[\"Average Price\"].map(\"${:.2f}\".format)\n", "p_analysis[\"Total Revenue\"]=p_analysis[\"Total Revenue\"].map(\"${:.2f}\".format)\n", "p_analysis" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Gender Demographics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Percentage and Count of Male Players\n", "\n", "\n", "* Percentage and Count of Female Players\n", "\n", "\n", "* Percentage and Count of Other / Non-Disclosed\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Total CountPercentage of Players
Gender
Male48484.03%
Female8114.06%
Other / Non-Disclosed111.91%
\n", "
" ], "text/plain": [ " Total Count Percentage of Players\n", "Gender \n", "Male 484 84.03%\n", "Female 81 14.06%\n", "Other / Non-Disclosed 11 1.91%" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "demo=pd.DataFrame(purchase_data.groupby('SN')[\"Gender\"].max())\n", "\n", "gender_demo=pd.DataFrame(demo.groupby('Gender')['Gender'].count())\n", "total=len(demo['Gender'])\n", "gender_demo[\"Percentage of Players\"]= (gender_demo['Gender']/total)*100\n", "gender_demo[\"Percentage of Players\"]=gender_demo[\"Percentage of Players\"].map(\"{0:.2f}%\".format)\n", "\n", "gender_demo=gender_demo.rename(columns={'Gender': 'Total Count'})\n", "gender_final=gender_demo.sort_values('Total Count', ascending=False)\n", "gender_final" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## Purchasing Analysis (Gender)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender\n", "\n", "\n", "\n", "\n", "* Create a summary data frame to hold the results\n", "\n", "\n", "* Optional: give the displayed data cleaner formatting\n", "\n", "\n", "* Display the summary data frame" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GenderPurchase CountAverage Purchase PriceTotal Purchase ValueAvg Total Purchase per Person
0Female113$3.20$361.94$4.47
1Male652$3.02$1967.64$4.07
2Other / Non-Disclosed15$3.35$50.19$4.56
\n", "
" ], "text/plain": [ " Gender Purchase Count Average Purchase Price \\\n", "0 Female 113 $3.20 \n", "1 Male 652 $3.02 \n", "2 Other / Non-Disclosed 15 $3.35 \n", "\n", " Total Purchase Value Avg Total Purchase per Person \n", "0 $361.94 $4.47 \n", "1 $1967.64 $4.07 \n", "2 $50.19 $4.56 " ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\n", "count= purchase_data.groupby('Gender')['Item Name'].count().values.tolist()\n", "avgprice=purchase_data.groupby('Gender')['Price'].mean().values.tolist()\n", "totalprice=purchase_data.groupby('Gender')['Price'].sum().values.tolist()\n", "\n", "\n", "sumpricepp=purchase_data.groupby(['Gender','SN'])['Price'].sum().to_frame(name='Sum').reset_index()\n", "ppcount=purchase_data.groupby(['Gender','SN'])['SN'].count().to_frame(name='Count').reset_index()\n", "merge_pricepp=pd.merge(sumpricepp, ppcount, on=\"SN\", how=\"outer\")\n", "merge_pricepp[\"Average\"]=(merge_pricepp['Sum']/merge_pricepp['Count']).astype(\"float\")\n", "avgpricepp=merge_pricepp.groupby(['Gender_y'])['Sum'].mean()\n", "avgpricepp_list = avgpricepp.values.tolist()\n", "\n", "purchase_gender_df=pd.DataFrame([\n", " {\"Gender\": \"Female\", \"Purchase Count\": count[0], \"Average Purchase Price\": avgprice[0], \n", " \"Total Purchase Value\": totalprice[0], \"Avg Total Purchase per Person\": avgpricepp_list[0]},\n", " {\"Gender\": \"Male\", \"Purchase Count\": count[1], \"Average Purchase Price\": avgprice[1], \n", " \"Total Purchase Value\": totalprice[1], \"Avg Total Purchase per Person\": avgpricepp_list[1]},\n", " {\"Gender\": \"Other / Non-Disclosed\", \"Purchase Count\": count[2], \"Average Purchase Price\": avgprice[2], \n", " \"Total Purchase Value\": totalprice[2], \"Avg Total Purchase per Person\": avgpricepp_list[2]}\n", "])\n", "\n", "purchase_gender_df[\"Average Purchase Price\"]=purchase_gender_df[\"Average Purchase Price\"].map(\"${:.2f}\".format)\n", "purchase_gender_df[\"Total Purchase Value\"]=purchase_gender_df[\"Total Purchase Value\"].map(\"${:.2f}\".format)\n", "purchase_gender_df[\"Avg Total Purchase per Person\"]=purchase_gender_df[\"Avg Total Purchase per Person\"].map(\"${:.2f}\".format)\n", "purchase_gender_df=purchase_gender_df.reset_index(drop=True)\n", "purchase_gender_df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Age Demographics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Establish bins for ages\n", "\n", "\n", "* Categorize the existing players using the age bins. Hint: use pd.cut()\n", "\n", "\n", "* Calculate the numbers and percentages by age group\n", "\n", "\n", "* Create a summary data frame to hold the results\n", "\n", "\n", "* Optional: round the percentage column to two decimal points\n", "\n", "\n", "* Display Age Demographics Table\n" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Total CountsPercentage of Players
Age Range
<10172.95%
10-14223.82%
15-1910718.58%
20-2425844.79%
25-297713.37%
30-34529.03%
35-39315.38%
40+122.08%
\n", "
" ], "text/plain": [ " Total Counts Percentage of Players\n", "Age Range \n", "<10 17 2.95%\n", "10-14 22 3.82%\n", "15-19 107 18.58%\n", "20-24 258 44.79%\n", "25-29 77 13.37%\n", "30-34 52 9.03%\n", "35-39 31 5.38%\n", "40+ 12 2.08%" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "age_bins=[0,9,14,19,24,29,34,39,50]\n", "age_labels=[\"<10\",\"10-14\",\"15-19\",\"20-24\",\"25-29\",\"30-34\",\"35-39\",\"40+\"]\n", "\n", "uniquesn=pd.DataFrame(purchase_data.groupby('SN')['Age'].mean())\n", "uniquesn[\"Age Range\"]=pd.cut(uniquesn['Age'],age_bins,labels=age_labels)\n", "\n", "new_age_df=pd.DataFrame(uniquesn.groupby(\"Age Range\")[\"Age\"].count())\n", "new_age_df=new_age_df.rename(columns={\"Age\":\"Total Counts\"})\n", "\n", "new_age_df[\"Percentage of Players\"]=(new_age_df[\"Total Counts\"]/(new_age_df[\"Total Counts\"].sum()))*100\n", "new_age_df[\"Percentage of Players\"] = new_age_df[\"Percentage of Players\"].map(\"{0:,.2f}%\".format)\n", "new_age_df\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Purchasing Analysis (Age)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Bin the purchase_data data frame by age\n", "\n", "\n", "* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below\n", "\n", "\n", "* Create a summary data frame to hold the results\n", "\n", "\n", "* Optional: give the displayed data cleaner formatting\n", "\n", "\n", "* Display the summary data frame" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Purchase CountAverage Purchase PriceTotal Purchase ValueAvg Total Purchase per Person
Age Range
<1023$3.35$77.13$4.54
10-1428$2.96$82.78$3.76
15-19136$3.04$412.89$3.86
20-24365$3.05$1114.06$4.32
25-29101$2.90$293.00$3.81
30-3473$2.93$214.00$4.12
35-3941$3.60$147.67$4.76
40+13$2.94$38.24$3.19
\n", "
" ], "text/plain": [ " Purchase Count Average Purchase Price Total Purchase Value \\\n", "Age Range \n", "<10 23 $3.35 $77.13 \n", "10-14 28 $2.96 $82.78 \n", "15-19 136 $3.04 $412.89 \n", "20-24 365 $3.05 $1114.06 \n", "25-29 101 $2.90 $293.00 \n", "30-34 73 $2.93 $214.00 \n", "35-39 41 $3.60 $147.67 \n", "40+ 13 $2.94 $38.24 \n", "\n", " Avg Total Purchase per Person \n", "Age Range \n", "<10 $4.54 \n", "10-14 $3.76 \n", "15-19 $3.86 \n", "20-24 $4.32 \n", "25-29 $3.81 \n", "30-34 $4.12 \n", "35-39 $4.76 \n", "40+ $3.19 " ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "purchase_data[\"Age Range\"]=pd.cut(purchase_data['Age'],age_bins,labels=age_labels)\n", "\n", "\n", "age_purchase=purchase_data.groupby(\"Age Range\")[\"Item ID\"].count()\n", "avg_price=purchase_data.groupby(\"Age Range\")['Price'].mean()\n", "total_price=purchase_data.groupby(\"Age Range\")['Price'].sum()\n", "\n", "# calculate average total price per person\n", "ppcount=pd.DataFrame(purchase_data.groupby('SN')['Item ID'].count()) \n", "pptotal=pd.DataFrame(purchase_data.groupby('SN')['Price'].sum())\n", "ppmerge_df=pd.merge(ppcount, pptotal, on=\"SN\", how=\"outer\")\n", "aronly_df=pd.DataFrame(purchase_data.groupby('SN')['Age Range'].max())\n", "pparmerge=pd.merge(aronly_df, ppmerge_df, on=\"SN\", how=\"outer\")\n", "avgpp_df=pparmerge.groupby(\"Age Range\")[\"Price\"].mean()\n", "\n", "\n", "# final merge all dataframes to store all info\n", "finalmerge1=pd.merge(age_purchase, avg_price, on=\"Age Range\", how=\"outer\")\n", "finalmerge2=pd.merge(finalmerge1, total_price, on=\"Age Range\", how=\"outer\")\n", "finalmerge3=pd.merge(finalmerge2, avgpp_df, on=\"Age Range\", how=\"outer\")\n", "\n", "\n", "# format\n", "finalmerge3=finalmerge3.rename(columns={\"Item ID\":\"Purchase Count\",\"Price_x\": \"Average Purchase Price\",\n", " \"Price_y\":\"Total Purchase Value\",\"Price\":\"Avg Total Purchase per Person\"})\n", "\n", "finalmerge3[\"Purchase Count\"] = finalmerge3[\"Purchase Count\"]\n", "finalmerge3[\"Average Purchase Price\"] = finalmerge3[\"Average Purchase Price\"].map(\"${:.2f}\".format)\n", "finalmerge3[\"Total Purchase Value\"] = finalmerge3[\"Total Purchase Value\"].map(\"${:.2f}\".format)\n", "finalmerge3[\"Avg Total Purchase per Person\"] = finalmerge3[\"Avg Total Purchase per Person\"].map(\"${:.2f}\".format)\n", "\n", "finalmerge3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Top Spenders" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Run basic calculations to obtain the results in the table below\n", "\n", "\n", "* Create a summary data frame to hold the results\n", "\n", "\n", "* Sort the total purchase value column in descending order\n", "\n", "\n", "* Optional: give the displayed data cleaner formatting\n", "\n", "\n", "* Display a preview of the summary data frame\n", "\n" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Purchase CountAverage Purchase ValueTotal Purchase Value
SN
Lisosia935$3.79$18.96
Idastidru524$3.86$15.45
Chamjask733$4.61$13.83
Iral744$3.40$13.62
Iskadarya953$4.37$13.10
\n", "
" ], "text/plain": [ " Purchase Count Average Purchase Value Total Purchase Value\n", "SN \n", "Lisosia93 5 $3.79 $18.96\n", "Idastidru52 4 $3.86 $15.45\n", "Chamjask73 3 $4.61 $13.83\n", "Iral74 4 $3.40 $13.62\n", "Iskadarya95 3 $4.37 $13.10" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\n", "ppavg=pd.DataFrame(purchase_data.groupby('SN')['Price'].mean())\n", "topspender=pd.merge(pparmerge, ppavg, on=\"SN\", how='outer')\n", "\n", "topspender=topspender.rename(columns={\"Item ID\": \"Purchase Count\", \"Price_y\":\"Average Purchase Value\", \n", " \"Price_x\":\"Total Purchase Value\"})\n", "topspender=topspender.sort_values(\"Total Purchase Value\", ascending=False)\n", "del topspender['Age Range']\n", "\n", "# formatting\n", "topspender=topspender[[\"Purchase Count\", \"Average Purchase Value\", \"Total Purchase Value\"]]\n", "topspender[\"Total Purchase Value\"] = topspender[\"Total Purchase Value\"].map(\"${:.2f}\".format)\n", "topspender[\"Average Purchase Value\"] = topspender[\"Average Purchase Value\"].map(\"${:.2f}\".format)\n", "\n", "topspender[:5]\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Most Popular Items" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Retrieve the Item ID, Item Name, and Item Price columns\n", "\n", "\n", "* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value\n", "\n", "\n", "* Create a summary data frame to hold the results\n", "\n", "\n", "* Sort the purchase count column in descending order\n", "\n", "\n", "* Optional: give the displayed data cleaner formatting\n", "\n", "\n", "* Display a preview of the summary data frame\n", "\n" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Purchase CountItem PriceTotal Purchase Value
Item IDItem Name
92Final Critic13$4.61$59.99
178Oathbreaker, Last Hope of the Breaking Storm12$4.23$50.76
145Fiery Glass Crusader9$4.58$41.22
132Persuasion9$3.22$28.99
108Extraction, Quickblade Of Trembling Hands9$3.53$31.77
\n", "
" ], "text/plain": [ " Purchase Count \\\n", "Item ID Item Name \n", "92 Final Critic 13 \n", "178 Oathbreaker, Last Hope of the Breaking Storm 12 \n", "145 Fiery Glass Crusader 9 \n", "132 Persuasion 9 \n", "108 Extraction, Quickblade Of Trembling Hands 9 \n", "\n", " Item Price \\\n", "Item ID Item Name \n", "92 Final Critic $4.61 \n", "178 Oathbreaker, Last Hope of the Breaking Storm $4.23 \n", "145 Fiery Glass Crusader $4.58 \n", "132 Persuasion $3.22 \n", "108 Extraction, Quickblade Of Trembling Hands $3.53 \n", "\n", " Total Purchase Value \n", "Item ID Item Name \n", "92 Final Critic $59.99 \n", "178 Oathbreaker, Last Hope of the Breaking Storm $50.76 \n", "145 Fiery Glass Crusader $41.22 \n", "132 Persuasion $28.99 \n", "108 Extraction, Quickblade Of Trembling Hands $31.77 " ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "purchase_data.columns\n", "\n", "popularity= purchase_data[[\"Item ID\", \"Item Name\", \"Price\"]]\n", "\n", "total_price= popularity.groupby([\"Item ID\",\"Item Name\"])['Price'].sum()\n", "item_price=popularity.groupby([\"Item ID\",\"Item Name\"])['Price'].mean()\n", "count=popularity.groupby([\"Item ID\",\"Item Name\"])['Price'].count()\n", "\n", "merge1_df=pd.merge(count, item_price, on=[\"Item ID\",\"Item Name\"], how=\"outer\")\n", "merge2_df=pd.merge(merge1_df, total_price, on=[\"Item ID\",\"Item Name\"], how=\"outer\")\n", "merge2_df=merge2_df.rename(columns={\"Price_x\":\"Purchase Count\",\n", " \"Price_y\": \"Item Price\",\n", " \"Price\": \"Total Purchase Value\"})\n", "newmerge_df=merge2_df.sort_values(\"Purchase Count\", ascending=False)\n", "\n", "\n", "\n", "newmerge_df[\"Item Price\"] = newmerge_df[\"Item Price\"].map(\"${:.2f}\".format)\n", "newmerge_df[\"Total Purchase Value\"] = newmerge_df[\"Total Purchase Value\"].map(\"${:.2f}\".format)\n", "newmerge_df[:5] \n", "\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Most Profitable Items" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Sort the above table by total purchase value in descending order\n", "\n", "\n", "* Optional: give the displayed data cleaner formatting\n", "\n", "\n", "* Display a preview of the data frame\n", "\n" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Purchase CountItem PriceTotal Purchase Value
Item IDItem Name
92Final Critic13$4.61$59.99
178Oathbreaker, Last Hope of the Breaking Storm12$4.23$50.76
82Nirvana9$4.90$44.10
145Fiery Glass Crusader9$4.58$41.22
103Singed Scalpel8$4.35$34.80
\n", "
" ], "text/plain": [ " Purchase Count \\\n", "Item ID Item Name \n", "92 Final Critic 13 \n", "178 Oathbreaker, Last Hope of the Breaking Storm 12 \n", "82 Nirvana 9 \n", "145 Fiery Glass Crusader 9 \n", "103 Singed Scalpel 8 \n", "\n", " Item Price \\\n", "Item ID Item Name \n", "92 Final Critic $4.61 \n", "178 Oathbreaker, Last Hope of the Breaking Storm $4.23 \n", "82 Nirvana $4.90 \n", "145 Fiery Glass Crusader $4.58 \n", "103 Singed Scalpel $4.35 \n", "\n", " Total Purchase Value \n", "Item ID Item Name \n", "92 Final Critic $59.99 \n", "178 Oathbreaker, Last Hope of the Breaking Storm $50.76 \n", "82 Nirvana $44.10 \n", "145 Fiery Glass Crusader $41.22 \n", "103 Singed Scalpel $34.80 " ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "newmerge_df=merge2_df.sort_values(\"Total Purchase Value\", ascending=False)\n", "\n", "\n", "\n", "newmerge_df[\"Item Price\"] = newmerge_df[\"Item Price\"].map(\"${:.2f}\".format)\n", "newmerge_df[\"Total Purchase Value\"] = newmerge_df[\"Total Purchase Value\"].map(\"${:.2f}\".format)\n", "newmerge_df[:5] \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "anaconda-cloud": {}, "kernel_info": { "name": "python3" }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": false, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "nteract": { "version": "0.2.0" } }, "nbformat": 4, "nbformat_minor": 2 }