{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Project: analyse customer and track sales in chinook digital store\n",
"The objective for this project is to analyse data from the chinook database, which has information about a digital music shop.\n",
"\n",
"The chinook database contains details about the artists, songs, and albums from the music shop, as well as details on the shop's employees, customers, and the customers purchases. This information is contained in eleven tables, click [here](https://github.com/lerocha/chinook-database) to visit the github repository.\n",
"\n",
"The data analysis will explore below business questions:\n",
"- artist used in the most playlists?\n",
"- most popular track?\n",
"- how many tracks have been purchased vs not purchased?\n",
"- music genre which has the least purchased tracks?\n",
"\n",
"- performance of sales support agent\n",
"- breakdown of customer and sales data by country\n",
"- most popular music genre in the US\n",
"- recommendation for new albums to be added to the store for US customers\n",
"\n",
"### Results summary\n",
"\n",
"\n",
"\n",
"For more details, please refer to the the full analysis below."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"%%capture\n",
"%load_ext sql\n",
"%sql sqlite:///chinook.db"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Preview the data stored in the chinook database."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * sqlite:///chinook.db\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"
\n",
" \n",
" name | \n",
" type | \n",
"
\n",
" \n",
" album | \n",
" table | \n",
"
\n",
" \n",
" artist | \n",
" table | \n",
"
\n",
" \n",
" customer | \n",
" table | \n",
"
\n",
" \n",
" employee | \n",
" table | \n",
"
\n",
" \n",
" genre | \n",
" table | \n",
"
\n",
" \n",
" invoice | \n",
" table | \n",
"
\n",
" \n",
" invoice_line | \n",
" table | \n",
"
\n",
" \n",
" media_type | \n",
" table | \n",
"
\n",
" \n",
" playlist | \n",
" table | \n",
"
\n",
" \n",
" playlist_track | \n",
" table | \n",
"
\n",
" \n",
" track | \n",
" table | \n",
"
\n",
"
"
],
"text/plain": [
"[('album', 'table'),\n",
" ('artist', 'table'),\n",
" ('customer', 'table'),\n",
" ('employee', 'table'),\n",
" ('genre', 'table'),\n",
" ('invoice', 'table'),\n",
" ('invoice_line', 'table'),\n",
" ('media_type', 'table'),\n",
" ('playlist', 'table'),\n",
" ('playlist_track', 'table'),\n",
" ('track', 'table')]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT\n",
" name,\n",
" type\n",
"FROM sqlite_master\n",
"WHERE type IN (\"table\",\"view\");"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Popular artists used in the most playlists\n",
"The most popular artist with tracks used in the most playlists is Iron Maiden."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * sqlite:///chinook.db\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" artist_id | \n",
" artist_name | \n",
" tracks_in_playlist | \n",
"
\n",
" \n",
" 90 | \n",
" Iron Maiden | \n",
" 516 | \n",
"
\n",
" \n",
" 150 | \n",
" U2 | \n",
" 333 | \n",
"
\n",
" \n",
" 50 | \n",
" Metallica | \n",
" 296 | \n",
"
\n",
" \n",
" 22 | \n",
" Led Zeppelin | \n",
" 252 | \n",
"
\n",
" \n",
" 58 | \n",
" Deep Purple | \n",
" 226 | \n",
"
\n",
" \n",
" 149 | \n",
" Lost | \n",
" 184 | \n",
"
\n",
" \n",
" 118 | \n",
" Pearl Jam | \n",
" 177 | \n",
"
\n",
" \n",
" 82 | \n",
" Faith No More | \n",
" 145 | \n",
"
\n",
" \n",
" 81 | \n",
" Eric Clapton | \n",
" 145 | \n",
"
\n",
" \n",
" 100 | \n",
" Lenny Kravitz | \n",
" 143 | \n",
"
\n",
"
"
],
"text/plain": [
"[(90, 'Iron Maiden', 516),\n",
" (150, 'U2', 333),\n",
" (50, 'Metallica', 296),\n",
" (22, 'Led Zeppelin', 252),\n",
" (58, 'Deep Purple', 226),\n",
" (149, 'Lost', 184),\n",
" (118, 'Pearl Jam', 177),\n",
" (82, 'Faith No More', 145),\n",
" (81, 'Eric Clapton', 145),\n",
" (100, 'Lenny Kravitz', 143)]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT \n",
" a.artist_id, \n",
" a.name artist_name, \n",
" count(a.name) tracks_in_playlist\n",
"FROM playlist_track pt\n",
"INNER JOIN track t on t.track_id = pt.track_id\n",
"INNER JOIN album ab on ab.album_id = t.album_id\n",
"INNER JOIN artist a on a.artist_id = ab.artist_id\n",
"GROUP BY a.name\n",
"ORDER BY tracks_in_playlist DESC\n",
"LIMIT 10;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Below we try to find out if there is a particular track which is more popular amongst Iron Maiden music."
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * sqlite:///chinook.db\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" artist_name | \n",
" track_id | \n",
" track_name | \n",
" album | \n",
" tracks_in_playlist | \n",
"
\n",
" \n",
" Iron Maiden | \n",
" 1221 | \n",
" 2 Minutes To Midnight | \n",
" A Real Dead One | \n",
" 13 | \n",
"
\n",
" \n",
" Iron Maiden | \n",
" 1213 | \n",
" The Trooper | \n",
" A Real Dead One | \n",
" 12 | \n",
"
\n",
" \n",
" Iron Maiden | \n",
" 1212 | \n",
" The Number Of The Beast | \n",
" A Real Dead One | \n",
" 12 | \n",
"
\n",
" \n",
" Iron Maiden | \n",
" 1223 | \n",
" Hallowed Be Thy Name | \n",
" A Real Dead One | \n",
" 12 | \n",
"
\n",
" \n",
" Iron Maiden | \n",
" 1234 | \n",
" Fear Of The Dark | \n",
" A Real Live One | \n",
" 11 | \n",
"
\n",
" \n",
" Iron Maiden | \n",
" 1278 | \n",
" Wrathchild | \n",
" Killers | \n",
" 10 | \n",
"
\n",
" \n",
" Iron Maiden | \n",
" 1229 | \n",
" The Evil That Men Do | \n",
" A Real Live One | \n",
" 10 | \n",
"
\n",
" \n",
" Iron Maiden | \n",
" 1222 | \n",
" Iron Maiden | \n",
" A Real Dead One | \n",
" 10 | \n",
"
\n",
" \n",
" Iron Maiden | \n",
" 1227 | \n",
" Wasting Love | \n",
" A Real Live One | \n",
" 9 | \n",
"
\n",
" \n",
" Iron Maiden | \n",
" 1228 | \n",
" Tailgunner | \n",
" A Real Live One | \n",
" 9 | \n",
"
\n",
"
"
],
"text/plain": [
"[('Iron Maiden', 1221, '2 Minutes To Midnight', 'A Real Dead One', 13),\n",
" ('Iron Maiden', 1213, 'The Trooper', 'A Real Dead One', 12),\n",
" ('Iron Maiden', 1212, 'The Number Of The Beast', 'A Real Dead One', 12),\n",
" ('Iron Maiden', 1223, 'Hallowed Be Thy Name', 'A Real Dead One', 12),\n",
" ('Iron Maiden', 1234, 'Fear Of The Dark', 'A Real Live One', 11),\n",
" ('Iron Maiden', 1278, 'Wrathchild', 'Killers', 10),\n",
" ('Iron Maiden', 1229, 'The Evil That Men Do', 'A Real Live One', 10),\n",
" ('Iron Maiden', 1222, 'Iron Maiden', 'A Real Dead One', 10),\n",
" ('Iron Maiden', 1227, 'Wasting Love', 'A Real Live One', 9),\n",
" ('Iron Maiden', 1228, 'Tailgunner', 'A Real Live One', 9)]"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT \n",
" a.name artist_name, \n",
" pt.track_id,\n",
" t.name track_name,\n",
" ab.title album,\n",
" count(t.name) tracks_in_playlist\n",
"FROM playlist_track pt\n",
"INNER JOIN track t on t.track_id = pt.track_id\n",
"INNER JOIN album ab on ab.album_id = t.album_id\n",
"INNER JOIN artist a on a.artist_id = ab.artist_id\n",
"WHERE a.artist_id = 90\n",
"GROUP BY track_name\n",
"ORDER BY tracks_in_playlist DESC\n",
"LIMIT 10;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The album 'A Real Dead One' has the **Top 4** tracks for Iron Maiden and the most popular Iron Maiden digital track is '2 Minutes to Midnight'. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Total purchased and non-purchased tracks \n",
"\n",
"There are 3503 digital tracks in the chinook digital music store, of this, customers have purchased 1806, there are still 1697 tracks which have not yet been purchased by any customer. "
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * sqlite:///chinook.db\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" total_tracks | \n",
" purchased | \n",
" non_purchased | \n",
"
\n",
" \n",
" 3503 | \n",
" 1806 | \n",
" 1697 | \n",
"
\n",
"
"
],
"text/plain": [
"[(3503, 1806, 1697)]"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"WITH total_tracks AS \n",
" (\n",
" SELECT \n",
" count(distinct track_id) total_tracks, \n",
" (\n",
" SELECT \n",
" count(distinct track_id) \n",
" FROM invoice_line) purchased\n",
" FROM track\n",
" )\n",
"SELECT tt.*, \n",
" (tt.total_tracks - tt.purchased) non_purchased \n",
"FROM total_tracks tt;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This list has some of the tracks which have not yet been purchased from the store by a customer."
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * sqlite:///chinook.db\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" track_id | \n",
" name | \n",
" album_id | \n",
" media_type_id | \n",
" genre_id | \n",
" composer | \n",
" milliseconds | \n",
" bytes | \n",
" unit_price | \n",
"
\n",
" \n",
" 99 | \n",
" Your Time Has Come | \n",
" 11 | \n",
" 1 | \n",
" 4 | \n",
" Cornell, Commerford, Morello, Wilk | \n",
" 255529 | \n",
" 8273592 | \n",
" 0.99 | \n",
"
\n",
" \n",
" 101 | \n",
" Be Yourself | \n",
" 11 | \n",
" 1 | \n",
" 4 | \n",
" Cornell, Commerford, Morello, Wilk | \n",
" 279484 | \n",
" 9106160 | \n",
" 0.99 | \n",
"
\n",
" \n",
" 104 | \n",
" Heaven's Dead | \n",
" 11 | \n",
" 1 | \n",
" 4 | \n",
" Cornell, Commerford, Morello, Wilk | \n",
" 276688 | \n",
" 9006158 | \n",
" 0.99 | \n",
"
\n",
" \n",
" 106 | \n",
" Man Or Animal | \n",
" 11 | \n",
" 1 | \n",
" 4 | \n",
" Cornell, Commerford, Morello, Wilk | \n",
" 233195 | \n",
" 7542942 | \n",
" 0.99 | \n",
"
\n",
" \n",
" 107 | \n",
" Yesterday To Tomorrow | \n",
" 11 | \n",
" 1 | \n",
" 4 | \n",
" Cornell, Commerford, Morello, Wilk | \n",
" 273763 | \n",
" 8944205 | \n",
" 0.99 | \n",
"
\n",
"
"
],
"text/plain": [
"[(99, 'Your Time Has Come', 11, 1, 4, 'Cornell, Commerford, Morello, Wilk', 255529, 8273592, 0.99),\n",
" (101, 'Be Yourself', 11, 1, 4, 'Cornell, Commerford, Morello, Wilk', 279484, 9106160, 0.99),\n",
" (104, \"Heaven's Dead\", 11, 1, 4, 'Cornell, Commerford, Morello, Wilk', 276688, 9006158, 0.99),\n",
" (106, 'Man Or Animal', 11, 1, 4, 'Cornell, Commerford, Morello, Wilk', 233195, 7542942, 0.99),\n",
" (107, 'Yesterday To Tomorrow', 11, 1, 4, 'Cornell, Commerford, Morello, Wilk', 273763, 8944205, 0.99)]"
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT \n",
" t.*\n",
"FROM track t \n",
"WHERE t.track_id not in (\n",
" SELECT \n",
" il.track_id \n",
" FROM invoice_line il\n",
" )\n",
"LIMIT 5;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"From the tracks which have not been purchased, 460 belong to the 'Latin' genre, hmmm... how many tracks are available for purchase for this genre? "
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * sqlite:///chinook.db\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" genre | \n",
" genre_id | \n",
" total_tracks | \n",
"
\n",
" \n",
" Latin | \n",
" 7 | \n",
" 460 | \n",
"
\n",
" \n",
" Rock | \n",
" 1 | \n",
" 382 | \n",
"
\n",
" \n",
" Alternative & Punk | \n",
" 4 | \n",
" 156 | \n",
"
\n",
" \n",
" Metal | \n",
" 3 | \n",
" 136 | \n",
"
\n",
" \n",
" TV Shows | \n",
" 19 | \n",
" 91 | \n",
"
\n",
" \n",
" Jazz | \n",
" 2 | \n",
" 69 | \n",
"
\n",
" \n",
" Drama | \n",
" 21 | \n",
" 63 | \n",
"
\n",
" \n",
" Classical | \n",
" 24 | \n",
" 58 | \n",
"
\n",
" \n",
" Soundtrack | \n",
" 10 | \n",
" 38 | \n",
"
\n",
" \n",
" Reggae | \n",
" 8 | \n",
" 36 | \n",
"
\n",
"
"
],
"text/plain": [
"[('Latin', 7, 460),\n",
" ('Rock', 1, 382),\n",
" ('Alternative & Punk', 4, 156),\n",
" ('Metal', 3, 136),\n",
" ('TV Shows', 19, 91),\n",
" ('Jazz', 2, 69),\n",
" ('Drama', 21, 63),\n",
" ('Classical', 24, 58),\n",
" ('Soundtrack', 10, 38),\n",
" ('Reggae', 8, 36)]"
]
},
"execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT \n",
" g.name genre,\n",
" g.genre_id,\n",
" count(t.track_id) total_tracks\n",
"FROM track t\n",
"INNER JOIN genre g on g.genre_id = t.genre_id\n",
"WHERE t.track_id not in (\n",
" SELECT \n",
" il.track_id \n",
" FROM invoice_line il\n",
" )\n",
"GROUP BY genre\n",
"ORDER BY total_tracks DESC\n",
"LIMIT 10;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The data suggests that the 'Latin' genre has only had 20% of all available tracks (579) purchased by customers. This insight will be helpful when recommending new albums to be added to the digital music shop."
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * sqlite:///chinook.db\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" total_tracks | \n",
" genre | \n",
"
\n",
" \n",
" 579 | \n",
" Latin | \n",
"
\n",
"
"
],
"text/plain": [
"[(579, 'Latin')]"
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT \n",
" count(t.track_id) total_tracks, \n",
" g.name genre \n",
"FROM track t\n",
"INNER JOIN genre g on g.genre_id = t.genre_id\n",
"WHERE t.genre_id = 7\n",
"GROUP BY genre;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Performance by customer sales for chinook sales support agents\n",
"\n",
"The below data displays total sales amount (dollars) for each sales support agent within the organisation."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * sqlite:///chinook.db\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" employee_name | \n",
" rep_sales_total | \n",
" percentage_sales | \n",
" hire_date | \n",
"
\n",
" \n",
" Jane Peacock | \n",
" 1731.51 | \n",
" 36.77 | \n",
" 2017-04-01 00:00:00 | \n",
"
\n",
" \n",
" Margaret Park | \n",
" 1584.0000000000002 | \n",
" 33.63 | \n",
" 2017-05-03 00:00:00 | \n",
"
\n",
" \n",
" Steve Johnson | \n",
" 1393.92 | \n",
" 29.6 | \n",
" 2017-10-17 00:00:00 | \n",
"
\n",
"
"
],
"text/plain": [
"[('Jane Peacock', 1731.51, 36.77, '2017-04-01 00:00:00'),\n",
" ('Margaret Park', 1584.0000000000002, 33.63, '2017-05-03 00:00:00'),\n",
" ('Steve Johnson', 1393.92, 29.6, '2017-10-17 00:00:00')]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"WITH rep_sales AS\n",
" (\n",
" SELECT\n",
" c.customer_id,\n",
" round(sum(i.total), 2) rep_sales_total,\n",
" c.support_rep_id\n",
" FROM customer c\n",
" INNER JOIN invoice i on i.customer_id = c.customer_id\n",
" GROUP BY i.customer_id\n",
" )\n",
"SELECT \n",
" e.first_name || ' ' || e.last_name employee_name,\n",
" SUM(rs.rep_sales_total) rep_sales_total,\n",
" round(CAST(SUM(rs.rep_sales_total) AS FLOAT) /\n",
" (\n",
" SELECT SUM(rep_sales_total) \n",
" FROM rep_sales\n",
" ) * 100, 2) percentage_sales,\n",
" e.hire_date\n",
"FROM rep_sales rs\n",
"INNER JOIN employee e on e.employee_id = rs.support_rep_id\n",
"WHERE e.title in ('Sales Support Agent')\n",
"GROUP BY employee_name;\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The highest performing sales agent is Jane Peacock with 36.77% of all sales while Margaret Park has 33.63%. Steve Johnson has the least percent of sales recorded with 29.6%, approximately 7% from the top performing employee. \n",
"\n",
"Based on the hire date for these employees, the data suggests some correlation with sales performance. Jane was the first sales support agent to be hired by the organisation amongst the three. This suggests that Steve's sales amount could be as a result of joining the organisation much later (at least five months) when compared with the other two employees in same role."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Customer and sales data categorised by country\n",
"\n",
"The below shows sales data, for each country by calculating:\n",
"\n",
"- total number of customers\n",
"- total value of sales\n",
"- average value of sales per customer\n",
"- average order value\n",
"\n",
"Because there are a number of countries with only one customer, these customers will be grouped in a new category which will be labelled 'other'."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * sqlite:///chinook.db\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" country | \n",
" total_customers | \n",
" total_sales | \n",
" avg_sales_per_customer | \n",
" avg_order_value | \n",
"
\n",
" \n",
" Other | \n",
" 15 | \n",
" 1094.94 | \n",
" 73.0 | \n",
" 7.45 | \n",
"
\n",
" \n",
" USA | \n",
" 13 | \n",
" 1040.49 | \n",
" 80.04 | \n",
" 7.94 | \n",
"
\n",
" \n",
" Canada | \n",
" 8 | \n",
" 535.59 | \n",
" 66.95 | \n",
" 7.05 | \n",
"
\n",
" \n",
" Brazil | \n",
" 5 | \n",
" 427.68 | \n",
" 85.54 | \n",
" 7.01 | \n",
"
\n",
" \n",
" France | \n",
" 5 | \n",
" 389.07 | \n",
" 77.81 | \n",
" 7.78 | \n",
"
\n",
" \n",
" Germany | \n",
" 4 | \n",
" 334.62 | \n",
" 83.66 | \n",
" 8.16 | \n",
"
\n",
" \n",
" Czech Republic | \n",
" 2 | \n",
" 273.24 | \n",
" 136.62 | \n",
" 9.11 | \n",
"
\n",
" \n",
" United Kingdom | \n",
" 3 | \n",
" 245.52 | \n",
" 81.84 | \n",
" 8.77 | \n",
"
\n",
" \n",
" Portugal | \n",
" 2 | \n",
" 185.13 | \n",
" 92.57 | \n",
" 6.38 | \n",
"
\n",
" \n",
" India | \n",
" 2 | \n",
" 183.15 | \n",
" 91.58 | \n",
" 8.72 | \n",
"
\n",
"
"
],
"text/plain": [
"[('Other', 15, 1094.94, 73.0, 7.45),\n",
" ('USA', 13, 1040.49, 80.04, 7.94),\n",
" ('Canada', 8, 535.59, 66.95, 7.05),\n",
" ('Brazil', 5, 427.68, 85.54, 7.01),\n",
" ('France', 5, 389.07, 77.81, 7.78),\n",
" ('Germany', 4, 334.62, 83.66, 8.16),\n",
" ('Czech Republic', 2, 273.24, 136.62, 9.11),\n",
" ('United Kingdom', 3, 245.52, 81.84, 8.77),\n",
" ('Portugal', 2, 185.13, 92.57, 6.38),\n",
" ('India', 2, 183.15, 91.58, 8.72)]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"WITH customer_sales AS \n",
" (\n",
" SELECT il.*,\n",
" CASE\n",
" WHEN (\n",
" SELECT count(*)\n",
" FROM customer\n",
" where country = c.country\n",
" ) = 1 THEN \"Other\"\n",
" ELSE c.country\n",
" END AS country, \n",
" i.total,\n",
" c.customer_id\n",
" FROM invoice_line il\n",
" INNER JOIN invoice i on il.invoice_id = i.invoice_id\n",
" INNER JOIN customer c on i.customer_id = c.customer_id\n",
" )\n",
"SELECT \n",
" country, \n",
" count(distinct(customer_id)) total_customers, \n",
" round(SUM(unit_price), 2) total_sales,\n",
" round(SUM(unit_price) / count(distinct customer_id), 2) avg_sales_per_customer,\n",
" round(SUM(unit_price) / count(distinct invoice_id), 2) avg_order_value\n",
"FROM customer_sales cs\n",
"GROUP BY country\n",
"ORDER BY total_sales DESC;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The US might have the most customer sales overall, but the average order value per customer is only `$80` (the second lowest total amount across the countries excluding those countries with only one customer which have been grouped in 'other'). The data suggests that there might be opportunity for higher sales in Czech Republic and Portugal which have `$136.62` and `$92.57` average order values per customer, though this data should be used with caution due to the limited sample size being used for analysis."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Popular music genres in the US\n",
"\n",
"The chinook record store has just signed a deal with a new record label, and would like to select three albums that will be added to the store, from a list of four. The four albums are by artists that don't have any tracks in the store right now - below we have the artist names, and the genre of music they produce:\n",
"\n",
"| Artist | Genre |\n",
"| ------------------- | ----------- |\n",
"| Regal | Hip-Hop |\n",
"| Red Tone | Punk |\n",
"| Meteor and the Girls | Pop |\n",
"| Slim Jim Bites | Blues |\n",
"\n",
"\n",
"The record label specializes in artists from USA, and they have given chinook some money to advertise the new albums in the US, this means we're interested in finding out which genres sell best in USA."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * sqlite:///chinook.db\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" genre | \n",
" tracks_purchased | \n",
" percentage_purchased | \n",
"
\n",
" \n",
" Rock | \n",
" 561 | \n",
" 53.38 | \n",
"
\n",
" \n",
" Alternative & Punk | \n",
" 130 | \n",
" 12.37 | \n",
"
\n",
" \n",
" Metal | \n",
" 124 | \n",
" 11.8 | \n",
"
\n",
" \n",
" R&B/Soul | \n",
" 53 | \n",
" 5.04 | \n",
"
\n",
" \n",
" Blues | \n",
" 36 | \n",
" 3.43 | \n",
"
\n",
" \n",
" Alternative | \n",
" 35 | \n",
" 3.33 | \n",
"
\n",
" \n",
" Pop | \n",
" 22 | \n",
" 2.09 | \n",
"
\n",
" \n",
" Latin | \n",
" 22 | \n",
" 2.09 | \n",
"
\n",
" \n",
" Hip Hop/Rap | \n",
" 20 | \n",
" 1.9 | \n",
"
\n",
" \n",
" Jazz | \n",
" 14 | \n",
" 1.33 | \n",
"
\n",
" \n",
" Easy Listening | \n",
" 13 | \n",
" 1.24 | \n",
"
\n",
" \n",
" Reggae | \n",
" 6 | \n",
" 0.57 | \n",
"
\n",
" \n",
" Electronica/Dance | \n",
" 5 | \n",
" 0.48 | \n",
"
\n",
" \n",
" Classical | \n",
" 4 | \n",
" 0.38 | \n",
"
\n",
" \n",
" Heavy Metal | \n",
" 3 | \n",
" 0.29 | \n",
"
\n",
" \n",
" Soundtrack | \n",
" 2 | \n",
" 0.19 | \n",
"
\n",
" \n",
" TV Shows | \n",
" 1 | \n",
" 0.1 | \n",
"
\n",
"
"
],
"text/plain": [
"[('Rock', 561, 53.38),\n",
" ('Alternative & Punk', 130, 12.37),\n",
" ('Metal', 124, 11.8),\n",
" ('R&B/Soul', 53, 5.04),\n",
" ('Blues', 36, 3.43),\n",
" ('Alternative', 35, 3.33),\n",
" ('Pop', 22, 2.09),\n",
" ('Latin', 22, 2.09),\n",
" ('Hip Hop/Rap', 20, 1.9),\n",
" ('Jazz', 14, 1.33),\n",
" ('Easy Listening', 13, 1.24),\n",
" ('Reggae', 6, 0.57),\n",
" ('Electronica/Dance', 5, 0.48),\n",
" ('Classical', 4, 0.38),\n",
" ('Heavy Metal', 3, 0.29),\n",
" ('Soundtrack', 2, 0.19),\n",
" ('TV Shows', 1, 0.1)]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"WITH popular_usa_genres AS \n",
" (\n",
" SELECT il.*,\n",
" c.country \n",
" FROM invoice_line il\n",
" INNER JOIN invoice i on il.invoice_id = i.invoice_id\n",
" INNER JOIN customer c on i.customer_id = c.customer_id\n",
" WHERE c.country = \"USA\"\n",
" )\n",
"SELECT g.name genre, \n",
" COUNT(pug.invoice_line_id) tracks_purchased, \n",
" round(CAST(COUNT(pug.invoice_line_id) AS FLOAT) / (\n",
" SELECT COUNT(*)\n",
" FROM popular_usa_genres\n",
" ) * 100, 2) percentage_purchased\n",
"FROM popular_usa_genres pug\n",
"INNER JOIN track t on t.track_id = pug.track_id\n",
"INNER JOIN genre g on g.genre_id = t.genre_id\n",
"GROUP BY g.genre_id\n",
"ORDER BY tracks_purchased desc;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Recommendation for new albums \n",
"\n",
"The most popular music genre for customers purchasing music from the US is 'Rock' which has 53% of sales while 'Alternative & Punk' genre has 12%. The 'Metal' genre has 11.7% sales, and much further from the **Top 3** genres is 'R&B/Soul' which has 5% of total sales.\n",
"\n",
"Based on global sales of tracks across the different genres in the US, chinook digital record store should purchase albums by the following 3 artists:\n",
"\n",
"| Artist | Genre |\n",
"| ------------------- | ----------- |\n",
"| Red Tone | Punk |\n",
"| Meteor and the Girls | Pop |\n",
"| Slim Jim Bites | Blues |\n",
"\n",
"Punk genre is ranked 2nd overall in sales, hence we expect 'Red Tone' to have better sales. \n",
"\n",
"The below artist narrowly missed the recommended shortlist, despite hip-hop/rap being ranked 9th overall in the US, this music genre was only 0.19 per-cent off the 7th position in popularity presently held by 'Pop'.\n",
"- Regal: Hip Hop/Rap genre is ranked 9th overall"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 4
}