{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Soda profits\n", "\n", "The Coca-Cola Co spreadsheet[1] is missing some calculations. \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# install the necessary dependencies\n", "import sys\n", "!{sys.executable} -m pip install --quiet ipython-sql\n", "\n", "%load_ext sql" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# download data file\n", "!curl https://raw.githubusercontent.com/ocademy-ai/machine-learning/main/open-machine-learning-jupyter-book/assets/data/coca_cola_co.sqlite -o ./coca_cola_co.sqlite\n", "\n", "# loading from disk\n", "%sql sqlite:///coca_cola_co.sqlite\n", "\n", "# clear tmp data file\n", "!rm -rf coca_cola_co.sqlite\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's have a look at some sample data." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " * sqlite:///coca_cola_co.sqlite\n", "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
yearNet operating revenuesCost of goods soldGross Profit
0930990.0011088.0019902.00
1035119.0012693.0022426.00
1146542.0018215.0028327.00
1248017.0019053.0028964.00
1346854.0018421.0028433.00
1445998.0017889.0028109.00
1544294.0017482.00
1641863.0016465.00
1735410.0013255.00
1831856.0011770.00
" ], "text/plain": [ "[('09', '30990.00', '11088.00', '19902.00'),\n", " ('10', '35119.00', '12693.00', '22426.00'),\n", " ('11', '46542.00', '18215.00', '28327.00'),\n", " ('12', '48017.00', '19053.00', '28964.00'),\n", " ('13', '46854.00', '18421.00', '28433.00'),\n", " ('14', '45998.00', '17889.00', '28109.00'),\n", " ('15', '44294.00', '17482.00', ''),\n", " ('16', '41863.00', '16465.00', ''),\n", " ('17', '35410.00', '13255.00', ''),\n", " ('18', '31856.00', '11770.00', '')]" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "\n", "SELECT * FROM \"coca-cola-co\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Assignment\n", "\n", "Create queries to return the following information:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1. Calculate the Gross profits of FY '15, '16, '17, and '18.\n", " - Gross Profit = Net Operating revenues - Cost of goods sold.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%sql\n", "\n", "SELECT year, \"Net Operating Revenues\" ____ \"Cost of Goods Sold\" ______ \"Gross Profit\"\n", "FROM \"coca-cola-co\"\n", "WHERE ______ IN ('15', '16', '17', '18');" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "2. Calculate the average of all the gross profits. Try to do this with a function.\n", " - Average = Sum of gross profits divided by the number of fiscal years (10).\n", " - Documentation of the [AVERAGE function](https://support.microsoft.com/en-us/office/average-function-047bac88-d466-426c-a32b-8f33eb960cf6).\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%sql\n", "\n", "SELECT ____(\"Gross Profit\") AS \"Average Gross Profit\"\n", "______ (\n", " SELECT ____ - ____ AS \"Gross Profit\"\n", " FROM \"coca-cola-co\"\n", ") AS \"All Gross Profits\";" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Acknowledgments\n", "\n", "Thanks to Microsoft for creating the open-source course [Data Science for Beginners](https://github.com/microsoft/Data-Science-For-Beginners) and the data author [Jasmine Greenaway](https://github.com/paladique). They inspires the majority of the content in this chapter." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bibliography\n", "\n", "[1]. CocaCola excel. (n.d.). Retrieved 29 May 2023, from https://www.kaggle.com/datasets/yiyiwang0826/cocacola-excel" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.5" } }, "nbformat": 4, "nbformat_minor": 4 }