{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "45e5a76a-107c-4684-9b0f-26599138e5ab",
   "metadata": {},
   "source": [
    "# Portfolio Panic: A Wealth Manager's Conundrum  \n",
    "\n",
    "### Description:  \n",
    "Imagine you're a wealth manager at a prestigious financial firm, responsible for managing a vast portfolio of stocks, bonds, and assets for high-net-worth clients. Your team relies on data analysis to make informed investment decisions, but your current dataset is a mess! You need to wrangle the data to identify trends, improve performance, and mitigate risk. Can you use your Pandas skills to tame the data beast and save the day?  \n",
    "\n",
    "### Tasks:  \n",
    "- **Asset Allocation Analysis:** Identify the top 5 asset classes by total value and calculate their respective weights in the portfolio.\n",
    "- **Risk Management:** Find the stocks with the highest volatility (highest values in the \"Volatility\" column) and calculate their average returns.\n",
    "- **Performance Optimization:** Group the data by sector and calculate the average returns for each sector. Identify the top 3 sectors with the highest returns."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "e832a7fc-164e-4d93-80b9-5f275eb8b3c8",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Python version 3.11.7 | packaged by Anaconda, Inc. | (main, Dec 15 2023, 18:05:47) [MSC v.1916 64 bit (AMD64)]\n",
      "Pandas version 2.2.1\n",
      "Numpy version 1.26.4\n"
     ]
    }
   ],
   "source": [
    "# import libraries\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "import sys\n",
    "\n",
    "print('Python version ' + sys.version)\n",
    "print('Pandas version ' + pd.__version__)\n",
    "print('Numpy version ' + np.__version__)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "48c842c1-4ae1-4bc7-bda0-d53820d7b8e4",
   "metadata": {},
   "source": [
    "# The Data  \n",
    "\n",
    "The dataset represents a portfolio of assets, including stocks, bonds, and other investment vehicles with information on their sector, value, returns, and volatility. It consists of 1000 rows, with each row representing a single asset and its corresponding attributes.\n",
    "\n",
    "### Columns:  \n",
    "- **Asset Class:** The type of asset (Stocks, Bonds, Real Estate, Commodities, Currencies)  \n",
    "- **Sector:** The industry sector (Technology, Finance, Healthcare, Energy, Consumer Goods)   \n",
    "- **Stock Symbol:** The stock symbol (AAPL, MSFT, JPM, GOOG, AMZN)  \n",
    "- **Value:** The current value of the asset  \n",
    "- **Returns:** The historical returns of the asset  \n",
    "- **Volatility:** The historical volatility of the asset  \n",
    "\n",
    "Can you tame the data and help the wealth manager make informed investment decisions?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "afa72db0-119c-4272-8dee-0c08b81af6b4",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Asset Class</th>\n",
       "      <th>Sector</th>\n",
       "      <th>Stock Symbol</th>\n",
       "      <th>Value</th>\n",
       "      <th>Returns</th>\n",
       "      <th>Volatility</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Currencies</td>\n",
       "      <td>Energy</td>\n",
       "      <td>AAPL</td>\n",
       "      <td>27148.447388</td>\n",
       "      <td>0.080532</td>\n",
       "      <td>0.206389</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Stocks</td>\n",
       "      <td>Technology</td>\n",
       "      <td>AMZN</td>\n",
       "      <td>47770.455686</td>\n",
       "      <td>0.152416</td>\n",
       "      <td>0.225218</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Commodities</td>\n",
       "      <td>Healthcare</td>\n",
       "      <td>MSFT</td>\n",
       "      <td>81326.560992</td>\n",
       "      <td>0.074461</td>\n",
       "      <td>0.469665</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Commodities</td>\n",
       "      <td>Consumer Goods</td>\n",
       "      <td>AAPL</td>\n",
       "      <td>81746.141057</td>\n",
       "      <td>-0.027992</td>\n",
       "      <td>0.483869</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Commodities</td>\n",
       "      <td>Finance</td>\n",
       "      <td>MSFT</td>\n",
       "      <td>75283.975516</td>\n",
       "      <td>0.058908</td>\n",
       "      <td>0.139532</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   Asset Class          Sector Stock Symbol         Value   Returns  \\\n",
       "0   Currencies          Energy         AAPL  27148.447388  0.080532   \n",
       "1       Stocks      Technology         AMZN  47770.455686  0.152416   \n",
       "2  Commodities      Healthcare         MSFT  81326.560992  0.074461   \n",
       "3  Commodities  Consumer Goods         AAPL  81746.141057 -0.027992   \n",
       "4  Commodities         Finance         MSFT  75283.975516  0.058908   \n",
       "\n",
       "   Volatility  \n",
       "0    0.206389  \n",
       "1    0.225218  \n",
       "2    0.469665  \n",
       "3    0.483869  \n",
       "4    0.139532  "
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# set the seed\n",
    "np.random.seed(0)\n",
    "\n",
    "# generate the data\n",
    "data = {\n",
    "    'Asset Class': np.random.choice(['Stocks', 'Bonds', 'Real Estate', 'Commodities', 'Currencies'], size=1000),\n",
    "    'Sector': np.random.choice(['Technology', 'Finance', 'Healthcare', 'Energy', 'Consumer Goods'], size=1000),\n",
    "    'Stock Symbol': np.random.choice(['AAPL', 'MSFT', 'JPM', 'GOOG', 'AMZN'], size=1000),\n",
    "    'Value': np.random.uniform(1000, 100000, size=1000),\n",
    "    'Returns': np.random.normal(0.05, 0.1, size=1000),\n",
    "    'Volatility': np.random.uniform(0.1, 0.5, size=1000)\n",
    "}\n",
    "\n",
    "df = pd.DataFrame(data)\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "37e6eb18-426f-47b1-956e-cb32568d3cb0",
   "metadata": {},
   "source": [
    "Let us start by looking at the datatypes and ensure the columns are of the correct type."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "8b43da1a-fbd4-4405-9491-1c29944c339c",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 1000 entries, 0 to 999\n",
      "Data columns (total 6 columns):\n",
      " #   Column        Non-Null Count  Dtype  \n",
      "---  ------        --------------  -----  \n",
      " 0   Asset Class   1000 non-null   object \n",
      " 1   Sector        1000 non-null   object \n",
      " 2   Stock Symbol  1000 non-null   object \n",
      " 3   Value         1000 non-null   float64\n",
      " 4   Returns       1000 non-null   float64\n",
      " 5   Volatility    1000 non-null   float64\n",
      "dtypes: float64(3), object(3)\n",
      "memory usage: 47.0+ KB\n"
     ]
    }
   ],
   "source": [
    "df.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "daaa2982-3485-4d3e-934c-104b2102c8b0",
   "metadata": {},
   "source": [
    "# Asset Allocation Analysis:  \n",
    "\n",
    "Identify the top 5 asset classes by total value and calculate their respective weights in the portfolio.  \n",
    "\n",
    "In the first task, \"weights\" refer to the percentage of the total portfolio value that each asset class represents.  \n",
    "\n",
    "For example, if the total portfolio value is \\\\$1 million, and the Stocks have a total value of \\\\$400,000, then the weight of the \"Stocks\" asset class would be 40% (\\\\$400,000 / \\\\$1,000,000).  \n",
    "\n",
    "So, in this task, you need to calculate the weights for each of the top 5 asset classes by total value, to see how the portfolio is allocated across different asset classes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "5b0d1aff-70ce-42b9-9f64-03067ad9482a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Asset Class\n",
       "Bonds          19.516817\n",
       "Commodities    22.715469\n",
       "Currencies     18.970236\n",
       "Real Estate    19.340309\n",
       "Stocks         19.457170\n",
       "Name: Value, dtype: float64"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# create group object\n",
    "group = df.groupby('Asset Class')\n",
    "\n",
    "# add the total value of each asset class and divide it by the total value of the portfolio\n",
    "group.sum(numeric_only=True)['Value'].div(df['Value'].sum()) * 100"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5c6be945-55d3-41be-8d94-1b99c2b82043",
   "metadata": {},
   "source": [
    "# Risk Management:  \n",
    "\n",
    "Find the stocks with the highest volatility (highest values in the \"Volatility\" column) and calculate their average returns.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "97f79e93-d812-43a6-b56d-7ff26ba13ce1",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>avg_volatility</th>\n",
       "      <th>avg_returns</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Stock Symbol</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>AAPL</th>\n",
       "      <td>0.292839</td>\n",
       "      <td>0.041167</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AMZN</th>\n",
       "      <td>0.290529</td>\n",
       "      <td>0.044325</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>GOOG</th>\n",
       "      <td>0.306020</td>\n",
       "      <td>0.041320</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>JPM</th>\n",
       "      <td>0.302877</td>\n",
       "      <td>0.051169</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>MSFT</th>\n",
       "      <td>0.289748</td>\n",
       "      <td>0.054646</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              avg_volatility  avg_returns\n",
       "Stock Symbol                             \n",
       "AAPL                0.292839     0.041167\n",
       "AMZN                0.290529     0.044325\n",
       "GOOG                0.306020     0.041320\n",
       "JPM                 0.302877     0.051169\n",
       "MSFT                0.289748     0.054646"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# create group object\n",
    "group = df.groupby('Stock Symbol')\n",
    "\n",
    "group.agg(\n",
    "     avg_volatility=pd.NamedAgg(column=\"Volatility\", aggfunc=\"mean\"),\n",
    "     avg_returns=pd.NamedAgg(column=\"Returns\", aggfunc=\"mean\")\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "39458e96-cd1b-43cc-bc86-27fe83a3d52d",
   "metadata": {},
   "source": [
    "I decided to calculate the average volatility for all of the stocks since there are only 5 in this dataset.  \n",
    "\n",
    "JPMorgan and Google seem to have the most volitility, but in general, all 5 have similar figures."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "94d4582a-6fce-420e-bb53-348bfc97ff5f",
   "metadata": {},
   "source": [
    "# Performance Optimization:  \n",
    "\n",
    "Group the data by sector and calculate the average returns for each sector. Identify the top 3 sectors with the highest returns."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "e0db0bb2-4f61-424a-ae5e-1b1b7d1ed669",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Sector\n",
       "Healthcare        0.065089\n",
       "Consumer Goods    0.054472\n",
       "Technology        0.043236\n",
       "Name: Returns, dtype: float64"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# create group object\n",
    "group = df.groupby('Sector')\n",
    "\n",
    "# get the average returns and select the top 3\n",
    "group.mean(numeric_only=True).sort_values(by='Returns', ascending=False)['Returns'].head(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "149dc185-9089-4ec9-839f-09be4608ba73",
   "metadata": {},
   "source": [
    "# Summary  \n",
    "\n",
    "The analysis revealed the 5 asset classes, with weights ranging from 19% to 23%. JPMorgan and Google were identified as the most volatile stocks, with average returns between 4% and 5%. Healthcare, Consumer Goods, and Technology emerged as the top-performing sectors, with returns between 4% and 7%. \n",
    "\n",
    "### Key Takeaways:  \n",
    "- Grouping data by asset class, stock symbol, and sector\n",
    "- Calculating weights, average volatility, and average returns\n",
    "- Identifying top asset classes, stocks with high volatility, and top-performing sectors\n",
    "- Using Pandas functions: groupby, sum, div, agg, mean, sort_values, and head"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9030ad27-80ce-44ac-ad5f-466357b61963",
   "metadata": {},
   "source": [
    "<p class=\"text-muted\">This tutorial was created by <a href=\"https://www.hedaro.com\" target=\"_blank\"><strong>HEDARO</strong></a></p>"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}