{ "cells": [ { "cell_type": "markdown", "id": "fd5c3384-3b1f-4240-b344-06bc5484ed57", "metadata": {}, "source": [ "# Challenge: Merging Financial Datasets\n", "\n", "### Description:\n", "You are a fintech professional working on a project to analyze the relationship between stock prices and financial statements. You have two datasets:\n", "- stock_prices: contains daily stock prices for various companies \n", "- financials: contains quarterly financial statement data (revenue, net income, etc.) for the same companies\n", "\n", "Your task is to merge these two datasets on the company name and date, so that you can analyze the relationship between stock prices and financial performance.\n", "\n", "### Pain Point:\n", "The datasets have different structures and formats, making it difficult to combine them. You need to use Pandas to:\n", "- Handle mismatched column names and data types\n", "- Merge the datasets on multiple columns (company name and date)\n", "- Handle missing values and duplicates" ] }, { "cell_type": "code", "execution_count": 1, "id": "8efc43a6-867a-47b2-bd32-cd9e545aaab2", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import sys" ] }, { "cell_type": "code", "execution_count": 2, "id": "2cd63d74-4e00-4a4e-86e2-5f24beada2dc", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Python version 3.11.7 | packaged by Anaconda, Inc. | (main, Dec 15 2023, 18:05:47) [MSC v.1916 64 bit (AMD64)]\n", "Pandas version 2.2.1\n", "Numpy version 1.26.4\n" ] } ], "source": [ "print('Python version ' + sys.version)\n", "print('Pandas version ' + pd.__version__)\n", "print('Numpy version ' + np.__version__)" ] }, { "cell_type": "markdown", "id": "d94c3bf7-306d-4fb3-826e-b504431f43e7", "metadata": {}, "source": [ "# Create Data" ] }, { "cell_type": "code", "execution_count": 3, "id": "3a2bab5a-1d59-4d39-bd59-a1a38087a695", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
SymbolDateRevenueNet Income
0AAPL2024-03-311000200
1GOOG2024-03-312000400
2MSFT2024-03-313000500
3AAPL2024-03-311200250
4GOOG2024-03-312500500
\n", "
" ], "text/plain": [ " Symbol Date Revenue Net Income\n", "0 AAPL 2024-03-31 1000 200\n", "1 GOOG 2024-03-31 2000 400\n", "2 MSFT 2024-03-31 3000 500\n", "3 AAPL 2024-03-31 1200 250\n", "4 GOOG 2024-03-31 2500 500" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "financials = {\n", " 'Symbol': ['AAPL', 'GOOG', 'MSFT', 'AAPL', 'GOOG'],\n", " 'Date': ['2024-03-31', '2024-03-31', '2024-03-31', '2024-03-31', '2024-03-31'],\n", " 'Revenue': [1000, 2000, 3000, 1200, 2500],\n", " 'Net Income': [200, 400, 500, 250, 500]\n", "}\n", "\n", "# create DataFrames\n", "financials_df = pd.DataFrame(financials)\n", "financials_df" ] }, { "cell_type": "code", "execution_count": 4, "id": "93a90628-30b0-41f4-91ee-1e754353d256", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 5 entries, 0 to 4\n", "Data columns (total 4 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 Symbol 5 non-null object\n", " 1 Date 5 non-null object\n", " 2 Revenue 5 non-null int64 \n", " 3 Net Income 5 non-null int64 \n", "dtypes: int64(2), object(2)\n", "memory usage: 292.0+ bytes\n" ] } ], "source": [ "financials_df.info()" ] }, { "cell_type": "code", "execution_count": 5, "id": "12285ad8-91d3-4c0e-bbf9-32af0c4b4149", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 5 entries, 0 to 4\n", "Data columns (total 4 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 Symbol 5 non-null object \n", " 1 Date 5 non-null datetime64[ns]\n", " 2 Revenue 5 non-null int64 \n", " 3 Net Income 5 non-null int64 \n", "dtypes: datetime64[ns](1), int64(2), object(1)\n", "memory usage: 292.0+ bytes\n" ] } ], "source": [ "# we need to convert the dates from strings to date objects\n", "financials_df['Date'] = pd.to_datetime(financials_df['Date'])\n", "financials_df.info()" ] }, { "cell_type": "code", "execution_count": 6, "id": "d8fb85b3-786f-482b-a494-a18558fc64cc", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
RevenueNet Income
SymbolDate
AAPL2024-03-312200450
GOOG2024-03-314500900
MSFT2024-03-313000500
\n", "
" ], "text/plain": [ " Revenue Net Income\n", "Symbol Date \n", "AAPL 2024-03-31 2200 450\n", "GOOG 2024-03-31 4500 900\n", "MSFT 2024-03-31 3000 500" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# compress data\n", "financials_df = financials_df.groupby(['Symbol','Date']).sum()\n", "financials_df" ] }, { "cell_type": "code", "execution_count": 7, "id": "31ae9763-ee5e-4789-997c-f595c1fc7679", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
AAPLGOOGMSFTDate
0359.674643342.778210280.8436182024-03-12
1177.374228431.584652498.3846912024-03-16
2204.603302495.466261316.0688592024-01-22
3417.608290460.682614139.3004992024-03-28
4121.619000246.874560440.3825022024-02-18
...............
177301.64402288.299780226.5885762024-01-02
178105.304424327.655990106.9911482024-02-28
179271.719726238.858198167.8686032024-02-27
180NaNNaNNaN2024-03-21
181339.245228122.208132349.7952242024-02-08
\n", "

182 rows × 4 columns

\n", "
" ], "text/plain": [ " AAPL GOOG MSFT Date\n", "0 359.674643 342.778210 280.843618 2024-03-12\n", "1 177.374228 431.584652 498.384691 2024-03-16\n", "2 204.603302 495.466261 316.068859 2024-01-22\n", "3 417.608290 460.682614 139.300499 2024-03-28\n", "4 121.619000 246.874560 440.382502 2024-02-18\n", ".. ... ... ... ...\n", "177 301.644022 88.299780 226.588576 2024-01-02\n", "178 105.304424 327.655990 106.991148 2024-02-28\n", "179 271.719726 238.858198 167.868603 2024-02-27\n", "180 NaN NaN NaN 2024-03-21\n", "181 339.245228 122.208132 349.795224 2024-02-08\n", "\n", "[182 rows x 4 columns]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# set the date range\n", "start_date = '2024-01-01'\n", "end_date = '2024-03-31'\n", "\n", "# generate a date range (daily)\n", "dates = pd.date_range(start_date, end_date, freq='D')\n", "\n", "# companies\n", "companies = ['AAPL', 'GOOG', 'MSFT']\n", "\n", "# generate random prices with missing values and duplicates\n", "np.random.seed(500) # keep sequence the same\n", "prices = np.random.uniform(low=50, high=500, size=(len(dates), len(companies)))\n", "prices[np.random.choice(prices.shape[0], size=10, replace=False), :] = np.nan # introduce missing values\n", "prices = np.repeat(prices, 2, axis=0) # introduce duplicates\n", "\n", "# create a DataFrame\n", "stock_prices_df = pd.DataFrame(prices, columns=companies)\n", "stock_prices_df['Date'] = np.repeat(dates, 2) # repeat dates for duplicates\n", "\n", "# shuffle the data to mix up the duplicates\n", "stock_prices_df = stock_prices_df.sample(frac=1).reset_index(drop=True)\n", "stock_prices_df" ] }, { "cell_type": "code", "execution_count": 8, "id": "2490e5ee-15da-4f8f-8025-ece5d129baf7", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 182 entries, 0 to 181\n", "Data columns (total 4 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 AAPL 162 non-null float64 \n", " 1 GOOG 162 non-null float64 \n", " 2 MSFT 162 non-null float64 \n", " 3 Date 182 non-null datetime64[ns]\n", "dtypes: datetime64[ns](1), float64(3)\n", "memory usage: 5.8 KB\n" ] } ], "source": [ "stock_prices_df.info()" ] }, { "cell_type": "markdown", "id": "c6366a41-00bb-4367-8906-72a9e457ac46", "metadata": {}, "source": [ "### Reshape df\n", "\n", "The data came in with the Symbols as the columns. Since the quarterly data has the Symbols as a columns, we need to reshape the stock_prices_df." ] }, { "cell_type": "code", "execution_count": 9, "id": "2a30a07f-fe98-4578-aada-76c3c0921313", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Datelevel_1Price
2102024-01-01AAPL362.155787
2162024-01-01AAPL362.155787
2112024-01-01GOOG77.772647
2172024-01-01GOOG77.772647
2122024-01-01MSFT349.975219
............
4502024-03-31AAPL420.118719
2142024-03-31GOOG316.125091
4512024-03-31GOOG316.125091
2152024-03-31MSFT320.870742
4522024-03-31MSFT320.870742
\n", "

486 rows × 3 columns

\n", "
" ], "text/plain": [ " Date level_1 Price\n", "210 2024-01-01 AAPL 362.155787\n", "216 2024-01-01 AAPL 362.155787\n", "211 2024-01-01 GOOG 77.772647\n", "217 2024-01-01 GOOG 77.772647\n", "212 2024-01-01 MSFT 349.975219\n", ".. ... ... ...\n", "450 2024-03-31 AAPL 420.118719\n", "214 2024-03-31 GOOG 316.125091\n", "451 2024-03-31 GOOG 316.125091\n", "215 2024-03-31 MSFT 320.870742\n", "452 2024-03-31 MSFT 320.870742\n", "\n", "[486 rows x 3 columns]" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# reshape and save in new variable\n", "new = pd.DataFrame(stock_prices_df.set_index('Date').stack(), columns=['Price']).reset_index()\n", "\n", "# sort by date and symbol\n", "new = new.sort_values(by=['Date','level_1'])\n", "new" ] }, { "cell_type": "markdown", "id": "05e266ec-8613-4e2c-9d3b-8fd87c59205d", "metadata": {}, "source": [ "Let's see some of these duplicates" ] }, { "cell_type": "code", "execution_count": 10, "id": "82a85ce8-cfb8-43c1-835c-6ba8393ce001", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Datelevel_1Price
2102024-01-01AAPL362.155787
2162024-01-01AAPL362.155787
\n", "
" ], "text/plain": [ " Date level_1 Price\n", "210 2024-01-01 AAPL 362.155787\n", "216 2024-01-01 AAPL 362.155787" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mask1 = new.loc[:,'Date'] == pd.to_datetime('1/1/2024')\n", "mask2 = new.loc[:,'level_1'] == 'AAPL'\n", "mask3 = new.loc[:,'Price'].apply(lambda x: np.isclose(x,362.155787))\n", "new[mask1 & mask2 & mask3]" ] }, { "cell_type": "markdown", "id": "4e9ea6a0-1715-40ea-b604-1483a22eae2b", "metadata": {}, "source": [ "### Drop Duplicates" ] }, { "cell_type": "code", "execution_count": 11, "id": "f9f11610-c99e-4f69-a9d3-370a3790162a", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Datelevel_1Price
2162024-01-01AAPL362.155787
2172024-01-01GOOG77.772647
2182024-01-01MSFT349.975219
4742024-01-02AAPL301.644022
4752024-01-02GOOG88.299780
............
2652024-03-30GOOG437.673426
2662024-03-30MSFT199.850571
4502024-03-31AAPL420.118719
4512024-03-31GOOG316.125091
4522024-03-31MSFT320.870742
\n", "

243 rows × 3 columns

\n", "
" ], "text/plain": [ " Date level_1 Price\n", "216 2024-01-01 AAPL 362.155787\n", "217 2024-01-01 GOOG 77.772647\n", "218 2024-01-01 MSFT 349.975219\n", "474 2024-01-02 AAPL 301.644022\n", "475 2024-01-02 GOOG 88.299780\n", ".. ... ... ...\n", "265 2024-03-30 GOOG 437.673426\n", "266 2024-03-30 MSFT 199.850571\n", "450 2024-03-31 AAPL 420.118719\n", "451 2024-03-31 GOOG 316.125091\n", "452 2024-03-31 MSFT 320.870742\n", "\n", "[243 rows x 3 columns]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# identify dupes\n", "dupes = new.duplicated()\n", "\n", "# we are going to keep the first occurrence of any duplicate found\n", "clean = new[dupes]\n", "clean" ] }, { "cell_type": "code", "execution_count": 12, "id": "88f0dc39-1b1e-4844-9800-394b28c4eca1", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Datelevel_1Price
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [Date, level_1, Price]\n", "Index: []" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# verify we have no duplicates\n", "clean[clean.duplicated()]" ] }, { "attachments": { "6cb0372e-0ac8-48b9-8f6b-50a6741b2e4f.png": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAo4AAACYCAIAAADlfP8YAAAgAElEQVR4Ae2dz44VS3L/+z1m5XdgM2y88yN4xUhe9dr6+QF+uhsvvGILFyHENGIY3MbgVl+EQBhp7kijkUYj7t2x79fAysjMyPiTmZVZUYeuw4mj0ZyqzMhvRnwiquLUoW/32Y2/nIATcAJOwAk4gR0TONuxb+6aE3ACTsAJOAEncOOt2ovACTgBJ+AEnMCuCXir3nV63Dkn4AScgBNwAt6qvQacgBNwAk7ACeyagLfqXafHnXMCTsAJOAEnsEGr/uWXX5yjkYAzNAK8ublxhs7QTsCu4HXoDA9BYINW/fe//93u2YkrOEN7AThDZ2gnYFfwOnSGhyCwQav+29/+ZvfsxBWcob0AnKEztBOwK3gdOsNDENigVf/1r3+1e3biCs7QXgDO0BnaCdgVvA6d4SEIrGzV9N9j/vKXv9g9O3EFZ2gvAGfoDO0E7Apeh87wEATWtOpffvnl6uoKvfnzn/+Mx36wjoAzXMeNrnKGlMa6Y2e4jhtd5QwpjXXHzlATmG7VsU/TVv2nP/1pXT6+0aqLe2dnZ3d++Pkbbbdqm/0yBHpn5HXvYlWEh1+0X4YQ+88/3CEUz87O9ghynwwVup3SizW+T4bBN3Ut77MIb25ubpGhLrZbuVA1gblWjX2atupPnz4d/j68eodA/s6dOzvv1ftlGC7vUqtQxzv92LNfhvHq59gu7vHz1QW+6cJ9MgR+pQhvbm4iUVKYm1Kwie2TYYiJX8u2KA+7+hYZhtLawaWpCUy0atqnaav++PHjYfNmUQ/g713sBH87kP0ylJd3ON9BJVdQ7pXhfolpiPtkCI2ZtergOYzusBT3yTAQk9eyzv9eRm6R4U56hSYw2qpFn6at+v3793vJsPIjcIdr/OLeThtMdHm/DOXlLc8V8lsb2CfDY2rUNzf7ZFhv1bH17K9X75NhuCz3e+3Km8YtMtxJq9YEhlq17tO0Vb97906S3ss5qU1yuBfviB/7Zci5hbP93RwjyF0y3DMwUn/5cJcM4/OzeqpOvWd35bhPhiHD/FrOOd/j+y0y3Emr1gSWW3W1T9NW/dNPP+0x27I0d12n+2ZIfxyqdsfcR/r3yLD1PLgPYtqLPTJMX3XXCm+XH4T2yTDkOuDirxpUXRXffuQWGcIlSyndDiNNYKFVt/o0bdX0+NsntbOjaM7itLPw20/tlqH8JA6FvLsHGUjYHhkeW6veI8Nja9X7ZBgukT3fAfk99xYZhkt2Bzc4TaDXqjt9mgq9efOGc97Hmfp0BJ+UdpCEGp6dMqxd3rvtPrtkGO6OO62546nDVsm1xmuRfbuxXdYhhH88rfoWGYai2sEVqwk0W3W/T9NW/erVq293HQzvVLuM93vf3CfDAFtf3nuluEuGUIY7uPIHr5tdMmz9W3XtEh+M85Bm+2RYv5YPycGifYsMd9KqNYF6q17s07RVX15eWrJymLX1y3ivXeZmlwwhM6pV18keJotTqjtlGADu4WP6EMt9MqyX3F4v5n0yDOlX1/JQTdyG0S0y3Emr1gQqrXqkT9NW/cc//vE2stnds35xpx+suJ2fE+j6u0eG0WFxee+48eyWIRQj/UUygew+fwXKPhnqqxnKcJ+/AeVmnwxjzckq7N6UbnHyFhnupFVrArJVD/Zp2qr/8Ic/3GJSq1vDlVztyPqqrwp868EdMkwI4k2R/ERkFeu35lXbb78Mg7fHwXGfDNNnHVKEO+3SUJb7ZFitwd3+gtZbZLiTVq0JsFY93qdpq3727FntzuljEwSc4QSshqkzbICZGHaGE7Aaps6wAWZi2BlqAqVVT/Vp2qp///vfTyTBTWsEnGGNytyYM5zjVbN2hjUqc2POcI5XzdoZagKlVdeIDY09efJkyM6N2gScYZvN6IwzHCXVtnOGbTajM85wlFTbzhlqAhu06sePH7eZ+8wQAWc4hKlr5Ay7eIYmneEQpq6RM+ziGZp0hprABq36xx9/HMLvRm0CzrDNZnTGGY6Sats5wzab0RlnOEqqbecMNYGzT58+ffz48cOHD+/fv3/37t3bt2+vr6+vrq7evHnz+vXrV69eXV5evnz58sWLF8+fP3/27NnFxcXTp0+fPHny+PHjR48ePXz48IG/nIATcAJOwAk4gYMR2OCp+sGDB1/9ZSPw66+/2gR89VdnaC8CZ+gM7QTsCl6HmoC3antdbaCgE7OB6IlJOEN7wp2hM7QTsCt4HWoC3qrtdbWBgk7MBqInJuEM7Ql3hs7QTsCu4HWoCXirttfVBgo6MRuInpiEM7Qn3Bk6QzsBu4LXoSbgrdpeVxso6MRsIHpiEs7QnnBn6AztBOwKXoeagLdqe11toKATs4HoiUk4Q3vCnaEztBOwK3gdagKqVYc/KzD3Fxn8J8C9NO0E7Aq6uO2ap6bgDO0Zd4bO8BAESqsmf/pnk1b95f7d8Kdwzq+rbtdn0yj+CR26+Pr87Ozu/S9VtaMfrF/ePRwh5KX5UZshod0zFgwznGrNhGI6Y/WUzfkwHY1Veff+dSzseMr+n5br7nFVHRQMs03C0IivPivR0cV+Lav7Yg+XSAPWHEWabb7raxkj/347Qc6jvhJzq8Y//bXVU3W4Gs/Pw//VenVjNtRrubXC/RRPT+zyhuAZPA4nXtpI5+vXr/UVrB9VbCpDfKNcO7t/F8UdAfF+nGJQUzCALL/cv5uPF1B8dzUpGCZeIUy/lkcvAM2wfmXmGssfucu5X8uCIVyfpJHEC5gC6yTnOC9SQeDr16+5VePveNuoVcPVfQ19gCDOQFuz4s4IGckJOU7iOeKFd5EYFnhtad0ARrG9r7ep7bj/sRrD8/vhEVhUYABz9/598i1Ns7aiafOrnOa6/dOqeygYRqPW1dqfFehYNX533ChNwZAFTu3ycd0ARrFy19vkXY7rvcaQX8V1IrUoj7PYBIHDtep0dcePh7nZIsfmbOBPrSlleoxK38sBTwyUIa9MHmjTIEBK17fJhm93HGecYfwi8PxaY0iFROuJYOOxhtW0IPksfBLtTAvrIzgVDMHjAAeKkRLDWJqzEh1dTY9R6Xs54Ax1/Yk4mwYBkl/LQKvKiPAJRnCavyHPN082SL9gpBO7vIJ5FYUAD/NUHUAkWuFQsGjPissbMpQXV4RE0R/xKUsMC7saVJMFWTpgQ6yr2xzXIGOYvlU8v84tG2PJ1cf4AInKl+VhPBcgKpQDplGGj/dIMAyBZF7pUNBozwp0rNa+O24044whC5ta4XGTBVk6YEOsUfp4DxjDci2zgGjE1+fkMoWJ3KwrT4t8nsow/ds9EQQO1Ko5Cllmvdkwh/cCMMwdv0L8dlFuuztLDCcUMaSPixEODCEn6kmZKUd0PhyXGTjCmoaz/LG0qi6V9nXOGNLLm1ZgCZmOQiAl/hJ8GWMJyIErjTxxrO+CIaUYQpLxFpp6NswhyMgRS03qHCuuqt+MIScUMbBSgiHkRAXLTDmi8+G4zMARAoaz7/FapgCaVcRZSLNwjqCCnjSge9zaMasi8OIAT9UclCTRnWUVJp7Gdwl0q0yyxAAFffUSAOSQe0DoDtgsb8TV933GGLImU1CEiBPZMsjCAiZ4LRN7ZpVOGho10+MYEwyrX0mUygRW5K7HcCSQ1X7BDI+DzLiXjCFQKMSyCgFADvNsfCd0B2yWN+Lq+z5jDNm1XPzmEcNZLrbwjnUp4ClDWKRTVDa6lSNB4CBP1XUUmdzybAuaIH4r/A62KU9MCDUDK1sSAEARa7GY0A+IIzaLGxHp3R9yhqzL5DgJQopKhJat01NLqyD3+nlcRDN1WmNI73/xOFWeX8tVtpwhKSZiTQpx5DodsVnciGy/+0POkF3L6DsJGPiU65TjIqzDWj6Jans7EAQO0aprJArU/ixgLMQ5PUGcTx77mUgMYJIgKAAgKg1iFeLoiM3iRkcEtsYwf5yJdcf+20GKk0dJijQcIlBuFc7aGtr2KEY4QwICvYeqAqz9Wb+WEzLAJGuIFg4QlQZ+LWPF1dsrxSorkZ9T1kFUntON9nPMr8Tg19ZfgEPd5RskBp5H8ztOwEEZDYxV0SbjYNWa43pHeKYSA9XGHq1hpAAAaPTRO64oBqkotQanuLjR0dAUDCEwrMQUJgFG6kn/UEq2C8sYUk6DaPCJYz1jDKHEkGAOKY/m9zwe38toD12w6mDlksd2xhgG51PtEZIwUgAAtFxyZUUx8GuZ/yeXkWjhU6oOSRWcYEzgi09B0LuL1F6qTVXR1q2aMSNRx/HwVFMQlmlcFai2qIER/TKuZVh0j+dIJyb4LkJWAS/Nj2gAoxGh3cMUDMUVGq9ucsWGmJFonE3VhaPxLktOJQOmISeP8ZwyhKIgwHI8cdyv5cxDvlOGZQ6olduXKqql+aA0YjNmVPza55FgyC5PgCjrklqcXwdQxAK54Rg1L3eBPbEQBGpP1fi7UIYP/HeA21OsE2PXPDUFZ2jPuDN0hnYCdgWvQ01AfQE+3KHR0Fu1l6adgF1BF7dd89QUnKE9487QGR6CgLdqO9UNFPzytkN0hs7QTsCu4HXoDA9BwFu1neoGCn552yE6Q2doJ2BX8Dp0hocg4K3aTnUDBb+87RCdoTO0E7AreB06w0MQOPv06dPHjx8/fPjw/v37d+/evX379vr6+urq6s2bN69fv3716tXl5eXLly9fvHjx/PnzZ8+eXVxcPH369MmTJ48fP3706NHDhw8fPHjwq7+cgBNwAk7ACTiBwxDwp2r7B6ANFPyTuB2iM3SGdgJ2Ba9DZ3gIAt6q7VQ3UPDL2w7RGTpDOwG7gtehMzwEAW/VdqobKPjlbYfoDJ2hnYBdwevQGR6CgLdqO9UNFPzytkN0hs7QTsCu4HXoDA9BwFu1neoGCn552yE6Q2doJ2BX8Dp0hocg4K3aTnUDBb+87RCdoTO0E7AreB3ulmH4ZeDqt693vJ2170jNTukqwlZ9ca/8Lvmzexf4a0OXD/wXi86mQdvrxGgbH+kTcIZ9PiOzznCEUt/GGfb5jMwehqH4Kx6LjszaLwpOGGgCqVVf3Lvzw8+5K0PXHu/WA606xIwv/PMm0XE6pz/x0Fn6x1JE0NRsSoQupPrsT6+gx2wUA6Lr+B+2wYXir+IoF3ViRIBCoAgPuJR8KmuUNgwwFMxDNkNl2ATwILNskozn3eN8ZSIYpLCYG4wtnwlL1jMMq03e8pBSYVQjq8XV3JpNSLywZ4MhXchB0RlWtzEEG0MGke7Ldq0Gkgmm96m4Ri6BekUxv3i+1k0F940MI4FaneSLItYXd1fwWyzoqn5n65Ygw6QyS2dLPQzka4ChjjhfxA0ysC2fi/7xMdTl9jQUeeGwgJhab1WftiaAT9W5T4f3n3+4czb+ZL3UqkMg6L+AA6d5Ek5KRtXfFUWI4mCtCBAm+6Esn2DyaBMPwDL733aYS8g4Ry7v63PBsOY3+FRxaeDa7njY2TqsqjsSvEAuIF5O8bYZ/EIjSjasvnt+zv4iKvMQTsTOuripZLzX4G5cwOZt2qYvgrmpxNXwCjqfCLKEFLZLL1wPs21Qyx4uMewptPftB1JCQkaTcQmFeB1yKMHxcfILYaI0r6LkhoVhkqh5G8bKxcZgy/jbN6KOfneKb072C340SpS5WCWVdUAdoYbRJYZ5ZXkHjVg3TAktxCZT9sEYVSGUcgpCiIAG3VuV3AomsizjlCZQbdU34bl6+LF6qVUjLDigkdFjmAyB5qhhEvlwEXq2VqStT70IO/UtMWltMzkjz2dLE9ZnTpRF/uybsWVDvaFYJQ3kOZpnxTggSaGZOOCr8llrkzB+9/6X9Bal8hoU1lvr4kbjyoESLDZ8Kp/BewZbjFtHeRmZD0MyLjIdDvkqHSOaZ0N4p17lCbRsiijL6VskVaDHsDfdlx6jY9WDLAPv6+KKz/Z0LWCdIs98yz6xwXhSmzLXYRBV3kqEClDxrjMFRlX9Eo/aunMDbH8IAyfoPUoGUPwNM3gXjcNzDMt10w6dz2Tv+GhxqXfT5xepDKupmLckm4QhTbtFoNqqw1M1+UKcPnBXjte3apUjEmcKgwTWOFwp0tYnTqQt1RZxHAyxGtuC8YsZNKxV/lxpag8zG+5SHq3tiHPxgFfcwgZ4G+SrhCQ5rbtV3wVHwwFCUylAM9xmK4b8UkR5vSFO1Q7AHN3PXTiQY3GJpXyTAbx8gfyoFtSVRd4SJqiHplbdTdBAINmr9K687urTxTos1AoHImBciUY4ggdakU/h5RCHV9QhdQodYd7iKG6tgOQZti4PlndU0naNKW1Y1JqZVe6heFkMRzBOAYTROYZFsbVJ66KbtVceawEVePJOxYlLw4EEUCFQadXw9fd4p76Za9U0EnQWWZfMw1H4DhRfKpq4aqVIW7/4kN3SW4QZMdoWDMZhFr4xiAfi6q4kJu+t32HfOgzhEq5tjaPBsodgKreOwSxlKEUvY5YAyw7RMmyGYeoIgPf9LyWEmctbBlJUMokhb9kyfgJoigjxn8XFFkmvBvAS3SilBpoPPsJDWD7DkKe1u+9AIAyEurgqA424lB9kIBxiRbENwag+xcMcWLWeIb+pMG8DQe4fiYs5Fbm07pxkFdPvbN0VbGaWbJTc0yGECW0XRucYlvjrajGFHF9cM2sPqyDmdGnroHqS5LsDYhYOlXOagGzV0KfHv/wOD9kzrRqiLG6Bv+I0eV2zJJGW7MRkT4u09Qfpg0C5F0NBE+IQGnMYFkBbI6tSHDoxJMB4GAVjW9QCYCNdQg1Y21iERumeVPFwYOt0zREARRfcKgnCiYpXwTT7GabLKjAWp2R67PIeCWTCW4xEHUiRdlz5bpU+7+TglWJ0vQBIBjDMFi2DgqXSw6g3UIfomVAY3DderjxzKFkO1sYFTlEeZvIizOV8GRjCTSR7HxBgvoMXeAKYNKFETzgMdvmi6tBoT/UES8bkHQD2LS5HN8o5dTZHXMRmGJZVrcYfP2zQj/V5TQNiiFi5Sj3GSW1ZlxQM24nOnmkCtFXHNj3xPB2/DR9v1dLf4BaMpfvU2fk53qoVAzDUSYXQVoi09dVM7ZOfKMQUCGYQI4sOxzqlx/naSZnRiUkTtbeoxvYKZtolXAxTDXbRaNFDaqa2ThtpcjnB1RXKq7C+eBmm2bowjS9SKmn3LRjGcmS7JvVaEeQp+Q6OEpGFuMrymAWyskzVKiwnvTCL9gugOkkZZyhjDDsv71v8awWZIla1MaQPq6iymXwtzORips+LdOaJUIp3vA1TNLBmV6o8QcI2oUYW9EsZ8UtP7Y2ChUQ64qZgl69YfcF2blnjdchdqJYNxM3h5VWz9ur+wAMOsjVJAEE8COct2sk1TQBbNfwXWtNteuKpWrqbadF3UiKKQQ0BXYvHYyJtfb2Rso0ZK7DD3soIdeCAJCrd0+hynRgMp3YAiiTZeX8xhEvRFRzhBwMe5gW1rfNcQMBdgBEWeratFDUY5yubvFfXBz/4hJ3hlLckEHaoRWbimsNbvzEwd+DWwUFpD3HFIMOOAkrpBOEULKfljzN4ABy6JhV9pQsDpJLyIecRN62QHwizsmo1w563mkewrsXRvhF19DtTvTsbpisfgE49bSP5yjLbfgEenKr7VO2rALBqD9Fx5gN50atgJNcieafSuopSqw6NelWfHvwCPDpXjR/TE+872V1gkE+CTZCg52QdO6Ql0RHpTMmdwJT5DuGwkconRJTR1kpRJ4bFJE+U85XuT9eo/ehkfhxi8TRXVLbOYnIK4m5dJJVWnXXSe5BrJrwyaWRo9DY6vSQSrCqul8glwzITbysSCNizzJEVtd36Ho4w7Cvk7TtRdmLMq+v3UJytUwTHZlhQPYV3PExRpVsx5ASDOzT57cQrvHJpCpvrKxZls1HByj2wqFZ26+RrhGHRLkfgKi+BytAq+0Y9SLhiu8aq4kG9lGvfzcRWPfUfZ8kfAl/8Ahy879yyo99gVepDNB+WVjihphj5hAj0fvSqra8k1UWdd9caqWikv8wQVi+VJv1vmxMZFr92MjsF7zDNK5j7xM8Y+/bWX+7fJU7AHsgzQiqnzJ10UvGKmoVpok+mYKGasjAUzpO98LDiraA2IBLUgllxfiXe7FbFqzyVtiJ7LSdlieGyAuwOXmGM3ToRDLPzc3GFVXzPrMPegwl6Fa5/cklwP0CMTqNOb1U02oihqJN4u8re82C57+zqFScYhqpDOiO3bt4qu5klitzdOFEbwyWLDNGSH4AoyWoqCzZCV4zag121Hnp56aySTuTE4rgmAK06/iM1eRKHw9Gn7KVWDYUkxaNrMZY8p4CyaTLLS5NZaZxsmojkq7u+O3VaYIQpLpQBt/eievJb4pEfiWLK6vuclktiFYSaPIclNLKmh0KERN6eiQWc0eb3tJ1YRr3KIMN7MCsO8kXECVyiixun4gGXoJUiYo/+LnvLGfZEqCc8rthiMiGW2ba/6Q5UVhGGfBUDtezhEsOOQmdfPkXBq4+dwnQsrkAXPGPRUualAEpFdch3wuysShsaGDKXAwribY4xJZ3OgLt0gDvZoKL1cXs9xfJSBNkwK17uAs952KefryWG6Gk6EH6UsoGJ4m9eN2nfq4cci87LwqrsC5Bi2QszmgD+W7V8Vh4/X2rV6JIfNAnoxDRNfaJBwBk2wEwMO8MJWA1TZ9gAMzG8FcNGp256MmvfFDJPaALeqs1QtxDQidlC9bQ0nKE9387QGdoJ2BU2qsPwXKueVzvezdp3pKxTmoC3aivTTdbrxGwie1IiztCebmfoDO0E7Apeh5qAt2p7XW2goBOzgeiJSThDe8KdoTO0E7AreB1qAt6q7XW1gYJOzAaiJybhDO0Jd4bO0E7AruB1qAmcffr06ePHjx8+fHj//v27d+/evn17fX19dXX15s2b169fv3r16vLy8uXLly9evHj+/PmzZ88uLi6ePn365MmTx48fP3r06OHDhw8ePPjVX07ACTgBJ+AEnMBhCPhTtf0j4AYK+jPUBqInJuEM7Ql3hs7QTsCu4HWoCXirttfVBgo6MRuInpiEM7Qn3Bk6QzsBu4LXoSbgrdpeVxso6MRsIHpiEs7QnnBn6AztBOwKXoeagLdqe11toKATs4HoiUk4Q3vCnaEztBOwK3gdagLequ11tYGCTswGoicm4QztCXeGztBOwK7gdagJeKu219UGCjoxG4iemIQztCfcGTpDOwG7woHqcPa3kc3a2wNHBU0gt2r2FztG/1BH/D3h/jvAke/qA52Y1VInu9AZ2lPvDJ2hnYBd4TB1GDqv/tMdbW9n7dtK8zOaQGrVP/9w74ef01/ogK490a0HWnWIGV8CFp3Tv6+VznYwU7MpEbqQ6rM/vYIes1EMiK7jf1IKF6a/JJOXKBd1YlRyG54OuJR8Is4o8TDANmAeshkqwyYgODLLJsl43jzOVyaCQQqLucE85DNhyXqGKvqKU11veUgpyxWRelxNUGxC4oU9G17RhRwUnWF1G0OwMWwmiO1aDSQTTO9TcY1cApPkxdXAU8mi4VPBfSPDSKBa/yxQvTFn2Peyqt/ZWt4f8u5sF5VZOlvqkIWRLhbx1wYHGPJo41ncL/smLGBbPjdhT0ORFw4LiO3QW9WnrQnkp2r2h7Tm/nz1UqsOgaD/Ag6c5kk4KRmNt2tyLtDj6VoRIFzT5xNMHjcloLP/bYe5hIxz5PKmfya3sp64Bc5zl+LlgGPEGA87Hna2DqtqAOONETcE8VIDeNsMfqERupL//uX5OfsjgMxDOBE76+KmkvFeg7txgV6JLnubtumLgFEwuavjangFnU8EWUKCPNcy2wa17OESw55Ce99+ICUkZJTu38gFJjr6NQm+eI78QpgozasoeWFhmCRq3oaxcrExGCL8VLHNyonzsg47W3cEgx+NjZiLVVLZbwgNoYbRJYZ5ZXkHjdrlkE3EJlP2wRj9g1DKKQghAhp0b1XyKpjUE6EJ1Fo1PFbfu2Ddu3Oy1Kozq/hOI6PHMBsCzVHDJPLhIvRsrUhbn3oRdupbYtLaZnJGns+WJqzPnCiL/Nk3Y8uGekOxShrIczTPinFAkkIzccBX5bPWJmH87v0v6S1K5TUorLfWxY3GlQMlWGz4VD6D9wy2GLeO8jIyH4ZkXGQ6HPJVOkY0z4bwTr3KE2jZFFGW07dIqkCPYW+6Lz1Gx6oHWQbe18UVn+3pWsA6RZ75ln1ig/GkNmWuwyCqvJUIFaDiXWcKjKr6JR61decG2P4QBk7Qe5QMoPgbZvAuGofnGJbrph06n8ne8dHiUu+mzy9SGVZTMW9JNglDmnaLgG7Vk4365mZ9q1Y5InGmMEhgjcOVIm194kTaUm0Rx8EQq7EtGL9NQ8Na5c+VpvYws+Eu5dHajjgXD3jFLWyAt0G+SkiS07pb9V1wNBwgNJUCNMNttmLIL0WU1xviVO0AzNH93IUDORaXWMo3GcDLF8iPakFdWeQtYYJ6aGrV3QQNBJK9Su/K664+XazDQq1wIALGlWiEI3igFfkUXg5xeEUdUqfQEeYtjuLWCkieYevyYHlHJW3XmNKGRa2ZWeUeipfFcATjFEAYnWNYFFubtC66WXvlsRZQgSfvVJy4NBxIABUC2KrLz5WNP0/HR+25Vk0jQWeRdck8HIXvCvGloomrVoq09YsP2S29RZgRo23BYBxm4RuDeCCu7kpi8t76HfatwxAu4drWOBosewimcusYzFKGUvQyZgmw7BAtw2YYpo4AeN//UkKYubxlIEUlkxjyli3jJ4CmiBD/WVxskfRqAC/RjVJqoPngIzyE5TMMeVq7+w4EwkCoi6sy0IhL+UEGwiFWFNsQjOpTPMyBVesZ8psK8zYQ5P6RuJhTkUvrzklWMf3O1l3BZmbJRsk9HUKY0HZhdI5hib+uFlPI8cU1s/awCmJOl7YOqidJvjsgZuFQOacJYKsu33CHf6k+2/bHypAlRFncAn/FafK6ZkkiRcWU7F5E6dUAABCLSURBVGmRtv4gfRAo92IoaEIcQmMOwwJoa2RVikMnhgQYD6NgbItaAGykS6gBaxuL0CjdkyoeDmyd0kAAFF1wqyQIJypeBdPsZ5guq8BYnJLpsct7JJAJbzESdSBF2nHlu1X6vJODV4rR9QIgGcAwW7QMCpZKD6PeQB2iZ0JhcN94b+aZQ8lysDYucIryMJMXYS7ny8AQbiLZ+4AA8x28wBPApAklesJhsMsXVYdGe6onWDIm7wCwb3E5ulHOqbM54iI2w7CsajX++GGDfqzPaxoQQ8TKVeoxTmrLuqRg2E509kwTqLTqm5ubqZ8rG3+qlv4Gt2As3afOzs/xVq0YgKFOKoS2QqStr2Zqn/xEIaZAMIMYWXQ41ik9ztdOyoxOTJqovUU1tlcw0y7hYphqsItGix5SM7V12kiTywmurlBehfXFyzDN1oVpfJFSSbtvwTCWI9s1qdeKIE/Jd3CUiCzEVZbHLJCVZapWYTnphVm0XwDVSco4Qxlj2Hl53+JfK8gUsaqNIX1YRZXN5GthJhczfV6kM0+EUrzjbZiigTW7UuUJErYJNbKgX8qIX3pqbxQsJNIRNwW7fMXqC7ZzyxqvQ+5CtWwgbg4vr5q1V/cHHnCQrUkCCOJBOG/RTq5pAvVWHb4NH36uHmzV0t1Mi76TElEMagjoWjweE2nr642UbcxYgR32VkaoAwckUemeRpfrxGA4tQNQJMnO+4shXIqu4Ag/GPAwL6htnecCAu4CjLDQs22lqME4X9nkvbo++MEn7AynvCWBsEMtMhPXHN76jYG5A7cODkp7iCsGGXYUUEonCKdgOS1/nMED4NA1qegrXRgglZQPOY+4aYX8QJiVVasZ9rzVPIJ1LY72jaij35nq3dkwXfkAdOppG8lXltn2C/DgVN2nal8FgFV7iI4zH8iLXgUjuRbJO5XWVVRv1Zs/VUfnqvFjeuJ9J7sLDPJJsAkS9JysY4e0JDoinSm5E5gy3yEcNlL5hIgy2lop6sSwmOSJcr7S/ekatR+dzI9DLJ7misrWWUxOQdyti6TSqrNOeg9yzYRXJo0Mjd5Gp5dEglXF9RK5ZFhm4m1FAgF7ljmyorZb38MRhn2FvH0nyk6MeXX9HoqzdYrg2AwLqqfwjocpqnQrhpxgcIcmv514hVcuTWFzfcWibDYqWLkHFtXKbp18jTAs2uUIXOUlUBlaZd+oBwlXbNdYVTyol3Ltu5nYqi/ukZ8lgx8wG36mHvgJcPC+c8uOfoNVqQ/RfFha4YSaYuQTItD70au2vpJUF3XeXWukopH+MkNYvVSa9L9tTmRY/NrJ7BS8wzSvYO4TP2Ps21t/uX+XOAF7IM8IqZwyd9JJxStqFqaJPpmChWrKwlA4T/bCw4q3gtqASFALZsX5lXizWxWv8lTaiuy1nJQlhssKsDt4hTF260QwzM7PxRVW8T2zDnsPJuhVuP7JJcH9ADE6jTq9VdFoI4aiTuLtKnvPg+W+s6tXnGAYqg7pjNy6eavsZpYocnfjRG0MlywyREt+AKIkq6ks2AhdMWoPdtV66OWls0o6kROL45pAbNXlx7/hcZz07fLTZs2jpS/AoZDIYz4cRtdiLHlOAWXTZJaXJrPSONk0EclXd3136rTACFNcKANu70X15LfEIz8SxZTV9zktl8QqCDV5DktoZE0PhQiJvD0TCzijze9pO7GMepVBhvdgVhzki4gTuEQXN07FAy5BK0XEHv1d9pYz7IlQT3hcscVkQiyzbX/THaisIgz5KgZq2cMlhh2Fzr58ioJXHzuF6VhcgS54xqKlzEsBlIrqkO+E2VmVNjQwZC4HFMTbHGNKOp0Bd+kAd7JBRevj9nqK5aUIsmFWvNwFnvOwTz9fSwzR03Qg/ChlAxPF37xu0r5XDzkWnZeFVdkXIMWyF2Y0gfoX4M22XJtYatXokh80CejENE19okHAGTbATAw7wwlYDVNn2AAzMbwVw0anbnoya98UMk9oAt6qzVC3ENCJ2UL1tDScoT3fztAZ2gnYFTaqw/Bcq55XO97N2nekrFOagLdqK9NN1uvEbCJ7UiLO0J5uZ+gM7QTsCl6HmoC3antdbaCgE7OB6IlJOEN7wp2hM7QTsCt4HWoC3qrtdbWBgk7MBqInJuEM7Ql3hs7QTsCu4HWoCZx9+vTp48ePHz58eP/+/bt3796+fXt9fX11dfXmzZvXr1+/evXq8vLy5cuXL168eP78+bNnzy4uLp4+ffrkyZPHjx8/evTo4cOHDx48+NVfTsAJOAEn4AScwGEI+FO1/SPgBgr6M9QGoicm4QztCXeGztBOwK7gdagJeKu219UGCjoxG4iemIQztCfcGTpDOwG7gtehJuCt2l5XGyjoxGwgemISztCecGfoDO0E7Apeh5qAt2p7XW2goBOzgeiJSThDe8KdoTO0E7AreB1qAt6q7XW1gYJOzAaiJybhDO0Jd4bO0E7AruB1qAl4q7bX1QYKOjEbiJ6YhDO0J9wZOkM7AbvCgepw9reRzdrbA0cFTaDSqsNfwDyb+Isd/jvAke/qA52Y1VInu9AZ2lPvDJ2hnYBd4TB1GDqv/tMdbW9n7dtK8zOagGrV6Y9seauep2tYoRNjEDvRpc7Qnnhn6AztBOwKK+sw9NZmN+Z/iiOapj+HVe3f4/ZgmaT4Z4HBXSq/qVwTEK06NOp79+75U7W92qYUdGKmlrtx9c/GOZZZAl6Hs8S0vTPUTGZH5hmSdlltvPGPcqapYIxWjf5OO3XPHuzwb4KA2PguYfHd83P5106rdzPWqkOjvvPDz+EbcH+qnq0uk/18aZq2+y4XO0N7Wp2hM7QTsCvM1mHul/COTZj60Z6JTRybbVo0ah96M13bXAcT1DL+0fm797+EGT6x1KpTo7658VZNc/xNjmdL85s4dWSbOEN7wpyhM7QTsCusrcNuo1TtMPlZaaLQvkfs9Yb0uZqCULvg0nCg9tIE8KkaG7W3agr4Gx3rxHyjjb+jbZyhPZnO0BnaCdgV1tYhtj/hgnz2ZdOV5jpsrw17LpRv3en38XOtOjxI3/nh5xt4+VM1y+S3OFlbmt/Ct2PZwxnaM+UMnaGdgF1hbR02+qRuqMVFaNTioXbcXlvWXZC7hPP8Rf1Mqxa9WZzG/t3+f/+PtUre1x6tLc21+32P65yhPavO0BnaCdgV1tZhu0+KZpxdlC0UxnX/zeZfpb02rblQW5UbNf6TNe4BB5rA2c1N+Oq7+sLH7HabDjPeqjnkNWc6MWtUTnuNM7Tn3xk6QzsBu8LaOqz1SWivpTES52QHTVNheNReb6iat94FRiotl36c0ATw36pJO/anapLOb3OoE/Nt9v2ednGG9mw6Q2doJ2BXWFuHunPSfxRmfsV+qVtyRQLWNezDMG2xYnljFfMkLKESMKkJeKtm1G7rRCfmtjw53n2doT13ztAZ2gnYFdbWoWiUX+O3y/oZGez08Ap7/pU46GLbbe/CCAUzXJNnNAFv1ZnNrb7rxNyqO0e5uTO0p80ZOkM7AbvCbB3Gpsi/U4YHZphQj87xWZebx245aw+hUjnSc+kw7kXmMyZDqybfhY8c+r9VZ+br32dLc/1O3+9KZ2jPrTN0hnYCdoWt6rDeedv+zdq3lawzmkDtqXqkPxMbb9XWtHz9qhNj1zw1BWdoz7gzdIZ2AnaFjeowPNdWHmOb/s3aN4XsE5qAt2o71Q0UdGI2ED0xCWdoT7gzdIZ2AnYFr0NNwFu1va42UNCJ2UD0xCScoT3hztAZ2gnYFbwONQFv1fa62kBBJ2YD0ROTcIb2hDtDZ2gnYFfwOtQEzq6urv4nv97A6zW8/ju/XsHrv+B1mV//Ca+XL1/+w7/90z/82z/96i8n4AScgBNwAk7gMATOPttesVXbP0aduIL+DHXiQFaE7wxXQBNLnKEAsuLUGa6AJpY4Q03AW7Uokts51Ym5HT+OeVdnaM+eM3SGdgJ2Ba9DTcBbtb2uNlDQidlA9MQknKE94c7QGdoJ2BW8DjUBb9X2utpAQSdmA9ETk3CG9oQ7Q2doJ2BX8DrUBLxV2+tqAwWdmA1ET0zCGdoT7gydoZ2AXcHrUBPwVm2vqw0UdGI2ED0xCWdoT7gzdIZ2AnaFA9Xh7G8jm7W3B44KmgC26t//y+9++xvyv398+O7zwMt/AhzhWg50Yixqp7nWGdrz7gydoZ2AXeEwdRg6r/rTHR1nZ+07UtNTmgBt1f/8///38+zLW/V0EmoLdGJqVj7WI+AMe3TG5pzhGKeelTPs0RmbW8kw9NZmN+Z/iiOapj94Ve3f4/Zgmf92FtMa3KXym8o1gdyq//c//vF33qrH6ugAVjoxB9jkO5d0hvYEO0NnaCdgV5ivQ9IuWbNEX2jnDcdo1ejvo/Zgh38TBMSS9MAuweTu+fnY36v+HF/eqjGnt3EwX5q34eW+93SG9vw4Q2doJ2BXmK3D3C/hHZsw9aM98zUvpuZxsKrE7UNvxkb99Wt7XWWXMHT3/pf0xrav/K3F/FR9+a+/+d2//vh5+uVfgHPCK89mS3PlNt/1MmdoT68zdIZ2AnaFtXUI/bDWYKvtMPkJi2i7TR1XDGFU1F5vSJ+rcUnu4VQSl4YDOgGrNAHaqsuPlf3L5efBl7dqmo7Vxzoxq6VOdqEztKfeGTpDOwG7wto6xPYnXJDPvmy60lyH7bVhz4XyrTt9+p5s1Z/JKzxh/9Z/Apyl88Ana0vzwG4dlbwztKfLGTpDOwG7wto6bPRJ3VCLi9CoxUPtuL22rLsgdwnn+enf0Ko/f/7p4T//5v/9x0+fl1/+VF2SbjhaW5qGLb+7pc7QnlJn6AztBOwKa+uw3SdFM84uyhYK47r/ZvOv0l6b1lyorcqNOv7ztnJPE8hfgH9mr9Cqx/7p2ls1ZtFyoBNjUTvNtc7Qnndn6AztBOwKa+uw1iehvZbGSJyTHTRNheFRe72hat56FxjJ/3EXeaf9WhOot+of//23/lRNcnrwQ52Yg2/53W3gDO0pdYbO0E7ArrC2DnXnpP8ozPyK/VK35IoErGvYh2HaYsXyxirmSVhCJWBSE0it+sd/Lz/+DY/Uvx38yTJ/qmbU157oxKxVOt11ztCee2foDO0E7Apr61A0yvyD16ohg1310VkrhGja9vwrcbDDtttZRREFM1yTJzQBbNXlx78Hv/r+DC9v1Zmt6V0nxiR3koudoT3tztAZ2gnYFWbrMDZF8l1y/qVlMKE6dXzW5eaxW87aQ6hUjvRcOox7kfmMaa5Vf1778ladgZveZ0vTtNl3utgZ2hPrDJ2hnYBdYas6rHfetn+z9m0l64wmUP+36s/DL2/V1pzAep2YTWRPSsQZ2tPtDJ2hnYBdYaM6DM+1lcfYpn+z9k0h+4Qm4K3aTnUDBZ2YDURPTMIZ2hPuDJ2hnYBdwetQE/BWba+rDRR0YjYQPTEJZ2hPuDN0hnYCdgWvQ03AW7W9rjZQ0InZQPTEJJyhPeHO0BnaCdgVvA41gbOrq6v/ya838HoNr//Or1fw+i94XebXf8Lr5cuX8d+qf/WXE3ACTsAJOAEncBgCZzfm14MHD+wfo05cQX+GOnEgK8J3hiugiSXOUABZceoMV0ATS5yhJuCtWhTJ7ZzqxNyOH8e8qzO0Z88ZOkM7AbuC16Em8H9prptPxwqz+QAAAABJRU5ErkJggg==" } }, "cell_type": "markdown", "id": "15f08c99-0026-40d8-9855-c9e69e19c6fe", "metadata": {}, "source": [ "### Missing Values\n", "\n", "Now I will be honest with you, I do not find Pandas and the Jupyter environment great for data gazing. I actually use pd.to_clipboard() and paste the data into Excel. There I have an easier time looking thriyugh the data to verify it looks good.\n", "\n", "It looks like all the missing values were taken care of while working on the duplicates. The picture below came from the original data set.\n", "\n", "![image.png](attachment:6cb0372e-0ac8-48b9-8f6b-50a6741b2e4f.png)" ] }, { "cell_type": "code", "execution_count": 13, "id": "4d2688bc-3c20-4e04-bd20-dff9fee3f2c0", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Index: 243 entries, 216 to 452\n", "Data columns (total 3 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 Date 243 non-null datetime64[ns]\n", " 1 level_1 243 non-null object \n", " 2 Price 243 non-null float64 \n", "dtypes: datetime64[ns](1), float64(1), object(1)\n", "memory usage: 7.6+ KB\n" ] } ], "source": [ "clean.info()" ] }, { "cell_type": "code", "execution_count": 14, "id": "ca1b4371-fe60-4803-83de-605bc2e5821e", "metadata": {}, "outputs": [], "source": [ "#clean.to_clipboard()" ] }, { "cell_type": "markdown", "id": "52f1c1f2-274a-4aed-a9c2-576f7f6483ad", "metadata": {}, "source": [ "### Merge data\n", "\n", "For this tutorial I am simply going to group my daily data into quarterly so that both dataframes represent quarterly data." ] }, { "cell_type": "code", "execution_count": 15, "id": "f72e3ae1-a859-475f-9133-50bfd8dba7da", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DateSymbolPrice
2162024-01-01AAPL362.155787
2172024-01-01GOOG77.772647
2182024-01-01MSFT349.975219
4742024-01-02AAPL301.644022
4752024-01-02GOOG88.299780
............
2652024-03-30GOOG437.673426
2662024-03-30MSFT199.850571
4502024-03-31AAPL420.118719
4512024-03-31GOOG316.125091
4522024-03-31MSFT320.870742
\n", "

243 rows × 3 columns

\n", "
" ], "text/plain": [ " Date Symbol Price\n", "216 2024-01-01 AAPL 362.155787\n", "217 2024-01-01 GOOG 77.772647\n", "218 2024-01-01 MSFT 349.975219\n", "474 2024-01-02 AAPL 301.644022\n", "475 2024-01-02 GOOG 88.299780\n", ".. ... ... ...\n", "265 2024-03-30 GOOG 437.673426\n", "266 2024-03-30 MSFT 199.850571\n", "450 2024-03-31 AAPL 420.118719\n", "451 2024-03-31 GOOG 316.125091\n", "452 2024-03-31 MSFT 320.870742\n", "\n", "[243 rows x 3 columns]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# clean up column names\n", "clean = clean.rename(columns={'level_1':'Symbol'})\n", "clean" ] }, { "cell_type": "code", "execution_count": 16, "id": "4a1b44f9-799f-49d0-bb8c-920e5e512194", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Price
SymbolDate
AAPL2024-03-31291.007603
GOOG2024-03-31299.556233
MSFT2024-03-31283.193836
\n", "
" ], "text/plain": [ " Price\n", "Symbol Date \n", "AAPL 2024-03-31 291.007603\n", "GOOG 2024-03-31 299.556233\n", "MSFT 2024-03-31 283.193836" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# resample to quarterly frequency\n", "clean_qtr = clean.groupby('Symbol').resample('QE', on='Date').mean()\n", "clean_qtr" ] }, { "cell_type": "code", "execution_count": 17, "id": "b5d939a7-9969-4212-88a8-c934eb316316", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PriceRevenueNet Income
SymbolDate
AAPL2024-03-31291.0076032200450
GOOG2024-03-31299.5562334500900
MSFT2024-03-31283.1938363000500
\n", "
" ], "text/plain": [ " Price Revenue Net Income\n", "Symbol Date \n", "AAPL 2024-03-31 291.007603 2200 450\n", "GOOG 2024-03-31 299.556233 4500 900\n", "MSFT 2024-03-31 283.193836 3000 500" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# combine both dataframes\n", "pd.concat([clean_qtr,financials_df], axis=1)" ] }, { "cell_type": "markdown", "id": "b7fe1ccf-4418-4ec5-b1b9-d618002b68b9", "metadata": {}, "source": [ "# How would you analyze the data above?" ] }, { "cell_type": "markdown", "id": "d024cb00-dc86-4143-8a31-12cb6e74f142", "metadata": {}, "source": [ "

This tutorial was created by HEDARO

" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.7" } }, "nbformat": 4, "nbformat_minor": 5 }