{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Import Packages"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# preparation for notebook\n",
"import datetime\n",
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Read and Clean Dataset"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# read source dataset and convert it into a pandas dataframe\n",
"source_df = pd.read_excel(io=\"Report.xlsx\",sheet_name=\"Assignment\")\n",
"\n",
"# clean the dataset: if there is any row that the first column of it is null - we drop them\n",
"cleaned_df = source_df[source_df[\"Type\"]\\\n",
" .isnull()\\\n",
" .apply(lambda x: True if x == False else False)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### The Plan\n",
"To transform each bill into IC we can have a core function. We build that first.\n",
"\n",
"For a long list of bills, we have to do the transform in a neat way.\n",
"- we build up dictionaries according to groups\n",
"- divide the DataFrame into peaces according to the groups\n",
"- run the core function at the lowest level\n",
"- append them together than we have our result"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Preparations and References:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In order to record the time of the IC transaction properly, we have to know the month and the year of the bill. We usually code last month's transaction one month later so we create below function:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The last month is: November 2018\n"
]
}
],
"source": [
"# prepare the function to show last month - will use it in the memo column in the core function below\n",
"def IC_month(today):\n",
" if datetime.date.today().month == 1:\n",
" month = 12\n",
" year = today.year - 1\n",
" else:\n",
" month = today.month - 1\n",
" year = today.year\n",
" return datetime.date(year, month, today.day).strftime(\"%B %Y\")\n",
"\n",
"# test\n",
"testmonth = IC_month(datetime.date.today())\n",
"print('The last month is: {}'.format(testmonth))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We need to build up a mapping reference so that in the core function below, we can transform our dataset into the desired format. You can see a sample of the first two lines of the mapping in below table.\n",
"1. Full Sub: structured name for the subsidiary (system accepted name)\n",
"2. Short Name: short name for subsidiaries. Use it in memo and file name\n",
"3. IC from: the account name for the \"from line\" of the IC\n",
"4. IC to: the account name for the \"to line\" of the IC"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Full Sub | \n",
" Short Name | \n",
" IC From | \n",
" IC To | \n",
"
\n",
" \n",
" \n",
" \n",
" Your Company, Inc. | \n",
" Your Company, Inc. | \n",
" US | \n",
" 1501-01 Intercompany Receivable - Other : Inte... | \n",
" 2120-05 Intercompany Payable - Other : Interco... | \n",
"
\n",
" \n",
" Your Company Public Sector, Inc. | \n",
" Your Company, Inc. : Your Company Public Secto... | \n",
" PS | \n",
" 1501-11 Intercompany Receivable - Other : Inte... | \n",
" 2120-13 Intercompany Payable - Other : Interco... | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Full Sub \\\n",
"Your Company, Inc. Your Company, Inc. \n",
"Your Company Public Sector, Inc. Your Company, Inc. : Your Company Public Secto... \n",
"\n",
" Short Name \\\n",
"Your Company, Inc. US \n",
"Your Company Public Sector, Inc. PS \n",
"\n",
" IC From \\\n",
"Your Company, Inc. 1501-01 Intercompany Receivable - Other : Inte... \n",
"Your Company Public Sector, Inc. 1501-11 Intercompany Receivable - Other : Inte... \n",
"\n",
" IC To \n",
"Your Company, Inc. 2120-05 Intercompany Payable - Other : Interco... \n",
"Your Company Public Sector, Inc. 2120-13 Intercompany Payable - Other : Interco... "
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# building up mapping reference\n",
"sub_mapping_ref = pd.DataFrame(data = {'Your Company, Inc.': ['Your Company, Inc.', 'US', '1501-01 Intercompany Receivable - Other : Intercompany Receivable - Other - from US', '2120-05 Intercompany Payable - Other : Intercompany Payable - Other - to US'], 'Your Company Public Sector, Inc.': ['Your Company, Inc. : Your Company Public Sector, Inc.', 'PS', '1501-11 Intercompany Receivable - Other : Intercompany Receivable - Other - from Public Sector', '2120-13 Intercompany Payable - Other : Intercompany Payable - Other - to Public Sector'], 'Your Company Singapore Pte. Ltd.': ['Your Company, Inc. : Your Company Singapore Pte. Ltd.', 'SG', '1501-06 Intercompany Receivable - Other : Intercompany Receivable - Other - from SG', '2120-09 Intercompany Payable - Other : Intercompany Payable - Other - to SG'], 'Your Company AU Pty Ltd.': ['Your Company, Inc. : Your Company HOLDING LIMITED : Your Company AU Pty Ltd.', 'AU', '1501-05 Intercompany Receivable - Other : Intercompany Receivable - Other - from AU', '2120-08 Intercompany Payable - Other : Intercompany Payable - Other - to AU'], 'Your Company Japan K.K.': ['Your Company, Inc. : Your Company Japan K.K.', 'JP', '1501-04 Intercompany Receivable - Other : Intercompany Receivable - Other - from JP', '2120-07 Intercompany Payable - Other : Intercompany Payable - Other - to JP'], 'Your Company Deutschland GmbH': ['Your Company, Inc. : Your Company Deutschland GmbH', 'DE', '1501-03 Intercompany Receivable - Other : Intercompany Receivable - Other - from DE', '2120-06 Intercompany Payable - Other : Intercompany Payable - Other - to DE'], 'Your Company Canada Ltd.': ['Your Company, Inc. : Your Company HOLDING LIMITED : Your Company Canada Ltd.', 'CA', '1501-08 Intercompany Receivable - Other : Intercompany Receivable - Other - from CA', '2120-10 Intercompany Payable - Other : Intercompany Payable - Other - to CA'], 'Your Company France': ['Your Company, Inc. : Your Company Deutschland GmbH : Your Company France', 'FR', '1501-10 Intercompany Receivable - Other : Intercompany Receivable - Other - from FR', '2120-12 Intercompany Payable - Other : Intercompany Payable - Other - to FR'], 'Your Company UK, Ltd.': ['Your Company, Inc. : Your Company HOLDING LIMITED : Your Company UK, Ltd.', 'UK', '1501-02 Intercompany Receivable - Other : Intercompany Receivable - Other - from UK', '2120-03 Intercompany Payable - Other : Intercompany Payable - Other - to UK'], 'Your Company Switzerland': ['Your Company, Inc. : Your Company Deutschland GmbH : Your Company Switzerland', 'SZ', '1501-16 Intercompany Receivable - Other : Intercompany Receivable - Other - from SZ', '2120-17 Intercompany Payable - Other : Intercompany Payable - Other - to SZ'], 'Your Company Benelux': ['Your Company, Inc. : Your Company Deutschland GmbH : Your Company Benelux', 'NL', '1501-14 Intercompany Receivable - Other : Intercompany Receivable - Other - from NL', '2120-15 Intercompany Payable - Other : Intercompany Payable - Other - to NL'], 'Your Company Sweden': ['Your Company, Inc. : Your Company Deutschland GmbH : Your Company Sweden', 'SW', '1501-12 Intercompany Receivable - Other : Intercompany Receivable - Other - from SW', '2120-14 Intercompany Payable - Other : Intercompany Payable - Other - to SW'], 'Your Company NZ Limited': ['Your Company, Inc. : Your Company HOLDING LIMITED : Your Company AU Pty Ltd. : Your Company NZ Limited', 'NZ', '1501-15 Intercompany Receivable - Other : Intercompany Receivable - Other - from NZ', '2120-16 Intercompany Payable - Other : Intercompany Payable - Other - to NZ'], 'Your Company South Africa': ['Your Company, Inc. : Your Company HOLDING LIMITED : Your Company UK, Ltd. : Your Company South Africa', 'SA', '1501-09 Intercompany Receivable - Other : Intercompany Receivable - Other - from SA', '2120-11 Intercompany Payable - Other : Intercompany Payable - Other - to SA']\n",
"})\n",
"sub_mapping_df = sub_mapping_ref.transpose()\n",
"sub_mapping_df.columns = ['Full Sub','Short Name', 'IC From', 'IC To']\n",
"sub_mapping_df.head(2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### The Core Function"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The core function have four steps:\n",
"1. Before creating the function, we prepare the desired columns for it. That's because the 'append' method in pandas DataFrame seems ignore the sorting order of the columns. We use the csv_columns to correct it in the end.\n",
"2. Create four empty DataFrame as demonstrated in the scenario in the post.\n",
"3. We transform the dataset for each of the four lines\n",
" 1. Expense lines for 'from subsidiary'.\n",
" 2. Expense lines for 'to subsidiary'.\n",
" 3. Receivable line for 'from subsidiary'.\n",
" 4. Payable line for 'to subsidiary'.\n",
"4. We append these dataframe together"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"# 1. Prepare the desired columns\n",
"csv_columns = ['Line Subsidiary','Account','Debit','Credit','Memo']\n",
"\n",
"# THE! greatest function to translate transactions into IC format\n",
"def translate_IC(test_df):\n",
" df_from = pd.DataFrame()\n",
" df_to = pd.DataFrame()\n",
" df_from_total = pd.DataFrame()\n",
" df_to_total = pd.DataFrame()\n",
" \n",
" # 3.1 lines for from subsidiary\n",
" df_from['Line Subsidiary'] = test_df['Subsidiary'].apply(lambda x: sub_mapping_df.loc[x,'Full Sub'])\n",
" df_from['Account'] = test_df['Account (Line): Name (GL-style)']\n",
" df_from['Debit'] = None\n",
" df_from['Credit'] = test_df['Amount (Foreign Currency)']\n",
" df_from['Memo'] = test_df['Memo']\n",
"\n",
" # 3.2 lines for to subsidiary\n",
" df_to['Line Subsidiary'] = test_df['IC Subsidiary'].apply(lambda x: sub_mapping_df.loc[x,'Full Sub'])\n",
" df_to['Account'] = test_df['Account (Line): Name (GL-style)']\n",
" df_to['Debit'] = test_df['Amount (Foreign Currency)']\n",
" df_to['Credit'] = None\n",
" df_to['Memo'] = test_df['Memo']\n",
"\n",
" # 3.3 total line for from subsidiary\n",
" df_from_total['Line Subsidiary'] = test_df['Subsidiary'].iloc[0:1].apply(lambda x: sub_mapping_df.loc[x,'Full Sub'])\n",
" df_from_total['Account'] = test_df['IC Subsidiary'].iloc[0:1].apply(lambda x: sub_mapping_df.loc[x,'IC From'])\n",
" df_from_total['Debit'] = round(df_from['Credit'].sum(),2)\n",
" df_from_total['Credit'] = None\n",
" df_from_total['Memo'] = 'Intercompany from {} to {} for {} in {}'\\\n",
" .format(test_df['Subsidiary'].iloc[0:1].apply(lambda x: sub_mapping_df.loc[x,'Short Name']).iloc[0]\\\n",
" ,test_df['IC Subsidiary'].iloc[0:1].apply(lambda x: sub_mapping_df.loc[x,'Short Name']).iloc[0]\\\n",
" ,test_df['Num'].iloc[0]\\\n",
" ,IC_month(datetime.date.today()) )\n",
"\n",
" # 3.4 total line for to subsidiary\n",
" df_to_total['Line Subsidiary'] = test_df['IC Subsidiary'].iloc[0:1].apply(lambda x: sub_mapping_df.loc[x,'Full Sub'])\n",
" df_to_total['Account'] = test_df['Subsidiary'].iloc[0:1].apply(lambda x: sub_mapping_df.loc[x,'IC To'])\n",
" df_to_total['Debit'] = None\n",
" df_to_total['Credit'] = round(df_from['Credit'].sum(),2)\n",
" df_to_total['Memo'] = 'Intercompany from {} to {} for {} in {}'\\\n",
" .format(test_df['Subsidiary'].iloc[0:1].apply(lambda x: sub_mapping_df.loc[x,'Short Name']).iloc[0]\\\n",
" ,test_df['IC Subsidiary'].iloc[0:1].apply(lambda x: sub_mapping_df.loc[x,'Short Name']).iloc[0]\\\n",
" ,test_df['Num'].iloc[0]\\\n",
" ,IC_month(datetime.date.today()) )\n",
"\n",
" # 4. append everything and return the greatest IC dataframe ever!\n",
" test = df_from_total.append(df_from, sort=True).append(df_to_total, sort=True).append(df_to, sort=True)\n",
" return test.loc[:,csv_columns]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Building up dictionaries and run the core function\n",
"Remember that the first requirement is to:\n",
"\n",
"*Given a list of bills, classify them into different groups according to:*\n",
"1. *IC (intercompany) from subsidiary, the employee payroll sub -- it's DE in the above case*\n",
"2. *Bill reference number -- unique key for bills*\n",
"3. *IC to subsidiary -- it's US in the above case*\n",
"\n",
"In order to achieve this, we have **THREE STEPS**:\n",
"1. Find all 'IC sub' in the DataFrame, divide the dataframe according to it, put the DataFrame into the dictionary. Now we are ready for a loop inside the dictionary\n",
"2. We:\n",
" 1. Loop through the above dictionary and do exactly the same for 'Reference Number' (a dict inside a dict)\n",
" 2. Loop through the above dictionary and do exactly the same for 'to sub' (a dict inside a dict inside a dict...)\n",
"3. Append them together in the right order and save them in a csv"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### STEP 1:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"We have subsidiaries:\n",
" 1. Your Company, Inc.\n",
" 2. Your Company Public Sector, Inc.\n",
" 3. Your Company Benelux\n",
" 4. Your Company France\n",
" 5. Your Company Deutschland GmbH\n",
"--------------------------------------------------\n",
"Below is a glance of the DataFrame of the first item in the dictionary: \n",
"--------------------------------------------------\n"
]
},
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Type | \n",
" Num | \n",
" Memo | \n",
" Account (Line): Name (GL-style) | \n",
" Amount | \n",
" Subsidiary | \n",
" Currency: Currency Symbol | \n",
" Amount (Foreign Currency) | \n",
" IC Subsidiary | \n",
"
\n",
" \n",
" \n",
" \n",
" 19 | \n",
" Bill | \n",
" 1B0611D056A043A5B543 | \n",
" T | \n",
" 6000-06 | \n",
" 0.36 | \n",
" Your Company, Inc. | \n",
" USD | \n",
" 0.36 | \n",
" Your Company AU Pty Ltd. | \n",
"
\n",
" \n",
" 20 | \n",
" Bill | \n",
" 1B0611D056A043A5B543 | \n",
" U | \n",
" 6000-06 | \n",
" 13.30 | \n",
" Your Company, Inc. | \n",
" USD | \n",
" 13.30 | \n",
" Your Company AU Pty Ltd. | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Type Num Memo Account (Line): Name (GL-style) Amount \\\n",
"19 Bill 1B0611D056A043A5B543 T 6000-06 0.36 \n",
"20 Bill 1B0611D056A043A5B543 U 6000-06 13.30 \n",
"\n",
" Subsidiary Currency: Currency Symbol Amount (Foreign Currency) \\\n",
"19 Your Company, Inc. USD 0.36 \n",
"20 Your Company, Inc. USD 13.30 \n",
"\n",
" IC Subsidiary \n",
"19 Your Company AU Pty Ltd. \n",
"20 Your Company AU Pty Ltd. "
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# We use value_counts().index to extract subsidiary info\n",
"from_subsidiary = cleaned_df[\"Subsidiary\"].value_counts().index\n",
"\n",
"# create a dictionary and \n",
"from_sub_dic = {}\n",
"for sub in from_subsidiary:\n",
" sub_df = cleaned_df[cleaned_df[\"Subsidiary\"].apply(lambda x: True if x == sub else False)]\n",
" from_sub_dic[sub] = sub_df\n",
"\n",
"# TEST CODE:\n",
"# test the result of the above codes\n",
"print(\"We have subsidiaries:\")\n",
"count = 1\n",
"for sub in from_sub_dic.keys():\n",
" print(\" \"*5 + str(count) + \". \"+ sub)\n",
" count += 1\n",
"print(\"-\"*50)\n",
"print(\"Below is a glance of the DataFrame of the first item in the dictionary: \")\n",
"print(\"-\"*50)\n",
"test_key = list(from_sub_dic.keys())\n",
"from_sub_dic[test_key[0]].head(2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### STEP 2:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"for from_sub_key in from_sub_dic:\n",
" # this is the dataframe for each \"from sub\"\n",
" from_sub_df = from_sub_dic[from_sub_key]\n",
"\n",
" # create report id dictionary\n",
" report_id_dic = {}\n",
" report_id = from_sub_df['Num'].value_counts().index\n",
" \n",
" # report id loop inside from sub loop\n",
" for each_id in report_id:\n",
" report_id_df = from_sub_df[from_sub_df[\"Num\"]\n",
" .apply(lambda x: True if x == each_id else False)]\n",
" report_id_dic[each_id] = report_id_df\n",
" \n",
" # create another dic inside for \"to sub\" (same steps as divide sub)\n",
" to_sub_dic = {}\n",
" to_subsidiary = report_id_df[\"IC Subsidiary\"].value_counts().index\n",
" \n",
" # to sub loop inside report id loop inside from sub loop\n",
" for to_sub in to_subsidiary:\n",
" to_sub_df = report_id_df[report_id_df[\"IC Subsidiary\"]\n",
" .apply(lambda x: True if x == to_sub else False)]\n",
" lowest_level_df = translate_IC(to_sub_df)\n",
" to_sub_dic[to_sub] = lowest_level_df\n",
"\n",
" # overwrite report id dataframe by to sub dictionary, so we have a dic inside a dic inside a dic\n",
" report_id_dic[each_id] = to_sub_dic\n",
"\n",
" # overwrite from sub dataframe by report id dictionary, so we have a dic inside a dic\n",
" from_sub_dic[from_sub_key] = report_id_dic"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---------\n",
"#### STEP 3: \n",
"We concate these DataFrames together and create csv files in the directory for each sub. \n",
"\n",
"At the same time, we generate a overview for the loops and the result.\n",
"\n",
"Well done!"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"From sub: Your Company, Inc.\n",
" 1B0611D056A043A5B543\n",
" To Sub: Your Company AU Pty Ltd.\n",
" 3DDA0AA9D2E84B16AF24\n",
" To Sub: Your Company Public Sector, Inc.\n",
" 501F4146E3B24FB3809F\n",
" To Sub: Your Company Public Sector, Inc.\n",
"Advanced Intercompany from US to AU PS.csv\n",
"----------------------------------------\n",
"From sub: Your Company Public Sector, Inc.\n",
" BEDC9C0992EE4A69AA93\n",
" To Sub: Your Company Sweden\n",
" To Sub: Your Company, Inc.\n",
" AF736C0BD1DF47C0B38B\n",
" To Sub: Your Company, Inc.\n",
"Advanced Intercompany from PS to SW US.csv\n",
"----------------------------------------\n",
"From sub: Your Company Benelux\n",
" 42DBABDCCB9A4EBC9D96\n",
" To Sub: Your Company, Inc.\n",
"Advanced Intercompany from NL to US.csv\n",
"----------------------------------------\n",
"From sub: Your Company France\n",
" 6FEC2F91CAC64CC6B953\n",
" To Sub: Your Company Benelux\n",
" C8DA4AC02C67400C96DC\n",
" To Sub: Your Company Benelux\n",
"Advanced Intercompany from FR to NL.csv\n",
"----------------------------------------\n",
"From sub: Your Company Deutschland GmbH\n",
" 60D9CEFA361640CDBE01\n",
" To Sub: Your Company Switzerland\n",
"Advanced Intercompany from DE to SZ.csv\n",
"----------------------------------------\n"
]
}
],
"source": [
"for i in from_sub_dic:\n",
" print('From sub: ' + i)\n",
" # create df for each from sub\n",
" sub_df = pd.DataFrame(columns = csv_columns)\n",
" to_sub_str = \"\"\n",
" to_sub_list = []\n",
" \n",
" for j in from_sub_dic[i]:\n",
" print(' '*5 + j)\n",
" \n",
" for k in from_sub_dic[i][j]:\n",
" if k not in to_sub_list:\n",
" to_sub_list.append(k) \n",
" to_sub_str += \" \" + sub_mapping_df.loc[k,'Short Name']\n",
" \n",
" print(' '*10 + 'To Sub: ' + k)\n",
" # append df\n",
" append_this_one = from_sub_dic[i][j][k]\n",
" sub_df = sub_df.append(append_this_one)\n",
"# print('test'+i,j,k)\n",
"\n",
" # save csv for every sub\n",
" \n",
" sub_df.to_csv('Advanced Intercompany from ' + sub_mapping_df.loc[i,'Short Name'] +' to' + to_sub_str + '.csv' ,index = False)\n",
" print('Advanced Intercompany from ' + sub_mapping_df.loc[i,'Short Name'] +' to' + to_sub_str + '.csv')\n",
" print('-'*40)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
}