{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"_cell_guid": "b1076dfc-b9ad-4769-8c92-a6c4dae69d19",
"_uuid": "8f2839f25d086af736a60e9eeb907d3b93b6e0e5",
"execution": {
"iopub.execute_input": "2022-12-08T13:51:01.372378Z",
"iopub.status.busy": "2022-12-08T13:51:01.371165Z",
"iopub.status.idle": "2022-12-08T13:51:01.40242Z",
"shell.execute_reply": "2022-12-08T13:51:01.399538Z",
"shell.execute_reply.started": "2022-12-08T13:51:01.372248Z"
},
"id": "XuycFYxcK5ue"
},
"source": [
"##
Predicting Listing Gains in the Indian IPO Market Using TensorFlow\n",
"\n",
"I develop a deep learning classification model to predict listing gains for Initial Public Offerings (IPO) in the Indian market. This model can be useful to make investment decisions in the IPO market."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Preliminary data exploration\n",
"The dataset is available under the file name `data.csv` (https://github.com/magorshunov/predicting_ipo_gains/blob/main/data.csv). Listing gains are the percentage increase in the share price of a company from its IPO issue price on the day of listing.\n",
"The data consists of following columns:\n",
"- `Date`: date when the IPO was listed\n",
"- `IPOName`: name of the IPO\n",
"- `Issue_Size`: size of the IPO issue, in INR Crores\n",
"- `Subscription_QIB`: number of times the IPO was subscribed by the QIB (Qualified Institutional Buyer) investor category\n",
"- `Subscription_HNI`: number of times the IPO was subscribed by the HNI (High Networth Individual) investor category\n",
"- `Subscription_RII`: number of times the IPO was subscribed by the RII (Retail Individual Investors) investor category\n",
"- `Subscription_Total`: total number of times the IPO was subscribed overall\n",
"- `Issue_Price`: the price in INR at which the IPO was issued\n",
"- `Listing_Gains_Percent`: is the percentage gain in the listing price over the issue price"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"execution": {
"iopub.execute_input": "2022-12-27T18:32:02.368665Z",
"iopub.status.busy": "2022-12-27T18:32:02.368168Z",
"iopub.status.idle": "2022-12-27T18:32:09.766150Z",
"shell.execute_reply": "2022-12-27T18:32:09.764954Z",
"shell.execute_reply.started": "2022-12-27T18:32:02.368525Z"
},
"id": "f022OaJIK5ul"
},
"outputs": [],
"source": [
"import numpy as np \n",
"import pandas as pd \n",
"import seaborn as sns\n",
"import matplotlib.pyplot as plt\n",
"import tensorflow as tf\n",
"from tensorflow import keras\n",
"from tensorflow.keras import layers\n",
"from sklearn.model_selection import train_test_split\n",
"from sklearn.metrics import mean_squared_error\n",
"from math import sqrt"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 224
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:09.769839Z",
"iopub.status.busy": "2022-12-27T18:32:09.768871Z",
"iopub.status.idle": "2022-12-27T18:32:09.824190Z",
"shell.execute_reply": "2022-12-27T18:32:09.823214Z",
"shell.execute_reply.started": "2022-12-27T18:32:09.769801Z"
},
"id": "ZAw9a566K5um",
"outputId": "cfab95d0-5609-4362-e535-d6a1615f066b"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(319, 9)\n"
]
},
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" Date \n",
" IPOName \n",
" Issue_Size \n",
" Subscription_QIB \n",
" Subscription_HNI \n",
" Subscription_RII \n",
" Subscription_Total \n",
" Issue_Price \n",
" Listing_Gains_Percent \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 03/02/10 \n",
" Infinite Comp \n",
" 189.80 \n",
" 48.44 \n",
" 106.02 \n",
" 11.08 \n",
" 43.22 \n",
" 165 \n",
" 11.82 \n",
" \n",
" \n",
" 1 \n",
" 08/02/10 \n",
" Jubilant Food \n",
" 328.70 \n",
" 59.39 \n",
" 51.95 \n",
" 3.79 \n",
" 31.11 \n",
" 145 \n",
" -84.21 \n",
" \n",
" \n",
" 2 \n",
" 15/02/10 \n",
" Syncom Health \n",
" 56.25 \n",
" 0.99 \n",
" 16.60 \n",
" 6.25 \n",
" 5.17 \n",
" 75 \n",
" 17.13 \n",
" \n",
" \n",
" 3 \n",
" 15/02/10 \n",
" Vascon Engineer \n",
" 199.80 \n",
" 1.12 \n",
" 3.65 \n",
" 0.62 \n",
" 1.22 \n",
" 165 \n",
" -11.28 \n",
" \n",
" \n",
" 4 \n",
" 19/02/10 \n",
" Thangamayil \n",
" 0.00 \n",
" 0.52 \n",
" 1.52 \n",
" 2.26 \n",
" 1.12 \n",
" 75 \n",
" -5.20 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Date IPOName Issue_Size Subscription_QIB Subscription_HNI \n",
"0 03/02/10 Infinite Comp 189.80 48.44 106.02 \\\n",
"1 08/02/10 Jubilant Food 328.70 59.39 51.95 \n",
"2 15/02/10 Syncom Health 56.25 0.99 16.60 \n",
"3 15/02/10 Vascon Engineer 199.80 1.12 3.65 \n",
"4 19/02/10 Thangamayil 0.00 0.52 1.52 \n",
"\n",
" Subscription_RII Subscription_Total Issue_Price Listing_Gains_Percent \n",
"0 11.08 43.22 165 11.82 \n",
"1 3.79 31.11 145 -84.21 \n",
"2 6.25 5.17 75 17.13 \n",
"3 0.62 1.22 165 -11.28 \n",
"4 2.26 1.12 75 -5.20 "
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.read_csv('data.csv')\n",
"print(df.shape)\n",
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 394
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:09.835454Z",
"iopub.status.busy": "2022-12-27T18:32:09.834527Z",
"iopub.status.idle": "2022-12-27T18:32:09.888708Z",
"shell.execute_reply": "2022-12-27T18:32:09.887449Z",
"shell.execute_reply.started": "2022-12-27T18:32:09.835412Z"
},
"id": "d8VNWtLDK5un",
"outputId": "31bede43-e79f-4bad-ab8c-223a91296f8b"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" Date \n",
" IPOName \n",
" Issue_Size \n",
" Subscription_QIB \n",
" Subscription_HNI \n",
" Subscription_RII \n",
" Subscription_Total \n",
" Issue_Price \n",
" Listing_Gains_Percent \n",
" \n",
" \n",
" \n",
" \n",
" count \n",
" 319 \n",
" 319 \n",
" 319.000000 \n",
" 319.000000 \n",
" 319.000000 \n",
" 319.000000 \n",
" 319.000000 \n",
" 319.000000 \n",
" 319.000000 \n",
" \n",
" \n",
" unique \n",
" 287 \n",
" 319 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" NaN \n",
" NaN \n",
" NaN \n",
" NaN \n",
" \n",
" \n",
" top \n",
" 16/08/21 \n",
" Infinite Comp \n",
" NaN \n",
" NaN \n",
" NaN \n",
" NaN \n",
" NaN \n",
" NaN \n",
" NaN \n",
" \n",
" \n",
" freq \n",
" 4 \n",
" 1 \n",
" NaN \n",
" NaN \n",
" NaN \n",
" NaN \n",
" NaN \n",
" NaN \n",
" NaN \n",
" \n",
" \n",
" mean \n",
" NaN \n",
" NaN \n",
" 1192.859969 \n",
" 25.684138 \n",
" 70.091379 \n",
" 8.561599 \n",
" 27.447147 \n",
" 375.128527 \n",
" 4.742696 \n",
" \n",
" \n",
" std \n",
" NaN \n",
" NaN \n",
" 2384.643786 \n",
" 40.716782 \n",
" 142.454416 \n",
" 14.508670 \n",
" 48.772203 \n",
" 353.897614 \n",
" 47.650946 \n",
" \n",
" \n",
" min \n",
" NaN \n",
" NaN \n",
" 0.000000 \n",
" 0.000000 \n",
" 0.000000 \n",
" 0.000000 \n",
" 0.000000 \n",
" 0.000000 \n",
" -97.150000 \n",
" \n",
" \n",
" 25% \n",
" NaN \n",
" NaN \n",
" 169.005000 \n",
" 1.150000 \n",
" 1.255000 \n",
" 1.275000 \n",
" 1.645000 \n",
" 119.000000 \n",
" -11.555000 \n",
" \n",
" \n",
" 50% \n",
" NaN \n",
" NaN \n",
" 496.250000 \n",
" 4.940000 \n",
" 5.070000 \n",
" 3.420000 \n",
" 4.930000 \n",
" 250.000000 \n",
" 1.810000 \n",
" \n",
" \n",
" 75% \n",
" NaN \n",
" NaN \n",
" 1100.000000 \n",
" 34.635000 \n",
" 62.095000 \n",
" 8.605000 \n",
" 33.395000 \n",
" 536.000000 \n",
" 25.310000 \n",
" \n",
" \n",
" max \n",
" NaN \n",
" NaN \n",
" 21000.000000 \n",
" 215.450000 \n",
" 958.070000 \n",
" 119.440000 \n",
" 326.490000 \n",
" 2150.000000 \n",
" 270.400000 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Date IPOName Issue_Size Subscription_QIB \n",
"count 319 319 319.000000 319.000000 \\\n",
"unique 287 319 NaN NaN \n",
"top 16/08/21 Infinite Comp NaN NaN \n",
"freq 4 1 NaN NaN \n",
"mean NaN NaN 1192.859969 25.684138 \n",
"std NaN NaN 2384.643786 40.716782 \n",
"min NaN NaN 0.000000 0.000000 \n",
"25% NaN NaN 169.005000 1.150000 \n",
"50% NaN NaN 496.250000 4.940000 \n",
"75% NaN NaN 1100.000000 34.635000 \n",
"max NaN NaN 21000.000000 215.450000 \n",
"\n",
" Subscription_HNI Subscription_RII Subscription_Total Issue_Price \n",
"count 319.000000 319.000000 319.000000 319.000000 \\\n",
"unique NaN NaN NaN NaN \n",
"top NaN NaN NaN NaN \n",
"freq NaN NaN NaN NaN \n",
"mean 70.091379 8.561599 27.447147 375.128527 \n",
"std 142.454416 14.508670 48.772203 353.897614 \n",
"min 0.000000 0.000000 0.000000 0.000000 \n",
"25% 1.255000 1.275000 1.645000 119.000000 \n",
"50% 5.070000 3.420000 4.930000 250.000000 \n",
"75% 62.095000 8.605000 33.395000 536.000000 \n",
"max 958.070000 119.440000 326.490000 2150.000000 \n",
"\n",
" Listing_Gains_Percent \n",
"count 319.000000 \n",
"unique NaN \n",
"top NaN \n",
"freq NaN \n",
"mean 4.742696 \n",
"std 47.650946 \n",
"min -97.150000 \n",
"25% -11.555000 \n",
"50% 1.810000 \n",
"75% 25.310000 \n",
"max 270.400000 "
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.describe(include='all')"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:09.893105Z",
"iopub.status.busy": "2022-12-27T18:32:09.892679Z",
"iopub.status.idle": "2022-12-27T18:32:09.903968Z",
"shell.execute_reply": "2022-12-27T18:32:09.902838Z",
"shell.execute_reply.started": "2022-12-27T18:32:09.893072Z"
},
"id": "quX0LdyMK5uo",
"outputId": "1262653e-07aa-40cf-c0ee-bce542137be3"
},
"outputs": [
{
"data": {
"text/plain": [
"count 319.000000\n",
"mean 4.742696\n",
"std 47.650946\n",
"min -97.150000\n",
"25% -11.555000\n",
"50% 1.810000\n",
"75% 25.310000\n",
"max 270.400000\n",
"Name: Listing_Gains_Percent, dtype: float64"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df['Listing_Gains_Percent'].describe()"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Date 0\n",
"IPOName 0\n",
"Issue_Size 0\n",
"Subscription_QIB 0\n",
"Subscription_HNI 0\n",
"Subscription_RII 0\n",
"Subscription_Total 0\n",
"Issue_Price 0\n",
"Listing_Gains_Percent 0\n",
"dtype: int64"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.isnull().sum()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "KwApVwu4K5uo"
},
"source": [
"## Exploring the Data\n",
"The `Listing_Gains_Percent` target variable is continous. Therfore, I will need to convert it into a categorical variable before I proceed. Approximately 55% of the IPOs listed in profit, and the data is fairly balanced. I have also dropped some of the variables that might not have predictive power. "
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"execution": {
"iopub.execute_input": "2022-12-27T18:32:09.922352Z",
"iopub.status.busy": "2022-12-27T18:32:09.921602Z",
"iopub.status.idle": "2022-12-27T18:32:09.930727Z",
"shell.execute_reply": "2022-12-27T18:32:09.929560Z",
"shell.execute_reply.started": "2022-12-27T18:32:09.922304Z"
},
"id": "QqkYBOUNK5up"
},
"outputs": [],
"source": [
"df['Listing_Gains_Profit'] = df['Listing_Gains_Percent'].apply(lambda x: 1 if x > 0 else 0)"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:09.932506Z",
"iopub.status.busy": "2022-12-27T18:32:09.932169Z",
"iopub.status.idle": "2022-12-27T18:32:09.946096Z",
"shell.execute_reply": "2022-12-27T18:32:09.944849Z",
"shell.execute_reply.started": "2022-12-27T18:32:09.932469Z"
},
"id": "FyUnzaeYK5up",
"outputId": "46fab12a-dbfe-4130-b484-60062cc9d161"
},
"outputs": [
{
"data": {
"text/plain": [
"count 319.000000\n",
"mean 0.545455\n",
"std 0.498712\n",
"min 0.000000\n",
"25% 0.000000\n",
"50% 1.000000\n",
"75% 1.000000\n",
"max 1.000000\n",
"Name: Listing_Gains_Profit, dtype: float64"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df['Listing_Gains_Profit'].describe()"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:09.948100Z",
"iopub.status.busy": "2022-12-27T18:32:09.947749Z",
"iopub.status.idle": "2022-12-27T18:32:09.957636Z",
"shell.execute_reply": "2022-12-27T18:32:09.956803Z",
"shell.execute_reply.started": "2022-12-27T18:32:09.948070Z"
},
"id": "tusoDA3PK5up",
"outputId": "c50fed56-638e-42b4-f28e-700dfa1caaae"
},
"outputs": [
{
"data": {
"text/plain": [
"Listing_Gains_Profit\n",
"1 0.545455\n",
"0 0.454545\n",
"Name: proportion, dtype: float64"
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df['Listing_Gains_Profit'].value_counts(normalize=True)"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:09.959345Z",
"iopub.status.busy": "2022-12-27T18:32:09.958997Z",
"iopub.status.idle": "2022-12-27T18:32:09.982144Z",
"shell.execute_reply": "2022-12-27T18:32:09.980694Z",
"shell.execute_reply.started": "2022-12-27T18:32:09.959314Z"
},
"id": "M8LYT3OdK5uq",
"outputId": "3ac1f945-2e95-4a17-f5e9-d24b4d6d6a5e"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"RangeIndex: 319 entries, 0 to 318\n",
"Data columns (total 7 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 Issue_Size 319 non-null float64\n",
" 1 Subscription_QIB 319 non-null float64\n",
" 2 Subscription_HNI 319 non-null float64\n",
" 3 Subscription_RII 319 non-null float64\n",
" 4 Subscription_Total 319 non-null float64\n",
" 5 Issue_Price 319 non-null int64 \n",
" 6 Listing_Gains_Profit 319 non-null int64 \n",
"dtypes: float64(5), int64(2)\n",
"memory usage: 17.6 KB\n"
]
}
],
"source": [
"df.drop(['Date ', 'IPOName', 'Listing_Gains_Percent'], axis=1, inplace=True)\n",
"df.info()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "Rqly_22nK5uq"
},
"source": [
"## Data Visualization \n",
"I will check for the distribution of predictors with respect to the target variable, since they could be informative for modeling. To do that, I will: \n",
"- Created a countplot to visualize the distribution of the target variable, and give the plot a proper title.\n",
"- Used plots to check for the presence of outliers in each of the continuous variables of the dataset.\n",
"- Used visualizations to check the relationship between your selected predictor variables and the target variable. Check if segmenting the plots with the distribution of the outcome classes provides any meaningful insight.\n",
"- Used visualizations to check if there are correlations between predictor variables.\n",
"\n",
"Here are some of the findings:\n",
"\n",
"1. The histogram and the boxplots show that outliers are present in the data and might need outlier treatment. \n",
"\n",
"2. The boxplot of `Issue_Price`, with respect to `Listing_Gains_Profit`, shows that there are more outliers for IPOs that listed a loss than there are outliers for IPOs that listed a profit. \n",
"\n",
"3. The scatterplot shows a correlation between Retail and Total IPO Subscription via a scatterplot."
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 295
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:09.984476Z",
"iopub.status.busy": "2022-12-27T18:32:09.983979Z",
"iopub.status.idle": "2022-12-27T18:32:10.215080Z",
"shell.execute_reply": "2022-12-27T18:32:10.214284Z",
"shell.execute_reply.started": "2022-12-27T18:32:09.984433Z"
},
"id": "MGFC-ZSNK5ur",
"outputId": "3811b1fe-a766-4131-ae9c-2540fd0187ce"
},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# visualizing the target variable\n",
"sns.countplot(x='Listing_Gains_Profit', data=df)\n",
"plt.title('Distribution of IPO Listing Profit Category')\n",
"plt.xlabel('Listing Profit (No=0, Yes=1)')\n",
"plt.ylabel('Frequency')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 351
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:10.216760Z",
"iopub.status.busy": "2022-12-27T18:32:10.216291Z",
"iopub.status.idle": "2022-12-27T18:32:10.518091Z",
"shell.execute_reply": "2022-12-27T18:32:10.516968Z",
"shell.execute_reply.started": "2022-12-27T18:32:10.216731Z"
},
"id": "u8zD7oyCK5ur",
"outputId": "e5a8a747-b65c-4021-bd87-bc00276d06d5"
},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plt.figure(figsize=[8,5])\n",
"sns.histplot(data=df, x='Issue_Price', bins=50).set(title='Distribution of Issue_Price', ylabel='Count')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 351
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:10.520447Z",
"iopub.status.busy": "2022-12-27T18:32:10.520140Z",
"iopub.status.idle": "2022-12-27T18:32:10.822114Z",
"shell.execute_reply": "2022-12-27T18:32:10.820900Z",
"shell.execute_reply.started": "2022-12-27T18:32:10.520419Z"
},
"id": "62lZ4aQTK5ur",
"outputId": "2099c778-fce5-4837-8819-9749ddf89d36"
},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plt.figure(figsize=[8,5])\n",
"sns.histplot(data=df, x='Issue_Size', bins=50).set(title='Distribution of Issue_Size', ylabel='Count')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 268
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:10.828097Z",
"iopub.status.busy": "2022-12-27T18:32:10.827747Z",
"iopub.status.idle": "2022-12-27T18:32:10.998452Z",
"shell.execute_reply": "2022-12-27T18:32:10.997110Z",
"shell.execute_reply.started": "2022-12-27T18:32:10.828067Z"
},
"id": "WrkZRIztK5ur",
"outputId": "5e348e9e-9a2f-470d-b685-cb152052f7bf"
},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sns.boxplot(data=df, y='Issue_Size')\n",
"plt.title('Boxplot of Issue_Size')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 295
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:11.000271Z",
"iopub.status.busy": "2022-12-27T18:32:10.999879Z",
"iopub.status.idle": "2022-12-27T18:32:11.198774Z",
"shell.execute_reply": "2022-12-27T18:32:11.197900Z",
"shell.execute_reply.started": "2022-12-27T18:32:11.000237Z"
},
"id": "UEK5vEBYK5us",
"outputId": "e7938243-822e-4beb-fd1d-ca27ee676b74"
},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sns.boxplot(data=df, x='Listing_Gains_Profit', y='Issue_Price')\n",
"plt.title('Boxplot of Issue_Price with respect to Listing Gains Type')\n",
"plt.xlabel('Listing Profit (No=0, Yes=1)')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:11.200947Z",
"iopub.status.busy": "2022-12-27T18:32:11.200319Z",
"iopub.status.idle": "2022-12-27T18:32:11.207924Z",
"shell.execute_reply": "2022-12-27T18:32:11.207051Z",
"shell.execute_reply.started": "2022-12-27T18:32:11.200912Z"
},
"id": "5Tas_DpEK5us",
"outputId": "ab488bb0-5fad-452c-8ea0-84e480a780f1"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Issue_Size 4.853402\n",
"Subscription_QIB 2.143705\n",
"Subscription_HNI 3.078445\n",
"Subscription_RII 3.708274\n",
"Subscription_Total 2.911907\n",
"Issue_Price 1.696881\n",
"Listing_Gains_Profit -0.183438\n",
"dtype: float64\n"
]
}
],
"source": [
"print(df.skew())"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 268
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:11.209644Z",
"iopub.status.busy": "2022-12-27T18:32:11.209282Z",
"iopub.status.idle": "2022-12-27T18:32:11.386354Z",
"shell.execute_reply": "2022-12-27T18:32:11.385216Z",
"shell.execute_reply.started": "2022-12-27T18:32:11.209599Z"
},
"id": "9OclMZRSK5us",
"outputId": "baa862bb-349c-40f2-d9dc-b11c948f6bd9"
},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sns.boxplot(data=df, y='Subscription_QIB')\n",
"plt.title('Boxplot of Subscription_QIB')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 296
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:11.388794Z",
"iopub.status.busy": "2022-12-27T18:32:11.388041Z",
"iopub.status.idle": "2022-12-27T18:32:11.623059Z",
"shell.execute_reply": "2022-12-27T18:32:11.621912Z",
"shell.execute_reply.started": "2022-12-27T18:32:11.388733Z"
},
"id": "xFoR7F-lK5us",
"outputId": "9bc27b47-0862-4735-e301-21330ef1b8e1"
},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sns.scatterplot(data=df, x='Subscription_RII', y='Subscription_Total')\n",
"plt.title('Scatterplot between Retail and Total IPO Subscription')\n",
"plt.show()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "T-PN0gKiK5ut"
},
"source": [
"## Outlier Treatment\n",
"Apart from performing a visual inspection, outliers can also be identified with the skewness or the interquartile range (IQR) value. There are different approaches to outlier treatment, but the one I've used here is outlier identification using the interquartile menthod. Once I identified the outliers, I clipped the variable values between the upper and lower bounds."
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:11.797659Z",
"iopub.status.busy": "2022-12-27T18:32:11.797316Z",
"iopub.status.idle": "2022-12-27T18:32:11.808432Z",
"shell.execute_reply": "2022-12-27T18:32:11.807238Z",
"shell.execute_reply.started": "2022-12-27T18:32:11.797626Z"
},
"id": "VXcqCGqCK5ut",
"outputId": "3347b3ee-6a34-4f2d-c8f4-39af7d394616"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"IQR = 930.995\n",
"lower = -1227.4875000000002\n",
"upper = 2496.4925000000003\n"
]
}
],
"source": [
"q1 = df['Issue_Size'].quantile(q=0.25)\n",
"q3 = df['Issue_Size'].quantile(q=0.75) \n",
"iqr = q3 - q1 \n",
"lower = (q1 - 1.5 * iqr) \n",
"upper = (q3 + 1.5 * iqr) \n",
"print('IQR = ', iqr, '\\nlower = ', lower, '\\nupper = ', upper, sep='')"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:11.809845Z",
"iopub.status.busy": "2022-12-27T18:32:11.809529Z",
"iopub.status.idle": "2022-12-27T18:32:11.829979Z",
"shell.execute_reply": "2022-12-27T18:32:11.828506Z",
"shell.execute_reply.started": "2022-12-27T18:32:11.809817Z"
},
"id": "MyxUXuWCK5ut",
"outputId": "ec2114dc-1139-4fc4-a173-7b75c18ea6ec"
},
"outputs": [
{
"data": {
"text/plain": [
"count 319.000000\n",
"mean 763.561238\n",
"std 769.689122\n",
"min 0.000000\n",
"25% 169.005000\n",
"50% 496.250000\n",
"75% 1100.000000\n",
"max 2496.492500\n",
"Name: Issue_Size, dtype: float64"
]
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df['Issue_Size'] = df['Issue_Size'].clip(lower, upper)\n",
"df['Issue_Size'].describe()"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:11.831412Z",
"iopub.status.busy": "2022-12-27T18:32:11.831071Z",
"iopub.status.idle": "2022-12-27T18:32:11.842221Z",
"shell.execute_reply": "2022-12-27T18:32:11.840980Z",
"shell.execute_reply.started": "2022-12-27T18:32:11.831377Z"
},
"id": "2no3iuXXK5ut",
"outputId": "d7c63024-5830-4c17-b6f8-1eb659dcc223"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"IQR = 33.48500000000001\n",
"lower = -49.07750000000001\n",
"upper = 84.86250000000001\n"
]
}
],
"source": [
"q1 = df['Subscription_QIB'].quantile(q=0.25)\n",
"q3 = df['Subscription_QIB'].quantile(q=0.75) \n",
"iqr = q3 - q1 \n",
"lower = (q1 - 1.5 * iqr) \n",
"upper = (q3 + 1.5 * iqr) \n",
"print('IQR = ', iqr, '\\nlower = ', lower, '\\nupper = ', upper, sep='')"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:11.844442Z",
"iopub.status.busy": "2022-12-27T18:32:11.843981Z",
"iopub.status.idle": "2022-12-27T18:32:11.860042Z",
"shell.execute_reply": "2022-12-27T18:32:11.858761Z",
"shell.execute_reply.started": "2022-12-27T18:32:11.844398Z"
},
"id": "KTRUl5bdK5ut",
"outputId": "a6d98c3b-4113-4244-fd0b-a697ca16e139"
},
"outputs": [
{
"data": {
"text/plain": [
"count 319.000000\n",
"mean 21.521183\n",
"std 29.104549\n",
"min 0.000000\n",
"25% 1.150000\n",
"50% 4.940000\n",
"75% 34.635000\n",
"max 84.862500\n",
"Name: Subscription_QIB, dtype: float64"
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df['Subscription_QIB'] = df['Subscription_QIB'].clip(lower, upper)\n",
"df['Subscription_QIB'].describe()"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:11.862921Z",
"iopub.status.busy": "2022-12-27T18:32:11.861528Z",
"iopub.status.idle": "2022-12-27T18:32:11.871911Z",
"shell.execute_reply": "2022-12-27T18:32:11.870729Z",
"shell.execute_reply.started": "2022-12-27T18:32:11.862883Z"
},
"id": "RCU2kxX-K5uu",
"outputId": "b0b6bcd6-de91-42d3-d45d-0912f6f3e198"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"IQR = 60.839999999999996\n",
"lower = -90.005\n",
"upper = 153.355\n"
]
}
],
"source": [
"q1 = df['Subscription_HNI'].quantile(q=0.25)\n",
"q3 = df['Subscription_HNI'].quantile(q=0.75) \n",
"iqr = q3 - q1 \n",
"lower = (q1 - 1.5 * iqr) \n",
"upper = (q3 + 1.5 * iqr) \n",
"print('IQR = ', iqr, '\\nlower = ', lower, '\\nupper = ', upper, sep='')"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:11.874563Z",
"iopub.status.busy": "2022-12-27T18:32:11.873563Z",
"iopub.status.idle": "2022-12-27T18:32:11.889349Z",
"shell.execute_reply": "2022-12-27T18:32:11.888085Z",
"shell.execute_reply.started": "2022-12-27T18:32:11.874517Z"
},
"id": "7Bi4KyUOK5uu",
"outputId": "189f296a-a9b6-4c07-cdf3-d5c7cd94e147"
},
"outputs": [
{
"data": {
"text/plain": [
"count 319.000000\n",
"mean 40.356426\n",
"std 57.427921\n",
"min 0.000000\n",
"25% 1.255000\n",
"50% 5.070000\n",
"75% 62.095000\n",
"max 153.355000\n",
"Name: Subscription_HNI, dtype: float64"
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df['Subscription_HNI'] = df['Subscription_HNI'].clip(lower, upper)\n",
"df['Subscription_HNI'].describe()"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:11.891309Z",
"iopub.status.busy": "2022-12-27T18:32:11.890841Z",
"iopub.status.idle": "2022-12-27T18:32:11.903508Z",
"shell.execute_reply": "2022-12-27T18:32:11.902206Z",
"shell.execute_reply.started": "2022-12-27T18:32:11.891254Z"
},
"id": "ynAHmlw4K5uu",
"outputId": "32547e3f-effb-4be6-b31f-d94e5ca8fc15"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"IQR = 7.33\n",
"lower = -9.72\n",
"upper = 19.6\n"
]
}
],
"source": [
"q1 = df['Subscription_RII'].quantile(q=0.25)\n",
"q3 = df['Subscription_RII'].quantile(q=0.75) \n",
"iqr = q3 - q1 \n",
"lower = (q1 - 1.5 * iqr) \n",
"upper = (q3 + 1.5 * iqr) \n",
"print('IQR = ', iqr, '\\nlower = ', lower, '\\nupper = ', upper, sep='')"
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:11.905291Z",
"iopub.status.busy": "2022-12-27T18:32:11.904829Z",
"iopub.status.idle": "2022-12-27T18:32:11.920193Z",
"shell.execute_reply": "2022-12-27T18:32:11.919398Z",
"shell.execute_reply.started": "2022-12-27T18:32:11.905250Z"
},
"id": "IjQZoL6KK5uu",
"outputId": "30c6be06-c313-4ea4-a481-03e0c34b3452"
},
"outputs": [
{
"data": {
"text/plain": [
"count 319.000000\n",
"mean 6.060940\n",
"std 6.176882\n",
"min 0.000000\n",
"25% 1.275000\n",
"50% 3.420000\n",
"75% 8.605000\n",
"max 19.600000\n",
"Name: Subscription_RII, dtype: float64"
]
},
"execution_count": 56,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df['Subscription_RII'] = df['Subscription_RII'].clip(lower, upper)\n",
"df['Subscription_RII'].describe()"
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:11.921919Z",
"iopub.status.busy": "2022-12-27T18:32:11.921276Z",
"iopub.status.idle": "2022-12-27T18:32:11.935310Z",
"shell.execute_reply": "2022-12-27T18:32:11.934484Z",
"shell.execute_reply.started": "2022-12-27T18:32:11.921885Z"
},
"id": "jDf0sDipK5uu",
"outputId": "a2f3cd7c-2456-4217-df5c-1b4f019af4fc"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"IQR = 31.749999999999996\n",
"lower = -45.97999999999999\n",
"upper = 81.01999999999998\n"
]
}
],
"source": [
"q1 = df['Subscription_Total'].quantile(q=0.25)\n",
"q3 = df['Subscription_Total'].quantile(q=0.75) \n",
"iqr = q3 - q1 \n",
"lower = (q1 - 1.5 * iqr) \n",
"upper = (q3 + 1.5 * iqr) \n",
"print('IQR = ', iqr, '\\nlower = ', lower, '\\nupper = ', upper, sep='')"
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:11.937247Z",
"iopub.status.busy": "2022-12-27T18:32:11.936404Z",
"iopub.status.idle": "2022-12-27T18:32:11.952110Z",
"shell.execute_reply": "2022-12-27T18:32:11.951342Z",
"shell.execute_reply.started": "2022-12-27T18:32:11.937211Z"
},
"id": "nrK_ia3IK5uu",
"outputId": "329220ea-eb96-4a7e-c206-479a8a735e36"
},
"outputs": [
{
"data": {
"text/plain": [
"count 319.000000\n",
"mean 20.456646\n",
"std 27.217740\n",
"min 0.000000\n",
"25% 1.645000\n",
"50% 4.930000\n",
"75% 33.395000\n",
"max 81.020000\n",
"Name: Subscription_Total, dtype: float64"
]
},
"execution_count": 58,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df['Subscription_Total'] = df['Subscription_Total'].clip(lower, upper)\n",
"df['Subscription_Total'].describe()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "0FdzVKR3K5uv"
},
"source": [
"## Setting the Target and Predictor Variables\n",
"Before moving on to modelling, I will:\n",
"- Create an array of the target variable (dependent variable).\n",
"- Create an array of the predictor variables (independent variables).\n",
"- Perform normalization on the predictor variables to scale their values to between 0 and 1."
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 300
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:11.953327Z",
"iopub.status.busy": "2022-12-27T18:32:11.953000Z",
"iopub.status.idle": "2022-12-27T18:32:11.996296Z",
"shell.execute_reply": "2022-12-27T18:32:11.995087Z",
"shell.execute_reply.started": "2022-12-27T18:32:11.953297Z"
},
"id": "lnCJqqTzK5uv",
"outputId": "d8a74d8b-72c1-4c15-8231-9c793f9733bf"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" Issue_Size \n",
" Subscription_QIB \n",
" Subscription_HNI \n",
" Subscription_RII \n",
" Subscription_Total \n",
" Issue_Price \n",
" Listing_Gains_Profit \n",
" \n",
" \n",
" \n",
" \n",
" count \n",
" 319.000000 \n",
" 319.000000 \n",
" 319.000000 \n",
" 319.000000 \n",
" 319.000000 \n",
" 319.000000 \n",
" 319.000000 \n",
" \n",
" \n",
" mean \n",
" 0.305854 \n",
" 0.253601 \n",
" 0.263157 \n",
" 0.309232 \n",
" 0.252489 \n",
" 0.174478 \n",
" 0.545455 \n",
" \n",
" \n",
" std \n",
" 0.308308 \n",
" 0.342961 \n",
" 0.374477 \n",
" 0.315147 \n",
" 0.335939 \n",
" 0.164604 \n",
" 0.498712 \n",
" \n",
" \n",
" min \n",
" 0.000000 \n",
" 0.000000 \n",
" 0.000000 \n",
" 0.000000 \n",
" 0.000000 \n",
" 0.000000 \n",
" 0.000000 \n",
" \n",
" \n",
" 25% \n",
" 0.067697 \n",
" 0.013551 \n",
" 0.008184 \n",
" 0.065051 \n",
" 0.020304 \n",
" 0.055349 \n",
" 0.000000 \n",
" \n",
" \n",
" 50% \n",
" 0.198779 \n",
" 0.058212 \n",
" 0.033061 \n",
" 0.174490 \n",
" 0.060849 \n",
" 0.116279 \n",
" 1.000000 \n",
" \n",
" \n",
" 75% \n",
" 0.440618 \n",
" 0.408131 \n",
" 0.404910 \n",
" 0.439031 \n",
" 0.412182 \n",
" 0.249302 \n",
" 1.000000 \n",
" \n",
" \n",
" max \n",
" 1.000000 \n",
" 1.000000 \n",
" 1.000000 \n",
" 1.000000 \n",
" 1.000000 \n",
" 1.000000 \n",
" 1.000000 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Issue_Size Subscription_QIB Subscription_HNI Subscription_RII \n",
"count 319.000000 319.000000 319.000000 319.000000 \\\n",
"mean 0.305854 0.253601 0.263157 0.309232 \n",
"std 0.308308 0.342961 0.374477 0.315147 \n",
"min 0.000000 0.000000 0.000000 0.000000 \n",
"25% 0.067697 0.013551 0.008184 0.065051 \n",
"50% 0.198779 0.058212 0.033061 0.174490 \n",
"75% 0.440618 0.408131 0.404910 0.439031 \n",
"max 1.000000 1.000000 1.000000 1.000000 \n",
"\n",
" Subscription_Total Issue_Price Listing_Gains_Profit \n",
"count 319.000000 319.000000 319.000000 \n",
"mean 0.252489 0.174478 0.545455 \n",
"std 0.335939 0.164604 0.498712 \n",
"min 0.000000 0.000000 0.000000 \n",
"25% 0.020304 0.055349 0.000000 \n",
"50% 0.060849 0.116279 1.000000 \n",
"75% 0.412182 0.249302 1.000000 \n",
"max 1.000000 1.000000 1.000000 "
]
},
"execution_count": 59,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"target_variable = ['Listing_Gains_Profit'] \n",
"predictors = list(set(list(df.columns)) - set(target_variable))\n",
"df[predictors] = df[predictors]/df[predictors].max()\n",
"df.describe()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "Re0qt2A7K5uv"
},
"source": [
"## Creating the Holdout Validation Approach\n",
"I will use the hold out validation approach to model evaluation. In this approach, I will divide the data in the 70:30 ratio, where I will use 70% of the data for training the model, while I will use the other 30% of the data to test the model."
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"execution": {
"iopub.execute_input": "2022-12-27T18:32:11.997760Z",
"iopub.status.busy": "2022-12-27T18:32:11.997459Z",
"iopub.status.idle": "2022-12-27T18:32:12.007201Z",
"shell.execute_reply": "2022-12-27T18:32:12.006116Z",
"shell.execute_reply.started": "2022-12-27T18:32:11.997732Z"
},
"id": "79ia8fAdK5uv",
"outputId": "831e6c4e-fd28-4311-f1c1-db69f0d196e9"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(223, 6)\n",
"(96, 6)\n"
]
}
],
"source": [
"X = df[predictors].values\n",
"y = df[target_variable].values\n",
"\n",
"X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=100)\n",
"print(X_train.shape); print(X_test.shape)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "hbdwpVyOK5uv"
},
"source": [
"## Define the Deep Learning Classification Model\n",
"In this step, I've defined the model by instantiating the sequential model class in TensorFlow's Keras. The model architecture is comprised of four hidden layers with `relu` as the activation function. The output layer uses a `sigmoid` activation function, which is a good choice for a binary classification model."
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {
"execution": {
"iopub.execute_input": "2022-12-27T18:32:12.009002Z",
"iopub.status.busy": "2022-12-27T18:32:12.008652Z",
"iopub.status.idle": "2022-12-27T18:32:12.162585Z",
"shell.execute_reply": "2022-12-27T18:32:12.161353Z",
"shell.execute_reply.started": "2022-12-27T18:32:12.008972Z"
},
"id": "5fE0bP4rK5uv"
},
"outputs": [],
"source": [
"# define model\n",
"tf.random.set_seed(100)\n",
"model = tf.keras.Sequential()\n",
"model.add(tf.keras.layers.Dense(32, input_shape = (X_train.shape[1],), activation = 'relu'))\n",
"model.add(tf.keras.layers.Dense(16, activation= 'relu'))\n",
"model.add(tf.keras.layers.Dense(8, activation= 'relu'))\n",
"model.add(tf.keras.layers.Dense(4, activation= 'relu'))\n",
"model.add(tf.keras.layers.Dense(1, activation='sigmoid')) "
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "_8zUg4DyK5uw"
},
"source": [
"## Compile and Train the Model\n",
"Once I have defined the model, the next steps are to compile and train it. Compiling a model requires specification of the following:\n",
"- An optimizer\n",
"- A loss function\n",
"- An evaluation metric\n",
"\n",
"After compiling the model, I fitted it on the training set. The accuracy improved over epochs. "
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"execution": {
"iopub.execute_input": "2022-12-27T18:35:30.988666Z",
"iopub.status.busy": "2022-12-27T18:35:30.987862Z",
"iopub.status.idle": "2022-12-27T18:35:30.995402Z",
"shell.execute_reply": "2022-12-27T18:35:30.994354Z",
"shell.execute_reply.started": "2022-12-27T18:35:30.988624Z"
},
"id": "IPz2XmGoK5uw",
"outputId": "3c333a63-86a1-43d7-cfc1-b6787f6ee1a0"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model: \"sequential\"\n",
"_________________________________________________________________\n",
" Layer (type) Output Shape Param # \n",
"=================================================================\n",
" dense (Dense) (None, 32) 224 \n",
" \n",
" dense_1 (Dense) (None, 16) 528 \n",
" \n",
" dense_2 (Dense) (None, 8) 136 \n",
" \n",
" dense_3 (Dense) (None, 4) 36 \n",
" \n",
" dense_4 (Dense) (None, 1) 5 \n",
" \n",
"=================================================================\n",
"Total params: 929\n",
"Trainable params: 929\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n",
"None\n"
]
}
],
"source": [
"model.compile(optimizer=tf.keras.optimizers.Adam(0.001), loss=tf.keras.losses.BinaryCrossentropy(), metrics=['accuracy'])\n",
"print(model.summary())"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"execution": {
"iopub.execute_input": "2022-12-27T18:47:17.359902Z",
"iopub.status.busy": "2022-12-27T18:47:17.359424Z",
"iopub.status.idle": "2022-12-27T18:47:23.418354Z",
"shell.execute_reply": "2022-12-27T18:47:23.417193Z",
"shell.execute_reply.started": "2022-12-27T18:47:17.359866Z"
},
"id": "2hFSDV0oK5uw",
"outputId": "15f1297b-d73d-4764-d75a-4b873e8a424c"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Epoch 1/250\n",
"7/7 [==============================] - 1s 3ms/step - loss: 0.6899 - accuracy: 0.5874\n",
"Epoch 2/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.6859 - accuracy: 0.5650\n",
"Epoch 3/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6826 - accuracy: 0.5516\n",
"Epoch 4/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6771 - accuracy: 0.5561\n",
"Epoch 5/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.6729 - accuracy: 0.5561\n",
"Epoch 6/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.6680 - accuracy: 0.5561\n",
"Epoch 7/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6629 - accuracy: 0.5561\n",
"Epoch 8/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6586 - accuracy: 0.5561\n",
"Epoch 9/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6536 - accuracy: 0.5561\n",
"Epoch 10/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.6493 - accuracy: 0.5561\n",
"Epoch 11/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.6454 - accuracy: 0.5561\n",
"Epoch 12/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6416 - accuracy: 0.5561\n",
"Epoch 13/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6383 - accuracy: 0.5516\n",
"Epoch 14/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.6351 - accuracy: 0.5919\n",
"Epoch 15/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.6330 - accuracy: 0.6368\n",
"Epoch 16/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6303 - accuracy: 0.6502\n",
"Epoch 17/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6275 - accuracy: 0.6771\n",
"Epoch 18/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6262 - accuracy: 0.6726\n",
"Epoch 19/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.6249 - accuracy: 0.6592\n",
"Epoch 20/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.6232 - accuracy: 0.6816\n",
"Epoch 21/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6214 - accuracy: 0.6771\n",
"Epoch 22/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6206 - accuracy: 0.6682\n",
"Epoch 23/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.6190 - accuracy: 0.6726\n",
"Epoch 24/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6182 - accuracy: 0.6726\n",
"Epoch 25/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6167 - accuracy: 0.6682\n",
"Epoch 26/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.6164 - accuracy: 0.6771\n",
"Epoch 27/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6159 - accuracy: 0.6592\n",
"Epoch 28/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6138 - accuracy: 0.6637\n",
"Epoch 29/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.6127 - accuracy: 0.6592\n",
"Epoch 30/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6119 - accuracy: 0.6682\n",
"Epoch 31/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6103 - accuracy: 0.6726\n",
"Epoch 32/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6101 - accuracy: 0.6816\n",
"Epoch 33/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6082 - accuracy: 0.6726\n",
"Epoch 34/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6069 - accuracy: 0.6771\n",
"Epoch 35/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.6055 - accuracy: 0.6816\n",
"Epoch 36/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.6051 - accuracy: 0.6861\n",
"Epoch 37/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6036 - accuracy: 0.6861\n",
"Epoch 38/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6027 - accuracy: 0.6816\n",
"Epoch 39/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.6022 - accuracy: 0.6816\n",
"Epoch 40/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.6014 - accuracy: 0.6816\n",
"Epoch 41/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5998 - accuracy: 0.6861\n",
"Epoch 42/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5992 - accuracy: 0.6816\n",
"Epoch 43/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5980 - accuracy: 0.6861\n",
"Epoch 44/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5977 - accuracy: 0.6861\n",
"Epoch 45/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5966 - accuracy: 0.6906\n",
"Epoch 46/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5967 - accuracy: 0.6861\n",
"Epoch 47/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5951 - accuracy: 0.6816\n",
"Epoch 48/250\n",
"7/7 [==============================] - 0s 6ms/step - loss: 0.5938 - accuracy: 0.6771\n",
"Epoch 49/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5927 - accuracy: 0.6771\n",
"Epoch 50/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5919 - accuracy: 0.6861\n",
"Epoch 51/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5924 - accuracy: 0.6816\n",
"Epoch 52/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5903 - accuracy: 0.6771\n",
"Epoch 53/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5900 - accuracy: 0.6816\n",
"Epoch 54/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5895 - accuracy: 0.6816\n",
"Epoch 55/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5878 - accuracy: 0.6906\n",
"Epoch 56/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5878 - accuracy: 0.6816\n",
"Epoch 57/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5881 - accuracy: 0.6906\n",
"Epoch 58/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5854 - accuracy: 0.6816\n",
"Epoch 59/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5848 - accuracy: 0.6771\n",
"Epoch 60/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5844 - accuracy: 0.6861\n",
"Epoch 61/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5825 - accuracy: 0.6861\n",
"Epoch 62/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5820 - accuracy: 0.6816\n",
"Epoch 63/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5837 - accuracy: 0.6816\n",
"Epoch 64/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5818 - accuracy: 0.6861\n",
"Epoch 65/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5806 - accuracy: 0.6906\n",
"Epoch 66/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5800 - accuracy: 0.6861\n",
"Epoch 67/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5794 - accuracy: 0.6906\n",
"Epoch 68/250\n",
"7/7 [==============================] - 0s 3ms/step - loss: 0.5784 - accuracy: 0.6861\n",
"Epoch 69/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5789 - accuracy: 0.6771\n",
"Epoch 70/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5769 - accuracy: 0.6816\n",
"Epoch 71/250\n",
"7/7 [==============================] - 0s 3ms/step - loss: 0.5761 - accuracy: 0.6906\n",
"Epoch 72/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5760 - accuracy: 0.6906\n",
"Epoch 73/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5773 - accuracy: 0.6906\n",
"Epoch 74/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5762 - accuracy: 0.6726\n",
"Epoch 75/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5742 - accuracy: 0.6951\n",
"Epoch 76/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5727 - accuracy: 0.6951\n",
"Epoch 77/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5726 - accuracy: 0.6861\n",
"Epoch 78/250\n",
"7/7 [==============================] - 0s 3ms/step - loss: 0.5727 - accuracy: 0.6726\n",
"Epoch 79/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5728 - accuracy: 0.6951\n",
"Epoch 80/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5730 - accuracy: 0.6906\n",
"Epoch 81/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5706 - accuracy: 0.6771\n",
"Epoch 82/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5703 - accuracy: 0.6951\n",
"Epoch 83/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5696 - accuracy: 0.6906\n",
"Epoch 84/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5693 - accuracy: 0.6816\n",
"Epoch 85/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5677 - accuracy: 0.6906\n",
"Epoch 86/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5702 - accuracy: 0.6996\n",
"Epoch 87/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5673 - accuracy: 0.7040\n",
"Epoch 88/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5664 - accuracy: 0.6951\n",
"Epoch 89/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5679 - accuracy: 0.6951\n",
"Epoch 90/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5656 - accuracy: 0.6951\n",
"Epoch 91/250\n",
"7/7 [==============================] - 0s 5ms/step - loss: 0.5664 - accuracy: 0.6951\n",
"Epoch 92/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5664 - accuracy: 0.6816\n",
"Epoch 93/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5645 - accuracy: 0.6996\n",
"Epoch 94/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5646 - accuracy: 0.6996\n",
"Epoch 95/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5644 - accuracy: 0.6906\n",
"Epoch 96/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5642 - accuracy: 0.6906\n",
"Epoch 97/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5621 - accuracy: 0.6906\n",
"Epoch 98/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5634 - accuracy: 0.6951\n",
"Epoch 99/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5623 - accuracy: 0.6996\n",
"Epoch 100/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5616 - accuracy: 0.6996\n",
"Epoch 101/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5607 - accuracy: 0.6951\n",
"Epoch 102/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5606 - accuracy: 0.6951\n",
"Epoch 103/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5603 - accuracy: 0.6951\n",
"Epoch 104/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5594 - accuracy: 0.6996\n",
"Epoch 105/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5590 - accuracy: 0.6951\n",
"Epoch 106/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5603 - accuracy: 0.6906\n",
"Epoch 107/250\n",
"7/7 [==============================] - 0s 3ms/step - loss: 0.5582 - accuracy: 0.6951\n",
"Epoch 108/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5595 - accuracy: 0.6951\n",
"Epoch 109/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5593 - accuracy: 0.6906\n",
"Epoch 110/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5569 - accuracy: 0.6996\n",
"Epoch 111/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5602 - accuracy: 0.6996\n",
"Epoch 112/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5588 - accuracy: 0.6906\n",
"Epoch 113/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5573 - accuracy: 0.6951\n",
"Epoch 114/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5553 - accuracy: 0.6996\n",
"Epoch 115/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5563 - accuracy: 0.6906\n",
"Epoch 116/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5556 - accuracy: 0.6951\n",
"Epoch 117/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5558 - accuracy: 0.6996\n",
"Epoch 118/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5546 - accuracy: 0.6951\n",
"Epoch 119/250\n",
"7/7 [==============================] - 0s 7ms/step - loss: 0.5539 - accuracy: 0.6996\n",
"Epoch 120/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5586 - accuracy: 0.6951\n",
"Epoch 121/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5563 - accuracy: 0.6996\n",
"Epoch 122/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5536 - accuracy: 0.6996\n",
"Epoch 123/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5539 - accuracy: 0.6951\n",
"Epoch 124/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5539 - accuracy: 0.6951\n",
"Epoch 125/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5524 - accuracy: 0.6996\n",
"Epoch 126/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5523 - accuracy: 0.7040\n",
"Epoch 127/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5521 - accuracy: 0.6996\n",
"Epoch 128/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5560 - accuracy: 0.6906\n",
"Epoch 129/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5521 - accuracy: 0.6951\n",
"Epoch 130/250\n",
"7/7 [==============================] - 0s 3ms/step - loss: 0.5582 - accuracy: 0.6951\n",
"Epoch 131/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5535 - accuracy: 0.7085\n",
"Epoch 132/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5535 - accuracy: 0.6906\n",
"Epoch 133/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5521 - accuracy: 0.7040\n",
"Epoch 134/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5500 - accuracy: 0.7085\n",
"Epoch 135/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5489 - accuracy: 0.6996\n",
"Epoch 136/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5491 - accuracy: 0.6996\n",
"Epoch 137/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5497 - accuracy: 0.7040\n",
"Epoch 138/250\n",
"7/7 [==============================] - 0s 3ms/step - loss: 0.5479 - accuracy: 0.6951\n",
"Epoch 139/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5522 - accuracy: 0.7040\n",
"Epoch 140/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5480 - accuracy: 0.6951\n",
"Epoch 141/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5484 - accuracy: 0.6906\n",
"Epoch 142/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5467 - accuracy: 0.7040\n",
"Epoch 143/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5458 - accuracy: 0.6996\n",
"Epoch 144/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5461 - accuracy: 0.7085\n",
"Epoch 145/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5451 - accuracy: 0.7040\n",
"Epoch 146/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5463 - accuracy: 0.6951\n",
"Epoch 147/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5465 - accuracy: 0.6996\n",
"Epoch 148/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5471 - accuracy: 0.6951\n",
"Epoch 149/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5441 - accuracy: 0.7040\n",
"Epoch 150/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5447 - accuracy: 0.6996\n",
"Epoch 151/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5437 - accuracy: 0.6951\n",
"Epoch 152/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5430 - accuracy: 0.7040\n",
"Epoch 153/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5430 - accuracy: 0.6996\n",
"Epoch 154/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5433 - accuracy: 0.7040\n",
"Epoch 155/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5457 - accuracy: 0.6951\n",
"Epoch 156/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5419 - accuracy: 0.6996\n",
"Epoch 157/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5410 - accuracy: 0.6996\n",
"Epoch 158/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5424 - accuracy: 0.6996\n",
"Epoch 159/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5405 - accuracy: 0.6996\n",
"Epoch 160/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5420 - accuracy: 0.6996\n",
"Epoch 161/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5403 - accuracy: 0.7040\n",
"Epoch 162/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5420 - accuracy: 0.7040\n",
"Epoch 163/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5400 - accuracy: 0.6996\n",
"Epoch 164/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5402 - accuracy: 0.6996\n",
"Epoch 165/250\n",
"7/7 [==============================] - 0s 7ms/step - loss: 0.5392 - accuracy: 0.7040\n",
"Epoch 166/250\n",
"7/7 [==============================] - 0s 3ms/step - loss: 0.5383 - accuracy: 0.7040\n",
"Epoch 167/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5407 - accuracy: 0.6996\n",
"Epoch 168/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5399 - accuracy: 0.6996\n",
"Epoch 169/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5378 - accuracy: 0.6996\n",
"Epoch 170/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5429 - accuracy: 0.6951\n",
"Epoch 171/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5371 - accuracy: 0.6996\n",
"Epoch 172/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5412 - accuracy: 0.7040\n",
"Epoch 173/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5403 - accuracy: 0.6996\n",
"Epoch 174/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5384 - accuracy: 0.7085\n",
"Epoch 175/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5375 - accuracy: 0.6996\n",
"Epoch 176/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5390 - accuracy: 0.6996\n",
"Epoch 177/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5371 - accuracy: 0.7040\n",
"Epoch 178/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5365 - accuracy: 0.7085\n",
"Epoch 179/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5341 - accuracy: 0.6996\n",
"Epoch 180/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5339 - accuracy: 0.7085\n",
"Epoch 181/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5345 - accuracy: 0.7085\n",
"Epoch 182/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5340 - accuracy: 0.6996\n",
"Epoch 183/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5338 - accuracy: 0.7085\n",
"Epoch 184/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5344 - accuracy: 0.7130\n",
"Epoch 185/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5323 - accuracy: 0.7040\n",
"Epoch 186/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5335 - accuracy: 0.7040\n",
"Epoch 187/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5329 - accuracy: 0.7040\n",
"Epoch 188/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5317 - accuracy: 0.7040\n",
"Epoch 189/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5309 - accuracy: 0.7040\n",
"Epoch 190/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5312 - accuracy: 0.7130\n",
"Epoch 191/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5297 - accuracy: 0.7085\n",
"Epoch 192/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5434 - accuracy: 0.6906\n",
"Epoch 193/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5351 - accuracy: 0.7220\n",
"Epoch 194/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5352 - accuracy: 0.7130\n",
"Epoch 195/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5300 - accuracy: 0.7040\n",
"Epoch 196/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5301 - accuracy: 0.7085\n",
"Epoch 197/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5288 - accuracy: 0.7130\n",
"Epoch 198/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5309 - accuracy: 0.7085\n",
"Epoch 199/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5329 - accuracy: 0.7040\n",
"Epoch 200/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5289 - accuracy: 0.7175\n",
"Epoch 201/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5306 - accuracy: 0.7130\n",
"Epoch 202/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5285 - accuracy: 0.7130\n",
"Epoch 203/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5285 - accuracy: 0.7175\n",
"Epoch 204/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5302 - accuracy: 0.7175\n",
"Epoch 205/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5253 - accuracy: 0.7085\n",
"Epoch 206/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5283 - accuracy: 0.7175\n",
"Epoch 207/250\n",
"7/7 [==============================] - 0s 3ms/step - loss: 0.5265 - accuracy: 0.7220\n",
"Epoch 208/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5279 - accuracy: 0.7175\n",
"Epoch 209/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5293 - accuracy: 0.7130\n",
"Epoch 210/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5273 - accuracy: 0.7175\n",
"Epoch 211/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5259 - accuracy: 0.7130\n",
"Epoch 212/250\n",
"7/7 [==============================] - 0s 8ms/step - loss: 0.5260 - accuracy: 0.7175\n",
"Epoch 213/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5232 - accuracy: 0.7220\n",
"Epoch 214/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5235 - accuracy: 0.7130\n",
"Epoch 215/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5220 - accuracy: 0.7130\n",
"Epoch 216/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5216 - accuracy: 0.7085\n",
"Epoch 217/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5231 - accuracy: 0.7130\n",
"Epoch 218/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5244 - accuracy: 0.7130\n",
"Epoch 219/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5222 - accuracy: 0.7085\n",
"Epoch 220/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5197 - accuracy: 0.7130\n",
"Epoch 221/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5198 - accuracy: 0.7175\n",
"Epoch 222/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5199 - accuracy: 0.7130\n",
"Epoch 223/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5191 - accuracy: 0.7130\n",
"Epoch 224/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5181 - accuracy: 0.7175\n",
"Epoch 225/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5191 - accuracy: 0.7220\n",
"Epoch 226/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5186 - accuracy: 0.7309\n",
"Epoch 227/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5196 - accuracy: 0.7309\n",
"Epoch 228/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5208 - accuracy: 0.7085\n",
"Epoch 229/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5169 - accuracy: 0.7220\n",
"Epoch 230/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5179 - accuracy: 0.7220\n",
"Epoch 231/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5168 - accuracy: 0.7220\n",
"Epoch 232/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5157 - accuracy: 0.7354\n",
"Epoch 233/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5149 - accuracy: 0.7309\n",
"Epoch 234/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5158 - accuracy: 0.7309\n",
"Epoch 235/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5143 - accuracy: 0.7444\n",
"Epoch 236/250\n",
"7/7 [==============================] - 0s 7ms/step - loss: 0.5188 - accuracy: 0.7265\n",
"Epoch 237/250\n",
"7/7 [==============================] - 0s 3ms/step - loss: 0.5144 - accuracy: 0.7399\n",
"Epoch 238/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5142 - accuracy: 0.7354\n",
"Epoch 239/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5154 - accuracy: 0.7399\n",
"Epoch 240/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5136 - accuracy: 0.7220\n",
"Epoch 241/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5148 - accuracy: 0.7354\n",
"Epoch 242/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5123 - accuracy: 0.7354\n",
"Epoch 243/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5125 - accuracy: 0.7309\n",
"Epoch 244/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5116 - accuracy: 0.7265\n",
"Epoch 245/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5112 - accuracy: 0.7354\n",
"Epoch 246/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5116 - accuracy: 0.7220\n",
"Epoch 247/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5117 - accuracy: 0.7220\n",
"Epoch 248/250\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.5118 - accuracy: 0.7399\n",
"Epoch 249/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5132 - accuracy: 0.7444\n",
"Epoch 250/250\n",
"7/7 [==============================] - 0s 1ms/step - loss: 0.5092 - accuracy: 0.7354\n"
]
},
{
"data": {
"text/plain": [
""
]
},
"execution_count": 63,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model.fit(X_train, y_train, epochs=250)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "w6VAo-ZNK5uw"
},
"source": [
"## Model Evaluation\n",
"The model evaluation output shows the performance of the model on both training and test data. The accuracy was approximately 75% on the training data and 74% on the test data. It's noteworthy that the training and test set accuracies are close to each other, which shows that there is consistency, and that the accuracy doesn't drop too much when I test the model on unseen data. "
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"execution": {
"iopub.execute_input": "2022-12-27T18:49:59.562231Z",
"iopub.status.busy": "2022-12-27T18:49:59.561787Z",
"iopub.status.idle": "2022-12-27T18:49:59.828505Z",
"shell.execute_reply": "2022-12-27T18:49:59.827266Z",
"shell.execute_reply.started": "2022-12-27T18:49:59.562197Z"
},
"id": "k2GEaWbtK5ux",
"outputId": "90ff685a-f1bb-467f-97fe-9f03d1c0b6ea"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"7/7 [==============================] - 0s 2ms/step - loss: 0.5101 - accuracy: 0.7354\n"
]
},
{
"data": {
"text/plain": [
"[0.5101242661476135, 0.7354260087013245]"
]
},
"execution_count": 64,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model.evaluate(X_train, y_train)"
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"execution": {
"iopub.execute_input": "2022-12-27T18:50:03.037461Z",
"iopub.status.busy": "2022-12-27T18:50:03.036767Z",
"iopub.status.idle": "2022-12-27T18:50:03.252629Z",
"shell.execute_reply": "2022-12-27T18:50:03.251550Z",
"shell.execute_reply.started": "2022-12-27T18:50:03.037425Z"
},
"id": "Rv1vr_EdK5ux",
"outputId": "ffc7f898-ff5d-4cf6-96e2-36f27d5ae678"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"3/3 [==============================] - 0s 1ms/step - loss: 0.6855 - accuracy: 0.7083\n"
]
},
{
"data": {
"text/plain": [
"[0.6855059266090393, 0.7083333134651184]"
]
},
"execution_count": 65,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model.evaluate(X_test, y_test)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "M_sIj5bVK5ux"
},
"source": [
"## Conclusion\n",
"\n",
"I have built a deep learning classification model using the deep learning framework, Keras, in TensorFlow. I used a IPO dataset and built a classifier algorithm to predict whether an IPO will list at profit or not. I used the Sequential API to build the model, which is achieving a decent accuracy of 75% and 74% on training and test data, respectively. I see that the accuracy is consistent across the training and test datasets, which is a promising sign. This model can be useful to make investment decisions in the IPO market.\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# PS: Alternative Approach via Functional API"
]
},
{
"cell_type": "code",
"execution_count": 68,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model: \"model_2\"\n",
"_________________________________________________________________\n",
" Layer (type) Output Shape Param # \n",
"=================================================================\n",
" input_3 (InputLayer) [(None, 6)] 0 \n",
" \n",
" dense_17 (Dense) (None, 128) 896 \n",
" \n",
" dropout_4 (Dropout) (None, 128) 0 \n",
" \n",
" dense_18 (Dense) (None, 64) 8256 \n",
" \n",
" dropout_5 (Dropout) (None, 64) 0 \n",
" \n",
" dense_19 (Dense) (None, 16) 1040 \n",
" \n",
" dense_20 (Dense) (None, 8) 136 \n",
" \n",
" dense_21 (Dense) (None, 4) 36 \n",
" \n",
" dense_22 (Dense) (None, 1) 5 \n",
" \n",
"=================================================================\n",
"Total params: 10,369\n",
"Trainable params: 10,369\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n",
"None\n",
"7/7 [==============================] - 0s 2ms/step - loss: 0.4684 - accuracy: 0.7668\n",
"[0.4683583080768585, 0.7668161392211914]\n",
"3/3 [==============================] - 0s 3ms/step - loss: 0.6766 - accuracy: 0.6875\n",
"[0.6766266822814941, 0.6875]\n"
]
}
],
"source": [
"input_layer = tf.keras.Input(shape=(X_train.shape[1],))\n",
"hidden_layer1 = tf.keras.layers.Dense(128, activation='relu')(input_layer)\n",
"drop1 = tf.keras.layers.Dropout(rate=0.40)(hidden_layer1)\n",
"hidden_layer2 = tf.keras.layers.Dense(64, activation='relu')(drop1)\n",
"drop2 =tf.keras.layers.Dropout(rate=0.20)(hidden_layer2)\n",
"hidden_layer3 = tf.keras.layers.Dense(16, activation='relu')(drop2)\n",
"hidden_layer4 = tf.keras.layers.Dense(8, activation='relu')(hidden_layer3)\n",
"hidden_layer5 = tf.keras.layers.Dense(4, activation='relu')(hidden_layer4)\n",
"output_layer = tf.keras.layers.Dense(1, activation='sigmoid')(hidden_layer5)\n",
"model = tf.keras.Model(inputs=input_layer, outputs=output_layer)\n",
"print(model.summary())\n",
"\n",
"optimizer = tf.keras.optimizers.Adam(0.001)\n",
"\n",
"loss = tf.keras.losses.BinaryCrossentropy()\n",
"\n",
"metrics = ['accuracy']\n",
"\n",
"model.compile(optimizer=optimizer, loss=loss, metrics=metrics)\n",
"model.fit(X_train, y_train, epochs=250, verbose=0)\n",
"print(model.evaluate(X_train, y_train))\n",
"print(model.evaluate(X_test, y_test))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
}
},
"nbformat": 4,
"nbformat_minor": 4
}