{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "dated-worst",
   "metadata": {},
   "source": [
    "# <center> 👩‍💻 Welcome to PyExplainer Quickstart Guide 👨‍💻 </center>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "legendary-military",
   "metadata": {},
   "source": [
    "### <center><a href=\"https://github.com/awsm-research/pyExplainer\">pyexplainer - GitHub Repository</a></center>\n",
    "### <center><a href=\"https://pypi.org/project/pyexplainer/\">pyexplainer - PyPI</a></center>\n",
    "### <center><a href=\"https://pyexplainer.readthedocs.io/en/latest/\">pyexplainer - Official Documentation</a></center>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "pregnant-appraisal",
   "metadata": {},
   "source": [
    "# 🛠 Installation \n",
    "## - Please ignore this part if you cloned the whole package from GitHub\n",
    "\n",
    "### 🤖 Run the cell below to install pyexplainer 0.1.5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "aboriginal-subject",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install pyexplainer"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ideal-bedroom",
   "metadata": {},
   "source": [
    "### 🤖 If the code above did not work, try the cell below, otherwise, you are good to go!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "alert-spirituality",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip3 install pyexplainer"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "regional-promise",
   "metadata": {},
   "source": [
    "# Let's get started !"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "forced-olympus",
   "metadata": {},
   "source": [
    "## 👩🏻‍🔧 1. Prepare data and model\n",
    "#### 📝Note. We use the default data and model here for an example"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "crude-promise",
   "metadata": {},
   "source": [
    "### 1.1 Import Libraries Needed"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "stunning-summer",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pyexplainer import pyexplainer_pyexplainer\n",
    "from pyexplainer.pyexplainer_pyexplainer import PyExplainer "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "turned-ethics",
   "metadata": {},
   "source": [
    "### 1.2 Use default datasets and model (Random Forest)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "hearing-gambling",
   "metadata": {},
   "outputs": [],
   "source": [
    "default_data_and_model = pyexplainer_pyexplainer.get_default_data_and_model()\n",
    "py_explainer = PyExplainer(X_train = default_data_and_model['X_train'],\n",
    "                           y_train = default_data_and_model['y_train'],\n",
    "                           indep = default_data_and_model['indep'],\n",
    "                           dep = default_data_and_model['dep'],\n",
    "                           blackbox_model = default_data_and_model['blackbox_model'])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "appropriate-meaning",
   "metadata": {},
   "source": [
    "## 🔧2. Create a Rule Object Manually\n",
    "#### 📝Note. Rule Object is the core backend concept of PyExplainer ! "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "relevant-subcommittee",
   "metadata": {},
   "source": [
    "### 2.1 Prepare X_explain and y_explain data\n",
    "#### 📝Note. We use the default data here for an example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "changing-empty",
   "metadata": {},
   "outputs": [],
   "source": [
    "X_explain = default_data_and_model['X_explain']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "pleasant-avenue",
   "metadata": {},
   "outputs": [],
   "source": [
    "y_explain = default_data_and_model['y_explain']"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "mechanical-motion",
   "metadata": {},
   "source": [
    "### 2.2 Create the rule object"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "awful-blank",
   "metadata": {},
   "outputs": [],
   "source": [
    "created_rule_obj = py_explainer.explain(X_explain=X_explain,\n",
    "                                        y_explain=y_explain,\n",
    "                                        search_function='crossoverinterpolation')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "architectural-broad",
   "metadata": {},
   "source": [
    "## 👩🏽‍🎨 3. Pass Rule Object to .visualise(rule_obj) to Generate the Bullet Chart and Interactive Slider\n",
    "#### 📝Note. simply move the gray slider to modify the value so you can get a new risk score."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "returning-vertical",
   "metadata": {},
   "source": [
    "#### 🔧 Visualise the Rule Object we created manually using .explain(...) method"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "cheap-minister",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "1c988c320be64b47898cc057a226721b",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(Label(value='Risk Score: '), FloatProgress(value=0.0, bar_style='info', layout=Layout(width='40…"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "52e3c160dfca46dc89fe2a9850403a0d",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "FloatSlider(value=0.0, continuous_update=False, description='#1 Decrease the values of CountDeclMethodDefault …"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "65659f31f84d4b82ba22601163b93661",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "FloatSlider(value=1.54, continuous_update=False, description='#2 Increase the values of RatioCommentToCode to …"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "2babbcca8dd54c6995af457978ddfd2d",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "FloatSlider(value=1.0, continuous_update=False, description='#3 Decrease the values of AvgCyclomaticModified t…"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "9bd17a5b379e42f48e715dfff9f7e878",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Output(layout=Layout(border='3px solid black'))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "py_explainer.visualise(created_rule_obj)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "infectious-calgary",
   "metadata": {},
   "source": [
    "# 🤡 Important - Bug Report Channel 🤡\n",
    "#### Please report <a href=\"https://github.com/awsm-research/pyExplainer/issues\">here</a>\n",
    "#### 📧 or email your report to michaelfu1998@gmail.com\n",
    "# "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "married-malpractice",
   "metadata": {},
   "source": [
    "# <center> 🙏Thanks for playing around with PyExplainer, I really appreciate your time! 🙏 </center>\n",
    "#### <center> 🔥 More Features will be Released Soon 🔥 </center>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "difficult-algeria",
   "metadata": {},
   "source": [
    "# 📜 Appendex 📜"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "numeric-efficiency",
   "metadata": {},
   "source": [
    "## A. 🕵🏻 What's in the Rule Object (rule_obj) ?  Let's unbox it ! 📦"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "increased-indicator",
   "metadata": {},
   "source": [
    "### Basic Data Check"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "respective-program",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Type of Rule Object: \", type(load_pyExp_rule_obj))\n",
    "print()\n",
    "print(\"All of the keys in Rule Object\")\n",
    "i = 1\n",
    "for k in load_pyExp_rule_obj.keys():\n",
    "    print(\"Key \", i, \" - \",k)\n",
    "    i += 1"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "forward-freeze",
   "metadata": {},
   "source": [
    "### 🔑 Key 1 - synthetic_data\n",
    "#### As can be seen below, the synthetic data are data coming from feature columns\n",
    "#### This synthetic data was generated internally by the PyExplainer when the .explain(...) method is triggered\n",
    "#### Currently we have 2 approaches to generate synthetic_data\n",
    "#### Approach (1) Crossover and Interpolation\n",
    "#### Approach (2) Random Pertubation\n",
    "#### After the process of C&I. or RP., synthetic_data will be generated as a DataFrame below"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "laden-doubt",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Type of pyExp_rule_obj['synthetic_data'] - \", type(load_pyExp_rule_obj['synthetic_data']), \"\\n\")\n",
    "print(\"Example\", \"\\n\\n\", load_pyExp_rule_obj['synthetic_data'].head(2))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "blind-mustang",
   "metadata": {},
   "source": [
    "### 🔑 Key 2 - synthetic_predictions\n",
    "#### As can be seen below, the synthetic prediction are data coming from the prediction column\n",
    "#### This synthetic prediction was generated internally by the PyExplainer when the .explain(...) method is triggered\n",
    "#### This synthetic prediction is created based on the black box model we passed to the PyExplainer when initialising (section 1.5 & 2.3)\n",
    "#### This synthetic prediction is generated based on the synthetic data above therefore it's called synthetic_predictions\n",
    "#### >>> e.g. synthetic_predictions = blackbox_model.predict(synthetic_data)  Note. we only need feature cols in synthetic_data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "municipal-operation",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Type of pyExp_rule_obj['synthetic_predictions'] - \", type(load_pyExp_rule_obj['synthetic_predictions']), \"\\n\")\n",
    "print(\"Example\", \"\\n\\n\", load_pyExp_rule_obj['synthetic_predictions'])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "adjacent-begin",
   "metadata": {},
   "source": [
    "### 🔑 Key 3 - X_explain\n",
    "#### This X_explain is exactly the same as the one we passed to .explain(...) method (section 3.3 & section 3.4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "according-profession",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Type of pyExp_rule_obj['X_explain'] - \", type(load_pyExp_rule_obj['X_explain']), \"\\n\")\n",
    "print(\"Example\", \"\\n\\n\", load_pyExp_rule_obj['X_explain'])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "spectacular-feeling",
   "metadata": {},
   "source": [
    "### 🔑 Key 4 - y_explain\n",
    "#### This y_explain is exactly the same as the one we passed to .explain(...) method (section 3.3 & section 3.4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "present-breakdown",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Type of pyExp_rule_obj['y_explain'] - \", type(load_pyExp_rule_obj['y_explain']), \"\\n\")\n",
    "print(\"Example\", \"\\n\\n\", load_pyExp_rule_obj['y_explain'])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "serious-territory",
   "metadata": {},
   "source": [
    "### 🔑 Key 5 - indep\n",
    "#### Names of the Selected Feature Cols"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "sublime-ethernet",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Type of pyExp_rule_obj['indep'] - \", type(load_pyExp_rule_obj['indep']), \"\\n\")\n",
    "print(\"Example\", \"\\n\\n\", load_pyExp_rule_obj['indep'])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "designed-barbados",
   "metadata": {},
   "source": [
    "### 🔑 Key 6 - dep\n",
    "#### Names of the Label Col (Prediction Col)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "agreed-health",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Type of pyExp_rule_obj['dep'] - \", type(load_pyExp_rule_obj['dep']), \"\\n\")\n",
    "print(\"Example\", \"\\n\\n\", load_pyExp_rule_obj['dep'])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "noted-wiring",
   "metadata": {},
   "source": [
    "### 🔑 Key 7 - top_k_positive_rules\n",
    "#### This shows the top k positive rules generated by the RuleFit model inside the .explain(...) function\n",
    "#### The value of 'top_k' can be tuned in when we create a Rule Object manually (section 3.4), the default value is 3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "northern-registration",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Type of pyExp_rule_obj['top_k_positive_rules'] - \", type(load_pyExp_rule_obj['top_k_positive_rules']), \"\\n\")\n",
    "print(\"Example\", \"\\n\\n\", load_pyExp_rule_obj['top_k_positive_rules'])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "contemporary-honolulu",
   "metadata": {},
   "source": [
    "### 🔑 Key 8 - top_k_negative_rules\n",
    "#### This shows the top k negative rules generated by the RuleFit model inside the .explain(...) function\n",
    "#### The value of 'top_k' can be tuned in when we create a Rule Object manually (section 3.4), the default value is 3\n",
    "#### However, in the current version, the top_k value is always the same for both negative and positive rules which can be improved in the future version"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "swiss-costa",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Type of pyExp_rule_obj['top_k_negative_rules'] - \", type(load_pyExp_rule_obj['top_k_negative_rules']), \"\\n\")\n",
    "print(\"Example\", \"\\n\\n\", load_pyExp_rule_obj['top_k_negative_rules'])"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}