{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Programming Language Classification with OpenVINO\n",
    "## Overview\n",
    "This tutorial will be divided in 2 parts:\n",
    "1. Create a simple inference pipeline with a pre-trained model using the OpenVINO™ IR format.\n",
    "2. Conduct [post-training quantization](https://docs.openvino.ai/2023.3/ptq_introduction.html) on a pre-trained model using Hugging Face Optimum and benchmark performance.\n",
    "\n",
    "Feel free to use the notebook outline in Jupyter or your IDE for easy navigation. \n",
    "\n",
    "#### Table of contents:\n",
    "- [Introduction](#Introduction)\n",
    "    - [Task](#Task)\n",
    "    - [Model](#Model)\n",
    "- [Part 1: Inference pipeline with OpenVINO](#Part-1:-Inference-pipeline-with-OpenVINO)\n",
    "    - [Install prerequisites](#Install-prerequisites)\n",
    "    - [Imports](#Imports)\n",
    "    - [Setting up HuggingFace cache](#Setting-up-HuggingFace-cache)\n",
    "    - [Select inference device](#Select-inference-device)\n",
    "    - [Download resources](#Download-resources)\n",
    "    - [Create inference pipeline](#Create-inference-pipeline)\n",
    "    - [Inference on new input](#Inference-on-new-input)\n",
    "- [Part 2: OpenVINO post-training quantization with HuggingFace Optimum](#Part-2:-OpenVINO-post-training-quantization-with-HuggingFace-Optimum)\n",
    "    - [Define constants and functions](#Define-constants-and-functions)\n",
    "    - [Load resources](#Load-resources)\n",
    "    - [Load calibration dataset](#Load-calibration-dataset)\n",
    "    - [Quantize model](#Quantize-model)\n",
    "    - [Load quantized model](#Load-quantized-model)\n",
    "    - [Inference on new input using quantized model](#Inference-on-new-input-using-quantized-model)\n",
    "    - [Load evaluation set](#Load-evaluation-set)\n",
    "    - [Evaluate model](#Evaluate-model)\n",
    "- [Additional resources](#Additional-resources)\n",
    "- [Clean up](#Clean-up)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Introduction\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "### Task\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "**Programming language classification** is the task of identifying which programming language is used in an arbitrary code snippet. This can be useful to label new data to include in a dataset, and potentially serve as an intermediary step when input snippets need to be process based on their programming language.\n",
    "\n",
    "It is a relatively easy machine learning task given that each programming language has its own formal symbols, syntax, and grammar. However, there are some potential edge cases:\n",
    "- **Ambiguous short snippets**: For example, TypeScript is a superset of JavaScript, meaning it does everything JavaScript can and more. For a short input snippet, it might be impossible to distinguish between the two. Given we know TypeScript is a superset, and the model doesn't, we should default to classifying the input as JavaScript in a post-processing step.\n",
    "- **Nested programming languages**: Some languages are typically used in tandem. For example, most HTML contains CSS and JavaScript, and it is not uncommon to see SQL nested in other scripting languages. For such input, it is unclear what the expected output class should be. \n",
    "- **Evolving programming language**: Even though programming languages are formal, their symbols, syntax, and grammar can be revised and updated. For example, the walrus operator (`:=`) was a symbol distinctively used in Golang, but was later introduced in Python 3.8."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Model\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "The classification model that will be used in this notebook is [`CodeBERTa-language-id`](https://huggingface.co/huggingface/CodeBERTa-language-id) by HuggingFace. This model was fine-tuned from the masked language modeling model [`CodeBERTa-small-v1`](https://huggingface.co/huggingface/CodeBERTa-small-v1) trained on the [`CodeSearchNet`](https://huggingface.co/huggingface/CodeBERTa-small-v1) dataset (Husain, 2019).\n",
    "\n",
    "It supports 6 programming languages:\n",
    "- Go\n",
    "- Java\n",
    "- JavaScript\n",
    "- PHP\n",
    "- Python\n",
    "- Ruby"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Part 1: Inference pipeline with OpenVINO\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "For this section, we will use the [**HuggingFace Optimum**](https://huggingface.co/docs/optimum/index) library, which aims to optimize inference on specific hardware and integrates with the OpenVINO toolkit. The code will be very similar to the [**HuggingFace Transformers**](https://huggingface.co/docs/transformers/index), but will allow to automatically convert models to the OpenVINO™ IR format."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Install prerequisites\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "First, complete the [repository installation steps](../../README.md).\n",
    "\n",
    "Then, the following cell will install:\n",
    "- HuggingFace Optimum with OpenVINO support\n",
    "- HuggingFace Evaluate to benchmark results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install -q \"diffusers>=0.17.1\" \"openvino>=2023.1.0\" \"nncf>=2.5.0\" \"gradio\" \"onnx>=1.11.0\" \"transformers>=4.33.0\" \"evaluate\" --extra-index-url https://download.pytorch.org/whl/cpu\n",
    "%pip install -q \"git+https://github.com/huggingface/optimum-intel.git\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Imports\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "The import `OVModelForSequenceClassification` from Optimum is equivalent to `AutoModelForSequenceClassification` from Transformers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2023-08-07 09:42:12.312320: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
      "2023-08-07 09:42:12.350853: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.\n",
      "To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
      "2023-08-07 09:42:13.079480: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "INFO:nncf:NNCF initialized successfully. Supported frameworks detected: torch, tensorflow, onnx, openvino\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'\n"
     ]
    }
   ],
   "source": [
    "from functools import partial\n",
    "from pathlib import Path\n",
    "\n",
    "import pandas as pd\n",
    "from datasets import load_dataset, Dataset\n",
    "import evaluate\n",
    "from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification\n",
    "from optimum.intel import OVModelForSequenceClassification  \n",
    "from optimum.intel.openvino import OVConfig, OVQuantizer\n",
    "from huggingface_hub.utils import RepositoryNotFoundError"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setting up HuggingFace cache\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "Resources from HuggingFace will be downloaded in the local folder `./model` (next to this notebook) instead of the device global cache for easy cleanup. Learn more [here](https://huggingface.co/docs/transformers/installation?highlight=transformers_cache#cache-setup)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "MODEL_NAME = \"CodeBERTa-language-id\"\n",
    "MODEL_ID = f\"huggingface/{MODEL_NAME}\"\n",
    "MODEL_LOCAL_PATH = Path(\"./model\").joinpath(MODEL_NAME)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Select inference device\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "select device from dropdown list for running inference using OpenVINO"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "4fd116b33e5c48af9c61e4d4b5770dac",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Dropdown(description='Device:', index=2, options=('CPU', 'GPU', 'AUTO'), value='AUTO')"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import ipywidgets as widgets\n",
    "import openvino as ov\n",
    "\n",
    "core = ov.Core()\n",
    "\n",
    "device = widgets.Dropdown(\n",
    "    options=core.available_devices + [\"AUTO\"],\n",
    "    value='AUTO',\n",
    "    description='Device:',\n",
    "    disabled=False,\n",
    ")\n",
    "\n",
    "device"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Download resources\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Downloading resources from HuggingFace Hub\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "a241021c39fa4f2c8cdcbb41b92748ec",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Downloading (…)okenizer_config.json:   0%|          | 0.00/19.0 [00:00<?, ?B/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "b7a69548e74b4eb2bcd918435be3d883",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Downloading (…)lve/main/config.json:   0%|          | 0.00/756 [00:00<?, ?B/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "792d786863e44ddd83e1d3e4fca77722",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Downloading (…)olve/main/vocab.json:   0%|          | 0.00/994k [00:00<?, ?B/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "58576ec710404098b0b97fb6746ddd7b",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Downloading (…)olve/main/merges.txt:   0%|          | 0.00/483k [00:00<?, ?B/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Framework not specified. Using pt to export to ONNX.\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "7644c7b2674d4a0a9d4d59d63c608d33",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Downloading pytorch_model.bin:   0%|          | 0.00/336M [00:00<?, ?B/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Some weights of the model checkpoint at huggingface/CodeBERTa-language-id were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.weight', 'roberta.pooler.dense.bias']\n",
      "- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n",
      "- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n",
      "Using framework PyTorch: 2.0.1+cpu\n",
      "Overriding 1 configuration item(s)\n",
      "\t- use_cache -> False\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "============== Diagnostic Run torch.onnx.export version 2.0.1+cpu ==============\n",
      "verbose: False, log level: Level.ERROR\n",
      "======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Compiling the model...\n",
      "Set CACHE_DIR to /tmp/tmp9yj8fta_/model_cache\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Ressources cached locally at: /home/ea/work/openvino_notebooks/notebooks/247-code-language-id/model/CodeBERTa-language-id\n"
     ]
    }
   ],
   "source": [
    "# try to load resources locally\n",
    "try:\n",
    "    model = OVModelForSequenceClassification.from_pretrained(MODEL_LOCAL_PATH, device=device.value)\n",
    "    tokenizer = AutoTokenizer.from_pretrained(MODEL_LOCAL_PATH)\n",
    "    print(f\"Loaded resources from local path: {MODEL_LOCAL_PATH.absolute()}\")\n",
    "\n",
    "# if not found, download from HuggingFace Hub then save locally\n",
    "except (RepositoryNotFoundError, OSError):\n",
    "    print(\"Downloading resources from HuggingFace Hub\")\n",
    "    tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)\n",
    "    tokenizer.save_pretrained(MODEL_LOCAL_PATH)\n",
    "\n",
    "    # export=True is needed to convert the PyTorch model to OpenVINO\n",
    "    model = OVModelForSequenceClassification.from_pretrained(MODEL_ID, export=True, device=device.value)\n",
    "    model.save_pretrained(MODEL_LOCAL_PATH)\n",
    "    print(f\"Ressources cached locally at: {MODEL_LOCAL_PATH.absolute()}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create inference pipeline\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Xformers is not installed correctly. If you want to use memory_efficient_attention to accelerate training use the following command to install Xformers\n",
      "pip install xformers.\n"
     ]
    }
   ],
   "source": [
    "code_classification_pipe = pipeline(\"text-classification\", model=model, tokenizer=tokenizer)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Inference on new input\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Input snippet:\n",
      "  df['speed'] = df.distance / df.time\n",
      "\n",
      "Predicted label: python\n",
      "Predicted score: 0.81\n"
     ]
    }
   ],
   "source": [
    "# change input snippet to test model\n",
    "input_snippet = \"df['speed'] = df.distance / df.time\"\n",
    "output = code_classification_pipe(input_snippet)\n",
    "\n",
    "print(f\"Input snippet:\\n  {input_snippet}\\n\")\n",
    "print(f\"Predicted label: {output[0]['label']}\")\n",
    "print(f\"Predicted score: {output[0]['score']:.2}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Part 2: OpenVINO post-training quantization with HuggingFace Optimum\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "In this section, we will quantize a trained model. At a high-level, this process consists of using lower precision numbers in the model, which results in a smaller model size and faster inference at the cost of a potential marginal performance degradation. [Learn more](https://docs.openvino.ai/2023.3/ptq_introduction.html).\n",
    "\n",
    "The HuggingFace Optimum library supports post-training quantization for OpenVINO. [Learn more](https://huggingface.co/docs/optimum/main/en/intel/index)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Define constants and functions\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "QUANTIZED_MODEL_LOCAL_PATH = MODEL_LOCAL_PATH.with_name(f\"{MODEL_NAME}-quantized\")\n",
    "DATASET_NAME = \"code_search_net\"\n",
    "LABEL_MAPPING = {\"go\": 0, \"java\": 1, \"javascript\": 2, \"php\": 3, \"python\": 4, \"ruby\": 5}\n",
    "\n",
    "\n",
    "def preprocess_function(examples: dict, tokenizer):\n",
    "    \"\"\"Preprocess inputs by tokenizing the `func_code_string` column\"\"\"\n",
    "    return tokenizer(\n",
    "        examples[\"func_code_string\"],\n",
    "        padding=\"max_length\",\n",
    "        max_length=tokenizer.model_max_length,\n",
    "        truncation=True,\n",
    "    )\n",
    "\n",
    "\n",
    "def map_labels(example: dict) -> dict:\n",
    "    \"\"\"Convert string labels to integers\"\"\"\n",
    "    label_mapping = {\"go\": 0, \"java\": 1, \"javascript\": 2, \"php\": 3, \"python\": 4, \"ruby\": 5}\n",
    "    example[\"language\"] = label_mapping[example[\"language\"]]\n",
    "    return example \n",
    "\n",
    "\n",
    "def get_dataset_sample(dataset_split: str, num_samples: int) -> Dataset:\n",
    "    \"\"\"Create a sample with equal representation of each class without downloading the entire data\"\"\"\n",
    "    labels = [\"go\", \"java\", \"javascript\", \"php\", \"python\", \"ruby\"]\n",
    "    example_per_label = num_samples // len(labels)\n",
    "\n",
    "    examples = []\n",
    "    for label in labels:\n",
    "        subset = load_dataset(\"code_search_net\", split=dataset_split, name=label, streaming=True)\n",
    "        subset = subset.map(map_labels)\n",
    "        examples.extend([example for example in subset.shuffle().take(example_per_label)])\n",
    "    \n",
    "    return Dataset.from_list(examples)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Load resources\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "NOTE: the base model is loaded using `AutoModelForSequenceClassification` from `Transformers`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Some weights of the model checkpoint at huggingface/CodeBERTa-language-id were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.weight', 'roberta.pooler.dense.bias']\n",
      "- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n",
      "- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n"
     ]
    }
   ],
   "source": [
    "tokenizer = AutoTokenizer.from_pretrained(MODEL_LOCAL_PATH)\n",
    "base_model = AutoModelForSequenceClassification.from_pretrained(MODEL_ID)\n",
    "\n",
    "quantizer = OVQuantizer.from_pretrained(base_model)\n",
    "quantization_config = OVConfig()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Load calibration dataset\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "The `get_dataset_sample()` function will sample up to `num_samples`, with an equal number of examples across the 6 programming languages.\n",
    "\n",
    "NOTE: Uncomment the method below to download and use the full dataset (5+ Gb). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "b29aeeb574c0444fb4c224a5ef1491e9",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Downloading builder script:   0%|          | 0.00/8.44k [00:00<?, ?B/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "d68f696b2d224b49bb9b6634d629f4d9",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Downloading metadata:   0%|          | 0.00/18.5k [00:00<?, ?B/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "ef3ead0d1e4f4e19a1bb5ca99395e34c",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Downloading readme:   0%|          | 0.00/12.9k [00:00<?, ?B/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "50764a49c6f04b21a30ef298e8c37e7f",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Map:   0%|          | 0/120 [00:00<?, ? examples/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "calibration_sample = get_dataset_sample(dataset_split=\"train\", num_samples=120)\n",
    "calibration_sample = calibration_sample.map(partial(preprocess_function, tokenizer=tokenizer))\n",
    "\n",
    "# calibration_sample = quantizer.get_calibration_dataset(\n",
    "#     DATASET_NAME,\n",
    "#     preprocess_function=partial(preprocess_function, tokenizer=tokenizer),\n",
    "#     num_samples=120,\n",
    "#     dataset_split=\"train\",\n",
    "#     preprocess_batch=True,\n",
    "# )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Quantize model\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "Calling `quantizer.quantize(...)` will iterate through the calibration dataset to quantize and save the model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "INFO:nncf:Not adding activation input quantizer for operation: 12 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/NNCFEmbedding[token_type_embeddings]/embedding_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 11 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/NNCFEmbedding[word_embeddings]/embedding_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 3 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/ne_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 4 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/int_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 5 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/cumsum_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 13 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/__add___2\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 6 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/type_as_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 7 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/__add___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 8 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/__mul___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 9 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/long_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 10 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/__add___1\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 14 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/NNCFEmbedding[position_embeddings]/embedding_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 15 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/__iadd___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 16 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/NNCFLayerNorm[LayerNorm]/layer_norm_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 17 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/Dropout[dropout]/dropout_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 30 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[0]/RobertaAttention[attention]/RobertaSelfAttention[self]/__add___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 33 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[0]/RobertaAttention[attention]/RobertaSelfAttention[self]/matmul_1\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 39 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[0]/RobertaAttention[attention]/RobertaSelfOutput[output]/__add___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 40 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[0]/RobertaAttention[attention]/RobertaSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 45 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[0]/RobertaOutput[output]/__add___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 46 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[0]/RobertaOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 59 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[1]/RobertaAttention[attention]/RobertaSelfAttention[self]/__add___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 62 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[1]/RobertaAttention[attention]/RobertaSelfAttention[self]/matmul_1\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 68 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[1]/RobertaAttention[attention]/RobertaSelfOutput[output]/__add___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 69 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[1]/RobertaAttention[attention]/RobertaSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 74 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[1]/RobertaOutput[output]/__add___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 75 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[1]/RobertaOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 88 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[2]/RobertaAttention[attention]/RobertaSelfAttention[self]/__add___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 91 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[2]/RobertaAttention[attention]/RobertaSelfAttention[self]/matmul_1\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 97 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[2]/RobertaAttention[attention]/RobertaSelfOutput[output]/__add___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 98 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[2]/RobertaAttention[attention]/RobertaSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 103 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[2]/RobertaOutput[output]/__add___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 104 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[2]/RobertaOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 117 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[3]/RobertaAttention[attention]/RobertaSelfAttention[self]/__add___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 120 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[3]/RobertaAttention[attention]/RobertaSelfAttention[self]/matmul_1\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 126 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[3]/RobertaAttention[attention]/RobertaSelfOutput[output]/__add___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 127 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[3]/RobertaAttention[attention]/RobertaSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 132 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[3]/RobertaOutput[output]/__add___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 133 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[3]/RobertaOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 146 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[4]/RobertaAttention[attention]/RobertaSelfAttention[self]/__add___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 149 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[4]/RobertaAttention[attention]/RobertaSelfAttention[self]/matmul_1\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 155 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[4]/RobertaAttention[attention]/RobertaSelfOutput[output]/__add___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 156 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[4]/RobertaAttention[attention]/RobertaSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 161 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[4]/RobertaOutput[output]/__add___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 162 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[4]/RobertaOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 175 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[5]/RobertaAttention[attention]/RobertaSelfAttention[self]/__add___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 178 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[5]/RobertaAttention[attention]/RobertaSelfAttention[self]/matmul_1\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 184 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[5]/RobertaAttention[attention]/RobertaSelfOutput[output]/__add___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 185 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[5]/RobertaAttention[attention]/RobertaSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 190 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[5]/RobertaOutput[output]/__add___0\n",
      "INFO:nncf:Not adding activation input quantizer for operation: 191 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[5]/RobertaOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0\n",
      "INFO:nncf:Collecting tensor statistics |█               | 33 / 300\n",
      "INFO:nncf:Collecting tensor statistics |███             | 66 / 300\n",
      "INFO:nncf:Collecting tensor statistics |█████           | 99 / 300\n",
      "INFO:nncf:Compiling and loading torch extension: quantized_functions_cpu...\n",
      "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
      "To disable this warning, you can either:\n",
      "\t- Avoid using `tokenizers` before the fork if possible\n",
      "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n",
      "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
      "To disable this warning, you can either:\n",
      "\t- Avoid using `tokenizers` before the fork if possible\n",
      "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n",
      "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
      "To disable this warning, you can either:\n",
      "\t- Avoid using `tokenizers` before the fork if possible\n",
      "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n",
      "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
      "To disable this warning, you can either:\n",
      "\t- Avoid using `tokenizers` before the fork if possible\n",
      "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n",
      "INFO:nncf:Finished loading torch extension: quantized_functions_cpu\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/home/ea/work/ov_notebooks_env/lib/python3.8/site-packages/nncf/torch/nncf_network.py:938: FutureWarning: Old style of accessing NNCF-specific attributes and methods on NNCFNetwork objects is deprecated. Access the NNCF-specific attrs through the NNCFInterface, which is set up as an `nncf` attribute on the compressed model object.\n",
      "For instance, instead of `compressed_model.get_graph()` you should now write `compressed_model.nncf.get_graph()`.\n",
      "The old style will be removed after NNCF v2.5.0\n",
      "  warning_deprecated(\n",
      "Using framework PyTorch: 2.0.1+cpu\n",
      "Overriding 1 configuration item(s)\n",
      "\t- use_cache -> False\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "WARNING:nncf:You are setting `forward` on an NNCF-processed model object.\n",
      "NNCF relies on custom-wrapping the `forward` call in order to function properly.\n",
      "Arbitrary adjustments to the forward function on an NNCFNetwork object have undefined behaviour.\n",
      "If you need to replace the underlying forward function of the original model so that NNCF should be using that instead of the original forward function that NNCF saved during the compressed model creation, you can do this by calling:\n",
      "model.nncf.set_original_unbound_forward(fn)\n",
      "if `fn` has an unbound 0-th `self` argument, or\n",
      "with model.nncf.temporary_bound_original_forward(fn): ...\n",
      "if `fn` already had 0-th `self` argument bound or never had it in the first place.\n",
      "============== Diagnostic Run torch.onnx.export version 2.0.1+cpu ==============\n",
      "verbose: False, log level: Level.ERROR\n",
      "======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================\n",
      "\n",
      "WARNING:nncf:You are setting `forward` on an NNCF-processed model object.\n",
      "NNCF relies on custom-wrapping the `forward` call in order to function properly.\n",
      "Arbitrary adjustments to the forward function on an NNCFNetwork object have undefined behaviour.\n",
      "If you need to replace the underlying forward function of the original model so that NNCF should be using that instead of the original forward function that NNCF saved during the compressed model creation, you can do this by calling:\n",
      "model.nncf.set_original_unbound_forward(fn)\n",
      "if `fn` has an unbound 0-th `self` argument, or\n",
      "with model.nncf.temporary_bound_original_forward(fn): ...\n",
      "if `fn` already had 0-th `self` argument bound or never had it in the first place.\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Configuration saved in model/CodeBERTa-language-id-quantized/openvino_config.json\n"
     ]
    }
   ],
   "source": [
    "quantizer.quantize(\n",
    "    quantization_config=quantization_config,\n",
    "    calibration_dataset=calibration_sample,\n",
    "    save_directory=QUANTIZED_MODEL_LOCAL_PATH,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Load quantized model\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "NOTE: the argument `export=True` is not required since the quantized model is already in the OpenVINO format."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Compiling the model...\n",
      "Set CACHE_DIR to model/CodeBERTa-language-id-quantized/model_cache\n"
     ]
    }
   ],
   "source": [
    "quantized_model = OVModelForSequenceClassification.from_pretrained(QUANTIZED_MODEL_LOCAL_PATH, device=device.value)\n",
    "quantized_code_classification_pipe = pipeline(\"text-classification\", model=quantized_model, tokenizer=tokenizer)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Inference on new input using quantized model\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Input snippet:\n",
      "  df['speed'] = df.distance / df.time\n",
      "\n",
      "Predicted label: python\n",
      "Predicted score: 0.81\n"
     ]
    }
   ],
   "source": [
    "input_snippet = \"df['speed'] = df.distance / df.time\"\n",
    "output = quantized_code_classification_pipe(input_snippet)\n",
    "\n",
    "print(f\"Input snippet:\\n  {input_snippet}\\n\")\n",
    "print(f\"Predicted label: {output[0]['label']}\")\n",
    "print(f\"Predicted score: {output[0]['score']:.2}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Load evaluation set\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "NOTE: Uncomment the method below to download and use the full dataset (5+ Gb). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "validation_sample = get_dataset_sample(dataset_split=\"validation\", num_samples=120)\n",
    "\n",
    "# validation_sample = load_dataset(DATASET_NAME, split=\"validation\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Evaluate model\n",
    "[back to top ⬆️](#Table-of-contents:)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "# This class is needed due to a current limitation of the Evaluate library with multiclass metrics\n",
    "# ref: https://discuss.huggingface.co/t/combining-metrics-for-multiclass-predictions-evaluations/21792/16\n",
    "class ConfiguredMetric:\n",
    "    def __init__(self, metric, *metric_args, **metric_kwargs):\n",
    "        self.metric = metric\n",
    "        self.metric_args = metric_args\n",
    "        self.metric_kwargs = metric_kwargs\n",
    "    \n",
    "    def add(self, *args, **kwargs):\n",
    "        return self.metric.add(*args, **kwargs)\n",
    "    \n",
    "    def add_batch(self, *args, **kwargs):\n",
    "        return self.metric.add_batch(*args, **kwargs)\n",
    "\n",
    "    def compute(self, *args, **kwargs):\n",
    "        return self.metric.compute(*args, *self.metric_args, **kwargs, **self.metric_kwargs)\n",
    "\n",
    "    @property\n",
    "    def name(self):\n",
    "        return self.metric.name\n",
    "\n",
    "    def _feature_names(self):\n",
    "        return self.metric._feature_names()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First, an `Evaluator` object for `text-classification` and a set of `EvaluationModule` are instantiated. Then, the evaluator `.compute()` method is called on both the base `code_classification_pipe` and the quantized `quantized_code_classification_pipeline`. Finally, results are displayed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "71e5ed383d224cacbf3817512eb79a8e",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Downloading builder script:   0%|          | 0.00/6.77k [00:00<?, ?B/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>f1</th>\n",
       "      <th>total_time_in_seconds</th>\n",
       "      <th>samples_per_second</th>\n",
       "      <th>latency_in_seconds</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>base</th>\n",
       "      <td>1.0</td>\n",
       "      <td>2.334411</td>\n",
       "      <td>51.404821</td>\n",
       "      <td>0.019453</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>quantized</th>\n",
       "      <td>1.0</td>\n",
       "      <td>2.234008</td>\n",
       "      <td>53.715113</td>\n",
       "      <td>0.018617</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            f1  total_time_in_seconds  samples_per_second  latency_in_seconds\n",
       "base       1.0               2.334411           51.404821            0.019453\n",
       "quantized  1.0               2.234008           53.715113            0.018617"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "code_classification_evaluator = evaluate.evaluator(\"text-classification\")\n",
    "# instantiate an object that can contain multiple `evaluate` metrics\n",
    "metrics = evaluate.combine([\n",
    "    ConfiguredMetric(evaluate.load('f1'), average='macro'),\n",
    "])\n",
    "\n",
    "base_results = code_classification_evaluator.compute(\n",
    "    model_or_pipeline=code_classification_pipe,\n",
    "    data=validation_sample,\n",
    "    input_column=\"func_code_string\",\n",
    "    label_column=\"language\",\n",
    "    label_mapping=LABEL_MAPPING,\n",
    "    metric=metrics,\n",
    ")\n",
    "\n",
    "quantized_results = code_classification_evaluator.compute(\n",
    "    model_or_pipeline=quantized_code_classification_pipe,\n",
    "    data=validation_sample,\n",
    "    input_column=\"func_code_string\",\n",
    "    label_column=\"language\",\n",
    "    label_mapping=LABEL_MAPPING,\n",
    "    metric=metrics,\n",
    ")\n",
    "\n",
    "results_df = pd.DataFrame.from_records([base_results, quantized_results], index=[\"base\", \"quantized\"])\n",
    "results_df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Additional resources\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "- [Grammatical Error Correction with OpenVINO](https://github.com/openvinotoolkit/openvino_notebooks/blob/main/notebooks/214-grammar-correction/214-grammar-correction.ipynb)\n",
    "- [Quantize a Hugging Face Question-Answering Model with OpenVINO](https://github.com/huggingface/optimum-intel/blob/main/notebooks/openvino/question_answering_quantization.ipynb)**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Clean up\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "Uncomment and run cell below to delete all resources cached locally in ./model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "# import os\n",
    "# import shutil\n",
    "\n",
    "# try:\n",
    "#     shutil.rmtree(path=QUANTIZED_MODEL_LOCAL_PATH)\n",
    "#     shutil.rmtree(path=MODEL_LOCAL_PATH)\n",
    "#     os.remove(path=\"./compressed_graph.dot\")\n",
    "#     os.remove(path=\"./original_graph.dot\")\n",
    "# except FileNotFoundError:\n",
    "#     print(\"Directory was already deleted\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  },
  "widgets": {
   "application/vnd.jupyter.widget-state+json": {
    "state": {},
    "version_major": 2,
    "version_minor": 0
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}