{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Copyright (c) Microsoft Corporation. All rights reserved.\n", "\n", "Licensed under the MIT License.\n", "\n", "\n", "# Deployment of a model to Azure Kubernetes Service (AKS)\n", "\n", "## Table of contents\n", "1. [Introduction](#intro)\n", "1. [Model deployment on AKS](#deploy)\n", " 1. [Workspace retrieval](#workspace)\n", " 1. [Docker image retrieval](#docker_image)\n", " 1. [AKS compute target creation](#compute)\n", " 1. [Monitoring activation](#monitor)\n", " 1. [Service deployment](#svc_deploy)\n", "1. [Clean up](#clean)\n", "1. [Next steps](#next)\n", "\n", "\n", "## 1. Introduction \n", "\n", "In many real life scenarios, trained machine learning models need to be deployed to production. As we saw in the [prior](21_deployment_on_azure_container_instances.ipynb) deployment notebook, this can be done by deploying on Azure Container Instances. In this tutorial, we will get familiar with another way of implementing a model into a production environment, this time using [Azure Kubernetes Service](https://docs.microsoft.com/en-us/azure/aks/concepts-clusters-workloads) (AKS).\n", "\n", "AKS manages hosted Kubernetes environments. It makes it easy to deploy and manage containerized applications without container orchestration expertise. It also supports deployments with CPU clusters and deployments with GPU clusters.\n", "\n", "At the end of this tutorial, we will have learned how to:\n", "\n", "- Deploy a model as a web service using AKS\n", "- Monitor our new service." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/ComputerVision/classification/notebooks/22_deployment_on_azure_kubernetes_service.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pre-requisites \n", "\n", "This notebook relies on resources we created in [21_deployment_on_azure_container_instances.ipynb](21_deployment_on_azure_container_instances.ipynb):\n", "- Our Azure Machine Learning workspace\n", "- The Docker image that contains the model and scoring script needed for the web service to work.\n", "\n", "If we are missing any of these, we should go back and run the steps from the sections \"Pre-requisites\" to \"3.D Environment setup\" to generate them.\n", "\n", "### Library import \n", "\n", "Now that our prior resources are available, let's first import a few libraries we will need for the deployment on AKS." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# For automatic reloading of modified libraries\n", "%reload_ext autoreload\n", "%autoreload 2\n", "\n", "import sys\n", "sys.path.extend([\"..\", \"../..\"]) # to access the utils_cv library\n", "\n", "# Azure\n", "from azureml.core import Workspace\n", "from azureml.core.compute import AksCompute, ComputeTarget\n", "from azureml.core.webservice import AksWebservice, Webservice" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Model deployment on AKS \n", "\n", "### 2.A Workspace retrieval \n", "\n", "Let's now load the workspace we used in the [prior notebook](21_deployment_on_azure_container_instances.ipynb).\n", "\n", "Note: The Docker image we will use below is attached to that workspace. It is then important to use the same workspace here. If, for any reason, we needed to use another workspace instead, we would need to reproduce, here, the steps followed to create a Docker image containing our image classifier model in the prior notebook." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To create or access an Azure ML Workspace, you will need the following information. If you are coming from previous notebook you can retreive existing workspace, or create a new one if you are just starting with this notebook.\n", "\n", "- subscription ID: the ID of the Azure subscription we are using\n", "- resource group: the name of the resource group in which our workspace resides\n", "- workspace region: the geographical area in which our workspace resides (e.g. \"eastus2\" -- other examples are ---available here -- note the lack of spaces)\n", "- workspace name: the name of the workspace we want to create or retrieve.\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "tags": [ "parameters" ] }, "outputs": [], "source": [ "subscription_id = \"YOUR_SUBSCRIPTION_ID\"\n", "resource_group = \"YOUR_RESOURCE_GROUP_NAME\" \n", "workspace_name = \"YOUR_WORKSPACE_NAME\" \n", "workspace_region = \"YOUR_WORKSPACE_REGION\" #Possible values eastus, eastus2 and so on.\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.A Workspace retrieval \n", "\n", "In [prior notebook](20_azure_workspace_setup.ipynb) notebook, we created a workspace. This is a critical object from which we will build all the pieces we need to deploy our model as a web service. Let's start by retrieving it." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "WARNING - Warning: Falling back to use azure cli login credentials.\n", "If you run your code in unattended mode, i.e., where you can't give a user input, then we recommend to use ServicePrincipalAuthentication or MsiAuthentication.\n", "Please refer to aka.ms/aml-notebook-auth for different authentication mechanisms in azureml-sdk.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Workspace name: amlnotebookws\n", "Workspace region: eastus\n", "Resource group: amlnotebookrg\n" ] } ], "source": [ "# A util method that creates a workspace or retrieves one if it exists, also takes care of Azure Authentication\n", "from utils_cv.common.azureml import get_or_create_workspace\n", "\n", "ws = get_or_create_workspace(\n", " subscription_id,\n", " resource_group,\n", " workspace_name,\n", " workspace_region)\n", "\n", "\n", "# Print the workspace attributes\n", "print('Workspace name: ' + ws.name, \n", " 'Workspace region: ' + ws.location, \n", " 'Subscription id: ' + ws.subscription_id, \n", " 'Resource group: ' + ws.resource_group, sep = '\\n')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.B Docker image retrieval \n", "\n", "We can reuse the Docker image we created in section 3. of the [previous tutorial](21_deployment_on_azure_container_instances.ipynb). Let's make sure that it is still available." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Docker images:\n", " --> Name: image-classif-resnet18-f48\n", " --> ID: image-classif-resnet18-f48:2\n", " --> Tags: {'training set': 'ImageNet', 'architecture': 'CNN ResNet18', 'type': 'Pretrained'}\n", " --> Creation time: 2019-07-18 17:51:26.927240+00:00\n", "\n" ] } ], "source": [ "print(\"Docker images:\")\n", "for docker_im in ws.images: \n", " print(f\" --> Name: {ws.images[docker_im].name}\\n \\\n", " --> ID: {ws.images[docker_im].id}\\n \\\n", " --> Tags: {ws.images[docker_im].tags}\\n \\\n", " --> Creation time: {ws.images[docker_im].created_time}\\n\"\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we did not delete it in the prior notebook, our Docker image is still present in our workspace. Let's retrieve it." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "docker_image = ws.images[\"image-classif-resnet18-f48\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also check that the model it contains is the one we registered and used during our deployment on ACI. In our case, the Docker image contains only 1 model, so taking the 0th element of the `docker_image.models` list returns our model.\n", "\n", "Note: We will not use the `registered_model` object anywhere here. We are running the next 2 cells just for verification purposes." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Existing model:\n", " --> Name: im_classif_resnet18\n", " --> Version: 8\n", " --> ID: im_classif_resnet18:8 \n", " --> Creation time: 2019-07-18 17:51:17.521804+00:00\n", " --> URL: aml://asset/5c63dec5ea424557838d109d3294b611\n" ] } ], "source": [ "registered_model = docker_image.models[0]\n", "\n", "print(f\"Existing model:\\n --> Name: {registered_model.name}\\n \\\n", "--> Version: {registered_model.version}\\n --> ID: {registered_model.id} \\n \\\n", "--> Creation time: {registered_model.created_time}\\n \\\n", "--> URL: {registered_model.url}\"\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.C AKS compute target creation\n", "\n", "In the case of deployment on AKS, in addition to the Docker image, we need to define computational resources. This is typically a cluster of CPUs or a cluster of GPUs. If we already have a Kubernetes-managed cluster in our workspace, we can use it, otherwise, we can create a new one.\n", "\n", "Note: The name we give to our compute target must be between 2 and 16 characters long.\n", "\n", "Let's first check what types of compute resources we have, if any" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "List of compute resources associated with our workspace:\n" ] } ], "source": [ "print(\"List of compute resources associated with our workspace:\")\n", "for cp in ws.compute_targets:\n", " print(f\" --> {cp}: {ws.compute_targets[cp]}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the case where we have no compute resource available, we can create a new one. For this, we can choose between a CPU-based or a GPU-based cluster of virtual machines. The latter is typically better suited for web services with high traffic (i.e. > 100 requests per second) and high GPU utilization. There is a [wide variety](https://docs.microsoft.com/en-us/azure/virtual-machines/windows/sizes-general) of machine types that can be used. In the present example, however, we will not need the fastest machines that exist nor the most memory optimized ones. We will use typical default machines:\n", "- [Standard D3 V2](https://docs.microsoft.com/en-us/azure/virtual-machines/windows/sizes-general#dv2-series):\n", " - 4 vCPUs\n", " - 14 GB of memory\n", "- [Standard NC6](https://docs.microsoft.com/en-us/azure/virtual-machines/windows/sizes-gpu):\n", " - 1 GPU\n", " - 12 GB of GPU memory\n", " - These machines also have 6 vCPUs and 56 GB of memory.\n", "\n", "Notes:\n", "- These are Azure-specific denominations\n", "- Information on optimized machines can be found [here](https://docs.microsoft.com/en-us/azure/virtual-machines/windows/sizes-general#other-sizes)\n", "- When configuring the provisioning of an AKS cluster, we need to choose a type of machine, as examplified above. This choice must be such that the number of virtual machines (also called `agent nodes`), we require, multiplied by the number of vCPUs on each machine must be greater than or equal to 12 vCPUs. This is indeed the [minimum needed](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-deploy-and-where#create-a-new-aks-cluster) for such cluster. By default, a pool of 3 virtual machines gets provisioned on a new AKS cluster to allow for redundancy. So, if the type of virtual machine we choose has a number of vCPUs (`vm_size`) smaller than 4, we need to increase the number of machines (`agent_count`) such that `agent_count x vm_size` ≥ `12` virtual CPUs. `agent_count` and `vm_size` are both parameters we can pass to the `provisioning_configuration()` method below.\n", "- [This document](https://docs.microsoft.com/en-us/azure/templates/Microsoft.ContainerService/2019-02-01/managedClusters?toc=%2Fen-us%2Fazure%2Fazure-resource-manager%2Ftoc.json&bc=%2Fen-us%2Fazure%2Fbread%2Ftoc.json#managedclusteragentpoolprofile-object) provides the full list of virtual machine types that can be deployed in an AKS cluster\n", "- Additional considerations on deployments using GPUs are available [here](https://docs.microsoft.com/en-us/azure/virtual-machines/windows/sizes-gpu#deployment-considerations)\n", "- If the Azure subscription we are using is shared with other users, we may encounter [quota restrictions](https://docs.microsoft.com/en-us/azure/azure-subscription-service-limits) when trying to create a new cluster. To ensure that we have enough machines left, we can go to the Portal, click on our workspace name, and navigate to the `Usage + quotas` section. If we need more machines than are currently available, we can request a [quota increase](https://docs.microsoft.com/en-us/azure/azure-subscription-service-limits#request-quota-increases).\n", "\n", "Here, we will use a cluster of CPUs. The creation of such resource typically takes several minutes to complete." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Creating..................................................................................................................................................................\n", "SucceededProvisioning operation finished, operation \"Succeeded\"\n", "We created the imgclass-aks-cpu AKS compute target\n" ] } ], "source": [ "# Declare the name of the cluster\n", "virtual_machine_type = 'cpu'\n", "aks_name = f'imgclass-aks-{virtual_machine_type}'\n", "\n", "if aks_name not in ws.compute_targets:\n", " # Define the type of virtual machines to use\n", " if virtual_machine_type == 'gpu':\n", " vm_size_name =\"Standard_NC6\"\n", " else:\n", " vm_size_name = \"Standard_D3_v2\"\n", "\n", " # Configure the cluster using the default configuration (i.e. with 3 virtual machines)\n", " prov_config = AksCompute.provisioning_configuration(vm_size = vm_size_name, agent_count=3)\n", "\n", " # Create the cluster\n", " aks_target = ComputeTarget.create(workspace = ws, \n", " name = aks_name, \n", " provisioning_configuration = prov_config)\n", " aks_target.wait_for_completion(show_output = True)\n", " print(f\"We created the {aks_target.name} AKS compute target\")\n", "else:\n", " # Retrieve the already existing cluster\n", " aks_target = ws.compute_targets[aks_name]\n", " print(f\"We retrieved the {aks_target.name} AKS compute target\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we need a more customized AKS cluster, we can provide more parameters to the `provisoning_configuration()` method, the full list of which is available [here](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.compute.akscompute?view=azure-ml-py#provisioning-configuration-agent-count-none--vm-size-none--ssl-cname-none--ssl-cert-pem-file-none--ssl-key-pem-file-none--location-none--vnet-resourcegroup-name-none--vnet-name-none--subnet-name-none--service-cidr-none--dns-service-ip-none--docker-bridge-cidr-none-).\n", "\n", "When the cluster deploys successfully, we typically see the following:\n", "\n", "```\n", "Creating ...\n", "SucceededProvisioning operation finished, operation \"Succeeded\"\n", "```\n", "\n", "In the case when our cluster already exists, we get the following message:\n", "\n", "```\n", "We retrieved the AKS compute target\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This compute target can be seen on the Azure portal, under the `Compute` tab.\n", "\n", "" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The AKS compute target provisioning succeeded -- There were 'None' errors\n" ] } ], "source": [ "# Check provisioning status\n", "print(f\"The AKS compute target provisioning {aks_target.provisioning_state.lower()} -- There were '{aks_target.provisioning_errors}' errors\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The set of resources we will use to deploy our web service on AKS is now provisioned and available.\n", "\n", "### 2.D Monitoring activation \n", "\n", "Once our web app is up and running, it is very important to monitor it, and measure the amount of traffic it gets, how long it takes to respond, the type of exceptions that get raised, etc. We will do so through [Application Insights](https://docs.microsoft.com/en-us/azure/azure-monitor/app/app-insights-overview), which is an application performance management service. To enable it on our soon-to-be-deployed web service, we first need to update our AKS configuration file:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "# Set the AKS web service configuration and add monitoring to it\n", "aks_config = AksWebservice.deploy_configuration(enable_app_insights=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.E Service deployment \n", "\n", "We are now ready to deploy our web service. As in the [first](21_deployment_on_azure_container_instances.ipynb) notebook, we will deploy from the Docker image. It indeed contains our image classifier model and the conda environment needed for the scoring script to work properly. The parameters to pass to the `Webservice.deploy_from_image()` command are similar to those used for the deployment on ACI. The only major difference is the compute target (`aks_target`), i.e. the CPU cluster we just spun up.\n", "\n", "Note: This deployment takes a few minutes to complete." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Creating service\n", "Running................................\n", "SucceededAKS service creation operation finished, operation \"Succeeded\"\n", "The web service is Healthy\n" ] } ], "source": [ "if aks_target.provisioning_state== \"Succeeded\": \n", " aks_service_name ='aks-cpu-image-classif-web-svc'\n", " aks_service = Webservice.deploy_from_image(\n", " workspace = ws, \n", " name = aks_service_name,\n", " image = docker_image,\n", " deployment_config = aks_config,\n", " deployment_target = aks_target\n", " )\n", " aks_service.wait_for_deployment(show_output = True)\n", " print(f\"The web service is {aks_service.state}\")\n", "else:\n", " raise ValueError(\"The web service deployment failed.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When successful, we should see the following:\n", "\n", "```\n", "Creating service\n", "Running ...\n", "SucceededAKS service creation operation finished, operation \"Succeeded\"\n", "The web service is Healthy\n", "```\n", "\n", "In the case where the deployment is not successful, we can look at the service logs to debug. [These instructions](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-troubleshoot-deployment) can also be helpful." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "# Access to the service logs\n", "# print(aks_service.get_logs())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The new deployment can be seen on the portal, under the Deployments tab.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Our web service is up, and is running on AKS." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Clean up \n", " \n", "In a real-life scenario, it is likely that the service we created would need to be up and running at all times. However, in the present demonstrative case, and once we have verified that our service works (cf. \"Next steps\" section below), we can delete it as well as all the resources we used.\n", "\n", "In this notebook, the only resource we added to our subscription, in comparison to what we had at the end of the notebook on ACI deployment, is the AKS cluster. There is no fee for cluster management. The only components we are paying for are:\n", "- the cluster nodes\n", "- the managed OS disks.\n", "\n", "Here, we used Standard D3 V2 machines, which come with a temporary storage of 200 GB. Over the course of this tutorial (assuming ~ 1 hour), this changed almost nothing to our bill. Now, it is important to understand that each hour during which the cluster is up gets billed, whether the web service is called or not. The same is true for the ACI and workspace we have been using until now.\n", "\n", "To get a better sense of pricing, we can refer to [this calculator](https://azure.microsoft.com/en-us/pricing/calculator/?service=kubernetes-service#kubernetes-service). We can also navigate to the [Cost Management + Billing pane](https://ms.portal.azure.com/#blade/Microsoft_Azure_Billing/ModernBillingMenuBlade/Overview) on the portal, click on our subscription ID, and click on the Cost Analysis tab to check our credit usage.\n", "\n", "If we plan on no longer using this web service, we can turn monitoring off, and delete the compute target, the service itself as well as the associated Docker image." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "# Application Insights deactivation\n", "# aks_service.update(enable_app_insights=False)\n", "\n", "# Service termination\n", "# aks_service.delete()\n", "\n", "# Compute target deletion\n", "# aks_target.delete()\n", "# This command executes fast but the actual deletion of the AKS cluster takes several minutes\n", "\n", "# Docker image deletion\n", "# docker_image.delete()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "At this point, all the service resources we used in this notebook have been deleted. We are only now paying for our workspace.\n", "\n", "If our goal is to continue using our workspace, we should keep it available. On the contrary, if we plan on no longer using it and its associated resources, we can delete it.\n", "\n", "Note: Deleting the workspace will delete all the experiments, outputs, models, Docker images, deployments, etc. that we created in that workspace." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "# ws.delete(delete_dependent_resources=True)\n", "# This deletes our workspace, the container registry, the account storage, Application Insights and the key vault" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. Next steps \n", "In the [next notebook](23_aci_aks_web_service_testing.ipynb), we will test the web services we deployed on ACI and on AKS." ] } ], "metadata": { "celltoolbar": "Tags", "jupytext": { "formats": "ipynb,py:light" }, "kernelspec": { "display_name": "cv", "language": "python", "name": "cv" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" } }, "nbformat": 4, "nbformat_minor": 2 }