{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Layerwise learning for quantum neural networks\n",
    "\n",
    "Notebook created by Felipe Oyarce, felipe.oyarce94@gmail.com\n",
    "\n",
    "In this project we’ve implemented a strategy presented by [Skolik et al., 2020](https://arxiv.org/abs/2006.14904) (check the [implementation](https://github.com/tensorflow/quantum/blob/research/layerwise_learning/layerwise_learning.ipynb) in Tensorflow Quantum) for effectively quantum neural networks. In layerwise learning the strategy is to gradually increase the number of parameters by adding a few layers and training them while freezing the parameters of previous layers already trained.\n",
    "An easy way for understanding this technique is to think that we’re dividing the problem into smaller circuits to successfully avoid to fall into [Barren Plateaus](https://arxiv.org/abs/1803.11173). Here, we provide a proof-of-concept for the implementation of this technique in Pennylane’s Pytorch interface.\n",
    "\n",
    "The task selected for this _proof-of-concept_ is the same used in the original paper for the binary classification between the handwritten digits _3_ and _6_ in the MNIST dataset."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pennylane-Pytorch implementation in MNIST dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import random\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "# Pennylane\n",
    "import pennylane as qml\n",
    "from pennylane import numpy as np\n",
    "\n",
    "# Pytorch\n",
    "import torch\n",
    "import torch.nn as nn\n",
    "import torch.optim as optim\n",
    "import torchvision.datasets as datasets\n",
    "import torchvision.transforms as transforms\n",
    "from torch.utils.data.sampler import SubsetRandomSampler"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Parameters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "n_qubits = 9\n",
    "n_layer_steps = 3\n",
    "n_layers_to_add = 2\n",
    "batch_size = 128\n",
    "epochs = 5"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We configure PyTorch to use CUDA only if available. Otherwise the CPU is used."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/home/felipeoyarce/Desktop/layerwise-learning/.venv/lib/python3.7/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at  /pytorch/c10/cuda/CUDAFunctions.cpp:100.)\n",
      "  return torch._C._cuda_getDeviceCount() > 0\n"
     ]
    }
   ],
   "source": [
    "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We initialize a PennyLane device with a `lightning.qubit` [backend](https://pennylane-lightning.readthedocs.io/en/latest/devices.html). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "dev = qml.device(\"lightning.qubit\", wires=n_qubits)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Data pre-processing\n",
    "\n",
    "In `data_transforms`, we compose several transformations to the images in order to reduce their sizes and construct a flatten vector while keeping meaningful information to being able to \"learn\" the difference between digits in a quantum neural network. Feel free to explore and try different representation of the data such as learned embeddings or dimensionality reduction approaches.\n",
    "\n",
    "#### Transformations\n",
    "- [CenterCrop](https://pytorch.org/docs/stable/torchvision/transforms.html#torchvision.transforms.CenterCrop): Crops the given image at the center.\n",
    "- [Resize](https://pytorch.org/docs/stable/torchvision/transforms.html#torchvision.transforms.Resize): Resize the input image to the given size.\n",
    "- [ToTensor](https://pytorch.org/docs/stable/torchvision/transforms.html#torchvision.transforms.ToTensor): Converts the images with values in the range [0, 255] to tensors with values in the range [0,1].\n",
    "- Flatten: [Lambda](https://pytorch.org/docs/stable/torchvision/transforms.html#torchvision.transforms.Lambda) that applies a lambda function to flatten the image into a vector."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "data_transforms = transforms.Compose([transforms.CenterCrop(18), #crop the image to a 18x18 image\n",
    "                                      transforms.Resize(3), #resize to a 3x3 image\n",
    "                                      transforms.ToTensor(), #convert to tensor\n",
    "                                      transforms.Lambda(lambda x: torch.flatten(x)) #obtain a vector by flatten the image\n",
    "                                     ])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1968\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/home/felipeoyarce/Desktop/layerwise-learning/.venv/lib/python3.7/site-packages/ipykernel_launcher.py:18: UserWarning: This overload of nonzero is deprecated:\n",
      "\tnonzero()\n",
      "Consider using one of the following signatures instead:\n",
      "\tnonzero(*, bool as_tuple) (Triggered internally at  /pytorch/torch/csrc/utils/python_arg_parser.cpp:882.)\n"
     ]
    }
   ],
   "source": [
    "# Download the MNIST dataset and apply the composition of transformations.\n",
    "train_set = datasets.MNIST(root='./data', train=True, download=True, transform=data_transforms)\n",
    "test_set = datasets.MNIST(root='./data', train=False, download=True, transform=data_transforms)\n",
    "\n",
    "# Change labels of digits '3' and '6' to be 0 and 1, respectively.\n",
    "# Note that first we must change the labels of the digits '0' and '1'\n",
    "train_set.targets[train_set.targets == 1] = 10\n",
    "train_set.targets[train_set.targets == 0] = 10\n",
    "train_set.targets[train_set.targets == 3] = 0\n",
    "train_set.targets[train_set.targets == 6] = 1\n",
    "\n",
    "test_set.targets[test_set.targets == 1] = 10\n",
    "test_set.targets[test_set.targets == 0] = 10\n",
    "test_set.targets[test_set.targets == 3] = 0\n",
    "test_set.targets[test_set.targets == 6] = 1\n",
    "\n",
    "# Filter to just images of '3's and '6's\n",
    "subset_indices_train = ((train_set.targets == 0) + (train_set.targets == 1)).nonzero().view(-1)\n",
    "subset_indices_test = ((test_set.targets == 0) + (test_set.targets == 1)).nonzero().view(-1)\n",
    "\n",
    "print(len(subset_indices_test))\n",
    "\n",
    "# Select just a subset of the training set. \n",
    "# Increase the number of examples for more accurate results\n",
    "NUM_EXAMPLES = 1000\n",
    "subset_indices_train = subset_indices_train[:NUM_EXAMPLES]\n",
    "\n",
    "# DataLoaders\n",
    "train_loader = torch.utils.data.DataLoader(train_set, batch_size=batch_size, shuffle=False,\n",
    "                                          sampler=SubsetRandomSampler(subset_indices_train))\n",
    "test_loader = torch.utils.data.DataLoader(test_set, batch_size=batch_size, shuffle=False,\n",
    "                                         sampler=SubsetRandomSampler(subset_indices_test))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Data distribution"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "495 images of digit '3'.\n",
      "505 images of digit '6'.\n"
     ]
    }
   ],
   "source": [
    "k = 0\n",
    "for x, y in train_loader:\n",
    "    for i in range(y.shape[0]):\n",
    "        if y[i].item() == 0:\n",
    "            k += 1\n",
    "print(f\"{k} images of digit '3'.\")\n",
    "print(f\"{NUM_EXAMPLES - k} images of digit '6'.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Utility functions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "def set_random_gates(n_qubits):\n",
    "    \"\"\"Utility function for creating a list\n",
    "    of random gates chosen from gate_set.\n",
    "    \n",
    "    The returned list has a length of n_qubits.\n",
    "    \n",
    "    Arguments:\n",
    "        n_qubits (int): Integer number indicating\n",
    "            the number of qubits of the quantum\n",
    "            circuit.\n",
    "            \n",
    "    Returns:\n",
    "        chosen_gates (list): List of length equal\n",
    "            to n_qubits containing RX, RY and RZ\n",
    "            rotations randomly chosen.\n",
    "    \"\"\"\n",
    "    \n",
    "    gate_set = [qml.RX, qml.RY, qml.RZ]\n",
    "    chosen_gates = []\n",
    "    for i in range(n_qubits):\n",
    "        chosen_gate = random.choice(gate_set)\n",
    "        chosen_gates.append(chosen_gate)\n",
    "    return chosen_gates\n",
    "\n",
    "def total_elements(array_list):\n",
    "    \"\"\"Utility function that returns the total number\n",
    "    of elements in a list of lists.\n",
    "    \n",
    "    Arguments:\n",
    "        array_list (list[list]): List of lists.\n",
    "    \n",
    "    Returns:\n",
    "        (int): Total number of elements in array_list.\n",
    "    \"\"\"\n",
    "\n",
    "    flattened = [val for sublist in array_list for val in sublist]\n",
    "    return len(flattened)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Define lists to update new gates and trained weights"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Lists to update the new gates and trained weights.\n",
    "layer_gates = []\n",
    "layer_weights = []"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Phase I: Increasing the circuit depth"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "def apply_layer(gates, weights):\n",
    "    \"\"\"Function to apply the layer composed of\n",
    "    of RX, RY and RZ to each qubit in the circuit\n",
    "    (just one gate per qubit, randomly chosen) with\n",
    "    their respective parameters. Then, apply CZ gates\n",
    "    in a ladder structure.\n",
    "    \n",
    "    Arguments:\n",
    "        gates: List of single qubit gates to apply in\n",
    "            the circuit. Length equal to the number\n",
    "            of qubits of the circuit.\n",
    "        \n",
    "        weights: List of parameters to apply in each\n",
    "            gate from gates. Length equal to the \n",
    "            number of qubits of the circuit.\n",
    "        \n",
    "    Returns:\n",
    "        None\n",
    "    \"\"\"\n",
    "    \n",
    "    # Apply single qubit gates with their weights.\n",
    "    for i in range(n_qubits): \n",
    "        gates[i](weights[i], wires = i)\n",
    "\n",
    "    # Apply CZ gates to each pair of qubits in ladder structure.\n",
    "    for i in range(n_qubits-1):\n",
    "        qml.CZ(wires=[i, i+1])\n",
    "        \n",
    "#Function for non-trainable part of the quantum circuit\n",
    "def apply_frozen_layers(frozen_layer_gates, frozen_layer_weights):\n",
    "    \"\"\"Function that applies multiple layers to the quantum\n",
    "    circuit. The main purpose of this function is to use it\n",
    "    for applying the layers already trained during Phase I of\n",
    "    layerwise learning.\n",
    "    \n",
    "    Arguments:\n",
    "        frozen_layer_gates: List of lists containing the qubit\n",
    "            rotations per layer to apply to the circuit.\n",
    "            List of \"shape\" (number layers, number qubits).\n",
    "        \n",
    "        frozen_layer_weights: List of lists containing the\n",
    "            parameters (angles) to each rotation in\n",
    "            frozen_layer_gates. List of \"shape\" (number layers, number qubits).\n",
    "    \n",
    "    Returns:\n",
    "        None\n",
    "    \"\"\"\n",
    "\n",
    "    for i in range(len(frozen_layer_gates)):\n",
    "        apply_layer(frozen_layer_gates[i], frozen_layer_weights[i])\n",
    "\n",
    "@qml.qnode(dev, interface=\"torch\")\n",
    "def quantum_net(inputs, new_weights):\n",
    "    \"\"\"Quantum network to train during Phase I of\n",
    "    layerwise learning. The data inputs are encoded\n",
    "    using an Angle Embedding with X rotations. Then, \n",
    "    we apply the non-trainable layers or frozen layers\n",
    "    using the two lists called layer_gates and layer_weights\n",
    "    that store the randomly selected single qubit rotations\n",
    "    and their trained weights in previous steps of layerwise\n",
    "    learning. Finally, n_layers_to_add is an integer number that\n",
    "    indicates the number of trainable layers to add in\n",
    "    each step of Phase I.\n",
    "    \n",
    "    Arguments:\n",
    "        inputs: Tensor data.\n",
    "        new_weights: New paramters to be train of shape\n",
    "            (n_layers_to_add, n_qubits).\n",
    "            \n",
    "    Returns:\n",
    "        (float): Expectation value of an Z measurement in the\n",
    "            last qubit of the circuit.\n",
    "    \"\"\"\n",
    "\n",
    "    # Encode the data with Angle Embedding\n",
    "    qml.templates.AngleEmbedding(inputs, wires=range(n_qubits))\n",
    "    \n",
    "    # Apply frozen layers\n",
    "    apply_frozen_layers(layer_gates, layer_weights)\n",
    "    \n",
    "    # Apply layers with trainable parameters\n",
    "    for i in range(n_layers_to_add):\n",
    "        apply_layer(new_gates[i], new_weights[i])\n",
    "        \n",
    "    # Expectation value of the last qubit\n",
    "    return qml.expval(qml.PauliZ(n_qubits-1))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Phase I step: 1\n",
      "Average loss over epoch 1: 0.9071\n",
      "Average loss over epoch 2: 0.9058\n",
      "Average loss over epoch 3: 0.9077\n",
      "Average loss over epoch 4: 0.9057\n",
      "Average loss over epoch 5: 0.9094\n",
      "Trained parameters: 18\n",
      "Layer weights: 18\n",
      "Number of layers: 2\n",
      "\n",
      "Phase I step: 2\n",
      "Average loss over epoch 1: 0.9037\n",
      "Average loss over epoch 2: 0.8973\n",
      "Average loss over epoch 3: 0.8875\n",
      "Average loss over epoch 4: 0.8786\n",
      "Average loss over epoch 5: 0.8689\n",
      "Trained parameters: 18\n",
      "Layer weights: 36\n",
      "Number of layers: 4\n",
      "\n",
      "Phase I step: 3\n",
      "Average loss over epoch 1: 0.8596\n",
      "Average loss over epoch 2: 0.8481\n",
      "Average loss over epoch 3: 0.8380\n",
      "Average loss over epoch 4: 0.8293\n",
      "Average loss over epoch 5: 0.8183\n",
      "Trained parameters: 18\n",
      "Layer weights: 54\n",
      "Number of layers: 6\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# Sigmoid function and Binary Cross Entropy loss\n",
    "sigmoid = nn.Sigmoid()\n",
    "loss = nn.BCELoss()\n",
    "\n",
    "for step in range(n_layer_steps):\n",
    "    \n",
    "    print(f\"Phase I step: {step+1}\")\n",
    "    \n",
    "    # Obtain random gates for each new layer.\n",
    "    new_gates = [set_random_gates(n_qubits) for i in range(n_layers_to_add)]\n",
    "    \n",
    "    # Define shape of the weights\n",
    "    weight_shapes = {\"new_weights\": (n_layers_to_add, n_qubits)}\n",
    "    \n",
    "    # Quantum net as a TorchLayer\n",
    "    qlayer = qml.qnn.TorchLayer(quantum_net, weight_shapes, init_method = nn.init.zeros_)\n",
    "    \n",
    "    # Create Sequential Model\n",
    "    model = torch.nn.Sequential(qlayer, sigmoid)\n",
    "    \n",
    "    # Optimizer\n",
    "    opt = optim.Adam(model.parameters(), lr=0.01)\n",
    "    \n",
    "    batches = NUM_EXAMPLES // batch_size\n",
    "    for epoch in range(epochs):\n",
    "        running_loss = 0\n",
    "        for x, y in train_loader:\n",
    "            opt.zero_grad()\n",
    "            y = y.to(torch.float32)\n",
    "            loss_evaluated = loss(model(x), y)\n",
    "            loss_evaluated.backward()\n",
    "            running_loss += loss_evaluated\n",
    "\n",
    "            opt.step()\n",
    "        avg_loss = running_loss / batches\n",
    "        print(\"Average loss over epoch {}: {:.4f}\".format(epoch + 1, avg_loss))\n",
    "    \n",
    "    # Extract weights after optimization to be save in layer_weights\n",
    "    for param in model.parameters():\n",
    "        new_weights = param.data\n",
    "    new_weights = new_weights.tolist()\n",
    "    print(f\"Trained parameters: {total_elements(new_weights)}\")\n",
    "\n",
    "    layer_gates += new_gates\n",
    "    layer_weights += new_weights\n",
    "    print(f\"Layer weights: {total_elements(layer_weights)}\")\n",
    "    print(f\"Number of layers: {len(layer_gates)}\")\n",
    "    print(\"\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Phase II: Split to circuit into pieces"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define partition of the circuit to train in each step.\n",
    "# Here we train the circuit by halves.\n",
    "partition_percentage = 0.5\n",
    "partition_size = int(n_layer_steps*n_layers_to_add*partition_percentage)\n",
    "n_partition_weights = partition_size*n_qubits\n",
    "n_sweeps = 2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def edit_model_parameters(model, new_parameters):\n",
    "    \"\"\"Function for editing the initial parameters\n",
    "    of a Sequential model in Pytorch to be a given\n",
    "    tensor as the initial parameters of the model.\n",
    "    This function is useful for Phase II because the\n",
    "    initial parameters in this phase are the trained\n",
    "    weights from Phase I.\n",
    "    \n",
    "    Arguments:\n",
    "        model (torch.nn.Sequential): In this case the Sequential\n",
    "            model in Pytorch with a TorchLayer from Pennylane.\n",
    "            Our quantum neural network.\n",
    "        \n",
    "        new_parameters (torch.nn.Parameter): The new parameters\n",
    "            that we want in the model as initial weights.\n",
    "        \n",
    "    Returns:\n",
    "        model (torch.nn.Sequential): The model with the new\n",
    "            model.parameters().\n",
    "    \"\"\"\n",
    "    \n",
    "    old_params = {}\n",
    "    for name, params in model.named_parameters():\n",
    "        old_params[name] = params.clone()\n",
    "    \n",
    "    old_params[\"0.partition_weights\"] = new_parameters\n",
    "    \n",
    "    for name, params in model.named_parameters():\n",
    "        params.data.copy_(old_params[name])\n",
    "        \n",
    "    return model\n",
    "\n",
    "def get_partition(layer_weights, partition, partition_size):\n",
    "    \"\"\"Function to get the first or second partition of an\n",
    "    array given a partition size. This function is useful\n",
    "    to avoid repeating our code in Phase II.\n",
    "    \n",
    "    Arguments:\n",
    "        layer_weights: List of lists containing the\n",
    "            parameters (angles) to each rotation in\n",
    "            layer_gates. List of \"shape\" (number layers, number qubits).\n",
    "            \n",
    "        partition (int): In this example it can be 1 or 2 to indicate\n",
    "            the partition.\n",
    "            \n",
    "        partition_size (int): Integer that tells you the layer in which\n",
    "            the partition is made.\n",
    "            \n",
    "    Returns:\n",
    "        Partition of layer_weights, first or second partition.\n",
    "    \"\"\"\n",
    "    \n",
    "    if partition == 1:\n",
    "        return layer_weights[:partition_size]\n",
    "    if partition == 2:\n",
    "        return layer_weights[partition_size:]\n",
    "    \n",
    "def save_trained_partition(layer_weights, trained_weights, partition, partition_size):\n",
    "    \"\"\"Function to update layer weights after training a partition.\n",
    "    \n",
    "    Arguments:\n",
    "        layer_weights: List of lists containing the\n",
    "            parameters (angles) to each rotation in\n",
    "            layer_gates. List of \"shape\" (number layers, number qubits).\n",
    "        \n",
    "        trained_weights: Trained weights after training a partition\n",
    "            of the circuit, could be first or second partition.\n",
    "        \n",
    "        partition (int): In this example it can be 1 or 2 to indicate\n",
    "            the partition.\n",
    "            \n",
    "        partition_size (int): Integer that tells you the layer in which\n",
    "            the partition is made.\n",
    "            \n",
    "    Returns:\n",
    "        None\n",
    "    \"\"\"\n",
    "    \n",
    "    if partition == 1:\n",
    "        layer_weights[:partition_size] = trained_weights\n",
    "    if partition == 2:\n",
    "        layer_weights[partition_size:] = trained_weights"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "@qml.qnode(dev, interface=\"torch\")\n",
    "def train_partition(inputs, partition_weights):\n",
    "    \"\"\"Qnode defined to train just a partition of \n",
    "    the quantum circuit after Phase I. This function\n",
    "    supports just a partition in two pieces of the\n",
    "    circuit. If partition == 1 is going to treat as\n",
    "    trainable the first portion of the circuit and if\n",
    "    partition == 2, the second portion is going to be\n",
    "    trainable.\n",
    "    \n",
    "    Arguments:\n",
    "        inputs: Tensor data.\n",
    "        partition_weights: Partition of the weights to be\n",
    "            trained. Shape (len(partition_weights, n_qubits).\n",
    "    \n",
    "    Returns:\n",
    "        (float): Expectation value of an Z measurement in the\n",
    "            last qubit of the circuit.\n",
    "    \"\"\"\n",
    "\n",
    "    #Encode the data with Angle Embedding\n",
    "    qml.templates.AngleEmbedding(inputs, wires=range(n_qubits))\n",
    "    \n",
    "    if partition == 1:\n",
    "        # Apply trainable partition first\n",
    "        for i in range(len(layer_gates[:partition_size])):\n",
    "            apply_layer(layer_gates[:partition_size][i], partition_weights[i])\n",
    "        \n",
    "        #Apply non-trainable partition\n",
    "        for i in range(len(layer_gates[partition_size:])):\n",
    "            apply_layer(layer_gates[partition_size:][i], layer_weights[partition_size:][i])\n",
    "    \n",
    "    elif partition == 2:\n",
    "        # Apply non-trainable partition first\n",
    "        for i in range(len(layer_gates[:partition_size])):\n",
    "            apply_layer(layer_gates[:partition_size][i], layer_weights[:partition_size][i])\n",
    "        \n",
    "        # Apply trainable partition\n",
    "        for i in range(len(layer_gates[partition_size:])):\n",
    "            apply_layer(layer_gates[partition_size:][i], partition_weights[i])\n",
    "    \n",
    "    # Expectation value of the last qubit\n",
    "    return qml.expval(qml.PauliZ(n_qubits-1))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Sweep: 1, partition: 1\n",
      "Average loss over epoch 1: 0.8106\n",
      "Average loss over epoch 2: 0.8023\n",
      "Average loss over epoch 3: 0.7974\n",
      "Average loss over epoch 4: 0.7922\n",
      "Average loss over epoch 5: 0.7887\n",
      "Trained parameters: 27\n",
      "Sweep: 1, partition: 2\n",
      "Average loss over epoch 1: 0.7857\n",
      "Average loss over epoch 2: 0.7806\n",
      "Average loss over epoch 3: 0.7768\n",
      "Average loss over epoch 4: 0.7734\n",
      "Average loss over epoch 5: 0.7697\n",
      "Trained parameters: 27\n",
      "Sweep: 2, partition: 1\n",
      "Average loss over epoch 1: 0.7668\n",
      "Average loss over epoch 2: 0.7653\n",
      "Average loss over epoch 3: 0.7639\n",
      "Average loss over epoch 4: 0.7626\n",
      "Average loss over epoch 5: 0.7618\n",
      "Trained parameters: 27\n",
      "Sweep: 2, partition: 2\n",
      "Average loss over epoch 1: 0.7607\n",
      "Average loss over epoch 2: 0.7572\n",
      "Average loss over epoch 3: 0.7538\n",
      "Average loss over epoch 4: 0.7509\n",
      "Average loss over epoch 5: 0.7483\n",
      "Trained parameters: 27\n"
     ]
    }
   ],
   "source": [
    "for sweep in range(n_sweeps):\n",
    "    \n",
    "    for partition in [1,2]:\n",
    "        print(f\"Sweep: {sweep+1}, partition: {partition}\")\n",
    "        # Get partition\n",
    "        trainable_weights = get_partition(layer_weights, partition, partition_size)\n",
    "\n",
    "        # Define shape of the weights\n",
    "        weight_shapes = {\"partition_weights\": (len(trainable_weights), n_qubits)}\n",
    "\n",
    "        # Quantum net as a TorchLayer\n",
    "        qlayer = qml.qnn.TorchLayer(train_partition, weight_shapes, init_method = nn.init.zeros_)\n",
    "\n",
    "        init_weights = nn.Parameter(torch.tensor(trainable_weights))\n",
    "\n",
    "        # Create Sequential Model\n",
    "        model = torch.nn.Sequential(qlayer, sigmoid)\n",
    "\n",
    "        # Edit model initial parameters to be init_weights\n",
    "        model = edit_model_parameters(model, init_weights)\n",
    "\n",
    "        # Optimizer\n",
    "        opt = optim.Adam(model.parameters(), lr=0.01)\n",
    "\n",
    "        batches = NUM_EXAMPLES // batch_size\n",
    "        for epoch in range(epochs):\n",
    "            running_loss = 0\n",
    "            for x, y in train_loader:\n",
    "                opt.zero_grad()\n",
    "                y = y.to(torch.float32)\n",
    "                loss_evaluated = loss(model(x), y)\n",
    "                loss_evaluated.backward()\n",
    "                running_loss += loss_evaluated\n",
    "\n",
    "                opt.step()\n",
    "            avg_loss = running_loss / batches\n",
    "            print(\"Average loss over epoch {}: {:.4f}\".format(epoch + 1, avg_loss))\n",
    "\n",
    "        for param in model.parameters():\n",
    "            trained_weights = param.data\n",
    "        trained_weights = trained_weights.tolist()\n",
    "        print(f\"Trained parameters: {total_elements(trained_weights)}\")\n",
    "\n",
    "        save_trained_partition(layer_weights, trained_weights, partition, partition_size)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Train accuracy: 0.7672776442307693\n"
     ]
    }
   ],
   "source": [
    "train_accuracy = 0\n",
    "for x, y in train_loader:\n",
    "    probs = model(x)\n",
    "    preds = (probs>0.5).float()\n",
    "    train_accuracy += torch.sum(preds == y).item()/preds.shape[0]\n",
    "print(f\"Train accuracy: {train_accuracy/len(train_loader)}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Test accuracy: 0.7674153645833334\n"
     ]
    }
   ],
   "source": [
    "test_accuracy = 0\n",
    "for x, y in test_loader:\n",
    "    probs = model(x)\n",
    "    preds = (probs>0.5).float()\n",
    "    test_accuracy += torch.sum(preds == y).item()/preds.shape[0]\n",
    "print(f\"Test accuracy: {test_accuracy/len(test_loader)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## References\n",
    "\n",
    "[[1]](https://arxiv.org/abs/1803.11173) McClean et al., 2018. Barren plateaus in quantum neural network training landscapes.\n",
    "\n",
    "[[2]](https://arxiv.org/abs/2006.14904) Skolik et al., 2020. \n",
    "Layerwise learning for quantum neural networks.\n",
    "\n",
    "[[3]](https://github.com/tensorflow/quantum/blob/research/layerwise_learning/layerwise_learning.ipynb) Notebook with the implementation in Tensorflow Quantum by the paper's author.\n",
    "\n",
    "[[4]](https://blog.tensorflow.org/2020/08/layerwise-learning-for-quantum-neural-networks.html) Tensorflow quantum Blog post about layerwise learning.\n",
    "\n",
    "[[5]](https://www.youtube.com/watch?v=lz8BOz5KPZg) Tensorflow quantum YouTube's video about layerwise learning."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}