{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "%config InlineBackend.figure_format = 'retina'\n", "import matplotlib.pyplot as plt\n", "import torch\n", "import torch.nn as nn\n", "from torchvision import datasets, transforms\n", "from torch.utils.data import Subset\n", "import torch.nn.functional as F\n", "import numpy as np\n", "\n", "from cpmpy import *" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Saving a classifier\n", "\n", "In this notebook, we will use the classifier that you built in p1.\n", "\n", "Hence, first go to that notebook and _export_ the classifier you built there, by adding the following code in that notebook:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#torch.save(model.state_dict(), \"myNet\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading a pre-trained classifier\n", "\n", "Now, we can load that pre-trained classifier in this notebook as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def MLP():\n", " return nn.Sequential(nn.Flatten(), # Flatten MNIST images into a 784 long vector\n", " nn.Linear(28*28, 120),\n", " nn.ReLU(),\n", " nn.Linear(120, 84), \n", " nn.ReLU(),\n", " nn.Linear(84, 10),\n", " nn.LogSoftmax(dim=1))\n", "\n", "\n", "\n", "class LeNet(nn.Module):\n", " def __init__(self, calibrated=False):\n", " super(LeNet, self).__init__()\n", " self.conv1 = nn.Conv2d(1, 6, 5, 1, padding=2)\n", " self.conv2 = nn.Conv2d(6, 16, 5, 1)\n", " self.fc1 = nn.Linear(5*5*16, 120)\n", " self.fc2 = nn.Linear(120, 84)\n", " self.fc3 = nn.Linear(84,10)\n", "\n", " def forward(self, x):\n", " x = F.relu(self.conv1(x))\n", " x = F.max_pool2d(x, 2, 2)\n", " x = F.relu(self.conv2(x))\n", " x = F.max_pool2d(x, 2, 2)\n", " x = x.view(-1, 5*5*16) \n", " x = F.relu(self.fc1(x))\n", " x = self.fc2(x)\n", " x = self.fc3(x)\n", " return F.log_softmax(x, dim=1)\n", "\n", "def load_clf(clf_classname, path):\n", " net = clf_classname()\n", " state_dict = torch.load(path, map_location=lambda storage, loc: storage)\n", " net.load_state_dict(state_dict)\n", " return net\n", "\n", "model = load_clf(LeNet, 'myNet') # change arguments accordingly\n", "model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We also have to load in the data again:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define a transform to normalize the data\n", "transform = transforms.Compose([transforms.ToTensor(),\n", " transforms.Normalize((0.5,), (0.5,)),\n", " ])\n", "\n", "# Download and load the training data\n", "trainset = datasets.MNIST('.', download=False, train=True, transform=transform)\n", "testset = datasets.MNIST('.', download=False, train=False, transform=transform)\n", "trainloader = torch.utils.data.DataLoader(trainset, batch_size=128, shuffle=True)\n", "testloader = torch.utils.data.DataLoader(testset, batch_size=128, shuffle=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Recap: solving a sudoku based on the predictions\n", "\n", "In the following, we repeat the code of the previous notebook for sampling a sudoku and getting predictions.\n", "\n", "We also included example _ortools_ code that solves the sudoku problem _(requires to install ortools, e.g. conda install ortools)_" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# sudoku's, from http://hakank.org/minizinc/sudoku_problems2/index.html\n", "\n", "sudoku_p0 = torch.IntTensor([[0,0,0, 2,0,5, 0,0,0],\n", " [0,9,0, 0,0,0, 7,3,0],\n", " [0,0,2, 0,0,9, 0,6,0],\n", " [2,0,0, 0,0,0, 4,0,9],\n", " [0,0,0, 0,7,0, 0,0,0],\n", " [6,0,9, 0,0,0, 0,0,1],\n", " [0,8,0, 4,0,0, 1,0,0],\n", " [0,6,3, 0,0,0, 0,8,0],\n", " [0,0,0, 6,0,8, 0,0,0]])\n", "\n", "# sample a dataset index with that value/label\n", "def sample_by_label(labels, value):\n", " # primitive but it works...\n", " idxs = torch.randperm(len(labels))\n", " for idx in idxs:\n", " if labels[idx] == value:\n", " return idx\n", "# sample a dataset index for each non-zero number\n", "def sample_visual_sudoku(sudoku_p, loader):\n", " for (images, labels) in loader: # sample one batch\n", " nonzero = sudoku_p > 0\n", " vizsudoku = torch.zeros((9,9,1,28,28), dtype=images.dtype)\n", " idxs = torch.LongTensor([sample_by_label(labels, value) for value in sudoku_p[nonzero]])\n", " vizsudoku[nonzero] = images[idxs]\n", " return vizsudoku\n", "def show_grid_img(images):\n", " dim = 9\n", " figure = plt.figure()\n", " num_of_images = dim*dim\n", " for index in range(num_of_images):\n", " plt.subplot(dim, dim, index+1)\n", " plt.axis('off')\n", " plt.imshow(images[index].numpy().squeeze(), cmap='gray_r')\n", "def predict_sudoku(model, vizsudoku):\n", " nonzero = (vizsudoku.reshape([-1,28,28]).sum(dim=(1,2)) != 0).reshape(9,9)\n", " predsudoku = torch.zeros((9,9), dtype=torch.long)\n", " with torch.no_grad():\n", " images = vizsudoku[nonzero]\n", " logprobs = model(images)\n", " preds = torch.argmax(logprobs, dim=1)\n", " predsudoku[nonzero] = preds\n", " return predsudoku\n", " \n", "vizsudoku = sample_visual_sudoku(sudoku_p0, testloader)\n", "show_grid_img(vizsudoku.reshape(-1,28,28))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Preparing the solving code\n", "In the subsequent, we can no longer assume that some values are given. Instead of constraining the non-empty values, we will add an objective function over the predicted probabilities.\n", "\n", "\n", "__Task: Prepare your CPMpy model for this by splitting your `solve_sudoku()` of the previous notebook, into:__\n", " - `model_sudoku()` returns a CPMpy `Model` with all constraints except the 'value' constraints\n", " - `solve_sudoku()` that adds to that model the constraints on the values and solves it\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def model_sudoku(shape=(9,9)):\n", " maxv = max(shape)\n", " puzzle = IntVar(1,maxv, shape=shape)\n", " \n", " # ...\n", " \n", " m = Model()\n", " return (m, puzzle)\n", "\n", "def solve_sudoku(grid, empty_val=0):\n", " \"\"\"\n", " Solve sudoku with given grid 'grid', where 'empty_val' is used to denote empty cells.\n", " returns (Bool, Array) with:\n", " - Bool: whether a satisfying solution was found\n", " - Array: if Bool = True, the filled-in grid.\n", " \"\"\"\n", " (m,puzzle) = model_sudoku()\n", " \n", " # Constraints on values (cells that are not empty)\n", " # ...\n", " \n", " if m.solve():\n", " return (True, puzzle.value())\n", " else:\n", " return (False, None)\n", "\n", "# try solving the 'true' sudoku, which has a unique solution\n", "solve_sudoku(sudoku_p0.numpy())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Finding the maximum likelihood solution\n", "\n", "As errors in the output may lead to infeasible sudoku's, we are going to want to find the _maximum likelihood_ solution.\n", "\n", "First, we read and store the prediction probabilities instead of the predictions. We obtain a 9x9x10 tensor (last dimension = probabilities of digit 0..9)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# get probabilities of predictions\n", "def predict_proba_sudoku(model, vizsudoku):\n", " # reshape from 9x9x1x28x28 to 81x1x28x28\n", " pred = model(vizsudoku.flatten(0,1))\n", " # our NN return 81 probabilistic vector: an 81x10 matrix\n", " return pred.reshape(9,9,10).detach() # reshape as 9x9x10 tensor for easier visualisation\n", "\n", "logprobs = predict_proba_sudoku(model, vizsudoku)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Maximum likelihood estimation with standard CP solver\n", "\n", "We need to turn the _satisfaction_ problem of sudoku into an _optimisation_ problem, where we optimize for maximum log likelihood.\n", "\n", "This means adding the objective function: a weighted sum of the decision variables, with as weight the log-probability of that decision variable being equal to the corresponding predicted value.\n", "\n", "E.g. $\\sum_i \\sum_j \\sum_c -log(prob[i,j,c])*[V[i,j] == c]$ over the given digits\n", "\n", "__Task: adapt the sudoku code to find the maximum likelihood visual sudoku solution!__\n", "\n", "Note that the only thing that changes is adding the objective, so you can reuse `sudoku_model()` for the constraints part!\n", "\n", "We assume that cells containing given clues are available through a $is\\_given$ boolean matrix." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def solve_vizsudoku(logprobs, is_given):\n", " # the constraint model\n", " csp, puzzle = model_sudoku(shape=(9,9))\n", " \n", " # 1) convert logprobs to positive integers (with scaling factor)\n", " logprobs = (-logprobs*10000).type(torch.int).numpy()\n", "\n", " # 2) build the list of terms to sum in cost function.\n", " # for every cell that is given, for v in 0..9: sum(logprobs[i,j,v]*(puzzle[i,j] == v))\n", " raise Exception(\"TODO!\")\n", " #obj = sum( ... )\n", " \n", " # 3) minimize the cost\n", " m = Model(csp.constraints, minimize=obj)\n", "\n", " if m.solve():\n", " return (True, puzzle.value())\n", " else:\n", " return (False, None)\n", "\n", "\n", "vsudoku = sample_visual_sudoku(sudoku_p0, testloader)\n", "logprobs = predict_proba_sudoku(model, vsudoku)\n", "is_given = (sudoku_p0 > 0)\n", "(sat, psol) = solve_vizsudoku(logprobs, is_given)\n", "print(\"Actual:\\n\", solve_sudoku(sudoku_p0.numpy())[1])\n", "\n", "# let's show the other ones too...\n", "preds = predict_sudoku(model, vsudoku).numpy()\n", "print(f\"Predictions solvable: {solve_sudoku(preds)[0]}, preds:\\n\",preds)\n", "print(\"Max likelihood:\\n\", psol)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Visualizing prediction error\n", "\n", "Our trained Neural network classifies an image correctly if it assigns the highest score to the true label. Thus, we can assess the accuracy of the model by comparing maximum likelihood labels against labels in the numerical instance. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "##helper function to plot and compare solution found with hybrid approach\n", "def plot_vs(visualsudoku, output, is_given, ml_digits, solution):\n", " n = 9\n", " fig, axes = plt.subplots(n, n, figsize=(1.5*n,2*n))\n", "\n", " for i in range(n*n):\n", " ax = axes[i//n, i%n]\n", " # sample image wrt solver output\n", " img = torch.zeros(28,28).float()\n", " c = 'gray'\n", " if is_given.reshape(-1)[i]:\n", " img = visualsudoku.view(-1, 28,28)[i].squeeze()\n", " # wrong given -> red\n", " # given fixed by cp -> green\n", " c = 'gray' if output.reshape(-1)[i] == ml_digits.reshape(-1)[i] else 'summer'\n", "\n", " c = 'autumn' if is_given.reshape(-1)[i] and output.reshape(-1)[i] != solution.reshape(-1)[i] else c\n", "\n", " if c == 'summer':\n", " ax.set_title('ML label: {}\\nsolver label: {}'.format(ml_digits.reshape(-1)[i], output.reshape(-1)[i]))\n", " elif c == 'autumn':\n", " ax.set_title('solver label: {}\\nTrue label: {}'.format(output.reshape(-1)[i], solution.reshape(-1)[i]))\n", "\n", " ax.imshow(img, cmap=c)\n", " ax.set_axis_off()\n", "plot_vs(vsudoku, psol, is_given, ml_digits, sudoku_p0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Higher Order knowledge exploitation\n", "\n", "As an optimisation problem, a visual sudoku instance may have many feasible solutions. \n", "\n", "However, any valid sudoku puzzle only admits a unique solution for a set of givens. To improve the efficiency of our hybrid approach, we can actually exploit this uniqueness property in the following manner:\n", "\n", "- When solver finds optimal solution, add the resulting assignment as a no-good and try to solve again\n", "- if any feasible solution is found this way, previous assignment for given cells does not lead to unique solution\n", "\n", "Iterate these steps until the uniqueness property is satisfied. In practice, we may need to loop several times depending on the accuracy of the classifier.\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "## Helpers \n", "def has_unique_solution(solution, is_given):\n", " \"\"\"\n", " Check whether values in given cells from provided solution lead to a unique sudoku solution. \n", " \"\"\"\n", " sol = solution.astype(np.int)\n", " csp, board = model_sudoku((9,9))\n", " # Enforce the solved predictions of the givens\n", " csp += [board[is_given] == sol[is_given]]\n", " \n", " # Forbid the solved predictions of the others\n", " csp += all(board != sol)\n", " \n", " return not csp.solve()\n", "\n", "\n", "def add_nogood(model, board, solution, is_given):\n", " \"\"\"\n", " Forbid current assignement of value in given cells, from provided solution\n", " \"\"\"\n", " sol = solution.astype(np.int)\n", " # nogood: built-in forbidden assignement constraint\n", " model.constraints += [board[is_given] != solution[is_given]]\n", " \n", " return model\n", "\n", "\n", "## The actual function\n", "def solve_vizsudoku_hybrid2(probs, is_given):\n", " (sat, solution) = solve_vizsudoku(probs, is_given)\n", " \n", " # Write a loop repeating following steps:\n", " # while ´solution´ is not unique:\n", " # add nogood to the vizsudoku model\n", " # solve again\n", " \n", " # Tip: you want access to the model created in `solve_sudoku` for this\n", " \n", " return False, None" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we visualize the output " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(sat, psol) = solve_vizsudoku_hybrid2(logprobs, is_given)\n", "\n", "if sat:\n", " plot_vs(vsudoku, psol, is_given, ml_digits, sudoku_p0)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }