{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# List ranking from incomplete pairwise preferences\n",
    "** *\n",
    "*Note: if you are visualizing this notebook directly from GitHub, some mathematical symbols might display incorrectly or not display at all. This same notebook can be rendered from nbviewer by following [this link.](http://nbviewer.jupyter.org/github/david-cortes/datascienceprojects/blob/master/optimization/list_optimization.ipynb)*\n",
    "\n",
    "This project consists of ranking a list of items (from best to worst) based on pairwise preferences (i.e. data in the form of “Item A is preferred to Item B”) aggregated from many people, which are incomplete (not available for each pair of items) and biased (more likely to be observed for some items than for others). The methods shown here are nevertheless applicable to ranking items from complete or partial orderings from different people, as any ranked list can be decomposed into pairwise preferences between items (i.e. each item is preferred over all items that are below in the ranking).\n",
    "\n",
    "**Examples of this kind of data:**\n",
    "* Ordered list of preferences for political candidates from different voters.\n",
    "* Lists of top-N best artists/games/books/etc. from different users.\n",
    "* Surveys asking people to compare a series of two items at a time (pick the better one).\n",
    "* Clicks on a vertical ranked list of items shown in random orders to users, if we assume that a clicked item is preferred to items that were ranked above and seen by the user but not clicked  - overall idea is explained in [Joachims, T., Granka, L., Pan, B., Hembrooke, H., Radlinski, F., & Gay, G. (2007). Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search. ACM Transactions on Information Systems (TOIS), 25(2), 7.](https://www.researchgate.net/profile/Bing_Pan/publication/220515686_Evaluating_the_accuracy_of_implicit_feedback_from_clicks_and_query_reformulations_in_Web_search/links/0deec52051abbe27d6000000.pdf)\n",
    "\n",
    "The original idea was to rank an electronic catalog of items based on pairwise preferences deduced from clicks and is explained in [my thesis](https://www.hse.ru/edu/vkr/182626307). Since the data originally used is not public, I’ll simulate data (random numbers following statistical distributions) that resembles the original one.\n",
    "\n",
    "The rankings here are evaluated in terms of the pairwise preferences that they satisfy minus the pairwise preferences that they violate – from here on defined as their _score_. Thus, if we observe the following preferences: $ A\\succ B, A\\succ B, A\\succ B, A\\prec B $, then an ordering that puts A above B would get +2, whereas an ordering that puts B above A would get -2.\n",
    "\n",
    "The goal is to reconstruct what the overall ranking of items is based on the observed preferences, and thus it is reasonable to assume that orderings with a higher such score are more representative of the overall real ranking than those with a lower score.\n",
    "** *\n",
    "## Sections\n",
    "\n",
    "[1. What the data looks like](#p1)\n",
    "\n",
    "[2. Simulating the data](#p2)\n",
    "\n",
    "[3. Algorithms](#p3)\n",
    "* [3.1 Implementing the algorithms](#p31)\n",
    "* [3.2 Evaluating and timing the algorithms](#p32)\n",
    "\n",
    "[4. Comparing these orderings to the simulation's underlying order](#p4)\n",
    "* [4.1 Defining the metrics](#p41)\n",
    "* [4.2 Evaluating the orderings](#p42)\n",
    "\n",
    "[5. Ranking a larger list](#p5)\n",
    "** *\n",
    "\n",
    "<a id=\"p1\"></a>\n",
    "## 1. What the data looks like\n",
    "\n",
    "This example will simulate data in the following way:\n",
    "* There is a list of items, which have an overall attractiveness for users - this will be simulated as a random number for each item, with a bigger number meaning that the item is better, thus each item is comparable to each other.\n",
    "* There are many users, each of which has different preferences for each item, but these tend to be similar across users – this will be simulated as a random number, from a distribution with less spread than the one for items, multiplied by the number reflecting the overall goodness of each item.\n",
    "* Some items are easier to sample than others – that is, it’s easier or more common to get to know the preference for a user for some items than for others- and their probability is proportional to some randomly-generated numbers (~Beta(1,1)).\n",
    "* The data available comes in the form of stated preferences between two products from different users – this will be simulated as picking two items and comparing their ‘goodness’ or ‘attractiveness’ as defined above, then adding which one was preferred to the data pool to work with.\n",
    "* Overall, there is information only for some pairs of items, but not for all.\n",
    "* Items and users are referred to by their numerical ID - this doesn't mean anything and is used only for identification purposes.\n",
    "\n",
    "** *\n",
    "## A look at some small simulated data\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>User</th>\n",
       "      <th>Prefers This Item</th>\n",
       "      <th>Over This Item</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   User  Prefers This Item  Over This Item\n",
       "0     2                  0               5\n",
       "1     1                  5               3\n",
       "2     2                  0               2\n",
       "3     2                  0               3\n",
       "4     3                  0               1"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import numpy as np, pandas as pd\n",
    "from random import shuffle\n",
    "\n",
    "nusers=4\n",
    "nitems=6\n",
    "\n",
    "np.random.seed(10)\n",
    "items=[i for i in range(nitems)]\n",
    "item_probs=np.random.beta(a=1,b=1,size=nitems)\n",
    "item_probs=item_probs/np.sum(item_probs)\n",
    "item_goodness=np.random.beta(a=2,b=2,size=nitems)\n",
    "user_preferences=[np.random.beta(a=3,b=3,size=nitems)*item_goodness for u in range(nusers)]\n",
    "agg_item_goodness=np.zeros((nitems))\n",
    "for u in range(nusers):\n",
    "    agg_item_goodness+=user_preferences[u]\n",
    "\n",
    "preferences=list()\n",
    "for iteration in range(100):\n",
    "    chosen_user=np.random.randint(low=0,high=nusers)\n",
    "    for sample in range(3):\n",
    "        chosen_items=np.random.choice(items,size=2,replace=False,p=item_probs)\n",
    "        if chosen_items[0]==chosen_items[1]:\n",
    "            continue\n",
    "        goodness_A=user_preferences[chosen_user][chosen_items[0]]\n",
    "        goodness_B=user_preferences[chosen_user][chosen_items[1]]\n",
    "        if goodness_A>goodness_B:\n",
    "            preferences.append((chosen_user,chosen_items[0],chosen_items[1]))\n",
    "        else:\n",
    "            preferences.append((chosen_user,chosen_items[1],chosen_items[0]))\n",
    "            \n",
    "shuffle(preferences)\n",
    "pd.DataFrame(preferences, columns=['User','Prefers This Item','Over This Item']).head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "These lists of preferences are aggregated by summing the number of times that preference “Item A is preferred to Item B” is observed and subtracting from it the number of times that preference “Item B is preferred to Item A” is observed.\n",
    "\n",
    "The aggregated preferences are stored in a table reflecting the lower-triangular matrix of a full item x item matrix, thus the entry for “ItemA>ItemB” is in the cell [A,B] if the ID of item A is less than the ID of item B, and is -1 x cell [B,A] otherwise."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "defaultdict(<function __main__.<lambda>>,\n",
       "            {(0, 1): 54,\n",
       "             (0, 2): 13,\n",
       "             (0, 3): 27,\n",
       "             (0, 5): 57,\n",
       "             (1, 2): 1,\n",
       "             (1, 3): -27,\n",
       "             (1, 5): -35,\n",
       "             (2, 3): 0,\n",
       "             (2, 5): -12})"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from collections import defaultdict\n",
    "\n",
    "aggregated_preferences=defaultdict(lambda: 0)\n",
    "for pref in preferences:\n",
    "    if pref[1]<pref[2]:\n",
    "        aggregated_preferences[(pref[1],pref[2])]+=1\n",
    "    else:\n",
    "        aggregated_preferences[(pref[2],pref[1])]-=1\n",
    "\n",
    "# some preferences are hidden, thus some will be deleted at random here\n",
    "for iteration in range(3):\n",
    "    chosen_items=np.random.randint(low=0,high=nitems,size=2)\n",
    "    if chosen_items[0]==chosen_items[1]:\n",
    "        continue\n",
    "    if chosen_items[0]<chosen_items[1]:\n",
    "        del aggregated_preferences[(chosen_items[0],chosen_items[1])]\n",
    "    else:\n",
    "        aggregated_preferences[(chosen_items[1],chosen_items[0])]\n",
    "aggregated_preferences"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ranking with these example preferences:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Best order: (0, 5, 2, 3, 1, 4)\n",
      "Best score: 96\n",
      "Theoretical maximum (perhaps not feasible): 226\n",
      "\n",
      "Item goodness defined in simulation: [(0, 0.88), (5, 0.64), (4, 0.62), (3, 0.56), (2, 0.3), (1, 0.22)]\n",
      "Item goodness across all users in simulation: [(0, 2.1), (4, 1.51), (5, 1.05), (3, 0.96), (2, 0.62), (1, 0.52)]\n"
     ]
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAEXCAYAAACjyo8UAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAG51JREFUeJzt3Xu4XHV97/H3hwRQERFOdmMM6A41XFJELluEChQJj4aL\ngFppqNagaOo5SEFEDAeLsUefBm+n3jXlllbkclBLCk+FEErVyi2EEC4BApJIIJeNVrnZQOB7/vj9\nNgzDzL7Mmr1n+OXzep797Jnfun3Xb9b67DVrzaytiMDMzMq1RacLMDOz0eWgNzMrnIPezKxwDnoz\ns8I56M3MCuegNzMrnIO+CUm9kkJSX6dreTmQ9DpJ10h6UlLRn9mVtIWk70v6Td5GDhnj5R+Slzuh\n0XNrbHPup80u6CVdKOnKBu19eSPozU0PAZOAZcOc7ypJp7et0Jef04HXA3uR+q1kRwAfBt5NWtdf\ndraccuV98s9bmK7R/vhL0uv1m7YU9zIyvtMFdKuIeBZY1+k6GpG0VUQ83ek66rwJuDUiVna6kAGj\n2E9vAtZGRKWAl7RlRDzTpprGnKQtAOV9pevlbaEr9+lRFxGb1Q9wIXBlg/Y+IIDe/Lw3P+/Lz7cE\nvgE8AmwkHfHPy8Ouz+M+/1Mz3/cCd9RMcxZp5xgYPhFYCPwBWAV8CLgTmFszTgAnAT8GngS+AowD\nzgMezNOuBM4AtqhfV+AzpA3898A80ju5ucCG3P6ZYfTbXwP3A0/n3x+rGbaqbv0vbDKPnYArgN8C\nTwH3ADNrhr8euIh0xPUU6d3UO4ZTQ7N+yu3TgKuAx/M6Xwy8rma6NwOLgceAJ4Dba5fbYPupXddV\nuX1r4B+A9cB/AzcCB9ZMd0ge/wjg5rwORzVZxmnA8rwODwPnAq9tMK8JjZ43meeq/Jr/IK/jOuD0\nunG2A+bnPnoc+A/y9p+Hn5CnPYK0jW4C9qDF7SzX/OcN6jy9yXY10Nd/TNqO1uU+WlrblzTZHxv1\nE0Pvn6uAzwLfz9vHGuDTDfaN+/Lr/ihwNTC+01n3oho7XcCYr3DrQf+pvCEcDLwB+FPgw3nYDnnY\n54HXkUME2Bd4NrfvAnwg7ygn1yz3p6RgOYB02mNx3snm1u0QG4CPAjsDU0h/eP4OeGuu9Tjgd8CJ\ndev6GPA9YDfgeOC5vMy/zzV9PM9/30H67D3AM8An8jQn5+fvzsN7gEXApXn9t2syn3/N470lr8MM\nYEYetg3pj9V/Agfl9TyGHLhD1TBIP03KO985wO7AnrmOm8h/FEk7+g9yH70pL+uAJuuwXX49H8rr\n2pPbvw6sBY7My/nH/FpPysMPyfXdAbwz19fTZBmnAofm1/XPSKH/zzXDB+Y10qB/jBRku5DC6Wng\nvXm4gF+Q/iDul/vh/+RpBtbhBFK43wC8Pc9nW1rczhg66HvyOB+t6+u35Pm9Odd5Vl6X3YbYH+v7\nbTj75yrSgccn8rJOzvM4oCY3NuVp35hr+yQO+q4I+k35Ba39eYrBg/4bpBBWk/k+v4HWtF0EXFfX\nNhdYkx/vmpexf83wnfLGN7duh/jmMNZtHnBt3bo+BIyraVsC3D5U7XXD/xM4v0E//qLm+ZU0OZKv\nGWc58Lkmwz5G+gPXMKyGWcNL+on0x3BxXdv2edz98vPHgFkj2IZOJx9d5ufbkILmQzVt44AHgC/k\n54fkZb6vhW12BumIc4u6eY006BfVtZ070H+kPyxPAK+sG2cZcEZ+fAINDgpa3c4YIuibjdNk/W4E\nPjvYNt2g3wbdP2vmc3HdOCsHlkV6R/B7YNuRvq5j+bPZXYzNfkY6eq79+cshprkwj3efpG9LOjKf\noxzM7qSAqvULYLKk15COfp4j7RQARMRDpNND9ZbUN0j6uKQlkvolPUE6knhD3Wh3x4vPoa4nve2m\nru2PWliPaYNM08jXgc9KukHSFyTtWzNsb2B5RDxasYb6ftoXOFjSEwM/pFCCdAoA4GvAuZKuk3SW\npN1GslJ5PlvW1pf7/IZh1PcSkg6VtEjSGkmPk05FbUU6Oq3ihgbPB+rbF3gV0F/XV3vwQj9BOkhq\n9AGFdmxnwyJpG0lfknS3pP/Kdfbx0m1/KEPtnwOW143zCC+sxyJgNfCgpIskzZK07QjrGHWba9A/\nFRH31/6Qzr01FRFLSUf5Z5L6bQGwaBhh33SWIxz/ydonkv6CdE74QuBdpD9C3yEFQq36i33RpK2V\n9RjROkTEeaTTKReQ3ir/UtLcFpY7WA1P1j3fgnQ6ov4P+1TSuxAiYi4p8P6FdEpuuaSPVKxruPW9\niKQ35npXAO8nBfBALfWvbTttQQri+n7aDfjbmvE2RuOLr61sZ0E6ZVRry2HU+hVS3/wt6dTWXqTr\nHu3sn9rXrel6RMTjwD6kU6e/JuXDPZJe38ZaKttcg74lEfF4RFweEf+TdC72UNJ5O0hv3cfVTbKC\ndC6z1oGkt4aPky5GbkHamQGQtCPpouRQDgRuiohvRcTS/Mfqj4eaqEXN1uPukc4oItZExPyIOA44\nG5idB90G7DnIZ5xbrWEp8CfA6vo/7vk1GKhrZUR8IyKOJF3k/ugIVusB0uv/fH2SxpGuu4y0j/pI\ngfXJiLghIu5jeNvDcOzf4PmK/Hgp6YMBzzXopw1tWn69fmo+iitpIi/9aO4zvHS/OhD4p4j4UUQs\nJx2k1W/7jfbHekPtn8MSEZsi4rqIOJN0DWgb4KjhTj8WHPTDJOk0ScdL2l3Sm0inegauwkM6l3eQ\npMk1YfVV4M8kzZW0i6QPkC7qfgkgIu4lXaH/nqT9Je1FOtr9A0MfLd8H7CPpcElTJQ0c3YyGLwN/\nJemkvKyTSRefvjSSmUj6uqQZknbO6zqDF4Lwh6QLqVdIOiiPc7Skd1Ss4dukC6iXSnpbnu9hkuZL\n2lbSK/OpuEPyl+Texgj/iEXEk8B3gXMkHSFp9/x8Iuld1kisJO2Xp0qaIul40sXZdthf0pm5/z5G\n+oTX/83DriWdxrgib1NTJB0g6fOSDmrT8utdB5yUv8OyN+nd6X/XjbMKmK70hbztc9t9wHsk7SPp\nzaQL6a9oMF39/lhv0P1zOCQdJekUSXvnd2N/SbpAvWKISceUg374Hgc+TXqLuJT0dvHwiHgqDz+b\ndCH1AdKRysDpnvcD7yOdr5yXf75VM98TSH8srid9zPKHvPARvcF8H7gsj38L6bTSV1teu0FExL+Q\nPm3wSVIAngL8r4j41xHOagvgm3kei0jrOSsv40nSH6o1pE/F3En6NERUqSEiHiEdtQ18CuQuUvhv\nzD/Pki7OXgjcC/yEdO76tBGu22dInzq6gHQOe0/SJ4rWjmQm+Qj1lLz8u0nvLNr1Rbyv5bpuA74A\nnB0Rl+flBuljk9eRPjF0L2n72pXG14za4VPAr0jb/uWki8P17x4+BbyDdF3lttx2Wh7v58C/kS7E\n/rxuupfsj/WGuX8O5XfAsaQ/lPeQXquPRkR9PR2lfOXYukQ++ngEOD4iftTpeqwMklYB34qIr3S6\nFht7/mZsh0k6lPRW7w7Slfwvkj73/dNO1mVm5XDQd96WpLfRO5M+y38jcHA+lWFmVplP3ZiZFc4X\nY83MCtcVp24mTJgQvb29nS7DzOxl5dZbb300InqGGq8rgr63t5clS4b8ZriZmdWQtHo44/nUjZlZ\n4Rz0ZmaFGzLoJZ0vaYOkO2vavizpHknLJf1E0mtrhp0p6X5J90p612gVbmZmwzOcI/oLSfckqbUI\n2CMi9iTdd+JMAEnTgJmkm0jNAL6Tb+5kZmYdMmTQR8TPSP/6rbbtmojYlJ/eCOyYHx8DXBIRGyPi\nQdK/e9uvjfWamdkIteMc/UdINxYCmMwL/9QB0g2qJjeaSNLs/E8zlvT3N7znkJmZtUGloJd0Fuk/\nzlw00mnzPcn7IqKvp2fIj4GamVmLWv4cvaQTSDfXnx4v3EfhYdKtQQfsmNvMzKxDWjqilzQDOAM4\nuuZ+7JDupz5T0taSppD+XdvN1cs0M7NWDXlEL+li0n9PnyBpDfA50qdstib9z1SAGyPi4xFxl6TL\nSP8wYRNwUpP/L2k2Ir1zrup0CWNu1bwjO12CFWLIoI+I4xs0nzfI+F8k3VPdzMy6gL8Za2ZWOAe9\nmVnhHPRmZoVz0JuZFc5Bb2ZWOAe9mVnhHPRmZoVz0JuZFc5Bb2ZWOAe9mVnhHPRmZoVz0JuZFc5B\nb2ZWOAe9mVnhHPRmZoVz0JuZFc5Bb2ZWOAe9mVnhHPRmZoVz0JuZFc5Bb2ZWOAe9mVnhHPRmZoVz\n0JuZFc5Bb2ZWOAe9mVnhxg81gqTzgaOADRGxR27bAbgU6AVWAcdFxH/lYWcCJwLPAn8TEVePSuWb\nsd45V3Vs2avmHdmxZdvY6dQ25u1rdAzniP5CYEZd2xxgcURMBRbn50iaBswE/iRP8x1J49pWrZmZ\njdiQQR8RPwN+W9d8DLAgP14AHFvTfklEbIyIB4H7gf3aVKuZmbWg1XP0EyNibX68DpiYH08GHqoZ\nb01uewlJsyUtkbSkv7+/xTLMzGwolS/GRkQA0cJ08yOiLyL6enp6qpZhZmZNtBr06yVNAsi/N+T2\nh4GdasbbMbeZmVmHtBr0C4FZ+fEs4Iqa9pmStpY0BZgK3FytRDMzq2I4H6+8GDgEmCBpDfA5YB5w\nmaQTgdXAcQARcZeky4C7gU3ASRHx7CjVbmZmwzBk0EfE8U0GTW8y/heBL1YpyszM2sffjDUzK5yD\n3syscA56M7PCOejNzArnoDczK5yD3syscA56M7PCOejNzArnoDczK5yD3syscA56M7PCOejNzArn\noDczK5yD3syscA56M7PCOejNzArnoDczK5yD3syscA56M7PCOejNzArnoDczK5yD3syscA56M7PC\nOejNzArnoDczK5yD3syscJWCXtInJd0l6U5JF0t6haQdJC2StDL/3r5dxZqZ2ci1HPSSJgN/A/RF\nxB7AOGAmMAdYHBFTgcX5uZmZdUjVUzfjgVdKGg+8CngEOAZYkIcvAI6tuAwzM6ug5aCPiIeBrwC/\nBtYCv4+Ia4CJEbE2j7YOmNhoekmzJS2RtKS/v7/VMszMbAhVTt1sTzp6nwK8HthG0gdrx4mIAKLR\n9BExPyL6IqKvp6en1TLMzGwIVU7dHAY8GBH9EfEM8GPgT4H1kiYB5N8bqpdpZmatqhL0vwb2l/Qq\nSQKmAyuAhcCsPM4s4IpqJZqZWRXjW50wIm6SdDmwFNgE3AbMB14NXCbpRGA1cFw7CrXu0Dvnqk6X\nsNlwX1u7tBz0ABHxOeBzdc0bSUf3ZmbWBfzNWDOzwjnozcwK56A3Myucg97MrHAOejOzwjnozcwK\n56A3Myucg97MrHAOejOzwjnozcwK56A3Myucg97MrHAOejOzwjnozcwK56A3Myucg97MrHAOejOz\nwjnozcwK56A3Myucg97MrHAOejOzwjnozcwK56A3Myucg97MrHAOejOzwjnozcwKVynoJb1W0uWS\n7pG0QtIBknaQtEjSyvx7+3YVa2ZmI1f1iP7rwE8jYjfgLcAKYA6wOCKmAovzczMz65CWg17SdsDB\nwHkAEfF0RPwOOAZYkEdbABxbtUgzM2tdlSP6KUA/cIGk2ySdK2kbYGJErM3jrAMmVi3SzMxaVyXo\nxwP7AN+NiL2BJ6k7TRMRAUSjiSXNlrRE0pL+/v4KZZiZ2WCqBP0aYE1E3JSfX04K/vWSJgHk3xsa\nTRwR8yOiLyL6enp6KpRhZmaDaTnoI2Id8JCkXXPTdOBuYCEwK7fNAq6oVKGZmVUyvuL0JwMXSdoK\n+BXwYdIfj8sknQisBo6ruAwzM6ugUtBHxDKgr8Gg6VXma2Zm7eNvxpqZFc5Bb2ZWOAe9mVnhHPRm\nZoVz0JuZFc5Bb2ZWOAe9mVnhHPRmZoVz0JuZFc5Bb2ZWOAe9mVnhHPRmZoVz0JuZFc5Bb2ZWOAe9\nmVnhHPRmZoVz0JuZFc5Bb2ZWOAe9mVnhHPRmZoVz0JuZFc5Bb2ZWOAe9mVnhHPRmZoVz0JuZFc5B\nb2ZWuMpBL2mcpNskXZmf7yBpkaSV+ff21cs0M7NWteOI/hRgRc3zOcDiiJgKLM7PzcysQyoFvaQd\ngSOBc2uajwEW5McLgGOrLMPMzKqpekT/D8AZwHM1bRMjYm1+vA6YWHEZZmZWQctBL+koYENE3Nps\nnIgIIJpMP1vSEklL+vv7Wy3DzMyGUOWI/u3A0ZJWAZcAh0r6AbBe0iSA/HtDo4kjYn5E9EVEX09P\nT4UyzMxsMC0HfUScGRE7RkQvMBO4LiI+CCwEZuXRZgFXVK7SzMxaNn4U5jkPuEzSicBq4LhRWEZX\n6J1zVadLMCtKp/apVfOO7Mhyx0pbgj4irgeuz49/A0xvx3zNzKw6fzPWzKxwDnozs8I56M3MCueg\nNzMrnIPezKxwDnozs8I56M3MCuegNzMrnIPezKxwDnozs8I56M3MCuegNzMrnIPezKxwDnozs8I5\n6M3MCuegNzMrnIPezKxwDnozs8I56M3MCuegNzMrnIPezKxwDnozs8I56M3MCje+0wW0Q++cqzpd\ngpm9jHUyQ1bNO3LUl+EjejOzwjnozcwK56A3Mytcy0EvaSdJ/y7pbkl3STolt+8gaZGklfn39u0r\n18zMRqrKEf0m4FMRMQ3YHzhJ0jRgDrA4IqYCi/NzMzPrkJaDPiLWRsTS/PhxYAUwGTgGWJBHWwAc\nW7VIMzNrXVvO0UvqBfYGbgImRsTaPGgdMLHJNLMlLZG0pL+/vx1lmJlZA5WDXtKrgR8Bp0bEY7XD\nIiKAaDRdRMyPiL6I6Ovp6alahpmZNVEp6CVtSQr5iyLix7l5vaRJefgkYEO1Es3MrIoqn7oRcB6w\nIiK+VjNoITArP54FXNF6eWZmVlWVWyC8Hfgr4A5Jy3Lb/wbmAZdJOhFYDRxXrUQzM6ui5aCPiF8A\najJ4eqvzNTOz9vI3Y83MCuegNzMrnIPezKxwDnozs8I56M3MCuegNzMrnIPezKxwDnozs8I56M3M\nCuegNzMrnIPezKxwDnozs8I56M3MCuegNzMrnIPezKxwDnozs8I56M3MCuegNzMrnIPezKxwDnoz\ns8I56M3MCuegNzMrnIPezKxwDnozs8I56M3MCuegNzMr3KgFvaQZku6VdL+kOaO1HDMzG9yoBL2k\nccC3gcOBacDxkqaNxrLMzGxwo3VEvx9wf0T8KiKeBi4BjhmlZZmZ2SDGj9J8JwMP1TxfA7ytdgRJ\ns4HZ+ekTku4dYp4TgEfbVmH7dXN93VwbuL4qurk2cH1D0jlNBw2ntjcOZxmjFfRDioj5wPzhji9p\nSUT0jWJJlXRzfd1cG7i+Krq5NnB9VbSzttE6dfMwsFPN8x1zm5mZjbHRCvpbgKmSpkjaCpgJLByl\nZZmZ2SBG5dRNRGyS9AngamAccH5E3FVxtsM+zdMh3VxfN9cGrq+Kbq4NXF8VbatNEdGueZmZWRfy\nN2PNzArnoDczK1xXBr2k90u6S9Jzkvrqhp2Zb6twr6R31bTvK+mOPOwbkjQGdV4qaVn+WSVpWW7v\nlfSHmmHfG+1amtQ3V9LDNXUcUTOsYT+OcX1flnSPpOWSfiLptbm9W/qvq27jIWknSf8u6e68f5yS\n25u+zmNc36q8Dy6TtCS37SBpkaSV+ff2Hapt15r+WSbpMUmndrLvJJ0vaYOkO2vamvZXpX02Irru\nB9gd2BW4HuiraZ8G3A5sDUwBHgDG5WE3A/sDAv4NOHyMa/4qcHZ+3Avc2QX9OBc4vUF7034c4/re\nCYzPj88BzumW/iN9iOABYGdgq9xf0zpc0yRgn/x4W+C+/Fo2fJ07UN8qYEJd25eAOfnxnIHXuAte\n23WkLxt1rO+Ag4F9arf1Zv1VdZ/tyiP6iFgREY2+KXsMcElEbIyIB4H7gf0kTQJeExE3RuqVfwKO\nHat687uH44CLx2qZFTXsx7EuIiKuiYhN+emNpO9bdIuuu41HRKyNiKX58ePACtK30LvZMcCC/HgB\nY7hfDmI68EBErO5kERHxM+C3dc3N+qvSPtuVQT+IRrdWmJx/1jRoHysHAesjYmVN25T8VvA/JB00\nhrXUOzmfGjm/5m1gs37spI+Q3okN6HT/dWMfPU9SL7A3cFNuavQ6j7UArpV0a77FCcDEiFibH68D\nJnamtBeZyYsPyrqh7wY0669K22PHgl7StZLubPDTVTc/G2adx/PiDWct8IaI2As4DfihpNd0oL7v\nkk497JVr+upo1FChvoFxzgI2ARflpjHrv5cjSa8GfgScGhGP0QWvc3Zgfs0OB06SdHDtwPxuu6Of\n51b6AufRwP/LTd3Sdy/Rzv7q5L1uDmthsma3VniYF7/tb9stF4aqU9J44L3AvjXTbAQ25se3SnoA\n2AVY0o6aRlJfTZ3/CFyZn47ZLSqG0X8nAEcB0/OGPab9N4iuvI2HpC1JIX9RRPwYICLW1wyvfZ3H\nVEQ8nH9vkPQT0qmF9ZImRcTafIp1Qydqq3E4sHSgz7ql72o0669K2+PL7dTNQmCmpK0lTQGmAjfn\ntzqPSdo/ny//EHDFGNV0GHBPRDx/6khSj9I9+ZG0c67zV2NUz/PyhjLgPcDA1f2G/diB+mYAZwBH\nR8RTNe3d0H9ddxuPvG2fB6yIiK/VtDd7nceytm0kbTvwmHSh/U5Sn83Ko81i7PbLZl707rsb+q5O\ns/6qts92+gp4k6vR7yGdg9oIrAeurhl2FumK873UfLIG6CO9SA8A3yJ/63cMar0Q+Hhd2/uAu4Bl\nwFLg3R3qx38G7gCW5w1l0lD9OMb13U8677gs/3yvy/rvCNInWx4AzupEDXX1HEh6K7+8ps+OGOx1\nHsPadiZ9KuT2/Nqdldv/B7AYWAlcC+zQwf7bBvgNsF1NW8f6jvQHZy3wTM67Ewfrryr7rG+BYGZW\nuJfbqRszMxshB72ZWeEc9GZmhXPQm5kVzkFvZlY4B71tViSdpXTnx+X5Fgtv63RNZqOtY9+MNRtr\nkg4gfQt3n4jYKGkC6c6Urc5vfLxwUzazruUjetucTAIejXSLBSLi0Yh4RNJbJf1S0u2Sbpa0raRX\nSLpA6f7qt0l6B6RbNkhaKOk60hdbkPRpSbfkdwmf79zqmTXmI3rbnFwDnC3pPtK3Di8Fbsi//yIi\nbsk3T/sDcArpvlJvlrQbcI2kXfJ89gH2jIjfSnon6evo+5H+F8JCSQdHugWtWVfwEb1tNiLiCdLN\n52YD/aSA/2tgbUTcksd5LJ+OORD4QW67B1hNurEawKKIGLiP+Dvzz22k2zXsRgp+s67hI3rbrETE\ns6T/XHa9pDuAk1qYzZM1jwX8fUR8vw3lmY0KH9HbZkPp/4bWHm3vRfovTZMkvTWPs22+9fTPgQ/k\ntl2AN5BuJlXvauAj+R7xSJos6Y9GcTXMRsxH9LY5eTXwTaV/Qr6JdPfM2cAFuf2VpPPzhwHfAb6b\nj/o3ASfkT+q8aIYRcY2k3YEb8rAngA/S+fuumz3Pd680MyucT92YmRXOQW9mVjgHvZlZ4Rz0ZmaF\nc9CbmRXOQW9mVjgHvZlZ4f4/yWVWXw/axCAAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x42f9780>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from itertools import permutations\n",
    "from copy import deepcopy\n",
    "from matplotlib import pyplot as plt\n",
    "%matplotlib inline\n",
    "\n",
    "def eval_ranking(order,aggregated_preferences):\n",
    "    score=0\n",
    "    for ind in range(len(order)-1):\n",
    "        el1=order[ind]\n",
    "        el2=order[ind+1]\n",
    "        if el1<el2:\n",
    "            score+=aggregated_preferences[(el1,el2)]\n",
    "        else:\n",
    "            score-=aggregated_preferences[(el2,el1)]\n",
    "    return score\n",
    "\n",
    "best_score=-100\n",
    "worst_score=100\n",
    "scores=list()\n",
    "for order in permutations([i for i in range(nitems)]):\n",
    "    score=eval_ranking(order,aggregated_preferences)\n",
    "    scores.append(score)\n",
    "    if score>best_score:\n",
    "        best_score=deepcopy(score)\n",
    "        best_order=order\n",
    "        \n",
    "print('Best order:',best_order)\n",
    "print(\"Best score:\",best_score)\n",
    "print('Theoretical maximum (perhaps not feasible):',np.sum([np.abs(v) for v in aggregated_preferences.values()]))\n",
    "print()\n",
    "print('Item goodness defined in simulation:',[(el,eval(\"{0:.2f}\".format(item_goodness[el]))) for el in np.argsort(-item_goodness)])\n",
    "print('Item goodness across all users in simulation:',[(el,eval(\"{0:.2f}\".format(agg_item_goodness[el]))) for el in np.argsort(-agg_item_goodness)])\n",
    "_=plt.hist(np.array(scores))\n",
    "_=plt.title('Histogram of scores for all permutations',size=14)\n",
    "_=plt.xlabel('Score')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"p2\"></a>\n",
    "## 2. Simulating the data\n",
    "\n",
    "Let's now simulate a more realistic scenario:\n",
    "\n",
    "* List of 30 items.\n",
    "* Sample with replacement of random 300 users from a population of 500, with 4 preferences stated from each.\n",
    "* Higher sampling bias (P~Beta(2,2)).\n",
    "* Slightly less spread item goodness (~Beta(3,3)) and less spread user variation in preference (~Beta(4,4))."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Theoretical maximum score: 812\n",
      "Number of pairwise preferences observed: 324\n",
      "Number of pairs in the list: 435\n"
     ]
    }
   ],
   "source": [
    "nusers=500\n",
    "nitems=30\n",
    "aggregated_preferences=defaultdict(lambda: 0)\n",
    "\n",
    "def add_preferences(el1,el2):\n",
    "    global aggregated_preferences\n",
    "    \n",
    "    if el1<el2:\n",
    "        aggregated_preferences[(el1,el2)]+=1\n",
    "    else:\n",
    "        aggregated_preferences[(el2,el1)]-=1\n",
    "\n",
    "np.random.seed(100)\n",
    "items=[i for i in range(nitems)]\n",
    "item_probs=np.random.beta(a=2,b=2,size=nitems)\n",
    "item_probs=item_probs/np.sum(item_probs)\n",
    "item_goodness=np.random.beta(a=3,b=3,size=nitems)\n",
    "user_preferences=[np.random.beta(a=4,b=4,size=nitems)*item_goodness for u in range(nusers)]\n",
    "agg_item_goodness=np.zeros((nitems))\n",
    "for u in range(nusers):\n",
    "    agg_item_goodness+=user_preferences[u]\n",
    "\n",
    "for iteration in range(300):\n",
    "    chosen_user=np.random.randint(low=0,high=nusers)\n",
    "    for sample in range(4):\n",
    "        chosen_items=np.random.choice(items,size=2,replace=False,p=item_probs)\n",
    "        if chosen_items[0]==chosen_items[1]:\n",
    "            continue\n",
    "        goodness_A=user_preferences[chosen_user][chosen_items[0]]\n",
    "        goodness_B=user_preferences[chosen_user][chosen_items[1]]\n",
    "        if goodness_A>goodness_B:\n",
    "            add_preferences(chosen_items[0],chosen_items[1])\n",
    "        else:\n",
    "            add_preferences(chosen_items[1],chosen_items[0])\n",
    "            \n",
    "print('Theoretical maximum score:',np.sum([np.abs(v) for v in aggregated_preferences.values()]))\n",
    "print('Number of pairwise preferences observed:',len([v for v in aggregated_preferences.values() if v!=0]))\n",
    "print('Number of pairs in the list:',int(nitems*(nitems-1)/2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"p3\"></a>\n",
    "## 3. Algorithms\n",
    "\n",
    "The example above with only 6 items found the best ordering by evaluating all the possible permutations of those 6 items, but such a brute-force search of all the orderings becomes infeasible when there are more than a dozen items (there are items! (factorial) possible orderings, so for example for 50 items there are 50!=3x10^64 different orderings).\n",
    "\n",
    "As such, this larger 30-element list will be ranked with discrete optimization algorithms based on local search, by swapping two items at a time within a list, starting with a random order. In addition, I also implemented two algorithms from some papers: [KwikSort]( http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.296.8422&rep=rep1&type=pdf) and [Greedy-Order]( http://arxiv.org/pdf/1105.5464) – the second one being perhaps the most straightforward heuristic; and two other common techniques: PageRank (the original algorithm used by Google for helping to rank web pages) and a convex relaxation of the problem with l1 penalty using a black-box SDP solver, by taking the difference between scores between two items plus a margin.\n",
    "\n",
    "While evaluating the score of an ordering requires checking the table of aggregated preferences for every pair in the list (of which there are items x (items-1)/2), it’s also possible to look for only the net effect that one particular pair of items has when they occupy their current position in the list, which implies checking only the preferences that involve this pair of items when they are at their current positions (which only implies (items -1) + (items -2) relationships).\n",
    "\n",
    "** *\n",
    "#### Min-conflict (complexity O(items^3)):\n",
    "\n",
    "    While it’s possible to improve the score\n",
    "        For each pair of items in the list:\n",
    "            Check the effect on score of having it as it is vs. reversed\n",
    "        Swap the pair that brought the highest score improvement\n",
    "        Repeat\n",
    "\n",
    "** *\n",
    "#### Random swaps (complexity O(items x iterations x repetitions + repetitions x items^2)):\n",
    "\n",
    "    Repeat r times:\n",
    "        Repeat n times:\n",
    "            Pick a random pair of items\n",
    "                Check the effect on score of having it as it is vs. reversed        \n",
    "                Swap them if it improves the score\n",
    "        Calculate the score of the obtained ordering\n",
    "        If it's the highest score seen so far, save this ordering\n",
    "    Output the ordering with the highest score seen\n",
    "\n",
    "** *\n",
    "#### Metropolis-Hastings swaping (complexity O(items x iterations)):\n",
    "\n",
    "    Repeat n times:\n",
    "        Pick a random pair of items   \n",
    "            Check the effect on score of having it as it is vs. reversed       \n",
    "            If swapping them improves the score:        \n",
    "                Swap them          \n",
    "                Update the score estimate           \n",
    "                If it's the highest score ever seen, save this ordering            \n",
    "            Else:        \n",
    "                Swap them with a small probability inversely proportional to the score decrease            \n",
    "                Update score estimate if they were swaped           \n",
    "    Output the ordering with the highest score seen\n",
    "\n",
    "In addition, I also implemented two algorithms from some papers: [Kwik-Sort]( http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.296.8422&rep=rep1&type=pdf) and [Greedy-Order]( http://arxiv.org/pdf/1105.5464)\n",
    "** *\n",
    "<a id=\"p31\"></a>\n",
    "### 3.1 Implementing the algorithms"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "#Helpful functions\n",
    "import random, cvxpy as cvx\n",
    "\n",
    "#Calculating score for a given order\n",
    "def list_score(ordering,prefs):\n",
    "    score=0\n",
    "    cnt=len(ordering)\n",
    "    for i in range(cnt-1):\n",
    "        for j in range(i+1,cnt):\n",
    "            e1,e2=ordering[i],ordering[j]\n",
    "            if e1<e2:\n",
    "                score+=prefs[(e1,e2)]\n",
    "            else:\n",
    "                score-=prefs[(e2,e1)]\n",
    "    return score\n",
    "\n",
    "#Generate all the possible pairs, will be used at each step\n",
    "def get_pairs_indices(n):\n",
    "    for i in range(n-1):\n",
    "        for j in range(i+1,n):\n",
    "            yield i,j\n",
    "\n",
    "#Give a preference and see how many constrains will it satisy minus how many will it violate           \n",
    "def score_pref(e1_name,e2_name,prefs_dict):\n",
    "    if e1_name<e2_name:\n",
    "        return prefs_dict[e1_name,e2_name]\n",
    "    else:\n",
    "        return -prefs_dict[e2_name,e1_name]\n",
    "\n",
    "#examine the net effect of having a pair as it is vs. the other way around\n",
    "def pair_net_effect(lst,pair_ind,prefs_dict):\n",
    "    lst2=deepcopy(lst)\n",
    "    e1_ind,e2_ind=pair_ind\n",
    "    if e1_ind>e2_ind:\n",
    "        e1_ind,e2_ind=e2_ind,e1_ind\n",
    "    lst2[e1_ind],lst2[e2_ind]=lst2[e2_ind],lst2[e1_ind]\n",
    "    score=0\n",
    "    rev_score=0\n",
    "    for p1 in range(e1_ind):\n",
    "        score+=score_pref(lst[p1],lst[e1_ind],prefs_dict)\n",
    "        rev_score+=score_pref(lst2[p1],lst2[e1_ind],prefs_dict)        \n",
    "        score+=score_pref(lst[p1],lst[e2_ind],prefs_dict)\n",
    "        rev_score+=score_pref(lst2[p1],lst2[e2_ind],prefs_dict)\n",
    "    for p2 in range(e1_ind+1,e2_ind):\n",
    "        score+=score_pref(lst[e1_ind],lst[p2],prefs_dict)\n",
    "        rev_score+=score_pref(lst2[e1_ind],lst2[p2],prefs_dict)\n",
    "        score+=score_pref(lst[p2],lst[e2_ind],prefs_dict)\n",
    "        rev_score+=score_pref(lst2[p2],lst2[e2_ind],prefs_dict)\n",
    "    for p3 in range(e2_ind+1,len(lst)):\n",
    "        score+=score_pref(lst[e1_ind],lst[p3],prefs_dict)\n",
    "        rev_score+=score_pref(lst2[e1_ind],lst2[p3],prefs_dict)\n",
    "        score+=score_pref(lst[e2_ind],lst[p3],prefs_dict)\n",
    "        rev_score+=score_pref(lst2[e2_ind],lst2[p3],prefs_dict)        \n",
    "    score+=score_pref(lst[e1_ind],lst[e2_ind],prefs_dict)\n",
    "    rev_score+=score_pref(lst2[e1_ind],lst2[e2_ind],prefs_dict)\n",
    "    return (score,rev_score)\n",
    "\n",
    "#Swap a pair of items in a list\n",
    "def swap(lst,pair_tuple):\n",
    "    lst[pair_tuple[0]],lst[pair_tuple[1]]=lst[pair_tuple[1]],lst[pair_tuple[0]]\n",
    "\n",
    "######################\n",
    "\n",
    "#Implementing the algorithms\n",
    "\n",
    "def greedy_ranking(list_els,dict_prefs):\n",
    "    ordering=list()\n",
    "    els=deepcopy(list_els)\n",
    "    while els!=[]:\n",
    "        best_score=float(\"-infinity\")\n",
    "        for e1 in els:\n",
    "            score_el=0\n",
    "            for e2 in els:\n",
    "                if e1==e2:\n",
    "                    continue\n",
    "                score_el+=score_pref(e1,e2,dict_prefs)\n",
    "            if score_el>best_score:\n",
    "                best_score=score_el\n",
    "                best_el=e1\n",
    "        ordering.append(best_el)\n",
    "        els.remove(best_el)\n",
    "    return ordering\n",
    "\n",
    "def kwiksort(list_els,dict_prefs):\n",
    "    if list_els==[]:\n",
    "        return []\n",
    "    pivot=np.random.choice(list_els)\n",
    "    left=[]\n",
    "    right=[]\n",
    "    for el in list_els:\n",
    "        if el==pivot:\n",
    "            continue\n",
    "        else:\n",
    "            if score_pref(el,pivot,dict_prefs)<0:\n",
    "                right.append(el)\n",
    "            else:\n",
    "                left.append(el)\n",
    "    left=kwiksort(left,dict_prefs)\n",
    "    right=kwiksort(right,dict_prefs)\n",
    "    return left+[pivot]+right\n",
    "\n",
    "def kwiksort_multiple_runs(list_els,dict_prefs,runs=10):\n",
    "    best_score=float(\"-infinity\")\n",
    "    for run in range(runs):\n",
    "        ordering=kwiksort(list_els,dict_prefs)\n",
    "        score=list_score(ordering,dict_prefs)\n",
    "        if score>best_score:\n",
    "            best_score=score\n",
    "            best_order=ordering\n",
    "    return best_order\n",
    "\n",
    "def pagerank(list_els,dict_prefs,eps_search=20):\n",
    "    prefs_mat=np.zeros((nitems,nitems))\n",
    "    for k,v in aggregated_preferences.items():\n",
    "        if v==0:\n",
    "            continue\n",
    "        elif v>0:\n",
    "            prefs_mat[k[1],k[0]]+=v\n",
    "        else:\n",
    "            prefs_mat[k[0],k[1]]-=v\n",
    "    prefs_mat_orig=prefs_mat.copy()\n",
    "    eps_grid=list(.5**np.logspace(0,1,eps_search))\n",
    "    best=-10^5\n",
    "    best_order=None\n",
    "    \n",
    "    for eps in eps_grid:\n",
    "        prefs_mat=prefs_mat_orig.copy()\n",
    "        for i in range(nitems):\n",
    "            prefs_mat[:,i]+=eps\n",
    "            tot=np.sum(prefs_mat[:,i])\n",
    "            prefs_mat[:,i]=prefs_mat[:,i]/tot\n",
    "\n",
    "        \n",
    "        pr=np.ones((nitems,1))/nitems\n",
    "        for i in range(30):\n",
    "            pr=prefs_mat.dot(pr)\n",
    "        lst_pagerank=list(np.argsort(pr.reshape(-1)))\n",
    "        score_this_order=list_score(lst_pagerank,aggregated_preferences)\n",
    "        if score_this_order>best:\n",
    "            best=score_this_order\n",
    "            best_order=deepcopy(lst_pagerank)\n",
    "    return best_order\n",
    "\n",
    "def cvx_relaxation(list_els,dict_prefs):\n",
    "    r=cvx.Variable(nitems)    \n",
    "    obj=0\n",
    "    for k,v in aggregated_preferences.items():\n",
    "        if v>0:\n",
    "            obj+=cvx.pos(v*r[k[0]]-v*r[k[1]]+1)\n",
    "        else:\n",
    "            obj+=cvx.pos(-v*r[k[1]]+v*r[k[0]]+1)\n",
    "    prob=cvx.Problem(cvx.Minimize(obj))\n",
    "    prob.solve()\n",
    "    return list(np.argsort(np.array(r.value).reshape(-1)))\n",
    "\n",
    "def minconflict(initial_guess,dict_prefs):\n",
    "    ordering=deepcopy(initial_guess)\n",
    "    while True:\n",
    "        best=0\n",
    "        best_swap=None\n",
    "        pairs_indices=get_pairs_indices(len(ordering))\n",
    "        for pair_ind in pairs_indices:\n",
    "            score_as_is,score_rev=pair_net_effect(ordering,pair_ind,dict_prefs)\n",
    "            improvement=(score_rev-score_as_is)\n",
    "            if improvement>best:\n",
    "                best=improvement\n",
    "                best_swap=pair_ind\n",
    "        if best_swap==None:\n",
    "            break\n",
    "        else:\n",
    "            swap(ordering,best_swap)\n",
    "    return ordering\n",
    "\n",
    "def random_swaps(initial_guess,dict_prefs,iterations=10000,repetitions=3):\n",
    "    best=0\n",
    "    for rep in range(repetitions):\n",
    "        ordering=deepcopy(initial_guess)\n",
    "        for it in range(iterations):\n",
    "            candidates_ind=random.sample(range(len(ordering)), 2)\n",
    "            score_as_is,score_rev=pair_net_effect(ordering,candidates_ind,dict_prefs)\n",
    "            if score_rev>score_as_is:\n",
    "                swap(ordering,candidates_ind)\n",
    "        score=list_score(ordering,dict_prefs)\n",
    "        if score>best:\n",
    "            best=score\n",
    "            best_ordering=deepcopy(ordering)\n",
    "    return best_ordering\n",
    "\n",
    "def metropolis_hastings(initial_guess,dict_prefs,iterations=10000,explore_fact=1):\n",
    "    ordering=deepcopy(initial_guess)\n",
    "    best=0\n",
    "    current_score=0\n",
    "    for it in range(iterations):\n",
    "        candidates_ind=random.sample(range(len(ordering)), 2)\n",
    "        score_as_is,score_rev=pair_net_effect(ordering,candidates_ind,dict_prefs)\n",
    "        if score_rev>score_as_is:\n",
    "            swap(ordering,candidates_ind)\n",
    "            current_score+=(score_rev-score_as_is)\n",
    "            if current_score>best:\n",
    "                best=current_score\n",
    "                best_ordering=deepcopy(ordering)\n",
    "        else:\n",
    "            criterion=(2*explore_fact)**(score_rev-score_as_is)\n",
    "            if np.random.random()<=criterion:\n",
    "                swap(ordering,candidates_ind)\n",
    "                current_score+=(score_rev-score_as_is)\n",
    "    return best_ordering"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"p32\"></a>\n",
    "### 3.2 Evaluating and timing the algorithms"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Score</th>\n",
       "      <th>Time (seconds)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Metropolis-Hastings Swapping</th>\n",
       "      <td>748</td>\n",
       "      <td>7.144304</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Min-Conflict</th>\n",
       "      <td>734</td>\n",
       "      <td>1.773097</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Random Swaps</th>\n",
       "      <td>728</td>\n",
       "      <td>20.773540</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Kwik-Sort (10,000 trials)</th>\n",
       "      <td>722</td>\n",
       "      <td>7.611491</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Convex relaxation</th>\n",
       "      <td>714</td>\n",
       "      <td>1.228070</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>PageRank (tuned epsilon)</th>\n",
       "      <td>660</td>\n",
       "      <td>0.081005</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Greedy Ranking</th>\n",
       "      <td>650</td>\n",
       "      <td>0.006000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                              Score  Time (seconds)\n",
       "Metropolis-Hastings Swapping    748        7.144304\n",
       "Min-Conflict                    734        1.773097\n",
       "Random Swaps                    728       20.773540\n",
       "Kwik-Sort (10,000 trials)       722        7.611491\n",
       "Convex relaxation               714        1.228070\n",
       "PageRank (tuned epsilon)        660        0.081005\n",
       "Greedy Ranking                  650        0.006000"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import time\n",
    "\n",
    "#Generating a random ordering\n",
    "np.random.seed(1)\n",
    "random_ordering=deepcopy(items)\n",
    "np.random.shuffle(random_ordering)\n",
    "\n",
    "start_time = time.time()\n",
    "greedy_rank=greedy_ranking(random_ordering,aggregated_preferences)\n",
    "time_greedy_rank=time.time() - start_time\n",
    "\n",
    "np.random.seed(1)\n",
    "random.seed(1)\n",
    "start_time = time.time()\n",
    "ks_rank=kwiksort_multiple_runs(random_ordering,aggregated_preferences,runs=10000)\n",
    "time_kwiksort=time.time() - start_time\n",
    "\n",
    "np.random.seed(1)\n",
    "random.seed(1)\n",
    "start_time = time.time()\n",
    "pr_rank=pagerank(random_ordering,aggregated_preferences)\n",
    "time_pagerank=time.time() - start_time\n",
    "\n",
    "np.random.seed(1)\n",
    "random.seed(1)\n",
    "start_time = time.time()\n",
    "cvxrelax_rank=cvx_relaxation(random_ordering,aggregated_preferences)\n",
    "time_cvxrelax=time.time() - start_time\n",
    "\n",
    "start_time = time.time()\n",
    "mc_rank=minconflict(random_ordering,aggregated_preferences)\n",
    "time_minconflict=time.time() - start_time\n",
    "\n",
    "np.random.seed(1)\n",
    "random.seed(1)\n",
    "start_time = time.time()\n",
    "rs_rank=random_swaps(random_ordering,aggregated_preferences,iterations=50000)\n",
    "time_random_swaps=time.time() - start_time\n",
    "\n",
    "np.random.seed(1)\n",
    "random.seed(1)\n",
    "start_time = time.time()\n",
    "mh_rank=metropolis_hastings(random_ordering,aggregated_preferences,iterations=50000,explore_fact=8)\n",
    "time_metropolis=time.time() - start_time\n",
    "\n",
    "lst_scores={'Greedy Ranking':list_score(greedy_rank,aggregated_preferences),\n",
    "            'Kwik-Sort (10,000 trials)':list_score(ks_rank,aggregated_preferences),\n",
    "            'PageRank (tuned epsilon)':list_score(pr_rank,aggregated_preferences),\n",
    "            'Convex relaxation':list_score(cvxrelax_rank,aggregated_preferences),\n",
    "            'Min-Conflict':list_score(mc_rank,aggregated_preferences),\n",
    "            'Random Swaps':list_score(rs_rank,aggregated_preferences),\n",
    "            'Metropolis-Hastings Swapping':list_score(mh_rank,aggregated_preferences)\n",
    "           }\n",
    "lst_times={'Greedy Ranking':time_greedy_rank,\n",
    "            'Kwik-Sort (10,000 trials)':time_kwiksort,\n",
    "            'PageRank (tuned epsilon)':time_pagerank,\n",
    "            'Convex relaxation':time_cvxrelax,\n",
    "            'Min-Conflict':time_minconflict,\n",
    "            'Random Swaps':time_random_swaps,\n",
    "            'Metropolis-Hastings Swapping':time_metropolis\n",
    "           }\n",
    "\n",
    "eval_df=pd.DataFrame.from_dict(lst_scores,orient='index').rename(columns={0:'Score'}).sort_values('Score',ascending=False)\n",
    "runtimes=pd.DataFrame.from_dict(lst_times,orient='index').rename(columns={0:'Time (seconds)'})\n",
    "eval_df.join(runtimes)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"p4\"></a>\n",
    "## 4. Comparing these orderings to the simulation's underlying order\n",
    "\n",
    "\n",
    "Since these incomplete pairwise preferences were generated from a complete ranked list, it’s possible to evaluate them also in terms of how similar the order that they output is to the real ranking (in this case, we know it because we generated these samples from it plus a small disturbance to account for user preferences, but in a real situation, this ranking wouldn’t be available).\n",
    "\n",
    "\n",
    "**Some good criteria for comparing one ranked list to another are:**\n",
    "\n",
    "* The fraction of pairs ranked in same order relative to each other (e.g. if A is ranked above B in both lists, that pair adds to their similarity, whereas if A is ranked above B in one while B is ranked above A in another, it doesn’t add to their similarity).\n",
    "* Symmetric AP correlation, described in [A new rank correlation coefficient for information retrieval](http://net.pku.edu.cn/~webg/src/paradise/reference/click-through%20evaluation/2008-SIGIR-A%20new%20rank%20correlation%20coefficient%20for%20information%20retrieval.pdf).\n",
    "\n",
    "** *\n",
    "<a id=\"p41\"></a>\n",
    "### 4.1 Defining the metrics"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "#implementing useful metrics\n",
    "def fraction_conc_pairs(list1,list2):\n",
    "    pairs_list1=set()\n",
    "    for i in range(len(list1)-1):\n",
    "        for j in range (i+1,len(list1)):\n",
    "            pairs_list1.add((list1[i],list1[j]))\n",
    "            \n",
    "    p=0\n",
    "    q=0\n",
    "    for i in range(len(list2)-1):\n",
    "        for j in range (i+1,len(list2)):\n",
    "            if (list2[i],list2[j]) in pairs_list1:\n",
    "                p+=1\n",
    "            else:\n",
    "                q+=1\n",
    "    return p/(p+q)\n",
    "\n",
    "def ap_cor(list1,list2):\n",
    "    pairs_list2=set()\n",
    "    p_prime=0\n",
    "    for i in range(len(list2)-1):\n",
    "        for j in range (i+1,len(list2)):\n",
    "            pairs_list2.add((list2[i],list2[j]))\n",
    "    \n",
    "    for i in range(1,len(list1)):\n",
    "        for j in range(i):\n",
    "            if (list1[j],list1[i]) in pairs_list2:\n",
    "                c=1\n",
    "            else:\n",
    "                c=0\n",
    "            p_prime+=c/i\n",
    "    p_prime=p_prime/(len(list1)-1)\n",
    "    return p_prime-(1-p_prime)\n",
    "\n",
    "def sym_ap_cor(list1,list2):\n",
    "    return (ap_cor(list1,list2)+ap_cor(list2,list1))/2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"p42\"></a>\n",
    "### 4.2 Evaluating the orderings"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Score</th>\n",
       "      <th>Time (seconds)</th>\n",
       "      <th>% concordant pairs w/generator</th>\n",
       "      <th>Symmetrical AP correlation w/generator</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Metropolis-Hastings Swapping</th>\n",
       "      <td>748</td>\n",
       "      <td>7.144304</td>\n",
       "      <td>86.67%</td>\n",
       "      <td>62.38%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Min-Conflict</th>\n",
       "      <td>734</td>\n",
       "      <td>1.773097</td>\n",
       "      <td>86.67%</td>\n",
       "      <td>66.71%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Random Swaps</th>\n",
       "      <td>728</td>\n",
       "      <td>20.773540</td>\n",
       "      <td>81.84%</td>\n",
       "      <td>54.92%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Kwik-Sort (10,000 trials)</th>\n",
       "      <td>722</td>\n",
       "      <td>7.611491</td>\n",
       "      <td>86.90%</td>\n",
       "      <td>68.55%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Convex relaxation</th>\n",
       "      <td>714</td>\n",
       "      <td>1.228070</td>\n",
       "      <td>86.44%</td>\n",
       "      <td>64.14%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>PageRank (tuned epsilon)</th>\n",
       "      <td>660</td>\n",
       "      <td>0.081005</td>\n",
       "      <td>83.45%</td>\n",
       "      <td>49.50%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Greedy Ranking</th>\n",
       "      <td>650</td>\n",
       "      <td>0.006000</td>\n",
       "      <td>79.31%</td>\n",
       "      <td>52.37%</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                              Score  Time (seconds)  \\\n",
       "Metropolis-Hastings Swapping    748        7.144304   \n",
       "Min-Conflict                    734        1.773097   \n",
       "Random Swaps                    728       20.773540   \n",
       "Kwik-Sort (10,000 trials)       722        7.611491   \n",
       "Convex relaxation               714        1.228070   \n",
       "PageRank (tuned epsilon)        660        0.081005   \n",
       "Greedy Ranking                  650        0.006000   \n",
       "\n",
       "                             % concordant pairs w/generator  \\\n",
       "Metropolis-Hastings Swapping                         86.67%   \n",
       "Min-Conflict                                         86.67%   \n",
       "Random Swaps                                         81.84%   \n",
       "Kwik-Sort (10,000 trials)                            86.90%   \n",
       "Convex relaxation                                    86.44%   \n",
       "PageRank (tuned epsilon)                             83.45%   \n",
       "Greedy Ranking                                       79.31%   \n",
       "\n",
       "                             Symmetrical AP correlation w/generator  \n",
       "Metropolis-Hastings Swapping                                 62.38%  \n",
       "Min-Conflict                                                 66.71%  \n",
       "Random Swaps                                                 54.92%  \n",
       "Kwik-Sort (10,000 trials)                                    68.55%  \n",
       "Convex relaxation                                            64.14%  \n",
       "PageRank (tuned epsilon)                                     49.50%  \n",
       "Greedy Ranking                                               52.37%  "
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "best_theoretical_order=list(np.argsort(-item_goodness))\n",
    "\n",
    "lst_conc_pairs={'Greedy Ranking':'{:.2%}'.format(fraction_conc_pairs(greedy_rank,best_theoretical_order)),\n",
    "            'Kwik-Sort (10,000 trials)':'{:.2%}'.format(fraction_conc_pairs(ks_rank,best_theoretical_order)),\n",
    "            'PageRank (tuned epsilon)':'{:.2%}'.format(fraction_conc_pairs(pr_rank,best_theoretical_order)),\n",
    "            'Convex relaxation':'{:.2%}'.format(fraction_conc_pairs(cvxrelax_rank,best_theoretical_order)),\n",
    "            'Min-Conflict':'{:.2%}'.format(fraction_conc_pairs(mc_rank,best_theoretical_order)),\n",
    "            'Random Swaps':'{:.2%}'.format(fraction_conc_pairs(rs_rank,best_theoretical_order)),\n",
    "            'Metropolis-Hastings Swapping':'{:.2%}'.format(fraction_conc_pairs(mh_rank,best_theoretical_order))\n",
    "           }\n",
    "lst_sym_ap_cor={'Greedy Ranking':'{:.2%}'.format(sym_ap_cor(greedy_rank,best_theoretical_order)),\n",
    "            'Kwik-Sort (10,000 trials)':'{:.2%}'.format(sym_ap_cor(ks_rank,best_theoretical_order)),\n",
    "            'PageRank (tuned epsilon)':'{:.2%}'.format(sym_ap_cor(pr_rank,best_theoretical_order)),\n",
    "            'Convex relaxation':'{:.2%}'.format(sym_ap_cor(cvxrelax_rank,best_theoretical_order)),\n",
    "            'Min-Conflict':'{:.2%}'.format(sym_ap_cor(mc_rank,best_theoretical_order)),\n",
    "            'Random Swaps':'{:.2%}'.format(sym_ap_cor(rs_rank,best_theoretical_order)),\n",
    "            'Metropolis-Hastings Swapping':'{:.2%}'.format(sym_ap_cor(mh_rank,best_theoretical_order))\n",
    "           }\n",
    "\n",
    "eval_df=pd.DataFrame.from_dict(lst_scores,orient='index').rename(columns={0:'Score'}).sort_values('Score',ascending=False)\n",
    "runtimes=pd.DataFrame.from_dict(lst_times,orient='index').rename(columns={0:'Time (seconds)'})\n",
    "fcp=pd.DataFrame.from_dict(lst_conc_pairs,orient='index').rename(columns={0:'% concordant pairs w/generator'})\n",
    "sapc=pd.DataFrame.from_dict(lst_sym_ap_cor,orient='index').rename(columns={0:'Symmetrical AP correlation w/generator'})\n",
    "eval_df.join(runtimes).join(fcp).join(sapc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Despite achieving lower scores, here Kwik-Sort managed to produce an ordering that is more similar to the underlying order form which the pairwise preferences were simulated. Metropolis-Hastings and Min-Conflict were very close though, and the relative outcomes of algorithms would vary with a different simulation."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Taking a look at which elements do each put at the top and at the bottom – the numbers in each column reflect the item IDs and are ordered according to their rank under each ordering:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Convex relaxation</th>\n",
       "      <th>Greedy Ranking</th>\n",
       "      <th>Kwik-Sort (10,000 trials)</th>\n",
       "      <th>Metropolis-Hastings Swapping</th>\n",
       "      <th>Min-Conflict</th>\n",
       "      <th>PageRank</th>\n",
       "      <th>Random Swaps</th>\n",
       "      <th>Simulation's order</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>16</td>\n",
       "      <td>12</td>\n",
       "      <td>16</td>\n",
       "      <td>25</td>\n",
       "      <td>16</td>\n",
       "      <td>12</td>\n",
       "      <td>16</td>\n",
       "      <td>16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>13</td>\n",
       "      <td>10</td>\n",
       "      <td>25</td>\n",
       "      <td>13</td>\n",
       "      <td>18</td>\n",
       "      <td>18</td>\n",
       "      <td>15</td>\n",
       "      <td>18</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>12</td>\n",
       "      <td>16</td>\n",
       "      <td>18</td>\n",
       "      <td>16</td>\n",
       "      <td>12</td>\n",
       "      <td>19</td>\n",
       "      <td>25</td>\n",
       "      <td>12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>18</td>\n",
       "      <td>25</td>\n",
       "      <td>12</td>\n",
       "      <td>18</td>\n",
       "      <td>19</td>\n",
       "      <td>10</td>\n",
       "      <td>18</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>25</td>\n",
       "      <td>18</td>\n",
       "      <td>10</td>\n",
       "      <td>12</td>\n",
       "      <td>10</td>\n",
       "      <td>0</td>\n",
       "      <td>12</td>\n",
       "      <td>25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>0</td>\n",
       "      <td>24</td>\n",
       "      <td>13</td>\n",
       "      <td>10</td>\n",
       "      <td>13</td>\n",
       "      <td>25</td>\n",
       "      <td>19</td>\n",
       "      <td>14</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   Convex relaxation  Greedy Ranking  Kwik-Sort (10,000 trials)  \\\n",
       "0                 16              12                         16   \n",
       "1                 13              10                         25   \n",
       "2                 12              16                         18   \n",
       "3                 18              25                         12   \n",
       "4                 25              18                         10   \n",
       "5                  0              24                         13   \n",
       "\n",
       "   Metropolis-Hastings Swapping  Min-Conflict  PageRank  Random Swaps  \\\n",
       "0                            25            16        12            16   \n",
       "1                            13            18        18            15   \n",
       "2                            16            12        19            25   \n",
       "3                            18            19        10            18   \n",
       "4                            12            10         0            12   \n",
       "5                            10            13        25            19   \n",
       "\n",
       "   Simulation's order  \n",
       "0                  16  \n",
       "1                  18  \n",
       "2                  12  \n",
       "3                   4  \n",
       "4                  25  \n",
       "5                  14  "
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.DataFrame({\"Simulation's order\":best_theoretical_order,'Metropolis-Hastings Swapping':mh_rank,\n",
    "              'PageRank':pr_rank, 'Convex relaxation':cvxrelax_rank, 'Min-Conflict':mc_rank,\n",
    "              'Random Swaps':rs_rank,'Kwik-Sort (10,000 trials)':ks_rank,'Greedy Ranking':greedy_rank}).head(6)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Convex relaxation</th>\n",
       "      <th>Greedy Ranking</th>\n",
       "      <th>Kwik-Sort (10,000 trials)</th>\n",
       "      <th>Metropolis-Hastings Swapping</th>\n",
       "      <th>Min-Conflict</th>\n",
       "      <th>PageRank</th>\n",
       "      <th>Random Swaps</th>\n",
       "      <th>Simulation's order</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>22</td>\n",
       "      <td>23</td>\n",
       "      <td>11</td>\n",
       "      <td>29</td>\n",
       "      <td>22</td>\n",
       "      <td>23</td>\n",
       "      <td>22</td>\n",
       "      <td>11</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>29</td>\n",
       "      <td>28</td>\n",
       "      <td>7</td>\n",
       "      <td>22</td>\n",
       "      <td>27</td>\n",
       "      <td>8</td>\n",
       "      <td>23</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>23</td>\n",
       "      <td>8</td>\n",
       "      <td>29</td>\n",
       "      <td>23</td>\n",
       "      <td>8</td>\n",
       "      <td>21</td>\n",
       "      <td>20</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>28</td>\n",
       "      <td>19</td>\n",
       "      <td>23</td>\n",
       "      <td>28</td>\n",
       "      <td>23</td>\n",
       "      <td>20</td>\n",
       "      <td>28</td>\n",
       "      <td>23</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>8</td>\n",
       "      <td>3</td>\n",
       "      <td>28</td>\n",
       "      <td>8</td>\n",
       "      <td>28</td>\n",
       "      <td>3</td>\n",
       "      <td>8</td>\n",
       "      <td>28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>28</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    Convex relaxation  Greedy Ranking  Kwik-Sort (10,000 trials)  \\\n",
       "24                 22              23                         11   \n",
       "25                 29              28                          7   \n",
       "26                 23               8                         29   \n",
       "27                 28              19                         23   \n",
       "28                  8               3                         28   \n",
       "29                  3               0                          3   \n",
       "\n",
       "    Metropolis-Hastings Swapping  Min-Conflict  PageRank  Random Swaps  \\\n",
       "24                            29            22        23            22   \n",
       "25                            22            27         8            23   \n",
       "26                            23             8        21            20   \n",
       "27                            28            23        20            28   \n",
       "28                             8            28         3             8   \n",
       "29                             3             3        28             3   \n",
       "\n",
       "    Simulation's order  \n",
       "24                  11  \n",
       "25                  27  \n",
       "26                   8  \n",
       "27                  23  \n",
       "28                  28  \n",
       "29                   3  "
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.DataFrame({\"Simulation's order\":best_theoretical_order,'Metropolis-Hastings Swapping':mh_rank,\n",
    "              'PageRank':pr_rank, 'Convex relaxation':cvxrelax_rank, 'Min-Conflict':mc_rank,\n",
    "              'Random Swaps':rs_rank,'Kwik-Sort (10,000 trials)':ks_rank,'Greedy Ranking':greedy_rank}).tail(6)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"p5\"></a>\n",
    "## 5. Ranking a larger list\n",
    "\n",
    "The previous list only had 30 elements, and while this makes it infeasible to order by brute-force search, the problem is still smallish. Not all algorithms scale well with a bigger sample size.\n",
    "\n",
    "Now doing the same but with a bigger simulation:\n",
    "* 250 items.\n",
    "* 3,000 sampled users (from an infinite population) with 5 preferences each.\n",
    "* Same distribution for sampling bias, item goodness and variation in user preference.\n",
    "\n",
    "This time there will only be information about pairwise preferences for 1/3 of the pairs of items. Since Min-Conflict scales poorly with the number of items, it won't be tried this time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Theoretical maximum score: 12716\n",
      "Number of pairwise preferences observed: 10067\n",
      "Number of pairs in the list: 31125\n"
     ]
    }
   ],
   "source": [
    "nitems=250\n",
    "aggregated_preferences=defaultdict(lambda: 0)\n",
    "\n",
    "np.random.seed(123)\n",
    "items=[i for i in range(nitems)]\n",
    "item_probs=np.random.beta(a=2,b=2,size=nitems)\n",
    "item_probs=item_probs/np.sum(item_probs)\n",
    "item_goodness=np.random.beta(a=3,b=3,size=nitems)\n",
    "user_preferences=[np.random.beta(a=4,b=4,size=nitems)*item_goodness for u in range(nusers)]\n",
    "agg_item_goodness=np.zeros((nitems))\n",
    "for u in range(nusers):\n",
    "    agg_item_goodness+=user_preferences[u]\n",
    "\n",
    "for iteration in range(3000):\n",
    "    prefs_user=np.random.beta(a=4,b=4,size=nitems)*item_goodness\n",
    "    for sample in range(5):\n",
    "        chosen_items=np.random.choice(items,size=2,replace=False,p=item_probs)\n",
    "        if chosen_items[0]==chosen_items[1]:\n",
    "            continue\n",
    "        goodness_A=prefs_user[chosen_items[0]]\n",
    "        goodness_B=prefs_user[chosen_items[1]]\n",
    "        if goodness_A>goodness_B:\n",
    "            add_preferences(chosen_items[0],chosen_items[1])\n",
    "        else:\n",
    "            add_preferences(chosen_items[1],chosen_items[0])\n",
    "            \n",
    "print('Theoretical maximum score:',np.sum([np.abs(v) for v in aggregated_preferences.values()]))\n",
    "print('Number of pairwise preferences observed:',len([v for v in aggregated_preferences.values() if v!=0]))\n",
    "print('Number of pairs in the list:',int(nitems*(nitems-1)/2))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Score</th>\n",
       "      <th>Time (seconds)</th>\n",
       "      <th>% concordant pairs w/generator</th>\n",
       "      <th>Symmetrical AP correlation w/generator</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Metropolis-Hastings Swapping</th>\n",
       "      <td>9102</td>\n",
       "      <td>97.669168</td>\n",
       "      <td>89.52%</td>\n",
       "      <td>65.42%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Random Swaps</th>\n",
       "      <td>9006</td>\n",
       "      <td>146.839168</td>\n",
       "      <td>88.46%</td>\n",
       "      <td>62.47%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Convex relaxation</th>\n",
       "      <td>8396</td>\n",
       "      <td>159.802713</td>\n",
       "      <td>92.52%</td>\n",
       "      <td>73.78%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>PageRank (tuned epsilon)</th>\n",
       "      <td>7568</td>\n",
       "      <td>2.920900</td>\n",
       "      <td>82.59%</td>\n",
       "      <td>49.81%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Greedy Ranking</th>\n",
       "      <td>7292</td>\n",
       "      <td>2.895627</td>\n",
       "      <td>77.77%</td>\n",
       "      <td>45.54%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Kwik-Sort (1,000 trials)</th>\n",
       "      <td>6120</td>\n",
       "      <td>28.923745</td>\n",
       "      <td>70.54%</td>\n",
       "      <td>28.62%</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                              Score  Time (seconds)  \\\n",
       "Metropolis-Hastings Swapping   9102       97.669168   \n",
       "Random Swaps                   9006      146.839168   \n",
       "Convex relaxation              8396      159.802713   \n",
       "PageRank (tuned epsilon)       7568        2.920900   \n",
       "Greedy Ranking                 7292        2.895627   \n",
       "Kwik-Sort (1,000 trials)       6120       28.923745   \n",
       "\n",
       "                             % concordant pairs w/generator  \\\n",
       "Metropolis-Hastings Swapping                         89.52%   \n",
       "Random Swaps                                         88.46%   \n",
       "Convex relaxation                                    92.52%   \n",
       "PageRank (tuned epsilon)                             82.59%   \n",
       "Greedy Ranking                                       77.77%   \n",
       "Kwik-Sort (1,000 trials)                             70.54%   \n",
       "\n",
       "                             Symmetrical AP correlation w/generator  \n",
       "Metropolis-Hastings Swapping                                 65.42%  \n",
       "Random Swaps                                                 62.47%  \n",
       "Convex relaxation                                            73.78%  \n",
       "PageRank (tuned epsilon)                                     49.81%  \n",
       "Greedy Ranking                                               45.54%  \n",
       "Kwik-Sort (1,000 trials)                                     28.62%  "
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#Generating a random ordering\n",
    "np.random.seed(1)\n",
    "random_ordering=deepcopy(items)\n",
    "np.random.shuffle(random_ordering)\n",
    "\n",
    "start_time = time.time()\n",
    "greedy_rank=greedy_ranking(random_ordering,aggregated_preferences)\n",
    "time_greedy_rank=time.time() - start_time\n",
    "\n",
    "np.random.seed(1)\n",
    "random.seed(1)\n",
    "start_time = time.time()\n",
    "ks_rank=kwiksort_multiple_runs(random_ordering,aggregated_preferences,runs=1000)\n",
    "time_kwiksort=time.time() - start_time\n",
    "\n",
    "np.random.seed(1)\n",
    "random.seed(1)\n",
    "start_time = time.time()\n",
    "pr_rank=pagerank(random_ordering,aggregated_preferences)\n",
    "time_pagerank=time.time() - start_time\n",
    "\n",
    "np.random.seed(1)\n",
    "random.seed(1)\n",
    "start_time = time.time()\n",
    "cvxrelax_rank=cvx_relaxation(random_ordering,aggregated_preferences)\n",
    "time_cvxrelax=time.time() - start_time\n",
    "\n",
    "np.random.seed(1)\n",
    "random.seed(1)\n",
    "start_time = time.time()\n",
    "rs_rank=random_swaps(random_ordering,aggregated_preferences,iterations=50000)\n",
    "time_random_swaps=time.time() - start_time\n",
    "\n",
    "np.random.seed(1)\n",
    "random.seed(1)\n",
    "start_time = time.time()\n",
    "mh_rank=metropolis_hastings(random_ordering,aggregated_preferences,iterations=100000,explore_fact=8)\n",
    "time_metropolis=time.time() - start_time\n",
    "\n",
    "best_theoretical_order=list(np.argsort(-item_goodness))\n",
    "\n",
    "lst_scores={'Greedy Ranking':list_score(greedy_rank,aggregated_preferences),\n",
    "            'Kwik-Sort (1,000 trials)':list_score(ks_rank,aggregated_preferences),\n",
    "            'PageRank (tuned epsilon)':list_score(pr_rank,aggregated_preferences),\n",
    "            'Convex relaxation':list_score(cvxrelax_rank,aggregated_preferences),\n",
    "            'Random Swaps':list_score(rs_rank,aggregated_preferences),\n",
    "            'Metropolis-Hastings Swapping':list_score(mh_rank,aggregated_preferences)\n",
    "           }\n",
    "lst_times={'Greedy Ranking':time_greedy_rank,\n",
    "            'Kwik-Sort (1,000 trials)':time_kwiksort,\n",
    "            'PageRank (tuned epsilon)':time_pagerank,\n",
    "            'Convex relaxation':time_cvxrelax,\n",
    "            'Random Swaps':time_random_swaps,\n",
    "            'Metropolis-Hastings Swapping':time_metropolis\n",
    "           }\n",
    "\n",
    "lst_conc_pairs={'Greedy Ranking':'{:.2%}'.format(fraction_conc_pairs(greedy_rank,best_theoretical_order)),\n",
    "            'Kwik-Sort (1,000 trials)':'{:.2%}'.format(fraction_conc_pairs(ks_rank,best_theoretical_order)),\n",
    "            'PageRank (tuned epsilon)':'{:.2%}'.format(fraction_conc_pairs(pr_rank,best_theoretical_order)),\n",
    "            'Convex relaxation':'{:.2%}'.format(fraction_conc_pairs(cvxrelax_rank,best_theoretical_order)),\n",
    "            'Random Swaps':'{:.2%}'.format(fraction_conc_pairs(rs_rank,best_theoretical_order)),\n",
    "            'Metropolis-Hastings Swapping':'{:.2%}'.format(fraction_conc_pairs(mh_rank,best_theoretical_order))\n",
    "           }\n",
    "lst_sym_ap_cor={'Greedy Ranking':'{:.2%}'.format(sym_ap_cor(greedy_rank,best_theoretical_order)),\n",
    "            'Kwik-Sort (1,000 trials)':'{:.2%}'.format(sym_ap_cor(ks_rank,best_theoretical_order)),\n",
    "            'PageRank (tuned epsilon)':'{:.2%}'.format(sym_ap_cor(pr_rank,best_theoretical_order)),\n",
    "            'Convex relaxation':'{:.2%}'.format(sym_ap_cor(cvxrelax_rank,best_theoretical_order)),\n",
    "            'Random Swaps':'{:.2%}'.format(sym_ap_cor(rs_rank,best_theoretical_order)),\n",
    "            'Metropolis-Hastings Swapping':'{:.2%}'.format(sym_ap_cor(mh_rank,best_theoretical_order))\n",
    "           }\n",
    "\n",
    "eval_df=pd.DataFrame.from_dict(lst_scores,orient='index').rename(columns={0:'Score'}).sort_values('Score',ascending=False)\n",
    "runtimes=pd.DataFrame.from_dict(lst_times,orient='index').rename(columns={0:'Time (seconds)'})\n",
    "fcp=pd.DataFrame.from_dict(lst_conc_pairs,orient='index').rename(columns={0:'% concordant pairs w/generator'})\n",
    "sapc=pd.DataFrame.from_dict(lst_sym_ap_cor,orient='index').rename(columns={0:'Symmetrical AP correlation w/generator'})\n",
    "eval_df.join(runtimes).join(fcp).join(sapc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This time, Kwik-Sort didn't fare so well with this larger and more sparse input data, and the convex relaxation of the problem with l1 penalty seemed to reconstruct an order closest to the original. Metropolis-Hastings again did well both in terms of score and correlation with the real order behind the simulation."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Again taking a look at which elements do each put at the top and at the bottom – the numbers in each column reflect the item IDs and are ordered according to their rank under each ordering:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Convex relaxation</th>\n",
       "      <th>Greedy Ranking</th>\n",
       "      <th>Kwik-Sort (10,000 trials)</th>\n",
       "      <th>Metropolis-Hastings Swapping</th>\n",
       "      <th>PageRank</th>\n",
       "      <th>Random Swaps</th>\n",
       "      <th>Simulation's order</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>222</td>\n",
       "      <td>8</td>\n",
       "      <td>193</td>\n",
       "      <td>105</td>\n",
       "      <td>52</td>\n",
       "      <td>222</td>\n",
       "      <td>50</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>77</td>\n",
       "      <td>7</td>\n",
       "      <td>177</td>\n",
       "      <td>198</td>\n",
       "      <td>152</td>\n",
       "      <td>124</td>\n",
       "      <td>232</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>207</td>\n",
       "      <td>203</td>\n",
       "      <td>1</td>\n",
       "      <td>207</td>\n",
       "      <td>206</td>\n",
       "      <td>77</td>\n",
       "      <td>152</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>107</td>\n",
       "      <td>127</td>\n",
       "      <td>198</td>\n",
       "      <td>124</td>\n",
       "      <td>233</td>\n",
       "      <td>161</td>\n",
       "      <td>77</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>233</td>\n",
       "      <td>232</td>\n",
       "      <td>232</td>\n",
       "      <td>226</td>\n",
       "      <td>97</td>\n",
       "      <td>226</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>50</td>\n",
       "      <td>39</td>\n",
       "      <td>71</td>\n",
       "      <td>16</td>\n",
       "      <td>50</td>\n",
       "      <td>16</td>\n",
       "      <td>36</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>8</td>\n",
       "      <td>92</td>\n",
       "      <td>168</td>\n",
       "      <td>107</td>\n",
       "      <td>77</td>\n",
       "      <td>236</td>\n",
       "      <td>86</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>170</td>\n",
       "      <td>226</td>\n",
       "      <td>52</td>\n",
       "      <td>77</td>\n",
       "      <td>186</td>\n",
       "      <td>207</td>\n",
       "      <td>16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>31</td>\n",
       "      <td>231</td>\n",
       "      <td>55</td>\n",
       "      <td>1</td>\n",
       "      <td>222</td>\n",
       "      <td>233</td>\n",
       "      <td>39</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>16</td>\n",
       "      <td>242</td>\n",
       "      <td>124</td>\n",
       "      <td>233</td>\n",
       "      <td>6</td>\n",
       "      <td>227</td>\n",
       "      <td>233</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>97</td>\n",
       "      <td>154</td>\n",
       "      <td>145</td>\n",
       "      <td>39</td>\n",
       "      <td>212</td>\n",
       "      <td>107</td>\n",
       "      <td>222</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>25</td>\n",
       "      <td>165</td>\n",
       "      <td>41</td>\n",
       "      <td>232</td>\n",
       "      <td>0</td>\n",
       "      <td>79</td>\n",
       "      <td>203</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>232</td>\n",
       "      <td>34</td>\n",
       "      <td>79</td>\n",
       "      <td>50</td>\n",
       "      <td>170</td>\n",
       "      <td>97</td>\n",
       "      <td>207</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>36</td>\n",
       "      <td>161</td>\n",
       "      <td>80</td>\n",
       "      <td>8</td>\n",
       "      <td>1</td>\n",
       "      <td>170</td>\n",
       "      <td>167</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>39</td>\n",
       "      <td>107</td>\n",
       "      <td>0</td>\n",
       "      <td>31</td>\n",
       "      <td>16</td>\n",
       "      <td>60</td>\n",
       "      <td>236</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    Convex relaxation  Greedy Ranking  Kwik-Sort (10,000 trials)  \\\n",
       "0                 222               8                        193   \n",
       "1                  77               7                        177   \n",
       "2                 207             203                          1   \n",
       "3                 107             127                        198   \n",
       "4                 233             232                        232   \n",
       "5                  50              39                         71   \n",
       "6                   8              92                        168   \n",
       "7                 170             226                         52   \n",
       "8                  31             231                         55   \n",
       "9                  16             242                        124   \n",
       "10                 97             154                        145   \n",
       "11                 25             165                         41   \n",
       "12                232              34                         79   \n",
       "13                 36             161                         80   \n",
       "14                 39             107                          0   \n",
       "\n",
       "    Metropolis-Hastings Swapping  PageRank  Random Swaps  Simulation's order  \n",
       "0                            105        52           222                  50  \n",
       "1                            198       152           124                 232  \n",
       "2                            207       206            77                 152  \n",
       "3                            124       233           161                  77  \n",
       "4                            226        97           226                   8  \n",
       "5                             16        50            16                  36  \n",
       "6                            107        77           236                  86  \n",
       "7                             77       186           207                  16  \n",
       "8                              1       222           233                  39  \n",
       "9                            233         6           227                 233  \n",
       "10                            39       212           107                 222  \n",
       "11                           232         0            79                 203  \n",
       "12                            50       170            97                 207  \n",
       "13                             8         1           170                 167  \n",
       "14                            31        16            60                 236  "
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.DataFrame({\"Simulation's order\":best_theoretical_order,'Metropolis-Hastings Swapping':mh_rank,\n",
    "              'PageRank':pr_rank, 'Convex relaxation':cvxrelax_rank,\n",
    "              'Random Swaps':rs_rank,'Kwik-Sort (10,000 trials)':ks_rank,'Greedy Ranking':greedy_rank}).head(15)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Convex relaxation</th>\n",
       "      <th>Greedy Ranking</th>\n",
       "      <th>Kwik-Sort (10,000 trials)</th>\n",
       "      <th>Metropolis-Hastings Swapping</th>\n",
       "      <th>PageRank</th>\n",
       "      <th>Random Swaps</th>\n",
       "      <th>Simulation's order</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>235</th>\n",
       "      <td>229</td>\n",
       "      <td>32</td>\n",
       "      <td>121</td>\n",
       "      <td>239</td>\n",
       "      <td>239</td>\n",
       "      <td>125</td>\n",
       "      <td>64</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>236</th>\n",
       "      <td>157</td>\n",
       "      <td>78</td>\n",
       "      <td>239</td>\n",
       "      <td>213</td>\n",
       "      <td>12</td>\n",
       "      <td>239</td>\n",
       "      <td>32</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>237</th>\n",
       "      <td>47</td>\n",
       "      <td>51</td>\n",
       "      <td>84</td>\n",
       "      <td>35</td>\n",
       "      <td>197</td>\n",
       "      <td>35</td>\n",
       "      <td>125</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>238</th>\n",
       "      <td>64</td>\n",
       "      <td>220</td>\n",
       "      <td>175</td>\n",
       "      <td>159</td>\n",
       "      <td>173</td>\n",
       "      <td>157</td>\n",
       "      <td>178</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>239</th>\n",
       "      <td>32</td>\n",
       "      <td>97</td>\n",
       "      <td>125</td>\n",
       "      <td>64</td>\n",
       "      <td>171</td>\n",
       "      <td>184</td>\n",
       "      <td>15</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>240</th>\n",
       "      <td>239</td>\n",
       "      <td>2</td>\n",
       "      <td>139</td>\n",
       "      <td>118</td>\n",
       "      <td>184</td>\n",
       "      <td>118</td>\n",
       "      <td>157</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>241</th>\n",
       "      <td>178</td>\n",
       "      <td>248</td>\n",
       "      <td>85</td>\n",
       "      <td>157</td>\n",
       "      <td>229</td>\n",
       "      <td>159</td>\n",
       "      <td>159</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>242</th>\n",
       "      <td>125</td>\n",
       "      <td>15</td>\n",
       "      <td>137</td>\n",
       "      <td>184</td>\n",
       "      <td>157</td>\n",
       "      <td>178</td>\n",
       "      <td>199</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>243</th>\n",
       "      <td>199</td>\n",
       "      <td>206</td>\n",
       "      <td>230</td>\n",
       "      <td>178</td>\n",
       "      <td>104</td>\n",
       "      <td>69</td>\n",
       "      <td>230</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>244</th>\n",
       "      <td>15</td>\n",
       "      <td>80</td>\n",
       "      <td>185</td>\n",
       "      <td>32</td>\n",
       "      <td>47</td>\n",
       "      <td>32</td>\n",
       "      <td>118</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>245</th>\n",
       "      <td>213</td>\n",
       "      <td>52</td>\n",
       "      <td>181</td>\n",
       "      <td>15</td>\n",
       "      <td>118</td>\n",
       "      <td>51</td>\n",
       "      <td>229</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>246</th>\n",
       "      <td>118</td>\n",
       "      <td>213</td>\n",
       "      <td>210</td>\n",
       "      <td>173</td>\n",
       "      <td>51</td>\n",
       "      <td>64</td>\n",
       "      <td>213</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>247</th>\n",
       "      <td>69</td>\n",
       "      <td>209</td>\n",
       "      <td>178</td>\n",
       "      <td>69</td>\n",
       "      <td>178</td>\n",
       "      <td>213</td>\n",
       "      <td>51</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>248</th>\n",
       "      <td>209</td>\n",
       "      <td>79</td>\n",
       "      <td>95</td>\n",
       "      <td>51</td>\n",
       "      <td>69</td>\n",
       "      <td>140</td>\n",
       "      <td>69</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>249</th>\n",
       "      <td>51</td>\n",
       "      <td>140</td>\n",
       "      <td>209</td>\n",
       "      <td>209</td>\n",
       "      <td>209</td>\n",
       "      <td>209</td>\n",
       "      <td>209</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Convex relaxation  Greedy Ranking  Kwik-Sort (10,000 trials)  \\\n",
       "235                229              32                        121   \n",
       "236                157              78                        239   \n",
       "237                 47              51                         84   \n",
       "238                 64             220                        175   \n",
       "239                 32              97                        125   \n",
       "240                239               2                        139   \n",
       "241                178             248                         85   \n",
       "242                125              15                        137   \n",
       "243                199             206                        230   \n",
       "244                 15              80                        185   \n",
       "245                213              52                        181   \n",
       "246                118             213                        210   \n",
       "247                 69             209                        178   \n",
       "248                209              79                         95   \n",
       "249                 51             140                        209   \n",
       "\n",
       "     Metropolis-Hastings Swapping  PageRank  Random Swaps  Simulation's order  \n",
       "235                           239       239           125                  64  \n",
       "236                           213        12           239                  32  \n",
       "237                            35       197            35                 125  \n",
       "238                           159       173           157                 178  \n",
       "239                            64       171           184                  15  \n",
       "240                           118       184           118                 157  \n",
       "241                           157       229           159                 159  \n",
       "242                           184       157           178                 199  \n",
       "243                           178       104            69                 230  \n",
       "244                            32        47            32                 118  \n",
       "245                            15       118            51                 229  \n",
       "246                           173        51            64                 213  \n",
       "247                            69       178           213                  51  \n",
       "248                            51        69           140                  69  \n",
       "249                           209       209           209                 209  "
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.DataFrame({\"Simulation's order\":best_theoretical_order,'Metropolis-Hastings Swapping':mh_rank,\n",
    "              'PageRank':pr_rank, 'Convex relaxation':cvxrelax_rank,\n",
    "              'Random Swaps':rs_rank,'Kwik-Sort (10,000 trials)':ks_rank,'Greedy Ranking':greedy_rank}).tail(15)"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python [Root]",
   "language": "python",
   "name": "Python [Root]"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}