{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Recommendations with IBM\n", "\n", "In this notebook, you will be putting your recommendation skills to use on real data from the IBM Watson Studio platform. \n", "\n", "\n", "You may either submit your notebook through the workspace here, or you may work from your local machine and submit through the next page. Either way assure that your code passes the project [RUBRIC](https://review.udacity.com/#!/rubrics/2322/view). **Please save regularly.**\n", "\n", "By following the table of contents, you will build out a number of different methods for making recommendations that can be used for different situations. \n", "\n", "\n", "## Table of Contents\n", "\n", "I. [Exploratory Data Analysis](#Exploratory-Data-Analysis)
\n", "II. [Rank Based Recommendations](#Rank)
\n", "III. [User-User Based Collaborative Filtering](#User-User)
\n", "IV. [Content Based Recommendations (EXTRA - NOT REQUIRED)](#Content-Recs)
\n", "V. [Matrix Factorization](#Matrix-Fact)
\n", "VI. [Extras & Concluding](#conclusions)\n", "\n", "At the end of the notebook, you will find directions for how to submit your work. Let's get started by importing the necessary libraries and reading in the data." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
article_idtitleemail
01430.0using pixiedust for fast, flexible, and easier...ef5f11f77ba020cd36e1105a00ab868bbdbf7fe7
11314.0healthcare python streaming application demo083cbdfa93c8444beaa4c5f5e0f5f9198e4f9e0b
21429.0use deep learning for image classificationb96a4f2e92d8572034b1e9b28f9ac673765cd074
31338.0ml optimization using cognitive assistant06485706b34a5c9bf2a0ecdac41daf7e7654ceb7
41276.0deploy your python model as a restful apif01220c46fc92c6e6b161b1849de11faacd7ccb2
\n", "
" ], "text/plain": [ " article_id title \\\n", "0 1430.0 using pixiedust for fast, flexible, and easier... \n", "1 1314.0 healthcare python streaming application demo \n", "2 1429.0 use deep learning for image classification \n", "3 1338.0 ml optimization using cognitive assistant \n", "4 1276.0 deploy your python model as a restful api \n", "\n", " email \n", "0 ef5f11f77ba020cd36e1105a00ab868bbdbf7fe7 \n", "1 083cbdfa93c8444beaa4c5f5e0f5f9198e4f9e0b \n", "2 b96a4f2e92d8572034b1e9b28f9ac673765cd074 \n", "3 06485706b34a5c9bf2a0ecdac41daf7e7654ceb7 \n", "4 f01220c46fc92c6e6b161b1849de11faacd7ccb2 " ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import project_tests as t\n", "import pickle\n", "import itertools\n", "\n", "%matplotlib inline\n", "\n", "user_interacts = pd.read_csv('data/user-item-interactions.csv').iloc[:, 1:]\n", "articles = pd.read_csv('data/articles_community.csv').iloc[:, 1:]\n", "\n", "# Show df to get an idea of the data\n", "user_interacts.head()" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
article_idtitleemail
01430.0using pixiedust for fast, flexible, and easier...ef5f11f77ba020cd36e1105a00ab868bbdbf7fe7
11314.0healthcare python streaming application demo083cbdfa93c8444beaa4c5f5e0f5f9198e4f9e0b
21429.0use deep learning for image classificationb96a4f2e92d8572034b1e9b28f9ac673765cd074
31338.0ml optimization using cognitive assistant06485706b34a5c9bf2a0ecdac41daf7e7654ceb7
41276.0deploy your python model as a restful apif01220c46fc92c6e6b161b1849de11faacd7ccb2
\n", "
" ], "text/plain": [ " article_id title \\\n", "0 1430.0 using pixiedust for fast, flexible, and easier... \n", "1 1314.0 healthcare python streaming application demo \n", "2 1429.0 use deep learning for image classification \n", "3 1338.0 ml optimization using cognitive assistant \n", "4 1276.0 deploy your python model as a restful api \n", "\n", " email \n", "0 ef5f11f77ba020cd36e1105a00ab868bbdbf7fe7 \n", "1 083cbdfa93c8444beaa4c5f5e0f5f9198e4f9e0b \n", "2 b96a4f2e92d8572034b1e9b28f9ac673765cd074 \n", "3 06485706b34a5c9bf2a0ecdac41daf7e7654ceb7 \n", "4 f01220c46fc92c6e6b161b1849de11faacd7ccb2 " ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Show df_content to get an idea of the data\n", "articles.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Part I : Exploratory Data Analysis\n", "\n", "Use the dictionary and cells below to provide some insight into the descriptive statistics of the data.\n", "\n", "`1.` What is the distribution of how many articles a user interacts with in the dataset? Provide a visual and descriptive statistics to assist with giving a look at the number of times each user interacts with an article. " ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "Each user is represented as a unique email, we could use describe to find out the descriptive statistics" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
email
count5148.000000
mean8.930847
std16.802267
min1.000000
25%1.000000
50%3.000000
75%9.000000
max364.000000
\n", "
" ], "text/plain": [ " email\n", "count 5148.000000\n", "mean 8.930847\n", "std 16.802267\n", "min 1.000000\n", "25% 1.000000\n", "50% 3.000000\n", "75% 9.000000\n", "max 364.000000" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "num_interactions = user_interacts['email'].value_counts()\n", "pd.DataFrame(num_interactions.describe())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use histogram to visualize the distributions of number of interactions a user have with contents in this data set" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Text(0.5,1,'distributions of number of interactions a user have with contents')" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAmAAAAGrCAYAAABnrCs6AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAIABJREFUeJzt3Xu4ZFV95vHvy51RRkBawi00KlEx\niUhaJNFJiBhuJgF9ZILJKBoSTIITndExaDIRjUwwz6iJ83gZjATwhsRLJEqiREVHE4HGINgg0gJK\n0wjNHSUSwd/8sdeB4nSdW/fpdfr0+X6e5zxn19qr9l571aqqt/alKlWFJEmS+tlqoRsgSZK01BjA\nJEmSOjOASZIkdWYAkyRJ6swAJkmS1JkBTJIkqTMD2BKW5Kwkb27T/ynJNfO47H9IckKbfmmSL8/j\nsn8ryWfna3kbK8mOSf4+yd1J/naB23Jqkg8s4PrfnOS2JN8bM29ex1hPm9uY21wkOTTJmoVux2wl\nWZXk0GnmX5Tkdzo2SUuYAUwAVNX/q6onzVRvtm/wVXVUVZ29se1KsjxJJdlmZNkfrKrDN3bZ8+iF\nwO7AY6vquIVuzEJJsg/wauCAqvqJyfNnO8bashbsjX2RjDltgKp6alVdBAv/YWUqSW5I8tx5WtZ6\nY1mbDwOY5lUGS21c7Qt8q6oeWOiGzKcNeNHeF7i9qm7dFO2ZC99w5p99Ks2vpfZGuaQleXqSryW5\nN8lHgB1G5j1ij0OSP0pyU6t7TZLDkhwJvB74jSTfT/L1VveiJKcl+QpwH/D4Mbvyk+T/tMN030xy\n2MiMR3zim/TJ9Evt/11tnT8/+ZBmkl9Icmlb9qVJfmFk3kVJ/izJV9q2fDbJbm3eDkk+kOT2JHe1\n++4+Rd89pS3rrnYY49db+RuBPx3pkxPH3PfUJOclOae1YVWSFSPzK8kTR26PHho+NMmaJK9NcmuS\nm5Mcm+ToJN9KckeS109a5Q5JPtLW9bUkTxtZ9p5JPpZkXZLrk/zhpHZ+tPXJPcBLx2zLY9p2rEvy\nnSR/kmSr9vhdCOzZ+uGsMfedPMZuSPKaJFe0x+4j7TF5FPAPI8v6fmv3VklOSfLt9pidl2TXtqyJ\nT/onJvku8PlW/rdJvteW/6UkTx1Z/45J3tq24+4kX06yI5vPmJvY1nuTXJXk+ePqtboPjZkp+nq9\n53Mrn3OfTrH+V4+Mz5eNlD8vyb8muSfJjUlOHZn3j0leMWk5X0/ygjb95CQXtjF+TZL/PMW6fznJ\nlSO3/ynJJSO3v5zk2DZ9Q5LnZorXsmbfcY/dFOs+Jsnlbfu+3ZY78Tw7v7V9dZLfHbnPlK8HSd4P\n/CTw961dr23lhyT55zZmvp6Rw6jTjTfGj+UnJvliG7u3ZXgv0EKoKv+WwB+wHfAd4L8B2zIcNvsR\n8OY2/1BgTZt+EnAjsGe7vRx4Qps+FfjApGVfBHwXeCqwTVv+RcDvtPkvBR4YWfdvAHcDu7b5NwDP\nHVneQ+to6y5gm5H5LwW+3KZ3Be4EXtzW/aJ2+7Ejbfs28FPAju326W3ey4G/B/4DsDXwc8B/HNN3\n2wKrGV6wtwOeA9wLPGmqPpl0/1OBHwJHt/X8OfDVkfkFPHHk9lmTHpcHGELetsDvAuuADwE7tT7/\nIfD4kXX9qD2+2wKvAa5v01sBl7VlbQc8HrgOOGLSfY9tdXccsy3nAJ9s614OfAs4cfIYmqIfHjG/\nPe6XAHu2x/Fq4PemWhbwKuCrwN7A9sD/BT48aZycAzxqou3Ab7e2bg/8JXD5yPLe2cbDXu1x+YVW\nb2JZCzbmWt3jWt9sxfCc+QGwxxR1z6KNmTk+n+fcp2Me0weANzGMsaMZPoTtMjL/Z9o2/CxwC3Bs\nm/cS4CsjyzoAuKu141GtzS9rfXwQcBvw1DFt2AH4N2C3Vvd7wNr2uO/Y5k08NjfQXmuY+rVs7GM3\nZr0HM7yO/Urbvr2AJ7d5XwTe1dp2IMNz9rBZvh481MZ2ey/g9lZ/q7a+24FlsxhvE4/h6Fj+MPDH\nbVk7AM+ez/ca/2b/5x6wpeMQhhfIv6yqH1XVR4FLp6j7IMOL4AFJtq2qG6rq2zMs/6yqWlVVD1TV\nj8bMv3Vk3R8BrgGet4HbMup5wLVV9f627g8D3wR+baTO31TVt6rq34DzGF4QYQgbj2UIPw9W1WVV\ndc+YdRwCPJrhRe3fq+rzwKcY3nhn68tVdUFVPQi8H3jaTHcY8SPgtNav5zK80fxVVd1bVauAVQxv\nbhMuq6qPtvpvY3iRPQR4BsOL9pvadlwHvBc4fuS+/1JVf1dVP2799ZAkWzMEgde1dd8AvJUhiGyo\nd1TV2qq6gyGYHDhN3ZcDf1xVa6rqfoY3shfmkYfGTq2qH0y0varObG2dqP+0DHvxtmIIZ6+sqpva\n4//Prd5Meow5qupvW9/8uD1nrmV405+r6Z7Pc+7TMX4EvKk9ty8Avs8Q+qiqi6rqyrYNVzC8+f9S\nu98ngAOT7Ntu/xbw8daOXwVuqKq/aX38NeBjDB8sHqGqfgisBH4RWAFcAXwZeBbDuL+2qm6fQ39N\n9dhNdiJwZlVd2Lbvpqr6ZoZzIZ8N/FFV/bCqLgf+mkc+T+byevBfgAta/R9X1YVte4/egDbD8Hjt\nyxDIf1hV83aBlObGALZ07AncVFWjv77+nXEVq2o1wyfjU4Fbk5ybZM8Zln/jDPPHrXumZc7Gnqy/\nHd9h+NQ4YfSKvPsYwhQML3yfAc5NsjbJXyTZdop13FhVP55mHTOZ3IYdMvtzam5vL9QwfJqHYU8C\nI2WPHrn90GPR2ryGYRv2ZTisd9fEH8Nevd3H3XeM3Xh4T+qEufbDZFM9NuPsC3xipO1XM4SLse1P\nsnWS09uhoXsY9izAsB27MQTTmT5YjNNjzJHkJe3w1sT2/nRr95zM8HyeU59O4fZ65PmPD21vkmcm\n+UKGQ9Z3A783sQ1VdS/waR7+AHA88MGRdj1z0lj9LWC9izuaLzLsbfvFNn0RQ9D7pXZ7LmY7Jvdh\n/PjZE7ijbd+EmcbHdK8H+wLHTeqLZwN7bECbAV4LBLikHf787WnqahMygC0dNwN7JclI2U9OVbmq\nPlRVz2Z48hfwlolZU91lhvWPW/faNv0DhkMyE0ZfZGda7trWxlE/Cdw0w/1on9jfWFUHMBx++lWG\nwyLj1rFPHnlxwazWMUv3MfX2b4h9JiZam/dm2IYbgeuraueRv52qavST9HT9fRsPf3qeMJ/9MGpc\nO24EjprU/h2q6qYp7vebwDHAc4HHMByOgeHN5zaGw0BPmOW6R23yMdf2Cr0XeAXD4bOdgW+0to8z\n3XNouufzXPt0rj4EnA/sU1WPAd4zaRs+DLwoyc8zHD77wki7vjipXY+uqt+fYj2TA9gXmTmAbcx2\nTbRx3PhZC+yaZKeRsrk8Tya360bg/ZP64lFVdfoGLIuq+l5V/W5V7cmwB/RdGTkHVf0YwJaOf2E4\nV+MPk2zTTnQdezgjyZOSPCfJ9gxvUv/G8KkYhj0vyzP3Kx0f19a9bZLjgKcAF7R5lwPHt3kreORh\nhnXAjxnOVxrnAuCnkvxm267fYDiX5FMzNaidvPsz7dDaPQzh4sExVS9meIN7bWvjoQyHm86daR2z\ndDnwm22PzZE8fIhmQ/1ckhe0T9SvAu5nOM/nEuCeDCdk79jW99NJnjGbhba9cOcBpyXZqYWE/w5s\nikv5bwEem+QxI2XvaeveFyDJsiTHTLOMnRi2/XaGcPK/Jma0PYNnAm9rJ0xv3U5Q3p7NY8w9iuHN\nc12738sY9oBN5XLg6CS7JvkJhsd9Yp3TPZ/n2qdztRPD3qAfJjmYIRSPuoAhFL4J+MjIXuZPMfTx\ni9tzbtskz0jylCnW888Mhz0PBi6p4dD8vsAzefhE9Mk29LVswvuAl2W4QGmrJHsleXJV3dja8+cZ\nLrr4WYbDlR+cdmmPbNfo2PsA8GtJjmjjdIcMF1nsPYtlrTeWkxw3ct87GcbZuDGoTcwAtkRU1b8D\nL2A4mfhOhnN5Pj5F9e2B0xn2EnyPITxNXGk38UWjtyf52hyacDGwf1vmacALR87L+J8MnyTvBN7I\n8Kl5ot33tfpfabvfD5m0Xbcz7EV4NcMb7WuBX62q22bRpp8APsrwRng1wyfl9cJE67tfB45q7X8X\n8JKq+uastnxmr2QIdBOHWf5uI5f3SYbHd+JE8Re0PS8PtvUcyHBi/m0M56Y8ZqoFjfFfGcLodQzn\n2XyIIcjMq9a3Hwaua4/7nsBfMexN+WySexlC5TOnWcw5DId+bgKuavVHvQa4kuFcyDsY9gpttZmM\nuasYzq/7F4Y35J8BvjLNct8PfJ3hMOtngdEr26Z7Ps+1T+fqD4A3tWX/KUOAf0g73+vjDHspR5/3\n9wKHMxyWXNva/Za2Leupqh8AXwNWtecrDH33nZr6a1E29LVsYp2XMFwk8HaGk/G/yMN7Rl/EsMd1\nLcO5bm9o527Nxp8Df9LG3mtaoDuG4TFbx7BH7H8wi/fvKcbyM4CLk3yf4bF/ZVVdP8u2aR7lkafl\nSJIkaVNzD5gkSVJnBjBJkqTODGCSJEmdGcAkSZI626x/XHW33Xar5cuXL3QzJEmSZnTZZZfdVlXL\nZlN3sw5gy5cvZ+XKlQvdDEmSpBklGfsLM+N4CFKSJKkzA5gkSVJnBjBJkqTODGCSJEmdGcAkSZI6\nM4BJkiR1ZgCTJEnqbMYAlmSHJJck+XqSVUne2Mr3S3JxkmuTfCTJdq18+3Z7dZu/fGRZr2vl1yQ5\nYlNtlCRJ0uZsNnvA7geeU1VPAw4EjkxyCPAW4O1VtT9wJ3Biq38icGdVPRF4e6tHkgOA44GnAkcC\n70qy9XxujCRJ0mIwYwCrwffbzW3bXwHPAT7ays8Gjm3Tx7TbtPmHJUkrP7eq7q+q64HVwMHzshWS\nJEmLyKzOAUuydZLLgVuBC4FvA3dV1QOtyhpgrza9F3AjQJt/N/DY0fIx9xld10lJViZZuW7durlv\nkSRJ0mZuVgGsqh6sqgOBvRn2Wj1lXLX2P1PMm6p88rrOqKoVVbVi2bJZ/Z6lJEnSojKnqyCr6i7g\nIuAQYOckEz/mvTewtk2vAfYBaPMfA9wxWj7mPpIkSUvGbK6CXJZk5za9I/Bc4GrgC8ALW7UTgE+2\n6fPbbdr8z1dVtfLj21WS+wH7A5fM14ZIkiQtFtvMXIU9gLPbFYtbAedV1aeSXAWcm+TNwL8C72v1\n3we8P8lqhj1fxwNU1aok5wFXAQ8AJ1fVg/O7OZIkSZu/DDunNk8rVqyolStXLnQzJEmSZpTksqpa\nMZu6s9kDtsVbfsqn1yu74fTnLUBLJEnSUuBPEUmSJHVmAJMkSerMACZJktSZAUySJKkzA5gkSVJn\nBjBJkqTODGCSJEmdGcAkSZI6M4BJkiR1ZgCTJEnqzAAmSZLUmQFMkiSpMwOYJElSZwYwSZKkzgxg\nkiRJnRnAJEmSOjOASZIkdWYAkyRJ6swAJkmS1JkBTJIkqTMDmCRJUmcGMEmSpM4MYJIkSZ0ZwCRJ\nkjozgEmSJHVmAJMkSerMACZJktSZAUySJKkzA5gkSVJnBjBJkqTODGCSJEmdGcAkSZI6M4BJkiR1\nZgCTJEnqzAAmSZLUmQFMkiSpMwOYJElSZwYwSZKkzgxgkiRJnRnAJEmSOjOASZIkdWYAkyRJ6swA\nJkmS1JkBTJIkqTMDmCRJUmcGMEmSpM4MYJIkSZ0ZwCRJkjozgEmSJHVmAJMkSerMACZJktTZjAEs\nyT5JvpDk6iSrkryylZ+a5KYkl7e/o0fu87okq5Nck+SIkfIjW9nqJKdsmk2SJEnavG0zizoPAK+u\nqq8l2Qm4LMmFbd7bq+p/j1ZOcgBwPPBUYE/gn5L8VJv9TuBXgDXApUnOr6qr5mNDJEmSFosZA1hV\n3Qzc3KbvTXI1sNc0dzkGOLeq7geuT7IaOLjNW11V1wEkObfVNYBJkqQlZU7ngCVZDjwduLgVvSLJ\nFUnOTLJLK9sLuHHkbmta2VTlk9dxUpKVSVauW7duLs2TJElaFGYdwJI8GvgY8Kqqugd4N/AE4ECG\nPWRvnag65u41TfkjC6rOqKoVVbVi2bJls22eJEnSojGbc8BIsi1D+PpgVX0coKpuGZn/XuBT7eYa\nYJ+Ru+8NrG3TU5VLkiQtGbO5CjLA+4Crq+ptI+V7jFR7PvCNNn0+cHyS7ZPsB+wPXAJcCuyfZL8k\n2zGcqH/+/GyGJEnS4jGbPWDPAl4MXJnk8lb2euBFSQ5kOIx4A/BygKpaleQ8hpPrHwBOrqoHAZK8\nAvgMsDVwZlWtmsdtkSRJWhRmcxXklxl//tYF09znNOC0MeUXTHc/SZKkpcBvwpckSerMACZJktSZ\nAUySJKkzA5gkSVJnBjBJkqTODGCSJEmdGcAkSZI6M4BJkiR1ZgCTJEnqzAAmSZLUmQFMkiSpMwOY\nJElSZwYwSZKkzgxgkiRJnRnAJEmSOjOASZIkdWYAkyRJ6swAJkmS1JkBTJIkqTMDmCRJUmcGMEmS\npM4MYJIkSZ0ZwCRJkjozgEmSJHVmAJMkSerMACZJktSZAUySJKkzA5gkSVJnBjBJkqTODGCSJEmd\nGcAkSZI6M4BJkiR1ZgCTJEnqzAAmSZLUmQFMkiSpMwOYJElSZwYwSZKkzgxgkiRJnRnAJEmSOjOA\nSZIkdWYAkyRJ6swAJkmS1JkBTJIkqTMDmCRJUmcGMEmSpM4MYJIkSZ0ZwCRJkjozgEmSJHVmAJMk\nSerMACZJktSZAUySJKkzA5gkSVJnBjBJkqTOZgxgSfZJ8oUkVydZleSVrXzXJBcmubb936WVJ8k7\nkqxOckWSg0aWdUKrf22SEzbdZkmSJG2+ZrMH7AHg1VX1FOAQ4OQkBwCnAJ+rqv2Bz7XbAEcB+7e/\nk4B3wxDYgDcAzwQOBt4wEdokSZKWkhkDWFXdXFVfa9P3AlcDewHHAGe3amcDx7bpY4BzavBVYOck\newBHABdW1R1VdSdwIXDkvG6NJEnSIjCnc8CSLAeeDlwM7F5VN8MQ0oDHtWp7ATeO3G1NK5uqfPI6\nTkqyMsnKdevWzaV5kiRJi8KsA1iSRwMfA15VVfdMV3VMWU1T/siCqjOqakVVrVi2bNlsmydJkrRo\nzCqAJdmWIXx9sKo+3opvaYcWaf9vbeVrgH1G7r43sHaackmSpCVlNldBBngfcHVVvW1k1vnAxJWM\nJwCfHCl/Sbsa8hDg7naI8jPA4Ul2aSffH97KJEmSlpRtZlHnWcCLgSuTXN7KXg+cDpyX5ETgu8Bx\nbd4FwNHAauA+4GUAVXVHkj8DLm313lRVd8zLVkiSJC0iMwawqvoy48/fAjhsTP0CTp5iWWcCZ86l\ngZIkSVsavwlfkiSpMwOYJElSZwYwSZKkzgxgkiRJnRnAJEmSOjOASZIkdWYAkyRJ6swAJkmS1JkB\nTJIkqTMDmCRJUmcGMEmSpM4MYJIkSZ0ZwCRJkjozgEmSJHVmAJMkSerMACZJktSZAUySJKkzA5gk\nSVJnBjBJkqTODGCSJEmdGcAkSZI6M4BJkiR1ZgCTJEnqzAAmSZLUmQFMkiSpMwOYJElSZwYwSZKk\nzgxgkiRJnRnAJEmSOjOASZIkdWYAkyRJ6swAJkmS1JkBTJIkqTMDmCRJUmcGMEmSpM4MYJIkSZ0Z\nwCRJkjozgEmSJHVmAJMkSerMACZJktSZAUySJKkzA5gkSVJnBjBJkqTODGCSJEmdGcAkSZI6M4BJ\nkiR1ZgCTJEnqzAAmSZLUmQFMkiSpMwOYJElSZwYwSZKkzgxgkiRJnc0YwJKcmeTWJN8YKTs1yU1J\nLm9/R4/Me12S1UmuSXLESPmRrWx1klPmf1MkSZIWh9nsATsLOHJM+dur6sD2dwFAkgOA44Gntvu8\nK8nWSbYG3gkcBRwAvKjVlSRJWnK2malCVX0pyfJZLu8Y4Nyquh+4Pslq4OA2b3VVXQeQ5NxW96o5\nt1iSJGmR25hzwF6R5Ip2iHKXVrYXcONInTWtbKry9SQ5KcnKJCvXrVu3Ec2TJEnaPG1oAHs38ATg\nQOBm4K2tPGPq1jTl6xdWnVFVK6pqxbJlyzaweZIkSZuvGQ9BjlNVt0xMJ3kv8Kl2cw2wz0jVvYG1\nbXqqckmSpCVlg/aAJdlj5ObzgYkrJM8Hjk+yfZL9gP2BS4BLgf2T7JdkO4YT9c/f8GZLkiQtXjPu\nAUvyYeBQYLcka4A3AIcmOZDhMOINwMsBqmpVkvMYTq5/ADi5qh5sy3kF8Blga+DMqlo171sjSZK0\nCMzmKsgXjSl+3zT1TwNOG1N+AXDBnFonSZK0BfKb8CVJkjozgEmSJHVmAJMkSerMACZJktSZAUyS\nJKkzA5gkSVJnBjBJkqTODGCSJEmdGcAkSZI6M4BJkiR1ZgCTJEnqzAAmSZLUmQFMkiSpMwOYJElS\nZwYwSZKkzgxgkiRJnRnAJEmSOjOASZIkdWYAkyRJ6swAJkmS1JkBTJIkqTMDmCRJUmcGMEmSpM4M\nYJIkSZ0ZwCRJkjozgEmSJHVmAJMkSerMACZJktSZAUySJKkzA5gkSVJnBjBJkqTODGCSJEmdGcAk\nSZI6M4BJkiR1ZgCTJEnqzAAmSZLUmQFMkiSpMwOYJElSZwYwSZKkzgxgkiRJnRnAJEmSOjOASZIk\ndWYAkyRJ6swAJkmS1JkBTJIkqTMDmCRJUmcGMEmSpM4MYJIkSZ0ZwCRJkjozgEmSJHVmAJMkSerM\nACZJktSZAUySJKmzGQNYkjOT3JrkGyNluya5MMm17f8urTxJ3pFkdZIrkhw0cp8TWv1rk5ywaTZH\nkiRp8zebPWBnAUdOKjsF+FxV7Q98rt0GOArYv/2dBLwbhsAGvAF4JnAw8IaJ0CZJkrTUzBjAqupL\nwB2Tio8Bzm7TZwPHjpSfU4OvAjsn2QM4Ariwqu6oqjuBC1k/1EmSJC0JG3oO2O5VdTNA+/+4Vr4X\ncONIvTWtbKry9SQ5KcnKJCvXrVu3gc2TJEnafM33SfgZU1bTlK9fWHVGVa2oqhXLli2b18ZJkiRt\nDjY0gN3SDi3S/t/aytcA+4zU2xtYO025JEnSkrOhAex8YOJKxhOAT46Uv6RdDXkIcHc7RPkZ4PAk\nu7ST7w9vZZIkSUvONjNVSPJh4FBgtyRrGK5mPB04L8mJwHeB41r1C4CjgdXAfcDLAKrqjiR/Blza\n6r2pqiaf2C9JkrQkzBjAqupFU8w6bEzdAk6eYjlnAmfOqXWSJElbIL8JX5IkqTMDmCRJUmcGMEmS\npM4MYJIkSZ0ZwCRJkjozgEmSJHVmAJMkSerMACZJktSZAUySJKkzA5gkSVJnBjBJkqTODGCSJEmd\nGcAkSZI622ahG7C5Wn7Kp9cru+H05y1ASyRJ0pbGPWCSJEmdGcAkSZI6M4BJkiR1ZgCTJEnqzAAm\nSZLUmQFMkiSpMwOYJElSZwYwSZKkzgxgkiRJnRnAJEmSOjOASZIkdWYAkyRJ6swAJkmS1JkBTJIk\nqTMDmCRJUmcGMEmSpM4MYJIkSZ0ZwCRJkjozgEmSJHVmAJMkSerMACZJktSZAUySJKkzA5gkSVJn\nBjBJkqTODGCSJEmdGcAkSZI6M4BJkiR1ZgCTJEnqzAAmSZLUmQFMkiSpMwOYJElSZwYwSZKkzgxg\nkiRJnRnAJEmSOjOASZIkdWYAkyRJ6swAJkmS1JkBTJIkqbONCmBJbkhyZZLLk6xsZbsmuTDJte3/\nLq08Sd6RZHWSK5IcNB8bIEmStNjMxx6wX66qA6tqRbt9CvC5qtof+Fy7DXAUsH/7Owl49zysW5Ik\nadHZFIcgjwHObtNnA8eOlJ9Tg68COyfZYxOsX5IkabO2sQGsgM8muSzJSa1s96q6GaD9f1wr3wu4\nceS+a1rZIyQ5KcnKJCvXrVu3kc2TJEna/Gyzkfd/VlWtTfI44MIk35ymbsaU1XoFVWcAZwCsWLFi\nvfmSJEmL3UbtAauqte3/rcAngIOBWyYOLbb/t7bqa4B9Ru6+N7B2Y9YvSZK0GG1wAEvyqCQ7TUwD\nhwPfAM4HTmjVTgA+2abPB17SroY8BLh74lClJEnSUrIxhyB3Bz6RZGI5H6qqf0xyKXBekhOB7wLH\ntfoXAEcDq4H7gJdtxLolSZIWrQ0OYFV1HfC0MeW3A4eNKS/g5A1dnyRJ0pbCb8KXJEnqzAAmSZLU\nmQFMkiSpMwOYJElSZwYwSZKkzgxgkiRJnRnAJEmSOjOASZIkdWYAkyRJ6swAJkmS1JkBTJIkqTMD\nmCRJUmcGMEmSpM62WegGLCbLT/n0emU3nP68BWiJJElazNwDJkmS1JkBTJIkqTMDmCRJUmcGMEmS\npM4MYJIkSZ0ZwCRJkjozgEmSJHVmAJMkSerMACZJktSZAUySJKkzA5gkSVJn/hbkRvL3ISVJ0ly5\nB0ySJKkzA5gkSVJnBjBJkqTODGCSJEmdGcAkSZI6M4BJkiR1ZgCTJEnqzAAmSZLUmQFMkiSpMwOY\nJElSZ/4U0SbgzxNJkqTpuAdMkiSpMwOYJElSZwYwSZKkzgxgkiRJnRnAJEmSOjOASZIkdWYAkyRJ\n6szvAetk3HeDjeP3hUmStOVzD5gkSVJn7gHbzPgt+pIkbfncAyZJktSZAUySJKkzA5gkSVJnngO2\nCMz2CkrwfDFJkhYDA9gWxpP4JUna/HkIUpIkqbPue8CSHAn8FbA18NdVdXrvNiw1czmEOZl7zyRJ\nmn9dA1iSrYF3Ar8CrAEuTXJ+VV3Vsx2avc39G/w95CpJWox67wE7GFhdVdcBJDkXOAYwgC1yG7OX\nrZf5buOWHvQ29/AtSYtZ7wC2F3DjyO01wDNHKyQ5CTip3fx+kms2YXt2A27bhMtfrBZ1v+Qtm2zR\nj+iXTbieRSVvWdzjZROyX8azX8azX8ZbbP2y72wr9g5gGVNWj7hRdQZwRpfGJCurakWPdS0m9st4\n9st49st49st49st49st4W3K/9L4Kcg2wz8jtvYG1ndsgSZK0oHoHsEuB/ZPsl2Q74Hjg/M5tkCRJ\nWlBdD0FW1QNJXgF8huFrKM6sqlU92zBJl0Odi5D9Mp79Mp79Mp79Mp79Mp79Mt4W2y+pqplrSZIk\nad74TfiSJEmdGcAkSZI6W7JvUDwbAAAEiElEQVQBLMmRSa5JsjrJKQvdnoWU5IYkVya5PMnKVrZr\nkguTXNv+77LQ7dzUkpyZ5NYk3xgpG9sPGbyjjZ8rkhy0cC3ftKbol1OT3NTGzOVJjh6Z97rWL9ck\nOWJhWr1pJdknyReSXJ1kVZJXtvIlPV6m6ZelPl52SHJJkq+3fnljK98vycVtvHykXZxGku3b7dVt\n/vKFbP+mMk2/nJXk+pHxcmAr37KeR1W15P4YLgD4NvB4YDvg68ABC92uBeyPG4DdJpX9BXBKmz4F\neMtCt7NDP/wicBDwjZn6ATga+AeG77Y7BLh4odvfuV9OBV4zpu4B7fm0PbBfe55tvdDbsAn6ZA/g\noDa9E/Cttu1LerxM0y9LfbwEeHSb3ha4uI2D84DjW/l7gN9v038AvKdNHw98ZKG3oXO/nAW8cEz9\nLep5tFT3gD30k0hV9e/AxE8i6WHHAGe36bOBYxewLV1U1ZeAOyYVT9UPxwDn1OCrwM5J9ujT0r6m\n6JepHAOcW1X3V9X1wGqG59sWpapurqqvtel7gasZfuljSY+XafplKktlvFRVfb/d3Lb9FfAc4KOt\nfPJ4mRhHHwUOSzLui8wXtWn6ZSpb1PNoqQawcT+JNN2LxJaugM8muaz9FBTA7lV1MwwvqsDjFqx1\nC2uqfnAMwSvaYYAzRw5RL7l+aYeHns7w6d3x0kzqF1ji4yXJ1kkuB24FLmTY23dXVT3Qqoxu+0P9\n0ubfDTy2b4v7mNwvVTUxXk5r4+XtSbZvZVvUeFmqAWzGn0RaYp5VVQcBRwEnJ/nFhW7QIrDUx9C7\ngScABwI3A29t5UuqX5I8GvgY8Kqqume6qmPKllK/LPnxUlUPVtWBDL8AczDwlHHV2v8l2y9Jfhp4\nHfBk4BnArsAftepbVL8s1QDmTyKNqKq17f+twCcYXhxumdi12/7funAtXFBT9cOSHkNVdUt74fwx\n8F4ePmy0ZPolybYMIeODVfXxVrzkx8u4fnG8PKyq7gIuYjiHaeckE1+IPrrtD/VLm/8YZn8awKI0\n0i9HtkPZVVX3A3/DFjpelmoA8yeRmiSPSrLTxDRwOPANhv44oVU7AfjkwrRwwU3VD+cDL2lX5RwC\n3D1x6GkpmHTexfMZxgwM/XJ8u4prP2B/4JLe7dvU2vk47wOurqq3jcxa0uNlqn5xvGRZkp3b9I7A\ncxnOj/sC8MJWbfJ4mRhHLwQ+X+0s9C3JFP3yzZEPMWE4L250vGwxz6OuP0W0uajN7yeRFtLuwCfa\n+Z3bAB+qqn9McilwXpITge8Cxy1gG7tI8mHgUGC3JGuANwCnM74fLmC4Imc1cB/wsu4N7mSKfjm0\nXRpeDFfRvhygqlYlOQ+4CngAOLmqHlyIdm9izwJeDFzZzl8BeD2Ol6n65UVLfLzsAZydZGuGHR/n\nVdWnklwFnJvkzcC/MoRX2v/3J1nNsOfr+IVodAdT9cvnkyxjOOR4OfB7rf4W9Tzyp4gkSZI6W6qH\nICVJkhaMAUySJKkzA5gkSVJnBjBJkqTODGCSJEmdGcAkSZI6M4BJkiR19v8BQX1ERlGrdjAAAAAA\nSUVORK5CYII=\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "num_interactions.hist(bins = 100, grid = False, figsize = (10, 7))\n", "plt.title('distributions of number of interactions a user have with contents')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Majority of users have very few interactions with the articles. It is very possible that majorities of the users are new users" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Fill in the median and maximum number of user_article interactions below\n", "median_val = 3 # 50% of individuals interact with ____ number of articles or fewer.\n", "max_views_by_user = 364 # The maximum number of user-article interactions by any 1 user is ______." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`2.` Explore and remove duplicate articles from the **df_content** dataframe. " ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1429. 1314. 1314. ... 233. 1160. 16.]\n" ] } ], "source": [ "# Print the duplicated article ids\n", "print(articles[articles['article_id'].duplicated()].article_id.values)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Remove any rows that have the same article_id - only keep the first\n", "articles = articles[~articles['article_id'].duplicated(keep='first')]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`3.` Use the cells below to find:\n", "\n", "**a.** The number of unique articles that have an interaction with a user. \n", "**b.** The number of unique articles in the dataset (whether they have any interactions or not).
\n", "**c.** The number of unique users in the dataset. \n", "**d.** The number of user-article interactions in the dataset." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": true }, "outputs": [], "source": [ "unique_articles = len(user_interacts['article_id'].unique())\n", "total_articles = len(articles)\n", "unique_users = len(user_interacts['email'].dropna().unique())\n", "user_article_interactions = len(user_interacts)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`4.` Use the cells below to find the most viewed **article_id**, as well as how often it was viewed." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": true }, "outputs": [], "source": [ "most_viewed_article_id = str(user_interacts['article_id'].value_counts().idxmax())# The most viewed article id\n", "max_views = user_interacts['article_id'].value_counts().max() # The most number of views" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
article_idtitleuser_id
01430.0using pixiedust for fast, flexible, and easier...1
11314.0healthcare python streaming application demo2
21429.0use deep learning for image classification3
31338.0ml optimization using cognitive assistant4
41276.0deploy your python model as a restful api5
\n", "
" ], "text/plain": [ " article_id title user_id\n", "0 1430.0 using pixiedust for fast, flexible, and easier... 1\n", "1 1314.0 healthcare python streaming application demo 2\n", "2 1429.0 use deep learning for image classification 3\n", "3 1338.0 ml optimization using cognitive assistant 4\n", "4 1276.0 deploy your python model as a restful api 5" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "## No need to change the code here - this will be helpful for later parts of the notebook\n", "# Run this cell to map the user email to a user_id column and remove the email column\n", "\n", "def email_mapper():\n", " coded_dict = dict()\n", " cter = 1\n", " email_encoded = []\n", " \n", " for val in user_interacts['email']:\n", " if val not in coded_dict:\n", " coded_dict[val] = cter\n", " cter+=1\n", " \n", " email_encoded.append(coded_dict[val])\n", " return email_encoded\n", "\n", "email_encoded = email_mapper()\n", "del user_interacts['email']\n", "user_interacts['user_id'] = email_encoded\n", "\n", "# show header\n", "user_interacts.head()" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Oops! It looks like the value associated with: `The number of unique articles on the IBM platform` wasn't right. Try again. It might just be the datatype. All of the values should be ints except the article_id should be a string. Let each row be considered a separate user-article interaction. If a user interacts with an article 3 times, these are considered 3 separate interactions.\n" ] } ], "source": [ "## If you stored all your results in the variable names above, \n", "## you shouldn't need to change anything in this cell\n", "\n", "sol_1_dict = {\n", " '`50% of individuals have _____ or fewer interactions.`': median_val,\n", " '`The total number of user-article interactions in the dataset is ______.`': user_article_interactions,\n", " '`The maximum number of user-article interactions by any 1 user is ______.`': max_views_by_user,\n", " '`The most viewed article in the dataset was viewed _____ times.`': max_views,\n", " '`The article_id of the most viewed article is ______.`': most_viewed_article_id,\n", " '`The number of unique articles that have at least 1 rating ______.`': unique_articles,\n", " '`The number of unique users in the dataset is ______`': unique_users,\n", " '`The number of unique articles on the IBM platform`': total_articles\n", "}\n", "\n", "# Test your dictionary against the solution\n", "t.sol_1_test(sol_1_dict)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Part II: Rank-Based Recommendations\n", "\n", "Unlike in the earlier lessons, we don't actually have ratings for whether a user liked an article or not. We only know that a user has interacted with an article. In these cases, the popularity of an article can really only be based on how often an article was interacted with.\n", "\n", "`1.` Fill in the function below to return the **n** top articles ordered with most interactions as the top. Test your function using the tests below." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def get_top_articles(n, df=user_interacts):\n", " '''\n", " INPUT:\n", " n - (int) the number of top articles to return\n", " df - (pandas dataframe) df as defined at the top of the notebook, user interactions\n", " \n", " OUTPUT:\n", " top_articles - (list) A list of the top 'n' article titles \n", " \n", " '''\n", " top_articles = list(df['title'].value_counts().reset_index().head(n)['index'])\n", " \n", " \n", " return top_articles # Return the top article titles from user interacts (not df_content)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def get_top_article_ids(n, df=user_interacts):\n", " '''\n", " INPUT:\n", " n - (int) the number of top articles to return\n", " df - (pandas dataframe) df as defined at the top of the notebook \n", " \n", " OUTPUT:\n", " top_articles - (list) A list of the top 'n' article titles \n", " \n", " '''\n", " # Your code here\n", " \n", " top_articles = list(df['article_id'].value_counts().reset_index().head(n)['index'])\n", " \n", " return top_articles # Return the top article ids" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['use deep learning for image classification', 'insights from new york car accident reports', 'visualize car data with brunel', 'use xgboost, scikit-learn & ibm watson machine learning apis', 'predicting churn with the spss random tree algorithm', 'healthcare python streaming application demo', 'finding optimal locations of new store using decision optimization', 'apache spark lab, part 1: basic concepts', 'analyze energy consumption in buildings', 'gosales transactions for logistic regression model']\n", "[1429.0, 1330.0, 1431.0, 1427.0, 1364.0, 1314.0, 1293.0, 1170.0, 1162.0, 1304.0]\n" ] } ], "source": [ "print(get_top_articles(10))\n", "print(get_top_article_ids(10))" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Your top_5 looks like the solution list! Nice job.\n", "Your top_10 looks like the solution list! Nice job.\n", "Your top_20 looks like the solution list! Nice job.\n" ] } ], "source": [ "# Test your function by returning the top 5, 10, and 20 articles\n", "top_5 = get_top_articles(5)\n", "top_10 = get_top_articles(10)\n", "top_20 = get_top_articles(20)\n", "\n", "# Test each of your three lists from above\n", "t.sol_2_test(get_top_articles)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Part III: User-User Based Collaborative Filtering\n", "\n", "\n", "`1.` Use the function below to reformat the **df** dataframe to be shaped with users as the rows and articles as the columns. \n", "\n", "* Each **user** should only appear in each **row** once.\n", "\n", "\n", "* Each **article** should only show up in one **column**. \n", "\n", "\n", "* **If a user has interacted with an article, then place a 1 where the user-row meets for that article-column**. It does not matter how many times a user has interacted with the article, all entries where a user has interacted with an article should be a 1. \n", "\n", "\n", "* **If a user has not interacted with an item, then place a zero where the user-row meets for that article-column**. \n", "\n", "Use the tests to make sure the basic structure of your matrix matches what is expected by the solution." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def create_user_item_matrix(df):\n", " '''\n", " INPUT:\n", " df - pandas dataframe with article_id, title, user_id columns\n", " \n", " OUTPUT:\n", " user_item - user item matrix \n", " \n", " Description:\n", " Return a matrix with user ids as rows and article ids on the columns with 1 values where a user interacted with \n", " an article and a 0 otherwise\n", " '''\n", " # Fill in the function here\n", " df['article_id'] = df['article_id'].astype(str)\n", "\n", " user_item_pivot = df.groupby(['user_id', 'article_id'])['title'].count().notnull().unstack()\n", " user_item = user_item_pivot.notnull().astype(np.int)\n", " \n", " return user_item # return the user_item matrix \n", "\n", "user_item = create_user_item_matrix(user_interacts)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "You have passed our quick tests! Please proceed!\n" ] } ], "source": [ "## Tests: You should just need to run this cell. Don't change the code.\n", "assert user_item.shape[0] == 5149, \"Oops! The number of users in the user-article matrix doesn't look right.\"\n", "assert user_item.shape[1] == 714, \"Oops! The number of articles in the user-article matrix doesn't look right.\"\n", "assert user_item.sum(axis=1)[1] == 36, \"Oops! The number of articles seen by user 1 doesn't look right.\"\n", "print(\"You have passed our quick tests! Please proceed!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`2.` Complete the function below which should take a user_id and provide an ordered list of the most similar users to that user (from most similar to least similar). The returned result should not contain the provided user_id, as we know that each user is similar to him/herself. Because the results for each user here are binary, it (perhaps) makes sense to compute similarity as the dot product of two users. \n", "\n", "Use the tests to test your function." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def find_similar_users(user_id, user_item=user_item):\n", " '''\n", " INPUT:\n", " user_id - (int) a user_id\n", " user_item - (pandas dataframe) matrix of users by articles: \n", " 1's when a user has interacted with an article, 0 otherwise\n", " \n", " OUTPUT:\n", " similar_users - (list) an ordered list where the closest users (largest dot product users)\n", " are listed first\n", " \n", " Description:\n", " Computes the similarity of every pair of users based on the dot product\n", " Returns an ordered\n", " \n", " '''\n", " # repeat the input user vector n times to match dimensions\n", " num_users = user_item.shape[0]\n", " user_vector = user_item.loc[user_id] \n", " user_vector_tile = np.tile(user_vector.values, (num_users, 1)) \n", " \n", " # compute similarity of each user to the provided user\n", " similarities = np.multiply(user_vector_tile, user_item).sum(axis = 1) \n", " \n", " # sort by similarity\n", " similarities.sort_values(ascending=False, inplace = True)\n", "\n", " # create list of just the ids\n", " most_similar_users = list(similarities.index)\n", " \n", " # remove the input user itself from the most similiar users\n", " most_similar_users.remove(user_id)\n", " \n", " return most_similar_users # return a list of the users in order from most to least similar" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The 10 most similar users to user 1 are: [3933, 23, 3782, 203, 4459, 131, 3870, 46, 4201, 5041]\n", "The 5 most similar users to user 3933 are: [1, 23, 3782, 4459, 203]\n", "The 3 most similar users to user 46 are: [4201, 23, 3782]\n" ] } ], "source": [ "# Do a spot check of your function\n", "print(\"The 10 most similar users to user 1 are: {}\".format(find_similar_users(1)[:10]))\n", "print(\"The 5 most similar users to user 3933 are: {}\".format(find_similar_users(3933)[:5]))\n", "print(\"The 3 most similar users to user 46 are: {}\".format(find_similar_users(46)[:3]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`3.` Now that you have a function that provides the most similar users to each user, you will want to use these users to find articles you can recommend. Complete the functions below to return the articles you would recommend to each user. " ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def get_article_names(article_ids, df=user_interacts):\n", " '''\n", " INPUT:\n", " article_ids - (list) a list of article ids\n", " df - (pandas dataframe) df as defined at the top of the notebook, user interacts\n", " \n", " OUTPUT:\n", " article_names - (list) a list of article names associated with the list of article ids \n", " (this is identified by the title column)\n", " '''\n", " # find the article names associated with list of article ids\n", " article_names = (user_interacts[user_interacts['article_id'].isin(article_ids)]['title']\n", " .drop_duplicates().values.tolist())\n", " \n", " return article_names " ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def get_user_articles(user_id, user_item=user_item):\n", " '''\n", " INPUT:\n", " user_id - (int) a user id\n", " user_item - (pandas dataframe) matrix of users by articles: \n", " 1's when a user has interacted with an article, 0 otherwise\n", " \n", " OUTPUT:\n", " article_ids - (list) a list of the article ids seen by the user\n", " article_names - (list) a list of article names associated with the list of article ids \n", " (this is identified by the doc_full_name column in df_content)\n", " \n", " Description:\n", " Provides a list of the article_ids and article titles that have been seen by a user\n", " '''\n", " # Find all the user ids that a certain user has seen\n", " user_vector = user_item.loc[user_id]\n", " user_vector_seen = user_vector.where(user_vector == 1).dropna()\n", " article_ids = list(user_vector_seen.index)\n", " \n", " # Find all the article names based on the article_ids returned in the last step\n", " article_names = get_article_names(article_ids)\n", " \n", " return article_ids, article_names" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def user_user_recs(user_id, m=10):\n", " '''\n", " INPUT:\n", " user_id - (int) a user id\n", " m - (int) the number of recommendations you want for the user\n", " \n", " OUTPUT:\n", " recs - (list) a list of recommendations for the user\n", " \n", " Description:\n", " Loops through the users based on closeness to the input user_id\n", " For each user - finds articles the user hasn't seen before and provides them as recs\n", " Does this until m recommendations are found\n", " \n", " Notes:\n", " Users who are the same closeness are chosen arbitrarily as the 'next' user\n", " \n", " For the user where the number of recommended articles starts below m \n", " and ends exceeding m, the last items are chosen arbitrarily\n", " \n", " '''\n", " # Find all articles that the user has read before\n", " user_read_articles = get_user_articles(user_id)[0]\n", " similar_users = find_similar_users(user_id)\n", "\n", " # Find all articles that has been read by the similiar users\n", " read_articles = [get_user_articles(user)[0] for user in similar_users]\n", " read_articles_list = list(itertools.chain.from_iterable(read_articles))\n", "\n", " # remove duplicated read articles\n", " read_articles_unique = pd.Series(read_articles_list).drop_duplicates().tolist()\n", "\n", " # remove articles that has been seen by the given user, use the remaining articles as recommendations\n", " recs = [i for i in read_articles_unique if i not in user_read_articles][:m]\n", " \n", " return recs " ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['analyze energy consumption in buildings',\n", " 'analyze accident reports on amazon emr spark',\n", " '520 using notebooks with pixiedust for fast, flexi...\\nName: title, dtype: object',\n", " '1448 i ranked every intro to data science course on...\\nName: title, dtype: object',\n", " 'data tidying in data science experience',\n", " 'airbnb data for analytics: vancouver listings',\n", " 'recommender systems: approaches & algorithms',\n", " 'airbnb data for analytics: mallorca reviews',\n", " 'analyze facebook data using ibm watson and watson studio',\n", " 'a tensorflow regression model to predict house values']" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Check Results\n", "get_article_names(user_user_recs(1, 10)) # Return 10 recommendations for user 1" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "If this is all you see, you passed all of our tests! Nice job!\n" ] } ], "source": [ "# Test your functions here - No need to change this code - just run this cell\n", "assert set(get_article_names(['1024.0', '1176.0', '1305.0', '1314.0', '1422.0', '1427.0'])) == set(['using deep learning to reconstruct high-resolution audio', 'build a python app on the streaming analytics service', 'gosales transactions for naive bayes model', 'healthcare python streaming application demo', 'use r dataframes & ibm watson natural language understanding', 'use xgboost, scikit-learn & ibm watson machine learning apis']), \"Oops! Your the get_article_names function doesn't work quite how we expect.\"\n", "assert set(get_article_names(['1320.0', '232.0', '844.0'])) == set(['housing (2015): united states demographic measures','self-service data preparation with ibm data refinery','use the cloudant-spark connector in python notebook']), \"Oops! Your the get_article_names function doesn't work quite how we expect.\"\n", "assert set(get_user_articles(20)[0]) == set(['1320.0', '232.0', '844.0'])\n", "assert set(get_user_articles(20)[1]) == set(['housing (2015): united states demographic measures', 'self-service data preparation with ibm data refinery','use the cloudant-spark connector in python notebook'])\n", "assert set(get_user_articles(2)[0]) == set(['1024.0', '1176.0', '1305.0', '1314.0', '1422.0', '1427.0'])\n", "assert set(get_user_articles(2)[1]) == set(['using deep learning to reconstruct high-resolution audio', 'build a python app on the streaming analytics service', 'gosales transactions for naive bayes model', 'healthcare python streaming application demo', 'use r dataframes & ibm watson natural language understanding', 'use xgboost, scikit-learn & ibm watson machine learning apis'])\n", "print(\"If this is all you see, you passed all of our tests! Nice job!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`4.` Now we are going to improve the consistency of the **user_user_recs** function from above. \n", "\n", "* Instead of arbitrarily choosing when we obtain users who are all the same closeness to a given user - choose the users that have the most total article interactions before choosing those with fewer article interactions.\n", "\n", "\n", "* Instead of arbitrarily choosing articles from the user where the number of recommended articles starts below m and ends exceeding m, choose articles with the articles with the most total interactions before choosing those with fewer total interactions. This ranking should be what would be obtained from the **top_articles** function you wrote earlier." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def get_top_sorted_users(user_id, df=user_interacts, user_item=user_item):\n", " '''\n", " INPUT:\n", " user_id - (int)\n", " df - (pandas dataframe) df as defined at the top of the notebook, user interacts\n", " user_item - (pandas dataframe) matrix of users by articles: \n", " 1's when a user has interacted with an article, 0 otherwise\n", " \n", " \n", " OUTPUT:\n", " neighbors_df - (pandas dataframe) a dataframe with:\n", " neighbor_id - is a neighbor user_id\n", " similarity - measure of the similarity of each user to the provided user_id\n", " num_interactions - the number of articles viewed by the user - if a u\n", " \n", " Other Details - sort the neighbors_df by the similarity and then by number of interactions where \n", " highest of each is higher in the dataframe\n", " \n", " '''\n", " # repeat the input user vector n times to match dimensions\n", " num_users = user_item.shape[0]\n", " user_vector = user_item.loc[user_id] \n", " user_vector_tile = np.tile(user_vector.values, (num_users, 1)) \n", "\n", " # compute similarity of each user to the provided user\n", " similarities = pd.DataFrame(np.multiply(user_vector_tile, user_item).sum(axis = 1), \n", " columns = ['similarity'])\n", "\n", " # Add a column of total interactions each user have\n", " similarities['num_interactions'] = df.groupby('user_id').count()['title'].values\n", "\n", " # double sort on similarities and total interactions\n", " similarities.sort_values(by = ['similarity','num_interactions'], \n", " ascending=False, inplace = True)\n", "\n", " # remove the inputed user him/herself\n", " similarities.drop(user_id, inplace = True)\n", " \n", " # build neighbors_df\n", " neighbors_df = similarities.reset_index()\n", " neighbors_df.columns = ['neighbor_id', 'similarity','num_interactions']\n", " \n", " return neighbors_df " ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def user_user_recs_part2(user_id, m=10):\n", " '''\n", " INPUT:\n", " user_id - (int) a user id\n", " m - (int) the number of recommendations you want for the user\n", " \n", " OUTPUT:\n", " recs - (list) a list of recommendations for the user by article id\n", " rec_names - (list) a list of recommendations for the user by article title\n", " \n", " Description:\n", " Loops through the users based on closeness to the input user_id\n", " For each user - finds articles the user hasn't seen before and provides them as recs\n", " Does this until m recommendations are found\n", " \n", " Notes:\n", " * Choose the users that have the most total article interactions \n", " before choosing those with fewer article interactions.\n", "\n", " * Choose articles with the articles with the most total interactions \n", " before choosing those with fewer total interactions. \n", " \n", " '''\n", " user_read_articles = get_user_articles(user_id)[0]\n", " similar_users = get_top_sorted_users(user_id)['neighbor_id'].values.tolist()\n", " \n", " # Find all articles that has been read by the similiar users\n", " read_articles = [get_user_articles(user)[0] for user in similar_users]\n", " read_articles_list = list(itertools.chain.from_iterable(read_articles))\n", " \n", " # remove duplicated read articles\n", " read_articles_unique = pd.Series(read_articles_list).drop_duplicates().tolist()\n", "\n", " # remove articles that has been seen by the given user, use the remaining articles as recommendations\n", " recs = [i for i in read_articles_unique if i not in user_read_articles][:m]\n", " rec_names = get_article_names(recs)\n", " \n", " return recs, rec_names" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The top 10 recommendations for user 20 are the following article ids:\n" ] }, { "data": { "text/plain": [ "['1024.0',\n", " '1085.0',\n", " '109.0',\n", " '1150.0',\n", " '1151.0',\n", " '1152.0',\n", " '1153.0',\n", " '1154.0',\n", " '1157.0',\n", " '1160.0']" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Quick spot check - don't change this code - just use it to test your functions\n", "rec_ids, rec_names = user_user_recs_part2(20, 10)\n", "print(\"The top 10 recommendations for user 20 are the following article ids:\")\n", "rec_ids" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The top 10 recommendations for user 20 are the following article names:\n" ] }, { "data": { "text/plain": [ "['airbnb data for analytics: washington d.c. listings',\n", " 'analyze accident reports on amazon emr spark',\n", " 'tensorflow quick tips',\n", " 'airbnb data for analytics: venice listings',\n", " 'airbnb data for analytics: venice calendar',\n", " 'airbnb data for analytics: venice reviews',\n", " 'using deep learning to reconstruct high-resolution audio',\n", " 'airbnb data for analytics: vienna listings',\n", " 'airbnb data for analytics: vienna calendar',\n", " 'airbnb data for analytics: chicago listings']" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "print(\"The top 10 recommendations for user 20 are the following article names:\")\n", "rec_names" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`5.` Use your functions from above to correctly fill in the solutions to the dictionary below. Then test your dictionary against the solution. Provide the code you need to answer each following the comments below." ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": true }, "outputs": [], "source": [ "### Tests with a dictionary of results\n", "\n", "# Find the user that is most similar to user 1 \n", "user1_most_sim = get_top_sorted_users(1)['neighbor_id'].head(1).values[0]\n", "# Find the 10th most similar user to user 131\n", "user131_10th_sim = get_top_sorted_users(131)['neighbor_id'].head(10).values[-1]" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "This all looks good! Nice job!\n" ] } ], "source": [ "## Dictionary Test Here\n", "sol_5_dict = {\n", " 'The user that is most similar to user 1.': user1_most_sim, \n", " 'The user that is the 10th most similar to user 131': user131_10th_sim,\n", "# \"The top 10 recommendations for user 20 are the following article ids:\": rec_ids,\n", "# \"The top 10 recommendations for user 20 are the following article names:\": rec_names\n", "}\n", "\n", "t.sol_5_test(sol_5_dict)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`6.` If we were given a new user, which of the above functions would you be able to use to make recommendations? Explain. Can you think of a better way we might make recommendations? Use the cell below to explain a better method for new users." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If the new user is given, the user would have never seen any articles interactions in the history, which would lead to failure in the calculations of similiar vectors, since all values in *user_item* for that new user will be 0. The recommender will have trouble providing any meaningful recommendations for this new user." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`7.` Using your existing functions, provide the top 10 recommended articles you would provide for the a new user below. You can test your function against our thoughts to make sure we are all on the same page with how we might make a recommendation." ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "collapsed": true }, "outputs": [], "source": [ "new_user = '0.0'\n", "\n", "# What would your recommendations be for this new user '0.0'? As a new user, they have no observed articles.\n", "# Provide a list of the top 10 article ids you would give to \n", "new_user_recs = get_top_article_ids(10)" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "That's right! Nice job!\n" ] } ], "source": [ "assert set(new_user_recs) == set(['1314.0','1429.0','1293.0','1427.0','1162.0','1364.0','1304.0','1170.0','1431.0','1330.0']), \"Oops! It makes sense that in this case we would want to recommend the most popular articles, because we don't know anything about these users.\"\n", "\n", "print(\"That's right! Nice job!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Part IV: Content Based Recommendations (EXTRA - NOT REQUIRED)\n", "\n", "Another method we might use to make recommendations is to perform a ranking of the highest ranked articles associated with some term. You might consider content to be the **doc_body**, **doc_description**, or **doc_full_name**. There isn't one way to create a content based recommendation, especially considering that each of these columns hold content related information. \n", "\n", "`1.` Use the function body below to create a content based recommender. Since there isn't one right answer for this recommendation tactic, no test functions are provided. Feel free to change the function inputs if you decide you want to try a method that requires more input values. The input values are currently set with one idea in mind that you may use to make content based recommendations. One additional idea is that you might want to choose the most popular recommendations that meet your 'content criteria', but again, there is a lot of flexibility in how you might make these recommendations.\n", "\n", "### This part is NOT REQUIRED to pass this project. However, you may choose to take this on as an extra way to show off your skills." ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def make_content_recs():\n", " '''\n", " INPUT:\n", " \n", " OUTPUT:\n", " \n", " '''" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`2.` Now that you have put together your content-based recommendation system, use the cell below to write a summary explaining how your content based recommender works. Do you see any possible improvements that could be made to your function? Is there anything novel about your content based recommender?\n", "\n", "### This part is NOT REQUIRED to pass this project. However, you may choose to take this on as an extra way to show off your skills." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Write an explanation of your content based recommendation system here.**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`3.` Use your content-recommendation system to make recommendations for the below scenarios based on the comments. Again no tests are provided here, because there isn't one right answer that could be used to find these content based recommendations.\n", "\n", "### This part is NOT REQUIRED to pass this project. However, you may choose to take this on as an extra way to show off your skills." ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# make recommendations for a brand new user\n", "\n", "\n", "# make a recommendations for a user who only has interacted with article id '1427.0'\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Part V: Matrix Factorization\n", "\n", "In this part of the notebook, you will build use matrix factorization to make article recommendations to the users on the IBM Watson Studio platform.\n", "\n", "`1.` You should have already created a **user_item** matrix above in **question 1** of **Part III** above. This first question here will just require that you run the cells to get things set up for the rest of **Part V** of the notebook. " ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Load the matrix here\n", "user_item_matrix = pd.read_pickle('user_item_matrix.p')" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
article_id0.0100.01000.01004.01006.01008.0101.01014.01015.01016.0...977.098.0981.0984.0985.0986.0990.0993.0996.0997.0
user_id
10.00.00.00.00.00.00.00.00.00.0...0.00.01.00.00.00.00.00.00.00.0
20.00.00.00.00.00.00.00.00.00.0...0.00.00.00.00.00.00.00.00.00.0
30.00.00.00.00.00.00.00.00.00.0...1.00.00.00.00.00.00.00.00.00.0
40.00.00.00.00.00.00.00.00.00.0...0.00.00.00.00.00.00.00.00.00.0
50.00.00.00.00.00.00.00.00.00.0...0.00.00.00.00.00.00.00.00.00.0
\n", "

5 rows × 714 columns

\n", "
" ], "text/plain": [ "article_id 0.0 100.0 1000.0 1004.0 1006.0 1008.0 101.0 1014.0 1015.0 \\\n", "user_id \n", "1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "\n", "article_id 1016.0 ... 977.0 98.0 981.0 984.0 985.0 986.0 990.0 \\\n", "user_id ... \n", "1 0.0 ... 0.0 0.0 1.0 0.0 0.0 0.0 0.0 \n", "2 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "3 0.0 ... 1.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "4 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "5 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "\n", "article_id 993.0 996.0 997.0 \n", "user_id \n", "1 0.0 0.0 0.0 \n", "2 0.0 0.0 0.0 \n", "3 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 \n", "5 0.0 0.0 0.0 \n", "\n", "[5 rows x 714 columns]" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# quick look at the matrix\n", "user_item_matrix.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`2.` In this situation, you can use Singular Value Decomposition from [numpy](https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.linalg.svd.html) on the user-item matrix. Use the cell to perfrom SVD, and explain why this is different than in the lesson." ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Perform SVD on the User-Item Matrix \n", "u, s, vt = np.linalg.svd(user_item_matrix)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since there were missing data in the class, performing singular value decompositions directly will lead to errors from this opreation. However, in this project that, there were no missing data in the User-Item matrix. We are able to directly get an analytical solution out of singular value decompositions." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`3.` Now for the tricky part, how do we choose the number of latent features to use? Running the below cell, you can see that as the number of latent features increases, we obtain a lower error rate on making predictions for the 1 and 0 values in the user-item matrix. Run the cell below to get an idea of how the accuracy improves as we increase the number of latent features." ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAEWCAYAAACJ0YulAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAIABJREFUeJzt3Xl8HXW9//HXO1vTfd8XUkoptIUu\n1LKKVRYBkaKAUlHBi6A/RdwVXLiK3ut61XsVVEQEZZNFsCCKiBQVFdrSjW50pUnXdEnbtE2zfX9/\nzCQM6Wlz0vbknCTv5+NxHpnlOzOfmXMyn5nvzHxHIQTMzMwA8rIdgJmZ5Q4nBTMza+SkYGZmjZwU\nzMyskZOCmZk1clIwM7NGTgpmByHpbknfzNKyJelXknZIeikbMVjH5KSQBZJmxf/snbIdS1siaa2k\nzZK6JoZ9WNKsLIaVKWcB5wHDQghTm46UdI2kf7R0ppKmSSo7GgHG8/uapHubKbNW0j5JlYnPkCNc\n7lFdD3udk0Irk1QCvBkIwCWtvOyC1lxehhQAn8x2EC0lKb+FkxwDrA0h7MlEPFnwzhBCt8RnQzaD\naSf/CxnhpND6Pgj8G7gbuDo5QlJnSf8j6TVJOyX9Q1LneNxZkv4pqUJSqaRr4uGzJH04MY83HEFK\nCpI+LmkFsCIe9r/xPHZJmivpzYny+ZK+JGmVpN3x+OGSbpP0P03ifULSp5quoKSfSfp+k2G/l/SZ\nuPuLktbH818u6ZwWbL/vAZ+T1CvFckvi9S1IDGvcPvG2eUHSD+PtuFrSGfHwUklbJF3dZLb9JD0T\nx/q8pGMS8z4hHrc9Xo/3JMbdLemnkp6StAd4a4p4h0iaGU+/UtJ18fBrgTuB0+Oj6q+3YPsg6UOS\nlsYxr5b0kXh4V+CPwJDkEbukPEk3xd/5NkkPSerTZJteLWmdpK2SvhyPuwD4EvDeeF4LWhJnPI/T\nEr/rBZKmHcF6vKG6T03OJhSdsXxR0kJgj6SCeLpHJZVLWiPpxkT5qZLmxP8nmyX9oKXr1yaFEPxp\nxQ+wEvgYcApQAwxMjLsNmAUMBfKBM4BOwAhgNzADKAT6AhPjaWYBH07M4xrgH4n+ADwD9AE6x8Pe\nH8+jAPgssAkojsd9HlgEjAEETIjLTgU2AHlxuX7A3mT8iWWeDZQCivt7A/uAIfF8S4Eh8bgSYFSa\n224tcC7wO+Cb8bAPA7MS8wpAQWKaxu0Tb5ta4EPx9v0msC7e7p2A8+Pt3C0uf3fcf3Y8/n8bti3Q\nNV6PD8XbcTKwFRiXmHYncCbRwVdxivV5HrgdKAYmAuXAOam+xxTTHnQ88A5gVPz9vSX+nibH46YB\nZU3Kf4roQGVYvJ4/Bx5osk1/AXSOfw/7gRPj8V8D7k3ne0sxfCiwDbgo3kbnxf39D3M97m74XaQq\nE8cxHxger0seMBe4BSgCjgVWA2+Py/8L+EDc3Q04Ldv7j9b4ZD2AjvQhqieuAfrF/cuAT8fdeUQ7\nzgkpprsZeOwg85xF80nhbc3EtaNhucByYPpByi0Fzou7bwCeOkg5Ee1sz477rwP+GncfB2wh2rkX\ntnD7rY2nG0+0w+1Py5PCisS4k+LyycS8jdcT7t3Ag4lx3YC6eKfyXuDvTeL7OfCfiWl/fYh1GR7P\nq3ti2LeAu1N9jymmP+T4JmUfBz4Zd0/jwJ3pUuJkFPcPjn+nBYltOiwx/iXgyrj7a6SXFCqBivjz\neDz8i8BvmpR9Grj6MNfjbppPCv+R6D8VWJfif+1XcfffgK8T/792lI+rj1rX1cCfQwhb4/77eb0K\nqR/REeOqFNMNP8jwdJUmeyR9Nj4t3ympAugZL7+5Zd1DdJZB/Pc3qQqF6D/qQaIzG4D3AffF41YS\nHZl+Ddgi6UG18KJjCOEV4EngppZMF9uc6N4Xz6/psG6J/sZtF0KoBLYTnfEcA5waV3tUxNvxKmBQ\nqmlTGAJsDyHsTgx7jejo+YhIulDSv+NqqQqiI/F+h5jkGOCxxHosJUpYAxNlNiW69/LGbZSOS0MI\nveLPpYnlXtFkG55FlJQOZz3SkfxOjiGqgkou/0u8vt7XAscDyyTNlnTxES67TfDFllai6NrAe4B8\nSQ3/YJ2AXpImEFXZVBGdLjetmy0lqr5JZQ/QJdE/KEWZxqZwFV0/+CJwDrA4hFAvaQfR0X3DskYB\nr6SYz73AK3G8JxIduR3MA8CfJX2b6IjsXY3BhHA/cL+kHkRH198BPnCIeaXyn8DLQPI6R8NF2S7A\nrrg71fZoieENHZK6EVXDbSDaTs+HEM47xLSHaoJ4A9BHUvdEYhgBrD+SYBXd0fYo0bWr34cQaiQ9\nzuvfb6qYSomOoF9IMb+SZhZ5JM0slxKdKVyXYrmHsx4t+l+Il78mhDA6VXAhhBXADEl5wLuBRyT1\nDe3n4n9KPlNoPZcSHX2NJao/nki0Y/078MEQQj1wF/CD+OJXvqTT43+O+4BzJb0nvjjWV9LEeL7z\ngXdL6iLpOKKjm0PpTlSvXg4USLoF6JEYfyfwDUmjFTlZUl+AEEIZMJvoDOHREMK+gy0khDAvXsad\nwNMhhAoASWMkvS1eryqiI/O65jffAfNfCfwWuDExrJxop/r+ePv9B1GCOxIXKbrIXwR8A3gxhFBK\ndKZyvKQPSCqMP2+SdGKa8ZcC/wS+JalY0slE3919LYhN8bSNH6K68U5E275W0oVE10oabAb6SuqZ\nGPYz4L8UX0SX1F/S9DRj2AyUxDvOlroXeKekt8ffV3F8cXjYYa7HfKLvq4+kQURnpIfyErArvvjc\nOY5hvKQ3AUh6v6T+8f9mRTxNi3+rbY2TQuu5mqiucl0IYVPDB/gJcJWiO2Y+R3TGMJuomuI7RBd2\n1xGdOn82Hj6f6IIfwA+BaqJ/kntofqfyNNGdG68SVVdU8cZT6h8ADwF/Jjra/iXRRbkG9xDVxaes\nOmriAaJrAPcnhnUCvk10UXYTMIDolB1JV0lanMZ8G9xKdME36Tqii+XbgHFEO94jcT/RWcl2opsD\nrgKIj+7PB64kOurfRPR9teTZkxlEdfYbgMeIrkc804LpzyBKqk0/NxJ9hzuIqu5mNkwQQlhG9L2s\njqtMhhBdQJ9JdGa3m+ii86lpxvBw/HebpJdbEHtDYpxO9P2XE/0OP0/0m999GOvxG6Kz7LVEv9/f\nNrP8OuCdRAdoa4h+k3cSVacCXAAsllRJtI2uDCFUtWQd26KGu0PM0iLpbKIjvJL4CMrM2hGfKVja\nJBUSPTh2pxOCWfvkpGBpievKK4juDPlRlsMxswxx9ZGZmTXymYKZmTVqc88p9OvXL5SUlGQ7DDOz\nNmXu3LlbQwj9myvX5pJCSUkJc+bMyXYYZmZtiqTX0inn6iMzM2vkpGBmZo2cFMzMrJGTgpmZNXJS\nMDOzRhlLCpLuUvR6w1RNMBO3wPl/il5DuFDS5EzFYmZm6cnkmcLdRK0MHsyFwOj4cz3w0wzGYmZm\nacjYcwohhL8184KO6USvKwzAvyX1kjQ4hLAxUzGZWftUW1dPTV2guq6emrp6ausCtfUNfwN19VF/\nXX2gpu6N/bX1gboU5WrrG15RGb2ZJ/obNQvUMIwQDamrD9THrzdu6K4PgfqG4UTzadDY2cJmhs45\ncSAThvc68g12CNl8eG0ob2zHvywedkBSkHQ90dkEI0aMaJXgzOzw1NTVU1lVS+X+xKeqlr3VddTU\n1VNdW8/++G9Df0P3/tp69lXXUVVbR1VNHftq6qmqqUt86tlfW0dNXaCmtr4xCdS34SbcpObLNBjQ\no7hdJ4VUmyLlVxtCuAO4A2DKlClt+Os3y3319YHdVbVs31vN9j37qdhbw66qGnbtq2XXvkR31evd\nu6tqqNxfy+6qWvbXtrxVdQmK8vPoVJBH56J8igvzKS7Ip7gon+KCPPp0LYr6C/PoVJBPUUEehfl5\nFBaIovy4Oz+Pwnw1jsvPEwV5Ij9PB/QX5OVRkP/G/vw8UZCvxnJ5irob4hOK/zYEHQ3LE+RJ5OW9\n3p2fF5XNk+JPw3q2IANkSTaTQhmJ998Cw4jeQGVmR9G+6jq27dnPjj01bN9bzY491WzfU82OvdHf\n7U36d+ytoe4Qh96dC/Pp0bmAHsWF9OhcSL9uRZT060r34gK6dyqga6cCunUqoFvc36046u9clE9R\nfh5FBdGnU35+4069IN83QuaKbCaFmcANkh4kevXfTl9PMEtfbV095ZX72VBRxcad+9i0s6qxe+PO\nKrbsqmL73mqqalIfuecJencpok/XInp3LeLYft2YUlJEny5Rf5+uhfTuUkTvLkX07FwY7fSLCykq\n8A68PctYUpD0ADAN6CepjOg9t4UAIYSfAU8RvXd4JbAX+FCmYjFrS0II7NpXy+bdVWzeVcXmXfvj\nv2/s37J7/wFH9J0L8xncq5ghPTtz7Ki+9OvWKd7xRzv4vt2KGhNBj+JC8vJyvzrDWlcm7z6a0cz4\nAHw8U8s3y3W1dfW8tn0vKzZXsnLLblZsqeTVzZWs2VqZ8ui+R3EBA3sUM6hnMcf278uQnp0bE8Cg\nntHfHp0L2kS9teWuNtd0tllbU1cfeG3bHl7dvJvlmypZsWU3K7dUsrp8D9V1r+/8h/bqzOiB3Tj9\n2L4M6VXMgB7FDOpRzMAenRjQvZjORflZXAvrKJwUzI6SEALrK/Y17vyjv7tZWV5JdXxHjgTDe3fh\n+IHdmDZmAKMHdGP0wG6M6t+Nrp3872jZ51+h2WHaWrmf+esqmFe6g3nrKlhYtpPK/bWN4wf3LOb4\ngd05a3Q/jh/YnTEDu3PcgG4+4rec5qRglobq2nqWbNzFvHVRAphXuoPS7fsAKMgTJw7uwaWThnDi\n4B6MGdid0QO707NzYZajNms5JwWzJhqqgeatq2B+aQXz1u3glQ27GquABvcsZtKIXnzwtBImjujF\n+CE9ffRv7YaTgnV4e6trWVi2MzoDWLeDeaUVlO/eD0CngjxOHtaTq08/hskjejNxRC8G9+yc5YjN\nMsdJwTqcEALLNu1m1vJyZi3fwtzXdlAb3+9f0rcLZx3Xj0kjejFpeG9OGNydQj9tax2Ik4J1CLur\nanhh5dY4EZSzaVcVACcM6s61bx7JqSP7MHF4b/p0LcpypGbZ5aRg7daq8kr+smQzzy3fwpy10dlA\n904FnDW6H9PG9Octxw9gUM/ibIdpllOcFKzdqK8PzC+r4M+LN/PMkk2sKt8DRGcDH37zsUwb059T\njunt6iCzQ3BSsDZtf20d/1y1jT8v3sxflm6mfPd+CvLEacf25YOnl3Du2IEM7eULw2bpclKwNieE\nwD9XbeOBl9bx3LIt7Kmuo2tRPtPGDOD8cQOZNmaAnxEwO0xOCtZm7NxbwyMvl3Hfv19j9dY99O5S\nyCUTh3L+uIGcMaovnQr8rIDZkXJSsJy3sKyC3/zrNZ5YuIGqmnomj+jFD987gQvHD6a40InA7Ghy\nUrCctK+6jicWbODeF19jYdlOuhTl865Jw3j/aSMYN6RntsMza7ecFCynbNpZxT3/Wsv9L65j574a\nRg/oxq3Tx3HppKH0KPZ1ArNMc1KwnPDK+p388h9reGLBBupD4O3jBnH1GSWcOrKPXxpj1oqcFCxr\n6usDzy7bwp1/X82La7bTtSifD55ewofOLGF4ny7ZDs+sQ3JSsFa3t7qWR+eWcdcLa1mzdQ9Dehbz\n5YtO5L1Th7uKyCzLnBSs1dTVB347u5Tv/3k52/dUM2F4L348YxIXjh9EgZ8yNssJTgrWKl5cvY2v\nP7GEJRt3MXVkH77w9jGcckxvXy8wyzEZTQqSLgD+F8gH7gwhfLvJ+GOAu4D+wHbg/SGEskzGZK2r\nbMdevvXHZfxh4UaG9CzmJ++bxDtOGuxkYJajMpYUJOUDtwHnAWXAbEkzQwhLEsW+D/w6hHCPpLcB\n3wI+kKmYrPXsq67jp8+v4ufPr0KCT597PNeffazfUGaW4zJ5pjAVWBlCWA0g6UFgOpBMCmOBT8fd\nzwGPZzAeawUhBJ5YuJFvP7WUDTuruPjkwdx80YlulM6sjchkUhgKlCb6y4BTm5RZAFxGVMX0LqC7\npL4hhG3JQpKuB64HGDFiRMYCtiOzurySm363iJfWbGfs4B786MpJTB3ZJ9thmVkLZDIppKo0Dk36\nPwf8RNI1wN+A9UDtAROFcAdwB8CUKVOazsOyrLaunjv/sYYfPPMqxQV5/Pe7TuK9bxpOfp6vG5i1\nNZlMCmXA8ET/MGBDskAIYQPwbgBJ3YDLQgg7MxiTHWXLNu3iC48sZGHZTt4+biDfmD6eAT38NjOz\ntiqTSWE2MFrSSKIzgCuB9yULSOoHbA8h1AM3E92JZG1AdW09tz23kttnraRHcSG3vW8yF500yHcV\nmbVxGUsKIYRaSTcATxPdknpXCGGxpFuBOSGEmcA04FuSAlH10cczFY8dPQvLKvjCIwtZtmk3l04c\nwi3vHOcX3pu1EwqhbVXRT5kyJcyZMyfbYXRIVTV1/PAvr/KLv61mQPdi/utd4znnxIHZDsvM0iBp\nbghhSnPl/ESzpWVBaQWffmg+q8v3MGPqcG6+6ES3U2TWDjkp2CHV1EXXDn7815UM7N6J+z58Kmce\n1y/bYZlZhjgp2EGtLq/k0w8tYEFpBe+eNJT/vGQcPTv77MCsPXNSsAOEELj3xXX81x+WUFyYz23v\nm8w7Th6c7bDMrBU4KdgbbN5VxRceWcjzr5Zz9vH9+d7lJzPQzx2YdRhOCtboqUUb+dJji6iqqeMb\n08fx/tOO8XMHZh2Mk4JRXVvPVx5fxENzypgwrCc/eO9ERvXvlu2wzCwLnBQ6uN1VNXz03rm8sHIb\nn3jbcdx4zmgK/RY0sw7LSaED27yrimt+NZsVm3fz/SsmcPkpw7IdkpllmZNCB7Vyy26uvms2O/ZW\n88tr3sRbju+f7ZDMLAc4KXRAc9Zu59p75lCYn8dvrz+dk4b1zHZIZpYjnBQ6mD+9solPPjiPIb06\nc8+HpjKib5dsh2RmOcRJoQP5zb/WcsvMxUwY1ou7rnmTWzY1swM4KXQAIQS+9/Rybp+1inNPHMCP\nZ0ymc1F+tsMysxzkpNDOhRD40mOLeOClUmZMHc43po+nwLecmtlBOCm0c3f+fQ0PvFTKR98yii9e\nMMZPKJvZIfmQsR17bvkWvvXHpVx00iC+8HYnBDNrnpNCO7WqvJIbH5jHmEE9+P4VE8jLc0Iws+Y5\nKbRDO/fVcN09cyjKz+MXHzyFLkWuJTSz9Hhv0c7U1Qc+8cA81m3fy/3Xncaw3n4OwczS56TQznz7\nj0v526vlfOvdJzF1ZJ9sh2NmbUxGq48kXSBpuaSVkm5KMX6EpOckzZO0UNJFmYynvXt0bhm/+Psa\nrj79GGZMHZHtcMysDcpYUpCUD9wGXAiMBWZIGtuk2FeAh0IIk4ArgdszFU979/K6Hdz8u0WcMaov\nX7m46WY2M0tPJs8UpgIrQwirQwjVwIPA9CZlAtAj7u4JbMhgPO3Wpp1VfOQ3cxnUs5jb3jfZ70Mw\ns8OWyb3HUKA00V8WD0v6GvB+SWXAU8AnUs1I0vWS5kiaU15enolY26yqmjo+8ps57N1fyy8+OIXe\nbs/IzI5AJpNCqhvjQ5P+GcDdIYRhwEXAbyQdEFMI4Y4QwpQQwpT+/d3uf9KXH3uFhet38qMrJzFm\nUPdsh2NmbVwmk0IZMDzRP4wDq4euBR4CCCH8CygG+mUwpnblsXllPPpyGTe+bTTnjR2Y7XDMrB3I\nZFKYDYyWNFJSEdGF5JlNyqwDzgGQdCJRUnD9UBpe27aHrzz2ClNL+nDjOaOzHY6ZtRMZSwohhFrg\nBuBpYCnRXUaLJd0q6ZK42GeB6yQtAB4ArgkhNK1isiZq6uq58cH55OeJH145kXw3YWFmR0lGH14L\nITxFdAE5OeyWRPcS4MxMxtAe/eCZV1lQWsFPr5rM0F6dsx2OmbUjvnexjfnnyq387PlVzJg6nAtP\nGpztcMysnXFSaEO276nmU7+dz7H9uvJVP6BmZhngto/aiBACX3hkARV7a7j7Q1Pd8qmZZYTPFNqI\nX//rNf6ydAs3XXgCY4f0aH4CM7PD4KTQBizduIv/emopbx3Tnw+dWZLtcMysHXNSyHH7quu48YF5\n9Cgu5HtXTPArNc0so1wxneO++YclrNhSya//Yyr9unXKdjhm1s75TCGHPb14E/e9uI7rzz6Ws493\nm09mlnlOCjlqx55qvvS7RYwb0oPPnT8m2+GYWQfh6qMc9Y0nl7BzXw33fvhUigqcu82sdXhvk4Oe\nW76F381bz8emjeLEwb791Mxaj5NCjqncX8uXf7eI4wZ04+NvOy7b4ZhZB+PqoxzzvT8tY+OuKh75\n6Ol0KsjPdjhm1sE0e6Yg6QZJvVsjmI5u9trt/Prfr3H16SWcckyfbIdjZh1QOtVHg4DZkh6SdIH8\n9FRGVNXU8cVHFzKkZ2c+/3bfbWRm2dFsUgghfAUYDfwSuAZYIem/JY3KcGwdyo//uoLV5Xv41rtP\nomsn1+qZWXakdaE5fhvapvhTC/QGHpH03QzG1mEs3rCTnz+/mssmD/NDamaWVc0ekkq6Ebga2Arc\nCXw+hFAjKQ9YAXwhsyG2b7V19Xzx0YX06lLIVy8+MdvhmFkHl049RT/g3SGE15IDQwj1ki7OTFgd\nx53/WMMr63dx+1WT6dWlKNvhmFkHl0710VPA9oYeSd0lnQoQQliaqcA6gjVb9/DDZ17l7eMGcuH4\nQdkOx8wsraTwU6Ay0b8nHtas+G6l5ZJWSropxfgfSpoff16VVJFe2G1ffX3gpkcXUlSQx63Tx7tJ\nbDPLCelUHym+0Aw0Vhulcy0iH7gNOA8oI7qtdWYIYUliXp9OlP8EMKklwbdlD84u5cU12/nOZScx\nsEdxtsMxMwPSO1NYLelGSYXx55PA6jSmmwqsDCGsDiFUAw8C0w9RfgbwQBrzbfP219bxv8++yptK\nevOeKcOzHY6ZWaN0ksJHgTOA9URH/KcC16cx3VCgNNFfFg87gKRjgJHAXw8y/npJcyTNKS8vT2PR\nue3RuevZvGs/nzzneFcbmVlOabYaKISwBbjyMOadam8XUgwjnv8jIYS6g8RwB3AHwJQpUw42jzah\ntq6enz2/ignDenLmcX2zHY6Z2Rukc22gGLgWGAc0Vn6HEP6jmUnLgGTdyDBgw0HKXgl8vLlY2oMn\nF25k3fa9fOUdp/gswcxyTjrVR78hav/o7cDzRDv33WlMNxsYLWmkpCKiHf/MpoUkjSF6Qvpf6Qbd\nVtXXB26ftZLjB3bj3BMHZjscM7MDpJMUjgshfBXYE0K4B3gHcFJzE4UQaoEbgKeBpcBDIYTFkm6V\ndEmi6AzgweQdTu3VM0s38+rmSj427Tjy8nyWYGa5J51bUmvivxWSxhO1f1SSzsxDCE8RPfyWHHZL\nk/6vpTOvti6EwO3PrWREny5cfPLgbIdjZpZSOmcKd8TvU/gKUfXPEuA7GY2qHXph5TYWlO3ko28Z\nRUG+X3hnZrnpkGcKcaN3u0IIO4C/Ace2SlTt0E+eW8HAHp247JSUd+WameWEQx6yhhDqia4L2BGY\n+9p2/r16O9e9+Vi/YtPMclo69RjPSPqcpOGS+jR8Mh5ZO3L7c6vo3aWQGVNHZDsUM7NDSudCc8Pz\nCMnnCAKuSkrLkg27eHbZFj5z3vF+o5qZ5bx0nmge2RqBtFe3z1pJt04FXH16SbZDMTNrVjpPNH8w\n1fAQwq+Pfjjty+rySv6waCMfOXsUPbsUZjscM7NmpVOf8aZEdzFwDvAy4KTQjJ89v4qi/DyuPcsn\nW2bWNqRTffSJZL+knkRNX9ghrK/Yx+9eXs9Vp46gf/dO2Q7HzCwth/MU1V5g9NEOpL35xd+iV05c\n/5ZRWY7EzCx96VxTeILXm7zOA8YCD2UyqLZua+V+HnhpHe+aNJShvTpnOxwzs7Slc03h+4nuWuC1\nEEJZhuJpF+5+YS3VdfV8dJrPEsysbUknKawDNoYQqgAkdZZUEkJYm9HI2qjaunoemlPKW8cMYFT/\nbtkOx8ysRdK5pvAwUJ/or4uHWQp/X7GVLbv3854pw7IdiplZi6WTFApCCNUNPXF3UeZCatsenltK\nn65FvO0Ev0THzNqedJJCefKlOJKmA1szF1LbtX1PNc8s2cz0iUMoKnDz2GbW9qRzTeGjwH2SfhL3\nlwEpn3Lu6H4/fz01dYErThnefGEzsxyUzsNrq4DTJHUDFEJI5/3MHdLDc8oYP7QHY4f0yHYoZmaH\npdk6Dkn/LalXCKEyhLBbUm9J32yN4NqSxRt2smTjLp8lmFmblk7F94UhhIqGnvgtbBdlLqS26eE5\nZRTl5zF94pBsh2JmdtjSSQr5khob75HUGXBjPgn7a+v4/fz1nDduIL26+MYsM2u70kkK9wLPSrpW\n0rXAM8A96cxc0gWSlktaKemmg5R5j6QlkhZLuj/90HPHs0u3sGNvDVec4mcTzKxtS+dC83clLQTO\nBQT8CTimuekk5QO3AecR3bE0W9LMEMKSRJnRwM3AmSGEHZIGHN5qZNfDc0oZ1KOYN4/un+1QzMyO\nSLo3028ieqr5MqL3KSxNY5qpwMoQwur4gbcHgelNylwH3BZfpyCEsCXNeHLG5l1VPP9qOe+ePJT8\nPGU7HDOzI3LQMwVJxwNXAjOAbcBviW5JfWua8x4KlCb6y4BTm5Q5Pl7WC0A+8LUQwp9SxHI9cD3A\niBEj0lx86/jdy+upD3C5q47MrB041JnCMqKzgneGEM4KIfyYqN2jdKU6bA5N+guI3s0wjSj53Cmp\n1wEThXBHCGFKCGFK//65U0UTQuDhuaW8qaQ3x7rxOzNrBw6VFC4jqjZ6TtIvJJ1D6h39wZQByZv2\nhwEbUpT5fQihJoSwBlhOG3qBz8vrdrC6fI+fTTCzduOgSSGE8FgI4b3ACcAs4NPAQEk/lXR+GvOe\nDYyWNFJSEVFV1MwmZR4H3gogqR9RddLqFq9Fljw8p4zOhflcdPLgbIdiZnZUNHuhOYSwJ4RwXwjh\nYqKj/flAyttLm0xXC9wAPE10YfqhEMJiSbcmGth7GtgmaQnwHPD5EMK2w1yXVrW3upYnF27kopMG\n061TOk1ImZnlvhbtzUII24EyJZo7AAARQUlEQVSfx590yj8FPNVk2C2J7gB8Jv60KX96ZROV+2u5\nwu9NMLN2xO07H6aH55RxTN8unDqyT7ZDMTM7apwUDsO6bXv51+ptXD55GJKfTTCz9sNJ4TA88nIZ\nElzmZxPMrJ1xUmih+vrAo3PLOOu4fgzp1Tnb4ZiZHVVOCi30r9XbWF+xjyum+NkEM2t/nBRa6NGX\ny+heXMD5YwdmOxQzs6POSaEFqmrq+PPizVw4fhDFhfnZDsfM7KhzUmiB55ZtoXJ/LZdMGJrtUMzM\nMsJJoQVmLthAv26dOH1U32yHYmaWEU4KadpdVcOzy7bwjpMG+b0JZtZuOSmk6Zklm6mureeSiUOy\nHYqZWcY4KaRp5oINDO3Vmckjemc7FDOzjHFSSMP2PdX8Y8VW3jlhiJu1MLN2zUkhDU8t2khtfeCd\nE/zeBDNr35wU0vDEgg2M6t+VsYN7ZDsUM7OMclJoxqadVby0djuXTBjqqiMza/ecFJrx5MINhIDv\nOjKzDsFJoRkzF2zgpKE9Gdmva7ZDMTPLOCeFQ1izdQ8Ly3b6ArOZdRhOCofw5IINAFx8squOzKxj\ncFI4iBACMxdsYGpJH79Mx8w6jIwmBUkXSFouaaWkm1KMv0ZSuaT58efDmYynJZZt2s2KLZW80xeY\nzawDKcjUjCXlA7cB5wFlwGxJM0MIS5oU/W0I4YZMxXG4Zi7YQH6euGj8oGyHYmbWajJ5pjAVWBlC\nWB1CqAYeBKZncHlHTQiBJxZs4Mzj+tG3W6dsh2Nm1moymRSGAqWJ/rJ4WFOXSVoo6RFJKV98LOl6\nSXMkzSkvL89ErG8wr7SCsh37uGSCq47MrGPJZFJI9fhvaNL/BFASQjgZ+AtwT6oZhRDuCCFMCSFM\n6d+//1EO80Az52+gqCCP88f5Pcxm1rFkMimUAckj/2HAhmSBEMK2EML+uPcXwCkZjCctdfWBPyza\nyFvH9KdHcWG2wzEza1WZTAqzgdGSRkoqAq4EZiYLSEo+FXYJsDSD8aTl36u3Ub57v9/DbGYdUsbu\nPgoh1Eq6AXgayAfuCiEslnQrMCeEMBO4UdIlQC2wHbgmU/Gk64kFG+halM85Jw7IdihmZq0uY0kB\nIITwFPBUk2G3JLpvBm7OZAwtUV1bzx9f2cT54wZRXJif7XDMzFqdn2hO+Nur5ezcV+O7jsysw3JS\nSJj16ha6dSrgzOP6ZTsUM7OscFJImF9awcnDelJU4M1iZh2T936xqpo6lm3czcThvbIdiplZ1jgp\nxF5Zv5Pa+uCkYGYdmpNCbH5pBQATRzgpmFnH5aQQm19awdBenRnQvTjboZiZZY2TQmx+aYWrjsys\nw3NSALZW7qdsxz4nBTPr8JwUgPnrfD3BzAycFICo6ig/T4wf0jPboZiZZZWTAlFSOGFQdzoXub0j\nM+vYOnxSqK8PLPBFZjMzwEmB1Vsr2b2/1knBzAwnBebFF5kn+SKzmZmTwvzSCroXF3Bsv27ZDsXM\nLOucFEormDCsF3l5ynYoZmZZ16GTwr7qOpZtcsuoZmYNOnRSeGXDTurcMqqZWaMOnRQanmSe4KRg\nZgZkOClIukDSckkrJd10iHKXSwqSpmQynqYaWkbt371Tay7WzCxnZSwpSMoHbgMuBMYCMySNTVGu\nO3Aj8GKmYjmY+aUVbu/IzCwhk2cKU4GVIYTVIYRq4EFgeopy3wC+C1RlMJYDbNldxfqKfUxy1ZGZ\nWaNMJoWhQGmivywe1kjSJGB4COHJDMaRUmPLqE4KZmaNMpkUUt34HxpHSnnAD4HPNjsj6XpJcyTN\nKS8vPyrBzS+toCBPjB/qllHNzBpkMimUAcMT/cOADYn+7sB4YJaktcBpwMxUF5tDCHeEEKaEEKb0\n79//qAQ3v7SCEwZ3p7jQLaOamTXIZFKYDYyWNFJSEXAlMLNhZAhhZwihXwihJIRQAvwbuCSEMCeD\nMQFQVx9YWLbTVUdmZk1kLCmEEGqBG4CngaXAQyGExZJulXRJppabjlXllVTur2Xi8N7ZDMPMLOcU\nZHLmIYSngKeaDLvlIGWnZTKWJF9kNjNLrUM+0TyvsWXUrtkOxcwsp3TIpDA/ftOaW0Y1M3ujDpcU\n9lbXsnzTLlcdmZml0OGSwqKyndQHX08wM0ulwyWF+aW+yGxmdjAdMikM79OZvt3cMqqZWVMdMin4\n+QQzs9Q6VFLYvKuKjTurXHVkZnYQHSopzPNDa2Zmh9ShksL80goK88W4IT2yHYqZWU7qYElhBycO\n7uGWUc3MDqLDJIW6+sAit4xqZnZIHSYprNxSyZ7qOicFM7ND6DBJYX7pDsAXmc3MDqXDJIXeXYo4\nb+xARrplVDOzg8ro+xRyyfnjBnH+uEHZDsPMLKd1mDMFMzNrnpOCmZk1clIwM7NGTgpmZtbIScHM\nzBplNClIukDSckkrJd2UYvxHJS2SNF/SPySNzWQ8ZmZ2aBlLCpLygduAC4GxwIwUO/37QwgnhRAm\nAt8FfpCpeMzMrHmZPFOYCqwMIawOIVQDDwLTkwVCCLsSvV2BkMF4zMysGZl8eG0oUJroLwNObVpI\n0seBzwBFwNtSzUjS9cD1cW+lpOVpxtAP2JpuwDmircXc1uIFx9xa2lrMbS1eaFnMx6RTKJNJQSmG\nHXAmEEK4DbhN0vuArwBXpyhzB3BHiwOQ5oQQprR0umxqazG3tXjBMbeWthZzW4sXMhNzJquPyoDh\nif5hwIZDlH8QuDSD8ZiZWTMymRRmA6MljZRUBFwJzEwWkDQ60fsOYEUG4zEzs2ZkrPoohFAr6Qbg\naSAfuCuEsFjSrcCcEMJM4AZJ5wI1wA5SVB0doRZXOeWAthZzW4sXHHNraWsxt7V4IQMxKwTf8GNm\nZhE/0WxmZo2cFMzMrFG7TArNNa+RLZLukrRF0iuJYX0kPSNpRfy3dzxckv4vXoeFkiZnKebhkp6T\ntFTSYkmfzOW4JRVLeknSgjjer8fDR0p6MY73t/HND0jqFPevjMeXtGa8TWLPlzRP0pNtIWZJaxPN\n1MyJh+Xk7yIRcy9Jj0haFv+mT8/lmCWNibdvw2eXpE9lNOYQQrv6EF3UXgUcS/RA3AJgbLbjimM7\nG5gMvJIY9l3gprj7JuA7cfdFwB+Jnvc4DXgxSzEPBibH3d2BV4maLcnJuOPldou7C4EX4zgeAq6M\nh/8M+H9x98eAn8XdVwK/zeLv4zPA/cCTcX9OxwysBfo1GZaTv4tEfPcAH467i4BeuR5zIvZ8YBPR\nQ2gZizlrK5jBDXc68HSi/2bg5mzHlYinpElSWA4MjrsHA8vj7p8DM1KVy3L8vwfOawtxA12Al4me\npN8KFDT9jRDdHXd63F0Ql1MWYh0GPEv0VP+T8T91rsecKink7O8C6AGsabqtcjnmJnGeD7yQ6Zjb\nY/VRquY1hmYplnQMDCFsBIj/DoiH59x6xNUUk4iOvnM27rgaZj6wBXiG6MyxIoRQmyKmxnjj8TuB\nvq0Zb+xHwBeA+ri/L7kfcwD+LGmuoqZoIId/F0S1B+XAr+JqujsldSW3Y066Engg7s5YzO0xKaTV\nvEYbkFPrIakb8CjwqfDGhgwPKJpiWKvGHUKoC1HLu8OIGmY88RAxZT1eSRcDW0IIc5ODUxTNmZhj\nZ4YQJhO1hPxxSWcfomwuxFxAVH370xDCJGAPUdXLweRCzADE15MuAR5urmiKYS2KuT0mhZY2r5Ft\nmyUNBoj/bomH58x6SCokSgj3hRB+Fw/O+bhDCBXALKK61V6SGh7WTMbUGG88viewvXUj5UzgEklr\niZp7eRvRmUMux0wIYUP8dwvwGFECzuXfRRlQFkJ4Me5/hChJ5HLMDS4EXg4hbI77MxZze0wKzTav\nkWNm8vqT3FcT1dk3DP9gfDfBacDOhtPF1iRJwC+BpSGE5PsucjJuSf0l9Yq7OwPnAkuB54DLDxJv\nw3pcDvw1xJWxrSWEcHMIYVgIoYTo9/rXEMJV5HDMkrpK6t7QTVTf/Qo5+rsACCFsAkoljYkHnQMs\nyeWYE2bwetURZDLmbF00yfAFmYuI7pJZBXw52/Ek4noA2EjUrEcZcC1RXfCzRO0+PQv0icuK6CVF\nq4BFwJQsxXwW0ennQmB+/LkoV+MGTgbmxfG+AtwSDz8WeAlYSXQK3ikeXhz3r4zHH5vl38g0Xr/7\nKGdjjmNbEH8WN/yf5ervIhH3RGBO/Pt4HOjdBmLuAmwDeiaGZSxmN3NhZmaN2mP1kZmZHSYnBTMz\na+SkYGZmjZwUzMyskZOCmZk1clKwIyYpSPqfRP/nJH3tKM37bkmXN1/yiJdzRdxq5nNNhpco0apt\nGvO5VNLYI4ijRNL7DjFuX5NWM4uO5jLMnBTsaNgPvFtSv2wHkiQpvwXFrwU+FkJ46xEu9lKiVmQP\nVwlwqB32qhDCxMSnOgPLSKmF29PaKCcFOxpqid4V++mmI5oe6UuqjP9Ok/S8pIckvSrp25KuUvQu\nhEWSRiVmc66kv8flLo6nz5f0PUmz43bjP5KY73OS7id6eKdpPDPi+b8i6TvxsFuIHtL7maTvpbPC\nkq6Ll71A0qOSukg6g6h9mu/FR/Gj4s+f4kbj/i7phMR2+T9J/5S0OrGNvg28OZ7+gO15kFi6KnpX\nx2xFDb1Nj4eXxMt8Of6ckWoZkq6R9JPE/J6UNC3urpR0q6QXgdMlnRJ/b3MlPa3Xm1q4UdKS+Lt4\nMJ24LUdl4wk9f9rXB6gkapZ4LVE7PJ8DvhaPuxu4PFk2/jsNqCBq9rcTsB74ejzuk8CPEtP/iegA\nZjTRk+DFwPXAV+IynYieUh0Zz3cPMDJFnEOAdUB/osbR/gpcGo+bRYqnP2nS1HlieN9E9zeBTxxk\nfZ8FRsfdpxI1SdFQ7uF4vcYCKxPb5cmDbOcSYB+vP1l+Wzz8v4H3x929iJ7m70r0JGxxPHw0MCfV\nMoBrgJ8k+p8EpsXdAXhP3F0I/BPoH/e/F7gr7t7A609c98r2b9Kfw/80NLZldkRCCLsk/Rq4kWjH\nlY7ZIW6XRdIq4M/x8EVAshrnoRBCPbBC0mrgBKK2dk5OHGH3JNrxVQMvhRDWpFjem4BZIYTyeJn3\nEb346PE0400aL+mbRDvhbkTvOHgDRS3LngE8LDU2XtkpUeTxeL2WSBqY5nJXhagF2KTziRrU+1zc\nXwyMINpR/0TSRKAOOD7NZSTVETWGCDAGGA88E69PPlGzLRA1G3GfpMc5vO1pOcJJwY6mHxG91OZX\niWG1xNWUivYkyQuj+xPd9Yn+et7422zaFksgauPlEyGEN+yM42qPPQeJL1WzwofrbqKzjAWSriE6\n+m4qj+idCE134g2S638ksQm4LISw/A0Do4v9m4EJcSxVB5m+8TuKFSe6q0IIdYnlLA4hnJ5iHu8g\nSrCXAF+VNC68/i4Ia0N8TcGOmhDCdqJXSF6bGLwWOCXunk5UBdFSV0jKi68zHEv0Nqmngf+nqFlv\nJB2vqLXOQ3kReIukfvFF0xnA84cRD0SvJt0YL/+qxPDd8ThC9N6JNZKuiGOUpAnNzLdx+hZ4GvhE\nnHSRNCke3hPYGJ+NfIDoyD7VMtYCE+NtPJyoCexUlgP9JZ0eL6dQ0jhJecDwEMJzRC8Kajh7sjbI\nScGOtv8Bknch/YJoR/wSUZ36wY7iD2U50c77j8BHQwhVwJ1EzR6/rOiW0Z/TzJlvXFV1M1GT1AuI\n2qf//aGmiY2RVJb4XAF8lSjJPAMsS5R9EPh8fMF3FFHCuFZSQ2ui05tZ1kKgNr6AndaFZuAbRMl2\nYbwtvhEPvx24WtK/iaqOGrZ902W8QPSaykXA94nO9g4QojudLge+E6/PfKLqsXzgXkmLiFqo/WGI\n3mVhbZBbSTUzs0Y+UzAzs0ZOCmZm1shJwczMGjkpmJlZIycFMzNr5KRgZmaNnBTMzKzR/wc1kNaI\nzW6PxQAAAABJRU5ErkJggg==\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "num_latent_feats = np.arange(10,700+10,20)\n", "sum_errs = []\n", "\n", "for k in num_latent_feats:\n", " # restructure with k latent features\n", " s_new, u_new, vt_new = np.diag(s[:k]), u[:, :k], vt[:k, :]\n", " \n", " # take dot product\n", " user_item_est = np.around(np.dot(np.dot(u_new, s_new), vt_new))\n", " \n", " # compute error for each prediction to actual value\n", " diffs = np.subtract(user_item_matrix, user_item_est)\n", " \n", " # total errors and keep track of them\n", " err = np.sum(np.sum(np.abs(diffs)))\n", " sum_errs.append(err)\n", " \n", " \n", "plt.plot(num_latent_feats, 1 - np.array(sum_errs)/user_interacts.shape[0]);\n", "plt.xlabel('Number of Latent Features');\n", "plt.ylabel('Accuracy');\n", "plt.title('Accuracy vs. Number of Latent Features');" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`4.` From the above, we can't really be sure how many features to use, because simply having a better way to predict the 1's and 0's of the matrix doesn't exactly give us an indication of if we are able to make good recommendations. Instead, we might split our dataset into a training and test set of data, as shown in the cell below. \n", "\n", "Use the code from question 3 to understand the impact on accuracy of the training and test sets of data with different numbers of latent features. Using the split below: \n", "\n", "* How many users can we make predictions for in the test set? \n", "* How many users are we not able to make predictions for because of the cold start problem?\n", "* How many movies can we make predictions for in the test set? \n", "* How many movies are we not able to make predictions for because of the cold start problem?" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "C:\\Users\\DELL\\Anaconda3\\lib\\site-packages\\ipykernel_launcher.py:14: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame.\n", "Try using .loc[row_indexer,col_indexer] = value instead\n", "\n", "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n", " \n" ] } ], "source": [ "df_train = user_interacts.head(40000)\n", "df_test = user_interacts.tail(5993)\n", "\n", "def create_test_and_train_user_item(df_train, df_test):\n", " '''\n", " INPUT:\n", " df_train - training dataframe\n", " df_test - test dataframe\n", " \n", " OUTPUT:\n", " user_item_train - a user-item matrix of the training dataframe \n", " (unique users for each row and unique articles for each column)\n", " user_item_test - a user-item matrix of the testing dataframe \n", " (unique users for each row and unique articles for each column)\n", " test_idx - all of the test user ids\n", " test_arts - all of the test article ids\n", " \n", " '''\n", " user_item_train = create_user_item_matrix(df_train)\n", " user_item_test = create_user_item_matrix(df_test)\n", " \n", " test_idx = list(user_item_test.index)\n", " test_arts = list(user_item_test.columns)\n", " \n", " return user_item_train, user_item_test, test_idx, test_arts\n", "\n", "user_item_train, user_item_test, test_idx, test_arts = create_test_and_train_user_item(df_train, df_test)" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Awesome job! That's right! All of the test movies are in the training data, but there are only 20 test users that were also in the training set. All of the other users that are in the test set we have no data on. Therefore, we cannot make predictions for these users using SVD.\n" ] } ], "source": [ "# Replace the values in the dictionary below\n", "a = 662 \n", "b = 574 \n", "c = 20 \n", "d = 0 \n", "\n", "\n", "sol_4_dict = {\n", " 'How many users can we make predictions for in the test set?': c,\n", " 'How many users in the test set are we not able to make predictions for because of the cold start problem?': a, \n", " 'How many movies can we make predictions for in the test set?': b,\n", " 'How many movies in the test set are we not able to make predictions for because of the cold start problem?': d\n", "}\n", "\n", "t.sol_4_test(sol_4_dict)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`5.` Now use the **user_item_train** dataset from above to find **U**, **S**, and **V** transpose using SVD. Then find the subset of rows in the **user_item_test** dataset that you can predict using this matrix decomposition with different numbers of latent features to see how many features makes sense to keep based on the accuracy on the test data. This will require combining what was done in questions `2` - `4`.\n", "\n", "Build functions that plot the training/test against number of latent features" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# fit SVD on the user_item_train matrix\n", "u_train, s_train, vt_train = np.linalg.svd(user_item_train)\n", "\n", "test_rows_idx = user_item_train.index.isin(test_idx)\n", "test_col_idxs = user_item_train.columns.isin(test_arts)\n", "u_test = u_train[test_rows_idx, :]\n", "vt_test = vt_train[:, test_col_idxs]" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [], "source": [ "# find the users that exists in both training and test datasets\n", "user_present_both = np.intersect1d(user_item_test.index, user_item_train.index)\n", "user_item_test_predictable = user_item_test[user_item_test.index.isin(user_present_both)]\n", "\n", "# initialize testing parameters\n", "num_latent_feats = np.arange(10,700+10,20)\n", "sum_errs_train = []\n", "sum_errs_test = []\n", "\n", "for k in num_latent_feats:\n", " # restructure with k latent features for both training and test sets\n", " s_train_lat, u_train_lat, vt_train_lat = np.diag(s_train[:k]), u_train[:, :k], vt_train[:k, :]\n", " u_test_lat, vt_test_lat = u_test[:, :k], vt_test[:k, :]\n", " \n", " # take dot product for both training and test sets\n", " user_item_train_est = np.around(np.dot(np.dot(u_train_lat, s_train_lat), vt_train_lat))\n", " user_item_test_est = np.around(np.dot(np.dot(u_test_lat, s_train_lat), vt_test_lat))\n", " \n", " # compute error for each prediction to actual value\n", " diffs_train = np.subtract(user_item_train, user_item_train_est)\n", " diffs_test = np.subtract(user_item_test_predictable, user_item_test_est)\n", " \n", " # total errors and keep track of them for both training and test sets\n", " err_train = np.sum(np.sum(np.abs(diffs_train)))\n", " err_test = np.sum(np.sum(np.abs(diffs_test)))\n", " sum_errs_train.append(err_train)\n", " sum_errs_test.append(err_test) " ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAnEAAAG5CAYAAADh3mJ8AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAIABJREFUeJzs3Xd8VfX9x/HXJzuQhDBlT1FEZEbc\n2yruLaK2Fge1FWvrz1Y6XW3VVrvU1lrFWVfd1ipu3JWgCAgyRJAAQhhhk/n5/XFO4BIyLuNycpP3\n8/E4j3vG95z7OefeJJ98xznm7oiIiIhIckmJOgARERER2X5K4kRERESSkJI4ERERkSSkJE5EREQk\nCSmJExEREUlCSuJEREREkpCSOGn2zCzVzNaZWfddWVYkEcxsrJm9EuH7/5+ZFYc/B5lRxSEiSuIk\nCYV/PKqnKjPbGLN8wfYez90r3T3H3b/elWXjZWbjzey6mHPYZGaVMcuf7cSxTzazL+Ise5uZuZnt\nu6Pv1xyZ2VPh5zUgZt1gM1sXZVyJYGa5wC3AgeHPQWmN7QPMbNMOHDcn/O513EVxNvi9Dz+30hq/\nT07eyffdpech0hAlcZJ0wj8eOe6eA3wNnBKz7l81y5tZ2u6PMj5mZsDxwD0x5zQWeDfmnAbthjhS\ngQuAlcB3Ev1+Nd47JbwOyawEuDHqILbXDvxsdAEq3f3LRMQTgetif5+4+3+iDMYCqVHGIMlFSZw0\nOWb2GzN7wsweM7O1wIVmdpCZfWRmJWa2xMz+ambpYfm08L/nnuHyI+H2l81srZl9aGa9trdsuP0E\nM5ttZqvN7A4ze9/MvhsT7hBgqbsvieO8BprZW2a2ysxmmNkpMdvONLNZYQwLzewKM+sA/BvYK6am\noVUdhz8eaAH8JLxeKTHHNjO7Mub4U82sf7itt5m9aGbLwya234frbzOzu2OOsVUNjZkVmtn1ZvYx\nsAHYw8x+EPMec8xsq2TSzEaa2bRw+2wzO9LMRpvZxBrlrjOzR2q5fpea2ds11v3KzB6t6xrW83HU\n9E/gSDMrqG1jeH0OjFnefH2qr42ZjTGzxWHZi8zsUDP7PPzO/r7GIVPN7F4zWxOWOSTm2G3D7+U3\nZva1mf2yOkm2oCn2NTO728xWAdfUEmsLM/t7uH+Rmd1qZulmNgT4BMgMv0svbsf1wcwON7OPw5+F\nxWZ2u21JWN4JX7+0mBoxMzsr/MxLzGyimfWrcU2vCn8WVpvZQ2Gc2/O9ryvWHjHf6y/N7LIdPQ+r\n0fxtNWrrLKgR/LOZvQ6sB/YPP4M7wuu/xMz+YmYZYfnOZjYhvCYrzOzV7Tk3aWLcXZOmpJ2A+cCx\nNdb9BigDTiH4RyUb2B84AEgDegOzgbFh+TTAgZ7h8iPAcqAASAeeAB7ZgbIdgLXAaeG2q4Fy4Lsx\nsf4SuKlG/JcCb9dYlw98A5wHpAIHEtSa9QKMoCZoWFi2HTA4nD8Z+CKO6/gEMB5oCawDjovZNhqY\nBwwK36sfQY1MRngdf0OQALYADg73uQ24O+YYA4BNMcuFwFygb3ic1PA69Qzf43hgI9AvLH8UsAI4\nIvxMe4T75oTXuHvMsWcDx9dyjvnhMbvErJsRXqM6r2Ec1+4pYBzwc+CVcN1gYF1MmeUETZDUvD7h\ntakC/hheizPDz+ApoE14TVYDBWH5sUAFMCb8Xl0cHj8n3P4a8CeC731nYCpwQY19Lw6veXYt5/NH\n4G2gLdCJIHG7trbPsZZ969xO8J0tCN+3b/idujTclkPwc9UxpvyhwGJgaLjPD4CZQGrMNX0HaE/w\ns/YVcGG83/vqz62W9Wnh9+L/wuvbDygCDt3B8xhb/b2orUwYx3KC31EpQCZwL8HPZCuC7+3rwC/C\n8ncAt4dxZgCH787fuZoa16SaOGmq3nP3F929yt03uvskd/+fu1e4+zzgHoKEoC5PuXuhu5cD/yL4\no7y9ZU8Gprj78+G2PxH8so51EvDfOM7nLOBTd3/cg355HwEvE/zBh+AP875mluPuy919ShzHBMDM\n8oFTgUfdfT3wPHBRTJFLgd+6+2ce+MLdFxFcv0zg1+6+IZw+iPd9CZqQ57h7WXhOz7v7/PA9JgDv\nA9U1TJcCf3P3ieFnuiDcdx3wHEFTMGFtVy7BH72tuHsJMAEYGZYdQvDHf0JYZIevYegvwBAzO2w7\n94MgibwhvBbPECQP97v7SnefD3zE1t/B+e5+j7uXu/t4YCnwLTPrQ/DPyk/C7/1igj/658XsO9vd\nx4fXfGMtsVxA8Jmu8KCG+LfAt3fgnLbi7h+FPyeV7j6H4J+G+n4Gvwf8xd0/Cff5G5DH1tfhj+5e\n7O7LCH4e6vs5rc11YY1WiZktCNcdAZi73x5e3y+ABwm/NztwHvF4MvwdVUWQ4F0EXOXuq8Pv7a1s\n+QzLCf6J6hZ+X96p/ZDSHCiJk6ZqYeyCmfUzs5fCJqI1BP2X2tWz/zcx8xsI/nve3rKdY+Nwdyf4\nj746prYEtYL/q+fY1XoAR8f8wSkhqLnqFB73NII/MgvN7A0zGxrHMauNBFYBb4bL/wLOsKATO0A3\noLY+UN2Ar8I/PDui5md0hplNMrOV4fkdzpbPqK4YIPgDe2E4fyFBMlpZR9lHgVHh/PnAv8M/1Dt7\nDQkT4FsIaia3V6m7r45Z3kiQmMUux34Ht7p2BH1DOxN8T1oCy2O+J7cDe9Sz72YW9JHrACyIWb2A\nIGnYKWGz8StmtjT8Gfw59f8M9gB+XeM737pGLNvzc1qbG9w9P5x6xLzvnjXe94dAdfPn9p5HPGI/\nk64EtWxfxLz/UwSfC8BNQDEw0YJuBT/ayfeWJKYkTpoqr7H8D2A6sKe75wG/Jqj9SKQlBL+Qgc2D\nGGL/AI0AXoszCVoI/DfmD06+Bx2xrwFw9/fd/SSCP9ZvEiRisO11qM1FBE1ni83sG+ABgqa4c2Le\nu08dMfWq7m9Vw3qC5tVqtY3W2xybmeURNB/9Gujg7vkETWXVx64rBgjON8/MhhMkYQ/XUQ7gRWBv\nM9srLPvo5mDqvobb4+9hnMfVWB/P9dgeXWssdydoelxI0PTaOuZ7kufuw2PK1vmdcPcKYBlBIhN7\n7EU7GS/AfcDHQO/wZ/B3bPl8a4tpIfDzGt/5Fu7+QhzvFc/3vi4Lgek13jfX3at/Hrb3PLbrZ4Hg\nc6wk6LJR/f6t3H0PAHdf5e5Xunt3gu/w9WZ2wI6erCQ3JXHSXOQS/HFbb2b7EDTVJNp/gKFmdkpY\nw3EVQf+davE2pQI8DQw3s7MtGFyRYcFgjT3NLNfMzg1rzsoJ+lNV10QtJRg00LK2g5pZX+Ag4BiC\npqjBwEDgTrY0qd4L/NyCgRUW1mp2ASYCpcANZpYddsY+ONxnCnCMmXUyszbATxs4v2yC2odioMrM\nzgBimyXvBb5vQWd/M7PuYeyESfAjYZlv6msGDZsPnwfuIvjD+V54Heq7hnFz900ENSXX1tg0BRgV\nfnYHE/TX3Bm9LBiokWZmFxEknq+HzXufAL8LO9CnmNleFjPwIQ6PESQGbcLO9z8nuL5xM7OsGpMR\n/AyWuPt6M9uPoIkc2FyLuY6gZrraP4AfmdnQ8DPPNbPTzSwrjhDq/d43YCLB4I2xZpYZXuNBZlbd\nVLu95zEFKAh/bloQ/KNSp/A79CDwFwsGqVR/348FMLPTzKz6n6fVBP0pt/u7Kk2DkjhpLv6PIClZ\nS/DH4YlEv6G7LyX4T/mPBJ3y+wCfAqUWjP48hi39sRo61kqCzv6XETQhLSZoEk4Pi4whqEEoIeg7\nMzpcPxl4Bfg6bJqpOUrvIuAdd3/P3b+pnoA/A4dYMNL2AYJ+VU8DawiuXZ67lwEnEHTyXkQwyOTU\n8LjPh+/7BfAB8Gwc1+pagn5NKwgS3Fditr8FXEnw2a0h6LzfOeYQDwL7AQ/V9z6hR4FjgcfDZtRq\ntV5DM9vHgpGGbeM4NgR9pEpqrBsHDAvX/wR4PM5j1eUttgxuuRY4093XhttGEiR1s8Ltj7H1Pw8N\n+QUwh2AQweTwvW7fjv0zCZp/Y6cDgB8RJOLrCPoP1rwGvwaeDb+nJ4V9vX5MkJyXhOczkvhq2Rr6\n3tcp5nt9JEEz9TKCpL+6qXZ7z+NTgv6wHxAMmHgjjjCuJPiHZjJBovZftiSG+xIkmmsJPpvfuXth\nvOcnTYtt/TtMRBLFgtsQLAbOJvjP+TZ3P7j+vSQeFgzOWELQXL4rmv5ERBo91cSJJJCZjTCzVhY8\nnuhXBCMgPyZoArkh0uCaiLBZ6UqC/oVK4ESk2Wi0d7IXaSIOJeggnwF8DpzuwaOKPoo0qqalmKC5\nbWf7mYmIJBU1p4qIiIgkITWnioiIiCShZtGc2q5dO+/Zs2fUYYiIiIg0aPLkycvdvcFR5c0iievZ\nsyeFhRqBLSIiIo2fbXkMXL3UnCoiIiKShJTEiYiIiCQhJXEiIiIiSahZ9ImrTXl5OUVFRWzatCnq\nUHaLrKwsunbtSnp6esOFRUREpNFrtklcUVERubm59OzZk+CG702Xu7NixQqKioro1atX1OGIiIjI\nLtBsm1M3bdpE27Ztm3wCB2BmtG3bttnUOoqIiDQHzTaJA5pFAletOZ2riIhIc9CskzgRERGRZKUk\nLiIrVqxg8ODBDB48mI4dO9KlS5fNy2VlZXEdY/To0cyaNSvBkYqIiEhj1GwHNkStbdu2TJkyBYDr\nr7+enJwcrrnmmq3KuDvuTkpK7bn2/fffn/A4RUREpHFSTVwjM3fuXAYMGMDll1/O0KFDWbJkCWPG\njKGgoIB9992XG2+8cXPZQw89lClTplBRUUF+fj7jxo1j0KBBHHTQQSxbtizCsxAREZFEU00ccMOL\nnzNj8Zpdesz+nfO47pR9d2jfGTNmcP/993P33XcDcMstt9CmTRsqKio46qijOPvss+nfv/9W+6xe\nvZojjjiCW265hauvvprx48czbty4nT4PERERaZwSWhNnZuPNbJmZTa9ju5nZX81srplNNbOhMdsu\nMrM54XRRzPphZjYt3Oev1gSHXfbp04f9999/8/Jjjz3G0KFDGTp0KDNnzmTGjBnb7JOdnc0JJ5wA\nwLBhw5g/f/7uCldEREQikOiauAeAO4GH6th+AtA3nA4A/g4cYGZtgOuAAsCByWb2gruvCsuMAT4C\n/guMAF7emSB3tMYsUVq2bLl5fs6cOfzlL3/h448/Jj8/nwsvvLDW+71lZGRsnk9NTaWiomK3xCoi\nIiLRSGgS5+7vmFnPeoqcBjzk7g58ZGb5ZtYJOBJ4zd1XApjZa8AIM3sbyHP3D8P1DwGns5NJXGO2\nZs0acnNzycvLY8mSJUyYMIERI0ZEHZaIiDQywWC4oObD3XGgKlwXbAfHqfIt292BcH11GYJVWx13\n23XV5bbewevYXh3XVseLOaD7tmVj3zeY3fZYdV+LBgrshO5tWpCdkZq4N9gOUfeJ6wIsjFkuCtfV\nt76olvVN1tChQ+nfvz8DBgygd+/eHHLIIVGHJCKScFVVTnlVFRWVHkxVVVRUeTBVBvOVVU55ZVX4\nGizHbgteq6isgkp3qsL1m+fD1+ryVe5UVhG+btm/osqprNxSrtJjl2u+X/Ba13tVVr/HNuuCZKvK\nPZyCJKaqet1W27ckZ7HlZfd47opDGNwtP+owgOiTuNr6s/kOrN/2wGZjCJpd6d69+47Gt1tcf/31\nm+f33HPPzbcegeBJCw8//HCt+7333nub50tKSjbPn3feeZx33nm7PlARaRIqKqsoq6yirCKYSiu2\nXq5rW3llVbhvkDyVh+s2L4dTWUWQdFXPx24rjylbUemU1TJfnZA1BmkpRmqKkZZipISvqSkpW9an\nbtlevT4lxUg1SE0xUszISEvZPL/lla3WpZphZqSE+1XPp4SvwXK4LsWwmG0pZhhBGTMwqrdveVpP\n9fpg3ZYywOb3qlY9G9vlfHPZ2ItTfextylit+1S/b+zK2Pey2H1qxBhbZptj1SFRPeZ7tm2RmAPv\ngKiTuCKgW8xyV2BxuP7IGuvfDtd3raX8Ntz9HuAegIKCgsbx20BEpB7uzvqyStZtqmDtpnLWhK9r\nN1WwrjSYX7epgk0VVZSWV1JWWUVpeZBoBVPllsQrXC6NTcbCdbs6P8pISyEjNYW0VCM9NZhPD+fT\na8xnZ6SQnhIup4XbUlJIT6u9fFqKkRauS00JylYnT2nh/OZtqSmbE6q01BRSqxOklCBp2jqJ2jK/\nOfEKk6mUFDbv2wTHzkkTEnUS9wIw1sweJxjYsNrdl5jZBOB3ZtY6LHcc8DN3X2lma83sQOB/wHeA\nOyKJXEQkVF5ZxfrSCtZuqmB9WcWW+dLKYL40WBc7vzYmQaueX1da0WCCZQaZaSlkpqWSmZZCRlrK\n5uXq+bzsdDJSU8hMr962ZXtGarBPzfnMBralh8vVyVVGTGKlREckGglN4szsMYIatXZmVkQw4jQd\nwN3vJhhdeiIwF9gAjA63rTSzm4BJ4aFurB7kAHyfYNRrNsGAhiY7qEFEEqeqyllbGtRwrdtUwbrS\nctaVBrVg1cnWujApWxuu21I2mKrLlVVUxfWeWekp5GSm0TIzjdysNHIy0+jWpgW5WWnkZaWTE67P\nzUoPtmelkRe7nJlGy4w0UlKUNIlI4kenjmpguwNX1LFtPDC+lvWFwIBdEqCIJLXKKqdkQxklG8tZ\nvbGcNTGvazZVsHpjOas3lLNmU7g+fF29oZy1pRVxjWDLTk/dKulqmZlK5/xscjJTyckKErKcjLQt\n85lpmxO1nMxgfU5GsF9aqh6SIyK7TtTNqSIiQJCQrd5Yzsr1ZazaUMbK9WWUbChj5fryGstlrNoQ\nlFuzqbzeRCwrPYW8rHRaZQdTh9ws+nbIJS8rjVbZ6eRlp5OXlR4kXFlbErCczbVeSrxEpPFSEici\nCeMeNFkWry2leG0pyza/btq8rnpauaGszoQsMy2FNi0zaN0igzYtM+icn73VcnWSlrf5NUjSMtMa\nx72cREQSQUlcRFasWMExxxwDwDfffENqairt27cH4OOPP97qCQz1GT9+PCeeeCIdO3ZMWKwitamo\nrGLp2lIWl2xkcclGFpVs5JvVm1i2ppTidVsStU3l2/YXy0hNoX1uJu1yM+nWpgVDe7SmXcsgIWsd\nk5wF8+lkp6eq87yISA1K4iLStm3bzfeDu/7668nJyeGaa67Z7uOMHz+eoUOHKomTXW5dacXm5GzR\nqo1bJWuLSzbxzZpN29zLKy8rjQ55WXTIzWRo99Z0yM2kfW4mHXKzaL95PpNW2elKykREdpKSuEbo\nwQcf5K677qKsrIyDDz6YO++8k6qqKkaPHs2UKVNwd8aMGcMee+zBlClTGDlyJNnZ2dtVgycCULKh\njHnL1zOveD3zitcxr3g9C1ZuYHHJRlZvLN+qbFqK0bFVFl3yszmgVxs652fTpXV28JqfRadW2bTM\n1K8UEZHdRb9xAV4eB99M27XH7LgfnHDLdu82ffp0nn32WT744APS0tIYM2YMjz/+OH369GH58uVM\nmxbEWVJSQn5+PnfccQd33nkngwcP3rXxS5NRVlHF1yvX82VxkKx9tTxI1uYtX8/K9WWby6WlGN3b\ntqBX25bs37M1nfOrE7Rgap+bSapubSEi0mgoiWtkXn/9dSZNmkRBQQEAGzdupFu3bhx//PHMmjWL\nq666ihNPPJHjjjsu4kilsVm7qZzZS9cxZ+la5ixbF9SsLV/PwpUbtrqBbPvcTHq1a8nx++5B73Y5\n9G7fkt7tc+jaOpt0jcQUEUkaSuJgh2rMEsXdufjii7npppu22TZ16lRefvll/vrXv/L0009zzz33\nRBChRG19aQVzl61j1tK1zFm6dnPitnj1ps1lstJT6NUuhwFdWnHqoM5BotYuh17tW5KXlR5h9CIi\nsqsoiWtkjj32WM4++2yuuuoq2rVrx4oVK1i/fj3Z2dlkZWVxzjnn0KtXLy6//HIAcnNzWbt2bcRR\nSyJsKq9k7rJ1zA4TteB1LUWrNm4uk5GWwp7tcxjeqw17dcxlrw657LVHLl1bZ+uu/iIiTZySuEZm\nv/3247rrruPYY4+lqqqK9PR07r77blJTU7nkkktwd8yMW2+9FYDRo0dz6aWXamBDkluxrpQZS9bw\n+eJgmrF4NV8tX7+5GTQ91ejdLoch3VszsqAbfffIZe+OuXRv00L91EREminzeJ47k+QKCgq8sLBw\nq3UzZ85kn332iSiiaDTHc25s3J2iVRv5fPHqMFkLkrZv1mxpCu2Sn03/znns0ymPfh1z2WuPHHq0\nban+aiIizYSZTXb3gobKqSZOJEEqq5w5y9YyfVF1sraaGUvWsHZTBQCpKUaf9i05qE9b+nfKY9/O\nefTvnEd+C9WmiohIw5TEiewi60ormPJ1CYULVjJ5wSo+/bqEdaVBwpadnkq/TrmcNrgz/Tu1Yt/O\neezdMZesdD0WSkREdkyzTuKq+5c1B82h2Xx3W7J6I5Pmr2Ly/JUULljFzCVrqHIwg34d8zhjSBeG\n9WjNgC6t6NWupfquiYjILtVsk7isrCxWrFhB27Ztm3wi5+6sWLGCrKysqENJWpVVzhffrGHyglUU\nzl/F5AWrWFQSjBJtkZHK4G75jD26LwU9WjOkez65uo2HiIgkWLNN4rp27UpRURHFxcVRh7JbZGVl\n0bVr16jDSBpVVc6MJWt4b+5y3p+7fKum0Y55WQzr2ZpLD+tFQY827NMplzQNOhARkd2s2SZx6enp\n9OrVK+owpBFZuHID789dzrtzl/PB3OWs2hA8O3TvPXI5Y0gXCnq2ZliP1nTJz27ytbciItL4Ndsk\nTmT1hnI+nLecd+cEtW3zV2wAYI+8TI7utweH9W3HwXu2pUOumqFFRKTxURInzUZpRSWfLCjhvbnF\nvDd3BdOKSqhyyMlM48DebfjuwT05tG87+rTPUU2biIg0ekripEkr2VDGq58v5eXpS/ho3ko2lleS\nmmIM6ZbPlUf35bC+7RjULV830hURkaSjJE6anNUby3ltxlJemrqYd+csp6LK6d6mBSP378ahe7bj\ngN5tNHpURESSnpI4aRLWbirn9ZlLeWnqEt6ZvZyyyiq6ts7mksN6cfJ+nRnQJU9NpCIi0qQoiZOk\ntb60gje+WMZ/PlvM27OLKauoolOrLL5zUA9OHtSZQV1bKXETEZEmS0mcJJWNZZW8+cUyXpq2mDe/\nWMam8ir2yMvkggO6c/LATgzp1poUPRlBRESaASVx0ui5O5Pmr+KRjxbw2oylbCyvpF1OJucWdOPk\ngZ0p6KHETUREmh8lcdJobSyr5IXPFvHABwuYuWQNeVlpnDm0CycN7MQBvdrqWaQiItKsKYmTRmfh\nyg088tECnihcSMmGcvp1zOXmM/fj9MFdyM5IjTo8ERGRRkFJnDQK7s77c1fwwAfzeeOLpaSYcfy+\ne3DRQT0Z3quNBiiIiIjUoCROIrW+tIJnPiniwQ8XMHfZOtq0zOAHR/bhggN60Dk/O+rwREREGi0l\ncRKJecXreOjDBTw9uYi1pRUM7NqK288ZxEkDO5GVriZTERGRhiiJk93G3Xl7djEPvD+fibOLSU81\nTtyvExcd3JMh3fLVZCoiIrIdlMRJwlVWOa9M/4a73prLjCVr6JCbyY+P3YtRB3SjQ25W1OGJiIgk\nJSVxkjDllVU8P2Uxf3t7LvOK19O7XUv+cPZAThvchYw0PXBeRERkZyiJk11uU3kl/55cxD8mfknR\nqo3065jLnecP4YQBnXRvNxERkV1ESZzsMutLK3j0f1/zz3fnsWxtKUO653PDqftydL8O6u8mIiKy\niymJk522emM5D34wn/vf/4pVG8o5uE9b/jxyMAf1aavkTUREJEGUxMkOW76ulPve+4qHP1zAutIK\njunXgSuO3pOh3VtHHZqIiEiTpyROttuS1Rv5x8R5PD7pa0orqjhxv05cceSe9O+cF3VoIiIizYaS\nOIlbRWUV978/nz++NpvyyipOH9KF7x/Zhz7tc6IOTUREpNlREidxmb5oNeOemcr0RWs4pl8Hrj91\nX7q1aRF1WCIiIs1WQm/WZWYjzGyWmc01s3G1bO9hZm+Y2VQze9vMusZsu9XMpofTyJj1D5jZV2Y2\nJZwGJ/IcmrsNZRX87r8zOe2u91m6ppS/XTCUey8qUAInIiISsYTVxJlZKnAX8C2gCJhkZi+4+4yY\nYrcBD7n7g2Z2NHAz8G0zOwkYCgwGMoGJZvayu68J9/uJuz+VqNglMHF2Mb94dhpFqzYyang3xo3Y\nh1Yt0qMOS0REREhsc+pwYK67zwMws8eB04DYJK4/8ONw/i3guZj1E929Aqgws8+AEcCTCYxXQivW\nlXLTf2bw3JTF9G7fkifGHMgBvdtGHZaIiIjESGRzahdgYcxyUbgu1mfAWeH8GUCumbUN159gZi3M\nrB1wFNAtZr/fhk2wfzKzzNre3MzGmFmhmRUWFxfvivNp8tydpyYXccwfJ/LStCX88Ji+/PeHhymB\nExERaYQSWRNX211evcbyNcCdZvZd4B1gEVDh7q+a2f7AB0Ax8CFQEe7zM+AbIAO4B7gWuHGbN3K/\nJ9xOQUFBzfeVGhasWM/Pn53G+3NXMKxHa24+cz/22iM36rBERESkDolM4orYuvasK7A4toC7LwbO\nBDCzHOAsd18dbvst8Ntw26PAnHD9knD3UjO7nyARlB1UXlnFve9+xZ9fn016ago3nT6AC4Z3J0XP\nOBUREWnUEpnETQL6mlkvghq284DzYwuETaUr3b2KoIZtfLg+Fch39xVmNhAYCLwabuvk7ksseJ7T\n6cD0BJ5Dk/bZwhLGPTONmUvWcFz/PbjxtAF0bJUVdVgiIiISh4Qlce5eYWZjgQlAKjDe3T83sxuB\nQnd/ATgSuNnMnKA59Ypw93Tg3fC5m2uAC8NBDgD/MrP2BM21U4DLE3UOTVVFZRV/mDCLf747j3Y5\nmdx94TBGDOgYdVgiIiKyHcy96XcXKygo8MLCwqjDaBRWrCtl7KOf8uG8FYwa3o2fnbgPeVm6bYiI\niEhjYWaT3b2goXJ6YkMzMn3Rar738GSK15Vy2zmDOHtY14Z3EhERkUZJSVwz8dyni7j26am0aZnB\nU5cfxMCu+VGHJCIiIjtBSVyz+285AAAgAElEQVQTV1FZxc0vf8F9733F8F5tuOv8obTPrfXWeiIi\nIpJElMQ1YSvXlzH20U/44MsVXHRQD355cn/SUxP6uFwRERHZTZTENVGfL17NmIeC/m9/OHsg5xR0\na3gnERERSRpK4pqg56cE/d/yszP49/cOYlA39X8TERFpapTENSEVlVXc+soX/PPdrxjesw13XaD+\nbyIiIk2VkrgmYtX6MsY+9gnvz13Bdw7qwS9P6k9Gmvq/iYiINFVK4pqAGYvXMObhQpatKeX3Zw3k\n3P3V/01ERKSpUxKX5F74bDE/feozWmWn88T3DmRI99ZRhyQiIiK7gZK4JOXu/GHCLP729pcU9GjN\n3y4cSodcPbxeRESkuVASl4TKK6u49umpPPPJIkYN78YNpw5Q/zcREZFmRklcktlQVsEV//qEt2YV\nc/W39uLKo/fEzKIOS0RERHYzJXFJZNX6MkY/MImpRSX87oz9OP+A7lGHJCIiIhFREpckFpVs5Dv3\n/Y+FqzbytwuGMWJAx6hDEhERkQgpiUsCs75Zy3fG/48NZZU8fPFwDujdNuqQREREJGJK4hq5SfNX\ncskDk8hKT+Xflx9Ev455UYckIiIijYCSuEbstRlLGfvoJ3TJz+bBi4fTrU2LqEMSERGRRkJJXCP1\n+Mdf8/Nnp7Ff13zu/+7+tGmZEXVIIiIi0ogoiWtk3J0735zL7a/N5vC92vP3C4bSMlMfk4iIiGxN\n2UEjUlnl3PDi5zz04QLOGNKF3589kPRU3cRXREREtqUkrpEorajk6ic+46VpSxhzeG/GjehHSopu\n4isiIiK1UxLXCKzdVM73Hp7MB1+u4Bcn7sNlh/eOOiQRERFp5JTERWzZ2k18d/wkZi9dy59GDuKM\nIV2jDklERESSgJK4CJVXVnHZg4V8tXw9915UwJF7d4g6JBEREUkSSuIidMebc/msaDV3nT9UCZyI\niIhsFw19jMgnX6/irrfmcuaQLpw0sFPU4YiIiEiSURIXgfWlFVz9xBQ65mVx/Wn7Rh2OiIiIJCE1\np0bgNy/NZMHKDTx22YHkZaVHHY6IiIgkIdXE7WZvzFzKYx9/zZjDenNg77ZRhyMiIiJJSkncbrR8\nXSnXPj2Vfh1zufq4vaIOR0RERJKYmlN3E3dn3NPTWLOxgkcuPYDMtNSoQxIREZEkppq43eTJwoW8\nPnMpPx2xN/065kUdjoiIiCQ5JXG7wYIV67nhxRkc1LstFx/SK+pwREREpAlQEpdgFZVV/PiJKaSm\nGLefO0gPtRcREZFdQn3iEuzuiV/yydcl/OW8wXTOz446HBEREWkiVBOXQFOLSvjz63M4ZVBnThvc\nJepwREREpAlREpcgG8sq+dETU2iXk8lvThsQdTgiIiLSxKg5NUFueXkm84rX869LD6BVCz2VQURE\nRHYt1cQlwMTZxTz44QIuPqQXh+zZLupwREREpAlKaBJnZiPMbJaZzTWzcbVs72Fmb5jZVDN728y6\nxmy71cymh9PImPW9zOx/ZjbHzJ4ws4xEnsP2WrW+jJ/8+zP6dsjhpyP2jjocERERaaISlsSZWSpw\nF3AC0B8YZWb9axS7DXjI3QcCNwI3h/ueBAwFBgMHAD8xs+o75N4K/Mnd+wKrgEsSdQ7by935+bPT\nWLWhjD+fN5isdD2VQURERBIjkTVxw4G57j7P3cuAx4HTapTpD7wRzr8Vs70/MNHdK9x9PfAZMMLM\nDDgaeCos9yBwegLPYbs888kiXp7+DVd/a2/27dwq6nBERESkCUtkEtcFWBizXBSui/UZcFY4fwaQ\na2Ztw/UnmFkLM2sHHAV0A9oCJe5eUc8xATCzMWZWaGaFxcXFu+SE6rNw5Qaue+Fzhvdsw5jDeyf8\n/URERKR5S2QSV9ujCbzG8jXAEWb2KXAEsAiocPdXgf8CHwCPAR8CFXEeM1jpfo+7F7h7Qfv27Xfw\nFOJTWeX835OfAXD7uYNI1VMZREREJMESmcQVEdSeVesKLI4t4O6L3f1Mdx8C/CJctzp8/a27D3b3\nbxEkb3OA5UC+maXVdcwo/PPdeXw8fyXXn7ov3dq0iDocERERaQYSmcRNAvqGo0kzgPOAF2ILmFk7\nM6uO4WfA+HB9atisipkNBAYCr7q7E/SdOzvc5yLg+QSeQ1x6tWvJqOHdOGuonsogIiIiu0fCbvbr\n7hVmNhaYAKQC4939czO7ESh09xeAI4GbzcyBd4Arwt3TgXeDcQysAS6M6Qd3LfC4mf0G+BS4L1Hn\nEK/j9+3I8ft2jDoMERERaUYsqNxq2goKCrywsDDqMEREREQaZGaT3b2goXJ6YoOIiIhIElISJyIi\nIpKElMSJiIiIJCElcSIiIiJJSEmciIiISBJSEiciIiKShJTEiYiIiCQhJXEiIiIiSUhJnIiIiEgS\nUhInIiIikoSUxImIiIgkISVxIiIiIklISZyIiIhIElISJyIiIpKElMSJiIiIJCElcSIiIiJJSEmc\niIiISBJSEiciIiKShJTEiYiIiCQhJXEiIiIiSUhJnIiIiEgSUhInIiIikoQaTOLM7GkzO8nMlPCJ\niIiINBLxJGZ/B84H5pjZLWbWL8ExiYiIiEgDGkzi3P11d78AGArMB14zsw/MbLSZpSc6QBERERHZ\nVlxNpGbWFvgucCnwKfAXgqTutYRFJiIiIiJ1SmuogJk9A/QDHgZOcfcl4aYnzKwwkcGJiIiISO0a\nTOKAO939zdo2uHvBLo5HREREROIQT3PqPmaWX71gZq3N7AcJjElEREREGhBPEneZu5dUL7j7KuCy\nxIUkIiIiIg2JJ4lLMTOrXjCzVCAjcSGJiIiISEPi6RM3AXjSzO4GHLgceCWhUYmIiIhIveJJ4q4F\nvgd8HzDgVeDeRAYlIiIiIvVrMIlz9yqCpzb8PfHhiIiIiEg84rlPXF/gZqA/kFW93t17JzAuERER\nEalHPAMb7ieohasAjgIeIrjxr4iIiIhEJJ4kLtvd3wDM3Re4+/XA0YkNS0RERETqE8/Ahk1mlgLM\nMbOxwCKgQ2LDEhEREZH6xFMT9yOgBfBDYBhwIXBRIoMSERERkfrVm8SFN/Y9193XuXuRu49297Pc\n/aN4Dm5mI8xslpnNNbNxtWzvYWZvmNlUM3vbzLrGbPu9mX1uZjPN7K/VNxwOy80ysynhpFpBERER\naXbqTeLcvRIYFvvEhniFCeBdwAkEI1tHmVn/GsVuAx5y94HAjQSjYDGzg4FDgIHAAGB/4IiY/S5w\n98HhtGx7YxMRERFJdvH0ifsUeN7M/g2sr17p7s80sN9wYK67zwMws8eB04AZMWX6Az8O598Cnqs+\nPMHtTDIIbjCcDiyNI1YRERGRZiGePnFtgBUEI1JPCaeT49ivC7AwZrkoXBfrM+CscP4MINfM2rr7\nhwRJ3ZJwmuDuM2P2uz9sSv3VjtQSioiIiCS7eJ7YMHoHj11bcuU1lq8B7jSz7wLvEIx8rTCzPYF9\ngOo+cq+Z2eHu/g5BU+oiM8sFnga+TXDvuq3f3GwMMAage/fuO3gKIiIiIo1TPE9suJ9tky/c/eIG\ndi0CusUsdwUW1zjGYuDM8H1ygLPcfXWYgH3k7uvCbS8DBwLvuPuicN+1ZvYoQbPtNkmcu98D3ANQ\nUFCwTfwiIiIiySye5tT/AC+F0xtAHrAujv0mAX3NrJeZZQDnAS/EFjCzduE96AB+BowP578GjjCz\nNDNLJxjUMDNcbhfum07QrDs9jlhEREREmpR4mlOfjl02s8eA1+PYryK8OfAEIBUY7+6fm9mNQKG7\nvwAcCdxsZk7QnHpFuPtTBH3wphHUAr7i7i+aWUtgQpjApYZx/DOuMxURERFpQsx9+1oazWxv4CV3\n3zMxIe16BQUFXlhYGHUYIiIiIg0ys8nuXtBQuXj6xK1l6z5x3wDX7kRsIiIiIrKT4mlOzd0dgYiI\niIhI/Boc2GBmZ5hZq5jlfDM7PbFhiYiIiEh94hmdep27r65ecPcS4LrEhSQiIiIiDYkniautTDyP\n6xIRERGRBIkniSs0sz+aWR8z621mfwImJzowEREREalbPEnclUAZ8ATwJLCRLfdzExEREZEIxDM6\ndT0wbjfEIiIiIiJximd06mtmlh+z3NrMJiQ2LBERERGpTzzNqe3CEakAuPsqoEPiQhIRERGRhsST\nxFWZWffqBTPrwdZPcBARERGR3SyeW4X8AnjPzCaGy4cD30tcSCIiIiLSkHgGNrxiZkOBAwEDfuzu\nyxMemYiIiIjUKZ7mVNx9ubv/B5gBXG5m0xMbloiIiIjUJ57RqZ3M7Edm9jHwOZAKjEp4ZCIiIiJS\npzqTODO7zMzeBCYC7YBLgSXufoO7T9tdAYqIiIjIturrE3cX8CFwvrsXApiZRqWKiIiINAL1JXGd\ngXOAP5rZHgSP3ErfLVGJiIiISL3qbE4NBzP83d0PB44BVgPLzGymmf1ut0UoIiIiItuId3Rqkbvf\n5u7DgNOB0sSGJSIiIiL1iedmv1tx91nADQmIRURERETiFFdNnIiIiIg0LkriRERERJJQXM2pZtYF\n6BFb3t3fSVRQIiIiIlK/BpM4M7sVGEnwyK3KcLUDSuJEREREIhJPTdzpwN7urhGpIiIiIo1EPH3i\n5qGb/IqIiIg0KvHUxG0AppjZG8TcH87df5iwqERERESkXvEkcS+Ek4iIiIg0Eg0mce7+oJllAHuF\nq2a5e3liwxIRERGR+sQzOvVI4EFgPmBANzO7SLcYEREREYlOPM2ptwPHhY/bwsz2Ah4DhiUyMBER\nERGpWzyjU9OrEzgAd5+NRquKiIiIRCqemrhCM7sPeDhcvgCYnLiQRERERKQh8SRx3weuAH5I0Cfu\nHeBviQxKREREROoXz+jUUuCP4SQiIiIijUCdSZyZPenu55rZNIJnpW7F3QcmNDIRERERqVN9NXFX\nha8n745ARERERCR+dY5Odfcl4ewP3H1B7AT8YPeEJyIiIiK1iecWI9+qZd0JuzoQEREREYlfnUmc\nmX0/7A/Xz8ymxkxfAdPiObiZjTCzWWY218zG1bK9h5m9ER73bTPrGrPt92b2uZnNNLO/mpmF64eZ\n2bTwmJvXi4iIiDQn9dXEPQqcAjwfvlZPw9z9goYObGapwF0EtXb9gVFm1r9GsduAh8JBEjcCN4f7\nHgwcAgwEBgD7A0eE+/wdGAP0DacRDZ6liIiISBNTX5+41e4+H/gLsDKmP1y5mR0Qx7GHA3PdfZ67\nlwGPA6fVKNMfeCOcfytmuwNZQAaQSfCEiKVm1gnIc/cP3d2Bh4DT44hFREREpEmJp0/c34F1Mcvr\nw3UN6QIsjFkuCtfF+gw4K5w/A8g1s7bu/iFBUrcknCa4+8xw/6IGjgmAmY0xs0IzKywuLo4jXBER\nEZHkEU8SZ2GtFwDuXkV8T3qora9azfvNXQMcYWafEjSXLgIqzGxPYB+gK0GSdrSZHR7nMavjvMfd\nC9y9oH379nGEKyIiIpI84kni5pnZD80sPZyuAubFsV8R0C1muSuwOLaAuy929zPdfQjwi3DdaoJa\nuY/cfZ27rwNeBg4Mj9m1vmOKiIiINAfxJHGXAwcT1JIVAQcQDCxoyCSgr5n1MrMM4DzghdgCZtbO\nzKpj+BkwPpz/mqCGLs3M0glq6WaG965ba2YHhqNSv0Mw8EJERESkWYnn2anLCBKw7eLuFWY2FpgA\npALj3f1zM7sRKHT3F4AjgZvNzIF3gCvC3Z8Cjia4lYkDr7j7i+G27wMPANkENXQvb29sIiIiIsnO\nYrq7bb3B7Kfu/nszu4Pan536w0QHt6sUFBR4YWFh1GGIiIiINMjMJrt7QUPl6quJmxm+KvsRERER\naWTqTOKqmy/d/cHdF46IiIiIxKPOJM7MXqSO23cAuPupCYlIRERERBpUX3PqbeHrmUBH4JFweRQw\nP4ExiYiIiEgD6mtOnQhgZje5++Exm140s3cSHpmIiIiI1Cme+8S1N7Pe1Qtm1gvQIxBEREREIhTP\n47N+DLxtZtVPaegJfC9hEYmIiIhIg+K52e8rZtYX6Beu+sLdSxMbloiIiIjUp8HmVDNrAfwEGOvu\nnwHdzezkhEeWbErXRR2BiIiINCPx9Im7HygDDgqXi4DfJCyiZPTmb+G+b8GmNVFHIiIiIs1EPElc\nH3f/PVAO4O4bAUtoVMmmx8FQPAuevhSqKqOORkRERJqBeJK4MjPLJrzxr5n1AdQnLlafo+DEP8Cc\nCfDar6OORkRERJqBeEanXge8AnQzs38BhwDfTWRQSWn/S4LauA/vhHZ7wbCLoo5IREREmrB6kzgz\nM+ALgqc2HEjQjHqVuy/fDbEln+N/ByvmwktXQ5ve0OuwqCMSERGRJqre5lR3d+A5d1/h7i+5+3+U\nwNUjNQ3OuR/a9IEnvw0rvow6IhEREWmi4ukT95GZ7Z/wSJqKrFZw/uOAwaMjYWNJ1BGJiIhIExRP\nEncUQSL3pZlNNbNpZjY10YEltTa9YeQjsGo+/Pu7UFkRdUQiIiLSxMQzsOGEhEfRFPU8BE75Mzx/\nBbwyDk66LeqIREREpAmpM4kzsyzgcmBPYBpwn7urSml7DLkwGLH6wV+h/d4w/LKoIxIREZEmor7m\n1AeBAoIE7gTg9t0SUVNz7PWw94nw8rUw942ooxEREZEmor4krr+7X+ju/wDOBnS/jB2Rkgpn3gMd\n9oF/j4bi2VFHJCIiIk1AfUlcefWMmlF3UmYujHoM0jLg0XNhw8qoIxIREZEkV18SN8jM1oTTWmBg\n9byZ6Unv2yu/O5z3KKxZDE98GyrKoo5IREREklidSZy7p7p7XjjluntazHze7gyyyeg2HE67Exa8\nFzzVwT3qiERERCRJxXOLEdmVBp4Ly2fDO3+A9v3g4LFRRyQiIiJJSElcFI78eZDIvfpLaLsn7D0i\n6ohEREQkycTzxAbZ1VJS4PS7odMgePoSWPp51BGJiIhIklESF5WMFsGI1cxcePgM+Pp/UUckIiIi\nSURJXJTyOsO3n4P0FvDASTDpPg12EBERkbgoiYtah34w5i3ofWQwYvWFsVC+KeqoREREpJFTEtcY\nZLeG85+Aw66BTx+B+0+A1YuijkpEREQaMSVxjUVKKhzzKxj5SDBy9Z4jYP77UUclIiIijZSSuMZm\nn1PgsjchKx8eOhU+ulv95ERERGQbSuIao/Z7w2VvQN/j4JVr4dnLoXxj1FGJiIhII6IkrrHKagUj\n/xXcGHjq43DfcbBqQdRRiYiISCOhJK4xS0mBI6+FUU/Aqvlwz5Ew7+2IgxIREZHGQElcMth7BFz2\nFuR0CG4M/MEd6icnIiLSzCmJSxbt9oRLX4d+JwfPXH36EihbH3VUIiIiEhElcckkMxfOfQiOuQ6m\nPwP3fgtWzos6KhEREYmAkrhkYwaHXQ0XPAVrFsH9J8LqoqijEhERkd0soUmcmY0ws1lmNtfMxtWy\nvYeZvWFmU83sbTPrGq4/ysymxEybzOz0cNsDZvZVzLbBiTyHRqvvsTD6v0GT6iNnw8aSqCMSERGR\n3ShhSZyZpQJ3AScA/YFRZta/RrHbgIfcfSBwI3AzgLu/5e6D3X0wcDSwAXg1Zr+fVG939ymJOodG\nb499gyc8rJgLT1wIFaVRRyQiIiK7SSJr4oYDc919nruXAY8Dp9Uo0x94I5x/q5btAGcDL7v7hoRF\nmsx6HwGn3QXz34Xnx0JVVdQRiYiIyG6QyCSuC7AwZrkoXBfrM+CscP4MINfM2tYocx7wWI11vw2b\nYP9kZpm1vbmZjTGzQjMrLC4u3rEzSBaDRsIxv4ZpT8KbN0YdjYiIiOwGiUzirJZ1NW9udg1whJl9\nChwBLAIqNh/ArBOwHzAhZp+fAf2A/YE2wLW1vbm73+PuBe5e0L59+x0+iaRx6NUwbDS89yeYdF/U\n0YiIiEiCpSXw2EVAt5jlrsDi2ALuvhg4E8DMcoCz3H11TJFzgWfdvTxmnyXhbKmZ3U+QCIoZnHgb\nrF0C/70G8jrD3idEHZWIiIgkSCJr4iYBfc2sl5llEDSLvhBbwMzamVl1DD8Dxtc4xihqNKWGtXOY\nmQGnA9MTEHtySk2Ds8dDp0Hw79FQNDnqiERERCRBEpbEuXsFMJagKXQm8KS7f25mN5rZqWGxI4FZ\nZjYb2AP4bfX+ZtaToCZvYo1D/8vMpgHTgHbAbxJ1DkkpoyWc/2TwiK5Hz9XNgEVERJoo82bwDM6C\nggIvLCyMOozda/kcuO9bkN0GLnkNWtYcLyIiIiKNkZlNdveChsrpiQ1NVbu+MOqJ4KkOj50H5Ruj\njkhERER2ISVxTVn3A+DMf0LRJHj6UqiqjDoiERER2UWUxDV1/U+FETfDF/+BCT+HZtB8LiIi0hwk\n8hYj0lgc+H0oWQgf3QWtusHBY6OOSERERHaSkrjm4rjfBP3jXv1FcA+5AWdGHZGIiIjsBCVxzUVK\nCpzxD1i3FJ79HuR2hB4HRx2ViIiI7CD1iWtO0rPgvEchvwc8NgqKZ0UdkYiIiOwgJXHNTYs2cOFT\nkJoB958ICz+OOiIRERHZAUrimqPWPWH0fyEzFx44GaY/HXVEIiIisp2UxDVX7frCpW9A5yHw1MUw\n8Q+6/YiIiEgSURLXnLVsCxe9APudC2/9Bp77PlSURh2ViIiIxEGjU5u7tEw48x5ouye8/Tso+RpG\nPhL0nRMREZFGSzVxAmZw5LVw5r3BI7ruPQaWz406KhEREamHkjjZYuA5cNGLsGk13HcszH8v6ohE\nRESkDkriZGvdD4RLX4eW7eGh02HKo1FHJCIiIrVQEifbatMbLnkVehwUDHZ44yaoqoo6KhEREYmh\nJE5ql90aLnwGhnwb3r0Nnr4EyjdGHZWIiIiENDpV6paaDqfeEYxcff06WL0QznsMctpHHZmIiEiz\np5o4qZ8ZHPojOPdh+GY63Hs0LJsZdVQiIiLNnpI4iU//U2H0S8HNgO87Dr56N+qIREREmjUlcRK/\nLsOCR3XldYbHz4fiWVFHJCIi0mwpiZPtk98NLngK0rLg0XNh/YqoIxIREWmWlMTJ9svvBuc9CmuW\nwJPfhoqyqCMSERFpdpTEyY7ptj+cdhcseB9e+jG4Rx2RiIhIs6JbjMiOG3gOLJ8N7/we2veDg6+M\nOiIREZFmQ0mc7JwjfwbLZ8GrvwruJ7f3CVFHJCIi0iyoOVV2TkoKnH43dBoET18a3EtOREREEk5J\nnOy8jBYw6jHIzIXHzoN1y6KOSEREpMlTEie7Rl7nIJFbvxwevwDKN0UdkYiISJOmJE52nc5D4Iy7\noehjeOFKjVgVERFJICVxsmvtezoc/UuY9iS8e3vU0YiIiDRZGp0qu95h10DxbHjzJmjXF/qfFnVE\nIiIiTY5q4mTXM4NT74Cu+8Mz34PFU6KOSEREpMlREieJkZ4VPJqrZbtgxOqaJVFHJCIi0qQoiZPE\nyekAox6HTWvg8VFQtiHqiERERJoMJXGSWB0HwNn3BU2qz30fqqqijkhERKRJUBInibf3CfCtG2HG\nczDxlqijERERaRI0OlV2j4OvhOJZMPFWSG8BB42FVH39REREdpRq4mT3MIOT/wT9TobXr4N7j4bF\nn0YdlYiISNJKaBJnZiPMbJaZzTWzcbVs72Fmb5jZVDN728y6huuPMrMpMdMmMzs93NbLzP5nZnPM\n7Akzy0jkOcgulJYBIx+Bcx6EtUvhn0fDy+OgdG3UkYmIiCSdhCVx/9/evcdZWdV7HP/8mOEm97vI\nIKig3OQ6CmgnPJiGmR5UKm+lR8rTSdPq6Cmq00mLF1kkWqlHUlNfmVbkLXspIKBpKjDERRBBMI0B\nBFQuCigM8zt/rDXMZmYzs4HZ8+w9832/Xs9r72c9az/P2r/XzPBjrWc9y8wKgDuAc4ABwCVmNqBK\ntanAg+4+GLgZmALg7vPcfai7DwXGAruAWfEztwDT3L0vsBWYmK3vIFlgFlZ1uHYBFF8F8/8P7hgJ\nK59KumUiIiJ5JZs9cacCa9z9TXffAzwCVH10/wBgTnw/L81xgAnA0+6+y8yMkNTNiMceAMbXecsl\n+1q0g3N/Dl9+Flp2gN9fBg9fCttLk26ZiIhIXshmEtcDWJeyXxrLUi0FLorvLwDamFmnKnUuBh6O\n7zsB29y9rIZzSj4pKoarnwuzV9fODb1yL98J+8pq+6SIiEijls0kztKUeZX9G4AxZrYYGAOsB/b/\n621m3YGTgZmHcM6Kz15tZiVmVrJly5ZDbbvUp4KmcPr1cM186HUazJykiQ8iIiK1yGYSVwr0TNkv\nAjakVnD3De5+obsPA74Xy7anVPk88Ji774377wLtzazi2RTVzply7unuXuzuxV26dDnybyPZ16EX\nXPoH+Nz98ME7mvggIiJSg2wmcQuBvnE2aTPCsOiTqRXMrLOZVbRhEnBflXNcQuVQKu7uhHvnJsSi\nK4AnstB2SYoZDLwArl2oiQ8iIiI1yFoSF+9bu5YwFLoS+IO7rzCzm83s/FjtDGCVma0GugGTKz5v\nZr0JPXnPVzn1t4Fvmdkawj1y92brO0iCKiY+TJx94MSH9/+RdMtERERygoXOrYatuLjYS0pKkm6G\nHK59e+GVO2HeFCjfC0Mvg0/eCO171v5ZERGRPGNmi9y9uLZ6WrFBcl/FxIfrFsOIf4elD8Mvh8Nf\nboAdG5NunYiISCKUxEn+aNsdzp0KX/87DL0UFv0GfjEUnvkufKgZyCIi0rgoiZP8074nnHc7XFsC\ngy6C+XfB7YNh9v/CrveTbp2IiEi9UBIn+avjcTD+TrhmIfQ7F/52O9x2MsydDLu3Jd06ERGRrFIS\nJ/mvcx+46B742svQ50z460/htsHw/E/hox1Jt05ERCQrlMRJw9G1P3z+QfiPF6D36TBvchhmfXEa\n7NmZdOtERETqlJI4aXi6D4ZLHoavzIUexfDsD+H2IfDCz2H31qRbJyIiUieUxEnD1WMEXD4DrpoF\nR58Mc26GWweGpby2vll321sAABD2SURBVJ1060RERI6Ikjhp+I4dCV98DL76IvQ/Dxb+Gn4xDGZc\nBRsWJ906ERGRw6IkThqPo0+GC++G65fB6K/B6lkw/Qy4/7PhfSNYvURERBoOJXHS+LTrAWf/GL61\nAs76Eby3Fn73ObhzNCz+LZR9nHQLRUREaqUkThqvFu3g9Ovg+qVwwXRoUghPXBMeT/LCrXrWnIiI\n5DQlcSKFzWDIF+CrL4R757oNgDk3wbSB8Mwk2PbPpFsoIiJSTWHSDRDJGWZwwtiwvfMqvPQrWDAd\n5t8NAy+A074OxwxNupUiIiKAeuJE0qs2CWImTB8DD5wHb8zWJAgREUmckjiRmlSdBPHuGnhoQpwE\n8ZAmQYiISGKUxIlkotokiAJ44muaBCEiIolREidyKPZPgngxTILo2l+TIEREJBGa2CByOGqcBDEe\nTrtOkyBERCSr1BMncqT2T4JYmrISxJjKlSDKy5NuoYiINEBK4kTqSrui9CtB3KWVIEREpO4piROp\nawdMgrg7zUoQW5NuoYiINABK4kSypbAZDLk4TIK4/NHKSRC3ahKEiIgcOU1sEMk2M+hzZtg2LoOX\ntRKEiIgcOfXEidSn7oPhwukpkyC0EoSIiBweJXEiSag6CUIrQYiIyCFSEieSpGqTIFJWgnj+Z1C6\nCMr2JN1KERHJQeaNYPimuLjYS0pKkm6GSO3cYe1ceOmX8Oa8UFbYAnqMgJ6nQs+RUHQqtOqUbDtF\nRCRrzGyRuxfXVk8TG0RySeokiB0bYd18WLcA1r0SErvyaaFepz7Qc1RlYtf5RGiijnURkcZESZxI\nrmrbPSzhNXB82N+7GzYsrkzsVj8NS34bjrVoF3rojh0ZkroeI6BZq+TaLiIiWackTiRfNG0JvU4L\nG4Sh1/fWxqQuJnZzZ4djBc2gz1kw6EI4cRw0b51cu0VEJCuUxInkKzPo3Cdswy4LZbu3QmlJuK9u\nxWOw6i9Q2BJOPBsGXQR9zw7JoIiI5D1NbBBpqMrLw710y/8Erz0BO7dAs9Zw0jkw8MJw311h86Rb\nKSIiVWQ6sUFJnEhjsK8M3n4xJHQr/xx67Jq3g/6fDQnd8WOgoGnSrRQREZTEHUBJnEiKfXvhzedg\n+aPw+lPw8Q5o2REGnB8Sut6fCM+rExGRRCiJS6EkTuQg9n4Ea+eEhG7V07B3Z0joeo6sfHzJMcOg\n2VFJt1REpNHQc+JEpHZNW0C/c8O2Zxe8MRPeeDbMdl39dKjTpBC6DzkwsWt7TLLtFhER9cSJyEHs\nfA9KF1Y+wmT9Iij7KBxr1zMmdTGx6zYICvR/QhGRuqCeOBE5Mq06wUnjwgZhDddNr8YVJObD2y/B\n8hnhWNNW0GN4eMhwt4HQtT906ht6+kREJCvUEycih8cdtpceuDTYphVQXhaOWxPoeEJI6PZvA6Dj\n8ZoJKyJSg5zoiTOzccDtQAFwj7v/pMrxXsB9QBfgfeBydy+Nx44F7gF6Ag58xt3fMrP7gTHA9nia\nK919STa/h4ikYQbte4bt5AmhrGwPvL8WNr8Gm18Pr5tWhFmwXh7qNGka1nrt2q8ysevSDzr01qxY\nEZFDkLUkzswKgDuAs4BSYKGZPenur6VUmwo86O4PmNlYYArwxXjsQWCyu882s9ZAecrnbnT3Gdlq\nu4gcpsJmlb1uqfbuhndXw+aVldu6heG5dRWat4Wi4sr77HoUQ4u29dt+EZE8ks2euFOBNe7+JoCZ\nPQL8G5CaxA0AvhnfzwMej3UHAIXuPhvA3T/MYjtFJNuatgwzXLsPObD84w9gy6rQY7dhcRiWfe4n\nhM53C/fXpU6g6NA79ACKiEhWk7gewLqU/VJgZJU6S4GLCEOuFwBtzKwTcCKwzcweBY4DngW+4+77\n4ucmm9kPgDmx/OOqFzezq4GrAY499tg6+1IiUoeatwm9b0XFMPxLoeyjHbC+pHICxbI/QMm94Vir\nriGZO3ZUSOy6D9HSYSLSaGUziUv33+WqsyhuAH5lZlcCfwXWA2WxXf8CDAP+CfweuBK4F5gEvAM0\nA6YD3wZurnYh9+nxOMXFxQ1/9oZIQ9GiLZwwNmwA5fvi8Ov8ysTu9afCsYJm4WHE7Xsdfg9d87bQ\nrge0LYqvPcJz8DT5QkRyXDaTuFLCpIQKRcCG1AruvgG4ECDe93aRu283s1JgccpQ7OPAKOBed98Y\nP/6xmf2GkAiKSEPVpACOHhS2UyaGsg82QemCysSudMHhndsdPtoGH22vcsCgzdEhoaua4LUrCq+t\nu0GTJkf01UREjkQ2k7iFQF8zO47Qw3YxcGlqBTPrDLzv7uWEHrb7Uj7bwcy6uPsWYCxQEj/T3d03\nmpkB44HlWfwOIpKL2nSD/ueFrS58/AFsXw87SuPr+sr9Ta/BG7Nh764DP9OkEJoewXJkR3WskhxW\nSRZbdtD9fyJSo6wlce5eZmbXAjMJjxi5z91XmNnNQIm7PwmcAUwxMycMp14TP7vPzG4A5sRkbRHw\n63jqh8ysC2G4dgnw1Wx9BxFpJJq3iY886Zf+uDvs3hqei7djfeVrWbXbcTPj5bDz3XCOt18Or/tv\n+Y2atjpIgncMFOohyiKJOXpwzsyc18N+RUSSVr4PPtyUpjewtLJX8MNNVL+tWETq3ZfnQtGIrF4i\nJx72KyIiGWhSEHrY2h4DnJK+Ttke+GAD7NgI5XvrtXkikqJz36RbsJ+SOBGRfFDYLDwnr0PvpFsi\nIjlCU6tERERE8pCSOBEREZE8pCROREREJA8piRMRERHJQ0riRERERPKQkjgRERGRPKQkTkRERCQP\nKYkTERERyUNK4kRERETykJI4ERERkTykJE5EREQkDymJExEREclDSuJERERE8pCSOBEREZE8pCRO\nREREJA+Zuyfdhqwzsy3A23Vwqs7Au3VwnoZOccqcYpU5xSozilPmFKvMKE6Zq6tY9XL3LrVVahRJ\nXF0xsxJ3L066HblOccqcYpU5xSozilPmFKvMKE6Zq+9YaThVREREJA8piRMRERHJQ0riDs30pBuQ\nJxSnzClWmVOsMqM4ZU6xyozilLl6jZXuiRMRERHJQ+qJExEREclDSuJERERE8pCSuAyY2TgzW2Vm\na8zsO0m3J2lmdp+ZbTaz5SllHc1stpm9EV87xHIzs1/E2C0zs+HJtbx+mVlPM5tnZivNbIWZXR/L\nFasqzKyFmS0ws6UxVjfF8uPMbH6M1e/NrFksbx7318TjvZNsf30zswIzW2xmT8V9xSkNM3vLzF41\nsyVmVhLL9PuXhpm1N7MZZvZ6/Js1WrE6kJmdFH+WKrYdZvaNJOOkJK4WZlYA3AGcAwwALjGzAcm2\nKnH3A+OqlH0HmOPufYE5cR9C3PrG7WrgrnpqYy4oA/7L3fsDo4Br4s+OYlXdx8BYdx8CDAXGmdko\n4BZgWozVVmBirD8R2OrufYBpsV5jcj2wMmVfcTq4f3X3oSnP7tLvX3q3A8+4ez9gCOHnS7FK4e6r\n4s/SUGAEsAt4jCTj5O7aatiA0cDMlP1JwKSk25X0BvQGlqfsrwK6x/fdgVXx/d3AJenqNbYNeAI4\nS7GqNU5HAX8HRhKefF4Yy/f/LgIzgdHxfWGsZ0m3vZ7iU0T4h2Is8BRgitNBY/UW0LlKmX7/qsep\nLfCPqj8bilWNMTsb+FvScVJPXO16AOtS9ktjmRyom7tvBIivXWO54gfEYaxhwHwUq7TiEOESYDMw\nG1gLbHP3slglNR77YxWPbwc61W+LE3Mb8N9AedzvhOJ0MA7MMrNFZnZ1LNPvX3XHA1uA38Rh+nvM\nrBWKVU0uBh6O7xOLk5K42lmaMj2XJXONPn5m1hr4E/ANd99RU9U0ZY0mVu6+z8MwRRFwKtA/XbX4\n2ihjZWafBTa7+6LU4jRVG3WcUpzu7sMJw1rXmNkna6jbmGNVCAwH7nL3YcBOKocE02nMsSLec3o+\n8MfaqqYpq9M4KYmrXSnQM2W/CNiQUFty2SYz6w4QXzfH8kYdPzNrSkjgHnL3R2OxYlUDd98GPEe4\nj7C9mRXGQ6nx2B+reLwd8H79tjQRpwPnm9lbwCOEIdXbUJzScvcN8XUz4d6lU9HvXzqlQKm7z4/7\nMwhJnWKV3jnA3919U9xPLE5K4mq3EOgbZ381I3ShPplwm3LRk8AV8f0VhPu/Ksq/FGfpjAK2V3Q7\nN3RmZsC9wEp3vzXlkGJVhZl1MbP28X1L4FOEG6vnARNitaqxqojhBGCux5tOGjJ3n+TuRe7em/C3\naK67X4biVI2ZtTKzNhXvCfcwLUe/f9W4+zvAOjM7KRadCbyGYnUwl1A5lApJxinpmwPzYQM+A6wm\n3KPzvaTbk/QWf3g3AnsJ/9OYSLjPZg7wRnztGOsaYXbvWuBVoDjp9tdjnD5B6DpfBiyJ22cUq7Sx\nGgwsjrFaDvwglh8PLADWEIYumsfyFnF/TTx+fNLfIYGYnQE8pTgdND7HA0vjtqLib7d+/w4ar6FA\nSfwdfBzooFiljdNRwHtAu5SyxOKkZbdERERE8pCGU0VERETykJI4ERERkTykJE5EREQkDymJExER\nEclDSuJERERE8pCSOBHJOWY2xczOMLPxZpb2yfFm9kMzu+EQzvndI2zTeDMbUENb1pvZkrj9pK6v\nISJSlZI4EclFIwnrzI4BXqijcx5REgeMB2pKsKa5+9C41bRk0ZFco5qUlRpEpJFREiciOcPMfmZm\ny4BTgJeBLwN3mdkPDuEcj8cFz1dULHoee8Zaxl6yh2LZ5Wa2IJbdbWYFsfxDM5tsZkvN7BUz62Zm\npxHWSvxZrH9Chm0ZYWbPx/bMTFma5ytmtjBe409mdlS6a5jZc2ZWHD/TOS63hZldaWZ/NLM/A7Ni\n2Y3xnMvM7KZY1srM/hKvs9zMvpBpHEUk9ymJE5Gc4e43EhK3+wmJ3DJ3H+zuNx/Caa5y9xFAMXCd\nmXWKPWO7Yy/ZZWbWH/gCYYH0ocA+4LL4+VbAK+4+BPgr8BV3f4mwhM6N8Rxr01z3mynDqZ+O6+b+\nEpgQ23MfMDnWfdTdT4nXWAlMzPAaqUYDV7j7WDM7G+hLWBt0KDDCwmLv44AN7j7E3QcBzxxCHEUk\nx6kbXkRyzTDCEmX9COs3HqrrzOyC+L4nIbl5r0qdM4ERwMKwxC0tqVy0eg/wVHy/CDgrw+tOc/ep\nFTtmNggYBMyO1yggLFcHMMjMfgy0B1oDMzO8RqrZ7l6xmP3ZcVsc91sTvvcLwFQzu4WwRFddDU2L\nSA5QEiciOcHMhhJ64IqAdwlrFJqZLQFGu/vuDM5xBvCpWH+XmT1HWD+0WlXgAXeflObYXq9cj3Af\nh/930oAV7j46zbH7gfHuvtTMriSsg5pOGZUjJlW/x84q15ri7ndXa4TZCMKavVPMbNYh9mqKSA7T\ncKqI5AR3XxKHNlcTbu6fC3w6Di3WmsBF7YCtMYHrB4xKObY3DnFCWKR6gpl1BTCzjmbWq5ZzfwC0\nyfT7AKuALmY2Ol6jqZkNjMfaABtjey5L+UzVa7xF6DEEmFDDtWYCV5lZ63itHmbW1cyOAXa5+2+B\nqcDwQ2i/iOQ4JXEikjPMrAshCSsH+rl7bcOp3zez0oqNcM9XYZwc8SPglZS604FlZvZQPO/3gVmx\n7mygey3XegS40cwWZzKxwd33EBKvW8xsKWGI+LR4+H8Is29nA6/XcI2pwH+a2UtA5xquNQv4HfCy\nmb0KzCAkgycDC2Jv5veAH9fWbhHJH1Y5aiAiIiIi+UI9cSIiIiJ5SEmciIiISB5SEiciIiKSh5TE\niYiIiOQhJXEiIiIieUhJnIiIiEgeUhInIiIikof+H2sPE7FibvYPAAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.figure(figsize=(10, 7))\n", "plt.plot(num_latent_feats, 1 - np.array(sum_errs_train)/(user_item_train.shape[0] \n", " * user_item_test_predictable.shape[1]), label='Train')\n", "plt.plot(num_latent_feats, 1 - np.array(sum_errs_test)/(user_item_test_predictable.shape[0] \n", " * user_item_test_predictable.shape[1]), label='Test')\n", "plt.xlabel('# Latent Features')\n", "plt.ylabel('Prediction Accuracy')\n", "plt.legend()\n", "plt.title('Training/Test Accuracy vs. Number of Latent Features')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "`6.` Use the cell below to comment on the results you found in the previous question. Given the circumstances of your results, discuss what you might do to determine if the recommendations you make with any of the above recommendation systems are an improvement to how users currently find articles? " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**1.** From the investigation from the dataset above, we learnt that we only have 20 users in the test set are present in the training set, which makes the users that could be recommended using the collaborative filtering techniques. There are a large amount of users will suffer from the cold start problem, which forced us to use the rank based reccommendation - a method that is far less ideal. \n", "\n", "**2.** From the training graph above, significant overfitting issues has been observed when we choose a larger number of latent features. It is better for us to choose a smaller number of latent features." ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "\n", "### Extras\n", "Using your workbook, you could now save your recommendations for each user, develop a class to make new predictions and update your results, and make a flask app to deploy your results. These tasks are beyond what is required for this project. However, from what you learned in the lessons, you certainly capable of taking these tasks on to improve upon your work here!\n", "\n", "\n", "## Conclusion\n", "\n", "> Congratulations! You have reached the end of the Recommendations with IBM project! \n", "\n", "> **Tip**: Once you are satisfied with your work here, check over your report to make sure that it is satisfies all the areas of the [rubric](https://review.udacity.com/#!/rubrics/2322/view). You should also probably remove all of the \"Tips\" like this one so that the presentation is as polished as possible.\n", "\n", "\n", "## Directions to Submit\n", "\n", "> Before you submit your project, you need to create a .html or .pdf version of this notebook in the workspace here. To do that, run the code cell below. If it worked correctly, you should get a return code of 0, and you should see the generated .html file in the workspace directory (click on the orange Jupyter icon in the upper left).\n", "\n", "> Alternatively, you can download this report as .html via the **File** > **Download as** submenu, and then manually upload it into the workspace directory by clicking on the orange Jupyter icon in the upper left, then using the Upload button.\n", "\n", "> Once you've done this, you can submit your project by clicking on the \"Submit Project\" button in the lower right here. This will create and submit a zip file with this .ipynb doc and the .html or .pdf version you created. Congratulations! " ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from subprocess import call\n", "call(['python', '-m', 'nbconvert', 'Recommendations_with_IBM.ipynb'])" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.7" } }, "nbformat": 4, "nbformat_minor": 2 }