{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### 计算传播应用\n",
    "***\n",
    "***\n",
    "# 推荐系统简介\n",
    "***\n",
    "***\n",
    "\n",
    "王成军\n",
    "\n",
    "wangchengjun@nju.edu.cn\n",
    "\n",
    "计算传播网 http://computational-communication.com"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 集体智慧编程\n",
    "\n",
    "> 集体智慧是指为了创造新想法，将一群人的行为、偏好或思想组合在一起。一般基于聪明的算法（Netflix, Google）或者提供内容的用户(Wikipedia)。\n",
    "\n",
    "集体智慧编程所强调的是前者，即通过编写计算机程序、构造具有智能的算法收集并分析用户的数据，发现新的信息甚至是知识。\n",
    "\n",
    "- Netflix\n",
    "- Google\n",
    "- Wikipedia\n",
    "\n",
    "Toby Segaran, 2007, Programming Collective Intelligence. O'Reilly. \n",
    "\n",
    "https://github.com/computational-class/programming-collective-intelligence-code/blob/master/chapter2/recommendations.py"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 推荐系统\n",
    "\n",
    "- 目前互联网世界最常见的智能产品形式。\n",
    "- 从信息时代过渡到注意力时代：\n",
    "    - 信息过载（information overload）\n",
    "    - 注意力稀缺\n",
    "- 推荐系统的基本任务是联系用户和物品，帮助用户快速发现有用信息，解决信息过载的问题。\n",
    "    - 针对长尾分布问题，找到个性化需求，优化资源配置\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 推荐系统的类型\n",
    "- 社会化推荐（Social Recommendation）\n",
    "    - 让朋友帮助推荐物品\n",
    "- 基于内容的推荐 （Content-based filtering）\n",
    "    - 基于用户已经消费的物品内容，推荐新的物品。例如根据看过的电影的导演和演员，推荐新影片。\n",
    "- 基于协同过滤的推荐（collaborative filtering）\n",
    "    - 找到和某用户的历史兴趣一致的用户，根据这些用户之间的相似性或者他们所消费物品的相似性，为该用户推荐物品"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 协同过滤算法\n",
    "\n",
    "- 基于邻域的方法（neighborhood-based method）\n",
    "    - 基于用户的协同过滤（user-based filtering）\n",
    "    - 基于物品的协同过滤 （item-based filtering）\n",
    "- 隐语义模型（latent factor model）\n",
    "- 基于图的随机游走算法（random walk on graphs）"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# UserCF和ItemCF的比较\n",
    "\n",
    "- UserCF较为古老， 1992年应用于电子邮件个性化推荐系统Tapestry, 1994年应用于Grouplens新闻个性化推荐， 后来被Digg采用\n",
    "    - 推荐那些与个体有共同兴趣爱好的用户所喜欢的物品（群体热点，社会化）\n",
    "        - 反映用户所在小型群体中物品的热门程度\n",
    "- ItemCF相对较新，应用于电子商务网站Amazon和DVD租赁网站Netflix\n",
    "    - 推荐那些和用户之前喜欢的物品相似的物品 （历史兴趣，个性化）\n",
    "        - 反映了用户自己的兴趣传承\n",
    "- 新闻更新快，物品数量庞大，相似度变化很快，不利于维护一张物品相似度的表格，电影、音乐、图书则可以。\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 推荐系统评测\n",
    "- 用户满意度\n",
    "- 预测准确度\n",
    "\n",
    "    $r_{ui}$用户实际打分， $\\hat{r_{ui}}$推荐算法预测打分, T为测量次数\n",
    "\n",
    "    - 均方根误差RMSE\n",
    "    \n",
    "        $RMSE = \\sqrt{\\frac{\\sum_{u, i \\in T} (r_{ui} - \\hat{r_{ui}})}{ T }^2} $\n",
    "        \n",
    "    - 平均绝对误差MAE\n",
    "    \n",
    "        $  MAE = \\frac{\\sum_{u, i \\in T} \\left | r_{ui} - \\hat{r_{ui}} \\right|}{ T}$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T03:37:50.871325Z",
     "start_time": "2019-06-15T03:37:50.858051Z"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [],
   "source": [
    "# A dictionary of movie critics and their ratings of a small\n",
    "# set of movies\n",
    "critics={'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5,\n",
    "      'Just My Luck': 3.0, 'Superman Returns': 3.5, 'You, Me and Dupree': 2.5,\n",
    "      'The Night Listener': 3.0},\n",
    "     'Gene Seymour': {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5,\n",
    "      'Just My Luck': 1.5, 'Superman Returns': 5.0, 'The Night Listener': 3.0,\n",
    "      'You, Me and Dupree': 3.5},\n",
    "     'Michael Phillips': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.0,\n",
    "      'Superman Returns': 3.5, 'The Night Listener': 4.0},\n",
    "     'Claudia Puig': {'Snakes on a Plane': 3.5, 'Just My Luck': 3.0,\n",
    "      'The Night Listener': 4.5, 'Superman Returns': 4.0,\n",
    "      'You, Me and Dupree': 2.5},\n",
    "     'Mick LaSalle': {'Lady in the Water': 3.0, 'Snakes on a Plane': 4.0,\n",
    "      'Just My Luck': 2.0, 'Superman Returns': 3.0, 'The Night Listener': 3.0,\n",
    "      'You, Me and Dupree': 2.0},\n",
    "     'Jack Matthews': {'Lady in the Water': 3.0, 'Snakes on a Plane': 4.0,\n",
    "      'The Night Listener': 3.0, 'Superman Returns': 5.0, 'You, Me and Dupree': 3.5},\n",
    "     'Toby': {'Snakes on a Plane':4.5,'You, Me and Dupree':1.0,'Superman Returns':4.0}}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T03:37:53.188222Z",
     "start_time": "2019-06-15T03:37:53.176776Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2.5"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "critics['Lisa Rose']['Lady in the Water']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T03:38:02.652358Z",
     "start_time": "2019-06-15T03:38:02.648001Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4.5"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "critics['Toby']['Snakes on a Plane']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T03:38:06.772052Z",
     "start_time": "2019-06-15T03:38:06.767392Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'Snakes on a Plane': 4.5, 'Superman Returns': 4.0, 'You, Me and Dupree': 1.0}"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "critics['Toby']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "***\n",
    "# 1. User-based filtering\n",
    "***\n",
    "## 1.0 Finding similar users"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T03:38:20.782918Z",
     "start_time": "2019-06-15T03:38:20.679782Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3.1622776601683795"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 欧几里得距离\n",
    "import numpy as np\n",
    "np.sqrt(np.power(5-4, 2) + np.power(4-1, 2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- This formula calculates the distance, which will be smaller for people who are more similar. \n",
    "- However, you need a function that gives higher values for people who are similar. \n",
    "- This can be done by adding 1 to the function (so you don’t get a division-by-zero error) and inverting it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.2402530733520421"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "1.0 /(1 + np.sqrt(np.power(5-4, 2) + np.power(4-1, 2)) )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T03:39:43.671528Z",
     "start_time": "2019-06-15T03:39:43.659485Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "# Returns a distance-based similarity score for person1 and person2\n",
    "def sim_distance(prefs,person1,person2):\n",
    "    # Get the list of shared_items\n",
    "    si={}\n",
    "    for item in prefs[person1]:\n",
    "        if item in prefs[person2]:\n",
    "            si[item]=1\n",
    "    # if they have no ratings in common, return 0\n",
    "    if len(si)==0: return 0\n",
    "    # Add up the squares of all the differences\n",
    "    sum_of_squares=np.sum([np.power(prefs[person1][item]-prefs[person2][item],2)\n",
    "                           for item in prefs[person1] if item in prefs[person2]])\n",
    "                           #for item in si.keys()])# \n",
    "    return 1/(1+np.sqrt(sum_of_squares) )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T03:39:50.817110Z",
     "start_time": "2019-06-15T03:39:50.812396Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.3483314773547883"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sim_distance(critics, 'Lisa Rose','Toby')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Pearson correlation coefficient"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T03:39:57.893913Z",
     "start_time": "2019-06-15T03:39:57.860142Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "# Returns the Pearson correlation coefficient for p1 and p2\n",
    "def sim_pearson(prefs,p1,p2):\n",
    "    # Get the list of mutually rated items\n",
    "    si={}\n",
    "    for item in prefs[p1]:\n",
    "        if item in prefs[p2]: si[item]=1\n",
    "    # Find the number of elements\n",
    "    n=len(si)\n",
    "    # if they are no ratings in common, return 0\n",
    "    if n==0: return 0\n",
    "    # Add up all the preferences\n",
    "    sum1=np.sum([prefs[p1][it] for it in si])\n",
    "    sum2=np.sum([prefs[p2][it] for it in si])\n",
    "    # Sum up the squares\n",
    "    sum1Sq=np.sum([np.power(prefs[p1][it],2) for it in si])\n",
    "    sum2Sq=np.sum([np.power(prefs[p2][it],2) for it in si])\n",
    "    # Sum up the products\n",
    "    pSum=np.sum([prefs[p1][it]*prefs[p2][it] for it in si])\n",
    "    # Calculate Pearson score\n",
    "    num=pSum-(sum1*sum2/n)\n",
    "    den=np.sqrt((sum1Sq-np.power(sum1,2)/n)*(sum2Sq-np.power(sum2,2)/n))\n",
    "    if den==0: return 0\n",
    "    return num/den"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T03:40:05.801716Z",
     "start_time": "2019-06-15T03:40:05.797214Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.9912407071619299"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sim_pearson(critics, 'Lisa Rose','Toby')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T03:40:31.416479Z",
     "start_time": "2019-06-15T03:40:31.409446Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "# Returns the best matches for person from the prefs dictionary.\n",
    "# Number of results and similarity function are optional params.\n",
    "def topMatches(prefs,person,n=5,similarity=sim_pearson):\n",
    "    scores=[(similarity(prefs,person,other),other)\n",
    "        for other in prefs if other!=person]\n",
    "    # Sort the list so the highest scores appear at the top \n",
    "    scores.sort( )\n",
    "    scores.reverse( )\n",
    "    return scores[0:n]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T03:40:37.864357Z",
     "start_time": "2019-06-15T03:40:37.859476Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[(0.9912407071619299, 'Lisa Rose'),\n",
       " (0.9244734516419049, 'Mick LaSalle'),\n",
       " (0.8934051474415647, 'Claudia Puig')]"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "topMatches(critics,'Toby',n=3) # topN"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## 1.1 Recommending Items"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "<img src='./img/usercf.png' width = 700px>\n",
    "\n",
    "- Toby相似的五个用户（Rose, Reymour, Puig, LaSalle, Matthews）及相似度（依次为0.99， 0.38， 0.89， 0.92, 0.66）\n",
    "- 这五个用户看过的三个电影（Night,Lady, Luck）及其评分\n",
    "    - 例如，Rose对Night评分是3.0\n",
    "- S.xNight是用户相似度与电影评分的乘积\n",
    "    - 例如，Toby与Rose相似度(0.99) * Rose对Night评分是3.0 = 2.97\n",
    "- 可以得到每部电影的得分\n",
    "    - 例如，Night的得分是12.89 = 2.97+1.14+4.02+2.77+1.99\n",
    "- 电影得分需要使用用户相似度之和进行加权\n",
    "    - 例如，Night电影的预测得分是3.35 = 12.89/3.84\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T03:49:21.867787Z",
     "start_time": "2019-06-15T03:49:21.838827Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "# Gets recommendations for a person by using a weighted average\n",
    "# of every other user's rankings\n",
    "def getRecommendations(prefs,person,similarity=sim_pearson):\n",
    "    totals={}\n",
    "    simSums={}\n",
    "    for other in prefs:\n",
    "        # don't compare me to myself\n",
    "        if other==person: continue\n",
    "        sim=similarity(prefs,person,other)\n",
    "        # ignore scores of zero or lower\n",
    "        if sim<=0: continue\n",
    "        for item in prefs[other]:   \n",
    "            # only score movies I haven't seen yet\n",
    "            if item not in prefs[person]:# or prefs[person][item]==0:\n",
    "                # Similarity * Score\n",
    "                totals.setdefault(item,0)\n",
    "                totals[item]+=prefs[other][item]*sim\n",
    "                # Sum of similarities\n",
    "                simSums.setdefault(item,0)\n",
    "                simSums[item]+=sim\n",
    "    # Create the normalized list\n",
    "    rankings=[(total/simSums[item],item) for item,total in totals.items()]\n",
    "    # Return the sorted list\n",
    "    rankings.sort()\n",
    "    rankings.reverse()\n",
    "    return rankings"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T03:49:26.596347Z",
     "start_time": "2019-06-15T03:49:26.590788Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[(3.3477895267131013, 'The Night Listener'),\n",
       " (2.8325499182641614, 'Lady in the Water'),\n",
       " (2.530980703765565, 'Just My Luck')]"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Now you can find out what movies I should watch next:\n",
    "getRecommendations(critics,'Toby')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T03:50:05.387449Z",
     "start_time": "2019-06-15T03:50:05.382039Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[(3.457128694491423, 'The Night Listener'),\n",
       " (2.778584003814923, 'Lady in the Water'),\n",
       " (2.422482042361917, 'Just My Luck')]"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# You’ll find that the results are only affected very slightly by the choice of similarity metric.\n",
    "getRecommendations(critics,'Toby',similarity=sim_distance)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "***\n",
    "# 2. Item-based filtering\n",
    "***\n",
    "\n",
    "Now you know how to find similar people and recommend products for a given person\n",
    "\n",
    "### But what if you want to see which products are similar to each other? \n",
    "This is actually the same method we used earlier to determine similarity between people—"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### 将item-user字典的键值翻转"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T03:50:31.798138Z",
     "start_time": "2019-06-15T03:50:31.789728Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "# you just need to swap the people and the items. \n",
    "def transformPrefs(prefs):\n",
    "    result={}\n",
    "    for person in prefs:\n",
    "        for item in prefs[person]:\n",
    "            result.setdefault(item,{})\n",
    "            # Flip item and person\n",
    "            result[item][person]=prefs[person][item]\n",
    "    return result\n",
    "\n",
    "movies = transformPrefs(critics)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### 计算item的相似性"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T03:50:47.138142Z",
     "start_time": "2019-06-15T03:50:47.132763Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[(0.6579516949597695, 'You, Me and Dupree'),\n",
       " (0.4879500364742689, 'Lady in the Water'),\n",
       " (0.11180339887498941, 'Snakes on a Plane'),\n",
       " (-0.1798471947990544, 'The Night Listener'),\n",
       " (-0.42289003161103106, 'Just My Luck')]"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "topMatches(movies,'Superman Returns')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### 给item推荐user"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T03:51:37.311313Z",
     "start_time": "2019-06-15T03:51:37.291502Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[(0.3090169943749474, 'Snakes on a Plane'),\n",
       " (0.252650308587072, 'The Night Listener'),\n",
       " (0.2402530733520421, 'Lady in the Water'),\n",
       " (0.20799159651347807, 'Just My Luck'),\n",
       " (0.1918253663634734, 'You, Me and Dupree')]"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def calculateSimilarItems(prefs,n=10):\n",
    "    # Create a dictionary of items showing which other items they\n",
    "    # are most similar to.\n",
    "    result={}\n",
    "    # Invert the preference matrix to be item-centric\n",
    "    itemPrefs=transformPrefs(prefs)\n",
    "    c=0\n",
    "    for item in itemPrefs:\n",
    "        # Status updates for large datasets\n",
    "        c+=1\n",
    "        if c%100==0: \n",
    "            print(\"%d / %d\" % (c,len(itemPrefs)))\n",
    "        # Find the most similar items to this one\n",
    "        scores=topMatches(itemPrefs,item,n=n,similarity=sim_distance)\n",
    "        result[item]=scores\n",
    "    return result\n",
    "\n",
    "itemsim=calculateSimilarItems(critics) \n",
    "itemsim['Superman Returns']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "<img src = './img/itemcf1.png' width=800px>\n",
    "\n",
    "- Toby看过三个电影（snakes、Superman、dupree）和评分（依次是4.5、4.0、1.0）\n",
    "- 表格2-3给出这三部电影与另外三部电影的相似度\n",
    "    - 例如superman与night的相似度是0.103\n",
    "- R.xNight表示Toby对自己看过的三部定影的评分与Night这部电影相似度的乘积\n",
    "    - 例如，0.818 = 4.5*0.182\n",
    "    \n",
    "    \n",
    "- 那么Toby对于Night的评分可以表达为0.818+0.412+0.148 = 1.378\n",
    "    - 已经知道Night相似度之和是0.182+0.103+0.148 = 0.433\n",
    "        - 那么Toby对Night的最终评分可以表达为：1.378/0.433 = 3.183\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T03:59:45.629649Z",
     "start_time": "2019-06-15T03:59:45.598310Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[(3.1667425234070894, 'The Night Listener'),\n",
       " (2.9366294028444346, 'Just My Luck'),\n",
       " (2.868767392626467, 'Lady in the Water')]"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def getRecommendedItems(prefs,itemMatch,user):\n",
    "    userRatings=prefs[user]\n",
    "    scores={}\n",
    "    totalSim={}\n",
    "    # Loop over items rated by this user\n",
    "    for (item,rating) in userRatings.items( ):\n",
    "        # Loop over items similar to this one\n",
    "        for (similarity,item2) in itemMatch[item]:\n",
    "            # Ignore if this user has already rated this item\n",
    "            if item2 in userRatings: continue\n",
    "            # Weighted sum of rating times similarity\n",
    "            scores.setdefault(item2,0)\n",
    "            scores[item2]+=similarity*rating\n",
    "            # Sum of all the similarities\n",
    "            totalSim.setdefault(item2,0)\n",
    "            totalSim[item2]+=similarity\n",
    "    # Divide each total score by total weighting to get an average\n",
    "    rankings=[(score/totalSim[item],item) for item,score in scores.items( )]\n",
    "    # Return the rankings from highest to lowest\n",
    "    rankings.sort( )\n",
    "    rankings.reverse( )\n",
    "    return rankings\n",
    "\n",
    "getRecommendedItems(critics,itemsim,'Toby')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T03:59:58.539274Z",
     "start_time": "2019-06-15T03:59:58.534208Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[(4.0, 'Michael Phillips'), (3.0, 'Jack Matthews')]"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "getRecommendations(movies,'Just My Luck')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 84,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[(3.1637361366111816, 'Michael Phillips')]"
      ]
     },
     "execution_count": 84,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "getRecommendations(movies, 'You, Me and Dupree')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "<img src = './img/itemcfNetwork.png' width = 700px>\n",
    "\n",
    "# 基于物品的协同过滤算法的网络表示方法"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 基于图的模型\n",
    "\n",
    "使用二分图表示用户行为，因此基于图的算法可以应用到推荐系统当中。\n",
    "\n",
    "<img src = './img/graphrec.png' width = 500px>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T06:26:46.988001Z",
     "start_time": "2019-06-15T06:26:46.974911Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "# https://github.com/ParticleWave/RecommendationSystemStudy/blob/d1960056b96cfaad62afbfe39225ff680240d37e/PersonalRank.py\n",
    "import os\n",
    "import random\n",
    "\n",
    "class Graph:\n",
    "    def __init__(self):\n",
    "        self.G = dict()\n",
    "    \n",
    "    def addEdge(self, p, q):\n",
    "        if p not in self.G: self.G[p] = dict()\n",
    "        if q not in self.G: self.G[q] = dict()\n",
    "        self.G[p][q] = 1\n",
    "        self.G[q][p] = 1\n",
    "\n",
    "    def getGraphMatrix(self):\n",
    "        return self.G"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T06:26:57.557206Z",
     "start_time": "2019-06-15T06:26:57.546827Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "dict_keys(['A', 'd', 'b', 'C', 'a', 'c', 'B'])\n"
     ]
    }
   ],
   "source": [
    "graph = Graph()\n",
    "graph.addEdge('A', 'a')\n",
    "graph.addEdge('A', 'c')\n",
    "graph.addEdge('B', 'a')\n",
    "graph.addEdge('B', 'b')\n",
    "graph.addEdge('B', 'c')\n",
    "graph.addEdge('B', 'd')\n",
    "graph.addEdge('C', 'c')\n",
    "graph.addEdge('C', 'd')\n",
    "G = graph.getGraphMatrix()\n",
    "print(G.keys())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T06:27:06.285484Z",
     "start_time": "2019-06-15T06:27:06.280821Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'A': {'a': 1, 'c': 1},\n",
       " 'B': {'a': 1, 'b': 1, 'c': 1, 'd': 1},\n",
       " 'C': {'c': 1, 'd': 1},\n",
       " 'a': {'A': 1, 'B': 1},\n",
       " 'b': {'B': 1},\n",
       " 'c': {'A': 1, 'B': 1, 'C': 1},\n",
       " 'd': {'B': 1, 'C': 1}}"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "G"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T06:32:02.195675Z",
     "start_time": "2019-06-15T06:32:02.189744Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "A a 1\n",
      "A c 1\n",
      "d C 1\n",
      "d B 1\n",
      "b B 1\n",
      "C d 1\n",
      "C c 1\n",
      "a B 1\n",
      "a A 1\n",
      "c C 1\n",
      "c B 1\n",
      "c A 1\n",
      "B a 1\n",
      "B d 1\n",
      "B c 1\n",
      "B b 1\n"
     ]
    }
   ],
   "source": [
    "for i, ri in G.items():\n",
    "    for j, wij in ri.items():\n",
    "        print(i, j, wij)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T06:34:47.321257Z",
     "start_time": "2019-06-15T06:34:47.301108Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "def PersonalRank(G, alpha, root, max_step):\n",
    "    # G is the biparitite graph of users' ratings on items\n",
    "    # alpha is the probability of random walk forward\n",
    "    # root is the studied User\n",
    "    # max_step if the steps of iterations.\n",
    "    rank = dict()\n",
    "    rank = {x:0.0 for x in G.keys()}\n",
    "    rank[root] = 1.0\n",
    "    for k in range(max_step):\n",
    "        tmp = {x:0.0 for x in G.keys()}\n",
    "        for i,ri in G.items():\n",
    "            for j,wij in ri.items():\n",
    "                if j not in tmp: tmp[j] = 0.0 #\n",
    "                tmp[j] += alpha * rank[i] / (len(ri)*1.0)\n",
    "                if j == root: tmp[j] += 1.0 - alpha\n",
    "        rank = tmp\n",
    "        print(k, rank)\n",
    "    return rank"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T06:35:06.747759Z",
     "start_time": "2019-06-15T06:35:06.740862Z"
    },
    "scrolled": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0 {'c': 0.4, 'd': 0.0, 'b': 0.0, 'C': 0.0, 'a': 0.4, 'A': 0.3999999999999999, 'B': 0.0}\n",
      "1 {'c': 0.15999999999999998, 'd': 0.0, 'b': 0.0, 'C': 0.10666666666666669, 'a': 0.15999999999999998, 'A': 0.6666666666666666, 'B': 0.2666666666666667}\n",
      "2 {'c': 0.3626666666666667, 'd': 0.09600000000000003, 'b': 0.053333333333333344, 'C': 0.04266666666666666, 'a': 0.32, 'A': 0.5066666666666666, 'B': 0.10666666666666665}\n",
      "3 {'c': 0.24106666666666665, 'd': 0.03839999999999999, 'b': 0.02133333333333333, 'C': 0.13511111111111113, 'a': 0.22399999999999998, 'A': 0.624711111111111, 'B': 0.3057777777777778}\n",
      "4 {'c': 0.36508444444444443, 'd': 0.11520000000000002, 'b': 0.06115555555555557, 'C': 0.07964444444444443, 'a': 0.31104, 'A': 0.5538844444444444, 'B': 0.1863111111111111}\n",
      "5 {'c': 0.29067377777777775, 'd': 0.06911999999999999, 'b': 0.03726222222222222, 'C': 0.14343585185185187, 'a': 0.258816, 'A': 0.6217718518518518, 'B': 0.31677629629629633}\n",
      "6 {'c': 0.3694383407407408, 'd': 0.12072960000000002, 'b': 0.06335525925925926, 'C': 0.1051610074074074, 'a': 0.312064, 'A': 0.5810394074074073, 'B': 0.2384971851851852}\n",
      "7 {'c': 0.322179602962963, 'd': 0.08976384000000001, 'b': 0.047699437037037044, 'C': 0.14680873086419757, 'a': 0.2801152, 'A': 0.6233424908641975, 'B': 0.322318538271605}\n",
      "8 {'c': 0.37252419634567907, 'd': 0.12318720000000004, 'b': 0.06446370765432101, 'C': 0.12182009679012348, 'a': 0.313800704, 'A': 0.5979606407901235, 'B': 0.27202572641975314}\n",
      "9 {'c': 0.34231744031604944, 'd': 0.10313318400000002, 'b': 0.05440514528395063, 'C': 0.14861466569218112, 'a': 0.2935894016, 'A': 0.624860067292181, 'B': 0.3257059134156379}\n",
      "10 {'c': 0.37453107587687245, 'd': 0.12458704896000004, 'b': 0.06514118268312759, 'C': 0.13253792435094652, 'a': 0.3150852096, 'A': 0.6087204113909463, 'B': 0.29349780121810704}\n",
      "11 {'c': 0.35520289454037857, 'd': 0.11171472998400003, 'b': 0.05869956024362141, 'C': 0.14970977315116601, 'a': 0.3021877248, 'A': 0.6259090374071659, 'B': 0.3278568031376681}\n",
      "12 {'c': 0.3758188848508664, 'd': 0.12545526988800004, 'b': 0.06557136062753362, 'C': 0.1394066638710343, 'a': 0.31593497559039996, 'A': 0.6155958617974342, 'B': 0.3072414019859314}\n",
      "13 {'c': 0.3634492906645737, 'd': 0.1172109459456, 'b': 0.061448280397186285, 'C': 0.1504004772487644, 'a': 0.30768662511615996, 'A': 0.6265923595297243, 'B': 0.3292315559869513}\n",
      "14 {'c': 0.37664344590878573, 'd': 0.126006502096896, 'b': 0.06584631119739026, 'C': 0.14380418922212634, 'a': 0.31648325500928, 'A': 0.6199944608903503, 'B': 0.3160374635863394}\n",
      "15 {'c': 0.36872695276225853, 'd': 0.12072916840611841, 'b': 0.06320749271726787, 'C': 0.1508408530811013, 'a': 0.311205277073408, 'A': 0.6270315542460547, 'B': 0.3301112040427255}\n",
      "16 {'c': 0.3771712037394075, 'd': 0.12635858204098563, 'b': 0.0660222408085451, 'C': 0.14661885476571632, 'a': 0.316834862506967, 'A': 0.6228092982326321, 'B': 0.3216669597688938}\n",
      "17 {'c': 0.37210465315311814, 'd': 0.1229809338600653, 'b': 0.06433339195377877, 'C': 0.15112242048023627, 'a': 0.3134571112468316, 'A': 0.6273129326666287, 'B': 0.3306741581298592}\n",
      "18 {'c': 0.3775089728847178, 'd': 0.12658379981806633, 'b': 0.06613483162597183, 'C': 0.1484202810515243, 'a': 0.3170600046926233, 'A': 0.6246107520062307, 'B': 0.32526983911327995}\n",
      "19 {'c': 0.37426638104575805, 'd': 0.1244220802432657, 'b': 0.06505396782265599, 'C': 0.15130257936315128, 'a': 0.3148982686251483, 'A': 0.627493061312974, 'B': 0.3310344465409781}\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "{'A': 0.627493061312974,\n",
       " 'B': 0.3310344465409781,\n",
       " 'C': 0.15130257936315128,\n",
       " 'a': 0.3148982686251483,\n",
       " 'b': 0.06505396782265599,\n",
       " 'c': 0.37426638104575805,\n",
       " 'd': 0.1244220802432657}"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "PersonalRank(G, 0.8, 'A', 20)\n",
    "#    print(PersonalRank(G, 0.8, 'B', 20))\n",
    "#    print(PersonalRank(G, 0.8, 'C', 20))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "***\n",
    "# 3. MovieLens Recommender\n",
    "***\n",
    "MovieLens是一个电影评价的真实数据，由明尼苏达州立大学的GroupLens项目组开发。\n",
    "\n",
    "### 数据下载\n",
    "http://grouplens.org/datasets/movielens/1m/\n",
    "\n",
    "> These files contain 1,000,209 anonymous ratings of approximately 3,900 movies \n",
    "made by 6,040 MovieLens users who joined MovieLens in 2000.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "# 数据格式\n",
    "All ratings are contained in the file \"ratings.dat\" and are in the following format:\n",
    "\n",
    "UserID::MovieID::Rating::Timestamp\n",
    "\n",
    "1::1193::5::978300760\n",
    "\n",
    "1::661::3::978302109\n",
    "\n",
    "1::914::3::978301968\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T06:38:51.401130Z",
     "start_time": "2019-06-15T06:38:51.387300Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "def loadMovieLens(path='/Users/datalab/bigdata/cjc/ml-1m/'):\n",
    "    # Get movie titles\n",
    "    movies={}\n",
    "    for line in open(path+'movies.dat', encoding = 'iso-8859-15'):\n",
    "        (id,title)=line.split('::')[0:2]\n",
    "        movies[id]=title\n",
    "  \n",
    "    # Load data\n",
    "    prefs={}\n",
    "    for line in open(path+'/ratings.dat'):\n",
    "        (user,movieid,rating,ts)=line.split('::')\n",
    "        prefs.setdefault(user,{})\n",
    "        prefs[user][movies[movieid]]=float(rating)\n",
    "    return prefs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T06:38:57.759710Z",
     "start_time": "2019-06-15T06:38:56.267827Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'Alice in Wonderland (1951)': 1.0,\n",
       " 'Army of Darkness (1993)': 3.0,\n",
       " 'Bad Boys (1995)': 5.0,\n",
       " 'Benji (1974)': 1.0,\n",
       " 'Brady Bunch Movie, The (1995)': 1.0,\n",
       " 'Braveheart (1995)': 5.0,\n",
       " 'Buffalo 66 (1998)': 1.0,\n",
       " 'Chambermaid on the Titanic, The (1998)': 1.0,\n",
       " 'Cowboy Way, The (1994)': 1.0,\n",
       " 'Cyrano de Bergerac (1990)': 4.0,\n",
       " 'Dear Diary (Caro Diario) (1994)': 1.0,\n",
       " 'Die Hard (1988)': 3.0,\n",
       " 'Diebinnen (1995)': 1.0,\n",
       " 'Dr. No (1962)': 1.0,\n",
       " 'Escape from the Planet of the Apes (1971)': 1.0,\n",
       " 'Fast, Cheap & Out of Control (1997)': 1.0,\n",
       " 'Faster Pussycat! Kill! Kill! (1965)': 1.0,\n",
       " 'From Russia with Love (1963)': 1.0,\n",
       " 'Fugitive, The (1993)': 5.0,\n",
       " 'Get Shorty (1995)': 1.0,\n",
       " 'Gladiator (2000)': 5.0,\n",
       " 'Goldfinger (1964)': 5.0,\n",
       " 'Good, The Bad and The Ugly, The (1966)': 4.0,\n",
       " 'Hunt for Red October, The (1990)': 5.0,\n",
       " 'Hurricane, The (1999)': 5.0,\n",
       " 'Indiana Jones and the Last Crusade (1989)': 4.0,\n",
       " 'Jaws (1975)': 5.0,\n",
       " 'Jurassic Park (1993)': 5.0,\n",
       " 'King Kong (1933)': 1.0,\n",
       " 'King of New York (1990)': 1.0,\n",
       " 'Last of the Mohicans, The (1992)': 1.0,\n",
       " 'Lethal Weapon (1987)': 5.0,\n",
       " 'Longest Day, The (1962)': 1.0,\n",
       " 'Man with the Golden Gun, The (1974)': 5.0,\n",
       " 'Mask of Zorro, The (1998)': 5.0,\n",
       " 'Matrix, The (1999)': 5.0,\n",
       " \"On Her Majesty's Secret Service (1969)\": 1.0,\n",
       " 'Out of Sight (1998)': 1.0,\n",
       " 'Palookaville (1996)': 1.0,\n",
       " 'Planet of the Apes (1968)': 1.0,\n",
       " 'Pope of Greenwich Village, The (1984)': 1.0,\n",
       " 'Princess Bride, The (1987)': 3.0,\n",
       " 'Raiders of the Lost Ark (1981)': 4.0,\n",
       " 'Rock, The (1996)': 5.0,\n",
       " 'Rocky (1976)': 5.0,\n",
       " 'Saving Private Ryan (1998)': 4.0,\n",
       " 'Shanghai Noon (2000)': 1.0,\n",
       " 'Speed (1994)': 1.0,\n",
       " 'Star Wars: Episode IV - A New Hope (1977)': 5.0,\n",
       " 'Star Wars: Episode V - The Empire Strikes Back (1980)': 5.0,\n",
       " 'Taking of Pelham One Two Three, The (1974)': 1.0,\n",
       " 'Terminator 2: Judgment Day (1991)': 5.0,\n",
       " 'Terminator, The (1984)': 4.0,\n",
       " 'Thelma & Louise (1991)': 1.0,\n",
       " 'True Romance (1993)': 1.0,\n",
       " 'U-571 (2000)': 5.0,\n",
       " 'Untouchables, The (1987)': 5.0,\n",
       " 'Westworld (1973)': 1.0,\n",
       " 'X-Men (2000)': 4.0}"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prefs=loadMovieLens()\n",
    "prefs['87']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### user-based filtering"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-05-05T03:13:11.081668Z",
     "start_time": "2018-05-05T03:13:09.528867Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[(5.0, 'Time of the Gypsies (Dom za vesanje) (1989)'),\n",
       " (5.0, 'Tigrero: A Film That Was Never Made (1994)'),\n",
       " (5.0, 'Schlafes Bruder (Brother of Sleep) (1995)'),\n",
       " (5.0, 'Return with Honor (1998)'),\n",
       " (5.0, 'Lured (1947)'),\n",
       " (5.0, 'Identification of a Woman (Identificazione di una donna) (1982)'),\n",
       " (5.0, 'I Am Cuba (Soy Cuba/Ya Kuba) (1964)'),\n",
       " (5.0, 'Hour of the Pig, The (1993)'),\n",
       " (5.0, 'Gay Deceivers, The (1969)'),\n",
       " (5.0, 'Gate of Heavenly Peace, The (1995)'),\n",
       " (5.0, 'Foreign Student (1994)'),\n",
       " (5.0, 'Dingo (1992)'),\n",
       " (5.0, 'Dangerous Game (1993)'),\n",
       " (5.0, 'Callejón de los milagros, El (1995)'),\n",
       " (5.0, 'Bittersweet Motel (2000)'),\n",
       " (4.820460101722989, 'Apple, The (Sib) (1998)'),\n",
       " (4.738956184936386, 'Lamerica (1994)'),\n",
       " (4.681816541467396, 'Bells, The (1926)'),\n",
       " (4.664958072522234, 'Hurricane Streets (1998)'),\n",
       " (4.650741840804562, 'Sanjuro (1962)'),\n",
       " (4.649974172600346, 'On the Ropes (1999)'),\n",
       " (4.636825408739504, 'Shawshank Redemption, The (1994)'),\n",
       " (4.627888709544556, 'For All Mankind (1989)'),\n",
       " (4.582048349280509, 'Midaq Alley (Callejón de los milagros, El) (1995)'),\n",
       " (4.579778646871153, \"Schindler's List (1993)\"),\n",
       " (4.57519994103739,\n",
       "  'Seven Samurai (The Magnificent Seven) (Shichinin no samurai) (1954)'),\n",
       " (4.574904988403455, 'Godfather, The (1972)'),\n",
       " (4.5746840191882345, \"Ed's Next Move (1996)\"),\n",
       " (4.558519037147828, 'Hanging Garden, The (1997)'),\n",
       " (4.527760042775589, 'Close Shave, A (1995)')]"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "getRecommendations(prefs,'87')[0:30]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### Item-based filtering"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "start_time": "2018-05-05T03:14:13.060Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "100 / 3706\n",
      "200 / 3706\n",
      "300 / 3706\n",
      "400 / 3706\n",
      "500 / 3706\n",
      "600 / 3706\n",
      "700 / 3706\n",
      "800 / 3706\n",
      "900 / 3706\n",
      "1000 / 3706\n"
     ]
    }
   ],
   "source": [
    "itemsim=calculateSimilarItems(prefs,n=50)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 95,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[(5.0, 'Uninvited Guest, An (2000)'),\n",
       " (5.0, 'Two Much (1996)'),\n",
       " (5.0, 'Two Family House (2000)'),\n",
       " (5.0, 'Trial by Jury (1994)'),\n",
       " (5.0, 'Tom & Viv (1994)'),\n",
       " (5.0, 'This Is My Father (1998)'),\n",
       " (5.0, 'Something to Sing About (1937)'),\n",
       " (5.0, 'Slappy and the Stinkers (1998)'),\n",
       " (5.0, 'Running Free (2000)'),\n",
       " (5.0, 'Roula (1995)'),\n",
       " (5.0, 'Prom Night IV: Deliver Us From Evil (1992)'),\n",
       " (5.0, 'Project Moon Base (1953)'),\n",
       " (5.0, 'Price Above Rubies, A (1998)'),\n",
       " (5.0, 'Open Season (1996)'),\n",
       " (5.0, 'Only Angels Have Wings (1939)'),\n",
       " (5.0, 'Onegin (1999)'),\n",
       " (5.0, 'Once Upon a Time... When We Were Colored (1995)'),\n",
       " (5.0, 'Office Killer (1997)'),\n",
       " (5.0, 'N\\xe9nette et Boni (1996)'),\n",
       " (5.0, 'No Looking Back (1998)'),\n",
       " (5.0, 'Never Met Picasso (1996)'),\n",
       " (5.0, 'Music From Another Room (1998)'),\n",
       " (5.0, \"Mummy's Tomb, The (1942)\"),\n",
       " (5.0, 'Modern Affair, A (1995)'),\n",
       " (5.0, 'Machine, The (1994)'),\n",
       " (5.0, 'Lured (1947)'),\n",
       " (5.0, 'Low Life, The (1994)'),\n",
       " (5.0, 'Lodger, The (1926)'),\n",
       " (5.0, 'Loaded (1994)'),\n",
       " (5.0, 'Line King: Al Hirschfeld, The (1996)')]"
      ]
     },
     "execution_count": 95,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "getRecommendedItems(prefs,itemsim,'87')[0:30]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Buiding Recommendation System with Turicreate"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "In this notebook we will import Turicreate and use it to\n",
    "\n",
    "- train two models that can be used for recommending new songs to users \n",
    "- compare the performance of the two models\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T06:40:29.380065Z",
     "start_time": "2019-06-15T06:40:27.941331Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "import turicreate as tc\n",
    "import matplotlib.pyplot as plt\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T06:40:42.258487Z",
     "start_time": "2019-06-15T06:40:42.238423Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\"><table frame=\"box\" rules=\"cols\">\n",
       "    <tr>\n",
       "        <th style=\"padding-left: 1em; padding-right: 1em; text-align: center\">item_id</th>\n",
       "        <th style=\"padding-left: 1em; padding-right: 1em; text-align: center\">rating</th>\n",
       "        <th style=\"padding-left: 1em; padding-right: 1em; text-align: center\">user_id</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">a</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">b</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">3</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">c</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">2</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">a</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">5</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">b</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">4</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">b</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">c</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">4</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">d</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">3</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">2</td>\n",
       "    </tr>\n",
       "</table>\n",
       "[8 rows x 3 columns]<br/>\n",
       "</div>"
      ],
      "text/plain": [
       "Columns:\n",
       "\titem_id\tstr\n",
       "\trating\tint\n",
       "\tuser_id\tstr\n",
       "\n",
       "Rows: 8\n",
       "\n",
       "Data:\n",
       "+---------+--------+---------+\n",
       "| item_id | rating | user_id |\n",
       "+---------+--------+---------+\n",
       "|    a    |   1    |    0    |\n",
       "|    b    |   3    |    0    |\n",
       "|    c    |   2    |    0    |\n",
       "|    a    |   5    |    1    |\n",
       "|    b    |   4    |    1    |\n",
       "|    b    |   1    |    2    |\n",
       "|    c    |   4    |    2    |\n",
       "|    d    |   3    |    2    |\n",
       "+---------+--------+---------+\n",
       "[8 rows x 3 columns]"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sf = tc.SFrame({'user_id': [\"0\", \"0\", \"0\", \"1\", \"1\", \"2\", \"2\", \"2\"],\n",
    "                       'item_id': [\"a\", \"b\", \"c\", \"a\", \"b\", \"b\", \"c\", \"d\"],\n",
    "                       'rating': [1, 3, 2, 5, 4, 1, 4, 3]})\n",
    "sf"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T06:40:49.395062Z",
     "start_time": "2019-06-15T06:40:48.313559Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre>Preparing data set.</pre>"
      ],
      "text/plain": [
       "Preparing data set."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>    Data has 8 observations with 3 users and 4 items.</pre>"
      ],
      "text/plain": [
       "    Data has 8 observations with 3 users and 4 items."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>    Data prepared in: 0.003196s</pre>"
      ],
      "text/plain": [
       "    Data prepared in: 0.003196s"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>Training ranking_factorization_recommender for recommendations.</pre>"
      ],
      "text/plain": [
       "Training ranking_factorization_recommender for recommendations."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+--------------------------------+--------------------------------------------------+----------+</pre>"
      ],
      "text/plain": [
       "+--------------------------------+--------------------------------------------------+----------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| Parameter                      | Description                                      | Value    |</pre>"
      ],
      "text/plain": [
       "| Parameter                      | Description                                      | Value    |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+--------------------------------+--------------------------------------------------+----------+</pre>"
      ],
      "text/plain": [
       "+--------------------------------+--------------------------------------------------+----------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| num_factors                    | Factor Dimension                                 | 32       |</pre>"
      ],
      "text/plain": [
       "| num_factors                    | Factor Dimension                                 | 32       |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| regularization                 | L2 Regularization on Factors                     | 1e-09    |</pre>"
      ],
      "text/plain": [
       "| regularization                 | L2 Regularization on Factors                     | 1e-09    |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| solver                         | Solver used for training                         | sgd      |</pre>"
      ],
      "text/plain": [
       "| solver                         | Solver used for training                         | sgd      |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| linear_regularization          | L2 Regularization on Linear Coefficients         | 1e-09    |</pre>"
      ],
      "text/plain": [
       "| linear_regularization          | L2 Regularization on Linear Coefficients         | 1e-09    |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| ranking_regularization         | Rank-based Regularization Weight                 | 0.25     |</pre>"
      ],
      "text/plain": [
       "| ranking_regularization         | Rank-based Regularization Weight                 | 0.25     |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| max_iterations                 | Maximum Number of Iterations                     | 25       |</pre>"
      ],
      "text/plain": [
       "| max_iterations                 | Maximum Number of Iterations                     | 25       |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+--------------------------------+--------------------------------------------------+----------+</pre>"
      ],
      "text/plain": [
       "+--------------------------------+--------------------------------------------------+----------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>  Optimizing model using SGD; tuning step size.</pre>"
      ],
      "text/plain": [
       "  Optimizing model using SGD; tuning step size."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>  Using 8 / 8 points for tuning the step size.</pre>"
      ],
      "text/plain": [
       "  Using 8 / 8 points for tuning the step size."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+---------+-------------------+------------------------------------------+</pre>"
      ],
      "text/plain": [
       "+---------+-------------------+------------------------------------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| Attempt | Initial Step Size | Estimated Objective Value                |</pre>"
      ],
      "text/plain": [
       "| Attempt | Initial Step Size | Estimated Objective Value                |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+---------+-------------------+------------------------------------------+</pre>"
      ],
      "text/plain": [
       "+---------+-------------------+------------------------------------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 0       | 25                | Not Viable                               |</pre>"
      ],
      "text/plain": [
       "| 0       | 25                | Not Viable                               |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 1       | 6.25              | Not Viable                               |</pre>"
      ],
      "text/plain": [
       "| 1       | 6.25              | Not Viable                               |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 2       | 1.5625            | Not Viable                               |</pre>"
      ],
      "text/plain": [
       "| 2       | 1.5625            | Not Viable                               |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 3       | 0.390625          | 2.79772                                  |</pre>"
      ],
      "text/plain": [
       "| 3       | 0.390625          | 2.79772                                  |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 4       | 0.195312          | 2.74043                                  |</pre>"
      ],
      "text/plain": [
       "| 4       | 0.195312          | 2.74043                                  |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 5       | 0.0976562         | 2.79932                                  |</pre>"
      ],
      "text/plain": [
       "| 5       | 0.0976562         | 2.79932                                  |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 6       | 0.0488281         | 3.00656                                  |</pre>"
      ],
      "text/plain": [
       "| 6       | 0.0488281         | 3.00656                                  |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 7       | 0.0244141         | 3.28278                                  |</pre>"
      ],
      "text/plain": [
       "| 7       | 0.0244141         | 3.28278                                  |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+---------+-------------------+------------------------------------------+</pre>"
      ],
      "text/plain": [
       "+---------+-------------------+------------------------------------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| Final   | 0.195312          | 2.74043                                  |</pre>"
      ],
      "text/plain": [
       "| Final   | 0.195312          | 2.74043                                  |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+---------+-------------------+------------------------------------------+</pre>"
      ],
      "text/plain": [
       "+---------+-------------------+------------------------------------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>Starting Optimization.</pre>"
      ],
      "text/plain": [
       "Starting Optimization."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+---------+--------------+-------------------+-----------------------+-------------+</pre>"
      ],
      "text/plain": [
       "+---------+--------------+-------------------+-----------------------+-------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| Iter.   | Elapsed Time | Approx. Objective | Approx. Training RMSE | Step Size   |</pre>"
      ],
      "text/plain": [
       "| Iter.   | Elapsed Time | Approx. Objective | Approx. Training RMSE | Step Size   |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+---------+--------------+-------------------+-----------------------+-------------+</pre>"
      ],
      "text/plain": [
       "+---------+--------------+-------------------+-----------------------+-------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| Initial | 141us        | 3.89999           | 1.3637                |             |</pre>"
      ],
      "text/plain": [
       "| Initial | 141us        | 3.89999           | 1.3637                |             |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+---------+--------------+-------------------+-----------------------+-------------+</pre>"
      ],
      "text/plain": [
       "+---------+--------------+-------------------+-----------------------+-------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 1       | 762us        | 4.11274           | 1.60028               | 0.195312    |</pre>"
      ],
      "text/plain": [
       "| 1       | 762us        | 4.11274           | 1.60028               | 0.195312    |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 2       | 1.298ms      | 3.00481           | 1.39367               | 0.116134    |</pre>"
      ],
      "text/plain": [
       "| 2       | 1.298ms      | 3.00481           | 1.39367               | 0.116134    |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 3       | 1.73ms       | 2.6827            | 1.23558               | 0.0856819   |</pre>"
      ],
      "text/plain": [
       "| 3       | 1.73ms       | 2.6827            | 1.23558               | 0.0856819   |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 4       | 2.257ms      | 2.49202           | 1.17326               | 0.0580668   |</pre>"
      ],
      "text/plain": [
       "| 4       | 2.257ms      | 2.49202           | 1.17326               | 0.0580668   |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 5       | 2.757ms      | 2.4385            | 1.16425               | 0.0491185   |</pre>"
      ],
      "text/plain": [
       "| 5       | 2.757ms      | 2.4385            | 1.16425               | 0.0491185   |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 10      | 5.116ms      | 2.28964           | 1.08345               | 0.029206    |</pre>"
      ],
      "text/plain": [
       "| 10      | 5.116ms      | 2.28964           | 1.08345               | 0.029206    |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 25      | 12.067ms     | 2.20318           | 1.06674               | 0.0146899   |</pre>"
      ],
      "text/plain": [
       "| 25      | 12.067ms     | 2.20318           | 1.06674               | 0.0146899   |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\"><table frame=\"box\" rules=\"cols\">\n",
       "    <tr>\n",
       "        <th style=\"padding-left: 1em; padding-right: 1em; text-align: center\">user_id</th>\n",
       "        <th style=\"padding-left: 1em; padding-right: 1em; text-align: center\">item_id</th>\n",
       "        <th style=\"padding-left: 1em; padding-right: 1em; text-align: center\">score</th>\n",
       "        <th style=\"padding-left: 1em; padding-right: 1em; text-align: center\">rank</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">0</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">d</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1.2718666195869446</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">c</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">4.015937566757202</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">d</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">3.1318363547325134</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">2</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">a</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">2.478732317686081</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "    </tr>\n",
       "</table>\n",
       "[4 rows x 4 columns]<br/>\n",
       "</div>"
      ],
      "text/plain": [
       "Columns:\n",
       "\tuser_id\tstr\n",
       "\titem_id\tstr\n",
       "\tscore\tfloat\n",
       "\trank\tint\n",
       "\n",
       "Rows: 4\n",
       "\n",
       "Data:\n",
       "+---------+---------+--------------------+------+\n",
       "| user_id | item_id |       score        | rank |\n",
       "+---------+---------+--------------------+------+\n",
       "|    0    |    d    | 1.2718666195869446 |  1   |\n",
       "|    1    |    c    | 4.015937566757202  |  1   |\n",
       "|    1    |    d    | 3.1318363547325134 |  2   |\n",
       "|    2    |    a    | 2.478732317686081  |  1   |\n",
       "+---------+---------+--------------------+------+\n",
       "[4 rows x 4 columns]"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<pre>+---------+--------------+-------------------+-----------------------+-------------+</pre>"
      ],
      "text/plain": [
       "+---------+--------------+-------------------+-----------------------+-------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>Optimization Complete: Maximum number of passes through the data reached.</pre>"
      ],
      "text/plain": [
       "Optimization Complete: Maximum number of passes through the data reached."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>Computing final objective value and training RMSE.</pre>"
      ],
      "text/plain": [
       "Computing final objective value and training RMSE."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>       Final objective value: 2.79567</pre>"
      ],
      "text/plain": [
       "       Final objective value: 2.79567"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>       Final training RMSE: 1.03992</pre>"
      ],
      "text/plain": [
       "       Final training RMSE: 1.03992"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "m = tc.recommender.create(sf, target='rating')\n",
    "recs = m.recommend()\n",
    "recs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## The CourseTalk dataset: loading and first look\n",
    "\n",
    "Loading of the CourseTalk database."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T06:43:24.922224Z",
     "start_time": "2019-06-15T06:43:24.866432Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre>Materializing SFrame</pre>"
      ],
      "text/plain": [
       "Materializing SFrame"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<html>                 <body>                     <iframe style=\"border:0;margin:0\" width=\"1000\" height=\"1200\" srcdoc='<html lang=\"en\">                         <head>                             <script src=\"https://cdnjs.cloudflare.com/ajax/libs/vega/3.0.8/vega.js\"></script>                             <script src=\"https://cdnjs.cloudflare.com/ajax/libs/vega-embed/3.0.0-rc7/vega-embed.js\"></script>                             <script src=\"https://cdnjs.cloudflare.com/ajax/libs/vega-tooltip/0.5.1/vega-tooltip.min.js\"></script>                             <link rel=\"stylesheet\" type=\"text/css\" href=\"https://cdnjs.cloudflare.com/ajax/libs/vega-tooltip/0.5.1/vega-tooltip.min.css\">                             <style>                             .vega-actions > a{                                 color:white;                                 text-decoration: none;                                 font-family: \"Arial\";                                 cursor:pointer;                                 padding:5px;                                 background:#AAAAAA;                                 border-radius:4px;                                 padding-left:10px;                                 padding-right:10px;                                 margin-right:5px;                             }                             .vega-actions{                                 margin-top:20px;                                 text-align:center                             }                            .vega-actions > a{                                 background:#999999;                            }                             </style>                         </head>                         <body>                             <div id=\"vis\">                             </div>                             <script>                                 var vega_json = \"{\\\"metadata\\\": {\\\"bubbleOpts\\\": {\\\"showAllFields\\\": false, \\\"fields\\\": [{\\\"field\\\": \\\"left\\\"}, {\\\"field\\\": \\\"right\\\"}, {\\\"field\\\": \\\"count\\\"}, {\\\"field\\\": \\\"label\\\"}]}}, \\\"config\\\": {\\\"axis\\\": {\\\"labelPadding\\\": 10, \\\"labelFont\\\": \\\"HelveticaNeue-Light, Arial\\\", \\\"titleColor\\\": \\\"#595959\\\", \\\"titleFont\\\": \\\"HelveticaNeue-Light, Arial\\\", \\\"titleFontWeight\\\": \\\"normal\\\", \\\"labelFontSize\\\": 7, \\\"titleFontSize\\\": 12, \\\"titlePadding\\\": 9, \\\"labelColor\\\": \\\"#595959\\\"}, \\\"axisY\\\": {\\\"minExtent\\\": 30}, \\\"style\\\": {\\\"rect\\\": {\\\"stroke\\\": \\\"rgba(200, 200, 200, 0.5)\\\"}, \\\"group-title\\\": {\\\"fontWeight\\\": \\\"normal\\\", \\\"fontSize\\\": 20, \\\"fill\\\": \\\"#595959\\\", \\\"font\\\": \\\"HelveticaNeue-Light, Arial\\\"}}}, \\\"width\\\": 800, \\\"padding\\\": 8, \\\"data\\\": [{\\\"name\\\": \\\"pts_store\\\"}, {\\\"name\\\": \\\"source_2\\\", \\\"values\\\": [{\\\"num_unique\\\": 2017, \\\"num_row\\\": 2773, \\\"max\\\": 2017.0, \\\"stdev\\\": 555.490133, \\\"median\\\": 1103.0, \\\"a\\\": 0, \\\"min\\\": 1.0, \\\"title\\\": \\\"user_id\\\", \\\"type\\\": \\\"integer\\\", \\\"numeric\\\": [{\\\"left\\\": -30, \\\"count\\\": 152, \\\"right\\\": 74}, {\\\"left\\\": 74, \\\"count\\\": 105, \\\"right\\\": 178}, {\\\"left\\\": 178, \\\"count\\\": 107, \\\"right\\\": 282}, {\\\"left\\\": 282, \\\"count\\\": 107, \\\"right\\\": 386}, {\\\"left\\\": 386, \\\"count\\\": 108, \\\"right\\\": 490}, {\\\"left\\\": 490, \\\"count\\\": 128, \\\"right\\\": 594}, {\\\"left\\\": 594, \\\"count\\\": 146, \\\"right\\\": 698}, {\\\"left\\\": 698, \\\"count\\\": 155, \\\"right\\\": 802}, {\\\"left\\\": 802, \\\"count\\\": 151, \\\"right\\\": 906}, {\\\"left\\\": 906, \\\"count\\\": 107, \\\"right\\\": 1010}, {\\\"left\\\": 1010, \\\"count\\\": 136, \\\"right\\\": 1114}, {\\\"left\\\": 1114, \\\"count\\\": 272, \\\"right\\\": 1218}, {\\\"left\\\": 1218, \\\"count\\\": 192, \\\"right\\\": 1322}, {\\\"left\\\": 1322, \\\"count\\\": 149, \\\"right\\\": 1426}, {\\\"left\\\": 1426, \\\"count\\\": 154, \\\"right\\\": 1530}, {\\\"left\\\": 1530, \\\"count\\\": 186, \\\"right\\\": 1634}, {\\\"left\\\": 1634, \\\"count\\\": 118, \\\"right\\\": 1738}, {\\\"left\\\": 1738, \\\"count\\\": 122, \\\"right\\\": 1842}, {\\\"left\\\": 1842, \\\"count\\\": 106, \\\"right\\\": 1946}, {\\\"left\\\": 1946, \\\"count\\\": 72, \\\"right\\\": 2050}, {\\\"stop\\\": 2050, \\\"step\\\": 104, \\\"start\\\": -30}], \\\"num_missing\\\": 0, \\\"mean\\\": 1020.098089, \\\"categorical\\\": []}, {\\\"num_unique\\\": 214, \\\"num_row\\\": 2773, \\\"max\\\": 214.0, \\\"stdev\\\": 60.457737, \\\"median\\\": 14.0, \\\"a\\\": 1, \\\"min\\\": 1.0, \\\"title\\\": \\\"course_id\\\", \\\"type\\\": \\\"integer\\\", \\\"numeric\\\": [{\\\"left\\\": -2, \\\"count\\\": 1223, \\\"right\\\": 9}, {\\\"left\\\": 9, \\\"count\\\": 316, \\\"right\\\": 20}, {\\\"left\\\": 20, \\\"count\\\": 134, \\\"right\\\": 31}, {\\\"left\\\": 31, \\\"count\\\": 154, \\\"right\\\": 42}, {\\\"left\\\": 42, \\\"count\\\": 93, \\\"right\\\": 53}, {\\\"left\\\": 53, \\\"count\\\": 87, \\\"right\\\": 64}, {\\\"left\\\": 64, \\\"count\\\": 37, \\\"right\\\": 75}, {\\\"left\\\": 75, \\\"count\\\": 11, \\\"right\\\": 86}, {\\\"left\\\": 86, \\\"count\\\": 83, \\\"right\\\": 97}, {\\\"left\\\": 97, \\\"count\\\": 32, \\\"right\\\": 108}, {\\\"left\\\": 108, \\\"count\\\": 134, \\\"right\\\": 119}, {\\\"left\\\": 119, \\\"count\\\": 38, \\\"right\\\": 130}, {\\\"left\\\": 130, \\\"count\\\": 91, \\\"right\\\": 141}, {\\\"left\\\": 141, \\\"count\\\": 87, \\\"right\\\": 152}, {\\\"left\\\": 152, \\\"count\\\": 14, \\\"right\\\": 163}, {\\\"left\\\": 163, \\\"count\\\": 69, \\\"right\\\": 174}, {\\\"left\\\": 174, \\\"count\\\": 50, \\\"right\\\": 185}, {\\\"left\\\": 185, \\\"count\\\": 68, \\\"right\\\": 196}, {\\\"left\\\": 196, \\\"count\\\": 33, \\\"right\\\": 207}, {\\\"left\\\": 207, \\\"count\\\": 19, \\\"right\\\": 218}, {\\\"stop\\\": 218, \\\"step\\\": 11, \\\"start\\\": -2}], \\\"num_missing\\\": 0, \\\"mean\\\": 46.83736, \\\"categorical\\\": []}, {\\\"num_unique\\\": 10, \\\"num_row\\\": 2773, \\\"max\\\": 5.0, \\\"stdev\\\": 0.946545, \\\"median\\\": 5.0, \\\"a\\\": 2, \\\"min\\\": 0.5, \\\"title\\\": \\\"rating\\\", \\\"type\\\": \\\"float\\\", \\\"numeric\\\": [{\\\"left\\\": 0.4554, \\\"count\\\": 30, \\\"right\\\": 0.6858}, {\\\"left\\\": 0.6858, \\\"count\\\": 0, \\\"right\\\": 0.9162}, {\\\"left\\\": 0.9162, \\\"count\\\": 43, \\\"right\\\": 1.1466}, {\\\"left\\\": 1.1466, \\\"count\\\": 0, \\\"right\\\": 1.377}, {\\\"left\\\": 1.377, \\\"count\\\": 23, \\\"right\\\": 1.6074}, {\\\"left\\\": 1.6074, \\\"count\\\": 0, \\\"right\\\": 1.8378}, {\\\"left\\\": 1.8378, \\\"count\\\": 53, \\\"right\\\": 2.0682}, {\\\"left\\\": 2.0682, \\\"count\\\": 0, \\\"right\\\": 2.2986}, {\\\"left\\\": 2.2986, \\\"count\\\": 35, \\\"right\\\": 2.529}, {\\\"left\\\": 2.529, \\\"count\\\": 0, \\\"right\\\": 2.7594}, {\\\"left\\\": 2.7594, \\\"count\\\": 0, \\\"right\\\": 2.9898}, {\\\"left\\\": 2.9898, \\\"count\\\": 80, \\\"right\\\": 3.2202}, {\\\"left\\\": 3.2202, \\\"count\\\": 0, \\\"right\\\": 3.4506}, {\\\"left\\\": 3.4506, \\\"count\\\": 94, \\\"right\\\": 3.681}, {\\\"left\\\": 3.681, \\\"count\\\": 0, \\\"right\\\": 3.9114}, {\\\"left\\\": 3.9114, \\\"count\\\": 285, \\\"right\\\": 4.1418}, {\\\"left\\\": 4.1418, \\\"count\\\": 0, \\\"right\\\": 4.3722}, {\\\"left\\\": 4.3722, \\\"count\\\": 313, \\\"right\\\": 4.6026}, {\\\"left\\\": 4.6026, \\\"count\\\": 0, \\\"right\\\": 4.833}, {\\\"left\\\": 4.833, \\\"count\\\": 1817, \\\"right\\\": 5.0634}, {\\\"stop\\\": 5.0634, \\\"step\\\": 0.2304, \\\"start\\\": 0.4554}], \\\"num_missing\\\": 0, \\\"mean\\\": 4.503606, \\\"categorical\\\": []}]}, {\\\"transform\\\": [{\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"20\\\", \\\"as\\\": \\\"c_x_axis_back\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+66\\\", \\\"as\\\": \\\"c_main_background\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+43\\\", \\\"as\\\": \\\"c_top_bar\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+59\\\", \\\"as\\\": \\\"c_top_title\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+58\\\", \\\"as\\\": \\\"c_top_type\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+178\\\", \\\"as\\\": \\\"c_rule\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+106\\\", \\\"as\\\": \\\"c_num_rows\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+130\\\", \\\"as\\\": \\\"c_num_unique\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+154\\\", \\\"as\\\": \\\"c_missing\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+105\\\", \\\"as\\\": \\\"c_num_rows_val\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+130\\\", \\\"as\\\": \\\"c_num_unique_val\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+154\\\", \\\"as\\\": \\\"c_missing_val\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+195\\\", \\\"as\\\": \\\"c_frequent_items\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+218\\\", \\\"as\\\": \\\"c_first_item\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+235\\\", \\\"as\\\": \\\"c_second_item\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+252\\\", \\\"as\\\": \\\"c_third_item\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+269\\\", \\\"as\\\": \\\"c_fourth_item\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+286\\\", \\\"as\\\": \\\"c_fifth_item\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+200\\\", \\\"as\\\": \\\"c_mean\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+220\\\", \\\"as\\\": \\\"c_min\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+240\\\", \\\"as\\\": \\\"c_max\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+260\\\", \\\"as\\\": \\\"c_median\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+280\\\", \\\"as\\\": \\\"c_stdev\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+198\\\", \\\"as\\\": \\\"c_mean_val\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+218\\\", \\\"as\\\": \\\"c_min_val\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+238\\\", \\\"as\\\": \\\"c_max_val\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+258\\\", \\\"as\\\": \\\"c_median_val\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+278\\\", \\\"as\\\": \\\"c_stdev_val\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+106\\\", \\\"as\\\": \\\"graph_offset\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+132\\\", \\\"as\\\": \\\"graph_offset_categorical\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")?false:true\\\", \\\"as\\\": \\\"c_clip_val\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")?250:0\\\", \\\"as\\\": \\\"c_width_numeric_val\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\")?false:true\\\", \\\"as\\\": \\\"c_clip_val_cat\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\")?250:0\\\", \\\"as\\\": \\\"c_width_numeric_val_cat\\\"}], \\\"name\\\": \\\"data_2\\\", \\\"source\\\": \\\"source_2\\\"}], \\\"height\\\": 980, \\\"$schema\\\": \\\"https://vega.github.io/schema/vega/v4.json\\\", \\\"marks\\\": [{\\\"type\\\": \\\"group\\\", \\\"encode\\\": {\\\"enter\\\": {\\\"strokeWidth\\\": {\\\"value\\\": 0}, \\\"clip\\\": {\\\"value\\\": 0}, \\\"fill\\\": {\\\"value\\\": \\\"#ffffff\\\"}, \\\"width\\\": {\\\"value\\\": 734}, \\\"y\\\": {\\\"value\\\": 0}, \\\"fillOpacity\\\": {\\\"value\\\": 0}, \\\"stroke\\\": {\\\"value\\\": \\\"#000000\\\"}, \\\"height\\\": {\\\"value\\\": 366}, \\\"x\\\": {\\\"value\\\": 0}}}, \\\"marks\\\": [{\\\"type\\\": \\\"group\\\", \\\"encode\\\": {\\\"enter\\\": {\\\"strokeWidth\\\": {\\\"value\\\": 0}, \\\"clip\\\": {\\\"value\\\": 0}, \\\"fill\\\": {\\\"value\\\": \\\"#ffffff\\\"}, \\\"width\\\": {\\\"value\\\": 734}, \\\"y\\\": {\\\"value\\\": 0}, \\\"fillOpacity\\\": {\\\"value\\\": 0}, \\\"stroke\\\": {\\\"value\\\": \\\"#000000\\\"}, \\\"height\\\": {\\\"value\\\": 366}, \\\"x\\\": {\\\"value\\\": 0}}}, \\\"axes\\\": [], \\\"scales\\\": [], \\\"marks\\\": [{\\\"type\\\": \\\"rect\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_main_background\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]\\\"}}, \\\"enter\\\": {\\\"strokeWidth\\\": {\\\"value\\\": 0.5}, \\\"fill\\\": {\\\"value\\\": \\\"#FEFEFE\\\"}, \\\"width\\\": {\\\"value\\\": 700}, \\\"y\\\": {\\\"value\\\": 66}, \\\"fillOpacity\\\": {\\\"value\\\": 1}, \\\"stroke\\\": {\\\"value\\\": \\\"#DEDEDE\\\"}, \\\"height\\\": {\\\"value\\\": 250}, \\\"x\\\": {\\\"value\\\": 33}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"rect\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_top_bar\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]\\\"}}, \\\"enter\\\": {\\\"strokeWidth\\\": {\\\"value\\\": 0.5}, \\\"fill\\\": {\\\"value\\\": \\\"#F5F5F5\\\"}, \\\"width\\\": {\\\"value\\\": 700}, \\\"y\\\": {\\\"value\\\": 43}, \\\"fillOpacity\\\": {\\\"value\\\": 1}, \\\"stroke\\\": {\\\"value\\\": \\\"#DEDEDE\\\"}, \\\"height\\\": {\\\"value\\\": 30}, \\\"x\\\": {\\\"value\\\": 33}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_top_type\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+687\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#595859\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 58}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"&apos;&apos;+datum[\\\\\\\"type\\\\\\\"]\\\"}, \\\"x\\\": {\\\"value\\\": 720}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 12}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_top_title\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+11\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#9B9B9B\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 59}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"&apos;&apos;+datum[\\\\\\\"title\\\\\\\"]\\\"}, \\\"x\\\": {\\\"value\\\": 44}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 15}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"rule\\\", \\\"encode\\\": {\\\"update\\\": {\\\"x2\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+687\\\"}, \\\"y\\\": {\\\"field\\\": \\\"c_rule\\\"}, \\\"y2\\\": {\\\"field\\\": \\\"c_rule\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+467\\\"}}, \\\"enter\\\": {\\\"strokeWidth\\\": {\\\"value\\\": 1}, \\\"y2\\\": {\\\"value\\\": 178}, \\\"x2\\\": {\\\"value\\\": 720}, \\\"y\\\": {\\\"value\\\": 178}, \\\"strokeCap\\\": {\\\"value\\\": \\\"butt\\\"}, \\\"stroke\\\": {\\\"value\\\": \\\"#EDEDEB\\\"}, \\\"x\\\": {\\\"value\\\": 500}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_num_rows\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+467\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 106}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"value\\\": \\\"Num. Rows:\\\"}, \\\"x\\\": {\\\"value\\\": 500}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 12}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_num_unique\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+467\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 130}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"value\\\": \\\"Num. Unique:\\\"}, \\\"x\\\": {\\\"value\\\": 500}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 12}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_missing\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+467\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 154}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"value\\\": \\\"Missing:\\\"}, \\\"x\\\": {\\\"value\\\": 500}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 12}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_num_rows_val\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#5A5A5A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 105}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"toString(format(datum[\\\\\\\"num_row\\\\\\\"], \\\\\\\",\\\\\\\"))\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 12}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_num_unique_val\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#5A5A5A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 130}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"toString(format(datum[\\\\\\\"num_unique\\\\\\\"], \\\\\\\",\\\\\\\"))\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 12}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_missing_val\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#5A5A5A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 154}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"toString(format(datum[\\\\\\\"num_missing\\\\\\\"], \\\\\\\",\\\\\\\"))\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 12}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_frequent_items\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+467\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"bold\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\")? \\\\\\\"Frequent Items\\\\\\\":\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 500}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_first_item\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+487\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"((datum[\\\\\\\"categorical\\\\\\\"].length >= 1) &amp;&amp; (toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\"))? toString(datum[\\\\\\\"categorical\\\\\\\"][0][\\\\\\\"label\\\\\\\"]):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 520}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_second_item\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+487\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"((datum[\\\\\\\"categorical\\\\\\\"].length >= 2) &amp;&amp; (toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\"))? toString(datum[\\\\\\\"categorical\\\\\\\"][1][\\\\\\\"label\\\\\\\"]):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 520}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_third_item\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+487\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"((datum[\\\\\\\"categorical\\\\\\\"].length >= 3) &amp;&amp; (toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\"))? toString(datum[\\\\\\\"categorical\\\\\\\"][2][\\\\\\\"label\\\\\\\"]):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 520}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_fourth_item\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+487\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"((datum[\\\\\\\"categorical\\\\\\\"].length >= 4) &amp;&amp; (toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\"))? toString(datum[\\\\\\\"categorical\\\\\\\"][3][\\\\\\\"label\\\\\\\"]):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 520}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_fifth_item\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+487\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"((datum[\\\\\\\"categorical\\\\\\\"].length >= 5) &amp;&amp; (toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\"))? toString(datum[\\\\\\\"categorical\\\\\\\"][4][\\\\\\\"label\\\\\\\"]):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 520}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_first_item\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#7A7A7A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"((datum[\\\\\\\"categorical\\\\\\\"].length >= 1) &amp;&amp; (toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\"))? toString(datum[\\\\\\\"categorical\\\\\\\"][0][\\\\\\\"count\\\\\\\"]):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_second_item\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#7A7A7A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"((datum[\\\\\\\"categorical\\\\\\\"].length >= 2) &amp;&amp; (toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\"))? toString(datum[\\\\\\\"categorical\\\\\\\"][1][\\\\\\\"count\\\\\\\"]):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 10}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_third_item\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#7A7A7A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"((datum[\\\\\\\"categorical\\\\\\\"].length >= 3) &amp;&amp; (toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\"))? toString(datum[\\\\\\\"categorical\\\\\\\"][2][\\\\\\\"count\\\\\\\"]):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 10}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_fourth_item\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#7A7A7A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"((datum[\\\\\\\"categorical\\\\\\\"].length >= 4) &amp;&amp; (toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\"))? toString(datum[\\\\\\\"categorical\\\\\\\"][3][\\\\\\\"count\\\\\\\"]):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 10}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_fifth_item\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#7A7A7A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"((datum[\\\\\\\"categorical\\\\\\\"].length >= 5) &amp;&amp; (toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\"))? toString(datum[\\\\\\\"categorical\\\\\\\"][4][\\\\\\\"count\\\\\\\"]):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 10}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_mean\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+467\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"bold\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")? \\\\\\\"Mean:\\\\\\\":\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 500}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_min\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+467\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 220}, \\\"fontWeight\\\": {\\\"value\\\": \\\"bold\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")? \\\\\\\"Min:\\\\\\\":\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 500}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_max\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+467\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 240}, \\\"fontWeight\\\": {\\\"value\\\": \\\"bold\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")? \\\\\\\"Max:\\\\\\\":\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 500}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_median\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+467\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 260}, \\\"fontWeight\\\": {\\\"value\\\": \\\"bold\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")? \\\\\\\"Median:\\\\\\\":\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 500}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_stdev\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+467\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 280}, \\\"fontWeight\\\": {\\\"value\\\": \\\"bold\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")? \\\\\\\"St. Dev:\\\\\\\":\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 500}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_mean_val\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#6A6A6A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 198}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")?toString(format(datum[\\\\\\\"mean\\\\\\\"], \\\\\\\",\\\\\\\")):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 10}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_min_val\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#6A6A6A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 218}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")?toString(format(datum[\\\\\\\"min\\\\\\\"], \\\\\\\",\\\\\\\")):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 10}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_max_val\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#6A6A6A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 238}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")?toString(format(datum[\\\\\\\"max\\\\\\\"], \\\\\\\",\\\\\\\")):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 10}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_median_val\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#6A6A6A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 258}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")?toString(format(datum[\\\\\\\"median\\\\\\\"], \\\\\\\",\\\\\\\")):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 10}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_stdev_val\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#6A6A6A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 278}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")?toString(format(datum[\\\\\\\"stdev\\\\\\\"], \\\\\\\",\\\\\\\")):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 10}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"marks\\\": [{\\\"name\\\": \\\"marks\\\", \\\"encode\\\": {\\\"hover\\\": {\\\"fill\\\": {\\\"value\\\": \\\"#7EC2F3\\\"}}, \\\"update\\\": {\\\"x2\\\": {\\\"scale\\\": \\\"x\\\", \\\"field\\\": \\\"right\\\"}, \\\"y\\\": {\\\"scale\\\": \\\"y\\\", \\\"field\\\": \\\"count\\\"}, \\\"y2\\\": {\\\"scale\\\": \\\"y\\\", \\\"value\\\": 0}, \\\"x\\\": {\\\"scale\\\": \\\"x\\\", \\\"field\\\": \\\"left\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#108EE9\\\"}}}, \\\"style\\\": [\\\"rect\\\"], \\\"from\\\": {\\\"data\\\": \\\"new_data\\\"}, \\\"type\\\": \\\"rect\\\"}], \\\"type\\\": \\\"group\\\", \\\"axes\\\": [{\\\"zindex\\\": 1, \\\"scale\\\": \\\"x\\\", \\\"orient\\\": \\\"bottom\\\", \\\"labelOverlap\\\": true, \\\"tickCount\\\": {\\\"signal\\\": \\\"ceil(width/40)\\\"}, \\\"title\\\": \\\"Values\\\"}, {\\\"labels\\\": false, \\\"scale\\\": \\\"x\\\", \\\"orient\\\": \\\"bottom\\\", \\\"grid\\\": true, \\\"gridScale\\\": \\\"y\\\", \\\"zindex\\\": 0, \\\"maxExtent\\\": 0, \\\"tickCount\\\": {\\\"signal\\\": \\\"ceil(width/40)\\\"}, \\\"ticks\\\": false, \\\"minExtent\\\": 0, \\\"domain\\\": false}, {\\\"zindex\\\": 1, \\\"scale\\\": \\\"y\\\", \\\"orient\\\": \\\"left\\\", \\\"labelOverlap\\\": true, \\\"tickCount\\\": {\\\"signal\\\": \\\"ceil(height/40)\\\"}, \\\"title\\\": \\\"Count\\\"}, {\\\"labels\\\": false, \\\"scale\\\": \\\"y\\\", \\\"orient\\\": \\\"left\\\", \\\"grid\\\": true, \\\"gridScale\\\": \\\"x\\\", \\\"zindex\\\": 0, \\\"maxExtent\\\": 0, \\\"tickCount\\\": {\\\"signal\\\": \\\"ceil(height/40)\\\"}, \\\"ticks\\\": false, \\\"minExtent\\\": 0, \\\"domain\\\": false}], \\\"style\\\": \\\"cell\\\", \\\"from\\\": {\\\"facet\\\": {\\\"name\\\": \\\"new_data\\\", \\\"data\\\": \\\"data_2\\\", \\\"field\\\": \\\"numeric\\\"}}, \\\"encode\\\": {\\\"update\\\": {\\\"clip\\\": {\\\"field\\\": \\\"c_clip_val\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+87\\\"}, \\\"width\\\": {\\\"field\\\": \\\"c_width_numeric_val\\\"}}, \\\"enter\\\": {\\\"strokeWidth\\\": {\\\"value\\\": 0}, \\\"fill\\\": {\\\"value\\\": \\\"#ffffff\\\"}, \\\"width\\\": {\\\"value\\\": 250}, \\\"y\\\": {\\\"field\\\": \\\"graph_offset\\\"}, \\\"fillOpacity\\\": {\\\"value\\\": 0}, \\\"stroke\\\": {\\\"value\\\": \\\"#000000\\\"}, \\\"height\\\": {\\\"value\\\": 150}, \\\"x\\\": {\\\"value\\\": 120}}}, \\\"scales\\\": [{\\\"type\\\": \\\"linear\\\", \\\"range\\\": [0, {\\\"signal\\\": \\\"width\\\"}], \\\"name\\\": \\\"x\\\", \\\"domain\\\": {\\\"fields\\\": [\\\"left\\\", \\\"right\\\"], \\\"data\\\": \\\"new_data\\\", \\\"sort\\\": true}, \\\"zero\\\": true, \\\"nice\\\": true}, {\\\"type\\\": \\\"linear\\\", \\\"range\\\": [{\\\"signal\\\": \\\"height\\\"}, 0], \\\"name\\\": \\\"y\\\", \\\"domain\\\": {\\\"data\\\": \\\"new_data\\\", \\\"field\\\": \\\"count\\\"}, \\\"zero\\\": true, \\\"nice\\\": true}], \\\"signals\\\": [{\\\"name\\\": \\\"width\\\", \\\"update\\\": \\\"250\\\"}, {\\\"name\\\": \\\"height\\\", \\\"update\\\": \\\"150\\\"}]}, {\\\"type\\\": \\\"group\\\", \\\"axes\\\": [{\\\"zindex\\\": 1, \\\"scale\\\": \\\"x\\\", \\\"orient\\\": \\\"top\\\", \\\"labelOverlap\\\": true, \\\"tickCount\\\": {\\\"signal\\\": \\\"ceil(width/40)\\\"}, \\\"title\\\": \\\"Count\\\"}, {\\\"labels\\\": false, \\\"scale\\\": \\\"x\\\", \\\"orient\\\": \\\"top\\\", \\\"grid\\\": true, \\\"gridScale\\\": \\\"y\\\", \\\"zindex\\\": 0, \\\"maxExtent\\\": 0, \\\"tickCount\\\": {\\\"signal\\\": \\\"ceil(width/40)\\\"}, \\\"ticks\\\": false, \\\"minExtent\\\": 0, \\\"domain\\\": false}, {\\\"scale\\\": \\\"y\\\", \\\"orient\\\": \\\"left\\\", \\\"title\\\": \\\"Label\\\", \\\"labelOverlap\\\": true, \\\"zindex\\\": 1}], \\\"style\\\": \\\"cell\\\", \\\"from\\\": {\\\"facet\\\": {\\\"name\\\": \\\"data_5\\\", \\\"data\\\": \\\"data_2\\\", \\\"field\\\": \\\"categorical\\\"}}, \\\"scales\\\": [{\\\"type\\\": \\\"linear\\\", \\\"range\\\": [0, 250], \\\"name\\\": \\\"x\\\", \\\"domain\\\": {\\\"data\\\": \\\"data_5\\\", \\\"field\\\": \\\"count\\\"}, \\\"zero\\\": true, \\\"nice\\\": true}, {\\\"paddingInner\\\": 0.1, \\\"type\\\": \\\"band\\\", \\\"paddingOuter\\\": 0.05, \\\"range\\\": [150, 0], \\\"name\\\": \\\"y\\\", \\\"domain\\\": {\\\"data\\\": \\\"data_5\\\", \\\"field\\\": \\\"label\\\", \\\"sort\\\": {\\\"field\\\": \\\"label_idx\\\", \\\"order\\\": \\\"descending\\\", \\\"op\\\": \\\"mean\\\"}}}], \\\"encode\\\": {\\\"update\\\": {\\\"clip\\\": {\\\"field\\\": \\\"c_clip_val_cat\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+137\\\"}, \\\"width\\\": {\\\"field\\\": \\\"c_width_numeric_val_cat\\\"}}, \\\"enter\\\": {\\\"strokeWidth\\\": {\\\"value\\\": 0}, \\\"fill\\\": {\\\"value\\\": \\\"#ffffff\\\"}, \\\"width\\\": {\\\"value\\\": 250}, \\\"y\\\": {\\\"field\\\": \\\"graph_offset_categorical\\\"}, \\\"fillOpacity\\\": {\\\"value\\\": 0}, \\\"stroke\\\": {\\\"value\\\": \\\"#000000\\\"}, \\\"height\\\": {\\\"value\\\": 150}, \\\"x\\\": {\\\"value\\\": 170}}}, \\\"marks\\\": [{\\\"name\\\": \\\"marks\\\", \\\"encode\\\": {\\\"hover\\\": {\\\"fill\\\": {\\\"value\\\": \\\"#7EC2F3\\\"}}, \\\"update\\\": {\\\"x2\\\": {\\\"scale\\\": \\\"x\\\", \\\"value\\\": 0}, \\\"y\\\": {\\\"scale\\\": \\\"y\\\", \\\"field\\\": \\\"label\\\"}, \\\"height\\\": {\\\"scale\\\": \\\"y\\\", \\\"band\\\": true}, \\\"x\\\": {\\\"scale\\\": \\\"x\\\", \\\"field\\\": \\\"count\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#108EE9\\\"}}}, \\\"style\\\": [\\\"bar\\\"], \\\"from\\\": {\\\"data\\\": \\\"data_5\\\"}, \\\"type\\\": \\\"rect\\\"}], \\\"signals\\\": [{\\\"name\\\": \\\"unit\\\", \\\"on\\\": [{\\\"events\\\": \\\"mousemove\\\", \\\"update\\\": \\\"isTuple(group()) ? group() : unit\\\"}], \\\"value\\\": {}}, {\\\"name\\\": \\\"pts\\\", \\\"update\\\": \\\"data(\\\\\\\"pts_store\\\\\\\").length &amp;&amp; {count: data(\\\\\\\"pts_store\\\\\\\")[0].values[0]}\\\"}, {\\\"name\\\": \\\"pts_tuple\\\", \\\"on\\\": [{\\\"events\\\": [{\\\"type\\\": \\\"click\\\", \\\"source\\\": \\\"scope\\\"}], \\\"force\\\": true, \\\"update\\\": \\\"datum &amp;&amp; item().mark.marktype !== &apos;group&apos; ? {unit: \\\\\\\"\\\\\\\", encodings: [\\\\\\\"x\\\\\\\"], fields: [\\\\\\\"count\\\\\\\"], values: [datum[\\\\\\\"count\\\\\\\"]]} : null\\\"}], \\\"value\\\": {}}, {\\\"name\\\": \\\"pts_modify\\\", \\\"on\\\": [{\\\"events\\\": {\\\"signal\\\": \\\"pts_tuple\\\"}, \\\"update\\\": \\\"modify(\\\\\\\"pts_store\\\\\\\", pts_tuple, true)\\\"}]}]}]}]}]}\";                                 var vega_json_parsed = JSON.parse(vega_json);                                 var toolTipOpts = {                                     showAllFields: true                                 };                                 if(vega_json_parsed[\"metadata\"] != null){                                     if(vega_json_parsed[\"metadata\"][\"bubbleOpts\"] != null){                                         toolTipOpts = vega_json_parsed[\"metadata\"][\"bubbleOpts\"];                                     };                                 };                                 vegaEmbed(\"#vis\", vega_json_parsed).then(function (result) {                                     vegaTooltip.vega(result.view, toolTipOpts);                                  });                             </script>                         </body>                     </html>' src=\"demo_iframe_srcdoc.htm\">                         <p>Your browser does not support iframes.</p>                     </iframe>                 </body>             </html>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "#train_file = 'http://s3.amazonaws.com/dato-datasets/millionsong/10000.txt'\n",
    "train_file = '../data/ratings.dat'\n",
    "sf = tc.SFrame.read_csv(train_file, header=False, \n",
    "                        delimiter='|', verbose=False)\n",
    "sf = sf.rename({'X1':'user_id', 'X2':'course_id', 'X3':'rating'})\n",
    "sf.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "In order to evaluate the performance of our model, we randomly split the observations in our data set into two partitions: we will use `train_set` when creating our model and `test_set` for evaluating its performance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T06:44:10.686617Z",
     "start_time": "2019-06-15T06:44:10.673786Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\"><table frame=\"box\" rules=\"cols\">\n",
       "    <tr>\n",
       "        <th style=\"padding-left: 1em; padding-right: 1em; text-align: center\">user_id</th>\n",
       "        <th style=\"padding-left: 1em; padding-right: 1em; text-align: center\">course_id</th>\n",
       "        <th style=\"padding-left: 1em; padding-right: 1em; text-align: center\">rating</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">5.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">2</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">5.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">3</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">5.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">4</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">5.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">5</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">5.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">6</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">5.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">7</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">5.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">8</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">5.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">9</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">5.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">10</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">5.0</td>\n",
       "    </tr>\n",
       "</table>\n",
       "[2773 rows x 3 columns]<br/>Note: Only the head of the SFrame is printed.<br/>You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.\n",
       "</div>"
      ],
      "text/plain": [
       "Columns:\n",
       "\tuser_id\tint\n",
       "\tcourse_id\tint\n",
       "\trating\tfloat\n",
       "\n",
       "Rows: 2773\n",
       "\n",
       "Data:\n",
       "+---------+-----------+--------+\n",
       "| user_id | course_id | rating |\n",
       "+---------+-----------+--------+\n",
       "|    1    |     1     |  5.0   |\n",
       "|    2    |     1     |  5.0   |\n",
       "|    3    |     1     |  5.0   |\n",
       "|    4    |     1     |  5.0   |\n",
       "|    5    |     1     |  5.0   |\n",
       "|    6    |     1     |  5.0   |\n",
       "|    7    |     1     |  5.0   |\n",
       "|    8    |     1     |  5.0   |\n",
       "|    9    |     1     |  5.0   |\n",
       "|    10   |     1     |  5.0   |\n",
       "+---------+-----------+--------+\n",
       "[2773 rows x 3 columns]\n",
       "Note: Only the head of the SFrame is printed.\n",
       "You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns."
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sf"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T06:44:23.226685Z",
     "start_time": "2019-06-15T06:44:23.223365Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "train_set, test_set = sf.random_split(0.8, seed=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "# Popularity model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Create a model that makes recommendations using item popularity. When no target column is provided, the popularity is determined by the number of observations involving each item. When a target is provided, popularity is computed using the item’s mean target value. When the target column contains ratings, for example, the model computes the mean rating for each item and uses this to rank items for recommendations.\n",
    "\n",
    "One typically wants to initially create a simple recommendation system that can be used as a baseline and to verify that the rest of the pipeline works as expected. The `recommender` package has several models available for this purpose. For example, we can create a model that predicts songs based on their overall popularity across all users.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T06:44:57.355433Z",
     "start_time": "2019-06-15T06:44:57.334939Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre>Preparing data set.</pre>"
      ],
      "text/plain": [
       "Preparing data set."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>    Data has 2202 observations with 1651 users and 201 items.</pre>"
      ],
      "text/plain": [
       "    Data has 2202 observations with 1651 users and 201 items."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>    Data prepared in: 0.006547s</pre>"
      ],
      "text/plain": [
       "    Data prepared in: 0.006547s"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>2202 observations to process; with 201 unique items.</pre>"
      ],
      "text/plain": [
       "2202 observations to process; with 201 unique items."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "popularity_model = tc.popularity_recommender.create(train_set, 'user_id', 'course_id', target = 'rating')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "# Item similarity Model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "* [Collaborative filtering](http://en.wikipedia.org/wiki/Collaborative_filtering) methods make predictions for a given user based on the patterns of other users' activities. One common technique is to compare items based on their [Jaccard](http://en.wikipedia.org/wiki/Jaccard_index) similarity.This measurement is a ratio: the number of items they have in common, over the total number of distinct items in both sets.\n",
    "* We could also have used another slightly more complicated similarity measurement, called [Cosine Similarity](http://en.wikipedia.org/wiki/Cosine_similarity). \n",
    "\n",
    "If your data is implicit, i.e., you only observe interactions between users and items, without a rating, then use ItemSimilarityModel with Jaccard similarity.  \n",
    "\n",
    "If your data is explicit, i.e., the observations include an actual rating given by the user, then you have a wide array of options.  ItemSimilarityModel with cosine or Pearson similarity can incorporate ratings.  In addition, MatrixFactorizationModel, FactorizationModel, as well as LinearRegressionModel all support rating prediction.  \n",
    "\n",
    "#### Now data contains three columns: ‘user_id’, ‘item_id’, and ‘rating’.\n",
    "\n",
    "itemsim_cosine_model = graphlab.recommender.create(data, \n",
    "       target=’rating’, \n",
    "       method=’item_similarity’, \n",
    "       similarity_type=’cosine’)\n",
    "       \n",
    "factorization_machine_model = graphlab.recommender.create(data, \n",
    "       target=’rating’, \n",
    "       method=’factorization_model’)\n",
    "\n",
    "\n",
    "In the following code block, we compute all the item-item similarities and create an object that can be used for recommendations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T06:45:29.852827Z",
     "start_time": "2019-06-15T06:45:29.832716Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre>Preparing data set.</pre>"
      ],
      "text/plain": [
       "Preparing data set."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>    Data has 2202 observations with 1651 users and 201 items.</pre>"
      ],
      "text/plain": [
       "    Data has 2202 observations with 1651 users and 201 items."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>    Data prepared in: 0.006935s</pre>"
      ],
      "text/plain": [
       "    Data prepared in: 0.006935s"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>Training model from provided data.</pre>"
      ],
      "text/plain": [
       "Training model from provided data."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>Gathering per-item and per-user statistics.</pre>"
      ],
      "text/plain": [
       "Gathering per-item and per-user statistics."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+--------------------------------+------------+</pre>"
      ],
      "text/plain": [
       "+--------------------------------+------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| Elapsed Time (Item Statistics) | % Complete |</pre>"
      ],
      "text/plain": [
       "| Elapsed Time (Item Statistics) | % Complete |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+--------------------------------+------------+</pre>"
      ],
      "text/plain": [
       "+--------------------------------+------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 2.388ms                        | 60.5       |</pre>"
      ],
      "text/plain": [
       "| 2.388ms                        | 60.5       |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 2.607ms                        | 100        |</pre>"
      ],
      "text/plain": [
       "| 2.607ms                        | 100        |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+--------------------------------+------------+</pre>"
      ],
      "text/plain": [
       "+--------------------------------+------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>Setting up lookup tables.</pre>"
      ],
      "text/plain": [
       "Setting up lookup tables."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>Processing data in one pass using dense lookup tables.</pre>"
      ],
      "text/plain": [
       "Processing data in one pass using dense lookup tables."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+-------------------------------------+------------------+-----------------+</pre>"
      ],
      "text/plain": [
       "+-------------------------------------+------------------+-----------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| Elapsed Time (Constructing Lookups) | Total % Complete | Items Processed |</pre>"
      ],
      "text/plain": [
       "| Elapsed Time (Constructing Lookups) | Total % Complete | Items Processed |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+-------------------------------------+------------------+-----------------+</pre>"
      ],
      "text/plain": [
       "+-------------------------------------+------------------+-----------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 3.247ms                             | 0                | 0               |</pre>"
      ],
      "text/plain": [
       "| 3.247ms                             | 0                | 0               |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 6.225ms                             | 100              | 201             |</pre>"
      ],
      "text/plain": [
       "| 6.225ms                             | 100              | 201             |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+-------------------------------------+------------------+-----------------+</pre>"
      ],
      "text/plain": [
       "+-------------------------------------+------------------+-----------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>Finalizing lookup tables.</pre>"
      ],
      "text/plain": [
       "Finalizing lookup tables."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>Generating candidate set for working with new users.</pre>"
      ],
      "text/plain": [
       "Generating candidate set for working with new users."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>Finished training in 0.008353s</pre>"
      ],
      "text/plain": [
       "Finished training in 0.008353s"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "item_sim_model = tc.item_similarity_recommender.create(\n",
    "    train_set, 'user_id', 'course_id', target = 'rating', \n",
    "    similarity_type='cosine')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "# Factorization Recommender Model\n",
    "Create a FactorizationRecommender that learns latent factors for each user and item and uses them to make rating predictions. This includes both standard matrix factorization as well as factorization machines models (in the situation where side data is available for users and/or items). [link](https://dato.com/products/create/docs/generated/graphlab.recommender.factorization_recommender.create.html#graphlab.recommender.factorization_recommender.create)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T06:45:56.902159Z",
     "start_time": "2019-06-15T06:45:54.601338Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre>Preparing data set.</pre>"
      ],
      "text/plain": [
       "Preparing data set."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>    Data has 2202 observations with 1651 users and 201 items.</pre>"
      ],
      "text/plain": [
       "    Data has 2202 observations with 1651 users and 201 items."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>    Data prepared in: 0.006938s</pre>"
      ],
      "text/plain": [
       "    Data prepared in: 0.006938s"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>Training factorization_recommender for recommendations.</pre>"
      ],
      "text/plain": [
       "Training factorization_recommender for recommendations."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+--------------------------------+--------------------------------------------------+----------+</pre>"
      ],
      "text/plain": [
       "+--------------------------------+--------------------------------------------------+----------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| Parameter                      | Description                                      | Value    |</pre>"
      ],
      "text/plain": [
       "| Parameter                      | Description                                      | Value    |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+--------------------------------+--------------------------------------------------+----------+</pre>"
      ],
      "text/plain": [
       "+--------------------------------+--------------------------------------------------+----------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| num_factors                    | Factor Dimension                                 | 8        |</pre>"
      ],
      "text/plain": [
       "| num_factors                    | Factor Dimension                                 | 8        |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| regularization                 | L2 Regularization on Factors                     | 1e-08    |</pre>"
      ],
      "text/plain": [
       "| regularization                 | L2 Regularization on Factors                     | 1e-08    |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| solver                         | Solver used for training                         | sgd      |</pre>"
      ],
      "text/plain": [
       "| solver                         | Solver used for training                         | sgd      |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| linear_regularization          | L2 Regularization on Linear Coefficients         | 1e-10    |</pre>"
      ],
      "text/plain": [
       "| linear_regularization          | L2 Regularization on Linear Coefficients         | 1e-10    |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| max_iterations                 | Maximum Number of Iterations                     | 50       |</pre>"
      ],
      "text/plain": [
       "| max_iterations                 | Maximum Number of Iterations                     | 50       |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+--------------------------------+--------------------------------------------------+----------+</pre>"
      ],
      "text/plain": [
       "+--------------------------------+--------------------------------------------------+----------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>  Optimizing model using SGD; tuning step size.</pre>"
      ],
      "text/plain": [
       "  Optimizing model using SGD; tuning step size."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>  Using 2202 / 2202 points for tuning the step size.</pre>"
      ],
      "text/plain": [
       "  Using 2202 / 2202 points for tuning the step size."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+---------+-------------------+------------------------------------------+</pre>"
      ],
      "text/plain": [
       "+---------+-------------------+------------------------------------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| Attempt | Initial Step Size | Estimated Objective Value                |</pre>"
      ],
      "text/plain": [
       "| Attempt | Initial Step Size | Estimated Objective Value                |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+---------+-------------------+------------------------------------------+</pre>"
      ],
      "text/plain": [
       "+---------+-------------------+------------------------------------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 0       | 25                | Not Viable                               |</pre>"
      ],
      "text/plain": [
       "| 0       | 25                | Not Viable                               |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 1       | 6.25              | Not Viable                               |</pre>"
      ],
      "text/plain": [
       "| 1       | 6.25              | Not Viable                               |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 2       | 1.5625            | Not Viable                               |</pre>"
      ],
      "text/plain": [
       "| 2       | 1.5625            | Not Viable                               |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 3       | 0.390625          | 0.131214                                 |</pre>"
      ],
      "text/plain": [
       "| 3       | 0.390625          | 0.131214                                 |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 4       | 0.195312          | 0.168634                                 |</pre>"
      ],
      "text/plain": [
       "| 4       | 0.195312          | 0.168634                                 |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 5       | 0.0976562         | 0.237488                                 |</pre>"
      ],
      "text/plain": [
       "| 5       | 0.0976562         | 0.237488                                 |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 6       | 0.0488281         | 0.338842                                 |</pre>"
      ],
      "text/plain": [
       "| 6       | 0.0488281         | 0.338842                                 |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+---------+-------------------+------------------------------------------+</pre>"
      ],
      "text/plain": [
       "+---------+-------------------+------------------------------------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| Final   | 0.390625          | 0.131214                                 |</pre>"
      ],
      "text/plain": [
       "| Final   | 0.390625          | 0.131214                                 |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+---------+-------------------+------------------------------------------+</pre>"
      ],
      "text/plain": [
       "+---------+-------------------+------------------------------------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>Starting Optimization.</pre>"
      ],
      "text/plain": [
       "Starting Optimization."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+---------+--------------+-------------------+-----------------------+-------------+</pre>"
      ],
      "text/plain": [
       "+---------+--------------+-------------------+-----------------------+-------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| Iter.   | Elapsed Time | Approx. Objective | Approx. Training RMSE | Step Size   |</pre>"
      ],
      "text/plain": [
       "| Iter.   | Elapsed Time | Approx. Objective | Approx. Training RMSE | Step Size   |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+---------+--------------+-------------------+-----------------------+-------------+</pre>"
      ],
      "text/plain": [
       "+---------+--------------+-------------------+-----------------------+-------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| Initial | 61us         | 0.891401          | 0.94414               |             |</pre>"
      ],
      "text/plain": [
       "| Initial | 61us         | 0.891401          | 0.94414               |             |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+---------+--------------+-------------------+-----------------------+-------------+</pre>"
      ],
      "text/plain": [
       "+---------+--------------+-------------------+-----------------------+-------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 1       | 31.222ms     | 0.904366          | 0.950979              | 0.390625    |</pre>"
      ],
      "text/plain": [
       "| 1       | 31.222ms     | 0.904366          | 0.950979              | 0.390625    |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 2       | 63.351ms     | 0.493281          | 0.702338              | 0.232267    |</pre>"
      ],
      "text/plain": [
       "| 2       | 63.351ms     | 0.493281          | 0.702338              | 0.232267    |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 3       | 100.4ms      | 0.273931          | 0.523383              | 0.171364    |</pre>"
      ],
      "text/plain": [
       "| 3       | 100.4ms      | 0.273931          | 0.523383              | 0.171364    |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 4       | 143.322ms    | 0.20113           | 0.448475              | 0.116134    |</pre>"
      ],
      "text/plain": [
       "| 4       | 143.322ms    | 0.20113           | 0.448475              | 0.116134    |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 5       | 174.795ms    | 0.150903          | 0.388462              | 0.098237    |</pre>"
      ],
      "text/plain": [
       "| 5       | 174.795ms    | 0.150903          | 0.388462              | 0.098237    |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 10      | 318.538ms    | 0.0530298         | 0.230279              | 0.0584121   |</pre>"
      ],
      "text/plain": [
       "| 10      | 318.538ms    | 0.0530298         | 0.230279              | 0.0584121   |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>| 50      | 1.50s        | 0.00183493        | 0.04281               | 0.0174693   |</pre>"
      ],
      "text/plain": [
       "| 50      | 1.50s        | 0.00183493        | 0.04281               | 0.0174693   |"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>+---------+--------------+-------------------+-----------------------+-------------+</pre>"
      ],
      "text/plain": [
       "+---------+--------------+-------------------+-----------------------+-------------+"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>Optimization Complete: Maximum number of passes through the data reached.</pre>"
      ],
      "text/plain": [
       "Optimization Complete: Maximum number of passes through the data reached."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>Computing final objective value and training RMSE.</pre>"
      ],
      "text/plain": [
       "Computing final objective value and training RMSE."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>       Final objective value: 0.00162513</pre>"
      ],
      "text/plain": [
       "       Final objective value: 0.00162513"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre>       Final training RMSE: 0.0402852</pre>"
      ],
      "text/plain": [
       "       Final training RMSE: 0.0402852"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "factorization_machine_model = tc.recommender.factorization_recommender.create(\n",
    "    train_set, 'user_id', 'course_id',                                                                    \n",
    "    target='rating')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "# Model Evaluation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "It's straightforward to use GraphLab to compare models on a small subset of users in the `test_set`. The [precision-recall](http://en.wikipedia.org/wiki/Precision_and_recall) plot that is computed shows the benefits of using the similarity-based model instead of the baseline `popularity_model`: better curves tend toward the upper-right hand corner of the plot. \n",
    "\n",
    "The following command finds the top-ranked items for all users in the first 500 rows of `test_set`. The observations in `train_set` are not included in the predicted items."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T06:46:49.448455Z",
     "start_time": "2019-06-15T06:46:45.968582Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "compare_models: using 246 users to estimate model performance\n",
      "PROGRESS: Evaluate model M0\n",
      "\n",
      "Precision and recall summary statistics by cutoff\n",
      "+--------+-----------------------+-----------------------+\n",
      "| cutoff |     mean_precision    |      mean_recall      |\n",
      "+--------+-----------------------+-----------------------+\n",
      "|   1    |          0.0          |          0.0          |\n",
      "|   2    |          0.0          |          0.0          |\n",
      "|   3    |          0.0          |          0.0          |\n",
      "|   4    |          0.0          |          0.0          |\n",
      "|   5    |          0.0          |          0.0          |\n",
      "|   6    | 0.0006775067750677507 | 0.0020325203252032527 |\n",
      "|   7    | 0.0005807200929152149 | 0.0020325203252032527 |\n",
      "|   8    | 0.0010162601626016261 |  0.006097560975609755 |\n",
      "|   9    | 0.0009033423667570006 |  0.006097560975609755 |\n",
      "|   10   | 0.0008130081300813011 |  0.006097560975609755 |\n",
      "+--------+-----------------------+-----------------------+\n",
      "[10 rows x 3 columns]\n",
      "\n",
      "\n",
      "Overall RMSE: 0.7309854474588884\n",
      "\n",
      "Per User RMSE (best)\n",
      "+---------+------+-------+\n",
      "| user_id | rmse | count |\n",
      "+---------+------+-------+\n",
      "|   1621  | 0.0  |   1   |\n",
      "+---------+------+-------+\n",
      "[1 rows x 3 columns]\n",
      "\n",
      "\n",
      "Per User RMSE (worst)\n",
      "+---------+-------------------+-------+\n",
      "| user_id |        rmse       | count |\n",
      "+---------+-------------------+-------+\n",
      "|   2009  | 3.006811989100817 |   1   |\n",
      "+---------+-------------------+-------+\n",
      "[1 rows x 3 columns]\n",
      "\n",
      "\n",
      "Per Item RMSE (best)\n",
      "+-----------+------+-------+\n",
      "| course_id | rmse | count |\n",
      "+-----------+------+-------+\n",
      "|     8     | 0.0  |   1   |\n",
      "+-----------+------+-------+\n",
      "[1 rows x 3 columns]\n",
      "\n",
      "\n",
      "Per Item RMSE (worst)\n",
      "+-----------+--------------------+-------+\n",
      "| course_id |        rmse        | count |\n",
      "+-----------+--------------------+-------+\n",
      "|    144    | 3.0954212400029215 |   3   |\n",
      "+-----------+--------------------+-------+\n",
      "[1 rows x 3 columns]\n",
      "\n",
      "PROGRESS: Evaluate model M1\n",
      "\n",
      "Precision and recall summary statistics by cutoff\n",
      "+--------+----------------------+----------------------+\n",
      "| cutoff |    mean_precision    |     mean_recall      |\n",
      "+--------+----------------------+----------------------+\n",
      "|   1    | 0.004065040650406504 | 0.004065040650406504 |\n",
      "|   2    | 0.006097560975609755 | 0.010162601626016263 |\n",
      "|   3    | 0.005420054200542005 | 0.011178861788617888 |\n",
      "|   4    | 0.007113821138211382 | 0.018292682926829285 |\n",
      "|   5    | 0.008130081300813007 | 0.026422764227642268 |\n",
      "|   6    | 0.007452574525745257 | 0.03048780487804879  |\n",
      "|   7    | 0.00696864111498258  | 0.03455284552845529  |\n",
      "|   8    | 0.007113821138211383 |  0.0426829268292683  |\n",
      "|   9    | 0.006775067750677503 | 0.04471544715447155  |\n",
      "|   10   | 0.006097560975609756 | 0.04471544715447156  |\n",
      "+--------+----------------------+----------------------+\n",
      "[10 rows x 3 columns]\n",
      "\n",
      "\n",
      "Overall RMSE: 4.586134473539705\n",
      "\n",
      "Per User RMSE (best)\n",
      "+---------+------+-------+\n",
      "| user_id | rmse | count |\n",
      "+---------+------+-------+\n",
      "|   1300  | 0.5  |   1   |\n",
      "+---------+------+-------+\n",
      "[1 rows x 3 columns]\n",
      "\n",
      "\n",
      "Per User RMSE (worst)\n",
      "+---------+------+-------+\n",
      "| user_id | rmse | count |\n",
      "+---------+------+-------+\n",
      "|   1704  | 5.0  |   1   |\n",
      "+---------+------+-------+\n",
      "[1 rows x 3 columns]\n",
      "\n",
      "\n",
      "Per Item RMSE (best)\n",
      "+-----------+------+-------+\n",
      "| course_id | rmse | count |\n",
      "+-----------+------+-------+\n",
      "|    200    | 0.5  |   1   |\n",
      "+-----------+------+-------+\n",
      "[1 rows x 3 columns]\n",
      "\n",
      "\n",
      "Per Item RMSE (worst)\n",
      "+-----------+------+-------+\n",
      "| course_id | rmse | count |\n",
      "+-----------+------+-------+\n",
      "|     43    | 5.0  |   1   |\n",
      "+-----------+------+-------+\n",
      "[1 rows x 3 columns]\n",
      "\n",
      "PROGRESS: Evaluate model M2\n",
      "\n",
      "Precision and recall summary statistics by cutoff\n",
      "+--------+-----------------------+-----------------------+\n",
      "| cutoff |     mean_precision    |      mean_recall      |\n",
      "+--------+-----------------------+-----------------------+\n",
      "|   1    |          0.0          |          0.0          |\n",
      "|   2    | 0.0020325203252032522 | 0.0010162601626016261 |\n",
      "|   3    |  0.002710027100271004 |  0.005081300813008129 |\n",
      "|   4    |  0.003048780487804878 |  0.009146341463414632 |\n",
      "|   5    | 0.0024390243902439033 |  0.009146341463414632 |\n",
      "|   6    | 0.0020325203252032527 |  0.009146341463414632 |\n",
      "|   7    |  0.001742160278745646 |  0.009146341463414632 |\n",
      "|   8    |  0.001524390243902439 |  0.009146341463414632 |\n",
      "|   9    | 0.0013550135501355005 |  0.009146341463414632 |\n",
      "|   10   | 0.0016260162601626018 |  0.013211382113821135 |\n",
      "+--------+-----------------------+-----------------------+\n",
      "[10 rows x 3 columns]\n",
      "\n",
      "\n",
      "Overall RMSE: 0.8346975060578289\n",
      "\n",
      "Per User RMSE (best)\n",
      "+---------+-----------------------+-------+\n",
      "| user_id |          rmse         | count |\n",
      "+---------+-----------------------+-------+\n",
      "|   1156  | 0.0025112800279867287 |   1   |\n",
      "+---------+-----------------------+-------+\n",
      "[1 rows x 3 columns]\n",
      "\n",
      "\n",
      "Per User RMSE (worst)\n",
      "+---------+--------------------+-------+\n",
      "| user_id |        rmse        | count |\n",
      "+---------+--------------------+-------+\n",
      "|   1646  | 3.3541036468019474 |   1   |\n",
      "+---------+--------------------+-------+\n",
      "[1 rows x 3 columns]\n",
      "\n",
      "\n",
      "Per Item RMSE (best)\n",
      "+-----------+---------------------+-------+\n",
      "| course_id |         rmse        | count |\n",
      "+-----------+---------------------+-------+\n",
      "|    129    | 0.00681198910081271 |   1   |\n",
      "+-----------+---------------------+-------+\n",
      "[1 rows x 3 columns]\n",
      "\n",
      "\n",
      "Per Item RMSE (worst)\n",
      "+-----------+--------------------+-------+\n",
      "| course_id |        rmse        | count |\n",
      "+-----------+--------------------+-------+\n",
      "|    144    | 3.5404041983447554 |   3   |\n",
      "+-----------+--------------------+-------+\n",
      "[1 rows x 3 columns]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "result = tc.recommender.util.compare_models(\n",
    "    test_set, [popularity_model, item_sim_model, factorization_machine_model],\n",
    "    user_sample=.5, skip_set=train_set)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "Now let's ask the item similarity model for song recommendations on several users. We first create a list of users and create a subset of observations, `users_ratings`, that pertain to these users."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T06:49:39.479200Z",
     "start_time": "2019-06-15T06:49:39.464634Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "dtype: int\n",
       "Rows: 100\n",
       "[232, 363, 431, 738, 1860, 732, 187, 1368, 1753, 764, 926, 1180, 1323, 1742, 1685, 1876, 1232, 614, 1573, 786, 1158, 1072, 863, 695, 454, 1211, 1404, 1242, 696, 444, 349, 1883, 499, 354, 573, 1531, 71, 1889, 312, 578, 1556, 1572, 79, 416, 848, 1805, 550, 1644, 1017, 521, 566, 703, 196, 1401, 533, 1054, 398, 1077, 341, 982, 516, 1473, 1982, 1670, 529, 1544, 1920, 1414, 223, 465, 376, 1231, 193, 202, 1128, 1853, 1907, 1331, 966, 810, 1895, 1704, 267, 1615, 350, 979, 69, 138, 1645, 837, 1841, 1801, 853, 803, 405, 1172, 112, 1653, 937, 1429]"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "K = 10\n",
    "users = tc.SArray(sf['user_id'].unique().head(100))\n",
    "users"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "Next we use the `recommend()` function to query the model we created for recommendations. The returned object has four columns: `user_id`, `song_id`, the `score` that the algorithm gave this user for this song, and the song's rank (an integer from 0 to K-1). To see this we can grab the top few rows of `recs`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-14T16:46:10.176376Z",
     "start_time": "2019-06-14T16:46:10.152361Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\"><table frame=\"box\" rules=\"cols\">\n",
       "    <tr>\n",
       "        <th style=\"padding-left: 1em; padding-right: 1em; text-align: center\">user_id</th>\n",
       "        <th style=\"padding-left: 1em; padding-right: 1em; text-align: center\">course_id</th>\n",
       "        <th style=\"padding-left: 1em; padding-right: 1em; text-align: center\">score</th>\n",
       "        <th style=\"padding-left: 1em; padding-right: 1em; text-align: center\">rank</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">232</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">93</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">0.21959692239761353</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">232</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">180</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">0.2127474546432495</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">232</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">188</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">0.20187050104141235</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">232</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">108</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">0.17809301614761353</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">232</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">55</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">0.1753571629524231</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">232</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">168</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">0.16715019941329956</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">232</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">133</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">0.16673386096954346</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">232</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">186</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">0.16153007745742798</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">232</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">164</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">0.1601088047027588</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">232</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">187</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">0.15785843133926392</td>\n",
       "        <td style=\"padding-left: 1em; padding-right: 1em; text-align: center; vertical-align: top\">10</td>\n",
       "    </tr>\n",
       "</table>\n",
       "[10 rows x 4 columns]<br/>\n",
       "</div>"
      ],
      "text/plain": [
       "Columns:\n",
       "\tuser_id\tint\n",
       "\tcourse_id\tint\n",
       "\tscore\tfloat\n",
       "\trank\tint\n",
       "\n",
       "Rows: 10\n",
       "\n",
       "Data:\n",
       "+---------+-----------+---------------------+------+\n",
       "| user_id | course_id |        score        | rank |\n",
       "+---------+-----------+---------------------+------+\n",
       "|   232   |     93    | 0.21959692239761353 |  1   |\n",
       "|   232   |    180    |  0.2127474546432495 |  2   |\n",
       "|   232   |    188    | 0.20187050104141235 |  3   |\n",
       "|   232   |    108    | 0.17809301614761353 |  4   |\n",
       "|   232   |     55    |  0.1753571629524231 |  5   |\n",
       "|   232   |    168    | 0.16715019941329956 |  6   |\n",
       "|   232   |    133    | 0.16673386096954346 |  7   |\n",
       "|   232   |    186    | 0.16153007745742798 |  8   |\n",
       "|   232   |    164    |  0.1601088047027588 |  9   |\n",
       "|   232   |    187    | 0.15785843133926392 |  10  |\n",
       "+---------+-----------+---------------------+------+\n",
       "[10 rows x 4 columns]"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "recs = item_sim_model.recommend(users=users, k=K)\n",
    "recs.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "To learn what songs these ids pertain to, we can merge in metadata about each song."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-15T06:51:05.644919Z",
     "start_time": "2019-06-15T06:51:05.291900Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre>Materializing SFrame</pre>"
      ],
      "text/plain": [
       "Materializing SFrame"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<html>                 <body>                     <iframe style=\"border:0;margin:0\" width=\"1000\" height=\"2400\" srcdoc='<html lang=\"en\">                         <head>                             <script src=\"https://cdnjs.cloudflare.com/ajax/libs/vega/3.0.8/vega.js\"></script>                             <script src=\"https://cdnjs.cloudflare.com/ajax/libs/vega-embed/3.0.0-rc7/vega-embed.js\"></script>                             <script src=\"https://cdnjs.cloudflare.com/ajax/libs/vega-tooltip/0.5.1/vega-tooltip.min.js\"></script>                             <link rel=\"stylesheet\" type=\"text/css\" href=\"https://cdnjs.cloudflare.com/ajax/libs/vega-tooltip/0.5.1/vega-tooltip.min.css\">                             <style>                             .vega-actions > a{                                 color:white;                                 text-decoration: none;                                 font-family: \"Arial\";                                 cursor:pointer;                                 padding:5px;                                 background:#AAAAAA;                                 border-radius:4px;                                 padding-left:10px;                                 padding-right:10px;                                 margin-right:5px;                             }                             .vega-actions{                                 margin-top:20px;                                 text-align:center                             }                            .vega-actions > a{                                 background:#999999;                            }                             </style>                         </head>                         <body>                             <div id=\"vis\">                             </div>                             <script>                                 var vega_json = \"{\\\"metadata\\\": {\\\"bubbleOpts\\\": {\\\"showAllFields\\\": false, \\\"fields\\\": [{\\\"field\\\": \\\"left\\\"}, {\\\"field\\\": \\\"right\\\"}, {\\\"field\\\": \\\"count\\\"}, {\\\"field\\\": \\\"label\\\"}]}}, \\\"config\\\": {\\\"axis\\\": {\\\"labelPadding\\\": 10, \\\"labelFont\\\": \\\"HelveticaNeue-Light, Arial\\\", \\\"titleColor\\\": \\\"#595959\\\", \\\"titleFont\\\": \\\"HelveticaNeue-Light, Arial\\\", \\\"titleFontWeight\\\": \\\"normal\\\", \\\"labelFontSize\\\": 7, \\\"titleFontSize\\\": 12, \\\"titlePadding\\\": 9, \\\"labelColor\\\": \\\"#595959\\\"}, \\\"axisY\\\": {\\\"minExtent\\\": 30}, \\\"style\\\": {\\\"rect\\\": {\\\"stroke\\\": \\\"rgba(200, 200, 200, 0.5)\\\"}, \\\"group-title\\\": {\\\"fontWeight\\\": \\\"normal\\\", \\\"fontSize\\\": 20, \\\"fill\\\": \\\"#595959\\\", \\\"font\\\": \\\"HelveticaNeue-Light, Arial\\\"}}}, \\\"width\\\": 800, \\\"padding\\\": 8, \\\"data\\\": [{\\\"name\\\": \\\"pts_store\\\"}, {\\\"name\\\": \\\"source_2\\\", \\\"values\\\": [{\\\"num_unique\\\": 5597, \\\"num_row\\\": 5597, \\\"max\\\": 5597.0, \\\"stdev\\\": 1615.714703, \\\"median\\\": 2801.0, \\\"a\\\": 0, \\\"min\\\": 1.0, \\\"title\\\": \\\"course_id\\\", \\\"type\\\": \\\"integer\\\", \\\"numeric\\\": [{\\\"left\\\": -78, \\\"count\\\": 209, \\\"right\\\": 210}, {\\\"left\\\": 210, \\\"count\\\": 288, \\\"right\\\": 498}, {\\\"left\\\": 498, \\\"count\\\": 288, \\\"right\\\": 786}, {\\\"left\\\": 786, \\\"count\\\": 288, \\\"right\\\": 1074}, {\\\"left\\\": 1074, \\\"count\\\": 288, \\\"right\\\": 1362}, {\\\"left\\\": 1362, \\\"count\\\": 288, \\\"right\\\": 1650}, {\\\"left\\\": 1650, \\\"count\\\": 288, \\\"right\\\": 1938}, {\\\"left\\\": 1938, \\\"count\\\": 288, \\\"right\\\": 2226}, {\\\"left\\\": 2226, \\\"count\\\": 288, \\\"right\\\": 2514}, {\\\"left\\\": 2514, \\\"count\\\": 288, \\\"right\\\": 2802}, {\\\"left\\\": 2802, \\\"count\\\": 288, \\\"right\\\": 3090}, {\\\"left\\\": 3090, \\\"count\\\": 288, \\\"right\\\": 3378}, {\\\"left\\\": 3378, \\\"count\\\": 288, \\\"right\\\": 3666}, {\\\"left\\\": 3666, \\\"count\\\": 288, \\\"right\\\": 3954}, {\\\"left\\\": 3954, \\\"count\\\": 288, \\\"right\\\": 4242}, {\\\"left\\\": 4242, \\\"count\\\": 288, \\\"right\\\": 4530}, {\\\"left\\\": 4530, \\\"count\\\": 288, \\\"right\\\": 4818}, {\\\"left\\\": 4818, \\\"count\\\": 288, \\\"right\\\": 5106}, {\\\"left\\\": 5106, \\\"count\\\": 288, \\\"right\\\": 5394}, {\\\"left\\\": 5394, \\\"count\\\": 204, \\\"right\\\": 5682}, {\\\"stop\\\": 5682, \\\"step\\\": 288, \\\"start\\\": -78}], \\\"num_missing\\\": 0, \\\"mean\\\": 2799.0, \\\"categorical\\\": []}, {\\\"num_unique\\\": 5574, \\\"type\\\": \\\"str\\\", \\\"num_row\\\": 5597, \\\"numeric\\\": [], \\\"num_missing\\\": 0, \\\"a\\\": 1, \\\"categorical\\\": [{\\\"label_idx\\\": 0, \\\"percentage\\\": \\\"0.0357334%\\\", \\\"label\\\": \\\"CS-169.1x: Software as a Service\\\", \\\"count\\\": 2}, {\\\"label_idx\\\": 1, \\\"percentage\\\": \\\"0.0357334%\\\", \\\"label\\\": \\\"CSS Fundamentals\\\", \\\"count\\\": 2}, {\\\"label_idx\\\": 2, \\\"percentage\\\": \\\"0.0357334%\\\", \\\"label\\\": \\\"Creating a Responsive HTML Email\\\", \\\"count\\\": 2}, {\\\"label_idx\\\": 3, \\\"percentage\\\": \\\"0.0357334%\\\", \\\"label\\\": \\\"Differential Equations\\\", \\\"count\\\": 2}, {\\\"label_idx\\\": 4, \\\"percentage\\\": \\\"0.0357334%\\\", \\\"label\\\": \\\"Geometry\\\", \\\"count\\\": 2}, {\\\"label_idx\\\": 5, \\\"percentage\\\": \\\"0.0357334%\\\", \\\"label\\\": \\\"HTML\\\", \\\"count\\\": 2}, {\\\"label_idx\\\": 6, \\\"percentage\\\": \\\"0.0357334%\\\", \\\"label\\\": \\\"HTML5 for Beginners\\\", \\\"count\\\": 2}, {\\\"label_idx\\\": 7, \\\"percentage\\\": \\\"0.0357334%\\\", \\\"label\\\": \\\"How to Market Your Business\\\", \\\"count\\\": 2}, {\\\"label_idx\\\": 8, \\\"percentage\\\": \\\"0.0357334%\\\", \\\"label\\\": \\\"How to Start a Business\\\", \\\"count\\\": 2}, {\\\"label_idx\\\": 9, \\\"percentage\\\": \\\"0.0357334%\\\", \\\"label\\\": \\\"Introduction to Databases\\\", \\\"count\\\": 2}, {\\\"label_idx\\\": 10, \\\"percentage\\\": \\\"99.6427%\\\", \\\"label\\\": \\\"Other (5564 labels)\\\", \\\"count\\\": 5577}], \\\"title\\\": \\\"title\\\"}, {\\\"num_unique\\\": 32, \\\"num_row\\\": 5597, \\\"max\\\": 4.9, \\\"stdev\\\": 0.694036, \\\"median\\\": 4.1, \\\"a\\\": 2, \\\"min\\\": 1.3, \\\"title\\\": \\\"avg_rating\\\", \\\"type\\\": \\\"float\\\", \\\"numeric\\\": [{\\\"left\\\": 1.22402, \\\"count\\\": 2, \\\"right\\\": 1.41218}, {\\\"left\\\": 1.41218, \\\"count\\\": 0, \\\"right\\\": 1.60034}, {\\\"left\\\": 1.60034, \\\"count\\\": 0, \\\"right\\\": 1.7885}, {\\\"left\\\": 1.7885, \\\"count\\\": 3, \\\"right\\\": 1.97666}, {\\\"left\\\": 1.97666, \\\"count\\\": 2, \\\"right\\\": 2.16482}, {\\\"left\\\": 2.16482, \\\"count\\\": 1, \\\"right\\\": 2.35298}, {\\\"left\\\": 2.35298, \\\"count\\\": 2, \\\"right\\\": 2.54114}, {\\\"left\\\": 2.54114, \\\"count\\\": 4, \\\"right\\\": 2.7293}, {\\\"left\\\": 2.7293, \\\"count\\\": 4, \\\"right\\\": 2.91746}, {\\\"left\\\": 2.91746, \\\"count\\\": 8, \\\"right\\\": 3.10562}, {\\\"left\\\": 3.10562, \\\"count\\\": 4, \\\"right\\\": 3.29378}, {\\\"left\\\": 3.29378, \\\"count\\\": 13, \\\"right\\\": 3.48194}, {\\\"left\\\": 3.48194, \\\"count\\\": 21, \\\"right\\\": 3.6701}, {\\\"left\\\": 3.6701, \\\"count\\\": 12, \\\"right\\\": 3.85826}, {\\\"left\\\": 3.85826, \\\"count\\\": 29, \\\"right\\\": 4.04642}, {\\\"left\\\": 4.04642, \\\"count\\\": 23, \\\"right\\\": 4.23458}, {\\\"left\\\": 4.23458, \\\"count\\\": 31, \\\"right\\\": 4.42274}, {\\\"left\\\": 4.42274, \\\"count\\\": 28, \\\"right\\\": 4.6109}, {\\\"left\\\": 4.6109, \\\"count\\\": 13, \\\"right\\\": 4.79906}, {\\\"left\\\": 4.79906, \\\"count\\\": 14, \\\"right\\\": 4.98722}, {\\\"missing\\\": true, \\\"count\\\": 5383}, {\\\"stop\\\": 4.98722, \\\"step\\\": 0.18816, \\\"start\\\": 1.22402}], \\\"num_missing\\\": 5383, \\\"mean\\\": 3.937383, \\\"categorical\\\": []}, {\\\"num_unique\\\": 90, \\\"type\\\": \\\"str\\\", \\\"num_row\\\": 5597, \\\"numeric\\\": [], \\\"num_missing\\\": 0, \\\"a\\\": 3, \\\"categorical\\\": [{\\\"label_idx\\\": 0, \\\"percentage\\\": \\\"94.3541%\\\", \\\"label\\\": \\\"Self-paced\\\", \\\"count\\\": 5281}, {\\\"label_idx\\\": 1, \\\"percentage\\\": \\\"1.08987%\\\", \\\"label\\\": \\\"TBA\\\", \\\"count\\\": 61}, {\\\"label_idx\\\": 2, \\\"percentage\\\": \\\"4.55601%\\\", \\\"label\\\": \\\"Other (88 labels)\\\", \\\"count\\\": 255}], \\\"title\\\": \\\"workload\\\"}, {\\\"num_unique\\\": 83, \\\"type\\\": \\\"str\\\", \\\"num_row\\\": 5597, \\\"numeric\\\": [], \\\"num_missing\\\": 0, \\\"a\\\": 4, \\\"categorical\\\": [{\\\"label_idx\\\": 0, \\\"percentage\\\": \\\"94.5685%\\\", \\\"label\\\": \\\"\\\", \\\"count\\\": 5293}, {\\\"label_idx\\\": 1, \\\"percentage\\\": \\\"0.464535%\\\", \\\"label\\\": \\\"Stanford University\\\", \\\"count\\\": 26}, {\\\"label_idx\\\": 2, \\\"percentage\\\": \\\"4.96695%\\\", \\\"label\\\": \\\"Other (81 labels)\\\", \\\"count\\\": 278}], \\\"title\\\": \\\"university\\\"}, {\\\"num_unique\\\": 11, \\\"type\\\": \\\"str\\\", \\\"num_row\\\": 5597, \\\"numeric\\\": [], \\\"num_missing\\\": 0, \\\"a\\\": 5, \\\"categorical\\\": [{\\\"label_idx\\\": 0, \\\"percentage\\\": \\\"96.1408%\\\", \\\"label\\\": \\\"-\\\", \\\"count\\\": 5381}, {\\\"label_idx\\\": 1, \\\"percentage\\\": \\\"3.85921%\\\", \\\"label\\\": \\\"Other (10 labels)\\\", \\\"count\\\": 216}], \\\"title\\\": \\\"difficulty\\\"}, {\\\"num_unique\\\": 14, \\\"type\\\": \\\"str\\\", \\\"num_row\\\": 5597, \\\"numeric\\\": [], \\\"num_missing\\\": 0, \\\"a\\\": 6, \\\"categorical\\\": [{\\\"label_idx\\\": 0, \\\"percentage\\\": \\\"54.4756%\\\", \\\"label\\\": \\\"udemy\\\", \\\"count\\\": 3049}, {\\\"label_idx\\\": 1, \\\"percentage\\\": \\\"37.3593%\\\", \\\"label\\\": \\\"lynda\\\", \\\"count\\\": 2091}, {\\\"label_idx\\\": 2, \\\"percentage\\\": \\\"3.09094%\\\", \\\"label\\\": \\\"coursera\\\", \\\"count\\\": 173}, {\\\"label_idx\\\": 3, \\\"percentage\\\": \\\"1.30427%\\\", \\\"label\\\": \\\"edx\\\", \\\"count\\\": 73}, {\\\"label_idx\\\": 4, \\\"percentage\\\": \\\"3.76988%\\\", \\\"label\\\": \\\"Other (10 labels)\\\", \\\"count\\\": 211}], \\\"title\\\": \\\"provider\\\"}]}, {\\\"transform\\\": [{\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"20\\\", \\\"as\\\": \\\"c_x_axis_back\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+66\\\", \\\"as\\\": \\\"c_main_background\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+43\\\", \\\"as\\\": \\\"c_top_bar\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+59\\\", \\\"as\\\": \\\"c_top_title\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+58\\\", \\\"as\\\": \\\"c_top_type\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+178\\\", \\\"as\\\": \\\"c_rule\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+106\\\", \\\"as\\\": \\\"c_num_rows\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+130\\\", \\\"as\\\": \\\"c_num_unique\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+154\\\", \\\"as\\\": \\\"c_missing\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+105\\\", \\\"as\\\": \\\"c_num_rows_val\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+130\\\", \\\"as\\\": \\\"c_num_unique_val\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+154\\\", \\\"as\\\": \\\"c_missing_val\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+195\\\", \\\"as\\\": \\\"c_frequent_items\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+218\\\", \\\"as\\\": \\\"c_first_item\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+235\\\", \\\"as\\\": \\\"c_second_item\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+252\\\", \\\"as\\\": \\\"c_third_item\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+269\\\", \\\"as\\\": \\\"c_fourth_item\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+286\\\", \\\"as\\\": \\\"c_fifth_item\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+200\\\", \\\"as\\\": \\\"c_mean\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+220\\\", \\\"as\\\": \\\"c_min\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+240\\\", \\\"as\\\": \\\"c_max\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+260\\\", \\\"as\\\": \\\"c_median\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+280\\\", \\\"as\\\": \\\"c_stdev\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+198\\\", \\\"as\\\": \\\"c_mean_val\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+218\\\", \\\"as\\\": \\\"c_min_val\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+238\\\", \\\"as\\\": \\\"c_max_val\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+258\\\", \\\"as\\\": \\\"c_median_val\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+278\\\", \\\"as\\\": \\\"c_stdev_val\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+106\\\", \\\"as\\\": \\\"graph_offset\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"toNumber(datum[\\\\\\\"a\\\\\\\"])*300+132\\\", \\\"as\\\": \\\"graph_offset_categorical\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")?false:true\\\", \\\"as\\\": \\\"c_clip_val\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")?250:0\\\", \\\"as\\\": \\\"c_width_numeric_val\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\")?false:true\\\", \\\"as\\\": \\\"c_clip_val_cat\\\"}, {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\")?250:0\\\", \\\"as\\\": \\\"c_width_numeric_val_cat\\\"}], \\\"name\\\": \\\"data_2\\\", \\\"source\\\": \\\"source_2\\\"}], \\\"height\\\": 2180, \\\"$schema\\\": \\\"https://vega.github.io/schema/vega/v4.json\\\", \\\"marks\\\": [{\\\"type\\\": \\\"group\\\", \\\"encode\\\": {\\\"enter\\\": {\\\"strokeWidth\\\": {\\\"value\\\": 0}, \\\"clip\\\": {\\\"value\\\": 0}, \\\"fill\\\": {\\\"value\\\": \\\"#ffffff\\\"}, \\\"width\\\": {\\\"value\\\": 734}, \\\"y\\\": {\\\"value\\\": 0}, \\\"fillOpacity\\\": {\\\"value\\\": 0}, \\\"stroke\\\": {\\\"value\\\": \\\"#000000\\\"}, \\\"height\\\": {\\\"value\\\": 366}, \\\"x\\\": {\\\"value\\\": 0}}}, \\\"marks\\\": [{\\\"type\\\": \\\"group\\\", \\\"encode\\\": {\\\"enter\\\": {\\\"strokeWidth\\\": {\\\"value\\\": 0}, \\\"clip\\\": {\\\"value\\\": 0}, \\\"fill\\\": {\\\"value\\\": \\\"#ffffff\\\"}, \\\"width\\\": {\\\"value\\\": 734}, \\\"y\\\": {\\\"value\\\": 0}, \\\"fillOpacity\\\": {\\\"value\\\": 0}, \\\"stroke\\\": {\\\"value\\\": \\\"#000000\\\"}, \\\"height\\\": {\\\"value\\\": 366}, \\\"x\\\": {\\\"value\\\": 0}}}, \\\"axes\\\": [], \\\"scales\\\": [], \\\"marks\\\": [{\\\"type\\\": \\\"rect\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_main_background\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]\\\"}}, \\\"enter\\\": {\\\"strokeWidth\\\": {\\\"value\\\": 0.5}, \\\"fill\\\": {\\\"value\\\": \\\"#FEFEFE\\\"}, \\\"width\\\": {\\\"value\\\": 700}, \\\"y\\\": {\\\"value\\\": 66}, \\\"fillOpacity\\\": {\\\"value\\\": 1}, \\\"stroke\\\": {\\\"value\\\": \\\"#DEDEDE\\\"}, \\\"height\\\": {\\\"value\\\": 250}, \\\"x\\\": {\\\"value\\\": 33}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"rect\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_top_bar\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]\\\"}}, \\\"enter\\\": {\\\"strokeWidth\\\": {\\\"value\\\": 0.5}, \\\"fill\\\": {\\\"value\\\": \\\"#F5F5F5\\\"}, \\\"width\\\": {\\\"value\\\": 700}, \\\"y\\\": {\\\"value\\\": 43}, \\\"fillOpacity\\\": {\\\"value\\\": 1}, \\\"stroke\\\": {\\\"value\\\": \\\"#DEDEDE\\\"}, \\\"height\\\": {\\\"value\\\": 30}, \\\"x\\\": {\\\"value\\\": 33}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_top_type\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+687\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#595859\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 58}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"&apos;&apos;+datum[\\\\\\\"type\\\\\\\"]\\\"}, \\\"x\\\": {\\\"value\\\": 720}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 12}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_top_title\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+11\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#9B9B9B\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 59}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"&apos;&apos;+datum[\\\\\\\"title\\\\\\\"]\\\"}, \\\"x\\\": {\\\"value\\\": 44}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 15}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"rule\\\", \\\"encode\\\": {\\\"update\\\": {\\\"x2\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+687\\\"}, \\\"y\\\": {\\\"field\\\": \\\"c_rule\\\"}, \\\"y2\\\": {\\\"field\\\": \\\"c_rule\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+467\\\"}}, \\\"enter\\\": {\\\"strokeWidth\\\": {\\\"value\\\": 1}, \\\"y2\\\": {\\\"value\\\": 178}, \\\"x2\\\": {\\\"value\\\": 720}, \\\"y\\\": {\\\"value\\\": 178}, \\\"strokeCap\\\": {\\\"value\\\": \\\"butt\\\"}, \\\"stroke\\\": {\\\"value\\\": \\\"#EDEDEB\\\"}, \\\"x\\\": {\\\"value\\\": 500}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_num_rows\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+467\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 106}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"value\\\": \\\"Num. Rows:\\\"}, \\\"x\\\": {\\\"value\\\": 500}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 12}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_num_unique\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+467\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 130}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"value\\\": \\\"Num. Unique:\\\"}, \\\"x\\\": {\\\"value\\\": 500}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 12}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_missing\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+467\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 154}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"value\\\": \\\"Missing:\\\"}, \\\"x\\\": {\\\"value\\\": 500}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 12}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_num_rows_val\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#5A5A5A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 105}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"toString(format(datum[\\\\\\\"num_row\\\\\\\"], \\\\\\\",\\\\\\\"))\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 12}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_num_unique_val\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#5A5A5A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 130}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"toString(format(datum[\\\\\\\"num_unique\\\\\\\"], \\\\\\\",\\\\\\\"))\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 12}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_missing_val\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#5A5A5A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 154}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"toString(format(datum[\\\\\\\"num_missing\\\\\\\"], \\\\\\\",\\\\\\\"))\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 12}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_frequent_items\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+467\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"bold\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\")? \\\\\\\"Frequent Items\\\\\\\":\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 500}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_first_item\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+487\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"((datum[\\\\\\\"categorical\\\\\\\"].length >= 1) &amp;&amp; (toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\"))? toString(datum[\\\\\\\"categorical\\\\\\\"][0][\\\\\\\"label\\\\\\\"]):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 520}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_second_item\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+487\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"((datum[\\\\\\\"categorical\\\\\\\"].length >= 2) &amp;&amp; (toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\"))? toString(datum[\\\\\\\"categorical\\\\\\\"][1][\\\\\\\"label\\\\\\\"]):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 520}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_third_item\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+487\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"((datum[\\\\\\\"categorical\\\\\\\"].length >= 3) &amp;&amp; (toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\"))? toString(datum[\\\\\\\"categorical\\\\\\\"][2][\\\\\\\"label\\\\\\\"]):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 520}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_fourth_item\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+487\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"((datum[\\\\\\\"categorical\\\\\\\"].length >= 4) &amp;&amp; (toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\"))? toString(datum[\\\\\\\"categorical\\\\\\\"][3][\\\\\\\"label\\\\\\\"]):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 520}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_fifth_item\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+487\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"((datum[\\\\\\\"categorical\\\\\\\"].length >= 5) &amp;&amp; (toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\"))? toString(datum[\\\\\\\"categorical\\\\\\\"][4][\\\\\\\"label\\\\\\\"]):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 520}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_first_item\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#7A7A7A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"((datum[\\\\\\\"categorical\\\\\\\"].length >= 1) &amp;&amp; (toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\"))? toString(datum[\\\\\\\"categorical\\\\\\\"][0][\\\\\\\"count\\\\\\\"]):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_second_item\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#7A7A7A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"((datum[\\\\\\\"categorical\\\\\\\"].length >= 2) &amp;&amp; (toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\"))? toString(datum[\\\\\\\"categorical\\\\\\\"][1][\\\\\\\"count\\\\\\\"]):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 10}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_third_item\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#7A7A7A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"((datum[\\\\\\\"categorical\\\\\\\"].length >= 3) &amp;&amp; (toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\"))? toString(datum[\\\\\\\"categorical\\\\\\\"][2][\\\\\\\"count\\\\\\\"]):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 10}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_fourth_item\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#7A7A7A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"((datum[\\\\\\\"categorical\\\\\\\"].length >= 4) &amp;&amp; (toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\"))? toString(datum[\\\\\\\"categorical\\\\\\\"][3][\\\\\\\"count\\\\\\\"]):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 10}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_fifth_item\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#7A7A7A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"((datum[\\\\\\\"categorical\\\\\\\"].length >= 5) &amp;&amp; (toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"str\\\\\\\"))? toString(datum[\\\\\\\"categorical\\\\\\\"][4][\\\\\\\"count\\\\\\\"]):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 10}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_mean\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+467\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 200}, \\\"fontWeight\\\": {\\\"value\\\": \\\"bold\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")? \\\\\\\"Mean:\\\\\\\":\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 500}, \\\"clip\\\": {\\\"value\\\": true}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_min\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+467\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 220}, \\\"fontWeight\\\": {\\\"value\\\": \\\"bold\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")? \\\\\\\"Min:\\\\\\\":\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 500}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_max\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+467\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 240}, \\\"fontWeight\\\": {\\\"value\\\": \\\"bold\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")? \\\\\\\"Max:\\\\\\\":\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 500}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_median\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+467\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 260}, \\\"fontWeight\\\": {\\\"value\\\": \\\"bold\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")? \\\\\\\"Median:\\\\\\\":\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 500}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_stdev\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+467\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#4A4A4A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 280}, \\\"fontWeight\\\": {\\\"value\\\": \\\"bold\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")? \\\\\\\"St. Dev:\\\\\\\":\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 500}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"left\\\"}, \\\"fontSize\\\": {\\\"value\\\": 11}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_mean_val\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#6A6A6A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 198}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")?toString(format(datum[\\\\\\\"mean\\\\\\\"], \\\\\\\",\\\\\\\")):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 10}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_min_val\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#6A6A6A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 218}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")?toString(format(datum[\\\\\\\"min\\\\\\\"], \\\\\\\",\\\\\\\")):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 10}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_max_val\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#6A6A6A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 238}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")?toString(format(datum[\\\\\\\"max\\\\\\\"], \\\\\\\",\\\\\\\")):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 10}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_median_val\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#6A6A6A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 258}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")?toString(format(datum[\\\\\\\"median\\\\\\\"], \\\\\\\",\\\\\\\")):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 10}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"type\\\": \\\"text\\\", \\\"encode\\\": {\\\"update\\\": {\\\"y\\\": {\\\"field\\\": \\\"c_stdev_val\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+667\\\"}}, \\\"enter\\\": {\\\"fontStyle\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#6A6A6A\\\"}, \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"}, \\\"dx\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"y\\\": {\\\"value\\\": 278}, \\\"fontWeight\\\": {\\\"value\\\": \\\"normal\\\"}, \\\"text\\\": {\\\"signal\\\": \\\"(toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"integer\\\\\\\" || toString(datum[\\\\\\\"type\\\\\\\"]) == \\\\\\\"float\\\\\\\")?toString(format(datum[\\\\\\\"stdev\\\\\\\"], \\\\\\\",\\\\\\\")):\\\\\\\"\\\\\\\"\\\"}, \\\"x\\\": {\\\"value\\\": 700}, \\\"font\\\": {\\\"value\\\": \\\"AvenirNext-Medium\\\"}, \\\"align\\\": {\\\"value\\\": \\\"right\\\"}, \\\"fontSize\\\": {\\\"value\\\": 10}, \\\"dy\\\": {\\\"offset\\\": 0, \\\"value\\\": 0}, \\\"angle\\\": {\\\"value\\\": 0}}}, \\\"from\\\": {\\\"data\\\": \\\"data_2\\\"}}, {\\\"marks\\\": [{\\\"name\\\": \\\"marks\\\", \\\"encode\\\": {\\\"hover\\\": {\\\"fill\\\": {\\\"value\\\": \\\"#7EC2F3\\\"}}, \\\"update\\\": {\\\"x2\\\": {\\\"scale\\\": \\\"x\\\", \\\"field\\\": \\\"right\\\"}, \\\"y\\\": {\\\"scale\\\": \\\"y\\\", \\\"field\\\": \\\"count\\\"}, \\\"y2\\\": {\\\"scale\\\": \\\"y\\\", \\\"value\\\": 0}, \\\"x\\\": {\\\"scale\\\": \\\"x\\\", \\\"field\\\": \\\"left\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#108EE9\\\"}}}, \\\"style\\\": [\\\"rect\\\"], \\\"from\\\": {\\\"data\\\": \\\"new_data\\\"}, \\\"type\\\": \\\"rect\\\"}], \\\"type\\\": \\\"group\\\", \\\"axes\\\": [{\\\"zindex\\\": 1, \\\"scale\\\": \\\"x\\\", \\\"orient\\\": \\\"bottom\\\", \\\"labelOverlap\\\": true, \\\"tickCount\\\": {\\\"signal\\\": \\\"ceil(width/40)\\\"}, \\\"title\\\": \\\"Values\\\"}, {\\\"labels\\\": false, \\\"scale\\\": \\\"x\\\", \\\"orient\\\": \\\"bottom\\\", \\\"grid\\\": true, \\\"gridScale\\\": \\\"y\\\", \\\"zindex\\\": 0, \\\"maxExtent\\\": 0, \\\"tickCount\\\": {\\\"signal\\\": \\\"ceil(width/40)\\\"}, \\\"ticks\\\": false, \\\"minExtent\\\": 0, \\\"domain\\\": false}, {\\\"zindex\\\": 1, \\\"scale\\\": \\\"y\\\", \\\"orient\\\": \\\"left\\\", \\\"labelOverlap\\\": true, \\\"tickCount\\\": {\\\"signal\\\": \\\"ceil(height/40)\\\"}, \\\"title\\\": \\\"Count\\\"}, {\\\"labels\\\": false, \\\"scale\\\": \\\"y\\\", \\\"orient\\\": \\\"left\\\", \\\"grid\\\": true, \\\"gridScale\\\": \\\"x\\\", \\\"zindex\\\": 0, \\\"maxExtent\\\": 0, \\\"tickCount\\\": {\\\"signal\\\": \\\"ceil(height/40)\\\"}, \\\"ticks\\\": false, \\\"minExtent\\\": 0, \\\"domain\\\": false}], \\\"style\\\": \\\"cell\\\", \\\"from\\\": {\\\"facet\\\": {\\\"name\\\": \\\"new_data\\\", \\\"data\\\": \\\"data_2\\\", \\\"field\\\": \\\"numeric\\\"}}, \\\"encode\\\": {\\\"update\\\": {\\\"clip\\\": {\\\"field\\\": \\\"c_clip_val\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+87\\\"}, \\\"width\\\": {\\\"field\\\": \\\"c_width_numeric_val\\\"}}, \\\"enter\\\": {\\\"strokeWidth\\\": {\\\"value\\\": 0}, \\\"fill\\\": {\\\"value\\\": \\\"#ffffff\\\"}, \\\"width\\\": {\\\"value\\\": 250}, \\\"y\\\": {\\\"field\\\": \\\"graph_offset\\\"}, \\\"fillOpacity\\\": {\\\"value\\\": 0}, \\\"stroke\\\": {\\\"value\\\": \\\"#000000\\\"}, \\\"height\\\": {\\\"value\\\": 150}, \\\"x\\\": {\\\"value\\\": 120}}}, \\\"scales\\\": [{\\\"type\\\": \\\"linear\\\", \\\"range\\\": [0, {\\\"signal\\\": \\\"width\\\"}], \\\"name\\\": \\\"x\\\", \\\"domain\\\": {\\\"fields\\\": [\\\"left\\\", \\\"right\\\"], \\\"data\\\": \\\"new_data\\\", \\\"sort\\\": true}, \\\"zero\\\": true, \\\"nice\\\": true}, {\\\"type\\\": \\\"linear\\\", \\\"range\\\": [{\\\"signal\\\": \\\"height\\\"}, 0], \\\"name\\\": \\\"y\\\", \\\"domain\\\": {\\\"data\\\": \\\"new_data\\\", \\\"field\\\": \\\"count\\\"}, \\\"zero\\\": true, \\\"nice\\\": true}], \\\"signals\\\": [{\\\"name\\\": \\\"width\\\", \\\"update\\\": \\\"250\\\"}, {\\\"name\\\": \\\"height\\\", \\\"update\\\": \\\"150\\\"}]}, {\\\"type\\\": \\\"group\\\", \\\"axes\\\": [{\\\"zindex\\\": 1, \\\"scale\\\": \\\"x\\\", \\\"orient\\\": \\\"top\\\", \\\"labelOverlap\\\": true, \\\"tickCount\\\": {\\\"signal\\\": \\\"ceil(width/40)\\\"}, \\\"title\\\": \\\"Count\\\"}, {\\\"labels\\\": false, \\\"scale\\\": \\\"x\\\", \\\"orient\\\": \\\"top\\\", \\\"grid\\\": true, \\\"gridScale\\\": \\\"y\\\", \\\"zindex\\\": 0, \\\"maxExtent\\\": 0, \\\"tickCount\\\": {\\\"signal\\\": \\\"ceil(width/40)\\\"}, \\\"ticks\\\": false, \\\"minExtent\\\": 0, \\\"domain\\\": false}, {\\\"scale\\\": \\\"y\\\", \\\"orient\\\": \\\"left\\\", \\\"title\\\": \\\"Label\\\", \\\"labelOverlap\\\": true, \\\"zindex\\\": 1}], \\\"style\\\": \\\"cell\\\", \\\"from\\\": {\\\"facet\\\": {\\\"name\\\": \\\"data_5\\\", \\\"data\\\": \\\"data_2\\\", \\\"field\\\": \\\"categorical\\\"}}, \\\"scales\\\": [{\\\"type\\\": \\\"linear\\\", \\\"range\\\": [0, 250], \\\"name\\\": \\\"x\\\", \\\"domain\\\": {\\\"data\\\": \\\"data_5\\\", \\\"field\\\": \\\"count\\\"}, \\\"zero\\\": true, \\\"nice\\\": true}, {\\\"paddingInner\\\": 0.1, \\\"type\\\": \\\"band\\\", \\\"paddingOuter\\\": 0.05, \\\"range\\\": [150, 0], \\\"name\\\": \\\"y\\\", \\\"domain\\\": {\\\"data\\\": \\\"data_5\\\", \\\"field\\\": \\\"label\\\", \\\"sort\\\": {\\\"field\\\": \\\"label_idx\\\", \\\"order\\\": \\\"descending\\\", \\\"op\\\": \\\"mean\\\"}}}], \\\"encode\\\": {\\\"update\\\": {\\\"clip\\\": {\\\"field\\\": \\\"c_clip_val_cat\\\"}, \\\"x\\\": {\\\"signal\\\": \\\"datum[\\\\\\\"c_x_axis_back\\\\\\\"]+137\\\"}, \\\"width\\\": {\\\"field\\\": \\\"c_width_numeric_val_cat\\\"}}, \\\"enter\\\": {\\\"strokeWidth\\\": {\\\"value\\\": 0}, \\\"fill\\\": {\\\"value\\\": \\\"#ffffff\\\"}, \\\"width\\\": {\\\"value\\\": 250}, \\\"y\\\": {\\\"field\\\": \\\"graph_offset_categorical\\\"}, \\\"fillOpacity\\\": {\\\"value\\\": 0}, \\\"stroke\\\": {\\\"value\\\": \\\"#000000\\\"}, \\\"height\\\": {\\\"value\\\": 150}, \\\"x\\\": {\\\"value\\\": 170}}}, \\\"marks\\\": [{\\\"name\\\": \\\"marks\\\", \\\"encode\\\": {\\\"hover\\\": {\\\"fill\\\": {\\\"value\\\": \\\"#7EC2F3\\\"}}, \\\"update\\\": {\\\"x2\\\": {\\\"scale\\\": \\\"x\\\", \\\"value\\\": 0}, \\\"y\\\": {\\\"scale\\\": \\\"y\\\", \\\"field\\\": \\\"label\\\"}, \\\"height\\\": {\\\"scale\\\": \\\"y\\\", \\\"band\\\": true}, \\\"x\\\": {\\\"scale\\\": \\\"x\\\", \\\"field\\\": \\\"count\\\"}, \\\"fill\\\": {\\\"value\\\": \\\"#108EE9\\\"}}}, \\\"style\\\": [\\\"bar\\\"], \\\"from\\\": {\\\"data\\\": \\\"data_5\\\"}, \\\"type\\\": \\\"rect\\\"}], \\\"signals\\\": [{\\\"name\\\": \\\"unit\\\", \\\"on\\\": [{\\\"events\\\": \\\"mousemove\\\", \\\"update\\\": \\\"isTuple(group()) ? group() : unit\\\"}], \\\"value\\\": {}}, {\\\"name\\\": \\\"pts\\\", \\\"update\\\": \\\"data(\\\\\\\"pts_store\\\\\\\").length &amp;&amp; {count: data(\\\\\\\"pts_store\\\\\\\")[0].values[0]}\\\"}, {\\\"name\\\": \\\"pts_tuple\\\", \\\"on\\\": [{\\\"events\\\": [{\\\"type\\\": \\\"click\\\", \\\"source\\\": \\\"scope\\\"}], \\\"force\\\": true, \\\"update\\\": \\\"datum &amp;&amp; item().mark.marktype !== &apos;group&apos; ? {unit: \\\\\\\"\\\\\\\", encodings: [\\\\\\\"x\\\\\\\"], fields: [\\\\\\\"count\\\\\\\"], values: [datum[\\\\\\\"count\\\\\\\"]]} : null\\\"}], \\\"value\\\": {}}, {\\\"name\\\": \\\"pts_modify\\\", \\\"on\\\": [{\\\"events\\\": {\\\"signal\\\": \\\"pts_tuple\\\"}, \\\"update\\\": \\\"modify(\\\\\\\"pts_store\\\\\\\", pts_tuple, true)\\\"}]}]}]}]}]}\";                                 var vega_json_parsed = JSON.parse(vega_json);                                 var toolTipOpts = {                                     showAllFields: true                                 };                                 if(vega_json_parsed[\"metadata\"] != null){                                     if(vega_json_parsed[\"metadata\"][\"bubbleOpts\"] != null){                                         toolTipOpts = vega_json_parsed[\"metadata\"][\"bubbleOpts\"];                                     };                                 };                                 vegaEmbed(\"#vis\", vega_json_parsed).then(function (result) {                                     vegaTooltip.vega(result.view, toolTipOpts);                                  });                             </script>                         </body>                     </html>' src=\"demo_iframe_srcdoc.htm\">                         <p>Your browser does not support iframes.</p>                     </iframe>                 </body>             </html>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "ename": "RuntimeError",
     "evalue": "Column name course_id does not exist.",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mRuntimeError\u001b[0m                              Traceback (most recent call last)",
      "\u001b[0;32m~/Applications/anaconda/lib/python3.5/site-packages/turicreate/data_structures/sframe.py\u001b[0m in \u001b[0;36mjoin\u001b[0;34m(self, right, on, how)\u001b[0m\n\u001b[1;32m   4351\u001b[0m         \u001b[0;32mwith\u001b[0m \u001b[0mcython_context\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 4352\u001b[0;31m             \u001b[0;32mreturn\u001b[0m \u001b[0mSFrame\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0m_proxy\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m__proxy__\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mjoin\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mright\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m__proxy__\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhow\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mjoin_keys\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m   4353\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32mturicreate/_cython/cy_sframe.pyx\u001b[0m in \u001b[0;36mturicreate._cython.cy_sframe.UnitySFrameProxy.join\u001b[0;34m()\u001b[0m\n",
      "\u001b[0;32mturicreate/_cython/cy_sframe.pyx\u001b[0m in \u001b[0;36mturicreate._cython.cy_sframe.UnitySFrameProxy.join\u001b[0;34m()\u001b[0m\n",
      "\u001b[0;31mRuntimeError\u001b[0m: Column name course_id does not exist.",
      "\nDuring handling of the above exception, another exception occurred:\n",
      "\u001b[0;31mRuntimeError\u001b[0m                              Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-42-09739df6b78f>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m      6\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      7\u001b[0m \u001b[0mcourses\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcourses\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'course_id'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'title'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'provider'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 8\u001b[0;31m \u001b[0mresults\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mrecs\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mjoin\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcourses\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mon\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'course_id'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhow\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'inner'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m      9\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     10\u001b[0m \u001b[0;31m#Populate observed user-course data with course info\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/Applications/anaconda/lib/python3.5/site-packages/turicreate/data_structures/sframe.py\u001b[0m in \u001b[0;36mjoin\u001b[0;34m(self, right, on, how)\u001b[0m\n\u001b[1;32m   4350\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   4351\u001b[0m         \u001b[0;32mwith\u001b[0m \u001b[0mcython_context\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 4352\u001b[0;31m             \u001b[0;32mreturn\u001b[0m \u001b[0mSFrame\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0m_proxy\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m__proxy__\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mjoin\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mright\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m__proxy__\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhow\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mjoin_keys\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m   4353\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   4354\u001b[0m     \u001b[0;32mdef\u001b[0m \u001b[0mfilter_by\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mvalues\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcolumn_name\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mexclude\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/Applications/anaconda/lib/python3.5/site-packages/turicreate/_cython/context.py\u001b[0m in \u001b[0;36m__exit__\u001b[0;34m(self, exc_type, exc_value, traceback)\u001b[0m\n\u001b[1;32m     47\u001b[0m             \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshow_cython_trace\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     48\u001b[0m                 \u001b[0;31m# To hide cython trace, we re-raise from here\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 49\u001b[0;31m                 \u001b[0;32mraise\u001b[0m \u001b[0mexc_type\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mexc_value\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m     50\u001b[0m             \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     51\u001b[0m                 \u001b[0;31m# To show the full trace, we do nothing and let exception propagate\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;31mRuntimeError\u001b[0m: Column name course_id does not exist."
     ]
    }
   ],
   "source": [
    "# Get the meta data of the courses\n",
    "courses = tc.SFrame.read_csv('../data/cursos.dat', header=False, delimiter='|', verbose=False)\n",
    "courses =courses.rename({'X1':'course_id', 'X2':'title', 'X3':'avg_rating', \n",
    "              'X4':'workload', 'X5':'university', 'X6':'difficulty', 'X7':'provider'})\n",
    "courses.show()\n",
    "\n",
    "courses = courses[['course_id', 'title', 'provider']]\n",
    "results = recs.join(courses, on='course_id', how='inner')\n",
    "\n",
    "#Populate observed user-course data with course info\n",
    "userset = frozenset(users)\n",
    "ix = sf['user_id'].apply(lambda x: x in userset, int)  \n",
    "user_data = sf[ix]\n",
    "user_data = user_data.join(courses, on='course_id')[['user_id', 'title', 'provider']]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-06-14T16:47:43.829359Z",
     "start_time": "2019-06-14T16:47:43.778230Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "User: 1\n",
      "We were told that the user liked these courses: \n",
      "+-------------------------------+----------+\n",
      "|             title             | provider |\n",
      "+-------------------------------+----------+\n",
      "| An Introduction to Interac... | coursera |\n",
      "+-------------------------------+----------+\n",
      "[1 rows x 2 columns]\n",
      "\n",
      "We recommend these other courses:\n",
      "+-------+----------+\n",
      "| title | provider |\n",
      "+-------+----------+\n",
      "+-------+----------+\n",
      "[0 rows x 2 columns]\n",
      "\n",
      "\n",
      "User: 2\n",
      "We were told that the user liked these courses: \n",
      "+-------------------------------+----------+\n",
      "|             title             | provider |\n",
      "+-------------------------------+----------+\n",
      "| An Introduction to Interac... | coursera |\n",
      "+-------------------------------+----------+\n",
      "[1 rows x 2 columns]\n",
      "\n",
      "We recommend these other courses:\n",
      "+-------+----------+\n",
      "| title | provider |\n",
      "+-------+----------+\n",
      "+-------+----------+\n",
      "[0 rows x 2 columns]\n",
      "\n",
      "\n",
      "User: 3\n",
      "We were told that the user liked these courses: \n",
      "+-------------------------------+----------+\n",
      "|             title             | provider |\n",
      "+-------------------------------+----------+\n",
      "| An Introduction to Interac... | coursera |\n",
      "+-------------------------------+----------+\n",
      "[1 rows x 2 columns]\n",
      "\n",
      "We recommend these other courses:\n",
      "+-------+----------+\n",
      "| title | provider |\n",
      "+-------+----------+\n",
      "+-------+----------+\n",
      "[0 rows x 2 columns]\n",
      "\n",
      "\n",
      "User: 4\n",
      "We were told that the user liked these courses: \n",
      "+-------------------------------+----------+\n",
      "|             title             | provider |\n",
      "+-------------------------------+----------+\n",
      "| A Beginner&#39;s Guide to ... | coursera |\n",
      "|          Gamification         | coursera |\n",
      "+-------------------------------+----------+\n",
      "[2 rows x 2 columns]\n",
      "\n",
      "We recommend these other courses:\n",
      "+-------+----------+\n",
      "| title | provider |\n",
      "+-------+----------+\n",
      "+-------+----------+\n",
      "[0 rows x 2 columns]\n",
      "\n",
      "\n",
      "User: 5\n",
      "We were told that the user liked these courses: \n",
      "+-------------------------------+----------+\n",
      "|             title             | provider |\n",
      "+-------------------------------+----------+\n",
      "| Web Intelligence and Big Data | coursera |\n",
      "+-------------------------------+----------+\n",
      "[1 rows x 2 columns]\n",
      "\n",
      "We recommend these other courses:\n",
      "+-------+----------+\n",
      "| title | provider |\n",
      "+-------+----------+\n",
      "+-------+----------+\n",
      "[0 rows x 2 columns]\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# Print out some recommendations \n",
    "for i in range(5):\n",
    "    user = list(users)[i]\n",
    "    print(\"User: \" + str(i + 1))\n",
    "    user_obs = user_data[user_data['user_id'] == user].head(K)\n",
    "    del user_obs['user_id']\n",
    "    user_recs = results[results['user_id'] == str(user)][['title', 'provider']]\n",
    "\n",
    "    print(\"We were told that the user liked these courses: \")\n",
    "    print (user_obs.head(K))\n",
    "\n",
    "    print (\"We recommend these other courses:\")\n",
    "    print (user_recs.head(K))\n",
    "\n",
    "    print (\"\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Readings\n",
    "- (Looking for more details about the modules and functions? Check out the <a href=\"https://dato.com/products/create/docs/\">API docs</a>.)\n",
    "- Toby Segaran, 2007, Programming Collective Intelligence. O'Reilly. Chapter 2 Making Recommendations\n",
    "    - programming-collective-intelligence-code/blob/master/chapter2/recommendations.py\n",
    "- 项亮 2012 推荐系统实践 人民邮电出版社"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "Python [conda env:anaconda]",
   "language": "python",
   "name": "conda-env-anaconda-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.4"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
   "autoclose": false,
   "autocomplete": true,
   "bibliofile": "biblio.bib",
   "cite_by": "apalike",
   "current_citInitial": 1,
   "eqLabelWithNumbers": true,
   "eqNumInitial": 0,
   "hotkeys": {
    "equation": "Ctrl-E",
    "itemize": "Ctrl-I"
   },
   "labels_anchors": false,
   "latex_user_defs": false,
   "report_style_numbering": false,
   "user_envs_cfg": false
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": false,
   "sideBar": false,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {
    "height": "48px",
    "left": "1382.98px",
    "top": "61.5313px",
    "width": "164px"
   },
   "toc_section_display": false,
   "toc_window_display": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}