{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Bandits" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this part, we will investigate the properties of the action selection schemes seen in the lecture and compare their properties:\n", "\n", "1. greedy action selection\n", "2. $\\epsilon$-greedy action selection\n", "3. softmax action selection\n", "\n", "Let's re-use the definitions of the last exercise:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "rng = np.random.default_rng()" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "class Bandit:\n", " \"\"\"\n", " n-armed bandit.\n", " \"\"\"\n", " def __init__(self, nb_actions, mean=0.0, std_Q=1.0, std_r=1.0):\n", " \"\"\"\n", " :param nb_actions: number of arms.\n", " :param mean: mean of the normal distribution for $Q^*$.\n", " :param std_Q: standard deviation of the normal distribution for $Q^*$.\n", " :param std_r: standard deviation of the normal distribution for the sampled rewards.\n", " \"\"\"\n", " # Store parameters\n", " self.nb_actions = nb_actions\n", " self.mean = mean\n", " self.std_Q = std_Q\n", " self.std_r = std_r\n", " \n", " # Initialize the true Q-values\n", " self.Q_star = rng.normal(self.mean, self.std_Q, self.nb_actions)\n", " \n", " # Optimal action\n", " self.a_star = self.Q_star.argmax()\n", " \n", " def step(self, action):\n", " \"\"\"\n", " Sampled a single reward from the bandit.\n", " \n", " :param action: the selected action.\n", " :return: a reward.\n", " \"\"\"\n", " return float(rng.normal(self.Q_star[action], self.std_r, 1)[0])" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAABOsAAAHACAYAAADzz+wWAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy80BEi2AAAACXBIWXMAAA9hAAAPYQGoP6dpAABcdElEQVR4nO3de1xVZfr//zeggJqgiIIoipk/D6NCYhBqoyUjJF+LckwdHY0UGyfKpDzQx0NqhVkZpnwkD5hOOphljqkxIUVOI2lCTGlK1mhqsjFzBMUREfbvDz/s3AEekM1ewOv5eKyH7Hvd697Xvcbxal3cay0Hs9lsFgAAAAAAAAC7c7R3AAAAAAAAAACuoFgHAAAAAAAAGATFOgAAAAAAAMAgKNYBAAAAAAAABkGxDgAAAAAAADAIinUAAAAAAACAQVCsAwAAAAAAAAyCYh0AAAAAAABgEI3sHUB9VVZWppMnT6p58+ZycHCwdzgAUOeZzWadO3dOPj4+cnTkd03kGQCoWeSZisg1AFCzbjTXUKyzkZMnT8rX19feYQBAvXP8+HG1b9/e3mHYHXkGAGyDPPMLcg0A2Mb1cg3FOhtp3ry5pCv/A7i5udk5GgCo+woLC+Xr62v597WhI88AQM0iz1RErgGAmnWjuYZinY2ULxN3c3MjsQFADeI2nCvIMwBgG+SZX5BrAMA2rpdreBgDAAAAAAAAYBAU6wAAAAAAAACDoFgHAAAAAAAAGATFOgAAAAAAAMAgKNYBAAAAAAAABkGxDgAAAAAAADAIinUAAAAAAACAQVCsAwAAAAAAAAyCYh0AAAAAAABgEBTrAAAAAAAAAIOgWAcAAAAAAAAYBMU6AAAAAAAAwCAo1gEAAAAAAAAGQbEOAAAAAAAAMIhG9g4AAHBtfjO32zuEGnN0YYS9QwAA/Ep9yjMSuQYAUPdRrAMAAAAA1DgKwQBQPdwGCwAAAAAAABgExToAAAAAAADAICjWAQAAAAAAAAZBsQ4AAAAAAAAwCIp1AAAAAAAAgEFQrAMAAAAAAAAMgmIdAAAAAAAAYBAU6wAAAAAAAACDoFgHAAAAAAAAGATFOgAAAAAAAMAgKNYBAAAAAAAABkGxDgAAAAAAADAIinUAAAAAAACAQVCsAwAAAAAAAAyCYh0AAAAAAABgEBTrAAAAAAAAAIOgWAcAAAAAAAAYBMU6AAAAAAAAwCAo1gEAAAAAAAAGQbEOAIDr2LVrl4YNGyYfHx85ODhoy5Yt1z0mIyNDffr0kYuLi+644w699dZbFfokJibKz89Prq6uCg4O1t69e2s+eAAAAAB1Sp0v1nEBBQCwtaKiIvn7+ysxMfGG+h85ckQRERG69957lZOTo6effloTJ07U3//+d0ufjRs3KjY2VnPnzlV2drb8/f0VFhamU6dO2WoaAAAAAOqAOl+s4wIKAGBr999/v1544QU99NBDN9Q/KSlJnTp10muvvabu3bsrJiZGv//97/X6669b+ixevFjR0dGKiopSjx49lJSUpKZNmyo5OdlW0wAAAABQB9T5Yh0XUAAAo8nMzFRoaKhVW1hYmDIzMyVJly5dUlZWllUfR0dHhYaGWvr8WnFxsQoLC602AAAAAPVPnS/W3SxbXEBJXEQBAH5hMpnk5eVl1ebl5aXCwkL997//1enTp1VaWlppH5PJVOmY8fHxcnd3t2y+vr42ix8AAACA/TS4Yp0tLqAkLqIAALYVFxengoICy3b8+HF7hwQAAADABhpcsc5WuIgCAJTz9vZWfn6+VVt+fr7c3NzUpEkTeXp6ysnJqdI+3t7elY7p4uIiNzc3qw0AAABA/dPginW2uICSuIgCAPwiJCRE6enpVm1paWkKCQmRJDk7OyswMNCqT1lZmdLT0y19AAAAADRMDa5YxwUUAOBmnT9/Xjk5OcrJyZF05c3iOTk5OnbsmKQrq6vHjRtn6f+nP/1J//73vzV9+nQdOnRI//u//6t33nlHU6dOtfSJjY3VypUrtXbtWh08eFCTJ09WUVGRoqKianVuAAAAAIylkb0DuFXnz5/Xd999Z/lcfgHl4eGhDh06KC4uTj/++KPWrVsn6coF1LJlyzR9+nQ99thj+vjjj/XOO+9o+/btljFiY2M1fvx49e3bV0FBQUpISOACCgAasH379unee++1fI6NjZUkjR8/Xm+99Zby8vIshTtJ6tSpk7Zv366pU6dqyZIlat++vVatWqWwsDBLn5EjR+qnn37SnDlzZDKZFBAQoNTU1ArPTAUAAADQsNT5Yh0XUAAAWxs0aJDMZnOV+996661Kj/nyyy+vOW5MTIxiYmJuNTwAAAAA9UidL9ZxAQUAAAAAAID6osE9sw4AAAAAAAAwKop1AAAAAGBDiYmJ8vPzk6urq4KDg7V3795r9t+0aZO6desmV1dX9erVSzt27LDsKykp0YwZM9SrVy81a9ZMPj4+GjdunE6ePGk1hp+fnxwcHKy2hQsX2mR+AICaRbEOAAAAAGxk48aNio2N1dy5c5WdnS1/f3+FhYXp1KlTlfbfvXu3Ro8erQkTJujLL79UZGSkIiMjtX//fknShQsXlJ2drdmzZys7O1ubN29Wbm6uHnjggQpjzZ8/X3l5eZbtySeftOlcAQA1g2IdAAAAANjI4sWLFR0draioKPXo0UNJSUlq2rSpkpOTK+2/ZMkShYeHa9q0aerevbsWLFigPn36aNmyZZIkd3d3paWl6ZFHHlHXrl119913a9myZcrKyrJ6sZ4kNW/eXN7e3patWbNmNp8vAODWUawDAAAAABu4dOmSsrKyFBoaamlzdHRUaGioMjMzKz0mMzPTqr8khYWFVdlfkgoKCuTg4KAWLVpYtS9cuFCtWrXSnXfeqVdeeUWXL1++ZrzFxcUqLCy02gAAta/Ovw0WAAAAAIzo9OnTKi0tlZeXl1W7l5eXDh06VOkxJpOp0v4mk6nS/hcvXtSMGTM0evRoubm5Wdqfeuop9enTRx4eHtq9e7fi4uKUl5enxYsXVxlvfHy85s2bd6PTAwDYCMU6AAAAAKiDSkpK9Mgjj8hsNmv58uVW+2JjYy0/9+7dW87Oznr88ccVHx8vFxeXSseLi4uzOq6wsFC+vr62CR4AUCWKdQAAAABgA56ennJyclJ+fr5Ve35+vry9vSs9xtvb+4b6lxfqfvjhB3388cdWq+oqExwcrMuXL+vo0aPq2rVrpX1cXFyqLOQBAGoPz6wDAAAAABtwdnZWYGCg0tPTLW1lZWVKT09XSEhIpceEhIRY9ZektLQ0q/7lhbrDhw9r586datWq1XVjycnJkaOjo9q0aVPN2QAAagsr6wAAAADARmJjYzV+/Hj17dtXQUFBSkhIUFFRkaKioiRJ48aNU7t27RQfHy9JmjJligYOHKjXXntNERERSklJ0b59+7RixQpJVwp1v//975Wdna1t27aptLTU8jw7Dw8POTs7KzMzU3v27NG9996r5s2bKzMzU1OnTtXYsWPVsmVL+5wIAMANo1gHAAAAADYycuRI/fTTT5ozZ45MJpMCAgKUmppqeYnEsWPH5Oj4yw1P/fr104YNGzRr1iw999xz6tKli7Zs2aKePXtKkn788Udt3bpVkhQQEGD1XZ988okGDRokFxcXpaSk6Pnnn1dxcbE6deqkqVOnWj2PDgBgXBTrAAAAAMCGYmJiFBMTU+m+jIyMCm0jRozQiBEjKu3v5+cns9l8ze/r06ePPv/885uOEwBgDDyzDgAAAAAAADAIinUAAAAAAACAQVCsAwAAAAAAAAyCYh0AAAAAAABgEBTrAAAAAAAAAIOgWAcAAAAAAAAYBMU6AAAAAAAAwCAo1gEAAAAAAAAGQbEOAAAAAAAAMAiKdQAAAAAAAIBBUKwDAAAAAAAADIJiHQAAAAAAAGAQFOsAAAAAAAAAg6BYBwAAAAAAABgExToAAAAAAADAICjWAQAAAAAAAAZBsQ4AAAAAAAAwCIp1AAAAAAAAgEFQrAMAAAAAAAAMgmIdAAAAAAAAYBAU6wAAAAAAAACDoFgHAAAAAAAAGATFOgAAAAAAAMAgKNYBAAAAAAAABkGxDgCAG5CYmCg/Pz+5uroqODhYe/furbLvoEGD5ODgUGGLiIiw9Hn00Ucr7A8PD6+NqQAAAAAwsEb2DgAAAKPbuHGjYmNjlZSUpODgYCUkJCgsLEy5ublq06ZNhf6bN2/WpUuXLJ9//vln+fv7a8SIEVb9wsPDtWbNGstnFxcX200CAAAAQJ1Qb1bWseIBAGArixcvVnR0tKKiotSjRw8lJSWpadOmSk5OrrS/h4eHvL29LVtaWpqaNm1aoVjn4uJi1a9ly5a1MR0AAAAABlYvinXlKx7mzp2r7Oxs+fv7KywsTKdOnaq0/+bNm5WXl2fZ9u/fLycnp0pXPFzd769//WttTAcAYCCXLl1SVlaWQkNDLW2Ojo4KDQ1VZmbmDY2xevVqjRo1Ss2aNbNqz8jIUJs2bdS1a1dNnjxZP//8c43GDgAAAKDuqRe3wV694kGSkpKStH37diUnJ2vmzJkV+nt4eFh9TklJueaKBwBAw3X69GmVlpbKy8vLqt3Ly0uHDh267vF79+7V/v37tXr1aqv28PBwPfzww+rUqZO+//57Pffcc7r//vuVmZkpJyenCuMUFxeruLjY8rmwsLCaMwIAAABgZHW+WFe+4iEuLs7SVtMrHlq2bKn77rtPL7zwglq1alXpGFxEAQAqs3r1avXq1UtBQUFW7aNGjbL83KtXL/Xu3VudO3dWRkaGBg8eXGGc+Ph4zZs3z+bxAgAAALCvOn8b7LVWPJhMpuseX77iYeLEiVbt4eHhWrdundLT0/Xyyy/r008/1f3336/S0tJKx4mPj5e7u7tl8/X1rf6kAACG4enpKScnJ+Xn51u15+fnX3f1dVFRkVJSUjRhwoTrfs/tt98uT09Pfffdd5Xuj4uLU0FBgWU7fvz4jU8CAAAAQJ1R54t1t+paKx4eeOAB9erVS5GRkdq2bZu++OILZWRkVDoOF1EAUD85OzsrMDBQ6enplraysjKlp6crJCTkmsdu2rRJxcXFGjt27HW/58SJE/r555/Vtm3bSve7uLjIzc3NagMAAABQ/9T5Yp1RVjxwEQUA9VdsbKxWrlyptWvX6uDBg5o8ebKKioosz0odN26c1eMYyq1evVqRkZEVHqFw/vx5TZs2TZ9//rmOHj2q9PR0Pfjgg7rjjjsUFhZWK3MCAAAAYEx1/pl1V694iIyMlPTLioeYmJhrHluTKx4AAPXXyJEj9dNPP2nOnDkymUwKCAhQamqq5REMx44dk6Oj9e+/cnNz9dlnn+mjjz6qMJ6Tk5O++uorrV27VmfPnpWPj4+GDBmiBQsWyMXFpVbmBAAAAMCY6nyxTrqy4mH8+PHq27evgoKClJCQUGHFQ7t27RQfH2913LVWPMybN0/Dhw+Xt7e3vv/+e02fPp0VDwDQgMXExFT5S6DKHpHQtWtXmc3mSvs3adJEf//732syPAAAAAD1RL0o1rHiAQAAAAAAAPVBvSjWSax4AAAAAAAAQN1X518wAQAAAAAAANQXFOsAAAAAAAAAg6BYBwAAAAAAABgExToAAAAAAADAICjWAQAAAAAAAAZBsQ4AAAAAAAAwCIp1AAAAAAAAgEFQrAMAAAAAAAAMopG9AwCq4jdzu71DqFFHF0bYOwQAAAAAAGBwrKwDAAAAAAAADIKVdQAAoMFiFTcAAACMhpV1AAAAAAAAgEFQrAMAAAAAAAAMgmIdAAAAAAAAYBAU6wAAAAAAAACDoFgHAAAAAAAAGATFOgAAAAAAAMAgKNYBAAAAAAAABkGxDgAAAABsKDExUX5+fnJ1dVVwcLD27t17zf6bNm1St27d5Orqql69emnHjh2WfSUlJZoxY4Z69eqlZs2aycfHR+PGjdPJkyetxjhz5ozGjBkjNzc3tWjRQhMmTND58+dtMj8AQM2iWAcAAAAANrJx40bFxsZq7ty5ys7Olr+/v8LCwnTq1KlK++/evVujR4/WhAkT9OWXXyoyMlKRkZHav3+/JOnChQvKzs7W7NmzlZ2drc2bNys3N1cPPPCA1ThjxozRgQMHlJaWpm3btmnXrl2aNGmSzecLALh1FOsAAAAAwEYWL16s6OhoRUVFqUePHkpKSlLTpk2VnJxcaf8lS5YoPDxc06ZNU/fu3bVgwQL16dNHy5YtkyS5u7srLS1NjzzyiLp27aq7775by5YtU1ZWlo4dOyZJOnjwoFJTU7Vq1SoFBwdrwIABWrp0qVJSUiqswAMAGA/FOgAAAACwgUuXLikrK0uhoaGWNkdHR4WGhiozM7PSYzIzM636S1JYWFiV/SWpoKBADg4OatGihWWMFi1aqG/fvpY+oaGhcnR01J49e6ocp7i4WIWFhVYbAKD2UawDAAAAABs4ffq0SktL5eXlZdXu5eUlk8lU6TEmk+mm+l+8eFEzZszQ6NGj5ebmZhmjTZs2Vv0aNWokDw+PKseRpPj4eLm7u1s2X1/f684RAFDzKNYBAAAAQB1UUlKiRx55RGazWcuXL7/l8eLi4lRQUGDZjh8/XgNRAgBuViN7BwAAAAAA9ZGnp6ecnJyUn59v1Z6fny9vb+9Kj/H29r6h/uWFuh9++EEff/yxZVVd+Ri/foHF5cuXdebMmSq/V5JcXFzk4uJyQ3MDANgOK+sAAAAAwAacnZ0VGBio9PR0S1tZWZnS09MVEhJS6TEhISFW/SUpLS3Nqn95oe7w4cPauXOnWrVqVWGMs2fPKisry9L28ccfq6ysTMHBwTUxNQCADbGyDgAAAABsJDY2VuPHj1ffvn0VFBSkhIQEFRUVKSoqSpI0btw4tWvXTvHx8ZKkKVOmaODAgXrttdcUERGhlJQU7du3TytWrJB0pVD3+9//XtnZ2dq2bZtKS0stz6Hz8PCQs7OzunfvrvDwcEVHRyspKUklJSWKiYnRqFGj5OPjY58TAQC4YRTrAAAAAMBGRo4cqZ9++klz5syRyWRSQECAUlNTLS+ROHbsmBwdf7nhqV+/ftqwYYNmzZql5557Tl26dNGWLVvUs2dPSdKPP/6orVu3SpICAgKsvuuTTz7RoEGDJEnr169XTEyMBg8eLEdHRw0fPlxvvPGG7ScMALhlFOsAAAAAwIZiYmIUExNT6b6MjIwKbSNGjNCIESMq7e/n5yez2Xzd7/Tw8NCGDRtuKk4AgDHwzDoAAAAAAADAICjWAQAAAAAAAAZBsQ4AAAAAAAAwCIp1AAAAAAAAgEFQrAMAAAAAAAAMgmIdAAAAAAAAYBAU6wAAAAAAAACDoFgHAMANSExMlJ+fn1xdXRUcHKy9e/dW2fett96Sg4OD1ebq6mrVx2w2a86cOWrbtq2aNGmi0NBQHT582NbTAAAAAGBw9aZYx0UUAMBWNm7cqNjYWM2dO1fZ2dny9/dXWFiYTp06VeUxbm5uysvLs2w//PCD1f5FixbpjTfeUFJSkvbs2aNmzZopLCxMFy9etPV0AAAAABhYvSjWcREFALClxYsXKzo6WlFRUerRo4eSkpLUtGlTJScnV3mMg4ODvL29LZuXl5dln9lsVkJCgmbNmqUHH3xQvXv31rp163Ty5Elt2bKlFmYEAAAAwKjqRbGOiygAgK1cunRJWVlZCg0NtbQ5OjoqNDRUmZmZVR53/vx5dezYUb6+vnrwwQd14MABy74jR47IZDJZjenu7q7g4OAqxywuLlZhYaHVBgAAAKD+qfPFOi6iAAC2dPr0aZWWllr9UkeSvLy8ZDKZKj2ma9euSk5O1t/+9je9/fbbKisrU79+/XTixAlJshx3M2PGx8fL3d3dsvn6+t7q1AAAAAAYUJ0v1nERBQAwmpCQEI0bN04BAQEaOHCgNm/erNatW+vNN9+s9phxcXEqKCiwbMePH6/BiAEAAAAYRZ0v1lUHF1EAgBvl6ekpJycn5efnW7Xn5+fL29v7hsZo3Lix7rzzTn333XeSZDnuZsZ0cXGRm5ub1QYAAACg/qnzxTouogAAtuTs7KzAwEClp6db2srKypSenq6QkJAbGqO0tFRff/212rZtK0nq1KmTvL29rcYsLCzUnj17bnhMAAAAAPVTnS/WcREFALC12NhYrVy5UmvXrtXBgwc1efJkFRUVKSoqSpI0btw4xcXFWfrPnz9fH330kf79738rOztbY8eO1Q8//KCJEydKuvKSo6efflovvPCCtm7dqq+//lrjxo2Tj4+PIiMj7TFFAAAAAAbRyN4B1ITY2FiNHz9effv2VVBQkBISEipcRLVr107x8fGSrlxE3X333brjjjt09uxZvfLKK1VeRHXp0kWdOnXS7NmzuYgCgAZq5MiR+umnnzRnzhyZTCYFBAQoNTXV8mzTY8eOydHxl99//ec//1F0dLRMJpNatmypwMBA7d69Wz169LD0mT59uoqKijRp0iSdPXtWAwYMUGpqqlxdXWt9fgCAX5SUlGjw4MFKSkpSly5d7B0OAKABqhfFOi6iAAC2FhMTo5iYmEr3ZWRkWH1+/fXX9frrr19zPAcHB82fP1/z58+vqRABADWgcePG+uqrr+wdBgCgAasXxTqJiygAAAAANWPs2LFavXq1Fi5caO9QAAANUL0p1gEAAABATbh8+bKSk5O1c+dOBQYGqlmzZlb7Fy9ebKfIAAANAcU6AAAAALjK/v371adPH0nSt99+a7XPwcHBHiGhjvKbud3eIdSoowsj7B0C0CBQrAMAAACAq3zyySf2DgEA0IA5Xr8LAAAAADRMJ06c0IkTJ+wdBgCgAaFYBwAAAABXKSsr0/z58+Xu7q6OHTuqY8eOatGihRYsWKCysjJ7hwcAqOe4DRYAAAAArvI///M/lrfB9u/fX5L02Wef6fnnn9fFixf14osv2jlCAEB9RrEOAAAAAK6ydu1arVq1Sg888IClrXfv3mrXrp3+/Oc/U6wDANgUt8ECAAAAwFXOnDmjbt26VWjv1q2bzpw5Y4eIAAANCcU6AAAAALiKv7+/li1bVqF92bJl8vf3t0NEAICGhNtgAQAAAOAqixYtUkREhHbu3KmQkBBJUmZmpo4fP64dO3bYOToAQH3HyjoAAAAAuMrAgQP17bff6qGHHtLZs2d19uxZPfzww8rNzdU999xj7/AAAPUcK+sAAAAA4P+UlJRo8ODBSkpK4kUSAAC7YGUdAAAAAPyfxo0b66uvvrJ3GACABoxiHQAAAABcZezYsVq9erW9wwAANFDcBgsAAAAAV7l8+bKSk5O1c+dOBQYGqlmzZlb7Fy9ebKfIAAANAcU6AAAAALjK/v371adPH0nSt99+a7XPwcHBHiEBABoQinUAAAAA8H9KS0s1b9489erVSy1btrR3OACABohn1gEAAADA/3FyctKQIUN09uxZe4cCAGigKNYBAAAAwFV69uypf//73/YOAwDQQFGsAwAAAICrvPDCC3r22We1bds25eXlqbCw0GoDAMCWeGYdAAAAAFxl6NChkqQHHnjA6oUSZrNZDg4OKi0ttVdoAIAGgGIdAAAAAFzlk08+sXcIAIAGjGIdAAAAAFxl4MCB9g4BANCA8cw6AAAAAPiVf/zjHxo7dqz69eunH3/8UZL0l7/8RZ999pmdIwMA1HcU6wAAAADgKu+9957CwsLUpEkTZWdnq7i4WJJUUFCgl156yc7RAQDqO4p1AAAAAHCVF154QUlJSVq5cqUaN25sae/fv7+ys7PtGBkAoCGgWAcAAAAAV8nNzdVvf/vbCu3u7u46e/Zs7QcEAGhQaqRYV1JSouPHjys3N1dnzpypiSEBAAAAwC68vb313XffVWj/7LPPdPvtt9shIgBAQ1LtYt25c+e0fPlyDRw4UG5ubvLz81P37t3VunVrdezYUdHR0friiy9qMlYAAAAAsLno6GhNmTJFe/bskYODg06ePKn169fr2Wef1eTJk+0dHgCgnmtUnYMWL16sF198UZ07d9awYcP03HPPycfHR02aNNGZM2e0f/9+/eMf/9CQIUMUHByspUuXqkuXLjUdOwCggSgpKdGJEyckSWfOnJGbm5udIwIA1GczZ85UWVmZBg8erAsXLui3v/2tXFxc9Oyzz+rJJ5+0d3gAgHquWsW6L774Qrt27dJvfvObSvcHBQXpscceU1JSktasWaN//OMfFOsAADfl3Llzevvtt5WSkqK9e/fq0qVLkqTOnTurffv2GjJkiCZNmqS77rrLzpECAOobBwcH/c///I+mTZum7777TufPn1ePHj1022232Ts0AEADUK1i3V//+tcb6ufi4qI//elP1fkKAEADVtkKbjc3N/Xr1087d+7UkSNHWMFdA/xmbrd3CDXq6MIIe4cAoJ5xdnZWjx497B0GAKCBqVax7mqlpaVatWqVcnNz1b59e/n7+ysgIECtWrWqifgAAA1QZSu4CwsLJUmBgYG69957WcENAAAAoF665WLdk08+qffee0+hoaFatmyZHBwcdPnyZbVr104BAQHaunVrTcQJAGhAWMENAAAAoKGq9ttgy23evFnr1q3T+vXr5eLion379mnJkiW6ePGiOnbsWBMxAgAAAAAAAA3CLa+sK3/YqiQ1btxYjRo1UkxMjEpKSnTy5MlbDhAA0LCVP27h66+/liRlZGSof//+PG4BAAAAQL10yyvrbr/9dktRrl27dvrxxx8lScOGDdPbb799q8MDABq4J598UnPmzNGpU6ckSSNGjFCbNm3UoUMHPfDAA7UWR2Jiovz8/OTq6qrg4GDt3bu3yr4rV67UPffco5YtW6ply5YKDQ2t0P/RRx+Vg4OD1RYeHm7raQAAbtBf/vIX9e/fXz4+Pvrhhx8kSQkJCfrb3/5202PdTA6RpE2bNqlbt25ydXVVr169tGPHDqv9mzdv1pAhQ9SqVSs5ODgoJyenwhiDBg2qkGd4dAQA1A23XKx7+OGH9eGHH0qSBg4cqOTkZEnSN998o//+97+3OvwN4yIKAOqn8sctrFq1StKVlXW1/biFjRs3KjY2VnPnzlV2drb8/f0VFhZmKSD+WkZGhkaPHq1PPvlEmZmZ8vX11ZAhQyy/0CoXHh6uvLw8y3ajz+oDANjW8uXLFRsbq6FDh+rs2bMqLS2VJLVo0UIJCQk3NdbN5pDdu3dr9OjRmjBhgr788ktFRkYqMjJS+/fvt/QpKirSgAED9PLLL1/zu6Ojo63yzKJFi24qdgCAfdzybbDPP/+85efp06frrrvuUuvWrVVYWKgJEybc6vA3pDwBJiUlKTg4WAkJCQoLC1Nubq7atGlToX/5RVS/fv3k6uqql19+WUOGDNGBAwfUrl07S7/w8HCtWbPG8tnFxaVW5lPOb+b2Wv0+Wzq6MMLeIQCoo65+3IIkuzxuYfHixYqOjlZUVJQkKSkpSdu3b1dycrJmzpxZof/69eutPq9atUrvvfee0tPTNW7cOEu7i4uLvL29bRs8AOCmLV26VCtXrlRkZKQWLlxoae/bt6+effbZmxrrZnPIkiVLFB4ermnTpkmSFixYoLS0NC1btkxJSUmSpD/+8Y+SpKNHj17zu5s2bUqeAYA66JZX1l2tQ4cOOnDggBYtWqRNmzYpMTGxJoev0tUJsEePHkpKSlLTpk0tq/x+bf369frzn/+sgIAAdevWTatWrVJZWZnS09Ot+pVfRJVvLVu2rI3pAACucvXjFiRZfq6txy1cunRJWVlZCg0NtbQ5OjoqNDRUmZmZNzTGhQsXVFJSIg8PD6v2jIwMtWnTRl27dtXkyZP1888/VzlGcXGxCgsLrTYAgG0cOXJEd955Z4V2FxcXFRUV3fA41ckhmZmZVv0lKSws7IZzztXWr18vT09P9ezZU3Fxcbpw4cI1+5NrAMAYarRYJ0menp6KiorSAw88IAcHh5oevgKjXEQBAGzj6sctSLIU6GrrcQunT59WaWmpvLy8rNq9vLxkMpluaIwZM2bIx8fHKleFh4dr3bp1Sk9P18svv6xPP/1U999/v+VWq1+Lj4+Xu7u7ZfP19a3+pAAA19SpU6dKnwOXmpqq7t273/A41ckhJpPplnJOuT/84Q96++239cknnyguLk5/+ctfNHbs2GseQ64BAGOo1m2wx44dU4cOHW64/48//mh1e2lNulYCPHTo0A2NUdVF1MMPP6xOnTrp+++/13PPPaf7779fmZmZcnJyqjBGcXGxiouLLZ/5LRQA1Izyxy2U/7v66aef1vrjFm7FwoULlZKSooyMDLm6ulraR40aZfm5V69e6t27tzp37qyMjAwNHjy4wjhxcXGKjY21fC4sLOQiCgBsJDY2Vk888YQuXrwos9msvXv36q9//avi4+Mtz1A1ukmTJll+7tWrl9q2bavBgwfr+++/V+fOnSs9hlwDAMZQrWLdXXfdpcjISE2cOFF33XVXpX0KCgr0zjvvaMmSJZo0aZKeeuqpWwrUVmrqIio+Pl7z5s2rlZgBoCHbs2ePMjIy1KpVKw0bNszm3+fp6SknJyfl5+dbtefn51/3OUCvvvqqFi5cqJ07d6p3797X7Hv77bfL09NT3333XaV5xsXFpdafnQoADdXEiRPVpEkTzZo1SxcuXNAf/vAH+fj4aMmSJVbXCddTnRzi7e1drZxzPcHBwZKk7777rspiHbkGAIyhWrfBfvPNN2rWrJl+97vfydvbWxEREYqOjtaTTz6psWPHqk+fPmrTpo2Sk5O1aNEimxbqauIi6qOPPrqpi6jKxMXFqaCgwLIdP3785iYCALA4duxYlftatWpV4XELv37Lak1ydnZWYGCg1XNNy59zGhISUuVxixYt0oIFC5Samqq+ffte93tOnDihn3/+WW3btq2RuAEAt2bMmDE6fPiwzp8/L5PJpBMnTtz0iu7q5JCQkJAKz9JOS0u7Zs65EeW39ZJnAMD4qlWsa9WqlRYvXqy8vDwtW7ZMXbp00enTp3X48GFJVxJbVlaWMjMzNXTo0BoN+NeMchHl4uIiNzc3qw0AUD133XWXHn/8cX3xxRdV9ikoKNDKlSvVs2dPvffeezaNJzY2VitXrtTatWt18OBBTZ48WUVFRZY3+40bN05xcXGW/i+//LJmz56t5ORk+fn5yWQyyWQy6fz585KuvOF22rRp+vzzz3X06FGlp6frwQcf1B133KGwsDCbzgUAcH333Xefzp49K+nKG1XbtGkj6cptoffdd99NjXWzOWTKlClKTU3Va6+9pkOHDun555/Xvn37FBMTY+lz5swZ5eTk6JtvvpEk5ebmKicnx/Jcu++//14LFixQVlaWjh49qq1bt2rcuHH67W9/e91FCgAA+6vWbbDlmjRpot///vf6/e9/X1PxVEtsbKzGjx+vvn37KigoSAkJCRUSYLt27RQfHy/pykXUnDlztGHDBstFlCTddtttuu2223T+/HnNmzdPw4cPl7e3t77//ntNnz6diygAqCXffPONXnzxRf3ud7+Tq6urAgMD5enpKUmKjo7W4cOHdeDAAfXp00eLFi2y+S+GRo4cqZ9++klz5syRyWRSQECAUlNTLc9LPXbsmBwdf/n91/Lly3Xp0qUK+XHu3Ll6/vnn5eTkpK+++kpr167V2bNn5ePjoyFDhmjBggXcfgQABpCRkaFLly5VaL948aL+8Y9/3NRYN5tD+vXrpw0bNmjWrFl67rnn1KVLF23ZskU9e/a09Nm6davlWkf65RE+5XnG2dlZO3futFwX+fr6avjw4Zo1a9ZNxQ4AsI9bKtYZBRdRAFC/lK/gfvHFF7V9+3Z99tln+v777y37x4wZo7CwMKsLF1uLiYmxWtVwtYyMDKvPR48eveZYTZo00d///vcaigwAUJP2798v6covjq5+A2tpaalSU1Or9eK8m8khkjRixAiNGDGiyvEeffRRPfroo1Xu9/X11aeffnqzYQIADKJeFOskLqIAoD66egV3YWGh3N3dtXLlSh41AACwmXvuuUcODg6V3u7apEkTLV261A5RAQAakpsu1v3jH//QPffco3/+85/q37+/LWICAKBK//3vfynWAQBsJicnR/7+/tq7d69at25taXd2dlabNm3k5ORkx+gAAA3BTRfrPvzwQzVq1Ejbt2+nWAcAqHVhYWGWN9qVO3TokLp162afgAAA9UrHjh1VVlZm7zAAAA3YTRXr5s2bp8uXL+u+++7TU089pfnz52vOnDm2ig0AAIsPP/xQklRUVKTjx4/L19fXsm/kyJH617/+Za/QAAD1zLp16665f9y4cbUUCQCgIbqpYt3cuXO1cuVKLViwQC1atNDEiRNtFRcAAFa6d+8uSfr55581btw4HTt2TO3atVPbtm3VuHFjO0cHAKhPpkyZYvW5pKREFy5ckLOzs5o2bUqxDgBgUzd9G+zly5f17LPP6s0337RFPAAAWPn3v/+tf/3rX7p06ZIkaf369br//vslST/++KN++OGHWn0rLACg/vvPf/5Toe3w4cOaPHmypk2bZoeIAAANyU0X6yZPnixJevzxx2s8GAAAypWWlmrixIlat26dzGazpT0+Pl7t27dXr1691K5dO7Vr186OUQIAGoouXbpo4cKFGjt2rA4dOmTvcAAA9ZijvQMAAKAyL730krZu3ao333xTubm5+uyzzyRJ58+fV//+/fXxxx/bOUIAQEPTqFEjnTx50t5hAADquZteWfdr5bcnNWrUSL1791bHjh1rIi4AQAO3du1avf7665bnAnl5eUmSMjIytGLFCkVGRurw4cNq0qSJvvzySw0cONCe4QIA6pGtW7dafTabzcrLy9OyZcvUv39/O0UFAGgoql2sq+z2JAcHBw0cOFBLlixRr169aixIAEDDc/z4cd1zzz2V7nv22WeVm5urxx57TN9++60ee+wxinUAgBoTGRlp9dnBwUGtW7fWfffdp9dee80+QQEAGoxq3wb769uTcnJytGrVKp07d47bkwAAt8zDw6PSB3yXmzhxoj788EPde++9euaZZ2oxMgBAfVdWVma1lZaWymQyacOGDWrbtq29wwMA1HPVXln369uTJKl3796KiorSq6++yu1JAIBbMmjQIL399tvq06dPpfu9vLzUqFEjrVixopYjAwAAAADbqXaxjtuTAAC2NGPGDAUHByswMFBjxoypsH/fvn1q3769HSIDANRHzz33nOVPZ2fna/ZdvHhxbYQEAGigql2sK789qVOnTpXunzhxokJCQjRx4kRuTwIA3LSAgAAtX75c48eP1zvvvKNHH31UklRQUKCMjAxNnTpVY8eOtW+QAIB646uvvrL86eTkVGU/BweH2goJANBAVbtYx+1JAABbe+yxx3T77bfr6aef1vDhwyVJfn5+MpvNCgsL09y5c+0cIQCgvti2bZvc3d21bds2ubm52TscAEADVu0XTMyYMUOJiYlav359pfu5PQkAUBMGDRqknJwc7dq1S9KVW48yMzP14YcfytXV1c7RAQDquxMnTujEiRP2DgMA0IBUu1h39e1JDz74oD766CPl5+eroKBAW7du1dSpUzVy5MiajBUA0ID17t1bkhQVFaXg4GA7RwMAqM/Kyso0f/58ubu7q2PHjurYsaNatGihBQsWqKyszN7hAQDquWrfBitZ354UHh5ueX4DtycBAAAAqKv+53/+R6tXr9bChQvVv39/SdJnn32m559/XhcvXtSLL75o5wgBAPXZLRXrpF9uTyrfLl26JH9/f1Y9AAAAAKiT1q5dq1WrVumBBx6wtPXu3Vvt2rXTn//8Z4p1AACbuuViXbmAgAAFBATU1HAAAAAAYBdnzpxRt27dKrR369ZNZ86csUNEAICGpNrPrAMAAACA+sjf31/Lli2r0L5s2TL5+/vbISIAQENSYyvrAAAAAKA+WLRokSIiIrRz506FhIRIkjIzM3X8+HHt2LHDztEBAOo7VtYBAAAAwFUGDhyob7/9Vg899JDOnj2rs2fP6uGHH1Zubq7uuecee4cHAKjnWFkHAAAAAL/i4+PDiyQAAHZBsQ4AAAAArpKamqrbbrtNAwYMkCQlJiZq5cqV6tGjhxITE9WyZUs7RwigLvGbud3eIdSYowsj7B1Cg8BtsAAAAABwlWnTpqmwsFCS9PXXXys2NlZDhw7VkSNHFBsba+foAAD1HSvrAAAAAOAqR44cUY8ePSRJ7733noYNG6aXXnpJ2dnZGjp0qJ2jAwDUd6ysAwAAAICrODs768KFC5KknTt3asiQIZIkDw8Py4o7AABshZV1AAAAAHCVAQMGKDY2Vv3799fevXu1ceNGSdK3336r9u3b2zk6AEB9x8o6AAAAALjKsmXL1KhRI7377rtavny52rVrJ0n68MMPFR4ebufoAAD1HSvrAAAAAOAqHTp00LZt2yq0v/7663aIBqjb6tObUCXehoraQbEOAAAAAH6ltLRU77//vg4ePChJ6t69uyIjI9WoEZdQAADbItMAAAAAwFUOHDigYcOGKT8/X127dpUkvfzyy2rdurU++OAD9ezZ084RAgDqM55ZBwAAAABXmThxonr27KkTJ04oOztb2dnZOn78uHr37q1JkybZOzwAQD3HyjoAAAAAuEpOTo727dunli1bWtpatmypF198UXfddZcdIwMANASsrAMAAACAq/x//9//p/z8/Artp06d0h133GGHiAAADQnFOgAAAAANXmFhoeXP+Ph4PfXUU3r33Xd14sQJnThxQu+++66efvppvfzyy3aOFABQ31GsAwDgBiQmJsrPz0+urq4KDg7W3r17r9l/06ZN6tatm1xdXdWrVy/t2LHDar/ZbNacOXPUtm1bNWnSRKGhoTp8+LAtpwAAuIYOHTpIkjp27Khhw4bpm2++0SOPPKKOHTuqY8eOeuSRR7R//34NGzbMzpECAOo7nlkHAMB1bNy4UbGxsUpKSlJwcLASEhIUFham3NxctWnTpkL/3bt3a/To0YqPj9f/+3//Txs2bFBkZKSys7MtbxBctGiR3njjDa1du1adOnXS7NmzFRYWpm+++Uaurq61PUUAaPC2bdumiIgIffDBB2rWrJm9wwEANGD1ZmUdKx4AALayePFiRUdHKyoqSj169FBSUpKaNm2q5OTkSvsvWbJE4eHhmjZtmrp3764FCxaoT58+WrZsmaQrOSYhIUGzZs3Sgw8+qN69e2vdunU6efKktmzZUoszAwCUGzBggOXPgQMHVrm1atXKzpECAOq7elGsK1/xMHfuXGVnZ8vf319hYWE6depUpf3LVzxMmDBBX375pSIjIxUZGan9+/db+pSveEhKStKePXvUrFkzhYWF6eLFi7U1LQCAAVy6dElZWVkKDQ21tDk6Oio0NFSZmZmVHpOZmWnVX5LCwsIs/Y8cOSKTyWTVx93dXcHBwVWOWVxcrMLCQqsNAFA7zp07pxUrVigoKEj+/v72DgcAUM/Vi9tgr17xIElJSUnavn27kpOTNXPmzAr9r17xIEkLFixQWlqali1bpqSkpAorHiRp3bp18vLy0pYtWzRq1KjamxwAwK5Onz6t0tJSeXl5WbV7eXnp0KFDlR5jMpkq7W8ymSz7y9uq6vNr8fHxmjdvXrXmUJWjCyNqdLy6qKGfA7+Z2+0dQo2qzv+enAP+f3Atu3bt0urVq/Xee+/Jx8dHDz/8sBITE+0dFgCgnqvzK+tY8QAAaAji4uJUUFBg2Y4fP27vkACgXsrPz9fChQvVpUsXjRgxQm5ubiouLtaWLVu0cOFC3XXXXfYOEQBQz9X5lXWseKi/Gvr8JX7bL3EOqnsMao6np6ecnJyUn59v1Z6fny9vb+9Kj/H29r5m//I/8/Pz1bZtW6s+AQEBlY7p4uIiFxeX6k4DAHCD+vbtq4iICCUkJCg8PFxOTk5KSkqyd1gAgAakzq+sMwpWPABA/eTs7KzAwEClp6db2srKypSenq6QkJBKjwkJCbHqL0lpaWmW/p06dZK3t7dVn8LCQu3Zs6fKMQEAteOPf/yj5s2bp4iICDk5Odk7HABAA1TnV9ax4gEAYGuxsbEaP368+vbtq6CgICUkJKioqMjyrNRx48apXbt2io+PlyRNmTJFAwcO1GuvvaaIiAilpKRo3759WrFihSTJwcFBTz/9tF544QV16dJFnTp10uzZs+Xj46PIyEh7TRMAoCsvkwgMDFT37t31xz/+kedVAwBqXZ1fWceKBwCArY0cOVKvvvqq5syZo4CAAOXk5Cg1NdXyuIRjx44pLy/P0r9fv37asGGDVqxYIX9/f7377rvasmWLevbsaekzffp0Pfnkk5o0aZLuuusunT9/XqmpqXJ1da31+QEAfrF06VLl5eXp8ccfV0pKinx8fFRWVqa0tDSdO3fO3uEBABqAOr+yTmLFAwDA9mJiYhQTE1PpvoyMjAptI0aM0IgRI6ocz8HBQfPnz9f8+fNrKkQAQA1p1qyZHnvsMT322GPKzc3V6tWrtXDhQs2cOVO/+93vtHXrVnuHCACox+r8yjqJFQ8AAAAAbKNr165atGiRTpw4ob/+9a/2DgcA0ADUi5V1EiseAAAAANiOk5OTIiMjudMGAGBz9WJlHQAAAAAAAFAfUKwDAAAAAAAADIJiHQAAAAAAAGAQFOsAAAAAAAAAg6BYBwAAAAAAABgExToAAAAAsKHExET5+fnJ1dVVwcHB2rt37zX7b9q0Sd26dZOrq6t69eqlHTt2WO3fvHmzhgwZolatWsnBwUE5OTkVxrh48aKeeOIJtWrVSrfddpuGDx+u/Pz8mpwWAMBGKNYBAAAAgI1s3LhRsbGxmjt3rrKzs+Xv76+wsDCdOnWq0v67d+/W6NGjNWHCBH355ZeKjIxUZGSk9u/fb+lTVFSkAQMG6OWXX67ye6dOnaoPPvhAmzZt0qeffqqTJ0/q4YcfrvH5AQBqHsU6AAAAALCRxYsXKzo6WlFRUerRo4eSkpLUtGlTJScnV9p/yZIlCg8P17Rp09S9e3ctWLBAffr00bJlyyx9/vjHP2rOnDkKDQ2tdIyCggKtXr1aixcv1n333afAwECtWbNGu3fv1ueff26TeQIAag7FOgAAAACwgUuXLikrK8uqqObo6KjQ0FBlZmZWekxmZmaFIlxYWFiV/SuTlZWlkpISq3G6deumDh06XHOc4uJiFRYWWm0AgNpHsQ4AAAAAbOD06dMqLS2Vl5eXVbuXl5dMJlOlx5hMppvqX9UYzs7OatGixU2NEx8fL3d3d8vm6+t7w98JAKg5FOsAAAAAAIqLi1NBQYFlO378uL1DAoAGqZG9AwAAAACA+sjT01NOTk4V3sKan58vb2/vSo/x9va+qf5VjXHp0iWdPXvWanXd9cZxcXGRi4vLDX8PAMA2WFkHAAAAADbg7OyswMBApaenW9rKysqUnp6ukJCQSo8JCQmx6i9JaWlpVfavTGBgoBo3bmw1Tm5uro4dO3ZT4wAA7IOVdQAAAABgI7GxsRo/frz69u2roKAgJSQkqKioSFFRUZKkcePGqV27doqPj5ckTZkyRQMHDtRrr72miIgIpaSkaN++fVqxYoVlzDNnzujYsWM6efKkpCuFOOnKijpvb2+5u7trwoQJio2NlYeHh9zc3PTkk08qJCREd999dy2fAQDAzaJYBwAAAAA2MnLkSP3000+aM2eOTCaTAgIClJqaanmJxLFjx+To+MsNT/369dOGDRs0a9YsPffcc+rSpYu2bNminj17Wvps3brVUuyTpFGjRkmS5s6dq+eff16S9Prrr8vR0VHDhw9XcXGxwsLC9L//+7+1MGMAwK2iWAcAAAAANhQTE6OYmJhK92VkZFRoGzFihEaMGFHleI8++qgeffTRa36nq6urEhMTlZiYeDOhAgAMgGfWAQAAAAAAAAZBsQ4AAAAAAAAwCIp1AAAAAAAAgEFQrAMAAAAAAAAMgmIdAAAAAAAAYBAU6wAAAAAAAACDoFgHAAAAAAAAGATFOgAAAAAAAMAgKNYBAAAAAAAABkGxDgAAAAAAADAIinUAAAAAAACAQVCsAwAAAAAAAAyCYh0AAAAAAABgEBTrAAAAAAAAAIOgWAcAAAAAAAAYBMU6AAAAAAAAwCAo1gEAAAAAAAAGQbEOAAAAAAAAMAiKdQAAAAAAAIBBUKwDAOAazpw5ozFjxsjNzU0tWrTQhAkTdP78+Wv2f/LJJ9W1a1c1adJEHTp00FNPPaWCggKrfg4ODhW2lJQUW08HAAAAgME1sncAAAAY2ZgxY5SXl6e0tDSVlJQoKipKkyZN0oYNGyrtf/LkSZ08eVKvvvqqevTooR9++EF/+tOfdPLkSb377rtWfdesWaPw8HDL5xYtWthyKgAAAADqgDq/so4VDwAAWzl48KBSU1O1atUqBQcHa8CAAVq6dKlSUlJ08uTJSo/p2bOn3nvvPQ0bNkydO3fWfffdpxdffFEffPCBLl++bNW3RYsW8vb2tmyurq61MS0AAAAABlbni3VjxozRgQMHlJaWpm3btmnXrl2aNGlSlf2vXvGwf/9+vfXWW0pNTdWECRMq9F2zZo3y8vIsW2RkpA1nAgAwmszMTLVo0UJ9+/a1tIWGhsrR0VF79uy54XEKCgrk5uamRo2sF7Q/8cQT8vT0VFBQkJKTk2U2m2ssdgAAAAB1U52+DbZ8xcMXX3xhuZBaunSphg4dqldffVU+Pj4Vjilf8VCuc+fOevHFFzV27FhdvnzZ6kKqfMUDAKBhMplMatOmjVVbo0aN5OHhIZPJdENjnD59WgsWLKjwi6T58+frvvvuU9OmTfXRRx/pz3/+s86fP6+nnnqq0nGKi4tVXFxs+VxYWHiTswEAAABQF9TplXWseAAAVMfMmTMrfdzB1duhQ4du+XsKCwsVERGhHj166Pnnn7faN3v2bPXv31933nmnZsyYoenTp+uVV16pcqz4+Hi5u7tbNl9f31uODwAAAIDx1OmVdax4AABUxzPPPKNHH330mn1uv/12eXt769SpU1btly9f1pkzZ6678vrcuXMKDw9X8+bN9f7776tx48bX7B8cHKwFCxaouLhYLi4uFfbHxcUpNjbW8rmwsJCCHQAAAFAPGbJYN3PmTL388svX7HPw4MFb/p7rrXgod+edd6qoqEivvPJKlcW6+Ph4zZs375ZjAgDYXuvWrdW6devr9gsJCdHZs2eVlZWlwMBASdLHH3+ssrIyBQcHV3lcYWGhwsLC5OLioq1bt97QiyNycnLUsmXLSgt1kuTi4lLlPgAAAAD1hyGLdax4AAAYQffu3RUeHq7o6GglJSWppKREMTExGjVqlOW5qD/++KMGDx6sdevWKSgoSIWFhRoyZIguXLigt99+W4WFhZbV1q1bt5aTk5M++OAD5efn6+6775arq6vS0tL00ksv6dlnn7XndAEAAAAYgCGLdax4AAAYxfr16xUTE6PBgwfL0dFRw4cP1xtvvGHZX1JSotzcXF24cEGSlJ2dbXlu6h133GE11pEjR+Tn56fGjRsrMTFRU6dOldls1h133KHFixcrOjq69iYGAAAAwJAMWay7Uax4AADYmoeHhzZs2FDlfj8/P6sXEA0aNOi6LyQKDw9XeHh4jcUIAAAAoP6o08U6iRUPAAAAAAAAqD/qfLGOFQ8AAAAAAACoLxztHQAAAAAAAACAKyjWAQAAAAAAAAZBsQ4AAAAAAAAwCIp1AAAAAAAAgEFQrAMAAAAAAAAMgmIdAAAAAAAAYBAU6wAAAAAAAACDoFgHAAAAAAAAGATFOgAAAAAAAMAgKNYBAAAAAAAABkGxDgAAAAAAADAIinUAAAAAAACAQVCsAwAAAAAAAAyCYh0AAAAAAABgEBTrAAAAAAAAAIOgWAcAAAAAAAAYBMU6AAAAAAAAwCAo1gEAAAAAAAAGQbEOAAAAAAAAMAiKdQAAAAAAAIBBUKwDAAAAAAAADKKRvQMAULWjCyPsHQIAoB4jzwAAABgPK+sAAAAAAAAAg6BYBwAAAAA2lJiYKD8/P7m6uio4OFh79+69Zv9NmzapW7ducnV1Va9evbRjxw6r/WazWXPmzFHbtm3VpEkThYaG6vDhw1Z9/Pz85ODgYLUtXLiwxucGAKh5FOsAAAAAwEY2btyo2NhYzZ07V9nZ2fL391dYWJhOnTpVaf/du3dr9OjRmjBhgr788ktFRkYqMjJS+/fvt/RZtGiR3njjDSUlJWnPnj1q1qyZwsLCdPHiRaux5s+fr7y8PMv25JNP2nSuAICaQbEOAAAAAGxk8eLFio6OVlRUlHr06KGkpCQ1bdpUycnJlfZfsmSJwsPDNW3aNHXv3l0LFixQnz59tGzZMklXVtUlJCRo1qxZevDBB9W7d2+tW7dOJ0+e1JYtW6zGat68uby9vS1bs2bNbD1dAEANoFgHAAAAADZw6dIlZWVlKTQ01NLm6Oio0NBQZWZmVnpMZmamVX9JCgsLs/Q/cuSITCaTVR93d3cFBwdXGHPhwoVq1aqV7rzzTr3yyiu6fPnyNeMtLi5WYWGh1QYAqH28DRYAAAAAbOD06dMqLS2Vl5eXVbuXl5cOHTpU6TEmk6nS/iaTybK/vK2qPpL01FNPqU+fPvLw8NDu3bsVFxenvLw8LV68uMp44+PjNW/evBufIADAJijWAQAAAEA9Exsba/m5d+/ecnZ21uOPP674+Hi5uLhUekxcXJzVcYWFhfL19bV5rAAAaxTrAAAAAMAGPD095eTkpPz8fKv2/Px8eXt7V3qMt7f3NfuX/5mfn6+2bdta9QkICKgyluDgYF2+fFlHjx5V165dK+3j4uJSZSEPAKrLb+Z2e4dQo44ujLD5d/DMOgAAAACwAWdnZwUGBio9Pd3SVlZWpvT0dIWEhFR6TEhIiFV/SUpLS7P079Spk7y9va36FBYWas+ePVWOKUk5OTlydHRUmzZtbmVKAIBawMo6AAAAALCR2NhYjR8/Xn379lVQUJASEhJUVFSkqKgoSdK4cePUrl07xcfHS5KmTJmigQMH6rXXXlNERIRSUlK0b98+rVixQpLk4OCgp59+Wi+88IK6dOmiTp06afbs2fLx8VFkZKSkKy+p2LNnj+699141b95cmZmZmjp1qsaOHauWLVva5TwAAG4cxToAAAAAsJGRI0fqp59+0pw5c2QymRQQEKDU1FTLCyKOHTsmR8dfbnjq16+fNmzYoFmzZum5555Tly5dtGXLFvXs2dPSZ/r06SoqKtKkSZN09uxZDRgwQKmpqXJ1dZV05XbWlJQUPf/88youLlanTp00depUq+fRAQCMi2IdAAAAANhQTEyMYmJiKt2XkZFRoW3EiBEaMWJEleM5ODho/vz5mj9/fqX7+/Tpo88//7xasQIA7I9n1gEAAAAAAAAGQbEOAAAAAAAAMAiKdQAAXMOZM2c0ZswYubm5qUWLFpowYYLOnz9/zWMGDRokBwcHq+1Pf/qTVZ9jx44pIiJCTZs2VZs2bTRt2jRdvnzZllMBAAAAUAfwzDoAAK5hzJgxysvLU1pamkpKShQVFaVJkyZpw4YN1zwuOjra6llCTZs2tfxcWlqqiIgIeXt7a/fu3crLy9O4cePUuHFjvfTSSzabCwAAAADjq/Mr61jxAACwlYMHDyo1NVWrVq1ScHCwBgwYoKVLlyolJUUnT5685rFNmzaVt7e3ZXNzc7Ps++ijj/TNN9/o7bffVkBAgO6//34tWLBAiYmJunTpkq2nBQAAAMDA6nyxbsyYMTpw4IDS0tK0bds27dq1S5MmTbrucdHR0crLy7NsixYtsuwrX/Fw6dIl7d69W2vXrtVbb72lOXPm2HIqAACDyczMVIsWLdS3b19LW2hoqBwdHbVnz55rHrt+/Xp5enqqZ8+eiouL04ULF6zG7dWrl7y8vCxtYWFhKiws1IEDByodr7i4WIWFhVYbAAAAgPqnTt8GW77i4YsvvrBcSC1dulRDhw7Vq6++Kh8fnyqPLV/xUJnyFQ87d+6Ul5eXAgICtGDBAs2YMUPPP/+8nJ2dbTIfAICxmEwmtWnTxqqtUaNG8vDwkMlkqvK4P/zhD+rYsaN8fHz01VdfacaMGcrNzdXmzZst415dqJNk+VzVuPHx8Zo3b96tTAcAAABAHVCnV9ax4gEAUB0zZ86s8DiEX2+HDh2q9viTJk1SWFiYevXqpTFjxmjdunV6//339f3331d7zLi4OBUUFFi248ePV3ssAAAAAMZVp1fWseIBAFAdzzzzjB599NFr9rn99tvl7e2tU6dOWbVfvnxZZ86cqXJ1dmWCg4MlSd999506d+4sb29v7d2716pPfn6+JFU5rouLi1xcXG74OwEAAADUTYYs1s2cOVMvv/zyNfscPHiw2uNf/Uy7Xr16qW3btho8eLC+//57de7cuVpjxsXFKTY21vK5sLBQvr6+1Y4RAGA7rVu3VuvWra/bLyQkRGfPnlVWVpYCAwMlSR9//LHKysosBbgbkZOTI0lq27atZdwXX3xRp06dsvzSKS0tTW5uburRo8dNzgYAAABAfWLIYh0rHgAARtC9e3eFh4crOjpaSUlJKikpUUxMjEaNGmV5LuqPP/6owYMHa926dQoKCtL333+vDRs2aOjQoWrVqpW++uorTZ06Vb/97W/Vu3dvSdKQIUPUo0cP/fGPf9SiRYtkMpk0a9YsPfHEE+QSAAAAoIEzZLGOFQ8AAKNYv369YmJiNHjwYDk6Omr48OF64403LPtLSkqUm5trefaps7Ozdu7cqYSEBBUVFcnX11fDhw/XrFmzLMc4OTlp27Ztmjx5skJCQtSsWTONHz9e8+fPr/X5AQAAADAWQxbrbhQrHgAAtubh4aENGzZUud/Pz09ms9ny2dfXV59++ul1x+3YsaN27NhRIzECAAAAqD/q9NtgpSsrHrp166bBgwdr6NChGjBggFasWGHZX9WKhyFDhqhbt2565plnNHz4cH3wwQeWY8pXPDg5OSkkJERjx47VuHHjWPEAAAAAAAAAm6rTK+skVjwAAAAAAACg/qjzK+sAAAAAAACA+oJiHQAAAAAAAGAQFOsAAAAAAAAAg6BYBwAAAAAAABgExToAAAAAAADAICjWAQAAAAAAAAZBsQ4AAAAAAAAwCIp1AAAAAAAAgEFQrAMAAAAAAAAMgmIdAAAAAAAAYBAU6wAAAAAAAACDoFgHAAAAAAAAGATFOgAAAAAAAMAgKNYBAAAAAAAABkGxDgAAAAAAADAIinUAAAAAAACAQVCsAwAAAAAAAAyCYh0AAAAAAABgEBTrAAAAAAAAAIOgWAcAAAAAAAAYBMU6AAAAAAAAwCAo1gEAAAAAAAAGQbEOAAAAAAAAMAiKdQAAAAAAAIBBUKwDAAAAAAAADIJiHQAAAAAAAGAQjewdAABcy9GFEfYOAQBQj5FnAACA0bCyDgAAAAAAADAIinUAAAAAAACAQVCsAwAAAAAAAAyCYh0AAAAAAABgEBTrAAAAAAAAAIOgWAcAAAAAAAAYBMU6AAAAAAAAwCAo1gEAAAAAAAAGQbEOAIBrOHPmjMaMGSM3Nze1aNFCEyZM0Pnz56vsf/ToUTk4OFS6bdq0ydKvsv0pKSm1MSUAAAAABtbI3gEAAGBkY8aMUV5entLS0lRSUqKoqChNmjRJGzZsqLS/r6+v8vLyrNpWrFihV155Rffff79V+5o1axQeHm753KJFixqPHwAAAEDdUudX1rHiAQBgKwcPHlRqaqpWrVql4OBgDRgwQEuXLlVKSopOnjxZ6TFOTk7y9va22t5//3098sgjuu2226z6tmjRwqqfq6trbUwLAAAAgIHV+WLdmDFjdODAAaWlpWnbtm3atWuXJk2aVGX/8hUPV2/z5s3TbbfdVumKh6v7RUZG2ng2AAAjyczMVIsWLdS3b19LW2hoqBwdHbVnz54bGiMrK0s5OTmaMGFChX1PPPGEPD09FRQUpOTkZJnN5irHKS4uVmFhodUGAAAAoP6p08U6VjwAAGzJZDKpTZs2Vm2NGjWSh4eHTCbTDY2xevVqde/eXf369bNqnz9/vt555x2lpaVp+PDh+vOf/6ylS5dWOU58fLzc3d0tm6+v781PCABgF4mJifLz85Orq6uCg4O1d+/ea/bftGmTunXrJldXV/Xq1Us7duyw2m82mzVnzhy1bdtWTZo0UWhoqA4fPmzV52bvQAIAGEedLtax4gEAUB0zZ86s8pEI5duhQ4du+Xv++9//asOGDZXmmNmzZ6t///668847NWPGDE2fPl2vvPJKlWPFxcWpoKDAsh0/fvyW4wMA2N7GjRsVGxuruXPnKjs7W/7+/goLC9OpU6cq7b97926NHj1aEyZM0JdffqnIyEhFRkZq//79lj6LFi3SG2+8oaSkJO3Zs0fNmjVTWFiYLl68aOlzs3cgAQCMo04X61jxAACojmeeeUYHDx685nb77bfL29u7wsXU5cuXdebMGXl7e1/3e959911duHBB48aNu27f4OBgnThxQsXFxZXud3FxkZubm9UGADC+xYsXKzo6WlFRUerRo4eSkpLUtGlTJScnV9p/yZIlCg8P17Rp09S9e3ctWLBAffr00bJlyyRdWVWXkJCgWbNm6cEHH1Tv3r21bt06nTx5Ulu2bJFUvTuQAADGYchiHSseAAC21Lp1a3Xr1u2am7Ozs0JCQnT27FllZWVZjv34449VVlam4ODg637P6tWr9cADD6h169bX7ZuTk6OWLVvKxcXlluYGADCOS5cuKSsrS6GhoZY2R0dHhYaGKjMzs9JjMjMzrfpLUlhYmKX/kSNHZDKZrPq4u7srODjY0qcm7kACANhPI3sHUJlnnnlGjz766DX72GPFw4IFC1RcXFzphZSLi4tVe/kts9wOCwA1o/zf02s9kqCmde/eXeHh4YqOjlZSUpJKSkoUExOjUaNGycfHR5L0448/avDgwVq3bp2CgoIsx3733XfatWtXhecMSdIHH3yg/Px83X333XJ1dVVaWppeeuklPfvsszccG3kGAGqWLfLM6dOnVVpaKi8vL6t2Ly+vKhcfmEymSvuX3zlU/uf1+lTnDqTi4mKrFd4FBQWSqp9ryoovVOs4o6rOeeAccA6k+nUOGvr8pVv77+8bzTWGLNa1bt36hlYhXL3iITAwUJJxVjycO3dOkrgdFgBq2Llz5+Tu7l5r37d+/XrFxMRo8ODBcnR01PDhw/XGG29Y9peUlCg3N1cXLlj/R0hycrLat2+vIUOGVBizcePGSkxM1NSpU2U2m3XHHXdYbpO6UeQZALCN2s4zRhIfH6958+ZVaCfXXOGeYO8I7I9zwDlo6POXauYcXC/XGLJYd6OMvOLBx8dHx48fV/PmzeXg4HDrk7WRwsJC+fr66vjx4w3y+UcNff4S50DiHEh14xyYzWadO3fO8u97bfHw8NCGDRuq3O/n51fpb8ZeeuklvfTSS5UeEx4ervDw8FuKizxTdzT0c9DQ5y9xDqS6cQ5skWc8PT3l5OSk/Px8q/b8/Pwq7wTy9va+Zv/yP/Pz89W2bVurPgEBAZY+1bkDKS4uTrGxsZbPZWVlOnPmjFq1amXYXFMX/m7ZGueAc9DQ5y/VnXNwo7mmThfrJOOueHB0dFT79u2rP7Fa1tAfVt7Q5y9xDiTOgWT8c9BQVzpUhjxT9zT0c9DQ5y9xDiTjn4OazjPOzs4KDAxUenq6IiMjJV0pgKWnpysmJqbSY0JCQpSenq6nn37a0paWlqaQkBBJUqdOneTt7a309HRLca6wsFB79uzR5MmTLWNU5w6kXz/aR5JatGhRjZnXPqP/3aoNnAPOQUOfv1Q3zsGN5Jo6X6wz6ooHAAAAAIiNjdX48ePVt29fBQUFKSEhQUVFRYqKipIkjRs3Tu3atVN8fLwkacqUKRo4cKBee+01RUREKCUlRfv27dOKFSskSQ4ODnr66af1wgsvqEuXLurUqZNmz54tHx8fS0HwRu5AAgAYV50v1gEAAACAUY0cOVI//fST5syZI5PJpICAAKWmplpeEHHs2DE5Ojpa+vfr108bNmzQrFmz9Nxzz6lLly7asmWLevbsaekzffp0FRUVadKkSTp79qwGDBig1NRUubq6Wvpc7w4kAIBxUaxr4FxcXDR37twbfnFGfdPQ5y9xDiTOgcQ5gO3wd4tz0NDnL3EOJM5BTExMlbe9ZmRkVGgbMWKERowYUeV4Dg4Omj9/vubPn19ln+vdgVRfNPS/WxLnQOIcNPT5S/XvHDiYa/Ld5AAAAAAAAACqzfH6XQAAAAAAAADUBop1AAAAAAAAgEFQrAMAAAAAAAAMgmIdAAAAAAAAYBAU6xqwxMRE+fn5ydXVVcHBwdq7d6+9Q6o1u3bt0rBhw+Tj4yMHBwdt2bLF3iHVuvj4eN11111q3ry52rRpo8jISOXm5to7rFq1fPly9e7dW25ubnJzc1NISIg+/PBDe4dlNwsXLpSDg4Oefvppe4eCeqIh5xmJXEOeIc/8GnkGttCQcw15hjxDnrFWn/IMxboGauPGjYqNjdXcuXOVnZ0tf39/hYWF6dSpU/YOrVYUFRXJ399fiYmJ9g7Fbj799FM98cQT+vzzz5WWlqaSkhINGTJERUVF9g6t1rRv314LFy5UVlaW9u3bp/vuu08PPvigDhw4YO/Qat0XX3yhN998U71797Z3KKgnGnqekcg15BnyzNXIM7CFhp5ryDPkGfLML+pdnjGjQQoKCjI/8cQTls+lpaVmHx8fc3x8vB2jsg9J5vfff9/eYdjdqVOnzJLMn376qb1DsauWLVuaV61aZe8watW5c+fMXbp0MaelpZkHDhxonjJlir1DQj1AnrFGriHPlCPPkGdQc8g1vyDPkGfKkWfqR55hZV0DdOnSJWVlZSk0NNTS5ujoqNDQUGVmZtoxMthTQUGBJMnDw8POkdhHaWmpUlJSVFRUpJCQEHuHU6ueeOIJRUREWP2bANwK8gwqQ54hz5BnUJPINfg18gx5pj7lmUb2DgC17/Tp0yotLZWXl5dVu5eXlw4dOmSnqGBPZWVlevrpp9W/f3/17NnT3uHUqq+//lohISG6ePGibrvtNr3//vvq0aOHvcOqNSkpKcrOztYXX3xh71BQj5Bn8GvkGfIMeQY1jVyDq5FnyDP1Lc9QrAOgJ554Qvv379dnn31m71BqXdeuXZWTk6OCggK9++67Gj9+vD799NMGkeCOHz+uKVOmKC0tTa6urvYOB0A9Rp4hz5BnANgSeYY8U9/yDMW6BsjT01NOTk7Kz8+3as/Pz5e3t7edooK9xMTEaNu2bdq1a5fat29v73BqnbOzs+644w5JUmBgoL744gstWbJEb775pp0js72srCydOnVKffr0sbSVlpZq165dWrZsmYqLi+Xk5GTHCFFXkWdwNfIMeYY8A1sg16AceYY8Ux/zDM+sa4CcnZ0VGBio9PR0S1tZWZnS09Mb3L3tDZnZbFZMTIzef/99ffzxx+rUqZO9QzKEsrIyFRcX2zuMWjF48GB9/fXXysnJsWx9+/bVmDFjlJOTU2cTG+yPPAOJPFMV8gx5BjWDXAPyTOXIM/Ujz7CyroGKjY3V+PHj1bdvXwUFBSkhIUFFRUWKioqyd2i14vz58/ruu+8sn48cOaKcnBx5eHioQ4cOdoys9jzxxBPasGGD/va3v6l58+YymUySJHd3dzVp0sTO0dWOuLg43X///erQoYPOnTunDRs2KCMjQ3//+9/tHVqtaN68eYVnejRr1kytWrVqcM/6QM1r6HlGIteQZ8gz5BnYWkPPNeQZ8gx5ph7nGTu/jRZ2tHTpUnOHDh3Mzs7O5qCgIPPnn39u75BqzSeffGKWVGEbP368vUOrNZXNX5J5zZo19g6t1jz22GPmjh07mp2dnc2tW7c2Dx482PzRRx/ZOyy7qi+vOocxNOQ8YzaTa8gz5JnKkGdQ0xpyriHPkGfIMxXVlzzjYDabzTavCAIAAAAAAAC4Lp5ZBwAAAAAAABgExToAAAAAAADAICjWAQAAAAAAAAZBsQ4AAAAAAAAwCIp1AAAAAAAAgEFQrAMAAAAAAAAMgmIdAAAAAAAAYBAU64AG6q233lKLFi3sHQYAoJ4izwAAbIk8g/qMYh1Qh2RmZsrJyUkRERE3dZyfn58SEhKs2kaOHKlvv/22BqMDANR15BkAgC2RZ4AbQ7EOqENWr16tJ598Urt27dLJkydvaawmTZqoTZs2NRQZAKA+IM8AAGyJPAPcGIp1QB1x/vx5bdy4UZMnT1ZERITeeustq/0ffPCB7rrrLrm6usrT01MPPfSQJGnQoEH64YcfNHXqVDk4OMjBwUFS5cvGly9frs6dO8vZ2Vldu3bVX/7yF6v9Dg4OWrVqlR566CE1bdpUXbp00datWy37//Of/2jMmDFq3bq1mjRpoi5dumjNmjU1fzIAADWOPAMAsCXyDHDjKNYBdcQ777yjbt26qWvXrho7dqySk5NlNpslSdu3b9dDDz2koUOH6ssvv1R6erqCgoIkSZs3b1b79u01f/585eXlKS8vr9Lx33//fU2ZMkXPPPOM9u/fr8cff1xRUVH65JNPrPrNmzdPjzzyiL766isNHTpUY8aM0ZkzZyRJs2fP1jfffKMPP/xQBw8e1PLly+Xp6WnDswIAqCnkGQCALZFngJtgBlAn9OvXz5yQkGA2m83mkpISs6enp/mTTz4xm81mc0hIiHnMmDFVHtuxY0fz66+/btW2Zs0as7u7u9X40dHRVn1GjBhhHjp0qOWzJPOsWbMsn8+fP2+WZP7www/NZrPZPGzYMHNUVFR1pgcAsDPyDADAlsgzwI1jZR1QB+Tm5mrv3r0aPXq0JKlRo0YaOXKkVq9eLUnKycnR4MGDb+k7Dh48qP79+1u19e/fXwcPHrRq6927t+XnZs2ayc3NTadOnZIkTZ48WSkpKQoICND06dO1e/fuW4oJAFA7yDMAAFsizwA3h2IdUAesXr1aly9flo+Pjxo1aqRGjRpp+fLleu+991RQUKAmTZrUWiyNGze2+uzg4KCysjJJ0v333295nsTJkyc1ePBgPfvss7UWGwCgesgzAABbIs8AN4diHWBwly9f1rp16/Taa68pJyfHsv3rX/+Sj4+P/vrXv6p3795KT0+vcgxnZ2eVlpZe83u6d++uf/7zn1Zt//znP9WjR4+bird169YaP3683n77bSUkJGjFihU3dTwAoHaRZwAAtkSeAW5eI3sHAODatm3bpv/85z+aMGGC3N3drfYNHz5cq1ev1iuvvKLBgwerc+fOGjVqlC5fvqwdO3ZoxowZkiQ/Pz/t2rVLo0aNkouLS6UPSZ02bZoeeeQR3XnnnQoNDdUHH3ygzZs3a+fOnTcc65w5cxQYGKjf/OY3Ki4u1rZt29S9e/dbOwEAAJsizwAAbIk8A9w8VtYBBrd69WqFhoZWSGzSleS2b98+eXh4aNOmTdq6dasCAgJ03333ae/evZZ+8+fP19GjR9W5c2e1bt260u+JjIzUkiVL9Oqrr+o3v/mN3nzzTa1Zs0aDBg264VidnZ0VFxen3r1767e//a2cnJyUkpJy03MGANQe8gwAwJbIM8DNczCb/+9dyQAAAAAAAADsipV1AAAAAAAAgEFQrAMAAAAAAAAMgmIdAAAAAAAAYBAU6wAAAAAAAACDoFgHAAAAAAAAGATFOgAAAAAAAMAgKNYBAAAAAAAABkGxDgAAAAAAADAIinUAAAAAAACAQVCsAwAAAAAAAAyCYh0AAAAAAABgEBTrAAAAAAAAAIP4/wEpL2SOhaqNGgAAAABJRU5ErkJggg==", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "nb_actions = 5\n", "bandit = Bandit(nb_actions)\n", "\n", "all_rewards = []\n", "for t in range(1000):\n", " rewards = []\n", " for a in range(nb_actions):\n", " rewards.append(bandit.step(a))\n", " all_rewards.append(rewards)\n", " \n", "mean_reward = np.mean(all_rewards, axis=0)\n", "\n", "plt.figure(figsize=(15, 5))\n", "plt.subplot(131)\n", "plt.bar(range(nb_actions), bandit.Q_star)\n", "plt.xlabel(\"Actions\")\n", "plt.ylabel(\"$Q^*(a)$\")\n", "plt.subplot(132)\n", "plt.bar(range(nb_actions), mean_reward)\n", "plt.xlabel(\"Actions\")\n", "plt.ylabel(\"$Q_t(a)$\")\n", "plt.subplot(133)\n", "plt.bar(range(nb_actions), np.abs(bandit.Q_star - mean_reward))\n", "plt.xlabel(\"Actions\")\n", "plt.ylabel(\"Absolute error\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Greedy action selection\n", "\n", "In **greedy action selection**, we systematically chose the action with the highest estimated Q-value at each play (or randomly when there are ties):\n", "\n", "$$a_t = \\text{argmax}_a Q_t(a)$$\n", "\n", "We maintain estimates $Q_t$ of the action values (initialized to 0) using the online formula:\n", "\n", "$$Q_{t+1}(a_t) = Q_t(a_t) + \\alpha \\, (r_{t} - Q_t(a_t))$$\n", "\n", "when receiving the sampled reward $r_t$ after taking the action $a_t$. The learning rate $\\alpha$ can be set to 0.1 at first.\n", "\n", "The algorithm simply alternates between these two steps for 1000 plays (or steps): take an action, update its Q-value. \n", "\n", "**Q:** Implement the greedy algorithm on the 5-armed bandit.\n", "\n", "Your algorithm will look like this:\n", "\n", "* Create a 5-armed bandit (mean of zero, variance of 1).\n", "* Initialize the estimated Q-values to 0 with an array of the same size as the bandit.\n", "* **for** 1000 plays:\n", " * Select the greedy action $a_t^*$ using the current estimates.\n", " * Sample a reward from $\\mathcal{N}(Q^*(a_t^*), 1)$.\n", " * Update the estimated Q-value of the action taken.\n", " \n", "Additionally, you will store the received rewards at each step in an initially empty list or a numpy array of the correct size and plot it in the end. You will also plot the true Q-values and the estimated Q-values at the end of the 1000 plays. \n", "\n", "*Tip:* to implement the argmax, do not rely on `np.argmax()`. If there are ties in the array, for example at the beginning:\n", "\n", "```python\n", "x = np.array([0, 0, 0, 0, 0])\n", "```\n", "\n", "`x.argmax()` will return you the **first occurrence** of the maximum 0.0 of the array. In this case it will be the index 0, so you will always select the action 0 first. \n", "\n", "It is much more efficient to retrieve the indices of **all** maxima and randomly select one of them:\n", "\n", "```python\n", "a = rng.choice(np.where(x == x.max())[0])\n", "```\n", "\n", "`np.where(x == x.max())` returns a list of indices where `x` is maximum. `rng.choice()` randomly selects one of them." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q:** Re-run your algorithm multiple times with different values of $Q^*$ (simply recreate the `Bandit`) and observe:\n", "\n", "1. How much reward you get.\n", "2. How your estimated Q-values in the end differ from the true Q-values.\n", "3. Whether greedy action action selection finds the optimal action or not." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before going further, let's turn the agent into a class for better reusability. \n", "\n", "**Q:** Create a `GreedyAgent` class taking the bandit as an argument as well as the learning rate `alpha=0.1`:\n", "\n", "```python\n", "bandit = Bandit(nb_actions)\n", "\n", "agent = GreedyAgent(bandit, alpha=0.1)\n", "```\n", "\n", "The constructor should initialize the array of estimated Q-values `Q_t` and store it as an attribute.\n", "\n", "Define a method `act(self)` that returns the index of the greedy action based on the current estimates, as well as a method `update(self, action, reward)` that allows to update the estimated Q-value of the action given the obtained reward. Define also a `train(self, nb_steps)` method that implements the complete training process for `nb_steps=1000` plays and returns the list of obtained rewards.\n", "\n", "```python\n", "class GreedyAgent:\n", " def __init__(self, bandit, alpha):\n", " # TODO\n", " \n", " def act(self): \n", " action = # TODO\n", " return action\n", " \n", " def update(self, action, reward):\n", " # TODO\n", " \n", " def train(self, nb_steps):\n", " # TODO\n", "```\n", "\n", "Re-run the experiment using this Greedy agent." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q:** Modify the `train()` method so that it also returns a list of binary values (0 and 1) indicating for each play whether the agent chose the optimal action. Plot this list and observe the lack of exploration.\n", "\n", "*Hint:* the index of the optimal action is already stored in the bandit: `bandit.a_star`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The evolution of the received rewards and optimal actions does not give a clear indication of the successful learning, as it is strongly dependent on the true Q-values. To truly estimate the performance of the algorithm, we have to average these results over many runs, e.g. 200.\n", "\n", "**Q:** Run the learning procedure 200 times (new bandit and agent every time) and average the results. Give a unique name to these arrays (e.g. `rewards_greedy` and `optimal_greedy`) as we will do comparisons later. Compare the results with the lecture, where a 10-armed bandit was used." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## $\\epsilon$-greedy action selection\n", "\n", "The main drawback of greedy action selection is that it does not explore: as soon as it finds an action better than the others (with a sufficiently positive true Q-value, i.e. where the sampled rewards are mostly positive), it will keep selecting that action and avoid exploring the other options. \n", "\n", "The estimated Q-value of the selected action will end up being quite correct, but those of the other actions will stay at 0.\n", "\n", "In $\\epsilon$-greedy action selection, the greedy action $a_t^*$ (with the highest estimated Q-value) will be selected with a probability $1-\\epsilon$, the others with a probability of $\\epsilon$ altogether. \n", "\n", "$$\n", " \\pi(a) = \\begin{cases} 1 - \\epsilon \\; \\text{if} \\; a = a_t^* \\\\ \\frac{\\epsilon}{|\\mathcal{A}| - 1} \\; \\text{otherwise.} \\end{cases}\n", "$$\n", "\n", "If you have $|\\mathcal{A}| = 5$ actions, the four non-greedy actions will be selected with a probability of $\\frac{\\epsilon}{4}$.\n", "\n", "**Q:** Create a `EpsilonGreedyAgent` (possibly inheriting from `GreedyAgent` to reuse code) to implement $\\epsilon$-greedy action selection (with $\\epsilon=0.1$ at first). Do not overwrite the arrays previously calculated (mean reward and optimal actions), as you will want to compare the two methods in a single plot.\n", "\n", "To implement $\\epsilon-$greedy, you need to:\n", "\n", "1. Select the greedy action $a = a^*_t$.\n", "2. Draw a random number between 0 and 1 (`rng.random()`).\n", "3. If this number is smaller than $\\epsilon$, you need to select another action randomly in the remaining ones (`rng.choice()`).\n", "4. Otherwise, keep the greedy action." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q:** Compare the properties of greedy and $\\epsilon$-greedy (speed, optimality, etc). Vary the value of the parameter $\\epsilon$ (0.0001 until 0.5) and conclude." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Softmax action selection\n", "\n", "To avoid exploring actions which are clearly not optimal, another useful algorithm is **softmax action selection**. In this scheme, the estimated Q-values are ransformed into a probability distribution using the softmax opertion:\n", "\n", "$$\n", " \\pi(a) = \\frac{\\exp \\frac{Q_t(a)}{\\tau}}{ \\sum_b \\exp \\frac{Q_t(b)}{\\tau}}\n", "$$ \n", "\n", "For each action, the term $\\exp \\frac{Q_t(a)}{\\tau}$ is proportional to $Q_t(a)$ but made positive. These terms are then normalized by the denominator in order to obtain a sum of 1, i.e. they are the parameters of a discrete probability distribution. The temperature $\\tau$ controls the level of exploration just as $\\epsilon$ for $\\epsilon$-greedy.\n", "\n", "In practice, $\\exp \\frac{Q_t(a)}{\\tau}$ can be very huge if the Q-values are high or the temperature is small, creating numerical instability (NaN). It is much more stable to substract the maximal Q-value from all Q-values before applying the softmax:\n", "\n", "$$\n", " \\pi(a) = \\frac{\\exp \\displaystyle\\frac{Q_t(a) - \\max_a Q_t(a)}{\\tau}}{ \\sum_b \\exp \\displaystyle\\frac{Q_t(b) - \\max_b Q_t(b)}{\\tau}}\n", "$$ \n", "\n", "This way, $Q_t(a) - \\max_a Q_t(a)$ is always negative, so its exponential is between 0 and 1.\n", "\n", "**Q:** Implement the softmax action selection (with $\\tau=0.5$ at first) and compare its performance to greedy and $\\epsilon$-greedy. Vary the temperature $\\tau$ and find the best possible value. Conclude.\n", "\n", "*Hint:* To select actions with different probabilities, check the doc of `rng.choice()`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exploration scheduling\n", "\n", "The problem with this version of softmax (with a constant temperature) is that even after it has found the optimal action, it will still explore the other ones (although more rarely than at the beginning). The solution is to **schedule** the exploration parameter so that it explores a lot at the beginning (high temperature) and gradually switches to more exploitation (low temperature).\n", "\n", "Many schemes are possible for that, the simplest one (**exponential decay**) being to multiply the value of $\\tau$ by a number very close to 1 after **each** play:\n", "\n", "$$\\tau = \\tau \\times (1 - \\tau_\\text{decay})$$\n", "\n", "**Q:** Implement in a class `SoftmaxScheduledAgent` temperature scheduling for the softmax algorithm ($\\epsilon$-greedy would be similar) with $\\tau=1$ initially and $\\tau_\\text{decay} = 0.01$ (feel free to change these values). Plot the evolution of `tau` and of the standard deviation of the choices of the optimal action. Conclude." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q:** Experiment with different schedules (initial values, decay rate) and try to find the best setting." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3.9.12 ('base')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.6" }, "vscode": { "interpreter": { "hash": "3d24234067c217f49dc985cbc60012ce72928059d528f330ba9cb23ce737906d" } } }, "nbformat": 4, "nbformat_minor": 4 }