{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# SYDE 556/750: Simulating Neurobiological Systems\n", "\n", "\n", "Readings: [Stewart et al.](http://compneuro.uwaterloo.ca/publications/stewart2010.html)\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Biological Cognition -- Control\n", "\n", "- Lots of contemporary neural models are quite simple\n", " - Working memory, vision, audition, perceptual decision making, oscillations, etc.\n", "\n", "- What happens when our models get more complex?\n", " - I.e., what happens when the models:\n", " - Switch modalities?\n", " - Have a complex environment?\n", " - Have limited resources?\n", " - Can't do everything at once?\n", " \n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "- The brain needs a way to determine how to best use the finite resources it has.\n", "- Think about what happens when:\n", " - You have two targets to reach to at once (or 3 or 4)\n", " - You want to get to a goal that requires a series of actions\n", " - You don't know what the target is, but you know what modality it will be in\n", " - You don't know what the target will be, but you know where it will be\n", " \n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "- In all these cases, your brain needs to control the flow of information through it to solve the task.\n", "- Chapter 5 of [How to build a brain](http://compneuro.uwaterloo.ca/publications/eliasmith2013.html) is focussed on relevant neural models.\n", "- That chapter distinguishes two aspects of control:\n", " 1. determining what an appropriate control signal is\n", " 2. applying that signal to change the system\n", "- The first is a kind of decision making called 'action selection'\n", "- The second is more of an implementational question about how to gate information effectively (we've seen several possibilities for this already; e.g. inhibition, multiplication)\n", "- This lecture focusses on the first aspect of control" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Action Selection and the Basal Ganglia\n", "\n", "- Actions can be many different things\n", " - physical movements\n", " - moving attention\n", " - changing contents of working memory \n", " - recalling items from long-term memory" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Action Selection\n", "\n", "- How can we do this?\n", "- Suppose we're a critter that's trying to survive in a harsh environment\n", "- We have a bunch of different possible actions\n", " - go home\n", " - move randomly\n", " - go towards food\n", " - go away from predators\n", "- Which one do we pick?\n", " - Ideas?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Reinforcement Learning\n", "- [Reinforcement learning](http://en.wikipedia.org/wiki/Reinforcement_learning) is a biologically inspired computational approach to machine learning. It is based on the idea that creatures maximize reward, which seems to be the case (see, e.g., the Rescorla-Wagner model of Pavlov's experiments).\n", "- There have been a lot of [interesting connections](http://www.ncbi.nlm.nih.gov/pubmed/19897789) found between signals in these models and signals in the brain.\n", "- So, let's steal a simple idea from reinforcement learning:\n", "- Each action has a utility $Q$ that depends on the current state $s$\n", " - $Q(s, a)$ (the action value)\n", "- The best action will then be the action that has the largest $Q$\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "- Note\n", " - Lots of different variations on this\n", " - $V(s)$ (the state value - expected reward given a state & policy)\n", " - Softmax: $p(a_i) = e^{Q(s, a_i)/T} / \\sum_i e^{Q(s, a_i)/T}$ (instead of max)\n", "- In RL research, people come up with learning algorithms for adjusting $Q$ based on rewards\n", "- We won't worry about that for now (see the lecture on learning) and just use the basic idea\n", " - There's some sort of state $s$\n", " - For each action $a_i$, compute $Q(s, a_i)$ which is a function that we can define\n", " - Take the biggest $Q$ and perform that action" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Implementation\n", "\n", "- One group of neurons to represent the state $s$\n", "- One group of neurons for each action's utility $Q(s, a_i)$\n", " - Or one large group of neurons for all the $Q$ values\n", "\n", "- What should the output be?\n", " - We could have $index$, which is the index $i$ of the action with the largest $Q$ value\n", " - Or we could have something like $[0,0,1,0]$, indicating which action is selected\n", " - Advantages and disadvantages?\n", "- The second option seems easier if we consider that we have to do action execution next..." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### A Simple Example\n", "\n", "- State $s$ is 2-dimensional (x,y plane)\n", "- Four actions (A, B, C, D)\n", "- Do action A if $s$ is near [1,0], B if near [-1,0], C if near [0,1], D if near [0,-1]\n", " - $Q(s, a_A)=s \\cdot [1,0]$\n", " - $Q(s, a_B)=s \\cdot [-1,0]$\n", " - $Q(s, a_C)=s \\cdot [0,1]$\n", " - $Q(s, a_D)=s \\cdot [0,-1]$\n", " \n", "- REMINDER: COURSE EVALUATION STUFF!\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Populating the interactive namespace from numpy and matplotlib\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "//anaconda/envs/python3/lib/python3.6/site-packages/IPython/core/magics/pylab.py:160: UserWarning: pylab import has clobbered these variables: ['e', 'positive']\n", "`%matplotlib` prevents importing * from pylab and numpy\n", " \"\\n`%matplotlib` prevents importing * from pylab and numpy\"\n" ] }, { "data": { "text/html": [ "\n", "
jupyter serverextension enable ' +\n", " 'nengo_gui.jupyter';\n", " p.classList.add('output_stderr');\n", " }\n", " });\n", " req.open('GET', './nengo/check', true);\n", " req.send();\n", " }\n", " \n", " " ], "text/plain": [ "