{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Controlling a Cart in a 1D Space" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The Challenge\n", "\n", "In this demo, we consider a cart that can move in a 1D space. At each time step the cart can be steered a bit to the left or right by a controller (the \"agent\"). The agent's knowledge about the cart's process dynamics (equations of motion) are known up to some additive Gaussian process noise. The agent also makes noisy observations of the position and velocity of the cart. Your challenge is to design an agent that steers the car to the zero position. (The agent should be specified as a probabilistic model and the control signal should be formulated as a Bayesian inference task). " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Set up environment\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "using Pkg;Pkg.activate(\"../probprog/workspace/\");Pkg.instantiate()\n", "using Random\n", "Random.seed!(87613) # Set random seed\n", "\n", "using LinearAlgebra\n", "using PyPlot\n", "using ForneyLab\n", "\n", "include(\"environment_1d.jl\") # Include environmental dynamics\n", "include(\"helpers_1d.jl\") # Include helper functions for plotting\n", "include(\"agent_1d.jl\") # Load agent's internal beliefs over external dynamics\n", "IJulia.clear_output()\n", ";" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Build model\n", "\n", "Here we specify the generative model for the agent. We define a state-space model that includes prior beliefs about desired future outcomes. The target future outcome is position $0$ (zero).\n", "\\begin{align*}\n", " p_t(o, s, u) &\\propto p(s_{t-1}) \\prod_{k=t}^{t+T} p(o_k | s_k)\\, p(s_k | s_{k-1}, u_k)\\, p(u_k)\\, \\tilde{p}(o_k)\\,.\n", "\\end{align*}\n", "\n", "We further detail the model by making the following assumptions:\n", "\\begin{align*}\n", " p(s_{t-1}) &= \\mathcal{N}(s_{t-1} | m_{s, t-t}, v_{s, t-1})\\\\\n", " p(s_k | s_{k-1}, u_k) &= \\mathcal{N}(s_k | s_{k-1} + u_k, \\gamma^{-1})\\\\\n", " p(o_k | s_k) &= \\mathcal{N}(o_k | s_k, \\phi^{-1})\\\\\n", " p(u_k) &= \\mathcal{N}(u_k | 0, \\upsilon) \\text{, for } k>t\\\\\n", " \\tilde{p}(o_k) &= \\mathcal{N}(o_k | 0, \\sigma) \\text{, for } k>t\\\\\n", " p(u_t) &= \\delta(u_t - \\hat{u}_t)\\\\\n", " \\tilde{p}(o_t) &= \\delta(o_t - \\hat{o}_t)\\,.\n", "\\end{align*}" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# Internal model perameters\n", "gamma = 100.0 # Transition precision\n", "phi = 10.0 # Observation precision\n", "upsilon = 1.0 # Control prior variance\n", "sigma = 1.0 # Goal prior variance\n", ";" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "T = 10 # Lookahead\n", "\n", "# Build internal model\n", "fg = FactorGraph()\n", "\n", "o = Vector{Variable}(undef, T) # Observed states\n", "s = Vector{Variable}(undef, T) # Noisy brain states\n", "u = Vector{Variable}(undef, T) # Control states\n", "\n", "@RV s_t_min ~ GaussianMeanVariance(placeholder(:m_s_t_min),\n", " placeholder(:v_s_t_min)) # Prior brain state\n", "u_t = placeholder(:u_t)\n", "@RV u[1] ~ GaussianMeanVariance(u_t, tiny)\n", "@RV s[1] ~ GaussianMeanPrecision(s_t_min + u[1], gamma)\n", "@RV o[1] ~ GaussianMeanPrecision(s[1], phi)\n", "placeholder(o[1], :o_t)\n", "\n", "s_k_min = s[1]\n", "for k=2:T\n", " @RV u[k] ~ GaussianMeanVariance(0.0, upsilon) # Control prior\n", " @RV s[k] ~ GaussianMeanPrecision(s_k_min + u[k], gamma) # State transition model\n", " @RV o[k] ~ GaussianMeanPrecision(s[k], phi) # Observation model\n", " GaussianMeanVariance(o[k], \n", " placeholder(:m_o, var_id=:m_o_*k, index=k-1),\n", " placeholder(:v_o, var_id=:v_o_*k, index=k-1)) # Goal prior\n", " s_k_min = s[k]\n", "end\n", ";" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Infer algorithm\n", "\n", "Next, we call upon [ForneyLab](http://forneylab.org) package to generate a message passing algorithm to *infer* the next action. " ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "function step!(data::Dict, marginals::Dict=Dict(), messages::Vector{Message}=Array{Message}(undef, 59))\n", "\n", "messages[1] = ruleSPGaussianMeanVarianceOutNPP(nothing, Message(Univariate, PointMass, m=0.0), Message(Univariate, PointMass, m=1.0))\n", "messages[2] = ruleSPGaussianMeanVarianceOutNPP(nothing, Message(Univariate, PointMass, m=data[:m_s_t_min]), Message(Univariate, PointMass, m=data[:v_s_t_min]))\n", "messages[3] = ruleSPGaussianMeanVarianceOutNPP(nothing, Message(Univariate, PointMass, m=data[:u_t]), Message(Univariate, PointMass, m=1.0e-12))\n", "messages[4] = ruleSPAdditionOutNGG(nothing, messages[2], messages[3])\n", "\n", "...\n", "\n", "marginals[:u_2] = messages[1].dist * messages[59].dist\n", "\n", "return marginals\n", "\n", "end\n" ] } ], "source": [ "# Schedule message passing algorithm\n", "algo = messagePassingAlgorithm(u[2]) # Infer internal states\n", "code = algorithmSourceCode(algo) # Generate algorithm source code\n", "\n", "eval(Meta.parse(code)) # Loads the step!() function for inference\n", "inspectSnippet(code) # Inspect a snippet of the algorithm code\n", ";" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the inference algorithm completely consists of a sequence of messages. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Execute algorithm\n", "\n", "Now we run this message passing algorithm for each time step so as to infer the next action. And when we're done, we plot the position of the cart as a function of time. Note that the cart get steered to the target position." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "# Initial state\n", "s_0 = 2.0\n", ";" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "N = 20 # Total simulation time\n", "\n", "(execute, observe) = initializeWorld() # Let there be a world\n", "(infer, act, slide) = initializeAgent() # Let there be an agent\n", "\n", "# Step through action-perception loop\n", "u_hat = Vector{Float64}(undef, N) # Actions\n", "o_hat = Vector{Float64}(undef, N) # Observations\n", "for t=1:N\n", " u_hat[t] = act() # Evoke an action from the agent\n", " execute(u_hat[t]) # The action influences hidden external states\n", " o_hat[t] = observe() # Observe the current environmental outcome (update p)\n", " infer(u_hat[t], o_hat[t]) # Infer beliefs from current model state (update q)\n", " slide() # Prepare for next iteration\n", "end\n", ";" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "Figure(PyObject