{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "name": "Underactuated Robotics - The Simple Pendulum.ipynb", "provenance": [], "collapsed_sections": [], "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.9-final" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "TKvYiJgnYExi" }, "source": [ "Welcome! If you are new to Google Colab/Jupyter notebooks, you might take a look at [this notebook](https://colab.research.google.com/notebooks/basic_features_overview.ipynb) first.\n", "\n", "**I recommend you run the first code cell of this notebook immediately, to start provisioning drake on the cloud machine, then you can leave this window open as you [read the textbook](http://underactuated.csail.mit.edu/pend.html).**\n", "\n", "# Notebook Setup\n", "\n", "The following cell will:\n", "- on Colab (only), install Drake to `/opt/drake`, install Drake's prerequisites via `apt`, and add pydrake to `sys.path`. This will take approximately two minutes on the first time it runs (to provision the machine), but should only need to reinstall once every 12 hours. If you navigate between notebooks using Colab's \"File->Open\" menu, then you can avoid provisioning a separate machine for each notebook.\n", "- import packages used throughout the notebook.\n", "\n", "You will need to rerun this cell if you restart the kernel, but it should be fast (even on Colab) because the machine will already have drake installed." ] }, { "cell_type": "code", "metadata": { "id": "A4QOaw_zYLfI" }, "source": [ "import importlib\n", "import sys\n", "from urllib.request import urlretrieve\n", "\n", "# Install drake (and underactuated).\n", "if 'google.colab' in sys.modules and importlib.util.find_spec('underactuated') is None:\n", " urlretrieve(f\"http://underactuated.csail.mit.edu/scripts/setup/setup_underactuated_colab.py\",\n", " \"setup_underactuated_colab.py\")\n", " from setup_underactuated_colab import setup_underactuated\n", " setup_underactuated(underactuated_sha='ffe2b28ed89637889c04405e5d7d2d98be3df5b6', drake_version='0.27.0', drake_build='release')\n", "\n", "server_args = []\n", "if 'google.colab' in sys.modules:\n", " server_args = ['--ngrok_http_tunnel']\n", "# Start a single meshcat server instance to use for the remainder of this notebook.\n", "from meshcat.servers.zmqserver import start_zmq_server_as_subprocess\n", "proc, zmq_url, web_url = start_zmq_server_as_subprocess(server_args=server_args)\n", " \n", "# Imports.\n", "import numpy as np\n", "import meshcat\n", "from ipywidgets import interact, FloatSlider, ToggleButton\n", "from IPython.display import display\n", "\n", "import pydrake.all\n", "from pydrake.all import (AddMultibodyPlantSceneGraph, DiagramBuilder, PlanarSceneGraphVisualizer, SceneGraph, Simulator)\n", "from pydrake.systems.jupyter_widgets import WidgetSystem\n", "from pydrake.examples.pendulum import PendulumGeometry, PendulumPlant\n", "\n", "import underactuated\n", "from underactuated.jupyter import AdvanceToAndVisualize, SetupMatplotlibBackend, running_as_notebook\n", "import underactuated.meshcat_utils as mutil\n" ], "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "C1sdq2R88C16" }, "source": [ "# Dynamics of the Simple Pendulum\n", "\n", "I find it extremely useful to use simulation to get physical intuition about these systems. Let's make sure we understand how the simple pendulum moves when it is exposed to a torque.\n", "\n", "For the duration of this notebook, we'll use the equations of motion $$ml^2 \\ddot\\theta + b\\dot\\theta + mgl \\sin\\theta = u,$$ where $u$ is our torque input. \n", "\n", "## Basic simulation\n", "\n", "The pendulum is a core example in Drake. We could certainly load it from a .urdf file, but Drake offers a Pendulum implementation that makes it convenient to manipulate the parameters (and visualize the system with different parameters)." ] }, { "cell_type": "code", "metadata": { "id": "8wj7ZlyEw_AB" }, "source": [ "\n", "builder = DiagramBuilder()\n", "pendulum = builder.AddSystem(PendulumPlant())\n", "\n", "# Setup visualization\n", "scene_graph = builder.AddSystem(SceneGraph())\n", "PendulumGeometry.AddToBuilder(builder, pendulum.get_state_output_port(), scene_graph)\n", "visualizer = pydrake.systems.meshcat_visualizer.ConnectMeshcatVisualizer(\n", " builder, \n", " scene_graph=scene_graph, \n", " zmq_url=zmq_url)\n", "visualizer.set_planar_viewpoint()\n", "visualizer.vis.delete()\n", "\n", "# Setup slider input\n", "slider = FloatSlider(value=0.0, min=-5., max=5., step=0.1, description='u', continuous_update=True)\n", "torque_system = builder.AddSystem(WidgetSystem([slider]))\n", "builder.Connect(torque_system.get_output_port(0), pendulum.get_input_port(0))\n", "\n", "diagram = builder.Build()\n", "\n", "# Set up a simulator to run this diagram\n", "simulator = Simulator(diagram)\n", "context = simulator.get_mutable_context()\n", "\n", "stop_button = ToggleButton(value=False, description='Stop Simulation')\n", "display(stop_button)\n", "\n", "# Set the initial conditions\n", "context.SetContinuousState([0.5, 0]) # theta, thetadot\n", "context.SetTime(0.0)\n", "\n", "if running_as_notebook: # Then we're not just running as a test on CI.\n", " simulator.set_target_realtime_rate(1.0)\n", "\n", " while not stop_button.value:\n", " simulator.AdvanceTo(simulator.get_context().get_time() + 1.0)\n", " stop_button.value = False\n", "else:\n", " simulator.AdvanceTo(0.1)\n" ], "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "aNHdvSloKUSn" }, "source": [ "After running the cell above, you should see an interactive slider that will control the torque, and be prompted to open a window for the visualizer (you'll have to open the new window if you run that cell again). Press the \"Stop Simulation\" button to regain control and continue in the notebook.\n", "\n", "Note: I've used a different visualizer here so that you can interact with the simulation as it runs, even on Google Colab. But the matplotlib-based visualizer will serve us well for most of the course." ] }, { "cell_type": "markdown", "metadata": { "id": "zKmlDpfeKUSo" }, "source": [ "# Autapse\n", "\n", "The simplest recurrent neural network model. $$\\dot{x} = -x + \\tanh(wx + u)$$" ] }, { "cell_type": "code", "metadata": { "id": "ftbxk7ddKUSo" }, "source": [ "\n", "def autapse(x, w=1, u=0):\n", " \"\"\"Args:\n", " w is feedback weight\n", " u is input\n", " \"\"\"\n", " return -x + np.tanh(w * x + u)\n", "\n", "\n", "Autapse = np.vectorize(autapse)\n", "xmax = 2.\n", "ymax = 1.\n", "x = np.arange(-xmax, xmax, 0.01)\n", "\n", "vis = meshcat.Visualizer(zmq_url=zmq_url, server_args=server_args)\n", "mutil.set_planar_viewpoint(vis, xmax=xmax, xmin=-xmax, ymin=-ymax, ymax=ymax)\n", "vis.delete()\n", "vis['/Grid'].set_property(\"visible\", True)\n", "vis['/Axes'].set_property(\"visible\", True)\n", "\n", "def update(w=1, u=0):\n", " vertices = np.vstack([x,0*x, Autapse(x, w=w, u=u)])\n", " vis[\"autpase\"].set_object(meshcat.geometry.Line(meshcat.geometry.PointsGeometry(vertices),meshcat.geometry.LineBasicMaterial(color=0x000000)))\n", " \n", " \n", "interact(update, w=(0,3,0.1), u=(-1.5,1.5,0.1));\n", "\n" ], "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "0z4i5CF5KUSp" }, "source": [ "# Long short-term memory (LSTM)\n", "\n", "A recurrent neural network component.. [Wikipedia](https://en.wikipedia.org/wiki/Long_short-term_memory)\n", "\n", "We'll look at the [\"gated recurrent unit\" version](https://colah.github.io/posts/2015-08-Understanding-LSTMs/)...\n", "\\begin{align}\n", "i[n] &= \\sigma_g(A_{i} x[n] + B_{i} u[n] + c_i) & \\text{input gate} \\\\\n", "o[n] &= \\sigma_g(A_{o} x[n] + B_{o} u[n] + c_o) & \\text{output gate} \\\\\n", "f[n] &= \\sigma_g(A_{f} x[n] + B_{f} u[n] + c_f) & \\text{forget gate} \\\\\n", "x[n+1] &= f[n] \\circ x[n] + i[n] \\circ \\tanh(B_{x} u[n] + c_c) \\\\\n", "y[n] &= \\tanh(o[n] \\circ x[n])\n", "\\end{align}\n", "\n", "where the operator $\\circ$ denotes the Hadamard product (element-wise product). $\\sigma _{g}(x) = \\frac{1}{1+e^{-x}} \\in (0,1)$ is the sigmoid function and recall that $\\tanh(x) \\in (-1,1)$.\n", "\n", "In the example below, we will plot the continuous-time version of this (to stay consistent with the rest of the analysis in the chapter), $\\dot{x}$ vs $x$. To keep the number of parameters reasonable, I've assumed that there are three inputs -- with $$u = \\begin{bmatrix} u_{input} \\\\ u_{forget} \\\\ u_{x} \\end{bmatrix}, B_i = [ 1, 0 0 ], B_f = [ 0, 1, 0 ], B_x = [ 0, 0, 1].$$ Note that the output $y$ has no impact on the stability and is not represented below.\n" ] }, { "cell_type": "code", "metadata": { "id": "LIiZe1mFKUSq" }, "source": [ "\n", "def sigma(x):\n", " return 1./(1+np.exp(-x))\n", "\n", "def lstm(x, uf=0, ui=0, ux=0, af=1, bf=1, cf=0, ai=1, bi=1, ci=0, bx=1, cx=0):\n", " return - x + sigma(af*x + bf*uf + cf)*x + sigma(ai*x + bi*ui + ci) * np.tanh(bx*ux+cx)\n", "\n", "Lstm = np.vectorize(lstm)\n", "xmax = 10.\n", "ymax = 4.\n", "x = np.arange(-xmax, xmax, 0.01)\n", "\n", "vis = meshcat.Visualizer(zmq_url=zmq_url, server_args=server_args)\n", "mutil.set_planar_viewpoint(vis, xmax=xmax, xmin=-xmax, ymin=-ymax, ymax=ymax)\n", "vis.delete()\n", "vis['/Grid'].set_property(\"visible\", True)\n", "vis['/Axes'].set_property(\"visible\", True)\n", "\n", "def update(u_forget, u_input, u_x,\n", " a_forget, b_forget, c_forget,\n", " a_input, b_input, c_input,\n", " b_x, c_x):\n", " vertices = np.vstack([x,0*x, Lstm(x,\n", " uf=u_forget, ui=u_input, ux=u_x,\n", " af=a_forget, bf=b_forget, cf=c_forget,\n", " ai=a_input, bi=b_input, ci=c_input,\n", " bx=b_x, cx=c_x)])\n", " vis[\"autpase\"].set_object(meshcat.geometry.Line(meshcat.geometry.PointsGeometry(vertices),meshcat.geometry.LineBasicMaterial(color=0x000000)))\n", " \n", "interact(update,\n", " u_forget=(-10,10,0.1),\n", " u_input=(-10,10,0.1),\n", " u_x=(-10,10,0.1),\n", " a_forget=(0,2,0.1),\n", " b_forget=(0,2,0.1),\n", " c_forget=(-1,1,.1),\n", " a_input=(0,2,0.1),\n", " b_input=(0,2,0.1),\n", " c_input=(-1,1,.1),\n", " b_x=(0,2,0.1),\n", " c_x=(-1,1,.1),\n", " );\n" ], "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "CvoRbIK1KUSq" }, "source": [ "I will make a few observations. First, I'll admit that the behavior here looks pretty strange to me! When \"forget gate\" is on (corresponding to a large negative input $u_{forget}$) and the \"input gate\" is off (corresponding to a large negative input $u_{input}$), the system has a stable fixed point close to zero. When the forget gate is off (large $u_{forget}$ -- yes, it feels backwards to me, too) then the system becomes an integrator which will accumulate the input $u_x$. Yes -- it's unstable! And no, it's not an artifact of the continuous-time implementation. The total output of the system is limited by the output $\\tanh$, but the internal state is unstable.\n", "\n", "The design of these units does not appear to have been made by someone who knew dynamics (nor graphical analysis). That they seem to work somewhat well in practice is fairly remarkable to me. Shouldn't we be able to do better?\n", "\n", "Note: The parameter space here is large and the interpretations subtle. If you find a better parameter regime and/or different interpretation, then I'd love to hear about it." ] }, { "source": [ "# Energy Shaping Controller\n", "\n", "First we will design the energy shaping controller (only), and plot the closed-loop phase portrait. Remember, this system is not actually stable at the upright. It is only attractive!" ], "cell_type": "markdown", "metadata": { "id": "DZ2m35mEKUSr" } }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "from copy import copy\n", "\n", "from pydrake.all import (DiagramBuilder, Saturation, SignalLogger,\n", " Simulator, wrap_to, VectorSystem)\n", "from pydrake.examples.pendulum import (PendulumParams, PendulumPlant)\n", "\n", "\n", "class EnergyShapingController(VectorSystem):\n", "\n", " def __init__(self, pendulum):\n", " VectorSystem.__init__(self, 2, 1)\n", " self.pendulum = pendulum\n", " self.pendulum_context = pendulum.CreateDefaultContext()\n", " self.SetPendulumParams(PendulumParams())\n", " \n", " def SetPendulumParams(self, params):\n", " self.pendulum_context.get_mutable_numeric_parameter(0).SetFromVector(params.CopyToVector())\n", " self.pendulum_context.SetContinuousState([np.pi, 0])\n", " self.desired_energy = self.pendulum.EvalPotentialEnergy(self.pendulum_context)\n", "\n", " def DoCalcVectorOutput(self, context, pendulum_state, unused, output):\n", " self.pendulum_context.SetContinuousState(pendulum_state)\n", " params = self.pendulum_context.get_numeric_parameter(0)\n", " theta = pendulum_state[0]\n", " thetadot = pendulum_state[1]\n", " total_energy = self.pendulum.EvalPotentialEnergy(self.pendulum_context) + self.pendulum.EvalKineticEnergy(self.pendulum_context)\n", " output[:] = (params.damping() * thetadot - .1 * thetadot *\n", " (total_energy - self.desired_energy))\n", "\n", "\n", "def PhasePlot(pendulum):\n", " phase_plot = plt.figure()\n", " ax = phase_plot.gca()\n", " theta_lim = [-np.pi, 3. * np.pi]\n", " ax.set_xlim(theta_lim)\n", " ax.set_ylim(-10., 10.)\n", "\n", " theta = np.linspace(theta_lim[0], theta_lim[1], 601) # 4*k + 1\n", " thetadot = np.zeros(theta.shape)\n", " context = pendulum.CreateDefaultContext()\n", " params = context.get_numeric_parameter(0)\n", " context.SetContinuousState([np.pi, 0])\n", " E_upright = pendulum.EvalPotentialEnergy(context)\n", " E = [E_upright, .1 * E_upright, 1.5 * E_upright]\n", " for e in E:\n", " for i in range(theta.size):\n", " v = ((e + params.mass() * params.gravity() * params.length() *\n", " np.cos(theta[i])) /\n", " (.5 * params.mass() * params.length() * params.length()))\n", " if (v >= 0):\n", " thetadot[i] = np.sqrt(v)\n", " else:\n", " thetadot[i] = float(\"nan\")\n", " ax.plot(theta, thetadot, color=[.6, .6, .6])\n", " ax.plot(theta, -thetadot, color=[.6, .6, .6])\n", "\n", " return ax\n", "\n", "\n", "builder = DiagramBuilder()\n", "\n", "pendulum = builder.AddSystem(PendulumPlant())\n", "ax = PhasePlot(pendulum)\n", "saturation = builder.AddSystem(Saturation(min_value=[-3], max_value=[3]))\n", "builder.Connect(saturation.get_output_port(0), pendulum.get_input_port(0))\n", "controller = builder.AddSystem(EnergyShapingController(pendulum))\n", "builder.Connect(pendulum.get_output_port(0), controller.get_input_port(0))\n", "builder.Connect(controller.get_output_port(0), saturation.get_input_port(0))\n", "\n", "#visualizer = pydrake.systems.meshcat_visualizer.ConnectMeshcatVisualizer(\n", "# builder, \n", "# scene_graph=scene_graph, \n", "# zmq_url=zmq_url)\n", "#visualizer.vis.delete()\n", "#visualizer.set_planar_viewpoint(xmin=-3.0, xmax=3.0, ymin=-3.0, ymax=4.0)\n", "\n", "logger = builder.AddSystem(SignalLogger(2))\n", "builder.Connect(pendulum.get_output_port(0), logger.get_input_port(0))\n", "\n", "diagram = builder.Build()\n", "simulator = Simulator(diagram)\n", "context = simulator.get_mutable_context()\n", "\n", "for i in range(5):\n", " context.SetTime(0.)\n", " simulator.Initialize()\n", " logger.reset()\n", " context.SetContinuousState(np.random.randn(2,))\n", " simulator.AdvanceTo(4)\n", " ax.plot(logger.data()[0, :], logger.data()[1, :])\n" ] }, { "source": [ "## Swing-up and balance\n", "\n", "Now we will combine our simple energy shaping controller with a linear controller that stabilizes the upright fixed point once we get close enough. We'll read more about this approach in the Acrobot and Cart-Pole notes." ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from pydrake.all import Linearize, LinearQuadraticRegulator\n", "\n", "\n", "def BalancingLQR(pendulum):\n", " context = pendulum.CreateDefaultContext()\n", "\n", " pendulum.get_input_port(0).FixValue(context, [0])\n", " context.SetContinuousState([np.pi, 0])\n", "\n", " Q = np.diag((10., 1.))\n", " R = [1]\n", "\n", " linearized_pendulum = Linearize(pendulum, context)\n", " (K, S) = LinearQuadraticRegulator(linearized_pendulum.A(),\n", " linearized_pendulum.B(), Q, R)\n", " return (K, S)\n", "\n", " \n", "class SwingUpAndBalanceController(VectorSystem):\n", "\n", " def __init__(self, pendulum):\n", " VectorSystem.__init__(self, 2, 1)\n", " (self.K, self.S) = BalancingLQR(pendulum)\n", " self.energy_shaping = EnergyShapingController(pendulum)\n", " self.energy_shaping_context = self.energy_shaping.CreateDefaultContext()\n", "\n", " # TODO(russt): Add a witness function to tell the simulator about the\n", " # discontinuity when switching to LQR.\n", "\n", " def DoCalcVectorOutput(self, context, pendulum_state, unused, output):\n", " xbar = copy(pendulum_state)\n", " xbar[0] = wrap_to(xbar[0], 0, 2. * np.pi) - np.pi\n", "\n", " # If x'Sx <= 2, then use the LQR controller\n", " if (xbar.dot(self.S.dot(xbar)) < 2.):\n", " output[:] = -self.K.dot(xbar)\n", " else:\n", " self.get_input_port(0).FixValue(self.energy_shaping_context, pendulum_state)\n", " output[:] = self.energy_shaping.get_output_port(0).Eval(self.energy_shaping_context)\n", " \n", "builder = DiagramBuilder()\n", "\n", "pendulum = builder.AddSystem(PendulumPlant())\n", "ax = PhasePlot(pendulum)\n", "saturation = builder.AddSystem(Saturation(min_value=[-3], max_value=[3]))\n", "builder.Connect(saturation.get_output_port(0), pendulum.get_input_port(0))\n", "controller = builder.AddSystem(SwingUpAndBalanceController(pendulum))\n", "builder.Connect(pendulum.get_output_port(0), controller.get_input_port(0))\n", "builder.Connect(controller.get_output_port(0), saturation.get_input_port(0))\n", "\n", "logger = builder.AddSystem(SignalLogger(2))\n", "builder.Connect(pendulum.get_output_port(0), logger.get_input_port(0))\n", "\n", "diagram = builder.Build()\n", "simulator = Simulator(diagram)\n", "context = simulator.get_mutable_context()\n", "\n", "for i in range(5):\n", " context.SetTime(0.)\n", " simulator.Initialize()\n", " logger.reset()\n", " context.SetContinuousState(np.random.randn(2,))\n", " simulator.AdvanceTo(4)\n", " ax.plot(logger.data()[0, :], logger.data()[1, :])\n", " \n", "ax.set_xlim(np.pi - 3., np.pi + 3.)\n", "ax.set_ylim(-5., 5.)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ] }