{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "dd522981",
   "metadata": {},
   "source": [
    "# Linear quadratic optimal control example\n",
    "\n",
    "Richard M. Murray, 20 Jan 2022 (updated 7 Jul 2024)\n",
    "\n",
    "This example works through the linear quadratic finite time optimal control problem.  We assume that we have a linear system of the form\n",
    "$$\n",
    "\\dot x = A x + Bu \n",
    "$$\n",
    "and that we want to minimize a cost function of the form\n",
    "$$\n",
    "\\int_0^T (x^T Q_x x + u^T Q_u u) dt + x^T P_1 x.\n",
    "$$\n",
    "We show how to compute the solution the Riccati ODE and use this to obtain an optimal (time-varying) linear controller."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "866842ea",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import scipy as sp\n",
    "import matplotlib.pyplot as plt\n",
    "import control as ct\n",
    "import control.optimal as opt\n",
    "import control.flatsys as fs\n",
    "import time"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "83a32e85",
   "metadata": {},
   "source": [
    "## System dynamics\n",
    "\n",
    "We use the linearized dynamics of the vehicle steering problem as our linear system.  This is mainly for convenient (since we have some intuition about it).  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "48c1bd7f-0db6-4488-af41-41f685280ec9",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Vehicle dynamics (bicycle model)\n",
    "\n",
    "# Function to take states, inputs and return the flat flag\n",
    "def _kincar_flat_forward(x, u, params={}):\n",
    "    # Get the parameter values\n",
    "    b = params.get('wheelbase', 3.)\n",
    "    #! TODO: add dir processing\n",
    "\n",
    "    # Create a list of arrays to store the flat output and its derivatives\n",
    "    zflag = [np.zeros(3), np.zeros(3)]\n",
    "\n",
    "    # Flat output is the x, y position of the rear wheels\n",
    "    zflag[0][0] = x[0]\n",
    "    zflag[1][0] = x[1]\n",
    "\n",
    "    # First derivatives of the flat output\n",
    "    zflag[0][1] = u[0] * np.cos(x[2])  # dx/dt\n",
    "    zflag[1][1] = u[0] * np.sin(x[2])  # dy/dt\n",
    "\n",
    "    # First derivative of the angle\n",
    "    thdot = (u[0]/b) * np.tan(u[1])\n",
    "\n",
    "    # Second derivatives of the flat output (setting vdot = 0)\n",
    "    zflag[0][2] = -u[0] * thdot * np.sin(x[2])\n",
    "    zflag[1][2] =  u[0] * thdot * np.cos(x[2])\n",
    "\n",
    "    return zflag\n",
    "\n",
    "# Function to take the flat flag and return states, inputs\n",
    "def _kincar_flat_reverse(zflag, params={}):\n",
    "    # Get the parameter values\n",
    "    b = params.get('wheelbase', 3.)\n",
    "    dir = params.get('dir', 'f')\n",
    "\n",
    "    # Create a vector to store the state and inputs\n",
    "    x = np.zeros(3)\n",
    "    u = np.zeros(2)\n",
    "\n",
    "    # Given the flat variables, solve for the state\n",
    "    x[0] = zflag[0][0]  # x position\n",
    "    x[1] = zflag[1][0]  # y position\n",
    "    if dir == 'f':\n",
    "        x[2] = np.arctan2(zflag[1][1], zflag[0][1])  # tan(theta) = ydot/xdot\n",
    "    elif dir == 'r':\n",
    "        # Angle is flipped by 180 degrees (since v < 0)\n",
    "        x[2] = np.arctan2(-zflag[1][1], -zflag[0][1])\n",
    "    else:\n",
    "        raise ValueError(\"unknown direction:\", dir)\n",
    "\n",
    "    # And next solve for the inputs\n",
    "    u[0] = zflag[0][1] * np.cos(x[2]) + zflag[1][1] * np.sin(x[2])\n",
    "    thdot_v = zflag[1][2] * np.cos(x[2]) - zflag[0][2] * np.sin(x[2])\n",
    "    u[1] = np.arctan2(thdot_v, u[0]**2 / b)\n",
    "\n",
    "    return x, u\n",
    "\n",
    "# Function to compute the RHS of the system dynamics\n",
    "def _kincar_update(t, x, u, params):\n",
    "    b = params.get('wheelbase', 3.)             # get parameter values\n",
    "    #! TODO: add dir processing\n",
    "    dx = np.array([\n",
    "        np.cos(x[2]) * u[0],\n",
    "        np.sin(x[2]) * u[0],\n",
    "        (u[0]/b) * np.tan(u[1])\n",
    "    ])\n",
    "    return dx\n",
    "\n",
    "def _kincar_output(t, x, u, params):\n",
    "    return x                            # return x, y, theta (full state)\n",
    "\n",
    "# Create differentially flat input/output system\n",
    "kincar = fs.FlatSystem(\n",
    "    _kincar_flat_forward, _kincar_flat_reverse, name=\"kincar\",\n",
    "    updfcn=_kincar_update, outfcn=_kincar_output,\n",
    "    inputs=('v', 'delta'), outputs=('x', 'y', 'theta'),\n",
    "    states=('x', 'y', 'theta'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fbdd78c0-30e9-43f7-9e8d-198ae38c2988",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Utility function to plot lane change manuever\n",
    "def plot_lanechange(t, y, u, figure=None, yf=None):\n",
    "    # Plot the xy trajectory\n",
    "    plt.subplot(3, 1, 1, label='xy')\n",
    "    plt.plot(y[0], y[1])\n",
    "    plt.xlabel(\"x [m]\")\n",
    "    plt.ylabel(\"y [m]\")\n",
    "    if yf is not None:\n",
    "        plt.plot(yf[0], yf[1], 'ro')\n",
    "\n",
    "    # Plot the inputs as a function of time\n",
    "    plt.subplot(3, 1, 2, label='v')\n",
    "    plt.plot(t, u[0])\n",
    "    plt.xlabel(\"Time $t$ [sec]\")\n",
    "    plt.ylabel(\"$v$ [m/s]\")\n",
    "\n",
    "    plt.subplot(3, 1, 3, label='delta')\n",
    "    plt.plot(t, u[1])\n",
    "    plt.xlabel(\"Time $t$ [sec]\")\n",
    "    plt.ylabel(\"$\\\\delta$ [rad]\")\n",
    "\n",
    "    plt.suptitle(\"Lane change manuever\")\n",
    "    plt.tight_layout()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "de9d85f3",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Initial conditions\n",
    "x0 = np.array([-40, -2., 0.])\n",
    "u0 = np.array([10, 0])               # only used for linearization\n",
    "Tf = 4\n",
    "\n",
    "# Linearized dynamics\n",
    "sys = kincar.linearize(x0, u0)\n",
    "print(sys)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c5c0abe9",
   "metadata": {},
   "source": [
    "## Optimal trajectory generation\n",
    "\n",
    "We generate an trajectory for the system that minimizes the cost function above.  Namely, starting from some initial function $x(0) = x_0$, we wish to bring the system toward the origin without using too much control effort."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "02e9f87c",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define the cost function and the terminal cost\n",
    "# (try changing these later to see what happens)\n",
    "Qx = np.diag([1, 1, 1])       # state costs\n",
    "Qu = np.diag([1, 1])          # input costs\n",
    "Pf = np.diag([1, 1, 1])       # terminal costs"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "62c76e5e",
   "metadata": {},
   "source": [
    "### Finite time, linear quadratic optimization\n",
    "\n",
    "The optimal solution satisfies the following equations, which follow from the maximum principle:\n",
    "\n",
    "$$\n",
    "  \\begin{aligned}\n",
    "    \\dot x &= \\left(\\frac{\\partial H}{\\partial \\lambda}\\right)^T\n",
    "      = A x + Bu, \\qquad & x(0) &= x_0, \\\\\n",
    "    -\\dot \\lambda &= \\left(\\frac{\\partial H}{\\partial x}\\right)^T\n",
    "      = Q_x x + A^T \\lambda, \\qquad\n",
    "      & \\lambda(T) &= P_1 x(T), \\\\\n",
    "    0 &= \\left(\\frac{\\partial H}{\\partial u}\\right)^T\n",
    "      = Q_u u + B^T \\lambda. &&\n",
    "  \\end{aligned}\n",
    "$$\n",
    "\n",
    "The last condition can be solved to obtain the optimal controller\n",
    "\n",
    "$$\n",
    "  u = -Q_u^{-1} B^T \\lambda,\n",
    "$$\n",
    "\n",
    "which can be substituted into the equations for the optimal solution.\n",
    "\n",
    "Given the linear nature of the dynamics, we attempt to find a solution\n",
    "by setting $\\lambda(t) = P(t) x(t)$ where $P(t) \\in {\\mathbb R}^{n \\times\n",
    "n}$.  Substituting this into the necessary condition, we obtain\n",
    "\n",
    "$$\n",
    "  \\begin{aligned}\n",
    "    & \\dot\\lambda =\n",
    "      \\dot P x + P \\dot x = \\dot P x + P(Ax - BQ_u^{-1} B^T P) x, \\\\\n",
    "    & \\quad\\implies\\quad\n",
    "      -\\dot P x - PA x + PBQ_u^{-1}B P x = Q_xx + A^T P x.\n",
    "  \\end{aligned}\n",
    "$$\n",
    "\n",
    "This equation is satisfied if we can find $P(t)$ such that\n",
    "\n",
    "$$\n",
    "  -\\dot P = PA + A^T P - P B Q_u^{-1} B^T P + Q_x,\n",
    "  \\qquad P(T) = P_1.\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b63aed88",
   "metadata": {},
   "source": [
    "To solve a final value problem with $P(T) = P_1$, we set the \"initial\" condition to $P_1$ and then invert time, so that we solve\n",
    "\n",
    "$$\n",
    "\\frac{dP}{d(-t)} = -\\frac{dP}{dt} = -F(P), \\qquad P(0) = P_1\n",
    "$$\n",
    "\n",
    "Solving this equation from time $t = 0$ to time $t = T$ will give us an solution that goes from $P(T)$ to $P(0)$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "02d74789",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Set up the Riccatti ODE\n",
    "def Pdot_reverse(t, x):\n",
    "    # Get the P matrix from the state by resizing\n",
    "    P = np.reshape(x, (sys.nstates, sys.nstates))\n",
    "    \n",
    "    # Compute the right hand side of Riccati ODE\n",
    "    Prhs = P @ sys.A + sys.A.T @ P + Qx - \\\n",
    "        P @ sys.B @ np.linalg.inv(Qu) @ sys.B.T @ P\n",
    "        \n",
    "    # Return P as a vector, *backwards* in time (no minus sign)\n",
    "    return Prhs.reshape((-1))\n",
    "\n",
    "# Solve the Riccati ODE (converting from matrix to vector and back)\n",
    "P0 = np.reshape(Pf, (-1))\n",
    "Psol = sp.integrate.solve_ivp(Pdot_reverse, (0, Tf), P0)\n",
    "Pfwd = np.reshape(Psol.y, (sys.nstates, sys.nstates, -1))\n",
    "\n",
    "# Reorder the solution in time\n",
    "Prev = Pfwd[:, :, ::-1] \n",
    "trev = Tf - Psol.t[::-1]\n",
    "\n",
    "print(\"Trange = \", trev[0], \"to\", trev[-1])\n",
    "print(\"P[Tf] =\", Prev[:,:,-1])\n",
    "print(\"P[0] =\", Prev[:,:,0])\n",
    "\n",
    "# Internal comparison: show that initial value is close to algebraic solution\n",
    "_, P_lqr, _ = ct.lqr(sys.A, sys.B, Qx, Qu)\n",
    "print(\"P_lqr =\", P_lqr)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f4fb1166",
   "metadata": {},
   "source": [
    "For solving the $x$ dynamics, we need a function to evaluate $P(t)$ at an arbitrary time (used by the integrator).  We can do this with the SciPy `interp1d` function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "728f675b",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define an interpolation function for P\n",
    "P = sp.interpolate.interp1d(trev, Prev)\n",
    "\n",
    "print(\"P(0) =\", P(0))\n",
    "print(\"P(3.5) =\", P(3.5))\n",
    "print(\"P(4) =\", P(4))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eb30c3fa",
   "metadata": {},
   "source": [
    "We now solve the $\\dot x$ equations *forward* in time, using $P(t)$:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "84092dcd",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Now solve the state forward in time\n",
    "def xdot_forward(t, x):\n",
    "    u = -np.linalg.inv(Qu) @ sys.B.T @ P(t) @ x\n",
    "    return sys.A @ x + sys.B @ u\n",
    "\n",
    "# Now simulate from a shifted initial condition\n",
    "xsol = sp.integrate.solve_ivp(xdot_forward, (0, Tf), x0)\n",
    "tvec = xsol.t\n",
    "x = xsol.y\n",
    "print(\"x[0] =\", x[:, 0])\n",
    "print(\"x[Tf] =\", x[:, -1])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8488acad",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Finally compute the \"desired\" state and input values\n",
    "xd = x\n",
    "ud = np.zeros((sys.ninputs, tvec.size))\n",
    "for i, t in enumerate(tvec):\n",
    "  ud[:, i] = -np.linalg.inv(Qu) @ sys.B.T @ P(t) @ x[:, i]\n",
    "\n",
    "plot_lanechange(tvec, xd, ud)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "89483f4b",
   "metadata": {},
   "source": [
    "Note here that we are stabilizing the system to the origin (compared to some of other examples where we change langes and so the final $y$ position is $y_\\text{f} = 2$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7ed4c5eb",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}