### A Pluto.jl notebook ###
# v0.19.40
using Markdown
using InteractiveUtils
# This Pluto notebook uses @bind for interactivity. When running this notebook outside of Pluto, the following 'mock version' of @bind gives bound variables a default value (instead of an error).
macro bind(def, element)
quote
local iv = try Base.loaded_modules[Base.PkgId(Base.UUID("6e696c72-6542-2067-7265-42206c756150"), "AbstractPlutoDingetjes")].Bonds.initial_value catch; b -> missing; end
local el = $(esc(element))
global $(esc(def)) = Core.applicable(Base.get, el) ? Base.get(el) : iv(el)
el
end
end
# ╔═╡ b1a45f30-beec-4089-904c-488b86b56a9e
begin
using Plots
using LaTeXStrings
using PlutoUI
using SymEngine
end
# ╔═╡ d4bb171d-3c1c-463a-9360-c78bdfc83363
begin
using Calculus
# also can compute gradients for multidim functions
Calculus.gradient(x->x[1]^2 * exp(3x[2]),ones(2)), Calculus.hessian( x->x[1]^2 * exp(3x[2]),ones(2))
end
# ╔═╡ 86440ba5-4b5f-440b-87e4-5446217dd073
using ForwardDiff # one particular AD package in julia
# ╔═╡ 7a7fc4fc-be68-40d6-868b-d141a7054319
html"""
"""
# ╔═╡ 5c316980-d18d-4698-a841-e732f7632cec
html""
# ╔═╡ 53ef0bfc-4239-11ec-0b4c-23f451fff4a6
md"""
# Optimization 1
* This lecture reminds you of some optimization theory.
* The focus here is to illustrate use cases with julia.
* We barely scratch the surface of optimization, and I refer you to Nocedal and Wright for a more thorough exposition in terms of theory.
* This 2-part lecture is heavily based on [Algorithms for Optimization](https://mitpress.mit.edu/books/algorithms-optimization) by Kochenderfer and Wheeler.
This is a 2 part lecture.
## Optimization I: Basics
1. Intro
2. Conditions for Optima
3. Derivatives and Gradients
4. Numerical Differentiation
5. Optim.jl
## Optimization II: Algorithms
1. Bracketing
2. Local Descent
3. First/Second Order and Direct Methods
4. Constraints
## The Optimization Process
```
1. Problem Specification
2. Initial Design
3. Optimization Proceedure:
a) Evaluate Performance
b) Good?
i. yes: final design
ii. no:
* Change design
* go back to a)
```
We want to automate step 3.
## Optimization Algorithms
* All of the algorithms we are going to see employ some kind of *iterative* proceedure.
* They try to improve the value of the objective function over successive steps.
* The way the algorithm goes about generating the next step is what distinguishes algorithms from one another.
* Some algos only use the objective function
* Some use both objective and gradients
* Some add the Hessian
* and many variants more
## Desirable Features of any Algorithm
* Robustness: We want good performance on a wide variety of problems in their class, and starting from *all* reasonable starting points.
* Efficiency: They should be fast and not use an excessive amount of memory.
* Accuracy: They should identify the solution with high precision.
"""
# ╔═╡ 9b3eee98-e481-4fb6-98c2-6ac408dcfe54
md"""
## Optimisation Basics
* Recall our generic definition of an optimization problem:
$$\min_{x\in\mathbb{R}^n} f(x) \text{ s.t. } x \in \mathcal{X}$$
symbol | meaning
--- | ----
$x$ | *choice variable* or a *design point*
$\mathcal{X}$ | feasible set
$f$ | objective function
$x^*$ | *solution* or a *minimizer*
$x^*$ is *solution* or a *minimizer* to this problem if $x^*$ is *feasible* and $x^*$ minimizes $f$.
Maximization is just minimizing $(-1)f$:
$$\min_{x\in\mathbb{R}^n} f(x) \text{ s.t. } x \in \mathcal{X} \equiv \max_{x\in\mathbb{R}^n} -f(x) \text{ s.t. } x \in \mathcal{X}$$
"""
# ╔═╡ 6163277d-70d3-4a73-89df-65c329c2b818
md"""
#
"""
# ╔═╡ 843fef36-611c-4411-b31b-8a11e128881b
@bind B Slider(0:0.1:10,default = 3.0)
# ╔═╡ 3ba9cc34-0dcd-4c2e-b428-242e456bd436
let
npoints = 100
a,b = (0,10)
x = range(a,b,length = npoints)
f₀(x) = x .* sin.(x)
plot(x, f₀.(x), leg=false,color=:black,lw = 2,title = "Finding the Max is Easy! Right?")
xtest = x[x .<= B]
fmax,ix = findmax(f₀.(xtest))
scatter!([xtest[ix]], [fmax], color = :red, ms = 5)
vline!([B],lw = 3)
end
# ╔═╡ 601e4aa9-e380-41a1-96a2-7089603889c3
md"""
## Constraints
* We often have constraints on problems in economics.
$$\max_{x_1,x_2} u(x_1,x_2) \text{ s.t. } p_1 x_1 + p_2 x_2 \leq y$$
* Constraints define the feasible set $\mathcal{X}$.
* It's better to write *weak inequalities* (i.e. $\leq$) rather than strict ones ($<$).
"""
# ╔═╡ fefd6403-f46c-4eb1-b754-a85bcb75914c
md"""
## Example
$$\min_{x_1,x_2} -\exp(-(x_1 x_2 - 3/2)^2 - (x_2-3/2)^2) \text{ s.t. } x_2 \leq \sqrt{x_1}$$
"""
# ╔═╡ 2ac3348d-196a-4507-b2f7-c575e42d7e7b
let
x=0:0.01:3.5
f0(x1,x2) = -exp.(-(x1.*x2 - 3/2).^2 - (x2-3/2).^2)
c(z) = sqrt(z)
p1 = surface(x,x,(x,y)->f0(x,y),xlab = L"x_1", ylab = L"x_2")
p2 = contour(x,x,(x,y)->f0(x,y),lw=1.5,levels=[collect(0:-0.1:-0.85)...,-0.887,-0.95,-1],xlab = L"x_1", ylab = L"x_2")
plot!(p2,c,0.01,3.5,label="",lw=2,color=:black,fill=(0,0.5,:blue))
scatter!(p2,[1.358],[1.165],markersize=5,markercolor=:red,label="Constr. Optimum")
plot(p1,p2,size=(900,300))
end
# ╔═╡ 09582278-6fed-4cac-9aaa-45cf0ac9fb6c
md"""
## Conditions for Local Minima
We can define *first and second order necessary conditions*, FONC and SONC. This definition is to point out that those conditions are not sufficient for optimality (only necessary).
### Univariate $f$
1. **FONC:** $f'(x^*) =0$
2. **SONC** $f''(x^*) \geq 0$ (and $f''(x^*) \leq 0$ for local maxima)
2. (**SOSC** $f''(x^*) > 0$ (and $f''(x^*) < 0$ for local maxima))
"""
# ╔═╡ 58cb6931-91a5-4325-8be4-6675f7e142ed
md"""
### Multivariate $f$
1. **FONC:** $\nabla f(x^*) =0$
2. **SONC** $\nabla^2f(x^*)$ is positive semidefinite (negative semidefinite for local maxima)
2. (**SOSC** $\nabla^2f(x^*)$ is positive definite (negative definite for local maxima))
"""
# ╔═╡ 9a5cb736-237c-4b3d-9820-b05ec4c961d5
md"""
#
"""
# ╔═╡ 89de388b-4bd1-4814-8378-10bfd0ac3f3d
md"""
#
"""
# ╔═╡ 274f5fd9-904d-4f11-b4e1-93a37e206080
md"""
#
"""
# ╔═╡ 3af8c139-2e6c-4830-9b67-96f78356f521
md"""
## Example Time: Rosenbrock's Banana Function
A well-known test function for numerical optimization algorithms is the Rosenbrock banana function developed by Rosenbrock in 1960. it is defined by
$$f(\mathbf{x}) = (1-x_1)^2 + 5(x_2-x_1^2)^2$$
"""
# ╔═╡ dd9bfbb1-aecf-458f-9a05-a93ff78fd741
md"""
## How to write a julia function?
* We talked briefly about this - so let's try out the various forms:
* (and don't forget to [look at the manual](https://docs.julialang.org/en/v1/manual/functions/) as always!)
"""
# ╔═╡ 34b7e91e-d67f-4554-b985-b9100adda733
# long form taking a vector x
function rosen₁(x)
(1-x[1])^2 + 5*(x[2] - x[1]^2)^2
end
# ╔═╡ 4d2a5726-2704-4b63-b334-df5175278b18
begin
using Optim
result = optimize(rosen₁, zeros(2), NelderMead())
end
# ╔═╡ 3270f9e3-e232-4752-949f-12f984581b19
# short form taking a vector x
rosen₂(x) = (1-x[1])^2 + 5*(x[2] - x[1]^2)^2
# ╔═╡ 2dbb5b13-790a-4ab7-95b1-b833c4cb027a
rosen₁([1.1,0.4]) == rosen₂([1.1,0.4])
# ╔═╡ f51233c4-ec66-4517-9109-5309601d1d87
md"""
* but the stuff with `x[1]` and `x[2]` is ugly to read
* no? 🤷🏿♂️ well I'd like to read this instead
$$f(x,y) = (1-x)^2 + 5(y-x^2)^2$$
* fear not. we can do better here.
"""
# ╔═╡ 1d698018-8b77-490d-ad3a-6c7001aa99ab
md"""
#
"""
# ╔═╡ 3729833f-80d4-4948-8d81-750008c8f16d
begin
# long form taking an x and a y
function rosen₃(x,y)
(1-x)^2 + 5*(y - x^2)^2
end
# short form taking a vector x
rosen₄(x,y) = (1-x[1])^2 + 5*(x[2] - x[1]^2)^2
end
# ╔═╡ 2eae1d35-df83-415f-87a5-1a5e0d1d649e
rosen₄([1,1])
# ╔═╡ 7172d082-e6d2-419b-8bb6-75e30f1b4dfe
md"""
ok fine, but it's often useful to keep data in a vector. Can we have the readibility of the `x,y` formulation, with the vector input?
➡️ We can! here's a cool feature called *argument destructuring*:
"""
# ╔═╡ e7841458-f641-48cf-8667-1e5b38cbd9f6
rosen₅((x,y)) = (1-x)^2 + 5*(y - x^2)^2 # the argument is a `tuple`, i.e. a single object!
# ╔═╡ abbc5a52-a02c-4f5b-bd1e-af5596455762
@which rosen₅([1.0, 1.3])
# ╔═╡ 95e688e2-9607-41a2-9098-626590bcf435
rosen₅( [1.0, 1.3] ) # assigns x = 1.0 , y = 1.3 inside the function
# ╔═╡ 8279fd8a-e447-49b6-b729-6e7b8883f5e4
md"""
#
Ok enough of that. Let's get a visual of the Rosenbrock function finally!
"""
# ╔═╡ ed2ee298-ac4f-4ae3-a9e3-300040a706a8
md"""
#
### Keyword Arguments
In fact, the numbers `1` and `5` in
$$f(x,y) = (1-x)^2 + 5(y-x^2)^2$$
are just *parameters*, i.e. the function definition can be changed by varying those. Let's get a version of `rosen()` which allows this, then let's investigate the plot again:
"""
# ╔═╡ 0bbaa5a8-8082-4697-ae98-92b2ae3769af
rosenkw(x,y ; a = 1, b = 5) = (a - x)^2 + b*(y - x^2)^2 # notice the ;
# ╔═╡ 5abc4cf1-7fe1-4d5e-9077-262984d07b4c
md"""
#
"""
# ╔═╡ dd0c1982-38f4-4752-916f-c05da365bade
md"""
* alright, not bad. but how can I change the a and b values now?
* One solution is to pass an *anonymous function* which will *enclose* the values for `a` and `b` (it is hence called a `closure`):
"""
# ╔═╡ f655db71-18c6-40db-83c8-0035e37e6eda
md"""
#
"""
# ╔═╡ 202dc3b6-ddcb-463d-b8f2-a285a2ecb112
md"""
This wouldn't be a proper pluto session if we wouldn't hook those values up to a slider, would it? Let's do it!
"""
# ╔═╡ 29d33b1f-8901-4fee-aa85-11adb6ebad1b
md"""
#
"""
# ╔═╡ 91fd09a1-8b3a-4772-b6a5-7b149d91eb4d
md"""
a = $(@bind a Slider(0.05:0.1:10.5, default=1, show_value=true))
"""
# ╔═╡ b49ca3b1-0d1b-4edb-8064-e8cd8d4db727
md"""
b = $(@bind b Slider(0.1:0.5:20, default=1, show_value=true))
"""
# ╔═╡ 86f0e396-f81b-45be-94a7-90e40a8ba251
md"""
## Finding Optima
Ok, tons of fun. Now let's see where the optimum of this function is located. In this instance, *optimum* means the *lowest value* on the $z$ axis. Let's project the 3D graph down into 2D via a contour plot to see this better:
"""
# ╔═╡ 9806ec5e-a884-41a1-980a-579915a33b8e
md"""
* The optimum is at point $(1,1)$ (I know it.)
* it's not great to see the contour lines on this plot though, so let's try a bit harder.
* Let's choose a different color scheme and also let's bit a bit smarter at which levels we want to measure the function:
"""
# ╔═╡ 8300dbb5-0eb6-4f84-80c6-24c4443b1f29
md"""
## Derivatives and Gradients
* 😱
* You all know this, so no panic.
* The derivative of a univariate function $f$ at point $x$, $f'(x)$ gives the rate with which $f$ changes at point $x$.
* Think of a tangent line to a curve, to economists known as the omnipresent and omnipotent expression : `THE SLOPE`. Easy. Peanuts. 🥜
* Here is the definition of $f'$
$$f'(x) \equiv \lim_{h\to0}\frac{f(x+h)-f(x)}{h}$$
* Like, if I gave you function like $u(c) = \frac{c^{1-\sigma}}{1-\sigma}$ , I bet you guys could shoot back in your sleep that $u'(c) = \frac{\partial u(c)}{\partial c} = ?$
* Of course you know all the differentiation rules, so no problem. But a computer?
* In fact, there are several ways. Let's illustrate the easiest one first, called *finite differencing*:
"""
# ╔═╡ edd64823-b054-4974-b817-853319a62bcd
u(c; σ = 2) = ((c)^(1-σ)) / (1-σ)
# ╔═╡ 986fcae1-138c-42f6-810e-e3c193f669bb
u(2.2)
# ╔═╡ b901c4aa-38f8-476a-8c9e-7eb523f59438
eps()
# ╔═╡ d4af5141-422b-4941-8dc7-f2b4b09029c0
md"""
ϵ = $(@bind ϵ Slider(-6:-1, show_value = true, default = -1))
"""
# ╔═╡ 3fd2f03a-fc52-4009-b284-0def00be601f
h = 10.0^ϵ
# ╔═╡ 27d955de-8d97-43e4-9176-aad5456eb797
let
c = 2.2
∂u∂c = (u(c + h) - u(c)) / h # definition from above!
Dict(:finite_diff => ∂u∂c, :truth_Paolo => c^-2)
end
# ╔═╡ 645ef857-aff9-4dee-bfd6-72fe9d542375
md"""
## Multiple Dimensions:
* Let's add more notation to have more than 1 dimensional functions.
### $f$ that takes a vector and outputs a number
* Unless otherwise noted, we have $x \in \mathbb{R}^n$ as an $n$ element vector.
* The **gradient** of a function $f : \mathbb{R}^n \mapsto \mathbb{R}$ is denoted $\nabla f:\mathbb{R}^n \mapsto \mathbb{R}^n$ and it returns a vector
$$\nabla f(x) = \left(\frac{\partial f}{\partial x_1}(x),\frac{\partial f}{\partial x_2}(x),\dots,\frac{\partial f}{\partial x_n}(x) \right)$$
* So that's just taking the partial derivative wrt to *each* component in $x$.
### $f$ that takes a vector and outputs *another vector* 🤪
* In this case we talk of the **Jacobian** matrix.
* You can easily see that if $f$ is s.t. it maps $n$ numbers (in) to $m$ numbers (out), now *taking the derivative* means keeping track of how all those numbers change as we change each of the $n$ input components in $x$.
* One particularly relevant Jacobian in optimization is the so-called **Hessian** matrix.
* You can think of the hessian either as a function $H_f :\mathbb{R}^n \mapsto \mathbb{R}^{n\times n}$ and returns an $(n,n)$ matrix, where the elements are
$$H_f(x) = \left( \begin{array}{cccc}
\frac{\partial^2 f}{\partial x_1 \partial x_1}(x) & \frac{\partial^2 f}{\partial x_2 \partial x_1}(x) & \dots & \frac{\partial^2 f}{\partial x_n \partial x_1}(x) \\
\frac{\partial^2 f}{\partial x_1 \partial x_2}(x) & \frac{\partial^2 f}{\partial x_2 \partial x_2}(x) & \dots & \frac{\partial^2 f}{\partial x_n \partial x_2}(x) \\
\vdots & \vdots & \dots & \vdots \\
\frac{\partial^2 f}{\partial x_1 \partial x_n}(x) & \frac{\partial^2 f}{\partial x_2 \partial x_n}(x) & \dots & \frac{\partial^2 f}{\partial x_n \partial x_n}(x)
\end{array}\right)$$
* or you just imagine the gradien from above, and then differentiate each element *again* wrt to all components of $x$.
"""
# ╔═╡ 06ca10a8-c922-4252-91d2-e025ab306f02
md"""
## Time for a Proof! 😨
* We mentioned above the FOC and SOC conditions.
* We should be able to *prove* that the point (1,1) is an optimum, right?
* Let's do it! Everybody derive the gradient *and* the hessian of the rosenbrock function $$f(x,y) = (1-x)^2 + 5(y-x^2)^2$$ to show that $(1,1)$ is a candidate optimum! As a homework! 😄
$$\left(\frac{\partial f(x,y)}{\partial x}, \frac{\partial f(x,y)}{\partial y}\right) = (0,0)$$
"""
# ╔═╡ ab589e93-a4ca-45be-882c-bc3da47e4d1c
md"""
### Calculus.jl package
* Meanwhile, here is a neat package to help out with finite differencing:
"""
# ╔═╡ b600aafb-7d23-417a-a8c9-597d95182469
md"""
## Approaches to Differentiation
1. We have seen *numerical Differentiation* or *finite differencing*. We have seen the issues with choosing the right step size. Also we need to evaluate the function many times, which is costly.
1. Symbolical Differentiation: We can teach the computer the rules, declare *symbols*, then then manipulate those expressions. We'll do that next.
1. Finally, there is **Automatic Differentiation (AD)**. That's the 💣 future! More later.
"""
# ╔═╡ bf8dfa21-29e4-4d6e-a876-ba1a6ca313b1
md"""
## Symbolic Differentiation on a Computer
* If you can write down an analytic form of $f$, there are ways to *symbolically* differentiate it on a computer.
* This is as if you would do the derivation on paper.
* Mathematica, python, and julia all have packages for that.
"""
# ╔═╡ 068dd98e-8507-4380-a4b2-f6fee80adaaa
begin
x = symbols("x");
f = x^2 + x/2 - sin(x)/x; diff(f, x)
end
# ╔═╡ 4b3f4b1b-1b22-4e2e-be5b-d44d74d8da0e
md"""
## Automatic Differentiation (AD)
* Breaks down the actual `code` that defines a function and performs elementary differentiation rules, after disecting expressions via the chain rule:
$$\frac{d}{dx}f(g(x)) = \frac{df}{dg}\frac{dg}{dx}$$
* This produces **analytic** derivatives, i.e. there is **no** approximation error.
* Very accurate, very fast.
* The idea is to be able to *unpick* **expressions** in your code.
* **Machine Learning** depends very strongly on this technology.
* Let's look at an example
"""
# ╔═╡ 3e480576-ed7d-4f2d-bcd1-d7d1cbbeccf9
let
c = 1.5
∂u∂c = (u(c + h) - u(c)) / h # definition from above!
(∂u∂c, c^-2, ForwardDiff.derivative(u,c))
end
# ╔═╡ bc52bf0c-6cd1-488d-a9c1-7a91a582dda9
md"""
* I find this mind blowing 🤯
#
### AD Example
Consider the function $f(x,y) = \ln(xy + \max(x,2))$. Let's get the partial derivative wrt $x$:
$$\begin{aligned} \frac{\partial f}{\partial x} &= \frac{1}{xy + \max(x,2)} \frac{\partial}{\partial x}(xy + \max(x,2)) \\
&= \frac{1}{xy + \max(x,2)} \left[\frac{\partial(xy)}{\partial x} + \frac{\partial\max(x,2)}{\partial x} \right]\\
&= \frac{1}{xy + \max(x,2)} \left[\left(y\frac{\partial(x)}{\partial x} + x\frac{\partial(y)}{\partial x}\right) + \left(\mathbf{1}(2>x)\frac{\partial 2}{\partial x} + \mathbf{1}(2 b.v ? a.∂ : a.v < b.v ? b.∂ : NaN
return Dual(v, ∂)
end
function Base.max(a::Dual, b::Int)
v = max(a.v, b)
∂ = a.v > b ? a.∂ : a.v < b ? 1 : NaN
return Dual(v, ∂)
end
```
"""
# ╔═╡ d9238a26-e792-44fc-be3d-7d8ec7e0117d
let
x = ForwardDiff.Dual(3,1);
y = ForwardDiff.Dual(2,0);
log(x*y + max(x,2))
end
# ╔═╡ eb2d7221-25b4-4836-b818-3ed944570040
md"""
... or just:
"""
# ╔═╡ 66f0d9bb-7d04-4e82-b9dd-55510971691b
ForwardDiff.derivative((x) -> log(x*2 + max(x,2)), 3) # y = 2
# ╔═╡ 4c60c221-545c-4050-bfea-211048a36bce
md"""
Of course this also works for more than one dimensional functions:
"""
# ╔═╡ 2d1f128c-bcfa-4017-9690-01f3f75c3efa
ForwardDiff.gradient(rosen₁, [1.0,1.0]) # notice: EXACTLY zero.
# ╔═╡ b4ade3a3-668e-495b-9b7b-ad45fdf2655b
ForwardDiff.hessian(rosen₁, [1.0,1.0]) # again, no rounding error.
# ╔═╡ 9431caba-619d-4104-a267-914a9bcc78ef
md"""
## Introducing [`Optim.jl`](https://github.com/JuliaNLSolvers/Optim.jl)
* Multipurpose unconstrained optimization package
* provides 8 different algorithms with/without derivatives
* univariate optimization without derivatives
* It comes with the workhorse function `optimize`
"""
# ╔═╡ 58f32a65-1ef8-4d9a-a874-00f7df563b3c
md"""
let's opitmize the rosenbrock functoin *without* any gradient and hessian:
"""
# ╔═╡ 9f238c4a-c557-4c57-a24c-6d221d592a18
md"""
now with both hessian and gradient! we choose another algorithm:
"""
# ╔═╡ 278cc047-83ee-49b1-a0e3-d2d779c1bc17
md"""
function library
"""
# ╔═╡ 5f3ad56f-5f8f-4b51-b45c-46c37eaeced4
begin
function g!(G, x)
G[1] = -2.0 * (1.0 - x[1]) - 400.0 * (x[2] - x[1]^2) * x[1]
G[2] = 200.0 * (x[2] - x[1]^2)
end
function h!(H, x)
H[1, 1] = 2.0 - 400.0 * x[2] + 1200.0 * x[1]^2
H[1, 2] = -400.0 * x[1]
H[2, 1] = -400.0 * x[1]
H[2, 2] = 200.0
end
end
# ╔═╡ f061e908-0687-4375-84e1-386a0dd48b39
o = optimize(rosen₁, g!, h!, zeros(2), Newton())
# ╔═╡ eb65a331-c977-4b0f-8add-873bd89095f4
Optim.minimizer(o)
# ╔═╡ d146a1e2-8067-4e25-b0cd-2a041162acb9
function minmax()
v=collect(range(-2,stop = 2, length = 30)) # values
mini = [x^2 + y^2 for x in v, y in v]
maxi = -mini # max is just negative min
saddle = [x^2 + y^3 for x in v, y in v]
Dict(:x => v,:min => mini, :max => maxi, :saddle => saddle)
end
# ╔═╡ 3722538e-76e9-4bab-bfa9-57eff72802b7
function mmplotter(s::Symbol;kws...)
d = minmax()
surface(d[:x],d[:x],d[s],title="$s",fillalpha=0.8,leg=false,fillcolor=:heat; kws...)
end
# ╔═╡ b059cb44-349a-48b5-a96e-62c4835fde10
mmplotter(:max)
# ╔═╡ 5b925811-6255-4e2e-b691-40869d65d6df
mmplotter(:min)
# ╔═╡ a88b6949-4b4a-4f5a-a9a2-c6978cd0f758
mmplotter(:saddle,camera = (30,50))
# ╔═╡ f368672a-5c78-4d2a-aea9-f2a2c1ee0a54
info(text) = Markdown.MD(Markdown.Admonition("info", "Info", [text]));
# ╔═╡ 63703f51-bf0a-42c1-b981-3191d88b4901
warning(text) = Markdown.MD(Markdown.Admonition("warning", "Warning", [text]));
# ╔═╡ fcc24d08-bb9a-482f-987e-e64184c8d6f2
warning(md"Keep in mind that there may be other (better!) solutions outside of your interval of attention.")
# ╔═╡ d4c22f7b-31f5-4f41-8731-2f6189d231b4
function rosendata(f::Function;npoints = 30)
x = y = range(-2,stop = 2, length = npoints) # x and y axis
rosenvals = [f(ix,iy) for ix in x, iy in y] # f evaluations
(x,y,rosenvals)
end
# ╔═╡ 76a613f2-482f-4a4d-8236-debee05bef1b
function rosenplotter(f::Function)
x,y,vals = rosendata(f) # get the data
# plotting
surface(x,y,vals, fillcolor = :thermal,colorbar=false,
alpha = 0.9,xlab = "x",ylab = "y", zlab = "z", zlim= (0,180))
end
# ╔═╡ 3cf9be4d-fa76-4264-b9b6-ff66bcf5db0e
rosenplotter(rosen₃)
# ╔═╡ dc21cc4b-aedd-42d7-b2a8-f36dfecee6f4
rosenplotter(rosenkw)
# ╔═╡ 7fcebc5a-a8c7-47d8-90b0-7ee8cd579585
rosenplotter( (x,y) -> rosenkw(x,y, a=1.2, b=2 ) ) # notice the `,` when calling
# ╔═╡ ba891e20-db23-4b03-9495-19c19df940d3
rosenplotter( (x,y) -> rosenkw(x,y, a=a, b=b ))
# ╔═╡ 12629919-26d3-4434-9c23-9778364fe71a
let
x,y,z = rosendata(rosenkw,npoints = 100) # default a,b
contour(x,y,z, fill = false, color = :deep,levels=[collect(0:0.2:175)...])
scatter!([1.0],[1.0], m=:c, c=:red, label = "(1,1)")
end
# ╔═╡ b1c207b7-9d70-453c-b554-1c91f59ada0a
let
x,y,z = rosendata(rosenkw,npoints = 100) # default a,b
loglevels = exp.(range(log(0.05), stop = log(175.0), length = 100))
contour(x,y,z, fill = false, color = :viridis,levels=loglevels)
scatter!([1.0],[1.0], m=:c, c=:red, label = "(1,1)")
end
# ╔═╡ 33e3b11c-b1b4-4c64-b742-734ebd06926e
danger(text) = Markdown.MD(Markdown.Admonition("danger", "Danger", [text]));
# ╔═╡ ca7d694b-182a-443d-b47d-1bfe4ed8039f
danger(md"""
You should **not** normally attempt to write a numerical optimizer for yourself. Entire generations of Applied Mathematicians and other numerical pro's have worked on those topics before you, so you should use their work:
1. Any optimizer you could come up with is probably going to perform below par, and be highly likely to contain mistakes.
2. Don't reinvent the wheel.
That said, it's very important that we understand some basics about the main algorithms, because your task is **to choose from the wide array of available ones**.""")
# ╔═╡ 2e3243dc-f489-4117-82f8-7d05f5188429
bigbreak = html"