The reading note of first 8 chapters of The Art and Theory of Dynamic Programming.

• Dynamic Programming cannot be passively teached, but actively learned by considerable practice with solving problmems on one's own.

• Why dynamic programming is more art than science we do not know. But common sense, not mathematical manipulation, is what it takes to be successful in a course in dynamic programming.

• Goals:
• Primary: identify state and stage variables, and define and characterize the optimal value function.
• Secondary: use DP analytically to establish the structure of the optimal solution and the necessary conditions.
• Final: special techniques.
• Chapters
• 1-8: deterministic
• 9-14: stochastic
• 15: learning related
• When used in undergraduate classes: 1-4, 6(1-4), 9-11, 13, 15(1-5).

# Elementary Path Problems

## Introduction

• Definition: optimization procedure that is particularly applicable to problems requiring a sequence of interrelated decisions.

• Sequence of decisions to sequence of situations, to optimize some measure of value.

## A Simple Path Problem

• Problem: one-way street, Manhattan-style city, from A to B.

## The Dynamic-Programming Solution

• Define sub-problems, solve the original problem by using the results from sub-problems.

• So we can solve all the sub-problems recursively in some order.

## Terminology

• Define:
• optimal value function: the rule that assigns values to various subproblems, it has arguments.
• optimal policy function: the rule that associats the best first decision with each subproblem.
• recurrence relation: a formula or set of formulas relating various values of OVF.
• boundary conditions.
• Procedure:
• To solve a problem by dynamic programming, we choose the arguments of the optimal value function and define that function to allow the use of the principle of optimality to write a recurrence relation.
• Starting with the boundary conditions, we then use the recurrence relation to determine concurrently the optimal value and policy functions.
• When the optimal value and decision are known for the value of the argument that represents the original whole problem, the solution is completed and the best path can be traced out using the optimal policy function alone.

## Computational Efficiency

• The time complexity of previous problem is `\$O(N^2)\$`.

## Forward Dynamic Programming

• We can reverse the viewpoint of start point and end point, to obtain another representation of sub-problems. This is called forward dynamic programming procedure.

## A More Complicated Example

• Problem: when turning in path requires additional cost.

## Solution of the Example

• Solution: add one argument to the optimal value function.

## The Consultant Question

• The art of dynamic-programming formulation involves the proper choice of the arguments for the optimal value function.

• A good way to determin the right amount of arguments of OVF is to think as "asking the consultant question."

## Stage and State

• In OVF, the monotonic argument is called the stage variale, all the others are called state variables.

## The Doubling-Up Procedure

• Usually, the cost of a certain step do not depend on the stage/time.

• The doubling-up procedure is that the stage `\$M+N\$` depends on stage `\$M\$` and stage `\$N\$`.

# Equipment Replacement

## The Simplest Model

• In `N` time periods, we originally have a machine with age `y`. `c(i)` is the cost of operating a machine with age `i`; `p` is price of a new machine; `t(i)` is trade-in value for machine with age `i`; `s(i)` is the salvage value of machine with age `i` at the end of `N` time periods.

• The problem is to minimize the total cost during the next `N` time periods.

## Dynamic-Programming Formulation

• The optimal value function is trivial and left as exercises.

## Shortest-Path Representation of the Problem

• Use age of machine as y-axis and time as x-axis, we can represent the problem as a shortest-path problem.

• TODO

## More Complex Equipment-Replacement Models

• Added the third choice between "keep" and "replace": "overhaul".

• And other variants in exercises.

# Resource Allocation

## The Simplest Model

• Not really dynamic (i.e. requiring a sequence of decisions).

• `X` units of a resourse to be distributed among `N` activities. Each activity has a return function `r_i(x)`. How to allocate the resourse to maximize the total return?

## Dynamic-Programming Formulation

• To use dynamic programming, we view the solution set as stages.

## Miscellaneous Remarks

• This problem can be represent as a longest-path problem.

• Variants:
• forward DP procedure
• return function not monotonic
• doubling-up procedure when activities have the same return function
• two- or three-dimentional resource
• additional requirement: total allocation for a subset have a lower limit
• additional requirement: at most `M` of the allocations may be nonzero
• additional requirement: if `k` activities are nonzero, a cost `h(k)` is assessed.
• two kinds of return: money and public goodwill, maximize money while goodwill has a lower limit.

## Unspecified Initial Resources

• Two kinds of resource which have unit costs, initially given money, how to buy resources and allocate them optimally.

• Variant: three kinds of resource.

## Lagrange Multipliers

• TODO: it's so brand new...

## Justlflcatlon of the Procedure

• TODO: skipped

## Geometric Interpretation of the Procedure

• TODO: skipped

• TODO: skipped

## More Than Two Constraints

• TODO: skipped

# The General Shortest-Path Problem

## Introduction

• Shortest path in a general directed graph.

## Acyclic Networks

• Graphs with no cycle.

• Topologic order: DP stages.

• Can find not only shortest distance but also longest one.

## General Networks

• No natural ordering.

• Dijkstra algorithm: path length as DP order/stages.

• Bellman-Ford: `k` iterations determine the shortest path when `k` or fewer arcs must be used.

• Yen's algorithm:
• reversal: path `1-5-2-3-4` has two reversals at `5` and `2`.
• the `i`th iteration find the length of the shortest path from node `1` to node `k` among all paths have `i-1` or fewer reversals.
• if the `i`th iteration does not update for some node, then it's converged; if convergence does not occur by `N`th iteration, negative cycle exists.
• most efficient.
• All-pair shortest paths
• Floyd
• Dantzig: similar but longer to write
• Variants
• value of path determined by the max/min one arc
• value of path are product of values on arc
• travel time on an arc determined by the start time
• value of arc determined by the previous traveled arc
• Dijkstra-like algorithm connecting two disjoint subset

# The Traveling-Salesman Problem

## Introduction

• Definition: a minimum-cost tour traveling all `N` vertexes and returning to start.

## Dynamic-Programming Formulation

• `f_i(j,S)` = the length of the shortest path from city `1` to city `j` via the set of `i` intermediate cities `S`

• Complexity `N^2 * 2^N`.

## A Doubling-Up Procedure for the Case of Symmetric Distances

• When the distance matrix of graph is a symmetric, assume `N` is even, we can solve all `f_i(j,S)` where `i <= (N-2)/2`, then combine complement two of them.

• Significantly reduced computing time.

## Other Versions of the Traveling-Salesman Problem

• Variant 1: without returning to city 1.

• Variant 2: visit other vertexes at least once, not exactly once.

• Exercises:
• diffferent variants of values on arc
• adding requirements "city `i` must preced `j`"
• use `O(2^N)` Dijkstra-like labels to solve the original problem

# Problems with Linear Dynamics and Quadratic Criteria

## Introduction

• Problems with a struture different from, but analogous to, the common structure of all linear-programming problems.

• Mathematical analysis leads to routine solution of problems with dozens of state variables.

• Newton-Raphson procedure to find the minimum/maximum of a function.

## A Linear Dynamics, Quadratic Criterion Model

• Linear dynamic system:
• `x(i)` state at stage `i`
• `y(i)` the decision
• `x(i+1)` = `g(i)x(i) + h(i)y(i)`
• `J` = `Σ[a(i)x^2(i)+c(i)y^2(i)] + lx^2(N)`
• Minimize `J`

# Discrete-Time Optimal-Control Problems

## Introduction

• Complete pack
• Item `i`
• Weight of item `w_i`
• Value of item `v_i`
• Capacity `w`

## Algorithm 1

• Graph with `0..w` as vertexes.

• One kind of item is one kind of arc.

• Find longest path in the graph.

## Algorithm 2

• Slightly improved version of algorithm 1, solution has unique representation, do not calculate one solution multiple times.

## Algorithm 3

• Forward-looking procedure, use permanent (optimal) value of `f(w)` to compute temporary values of `f(w)` for larger `w`.

## Algorithm 4

• Use breakpoints.