#+title: Chapter 1 --- Building Abstractions with Procedures #+date: <2021-07-04 Sun 09:15> #+author: thebesttv * conjure v. conjure v. / 'kʌndʒə(r) / 1. =~ / ~ sth=: make sth appear, disappear, or change as if by magic - Her grandfather taught her to *conjure*. 她的祖父教她*变魔术*。 - He could conjure coins *from behind* people's ears. 他可以从人们的耳朵后面变出硬币来。 - The magician conjured a rabbit out of his hat. - Thirteen years ago she found herself having to *conjure a career from thin air*. 13年前,她认识到自己得白手起家创造出一番事业来。 - They managed to *conjure a victory*. 他们出人意料地取得了胜利。 - He has conjured victories from worse situations than this. - He conjured a delicious meal out of a few leftovers. 他居然用几样吃剩的东西做出了可口的一餐。 - =~ up sth= - Every day a different chef will be conjuring up delicious dishes in the restaurant. 每天,饭店里会有一位不同的大厨像变戏法似的奉上可口的菜肴。 - He _conjured up a smile_ and reached out to squeeze her hand. 他马上露出笑脸,伸手去捏她的手。 - Dieting always seems to _conjure up images of_ endless salads. - Somehow we have to conjure up another $10,000. - Anne conjured up a most delicious home-made hot pot. 安妮魔术般地变出了一壶烫好的极醇美的自酿酒。 2. recall (an image); cause to feel or think of sth - she had forgotten how to conjure up the image of her mother's face. 她已想不起她母亲的脸长得什么样了。 - a special tune that conjures up a particular time and place. 令人想起特别时刻及场合的专用曲调。 3. call upon (a spirit or ghost) to appear by means of a magic ritual. 施魔法召唤(神灵,鬼魂)。 - they hoped to conjure up the spirit of their dead friend. 他们希望能施魔法召来已逝朋友的灵魂。 * Applicative order vs. normal order In /applicative order/, the interpreter first evaluates the operator & operands and then applies the resulting procedure to the resulting arguments. But in /normal order/, the interpreter would not evaluate the operands _until their values were needed_. Moreover, the value is evaluated every time it is needed, instead of once on entrance. For example, given some procedures: #+begin_src scheme (define (square x) (* x x)) (define (sum-of-squares a b) (+ (square a) (square b))) (define (f a) (sum-of-squares (+ a 1) (* a 2))) #+end_src and evaluate the expression =(f 5)=. In applicative order, the interpreter "evaluate the arguments and then apply": #+begin_src scheme (f 5) ; replace in body: a <- 5 (sum-of-squares (+ 5 1) (* 5 2)) ; evaluate operator & operands (sum-of-squares 6 10) ; replace in body: a <- 6, b <- 10 (+ (square 6) (square 10)) ; evaluate operator & operands (+ (* 6 6) (* 10 10)) (+ 36 100) 136 #+end_src In normal order, the interpreter "fully expand and then reduce": #+begin_src scheme (f 5) (sum-of-squares (+ 5 1) (* 5 2)) (+ (square (+ 5 1)) (square (* 5 2))) (+ (* (+ 5 1) (+ 5 1)) (* (* 5 2) (* 5 2))) ; fully expanded, start reducing (+ (* 6 6) (* 10 10)) (+ 36 100) 136 #+end_src In applicative order, each operator (and operand) is evaluated only once. The first operand of =sum-of-squares= (=(+ 5 1)=) is first evaluated to 6 and passed into the procedure body. However, in normal order, that operand is directly passed into the body of =sum-of-squares=, and duplicated in the body of =square=. As a result, =(+ 5 1)= is evaluated twice in normal order. Although normal order evaluation may introduce duplicated evaluation, its "evaluate until needed" nature allows *skipping the evaluation* of some of the arguments, for example: #+begin_src scheme (define (ite condition then else) ; if-then-else (if condition then else)) (ite (> x y) (heavy-computation) 0) #+end_src Even though =ite= is only a normal function (not a special form), when the condition =(> x y)= is not met, the =(heavy-computation)= is not evaluated and 0 is returned. For more information on different kinds of evaluation strategies, see this [[https://en.wikipedia.org/wiki/Evaluation_strategy][wikipedia article]]. The article [[https://sookocheff.com/post/fp/evaluating-lambda-expressions/][Normal, Applicative and Lazy Evaluation]] contains a more formal definition of normal-order evaluation. # TODO: add more content from ch3 & ch4, see footnote 16 * Functions vs special forms ** =define= ** =if= & =cond= In Scheme, =#f= is interpreted as false, and =#t= (or any other value other than =#f=) is true. #+begin_src scheme (> 1 2) ; #f (< 1 2) ; #t #+end_src #+begin_src scheme (define (abs x) (cond [(> x 0) x] [(= x 0) 0] [(< x 0) (- x)])) (define (abs x) (cond [(< x 0) (- x)] [else x])) #+end_src =else= is a special symbol that can be used as the final predicate of =cond=. In fact, any value other than =#f= can be used in place of =else=. ** =and= & =or= =and= and =or= are special forms, as not all operands are necessarily evaluated. However, =not= is an ordinary procedure, as it only takes and evaluates one operand. =and= returns the value of the first expression that evaluates to a false value, or the value of the last expression, if all expressions evaluate to true values. #+begin_src scheme (and (= 2 2) (> 2 1)) ; #t (and (= 2 2) (< 2 1)) ; #f (and 1 2 'c '(f g)) ; (f g) (and) ; #t #+end_src Similarly, =or= returns the first expression that evaluate to a true value, or the value of the last expression (=#f=), if all expressions evaluate to false values. #+begin_src scheme (or (= 2 2) (> 2 1)) ; #t (or (= 2 2) (< 2 1)) ; #t (or #f #f #f) ; #f (or 123 (/ 3 0)) ; 123 #+end_src Note that =(/ 3 0)= is not evaluated. * Find the smallest divisor #+begin_src scheme (define (smallest-divisor n) ; find the smallest divisior of n (define (find-divisior n test-divisior) (cond [(> (square test-divisior) n) n] [(divides? test-divisior n) test-divisior] [else (find-divisior n (+ test-divisior 1))])) (define (divides? a b) (= (remainder b a) 0)) (find-divisior n 2)) #+end_src * Accumulate, sum, prod =accumulate= takes an initial value (=null-value=) and a way to combine the running total with the new term (=combiner=). #+begin_src scheme ;;; recursive (define (accumulate combiner null-value term a next b) (if (> a b) null-value (combiner (accumulate combiner null-value term (next a) next b) (term a)))) ;;; iterative (define (accumulate combiner null-value term a next b) (define (iter a total) (if (> a b) total (iter (next a) (combiner total (term a))))) (iter a null-value)) #+end_src Both =sum= and =prod= can be defined in terms of =accumulate=. #+begin_src scheme (define (sum term a next b) (accumulate + 0 term a next b)) (define (prod term a next b) (accumulate * 1 term a next b)) #+end_src #+begin_src scheme (define (sum-cubes a b) (sum cube a 1+ b)) (define (sum-integers a b) (sum identity a 1+ b)) (define (pi-sum a b) (sum (lambda (x) (/ 1.0 (* x (+ x 2)))) a (lambda (x) (+ x 4)) b)) (define (fact n) (prod (lambda (x) x) 1 1+ n)) #+end_src * Local variables Functions take parameters, which can be used as local variables. Take for example the function: \[ f(x, y) = x(1+xy)^2 + y(1-y) + (1+xy)(1-y). \] Let $a = (1+xy)$, $b = (1-y)$, so $f(x, y) = x a^2 + y b + a b$. #+begin_src scheme (define (f x y) (define (f-helper a b) ; use parameters as local variables (+ (* x (square a)) (* y b) (* a b))) (f-helper (+ 1 (* x y)) ; a = 1 + xy (- 1 y))) ; b = 1 - y #+end_src The helper function is called only once, so it can be replaced with a lambda expression: #+begin_src scheme (define (f x y) ((lambda (a b) ; use lambda expression instead of named functions (+ (* x (square a)) (* y b) (* a b))) (+ 1 (* x y)) ; a = 1 + xy (- 1 y))) ; b = 1 - y #+end_src This is equivalent to using the =let= special form: #+begin_src scheme (define (f x y) (let ((a (+ 1 (* x y))) (b (- 1 y))) (+ (* x (square a)) (* y b) (* a b)))) #+end_src Another way to introduce local variable is using =define=. Local variables can be implemented as function parameters. #+begin_quote No new mechanism is required in the interpreter in order to provide local variables. A =let= expression is simply syntactic sugar for the underlying lambda application. #+end_quote Since =let= is only syntactic sugar, the local variables are calculated in the same way as function parameters, meaning: - They are computed in parallel, not in sequence. The expression #+begin_src scheme (let ([a 10] [b (+ a a)]) b) #+end_src results in error "Unbound variable: =a=". =b= cannot use the value of the preceding variable (parameter) =a=. - The symbols used in their computation are from the outer scope. As a result, the expression #+begin_src scheme (define x 2) ; [1] (let ([x 3] ; [2] [y (+ x 2)]) (* x y)) #+end_src has 12 as the result. The value of =y= is computed using the global variable =x= in [1] (outer scope), not [2]. * Fixed-point & Newton's method ** Fixed-point A number $x$ is called a /fixed point/ of a function $f$ if $f(x) = x$. For some function $f$ we can locate a fixed point by beginning with an initial guess and applying $f$ repeatedly, $$ f(x), \quad f(f(x)), \quad f(f(f(x))), \quad \ldots $$ until the value does not change very much. #+begin_src scheme (define (fixed-point f initial-guess) (define tolerance 0.001) (define (close-enough? a b) (< [abs (- a b)] tolerance)) (define (try guess) (let ([next (f guess)]) (if (close-enough? guess next) next (try next)))) (try initial-guess)) #+end_src To find $\sqrt{x}$ means finding the fixed point of the function $f(y) = x/y$. However, consider an initial guess $y_1$. The next guess is $y_2 = f(y_1) = x / y_1$, and the next one $y_3 = f(y_2) = x / (x / y_1) = y_1$. The guesses will oscillate between $y_1$ and $y_2$, never converging. Applying the technique of /average damping/ can solve this problem. Here =average-damp= is a procedure that takes a procedure =f= and returns another procedure---the average damped version of =f=. #+begin_src scheme (define (average x y) (/ (+ x y) 2)) (define (average-damp f) (lambda (x) (average x (f x)))) (define (sqrt x) (fixed-point (average-damp (lambda (y) (/ x y))) 1.0)) (sqrt 9) ; 3.000000001396984 #+end_src Notice that cube root is the fixed point of the function $f(y) = x / y^2$: #+begin_src scheme (define (cube-root x) (fixed-point (average-damp (lambda (y) (/ x (square y)))) 1.0)) (cube-root 27) ; 2.9998228753561564 #+end_src ** Newton's method If $g(x)$ is a differentiable function, then a solution of $g(x)=0$ is a fixed point of the function $f(x)$, where $$ f(x) = x - \frac{g(x)}{g'(x)}. $$ First we expression the idea of a derivative: $$ g'(x) = \frac{g(x + dx) - g(x)}{dx}. $$ Just like average damping, =deriv= transforms a function into another function: #+begin_src scheme (define (deriv g) (define dx 0.001) (lambda (x) (/ (- (g (+ x dx)) (g x)) dx))) #+end_src With the aid of =deriv=, we can express Newton's method as a fixed-point process. Here =newton-transform= converts the problem of finding $g(x) = 0$ to finding $f(x) = x$. #+begin_src scheme (define (newton-transform g) (lambda (x) (- x (/ (g x) ((deriv g) x))))) (define (newtons-method g guess) (fixed-point (newton-transform g) guess)) #+end_src Thus we can calculate $\sqrt{x}$: #+begin_src scheme (define (sqrt x) (newtons-method (lambda (y) (- (square y) x)) 1.0)) (sqrt 9) ; 3.0000000174227237 #+end_src Note that the resulting lambda expression in =newton-transform= calculates the derivative of $g$ *every time* it is called, since it does not save the result of =(deriv g)=. This is very inefficient. Using a local variable =dg= to hold the result so =deriv= is called only once: #+begin_src scheme (define (newton-transform g) (let ([dg (deriv g)]) (lambda (x) (- x (/ (g x) (dg x)))))) #+end_src ** =fixed-point-of-transform= We calculated =sqrt= using both the fixed point search and Newton's method: #+begin_src scheme ;;; fixed point (define (sqrt x) ; [1] (fixed-point (average-damp (lambda (y) (/ x y))) 1.0)) ;;; Newton's method (define (sqrt x) ; [2] (newtons-method (lambda (y) (- (square y) x)) 1.0)) #+end_src The latter [2] expands to: #+begin_src scheme (define (sqrt x) ; [3] (fixed-point (newton-transform (lambda (y) (- (square y) x))) 1.0)) #+end_src Both [1] and [3] have the same pattern---each method begins with a function and finds a fixed point of _some transformation of the function_ (=average-damp= or =newton-transform=). We can express this general idea itself as a procedure: #+begin_src scheme (define (fixed-point-of-transform g transform guess) (fixed-point (transform g) guess)) #+end_src Then the two methods become: #+begin_src scheme (define (sqrt x) (fixed-point-of-transform (lambda (y) (/ x y)) average-damp 1.0)) (define (sqrt x) (fixed-point-of-transform (lambda (y) (- (square y) x)) newton-transform 1.0)) #+end_src * Compose Let $f$ and $g$ be two one-argument functions. The composition $f$ after $g$ is $f(g(x))$: #+begin_src scheme (define (compose f g) (lambda (x) (f (g x)))) ((compose square 1+) 6) ; => (square (1+ 6)) => 49 #+end_src Applying a function $f$ $n$ times yields $$ f(f(\cdots f(x) \cdots)). $$ @@comment: the period must be inside the equation@@ We can either return $f$ when $n=1$, or return an identity function when $n=0$. The latter produces the correct result even when $n=0$. #+begin_src scheme (define (repeated f n) (if (= n 1) f (compose f (repeated f (- n 1))))) (define (repeated f n) (if (= n 0) identity (compose f (repeated f (- n 1))))) ((repeated 1+ 10) 5) ; 15 #+end_src Alternatively, there's an iterative implementation: #+begin_src scheme (define (repeated f n) (define (iter n res) (if (= n 0) res (iter (- n 1) (compose f res)))) (iter n identity)) ((repeated 1+ 10) 5) ; 15 #+end_src * =lambda= for recursion How to write a recursive function using only =lambda=? The main problem, of course, is how can a lambda expression call itself when it doesn't have a name for itself? [[https://www.scheme.com/tspl4/further.html#g55][Section 3.2]] of /The Scheme Programming Language/ gives the answer: simply pass the lambda procedure to itself: #+begin_src scheme (let ([sum (lambda (sum l) (if (null? l) 0 (+ (car l) (sum sum (cdr l)))))]) (sum sum '(1 2 3 4))) ; 10 #+end_src The =let= expression is essentially another =lambda=, here we give it a better name: #+begin_src scheme ((lambda (sum) (sum sum '(1 2 3 4))) (lambda (self l) (if (null? l) 0 (+ (car l) (self self (cdr l)))))) ; 10 #+end_src [[https://stackoverflow.com/a/66166000/11938767][Here]] is a factorial using two =lambda=s, only slight difference: #+begin_src scheme ((lambda (f x) (f f x)) (lambda (self n) (if (= n 0) 1 (* n (self self (- n 1))))) 5) ; 120 #+end_src [[https://stackoverflow.com/q/7719004/11938767][This]] stack overflow question uses three =lambda=s. The answers below has an [[https://gist.github.com/z5h/238891][explanation]] covering Y combinator. #+begin_src scheme (((lambda (x) (x x)) ; [1] (lambda (fact-gen) ; [2] (lambda (n) ; [3] (if (zero? n) 1 (* n ((fact-gen fact-gen) (- n 1))))))) 5) ; 120 #+end_src [3] is the factorial function. If [3] were given the name =fact=, then =(fact-gen fact-gen)= is just =fact= itself. [2] is a generator function whose parameter (=fact-gen=) is also a generator function (so [2] can use itself as parameter) and returns the factorial function. [1] takes a generator function ([2]) and applies the function to itself, thereby obtaining as return value the factorial function. [[https://stackoverflow.com/a/54359987/11938767][This]] answer uses /named =let=/: #+begin_src scheme ((lambda (n) (let sub ((i n) (z 1)) (if (zero? i) z (sub (- i 1) (* z i)) ))) 5 ) ; 120 #+end_src * Exercises ** Ex 1.3 --- the smallest of the three #+begin_quote Define a procedure that takes three numbers as arguments and returns the sum of the squares of the two larger numbers. #+end_quote When looking for the smallest value, the predicate _smaller or *equal to*_ (=<==) must be used. If only =<= is used, in evaluating =(f 2 2 3)=, the first two =and= condition will evaluate to false. The result would be =(sum-of-squares 2 2)=, which is very wrong. #+begin_src scheme (define (sum-of-squares a b) (+ (* a a) (* b b))) (define (f a b c) (cond [(and (<= a b) (<= a c)) (sum-of-squares b c)] [(and (<= b a) (<= b c)) (sum-of-squares a c)] [else (sum-of-squares a b)])) (f 2 2 3) ; 13 #+end_src In order to find the two larger ones out of three, a simpler solution: #+begin_src scheme (define (f a b c) (sum-of-squares (max a b) (max (min a b) c))) #+end_src For the first two numbers (=a=, =b=), at least one of them is in the result. So the bigger one (=(max a b)=) must be in the result. As for the smaller one (=(min a b)=), it needs to be compared with =c=. ** Ex 1.5 --- applicative-order & normal-order #+begin_quote Ben Bitdiddle has invented a test to determine whether the interpreter he is faced with is using applicative-order evaluation or normal-order evaluation. He defines the following two procedures: #+begin_src scheme (define (p) (p)) (define (test x y) (if (= x 0) 0 y)) #+end_src Then he evaluates the expression #+begin_src scheme (test 0 (p)) #+end_src What behavior will Ben observe with an interpreter that uses applicative-order evaluation? What behavior will he observe with an interpreter that uses normal-order evaluation? Explain your answer. (Assume that the evaluation rule for the special form =if= is the same whether the interpreter is using normal or applicative order: The predicate expression is evaluated first, and the result determines whether to evaluate the consequent or the alternative expression.) #+end_quote Using the substitution model, =(p)= infinitely expands to itself. Evaluating =(p)= will lead to an endless recursion. In applicative-order evaluation, the interpreter first evaluates all its operands, including =(p)=. So the whole expression will not evaluate to any result. However, in normal-order evaluation, not all operands will necessarily be evaluated (not until they are actually needed). The expression is first expanded into ~(if (= 0 0) 0 (p))~. Since the predicate is true, the =(p)= on the false branch is never needed. The whole expression evaluates to =0=. ** Ex 1.16 --- iterative fast exponentiation #+begin_quote Design a procedure that evolves an iterative exponentiation process that uses successive squaring and uses a logarithmic number of steps, as does =fast-expt=. (Hint: Using the observation that $(b^{n/2})^2 = (b^2)^{n/2}$, keep, along with the exponent $n$ and the base $b$, an additional state variable $a$, and define the state transformation in such a way that the product $a b^n$ is unchanged from state to state. At the beginning of the process a is taken to be $1$, and the answer is given by the value of $a$ at the end of the process. In general, the technique of defining an invariant quantity that remains unchanged from state to state is a powerful way to think about the design of iterative algorithms.) #+end_quote Original recursive code to compute $b^n$: #+begin_src scheme (define (fast-expt b n) (cond [(= n 0) 1] [(even? n) (square (fast-expt b (/ n 2)))] [else (* b (fast-expt b (- n 1)))])) #+end_src Iterative code: #+begin_src scheme (define (fast-expt b n) (define (iter b n prod) (cond [(= n 0) prod] [(even? n) (iter (square b) (/ n 2) prod)] [else (iter b (- n 1) (* prod b))])) (iter b n 1)) ;; the same thing: (define (fast-expt b n) (define (iter a b n) ; a * b^n (cond [(= n 0) a] [(even? n) (iter a (square b) (/ n 2))] [else (iter (* a b) b (- n 1))])) (iter 1 b n)) #+end_src ** Ex 1.44 --- order of application #+begin_quote The idea of smoothing a function is an important concept in signal processing. If $f$ is a function and $dx$ is some small number, then the smoothed version of $f$ is the function whose value at a point $x$ is the average of $f(x-dx)$, $f(x)$, and $f(x+dx)$. Write a procedure =smooth= that takes as input a procedure that computes $f$ and returns a procedure that computes the smoothed $f$. It is sometimes valuable to *repeatedly smooth a function* (that is, smooth the smoothed function, and so on) to obtain the n-fold smoothed function. Show how to generate the n-fold smoothed function of any given function using =smooth= and =repeated= from Exercise 1.43. #+end_quote The definition of =smooth= is quite easy: #+begin_src scheme (define (smooth f) (define dx 0.01) (define (average a b c) (/ (+ a b c) 3)) (lambda (x) (average (f (- x dx)) (f x) (f (+ x dx))))) ((smooth square) 2) ; 4.000066666666666 ((smooth (smooth square)) 2) ; 4.000133333333333 ((smooth (smooth (smooth square))) 2) ; 4.0001999999999995 #+end_src However, the repeated application of =smooth= should be written as: #+begin_src scheme (define (n-fold-smooth f n) ((repeated smooth n) f)) ((n-fold-smooth square 1) 2) ; 4.000066666666666 ((n-fold-smooth square 2) 2) ; 4.000133333333333 ((n-fold-smooth square 3) 2) ; 4.0001999999999995 #+end_src Not as: #+begin_src scheme (define (wrong f n) (repeated (smooth f) n)) ((wrong square 1) 2) ; 4.000066666666666 ((wrong square 2) 2) ; 16.00060000444444 ((wrong square 3) 2) ; 256.01926716889415 #+end_src The =wrong= implementation actually expands to: #+begin_src scheme ((smooth square) ((smooth square) 2)) ; 16.00060000444444 ((smooth square) ((smooth square) ((smooth square) 2))) ; 256.01926716889415 #+end_src ** Ex 1.45 --- n-th root Comput $\sqrt[n]{x}$ by calculating the fixed point of the function $x / y^{n-1}$ average damped $\lfloor \log_2 n \rfloor$ times. #+begin_src scheme (define (nth-root x n) (define (log2 n) (/ (log n) (log 2))) (let ([c (inexact->exact (floor (log2 n)))]) (fixed-point ((repeated average-damp c) (lambda (y) (/ x (expt y (- n 1))))) 1.0))) #+end_src ** Ex 1.46 --- iterative improvement #+begin_quote Several of the numerical methods described in this chapter are instances of an extremely general computational strategy known as /iterative improvement/. Iterative improvement says that, to compute something, we start with an initial guess for the answer, test if the guess is good enough, and otherwise improve the guess and continue the process using the improved guess as the new guess. Write a procedure =iterative-improve= that takes two procedures as arguments: a method for telling whether a guess is good enough and a method for improving a guess. =iterative-improve= should return as its value a *procedure* that takes a guess as argument and keeps improving the guess until it is good enough. Rewrite the =sqrt= procedure of Section 1.1.7 and the =fixed-point= procedure of Section 1.3.3 in terms of =iterative-improve=. #+end_quote #+begin_src scheme (define (iterative-improve good-enouth? improve) (define (try guess) (if (good-enouth? guess) guess (try (improve guess)))) try) (define (fixed-point f first-guess) ((iterative-improve (lambda (guess) (< [abs (- guess (f guess))] 0.00001)) f) first-guess)) (define (average-damp f) (lambda (x) (/ (+ x (f x)) 2))) (define (sqrt x) (fixed-point (average-damp (lambda (y) (/ x y))) 1.0)) (sqrt 9) ; 3.000000001396984 #+end_src