Exercise 5.45. By comparing the stack operations used by compiled code to the stack operations used by the evaluator for the same computation, we can determine the extent to which the compiler optimizes use of the stack, both in speed (reducing the total number of stack operations) and in space (reducing the maximum stack depth). Comparing this optimized stack use to the performance of a special-purpose machine for the same computation gives some indication of the quality of the compiler. a. Exercise 5.27 asked you to determine, as a function of n, the number of pushes and the maximum stack depth needed by the evaluator to compute n! using the recursive factorial procedure given above. Exercise 5.14 asked you to do the same measurements for the special-purpose factorial machine shown in figure 5.11. Now perform the same analysis using the compiled factorial procedure. Take the ratio of the number of pushes in the compiled version to the number of pushes in the interpreted version, and do the same for the maximum stack depth. Since the number of operations and the stack depth used to compute n! are linear in n, these ratios should approach constants as n becomes large. What are these constants? Similarly, find the ratios of the stack usage in the special-purpose machine to the usage in the interpreted version. Compare the ratios for special-purpose versus interpreted code to the ratios for compiled versus interpreted code. You should find that the special-purpose machine does much better than the compiled code, since the hand-tailored controller code should be much better than what is produced by our rudimentary general-purpose compiler. b. Can you suggest improvements to the compiler that would help it generate code that would come closer in performance to the hand-tailored version? ———————————————————————————————————————————————————————————————————————— a. From exercise 5.27, the recursive factorial in the ec-eval used 5n + 3 depth and 32n - 16 pushes. From exercise 5.14, "both the total pushes and the maximum stack depth are apparently equal to 2n - 2." With the compiled code: $ scheme --load load-eceval-compiler.scm 1 ]=> (load "ch5-compiler.scm") 1 ]=> (compile-and-go '(define (factorial n) (if (= n 1) 1 (* (factorial (- n 1)) n)))) (factorial 3) (total-pushes = 19 maximum-depth = 8) ;;; EC-Eval value: 6 (factorial 4) (total-pushes = 25 maximum-depth = 11) ;;; EC-Eval value: 24 pushes: 6n + 1 depth: 3n - 1 All together: depth pushes hand-written: 2n - 2 2n - 2 compiled: 3n - 1 6n + 1 ec-eval: 5n + 3 32n - 16 b. The handwritten factorial machine: (define fact-machine (make-machine '(continue n val) (list (list '= =) (list '- -) (list '* *)) '( (perform (op initialize-stack)) (assign continue (label fact-done)) fact-loop (test (op =) (reg n) (const 1)) (branch (label base-case)) (save continue) (save n) (assign n (op -) (reg n) (const 1)) (assign continue (label after-fact)) (goto (label fact-loop)) after-fact (restore n) (restore continue) (assign val (op *) (reg n) (reg val)) (goto (reg continue)) base-case (assign val (const 1)) (goto (reg continue)) fact-done (perform (op print-stack-statistics))))) The compiled version: 1 ]=> (pp (compile '(define (factorial n) (if (= n 1) 1 (* (factorial (- n 1)) n))) 'val 'next)) ((env) (val) ((assign val (op make-compiled-procedure) (label entry2) (reg env)) (goto (label after-lambda1)) ;; the factorial procedure entry point entry2 ;; compiled procedure overhead (assign env (op compiled-procedure-env) (reg proc)) (assign env (op extend-environment) (const (n)) (reg argl) (reg env)) (save continue) (save env) ;; here the "=" op could be open-coded ;; the next 18 lines correspond to a single (test (op =) (reg n) (const 1)) in the hand-written version (assign proc (op lookup-variable-value) (const =) (reg env)) (assign val (const 1)) (assign argl (op list) (reg val)) (assign val (op lookup-variable-value) (const n) (reg env)) (assign argl (op cons) (reg val) (reg argl)) (test (op primitive-procedure?) (reg proc)) (branch (label primitive-branch17)) compiled-branch16 (assign continue (label after-call15)) (assign val (op compiled-procedure-entry) (reg proc)) (goto (reg val)) primitive-branch17 (assign val (op apply-primitive-procedure) (reg proc) (reg argl)) after-call15 (restore env) (restore continue) (test (op false?) (reg val)) ;; the preceding line (test (op false?) (reg val)) is what finally actually makes the if-branching happen. ;; rather than evaluating the operator and operands, then evaluating the combination, then finally testing the truth-value of the result, a smarter compiler could determine that it has the value in a register already, and branch with a single test of the register's value against the constant as in the hand-written version. (branch (label false-branch4)) true-branch5 (assign val (const 1)) (goto (reg continue)) false-branch4 ;; the * and - operations here could be also be open-coded ;; this would also avoid saving proc on the stack (assign proc (op lookup-variable-value) (const *) (reg env)) (save continue) (save proc) (assign val (op lookup-variable-value) (const n) (reg env)) (assign argl (op list) (reg val)) (save argl) ;; rather than look up factorial here, the compiler could notice that it is bound in the immediate environment in which it appears and simply loop back to the entry point (assign proc (op lookup-variable-value) (const factorial) (reg env)) (save proc) (assign proc (op lookup-variable-value) (const -) (reg env)) (assign val (const 1)) (assign argl (op list) (reg val)) (assign val (op lookup-variable-value) (const n) (reg env)) (assign argl (op cons) (reg val) (reg argl)) (test (op primitive-procedure?) (reg proc)) (branch (label primitive-branch8)) compiled-branch7 (assign continue (label after-call6)) (assign val (op compiled-procedure-entry) (reg proc)) (goto (reg val)) primitive-branch8 (assign val (op apply-primitive-procedure) (reg proc) (reg argl)) after-call6 (assign argl (op list) (reg val)) (restore proc) (test (op primitive-procedure?) (reg proc)) (branch (label primitive-branch11)) compiled-branch10 (assign continue (label after-call9)) (assign val (op compiled-procedure-entry) (reg proc)) (goto (reg val)) primitive-branch11 (assign val (op apply-primitive-procedure) (reg proc) (reg argl)) after-call9 (restore argl) (assign argl (op cons) (reg val) (reg argl)) (restore proc) (restore continue) (test (op primitive-procedure?) (reg proc)) (branch (label primitive-branch14)) compiled-branch13 (assign val (op compiled-procedure-entry) (reg proc)) (goto (reg val)) primitive-branch14 (assign val (op apply-primitive-procedure) (reg proc) (reg argl)) (goto (reg continue)) after-call12 after-if3 after-lambda1 (perform (op define-variable!) (const factorial) (reg val) (reg env)) (assign val (const ok)))) So in this case, being smarter about environment lookups, register use, and order of operations should make the compiler produce code as tight as what can be written by hand. Because of the semantics of Scheme, not all of these optimizations may always be possible (for example the open-coding optimization made previously is not actually safe if set! is used, as mentioned in the text).