Compilers

Hoopoe

Source: Shantanu Kuveskar [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0)

Due on Wednesday, May 1st at 11:59 PM. This is a compiler lab. If you have a partner, the two of you will complete this lab as a team. If you do not have a partner, you are effectively in a team by yourself.

If you are working with a partner, please be familiar with the Partner Etiquette guidelines. You and your partner share a single repository and, barring unusual circumstances, will receive the same grade. If you experience any difficulties in your partnership, please alert your instructor as soon as possible.

If you are working alone, please don’t hesitate to seek help if you get stuck. Since there are no ninjas for this course and you don’t have a partner, your instructor is the only interactive resource available to you and is happy to help you through any problems you might encounter.

In either case, be familiar with the academic integrity policy! You are permitted to discuss high-level details of your compiler project with other students, but you should never see or share anything that will be turned in for a grade. If you need help, please post on the Piazza forum or, if necessary, contact the instructor directly. Make sure to post privately if you are sharing code. If you have any doubts about what is okay and what is not, it’s much safer to ask than to risk violating the Academic Integrity Policy.

Overview

In this lab, we will extend the Gull compiler to include tail call optimization. We will call the resulting languge “Hoopoe”. Once you have completed this task, you will be able to compile arbitrary long-running computations using recursion for loops!

Transitioning to Hoopoe

As usual, you just need to merge a branch to get your Hoopoe starter code.

git fetch
git checkout hoopoe
git merge gull

After you merge, you’ll discover the usual updates in src/language/. The most relevant part for this lab is the change made in src/language/asts.ml: the EAppl constructor now takes three arguments rather than two. As a result of this, your code will not compile, but it won’t take much effort to get it back up and running.

The Hoopoe Language

The Hoopoe language is syntactically identical to Gull; there are no new forms of expressions or new language features available to the programmer. In fact, the only difference is the one described above: the abstract syntax of EAppl has changed from

| EAppl of expr * expr

| EAppl of expr * expr * bool

The additional boolean value indicates whether the EAppl in question is in a tail position: that is, whether the function call is the very last thing that this expression will do. The provided parser defaults to assuming that no calls are tail calls (since that is always safe). You will soon write a function to identify tail positions.

The first thing you should do in implementing Hoopoe is to modify your compiler so that this value is ignored during well-formedness and compilation. If you completed the Finch lab and have applications which are generated as part of closure conversion, use the value false for all EAppl you create. Finally, make your unit tests and run them to ensure that everything is still working.

Marking Tail Positions

Next, you need to mark tail positions in your AST. This is easiest to do as a function program -> program which examines each expression in the AST for tail calls that need to be marked. Then, call this function in the right place in your compiler. The particular approach you use does not matter as long as the tail position variable is being set correctly.

Tail Calls

Consider the following program:

def f x = x end
def g y = f y end
g 4

In Gull, which has no concept of tail calls, the following sequence of events would occur:

The main expression calls g, leaving the return pointer for the main expression on the stack.
The g function calls f, leaving the return pointer for g on the stack.
The f function copies its parameter into eax and returns to the point in g where it was called.
The g function finishes and returns to the point in the main expression where it was called.
The main expression yields the value returned by g, which was 4.

Tail call optimization begins with the observation that there was no point in returning to the body of g. We knew we would simply be returning its value, so the stack frame we’ve created for g is just taking up space. With tail call optimization, we want to bring about the following sequence of events instead:

The main expression calls g, leaving the return pointer for the main expression on the stack.
The g function tail calls f. It modifies the stack by removing its own stack frame and replacing its own arguments with new arguments for f. It then jumps to f rather than calling f.
The f function, unaware that this has happened, copies the parameter into eax. When it issues the ret instruction, the tail call by g has changed its return pointer to return directly to the point in the main expression were g was called.
The main expression yields the value returned by f, which was 4.

This kind of optimization is especially important in functional languages, which make frequent and thorough use of recursion. With tail call optimization, recursive tail calls consume no additional stack memory and so work much like while loops in imperative languages. Without tail call optimization, long-running recursive loops will overflow the stack.

Implementing Tail Call Optimization

In tail call optimization, there are three main participants:

The original caller. In the example above, this is the program’s main expression.
The tail caller. In the example above, this is g.
The tail callee. In the example above, this is f.

Not every call can be optimized. To apply our tail call optimization strategy, we must know the following facts:

The call is in a tail position. If this is not true, then we must return to the calling function afterwards, so we can’t just throw our stack frame away. Note that this information is available at compile time.
The tail callee is a function with equal or fewer parameters as the tail caller. If we tail call, we return directly to the original caller, who will remove arguments from the stack equal to the number of parameters of the tail caller (since that is who the original caller had called). If the tail callee has more parameters than the tail caller, we won’t have enough room for the tail callee’s arguments and we will not perform the tail call. Because we have higher-order functions, we will not know this information until runtime.

Since the tail call position is known at compile time, it is the easiest part to handle: for a EAppl which is not in a tail position, we simply generate the same assembly as we would in Gull.

If the EAppl is in a tail position, we need to generate code which might perform a tail call. Remember that EAppl refers to two values: a closure and an argument. We’ll have our compiler generate assembly code which uses the following algorithm:

If the closure is not yet full, create a new closure which includes this argument. This is the same as in Falcon and Gull.
If the closure is full, then
- If the number of parameters in the closure is greater than the number of parameters of this function, we perform a normal call (just as we would if the EAppl were not in a tail position).
- Otherwise, we perform a tail call.
  - Instead of pushing the arguments onto the stack, we replace our own arguments with them. (If we have more parameters than the target function, we must replace our other arguments with zeros.)
  - We tear down our stack (in a fashion similar to how we would at the end of a function declaration), returning it to (almost) the same condition as it was in when the original caller called us.
  - We jump to the function rather than calling it.

Note that the above algorithm requires that we know the number of parameters for the current function. You’ll probably need to modify your environment type to track this information and create a function in environment.ml that allows you to retrieve it. You are not required to perform tail calls in the bird_main expression; bird_main is handled a bit differently from your other functions, so it would require special effort to optimize and this work wouldn’t save that much stack memory overall.

It may be helpful to get this algorithm working for tail callees with an equal number of parameters first and then extend your implementation to tail callees with fewer parameters.

Testing Tail Call Optimization

Testing a compiler optimization is somewhat different than testing the compilation of a language feature. Our tests so far have treated the program as a black box and just checked its output. But an optimization should, by its very nature, not change what task a program accomplishes: it should just make the program smarter about how to accomplish that task.

In this case, however, we do know of an observable change in behavior. In Gull, long-running recursive functions will run out of stack memory and cause a segmentation fault. We can test the Hoopoe optimization by writing such programs and making sure they run correctly. Here are some examples of the sorts of tests you may want to run:

A very simplistic form of test is to ensure that your tail call optimization prevents a stack overflow. Code like the following may help with that:
```
def f x =
  if x = 1 then 1 else f (x-1)
end

f 10000000
```
Without tail call optimization, this program will try (and fail) to create ten million stack frames. With tail call optimization, it should run correctly.
Because tail call optimization manipulates the stack in a variety of ways, we have to make sure that parameters are copied correctly! We can do this by passing parameters into a long-running recursive function to make sure they can be used once we’ve reached the base case:
```
def f a x b =
  if x = 1 then (a,b) else f (a+1) (x-1) (b && b)
end

f 4 10000000 true
```
If tail call optimization damages the stack somehow, it’s pretty unlikely that this function will give the expected result without encountering a type error or segmentation fault!
We should also test that tail call optimization does the right thing when presented with a function that has too many parameters.
```
def f x y z =
  x + y + z
end

def g n =
  f n n n
end

g 4
```
The above code doesn’t test the tail call optimization directly. If the decision about parameters is being made incorrectly, though, then the above code will likely either corrupt a return pointer or use a return pointer in a calculation, which will be visible in the output of the program.
It’s a bit trickier to test whether we’re getting the benefits of tail call optimization for functions with fewer parameters. We can do this by creating two mutually recursive functions and having the function with more arguments consume a lot of stack memory, like so:
```
def f1 x =
  if x = 1 then
    1
  else
    f2 (x-1) false
end

def f2 x b =
  let mem1 = 1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1 in
  let mem2 = 1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1 in
  f1 x
end

f1 100000
```
Here, f2 generates numerous temporary stack variables, allocating hundreds of bytes of stack memory for each call. In the call to f1, however, the f2 stack frame should be replaced, which frees those hundreds of bytes. If your code is behaving properly, then this program will exit normally. If not, then you will likely get a segmentation fault when the greedy calls to f2 cause your program to use up all of its stack memory.
Make sure to run plenty of code (like that from your unit tests) which includes non-tail calls. For instance, consider
```
def plus1 x = x + 1 end
def plus2 x = plus1 (plus1 x) end
plus2 4
```
Within plus2, there are two calls to plus1. If the inner plus1 call is treated as a tail call (even though it’s not in a tail position), then plus1 will return directly to the main expression and so the second call to plus1 will never occur. If the above code returns 5, it’s likely you’re tail calling when you shouldn’t be.

Summary

To complete this assignment, you’ll need to

Process an AST to identify tail calls
When compiling tail calls, generate tail call optimization code

Once you’ve done this, you have a complete language runtime: your programs can run indefinitely, never running out of stack or heap memory if written correctly!

Submitting

To submit your lab, just commit and push your work. The most recent pushed commit as of the due date will be graded. For more information on when assignments are due, see the Policies page.

After each lab, you will complete a brief questionnaire found here. In most cases, the questionnaire should take less than a minute. This questionnaire is required and will be used as part of your participation grade.

If You Have Trouble…

…then please contact your instructor! Piazza is the preferred method, but you can reach out via e-mail as well. Good luck!