Homework 4: Programs as Data


Category: You will Instantly receive a download link for .zip solution file upon Payment


5/5 - (8 votes)

CSci 2041: Advanced Programming Principles
This homework will focus expression evaluation and other functions
over expressions represented as inductive data types in OCaml. The
goal of this assignment is to give you the experience of working with
symbolic data at an abstract level. It will also give you some
insight into how compilers and interpreters work. While the languages
we are implementing are quite simple, the principles being used are
also used in production implementation of mainstream programming

**Change log:**
+ Monday, March 5: (Sean), corrected example in Part 2, type inference,
to be “2 * (3 + (4 < 5))" (instead of "2 * (3 + (4 < 5)0"). + Tuesday, March 6: (Yuze Jiang, via Eric), fixed error in example evaluation for the `unparse` function. + Wednesday, March 7: (Eric), fixed some awkward wording description of type inference for the ``Eq`` expression form. + Thursday, March 8: (Duanyang), fixed typo in the type of ``get_type`` , changed from ``value -> type“ to “value -> typ“.

+ Friday, March 9: (Duanyang), In Part 1 Step 6, recursive functions, changed
“sumTo“ to match definition given in “expr_functions.ml“.

+ Monday, March 12: (Duanyang), added clarification to “unparse“

# Writing transparent code

We have discussed the importance of writing code in a manner that
makes it easier for others to understand it. We’ve even had a few
labs devoted to this topic in which you improved your own code. For
this assignment, part of your score will be based on how well you
write code that others can follow.

You should consider writing type annotations where they are useful to
others, consider using let-expressions to give intermediate values
names that help reader understand their intended use, not write lines
longer than 80 characters, along with using other means that we’ve
discussed for making your code easier to read.

You do not need to over-think this process, but take some care to make
your programs readable to others.

# Languages and files

For this assignment we will consider 3 different language – that is, 3
different versions of the “expr“ type.

1. The first is the “full” language with functions and recursive
let-expressions as defined in “expr_functions.ml“ in the
“Homework/Files/expr“ directory in the public class repository.

2. The second is a subset of this that does not include functions and
is defined in the “expr_let_typing.ml“ file in the same directory.

3. The third is a version of the “full” language with type annotations
on identifier declarations. This will be provided soon (by Mar 6
at the latest).

Copy each of these files into a directory named “Hwk_04“ in
your personal GitHub course repository.

Each of these files has a number of “assert“ expressions that you
can un-comment when you are ready to test your code. Additional
asserts may be provided to you later. Also, the feedback scripts will
be up in a few days to also help clarify some questions you may have.
But keep in mind that neither the asserts not the feedback tests are
intended to be exhaustive. You do need to read and understand the
assignment specifications without relying completely on these.

# Part 1: Problems over the first language in “expr_functions.ml“

The problems below should be solved in the “expr_functions.ml“

## Serialize expressions

This problem asks you to write a function named “serialize“ that
converts an expression to a “string“ that is a valid OCaml
expression of type “expr“.

For example,
serialize (Add (Val (Int 1), Mul (Val (Int 2), Val (Int 3))))
should evaluate to the string
“Add (Val (Int 1), Mul (Val (Int 2), Val (Int 3)))”

Constructs such as “Lambda“ and “Let“ that have “string“
components will need to “escape” the double quote characters that
need to be in the resulting string.

For example,
serialize (Lambda (“n”, Add (Id “n”, Val (Int 1))))
should evaluate to the string
“Lambda (\”n\”, Add (Id \”n\”, Val (Int 1)))”

To see a version of this string the way that OCaml prints “expr“
values, that is, one that is output with the quotes not escaped, use
“print_endline“. For example, type the following into utop or ocaml:
print_endline (serialize (Lambda (“n”, Add (Id “n”, Val (Int 1)))));;
You should see
Lambda (“n”, Add (Id “n”, Val (Int 1)))
– : unit = ()

You ~~cannot~~ can copy the “Lambda…“ line above and paste it into utop.
This will be an easy way to check that your “serialize“ function is
generating valid OCaml code.

Your “serialize“ function should compute EXACTLY the string that
OCaml displays in the utop environment when it displays the value of
an “expr“. The only distinction can be the newlines and indentation
that OCaml will add. So white-space will not matter in checking this.

Note that your function should only serialize values constructed by
“Int“ or “Bool“. Raise a “Failure“ exception stating
“”Will only serialize integer and Boolean values”“.

## Unparse expressions

This problem asks you to write a function named “unparse“ that
converts an “expr“ value into a string that is what we write
in OCaml for similar computations.

For example,
unparse (Add (Val (Int 1), Mul (Val (Int 2), Val (Int 3))))
should evaluate to
“(1 + (2 * 3))”
Notice that “unparse“ generates more parentheses than are needed.
This simplifies the implementation of “unparse“.

Your function should generate output that you can copy and paste into
the OCaml interpreter (in utop) to **compute the value** of the expression.

By copying the output “(1 + (2 * 3))“ from above, OCaml will display

~~You need to handle quotes in a manner similar to in “serialize“.~~
Constructs such as “Lambda“ and “Let“ that have “string“
components will need to generate the actual name without double quotes
in the resulting string.

For example,
unparse (Lambda (“n”, Add(Id “n”, Val (Int 1))))
should evaluate to
“(fun n -> (n + 1))”
To test your function, you can copy and paste your output into utop
and let OCaml evaluate the expressions. For example, you can paste
the output “(fun n -> (n + 1))“ in utop and you should get a function of type
“int -> int“.

For another example,
unparse (App (Lambda (“n”, Add (Id “n”, Val (Int 1))), Val (Int 4)))
should evaluate to
“((fun n -> (n + 1)) 4)”
If you paste “((fun n -> (n + 1)) 4)“ into utop, you
should get the value “5“.

Note that your function should only ~~serialize~~ unparse values constructed by
“Int“ or “Bool“. Raise a “Failure“ exception stating
“”Will only unparse integer and Boolean values”“.

## Free variables

Complete the implementation “freevars“ developed in class and copied
into the “expr_functions.ml“file.

Our “expr“ language defines the notion of free variables the same
way that OCaml does. If you are unclear what should be computed by
“freevars“ for a particular expressions you can use your “unparse“
function and just type it in to utop. Note that in utop only the
first free variable (called “Unbound value” by OCaml) is displayed.
But using this you can discover the semantics of “freevars“ if it is
not already clear.

## Expression evaluation

This problem asks you to complete the “eval“ function to evaluate
all forms of expression.

In “expr_functions.ml“ you will find the definitions of the
“expr“, “value“, and “environment“ types. You will also find an
implementation of “lookup“. There is also an “evaluate“ function
that simply calls “eval“ for an expression starting with the empty

Some parts of “eval“ are provided, but many of the interesting cases
need to be implemented by you.

### Step 1 – arithmetic expressions

To get stared, first ensure that “evaluate“ will work for the
arithmetic operations and integer constants. Much of the work for this
has been done in some of the in-class example already.

An example evaluation:

+ “evaluate (Add (Val (Int 1), Mul (Val (Int 2), Val (Int 3))))“
evaluates to “Int 7“

### Step 2 – logical and relational expressions

Logical and relational operations are also straightforward:

Some sample evaluations:

+ “evaluate (Eq (Val (Int 1), Mul (Val (Int 2), Val (Int 3))))“
evaluates to “Bool false“

+ “evaluate (Lt (Val (Int 1), Mul (Val (Int 2), Val (Int 3))))“
evaluates to “Bool true“

### Step 3 – conditional expressions
Conditional expressions should also pose not significant challenge.
For example

(If (Lt (Val (Int 1), Mul (Val (Int 2), Val (Int 3))),
Val (Int 4),
Val (Int 5)))
evaluates to “Int 4 “

### Step 4 – let expressions

We have implemented non-recursive let-expressions in a forms in
class. Adapting that work to this setting should be straightforward.

### Step 5 – non-recursive functions

We have spent some time in class discussing closures as the way to
represent the value of a lambda expression. The slides have several
examples of this, a few of which are reproduced here. There is
additional information in the lecture slides.

The values “inc“ and “add“ are defined as follows:
let inc = Lambda (“n”, Add(Id “n”, Val (Int 1)))

let add = Lambda (“x”,
Lambda (“y”, Add (Id “x”, Id “y”))

Some sample evaluations:

+ “evaluate inc“
evaluates to “Closure (“n”, Add (Id “n”, Val (Int 1)), [])“

+ “evaluate add“
evaluates to “Closure (“x”, Lambda (“y”, Add (Id “x”, Id “y”)), [])“

+ “evaluate (App (add, Val (Int 1)))“
evaluates to “Closure (“y”, Add (Id “x”, Id “y”), [(“x”, Int 1)])“

+ “evaluate (App ( (App (add, Val (Int 1))), Val (Int 2)))“
evaluates to “Int 3“

### Step 6 – recursive functions

Consider the “sumTo“ function we discussed in class. In OCaml,
we’d write this function as follows:
let rec sumTo = fun n ->
if n = 0 then 0 else n + sumTo (n-1)
in sumTo
To represent this function in our mini-OCaml language defined by the
“expr“ type, we’d represent the function as follows:
let sumToN : expr =
LetRec (“sumToN”,
Lambda (“n”,
If (Eq (Id “n”, Val (Int 0)),
Val (Int 0),
Add (Id “n”,
App (Id “sumToN”,
Sub (Id “n”, Val (Int 1))
Id “sumToN”
Here we’ve given the name “sumToN“ to the “expr“ value that
represent our “sumTo“ function.

+ “evaluate (App (sumToN, Val (Int 10)))“
evaluates to “Int 55“

# Part 2: Problems on the second language in “expr_let_typing.ml“

The problems below should be solved in the “expr_let_typing.ml“

## Serialize and unparse expressions

Copy the “serialize“ and “unparse“ functions you completed above
into the file for this work and remove the clauses for language
constructs (such as lambda expressions) that are not part of the
“expr_let“ language.

## Evaluation

Modify “eval“ to return a “value result“ instead of a “value“.

The “value“ type is as before, but we introduce a new “result“
type that indicated a successful computation or that some errors have
occurred. It is defined as follows:
type ‘a result = OK of ‘a
| Err of error list

There are 3 types of errors, as defined in the “error“ type. Your
“eval“ function will return a value constructed by “OK“ if there
are no errors detected during evaluation.

But if an error is detected, then that error must be propagated out as
the result. Even though the “Err“ constructor builds values from an
“error list“, when you use “Err“ in “eval“ you should only ever
give it a list with exactly one “error“. In a later part of this
assignment you will find an opportunity to report multiple errors.

The “error“ and “typ“ types are defined as follows:
type typ =
| IntType
| BoolType

type error =
| UnboundName of string
| IncorrectType of expr * typ * (typ list)
| DivisionByZero of expr

Our simple language only has values of that are either of type
“IntType“ or “BoolType“. When errors are reported they are one of
the three possible errors indicated by the three constructors:

1. When an identifier is not found in the environment, a
“UnboundName“ error is created.

2. When there is a type error at evaluation time, an “IncorrectType“
error is created. A error “IncorrectType (e, actual_type,
expected_types)“ can be read as “Expression “e“ has type
“actual_type“ but was expected to have one of the types in the
list “expected_types“.”

3. The “DivisionByZero“ error is created when the right
sub-expression of a “Div“ expression evaluates to zero. The
entire “Div“ expression is placed into this error, not just the
right sub-expression.

Below are some guidelines for implementing this new version of

### Values

Expressions constructed by “Val“ will also return an “OK“ result.

### Operators

An expression like “Add“ will first evaluate it left
sub-expression. If this produces and error, then that error is to be
reported. If, on the other hand, it produces a value, then “eval“
should evaluate the right sub-expression. If the right sub-expression
returns a value, then a value should be computed and returned. If the
right sub-expression produces an error then that error is to be

One exception: if the right sub-expression of “Div“ produces a value
of “Int 0“ then you should report a “DivisionByZero“ error. The
argument to this error construct should be the entire “Div“
expression, not just the part that evaluated to zero.

Another concern: the “Eq“ expression can operate on two integer
values or on two Boolean values to produce a value. If the first is
an integer and the second is not, then the error message should
indicate that the second expression was expected to have a type of
“IntType“. Similarly, if the first is a Boolean type then the
second is expected to have type “BoolType“.

You should also write a function named “get_type“ of type “value ->
typ“ useful in generating error messages and in later parts of this

### Let expressions and identifiers

Like binary operators, we only report errors on evaluated parts of the
expression. If the binding expression contains produces an error
instead of a value, then we cannot extend the environment with a
name/value binding to evaluate the body. Thus, in this case we just
report the error in the binding expression and do not evaluate the
body. If the binding expression does produce a value, then the result
of evaluating the let expression is simply the result of evaluating
the body in the extended environment.

## Type inference

For this part of the problem you will implement a function named
“check“ that checks an expression for unbound names and incorrect
types **without** evaluating the expression.

The “check“ function has the type
expr -> context -> typ result
type context = (string * typ) list

The aim of this function is to return the type (that is, “typ“) of
an expression or as many errors as feasible if some type or name
errors indicate that the expression does not have a type.

The principle here is that an expression can only be ascribed a type
if all of its sub-expressions can be given a type. Thus an expression
such as “2 * (3 + (4 < 5))" does not "have a type" because the addition sub-expression is erroneous. For this function, the ``Err`` constructor may be used with multiple ``error`` values in the list. Some guidelines for completing this function follow: ### Values Expressions such as ``Val (Int 4)`` or ``Val (Bool true)`` are correct in isolation and have the type ``IntType`` or ``BoolType`` respectively. ### Operators But an expression such as ``` Add (Val (Int 4), Val (Bool true)) ``` is erroneous and we cannot compute a type for it. Here we have a sub-expression (``Val (Bool true)``) that has a type, but that type does not meet the typing requirements of the construct (``Add``) that is a part of. We identify expressions that have a type (that is have no type errors) but fail to meet the typing requirements of the construct they are in, as being the expression that is erroneous. The type inference process for ``Add``, ``Sub``, ``Mul``, and ``Div`` are all exactly the same. Your code should be written so that no duplicate code is used in these cases. You should not check for division by zero errors in this function. Since only the trivial cases could be detected we will not attempt to find any of them. Thus, an expression such as ``` Div (Val (Int 5), Val (Int 0)) ``` will yield the type ``IntType``. That is, ``` check (Div (Val (Int 5), Val (Int 0))) [] ``` should produce the result ``OK IntType``. The equality construct ``Eq`` works on integers and Boolean values, some additional care must be taken here. If the first argument has type ``IntType`` then the second one must as well. If the first argument does not have a type (that is, there is some error), then we can still check that the second one has either an ``IntType`` or a ``BoolType``. Here, these are the only two options. When we add function types in a later problem, this will become a concern. Let-expressions should first check the binding expression for errors. If there are none, then the context can be extended and the body can be checked in that context. If the binding expression doesn't have a type (and thus has errors) we will just report those error and not check the body of the let. ## Relating ``eval`` and ``check`` In writing ``eval`` and ``check`` you should keep in mind that they are related. It should be the case that if the ``check`` function does not find any type errors (unbound name or incorrect types) then ``eval`` should compute a value that has the same type as the original expression. ``` P(e) = forall v in value, t in typ . check e [] = OK t =>
( (eval e [] = OK v && get_type v = t) ||
(eval e [] = Err [ DivisionByZero e’ ], for some expression e’)

Statements such as this are common in verification of programming
language. Together, these are called “progress” and “preservation”
statements. These state that if type checking succeeds, then
evaluation will result in a value (progress) and the resulting value
has the same types as the original expression (preservation).

We have the caveat, that type checking will not detect division by
zero. So it is possible for a specific type of error to occur. But
we are guaranteeing that the other two types or error will not occur
during evaluation.

While you are not (yet) asked to prove this property, your functions
should satisfy it.

# Part 3: Type checking the full language

~~This specification will appear soon – by March 6 at the latest. When
it is ready and announcement on Canvas will be sent.~~
This part is now extra credit.
You can find it [here](Hwk_04_Programs_as_Data_Extra_Credit.md).