CptS355 – Assignment 4 Part 2 An Interpreter for a Simple Postscript-like Language (SPS)

$30.00

Category: Tags: , , , , , , You will Instantly receive a download link for .zip solution file upon Payment || To Order Original Work Click Custom Order?

Description

5/5 - (4 votes)

The Problem
In this assignment you will write an interpreter in Python for a simplified PostScript-like language,
concentrating on key computational features of the abstract machine, omitting all PS features related to
graphics, and using a somewhat-simplified syntax. The simplified language, SPS, has the following
features of PostScript language:
Constants:
▪ integer constants, e.g. 1, 2, 3, -4, -5. We will represent integer constants as Python
“int” values in opstack and dictstack.
▪ boolean constants, e.g. , true , false. We will represent boolean constants as Python
“bool” values in opstack and dictstack.
▪ string constants, e.g. (CptS355): string delimited in parenthesis. We will represent string
constants as StrConstant objects (see elements.py file.) (Make sure to keep the
parenthesis delimiters around the string value when you initialize the StrConstant object.)
▪ name constants, e.g. /fact, /x. A name constant starts with a ‘/’ and letter followed by an
arbitrary sequence of letters and numbers. We will represent name constants as Python “str”
values in opstack and dictstack.
▪ names to be looked up in the dictionary stack, e.g., fact, x; as for name constants, without
the ‘/’
▪ code constants (i.e., code-arrays); code between matched curly braces { … }
Operators:
▪ built-in operators on numbers: add, sub, mul, mod, eq, lt, gt
▪ string creation operator: string; pops an integer value (e.g., n) from the operand stack and
creates a StrConstant object with ‘value’ of length ‘n’. Initializes the characters in the
StrConstant‘s ‘value’with ‘\0’ , i.e., ascii NUL character.
▪ built-in operators on string values: get, put, getinterval, putinterval,
length. These operators should support StrConstant arguments.
▪ dictionary creation operator: dict; pops an integer value from the operand stack and creates a
new empty dictionary on the operand stack (we will call this psDict). psDict ignores the
popped integer value.
▪ built-in operators on dictionary values: get, put, length. These operators should
support DictConstant arguments.
▪ built-in conditional operators: if, ifelse (you will implement if/ifelse operators in
Part2)
▪ built-in loop operator: for (you will implement for operator in Part 2).
▪ stack operators: dup, copy, count, pop, clear, exch, stack
▪ dictionary stack manipulation operators: begin, end.
▪ begin requires one dictionary operand on the operand stack; end has no operands.
▪ name definition operator: def. We will call this psDef.
▪ stack printing operator: stack. Prints contents of opstack without changing it.
Part 2 – Requirements
The starter code provided to you partially implements a REPL (read-evaluate-print-loop) tool for
interpreting PostScript code. In Part 2 you will complete the implementation of the following pieces in
the starter code:
I. The Reader: You need to complete the reader in psparser.py. You will convert the PostScript
tokens to their corresponding expression objects.
II. The Evaluator: You need to implement the `eval` methods of expression objects in
elements.py.
III. Executing (applying) code-arrays: You need to implement the apply method of CodeArray (in
elements.py). You will also implement if, ifelse operators (psIf and psIfelse), and
for loop operator (psFor). You will write these functions in the psbuiltins.py file from
Part 1.
Running the Interpreter
1. Read-Eval-Print. You can run the interpreter in the REPL mode and interpret the PostScript code on
the command prompt. The RELP tool is implemented in reply.py file. The interpreter reads
PostScript expressions, evaluates them, and displays the results. You can run the REPL tool by
executing the reply.py file, i.e.,
python repl.py (or python3 repl.py)
2. Load. Alternatively, you can run the interpreter on the “Loader” mode and run the loaded PostScript
programs all at once. The “Loader” loads PostScript code (given as strings), evaluates them, and
displays the results. You can run the “Loader” tool by running the following command on the
command line:
python load.py (or python3 load.py)
Implementation Overview
Here is a brief overview of each of the Read-Eval-Print Loop components in our interpreter. Refer to this
section as you work through the project as a reminder of how all the small pieces fit together!
Read: This step parses user input (a string of PostScript code) into our interpreter’s internal Python
representation of PostScript expressions.
Lexical analysis has already been implemented in the tokenize function in psparser.py. This function
returns a list of tokens. The `read` function turns this list into a `Buffer` (defined in buffer.py)
which allows to get the tokens one at a time.
Syntactic analysis happens in `read` and `read_expr` functions. Together, these mutually recursive
functions parse tokens and converts them into our interpreter’s internal Python representation of
PostScript expressions.
In our interpreter, there are four types of expressions. elements.py defines the classes that will
represent each of these expressions and they are all subclasses of the `Expr` class:
1. Literal: represents primitive constants : integers or booleans . The `value` attribute contains
the constant value the `Literal` refers to.
2. Name: represents names of variables, functions, or operators . The `var_name` attribute
contains the name of the variable as a Python string, e.g., ‘/sq’,’sq’,’add’,’def’. If the
`var_name` starts with a `/` character, Name represents a name constant, otherwise it
represents a variable reference , function call, or a built-in operator call.
3. StringExpr: represents strings. The `value` attribute is a Python string enclosed in PostScript
string delimiters, i.e., ‘(‘ and ‘)’. For example:'(CptS355)’
4. Block: represents body of a function or blocks of if, ifelse, and for operators. The `value`
attribute is a Python list that includes the tokens of the PostScript code-array (block) it
represents , e.g., [Literal(10), Literal(5),Name(mul)]
The `read` function calls the `read_expr` function for each token in the buffer and returns a list of
expression objects. You will complete `read_expr` function in psparser.py. `read_expr` creates and
returns the expression object (i.e., one of the above) for the given token.
In repl.py and load.py , the interpreter calls `read` function to parse the tokens in PostScript code
and turns them into a list of expressions.
Eval: This step evaluates PostScript expressions (represented in Python) to obtain values.
Each of the expression classes (i.e., `Literal`, `Name`, ` StringExpr`, and `Block`) implement their
own `eval` methods returning the value that the expression evaluates to. You will complete the
implementations of the `eval` methods for each of these classes in elements.py. See “II. The
Evaluator” section below for more details.
In repl.py and load.py, the interpreter calls the `eval` method of each expression to evaluate the
expressions in the expression list.
Print: This step prints the __str__ representation of the obtained value.
Loop: The logic for the loop is handled by the while loop in repl.py. You don’t need to make any
changes in repl.py.
The UML diagram on the next page illustrates the class level design for the interpreter.
I. The Reader
The first part of this assignment deals with reading and parsing SPS code. The reader is implemented in
psparser.py. To parse the SPS programs,
1. We first pass the SPS code to the lexer which convert the continuous input text to a list of tokens.
`tokenize` function in psparser.py implements the tokenizer. You do not need to make any
changes in this part.
Example: Given the PostScript code ,
“””
/square {dup mul} def
/mydict 1 dict def
mydict /in 1 put
mydict /out 100 put
mydict /in 10 put
mydict /in get
square
mydict /out get
eq
{(equal)} {(different)} ifelse
“””
`tokenize` will return :
[‘/square’, ‘{‘, ‘dup’, ‘mul’, ‘}’, ‘def’, ‘/mydict’, 1, ‘dict’, ‘def’, ‘mydict’,
‘/in’, 1, ‘put’, ‘mydict’, ‘/out’, 100, ‘put’, ‘mydict’, ‘/in’, 10, ‘put’,
‘mydict’, ‘/in’, ‘get’, ‘square’, ‘mydict’, ‘/out’, ‘get’, ‘eq’, ‘{‘, ‘(‘,
‘equal’, ‘)’, ‘}’, ‘{‘, ‘(‘, ‘different’, ‘)’, ‘}’, ‘ifelse’]
2. Next, we parse the tokens, i.e., convert the tokens into our interpreter’s internal Python
representation of SPS expressions. The `read` and `read_expr` functions in psparser.py handle
parsing.
– The `read` function turns the token list (the output of `tokenize`) into a `Buffer` object
(defined in buffer.py) which allows to get the tokens one at a time.
“””Parse an expression from a string. If the string does not contain an
expression, None is returned. If the string cannot be parsed, a SyntaxError
is raised.
“””
def read(s):
#reading one token at a time
src = Buffer(tokenize(s))
out = []
while src.current() is not None:
out.append(read_expr(src))
return out
– `read` calls the `read_expr` function to convert each expression to its corresponding expression
object. You need to complete the `read_expr` function and create the corresponding expression
object for each token.
o As explained in section “Implementation Overview”, in our interpreter there are four types of
expressions: Literal, Name, StringExpr, and Block which are subclasses of the `Expr` object
defined in elements.py.
o In the `read_expr` function, you can use the helper functions provided in psparser.py to
check the type of each token and to retrieve the tokens that will be included in the array or
code-array values. Note that the tokens between `(` and `)` belong to a string and those
between `{` and `} belong to a code array.
“”” Converts the next token in the given Buffer to an expression. “””
def read_expr(src):
token = src.pop_first()
if token is None:
raise SyntaxError(‘Incomplete expression’)
# TO-DO – complete the following; include each condition as an `elif` case.
# if the token is a literal return a `Literal` object having `value` token.
# if the token is a name, create a Name object having `var_name` token.
# if the token is an array delimiter (i.e., ‘(‘), get all tokens until the
# matching ‘)’ delimiter and combine them as a Python string
# separated by space; create a StrConstant object having this string value.
# if the token is a codearray delimiter (i.e., ‘{‘), get all tokens until
# the matching ‘}’ delimiter and combine them as a Python list;
# create a Block object having this list value.
else:
raise SyntaxError(“‘{}’ is not the start of an expression”.format(token))
Given the tokens,
[‘/square’, ‘{‘, ‘dup’, ‘mul’, ‘}’, ‘def’, ‘/mydict’, 1, ‘dict’, ‘def’, ‘mydict’,
‘/in’, 1, ‘put’, ‘mydict’, ‘/out’, 100, ‘put’, ‘mydict’, ‘/in’, 10, ‘put’,
‘mydict’, ‘/in’, ‘get’, ‘square’, ‘mydict’, ‘/out’, ‘get’, ‘eq’, ‘{‘, ‘(‘,
‘equal’, ‘)’, ‘}’, ‘{‘, ‘(‘, ‘different’, ‘)’, ‘}’, ‘ifelse’]
`read` should return:
[Name(/square), Block([Name(dup), Name(mul)]), Name(def), Name(/mydict), Literal(1),
Name(dict), Name(def), Name(mydict), Name(/in), Literal(1), Name(put), Name(mydict),
Name(/out), Literal(100), Name(put), Name(mydict), Name(/in), Literal(10), Name(put),
Name(mydict), Name(/in), Name(get), Name(square), Name(mydict), Name(/out),
Name(get), Name(eq), Block([StringExpr((equal))]), Block([StringExpr((different))]),
Name(ifelse)]
The above is the print of the __repr__ representation of the expression list `read` returns.
II. The Evaluator
Evaluator evaluates the PostScript expressions (represented as Python objects). In repl.py and
load.py, the interpreter calls the `eval` method of each expression to evaluate the expressions in the
expression list.
Each of the expression classes (i.e., Literal, Name, StringExpr, and Block) implement their own
`eval` method returning the value that the expression evaluates to. For example:
1. `Literal` object is evaluated to the integer or boolean value it represents (e.g. 15 or `True` or
`False`) and pushed onto the opstack.
2. `Name` object is evaluated according to the following:
a. If the `Name` represents a name constant (i.e., its `var_name` attribute starts with a
`/`), it will be evaluated to a Python string having value `var_name` . The evaluated
value will be pushed onto the opstack.
b. If the `Name` represents a built-in operator (i.e., its `var_name` attribute is one of the
built-in operator names), then we will evaluate it by executing the operator function
defined in psbuiltins.py in the current environment (opstack).
c. If the `Name` represents a variable or function, interpreter looks up the value of the
variable in the current environment (dictstack).
i. If the variable value is a function (`CodeArray`), it should be applied (i.e.,
executed) by calling its `apply` method.
ii. Otherwise, the variable value is a constant and it should be pushed onto the
opstack.
3. `StringExpr` object is evaluated to a `StrConstant` value. For example: a StringExpr with
`value` attribute ‘(CptS355)’ will be evaluated to StrConstant with `value` attribute
‘(CptS355)’. The evaluated StrConstant is pushed onto the stack.
4. `Block` object is evaluated to `CodeArray` value. For example: a `Block` with `value`
attribute [Name(dup), Name(mul)] will be evaluated to `CodeArray` with the same `value`
(i.e., [Name(dup), Name(mul)]. The evaluated `CodeArray` is pushed onto the stack.
III. Handling of code-arrays ; if/ifelse, for operators
Recall that a `Block` represents the blocks of if, ifelse, and for operators and the function
bodies. As explained above, when a `Block` is evaluated, a `CodeArray` object having the same list of
tokens is pushed to the opstack. Once a `CodeArray` is on the stack several things can happen:
– if it is the top item on the stack when a `def` is executed (i.e. the `CodeArray` is the body of a
function), it is stored as the value of the name defined by the def.
– if it is the body part of an if/ifelse operator, the `psIf` (or `psIfElse`) function will execute
(apply) the `CodeArray`. For the if operator, the CodeArray is executed only if the “condition”
argument for if operator is true. For the ifelse operator, if the “condition” argument is true, first
`CodeArray` is executed, otherwise the second `CodeArray` is executed .
– if it is the body part of a for operator, psFor function will execute (apply) the `CodeArray` as part
of the evaluation of the repeat loop.
– finally, when a `Name` is looked up and if its value is a `CodeArray`, then the expression represents
a function call and the CodeArray value will be executed (applied).
A `CodeArray` can be executed by calling its `apply` method. `apply` should evaluate all tokens in the
function body.
Testing
1. Testing your interpreter manually:
You can do manual tests by using the RELP tool, i.e., by typing PostScript code on the REPL command
line and checking the contents of the opstack and dictstack after the PostScript code is interpreted.
I suggest you print both the opstack and the dictstack when the “stack” operation called to help
with the debugging. You can print the opstack and dictstack in different colors to improve the
readability. `colors.py` file includes some predefined colors. To print a string `s` in green and then reset
the color back to default, you can use the print statement : print(OKGREEN+s+CEND)
2. Testing your interpreter using the loader:
You can load the PostScript code in the loader and interpret it. The given test inputs are already included
in `load.py`. The `for` loop in this file iterates over each test input included in `tests` list, interprets
them, and print the content of the `opstack` . When you add a new PostScript test input to this file,
make sure to add it to the `tests` list. The expected outputs for the given 32 tests are included in the
“expected_output” dictionary in load.py.
3. Automated tests:
Sample Python unittests testing for the same inputs are included in `tests_part2.py` file. You should
provide 5 additional Python unittests in addition to the provided tests. Make sure that the PostScript
code in your tests are different than the provide examples and your tests include several operators. You
will loose points if you fail to provide tests or if your tests are too simple.
Testing the parsing:
Before even attempting to run your full interpreter, make sure that your parsing is working correctly.
Make sure you get the correct parsed output for the testcases. The parsed tokens for the 31 tests are
provided in load.py – see the “parse_output” dictionary in load.py.