Description
In this assignment, you will investigate memory access patterns, simulate the operation of page tables and implement several page replacement algorithms. This will give you some practice working with the algorithms we have been talking about in class. This assignment is based on a virtual memory simulator that uses the simvaddr-*.ref memory reference traces located at /u/csc369h/fall/pub/a4/traces . The first task is to implement virtual-to-physical address translation and demand paging using a page table design of your choice. Then you will implement two different page replacement algorithms: exact LRU (Least Recently Used) and Clock. Before you start work, you should complete the set of readings about memory, if you haven’t done so already: Paging: Introduction (http://pages.cs.wisc.edu/~remzi/OSTEP/vm-paging.pdf)
Tutorial 7 Exercise: Memory reference traces
The Tutorial 7 Exercise (%24CANVAS_OBJECT_REFERENCE%24/assignments/g99295e10ed471a400741e73aef899c8e) provides an introduction to the memory address traces used in the simulator.
Part 1: Virtual to physical translation
Intro video (https://web.microsoftstream.com/video/697981d9-04de-406f-9aeed38107682433)
Setup
Log into MarkUs to create or update your repo and get the starter code. Remember that you cannot manually create a new a3 directory in your repo or MarkUs won’t see it. The traces from our sample programs at /u/csc369h/fall/pub/a4/traces will be interesting to run once you have some confidence that your program is working, but you will definitely want to create small traces by hand for testing. The format of the traces is reftype vaddr value as shown in the sample below.
Note that the page offset part of the addresses are all between 0 and 15 (0xf) to fit in the reduced simulated physical page frames.
For a write reference type (S or M), the value will be written to the virtual address. For a read reference type (L or I) the value is the expected value that should be read from the virtual address. It should always be the same as the value in the most recent preceding write reference to the same virtual address. We use this to check that the address translations and pagein/pageout operations are working correctly.
A sample trace snippet is shown below: S 309001 182 S 1fff000000 55 I 108005 0 S 308008 122 L 1fff000000 55 L 308008 122 I 4cc5000 0 L 5018008 0 Note that in our traces, the Instruction reference type is likely to always have a value of 0 because these addresses are not written to after the program starts executing. You will also see Load references with a value of 0 when the trimmed trace includes a Load from an address that has not yet been written to.
Task 1 – Address Translation and Paging
Implement virtual-to-physical address translation and demand paging using a pagetable design of your choice. The main driver for the memory simulator, sim.c , reads memory reference traces in the format produced by the simify-trace.py tool from trimmed, reduced valgrind memory traces (refer to the Tutorial 7 exercise (%24CANVAS_OBJECT_REFERENCE%24/assignments/g99295e10ed471a400741e73aef899c8e) for more information on how the traces are generated).
For each line in the trace, the program asks for the simulated physical page frame that corresponds to the given virtual address by calling find_physpage, and then reads from the simulated physical memory at the location given by the physical frame number and the page offset. If the access type is a write (‘M’ for modify or ‘S’ for store), it will also write the value from the trace to the location. You should read sim.c so that you understand how it works but you should not modify it.
The simulator is executed as ./sim -f -m -s -a where memory size and swapfile size are the number of frames of simulated physical memory and the number of pages that can be stored in the swapfile, respectively. The swapfile size should be as large as the number of unique virtual pages in the trace, which you should be able to determine easily based on your analysis from Tutorial 7.
There are four main data structures that are used: 1. unsigned char *physmem : This is the space for our simulated physical memory. We define a simulated page frame size of SIMPAGESIZE and allocate SIMPAGESIZE * “memory size” bytes for physmem. 2. struct frame *coremap : The coremap array represents the state of (simulated) physical memory. Each element of the array represents a physical page frame. It records if the physical frame is in use and, if so, a pointer to the page table entry for the virtual page that is using it. 3. struct pt_entry – a page table entry.
The format of a page table entry is up to you, but at a minimum it must record the frame number if the virtual page is in (simulated) physical memory and an offset into the swap file if the page has been written out to swap. It must also contain flags to represent whether the entry is Valid, Dirty, and Referenced. 4. swap.c : The swapfile functions are all implemented in this file, along with bitmap functions to track free and used space in the swap file, and to move virtual pages between the swapfile and (simulated) physical memory. The swap_pagein and swap_pageout functions take a frame number and a swap offset as arguments.
The simulator code creates a temporary file in the current directory where it is executed to use as the swapfile, and removes this file as part of the cleanup when it completes. It does not, however, remove the temporary file if the simulator crashes or exits early due to a detected error. You must manually remove the swapfile.XXXXXX files in this case. To complete this task, you will have to write code in pagetable.c and pagetable.h . Read the code and comments in this file — it should be clear where implementation work is needed and what it needs to do.
Basic round-robin and random replacement algorithms are already implemented for you, so you can test your translation and paging functionality independently of implementing the replacement algorithms. Efficiency: In a real operating system implementation, the memory space taken up for your page tables reduces the memory space available to store the pages of processes’ virtual address spaces. Hence, keeping page tables small is desirable.
Reducing the time complexity of page table lookups is also important. Your solution will be evaluated on correctness as well as space and time efficiency.
Task 2 – Replacement Algorithms
Using the starter code, implement exact LRU and CLOCK (with one ref-bit) replacement algorithms. You may find that you want to add fields to the struct frame for the different page replacement algorithms. You can add them in pagetable_generic.h , but please label them clearly.
Note: to test your page replacement algorithms, we will replace your pagetable.c with a solution version, so your page replacement algorithm must be contained to the provided functions. Once you are done implementing the algorithms you can use the provided simvaddr-*.ref traces and the autotester to check the results. For each algorithm, the tester will run the programs on memory sizes 50 and 100 and check the output against the expected results.
Efficiency: Page replacement algorithms must be fast, since page replacement operations can be critical to performance. Consequently, you must implement these policies with efficiency in mind. For example, we will give you the expected complexities for some of the policies: RR: init, evict, ref: O(1) in time and space CLOCK: init, ref: O(1) in time and space; evict: O(M) in time, O(1) in space, where M = size of memory LRU and MRU: evict, ref: O(1) in time and space; init: O(M) in time and space, where M = size of memory Important notes When we run the autotests on your code, your page replacement algorithms will be compiled with a different pagetable.c file (the one from the solution).
All the code of the page replacement algorithms must be in their separate .c files, not in pagetable.c (except for additions to struct frame in pagetable.h ). When a page is being evicted, there should be only 2 possibilities: (i) the page is dirty and needs to be written to the swap; and (ii) the page is clean and already has a copy in the swap. A newly initialized page (zero-filled) should be marked dirty on the very first access. CLOCK must use the “Referenced” flag stored in the page table entry. All the algorithms must utilize their ref() functions (if necessary) instead of adding any algorithm-specific code to pagetable.c .
There are functions in pagetable.c to get the values of the Valid, Dirty, and Referenced flags given a page table entry, which you should implement. Use these functions in your replacement algorithm implementations if you need to check any of these flags — do not assume a particular format for page table entries, or your replacement algorithms are unlikely to work with our solution version of pagetable.c .
The simulator and the page replacement algorithms must not produce any additional or different (from starter code) output (except for errors that should be printed to stderr ), otherwise the tests will fail. The simulator writes a value into the simulated physical memory pages for Store or Modify references, and checks that simulated physical memory contains the last written value on Load or Instruction references. If there is a mismatch, the simulator prints an error message.
These errors indicate that there is something wrong with the address translation implementation. For debugging, you will find it useful to implement the print_pagetable() function in pagetable.c. The function should print (at least) one line for each in-use page in the pagetable (valid in memory, or currently evicted to swap).
Other than that, what information it displays is up to you — we will only be testing that it produces at least the expected number of outputs lines. You can use the debug flag in sim.c to control extra output during development. (NOTE: remember to set it to false in your final submission).