CS 416 Project 3 – User-level Memory Management

$30.00

Category: Tags: , , , , You will Instantly receive a download link for .zip solution file upon Payment || To Order Original Work Click Custom Order?

Description

5/5 - (5 votes)

Assume you are building a new startup for Cloud Infrastructure (Amazing Systems); a competitor of Amazon
Cloud Services. As the CEO of your company, you decide to move page table and memory allocation to the
software. In this project’s part 1, you will build a user-level page table that can translate virtual addresses
to physical addresses by building a multi-level page table. In part 2, you will also design and implement a
translation lookaside buffer (TLB) cache to reduce translation cost. For evaluating your code, we will test
your implementation across different page sizes. We will mainly focus on the correctness of your code, the
page table and TLB state. For the extra-credit (see part 3), you will implement a 64-bit 4-level page table
design. Note that this project would require understanding bit manipulation. We have added a helper code
for you to learn and get used to bit manipulation (see 8. Sample Codes and Helpers).
Part 1. Implementation of Virtual Memory System (70 pts)
While you have used malloc() in the past, you might not have thought about how virtual pages are translated
to physical pages and how they are managed. The goal of the project is to implement “a_malloc()” (Amazing
malloc), which will return a virtual address that maps to a physical page. For simplicity, we will use a 32-bit
address space that can support 4GB address space. We will vary the physical memory size and the page size
when testing your implementation.
set_physical_mem(): This function is responsible for allocating memory buffer using mmap or malloc
that creates an illusion of physical memory (Linux http://man7.org/linux/man-pages/man2/mmap.2.htm).
Here, the physical memory refers to a large region of contiguous memory that is allocated using mmap() or
malloc() and provides your page table and memory manager an illusion of physical memory. Feel free to use
malloc for allocating other data structures required for managing your virtual memory.
translate(): This function takes input a page directory (address of the outer page table), a virtual address,
and returns the corresponding physical address. You have to work with two-level page tables. For example,
in a 4K page size configuration, each level uses 10-bits with 12-bits reserved for offset. For other page sizes X,
reserve log_2(X) bits for offset and divide the remaining bits into two levels. Unequal division is possible in
this case.
page_map(): This function walks the page directory to see if there is an existing mapping for a virtual
address. If the virtual address is not present, then a new entry will be added. Note that you will be using
this in a_malloc() to add page table entry.
a_malloc(): This function takes the number of bytes to allocate and returns a virtual address. Because you
are using a 32-bit virtual address space, you are responsible for address management. To make things simple,
assume for each allocation, you will allocate one or more pages (depending on the size of your allocation).
For example, for a_malloc(1byte) and a_malloc(1byte), you will allocate one page for each call. Similarly,
for a_malloc(4097 bytes), you will allocate 2 pages when the page size is 4KB. This simple allocation would
cause fragmentation. One way to reduce some fragmentation can be found below:
Reducing Fragmentation (Optional-No Extra Credit): One way is to combine multiple small allocations to one page. For example, the first call of a_malloc returns an address 0x1000. When you call a_malloc
again, you can return 0x1001 (if application asked for 1-byte) or any address within the page size boundary.
The next call to a_malloc (if the size requested ends up not fitting within the first page) would return 0x2000
or higher. Note: This is optional. Implement only if you have time. No extra points of this.
1
Keeping Track: You will keep track of which physical pages are already allocated and which pages are
free; use a virtual and physical page bitmap that represents a page (you must use a bitmap. Using any
other structure will lose points, ie. using an array of chars with ‘y’ or ‘n’ or an array of ints using
each index for a page). The get_next_avail() (see the code) function must return the next free available
page. You must implement bitmaps efficiently (allocating only one bit per page) to avoid wasting memory
(https://www.cprogramming.com/tutorial/bitwise_operators.html, also see the C Review [416 abridged] in
resources on Sakai for some simple examples).
a_free(): This call takes a virtual address and the number of bytes (int), and releases (frees) pages starting
from the page representing the virtual address. For example, for a 4K page size, a_free(0x1000, 5000) will
free two pages starting at virtual addresses 0x1000. Also, please ensure a_free() isn’t deallocating a page
that hasn’t been allocated yet! Note, a_free returns success only if all the pages are deallocated.
put_value(): This function takes a virtual address, a value pointer, and the size of the value pointer as
an argument, and directly copies them to physical pages. Again, you have to check for the validity of the
library’s virtual address. Look into the code for implementation hints.
get_value(): This function also takes the same argument as put_value(), but reads the data from the
virtual address to the value buffer. If you are implementing TLB, always check the presence of translation in
TLB before proceeding forward.
mat_mult(): This function receives two matrices mat1 and mat2 as an argument with a size argument
representing the number of rows and columns. After performing matrix multiplication, copy the result to
answer array. Take a look at the test example. After reading the values from two matrices, you will perform
multiplication and store them to an array. For indexing the matrices, use the following method:
A[i][j] = A[(i * size_of_rows * value_size) + (j * value_size)]
Important Notes: – You cannot change the function arguments/signature of the following: a_malloc(),
a_free(), put_value(), get_value(), mat_mult(). – Your code must be thread-safe and your code will be tested
with multi-threaded benchmarks.
We have included one sample test code (benchmark/multi_test.c) to check thread safety. This is just a
sample, so please feel free to use, extend, increase thread count (and add to Makefile) the sample code for
thread-safety checks.
Part 2. Implementation of a TLB (20 pts)
In this part, you will implement a direct-mapped TLB. Remember that a TLB caches virtual page number to
physical address. This part cannot be completed unless Part 1 is correctly implemented.
Logic:
Initialize a direct-mapped TLB when initializing a page table. For any new page that gets allocated, no
translation would exist in the TLB. So, after you add a new page table translation entry, also add a translation
to the TLB by implementing add_TLB().
Before performing a translation (in translate()), lookup the TLB to check if virtual to physical page translation
exists. If the translation exists, you do not have to walk through the page table for performing translation
(as done in Part 1). You must implement check_TLB() function to check the presence of a translation in the
TLB.
If a translation does not exist in the TLB, check whether the page table has a translation (using your part 1).
If a translation exists in the page table, then you could simply add a new virtual to physical page translation
in the TLB using the function add_TLB().
Number of entries in TLB: The number of entries in the TLB is defined in my_vm.h (TLB_ENTRIES).
However, your code should work for any TLB entry count (modified using TLB_ENTRIES in my_vm.h).
2
TLB entry size: Remember, each entry in a TLB performs virtual to physical page translation. So, each
TLB entry must be large enough to hold the virtual and physical page numbers.
TLB Eviction: As discussed in the class, the number of entries in a TLB are much lower than the number
of page table entries. So clearly, we cannot cache all virtual to physical page translations in the TLB.
Consequently, we must frequently evict some entries. A simple technique is to find the TLB index of a virtual
page, and replace an older entry in the index with a new entry. The TLB eviction must be part of the
add_TLB() function.
Expected Output: You must report the TLB miss rate by completing the print_TLB_missrate() function.
See the class slides for the definition of TLB miss rate.
Important Note: Your code must be thread-safe and your code will be tested with multi-threaded benchmarks.
Part 3. Extra Credit: Implementing a 4-level page table (10 pts)
For the extra-credit, you are required to implement a 4-level page table for 64-bit addressing by extending
the 2-level page table in Part 1 for 32-bit CPUs.
1. Before attempting to implement the extra-credit part, you must complete Part 1 and Part 2 of the
project.
2. We cannot evaluate Part 3 unless Part 1 and 2 are implemented correctly.
3. Your 4-level design must be thread-safe.
4. Note that for the 64-bit design to work, you will have to compile the code without the -m32 flag.
Therefore, we suggest to create “my_vm64.c” and compile the code without the 32-bit flag.
5. Note that the TLB entries will also store 64-bit addresses.
6. For more questions, email us.
4. Suggested Steps
• Step 1. Design basic data structures for your memory management library.
• Step 2. Implement set_physical_mem(), translate() and page_map(). Make sure they work.
• Step 3. Implement a_malloc() and a_free().
• Step 4. Test your code with matrix multiplication.
• Step 5. Implement a direct-mapped TLB if steps 1 to 4 work as expected.
5. Compiling and Benchmark Instructions
Please only use the given Makefiles for compiling. We have also provided a matrix multiplication benchmark
to test the virtual memory functions. Before compiling the benchmark, you have to compile the project code
first. Also, the benchmark does not display correct results until you implement your page table and memory
management library. The benchmark provides hints for testing your code. Make sure your code is thread safe.
We will focus towards testing your implementation for correctness.
6. Report (10 pts)
Besides the VM library, you also need to write a report for your work. The report must include the following
parts:
1. Detailed logic of how you implemented each virtual memory function.
2. Benchmark output for Part 1 and the observed TLB miss rate in Part 2.
3. Support for different page sizes (in multiples of 4K).
4. Possible issues in your code (if any).
3
5. If you implement the extra-credit part, description of the 4-level page table design and support for
different page sizes.
Because we are using a 32-bit page table (for Part 1 and 2), the code compiles with -m32 flag. Not all iLab
machines support -m32. Here’s a list of them that you could use.
kill.cs.rutgers.edu
cp.cs.rutgers.edu
less.cs.rutgers.edu
ls.cs.rutgers.edu
7. Submission
Submit the following files to Sakai, as is (Do not compress them):
1. my_vm.h
2. my_vm.c
3. my_vm64.c (only if attempting extra-credit Part 3)
4. A report in .pdf format, completed, and with both partners names and netids on the report
5. Any other support source files and Makefiles you created or modified
Please Note: Your grade will be reduced if your submission does not follow the above instruction.
8. Sample Codes and Helpers.
This project requires understanding bit manipulation. In addition to the C tutorials we posted in the first
week of the class, see a sample code (hw3-sample.c), which will help you understand bit manipulation.
In hw3-sample.c, you will see functions that show an example for the following operations. Remember, this is
just a sample for you to understand the basics. You are responsible for understanding and trying out more
complex bit manipulation (if required).
1) Extracting “N” top order bits (outer bits) from a 32-bit number
2) Extracting “M” middle-order bits from a 32-bit number
3) Setting “Kth bit” value in a character array
4) Reading “Kth bit” value in a character array
9. FAQs and Other Things To Note
1. Similar to project 2, we will update FAQs to a webpage, in addition to answering them in Piazza.
https://www.cs.rutgers.edu/~sk2113/course/cs416-sp21/project3-faq.html
2. MAX_MEMSIZE vs. MEMSIZE. Within the header file (my_vm.h), you will see two defined values: (1) MAX_MEMSIZE and (2) MEMSIZE. The difference between the two is that MAX_MEMSIZE
is the size of the virtual address space you should support, while MEMSIZE is how much “physical
memory” you should have. In this case MAX_MEMSIZE is defined as (4ULL * 1024 * 1024 * 1024)
which is 2ˆ32 bytes or 4GB, the amount of virtual memory that a 32-bit system can have. On the other
hand, MEMSIZE is defined as (1024 * 1024 * 1024) or 2ˆ30 bytes or 1GB, which is how much memory
you should mmap() or malloc() to serve as your “physical memory”.
3. Be mindful of values 2ˆ32 and up. Notice that the definition of MAX_MEMSIZE is casted as
an unsigned long long. This is because the library is compiled as a 32-bit program. With a 32-bit
architecture, an int/unsigned int/long/unsigned long are all 4 bytes, meaning they cal only represent
the 0 to 2ˆ32 – 1 different values. So when dealing with MAX_MEMSIZE or other values that equal or
larger than 2ˆ32, make sure to use unsigned long long to avoid any value truncation.
4
4. If you are using your personal computer for development and getting the following error, then refer to
this link: https://www.cyberciti.biz/faq/debian-ubuntu-linux-gnustubs-32-h-no-such-file-or-directory/
gnu/stubs-32.h: No such file or directory compilation terminated. make: *** …
5