6. Information Theory, Memory Management

SoftSys Lecture Notes, Monday 2/22/21 #

Intro: Cold Boot Attacks #

Here’s what Think OS Chapter 3 has to say about RAM:

On most current computers, main memory is volatile, which means that when the computer shuts down, the contents of main memory are lost.

In theory this is true, but in practice it isn’t:

(Video link: https://www.youtube.com/watch?v=gwhx2vqXHkI)

Stack Variables and Return Values #

Last time, we talked about stack frames. Each function call gets a frame on the stack that keeps (among other things) its local variables, temporary storage, and the return address of the function that called it.

When a function returns, its stack frame is “destroyed” in the sense that the memory in those addresses can be reclaimed for other things.

The return value of the function is passed in one of the special purpose registers, which is almost always %rax (or %eax, depending on what type the value is and whether you are on a 32-bit or 64-bit system).

If the return value is a pointer to a location on the function’s stack frame, you’ll probably get gibberish when you try to read from it, because the frame has been destroyed.

So something like this doesn’t work:

int *add(int x, int y) {
  int z = x + y;
  return &z;
}

By the time the program gets this return value and tries to dereference z (written *z), the data at that memory address might not be there any more. You’ll probaby get a segfault.

This is one reason why you see pointers created beforehand, and then passed into a function that writes to the location it points to:

void add(int x, int y, int *z) {
  *z = x + y;
}

int main() {
  int z;
  add(42, 47, &z);
  return 0;
}

This works because z lives on main’s stack frame instead of on add’s, so it stays valid even adter add returns.

Information Theory #

It’s helpful to get used to the number of bits required to represent something. If you have \(n\) possible outcomes, each of which is equally probably, each outcome gives you \(\log_2 n\) bits of information. Here are some examples:

  • 6-sided die roll: ~2.58 bits
  • Single decimal digit (0 - 9): ~3.32 bits
  • Letter of the English alphabet: ~4.70 bits

If you have multiple independent outcomes, you can usually just sum the bits together. So if you roll two 6-sided dice, then you have double the amount of information, ~5.17 bits, and if you have any two-digit number (including 00, 01, etc.), then you have ~6.64 bits.

In many “realistic” scenarios, this breaks down a bit. If you are looking at English text, all of the letters don’t appear with equal frequency. Assuming that the letter “e” appears 13% of the time, you would get ~2.94 bits from the letter, whereas assuming that the letter “z” appears 0.074% of the time, you would get ~10.4 bits. The less likely something is, the more information it carries. This is what Think OS refers to as self-information or surprisal.

(Letter probabilities from Wikipedia: https://en.wikipedia.org/wiki/Letter_frequency)

In general, the self-information of an event is \(-\log_2 p\) , where \(p\) is the probability of the event happening. If you have \(n\) equally likely outcomes, then the probability of one of those events has a self-information of \(-\log_2 \frac{1}{n}\) bits, which is equal to \(\log_2 n\) as we mentioned above.

To store something, you need to round up the number of bits to the next integer. So for a 6-sided die roll, you need 3 bits to store the result of the roll, rounding up from ~2.58 bits, because you need a whole number of bits.

Exercise #

Calculate the self-information of the following:

  • The outcome of a fair coin flip
    • 1 bit
  • The outcome of rolling a fair, 12-sided die
    • 3.58 -> 4 bits
  • The letter “q” in natural English text (using the source above)
    • 10.04 -> 11 bits
  • A two-letter “word” from the English alphabet (don’t worry about whether it’s an actual word or not)
    • 9.40 bits -> 10 bits
  • Getting a full house in poker (probabilities here: https://en.wikipedia.org/wiki/Poker_probability)
    • 9.439 -> 10 bits
  • A given telephone area code in the North American Numbering Plan, assuming all area codes are equally likely. Including area codes not in use, but Excluding area codes not allowed by the NANP (list here: https://en.wikipedia.org/wiki/List_of_North_American_Numbering_Plan_area_codes)
    • Area codes cannot start with a 0 or 1 (so instead of 1000 area codes, you only have 800)
    • 9.64 -> 10 bits?
    • I did log2(8*10*10)

Memory Management and Pages #

Virtual memory is an abstraction of the physical memory on a machine. Somewhat confusingly, when people talk about “physical memory” they usually are referring to RAM, but in this context, “physical memory” refers to disk space on a machine. RAM is used as a cache to speed up access to memory that you are likely to use in the near future.

Memory allocation is usually done in pages rather than at the byte level. This helps with caching memory from disk in RAM. The cost of accessing data on disk is much, much higher compared to RAM, so if you are going to disk, you might as well get a chunk of memory. Plus, there is locality: programs that access one part of memory usually end up accessing other memory nearby as well.

Each process has its own virtual address space, so every process thinks that it has access to much more memory than it has. So the computer needs to match a virtual address for a process to a physical address on disk.

To do this, it uses a memory management unit (MMU), which translates a virtual address into a physical address.

Exercise #

  1. First, find out how large the pages on your system are. You can use getconf PAGE_SIZE to do this. The result will be in bytes.
  2. Now figure out how much physical memory (disk space) your system has. You might already know this number, but if you are running in a virtual machine, this number will probably be different from what you have on your machine overall. (If you don’t know this number and want to approximate it, you can run the command sudo fdisk -l from your terminal. Everything that has a “Disk model” or “Disk identifier” is physical memory you should include.)
  3. Once you know how much physical memory you have, how many bits will it take to store an address in this memory?
  4. On your system, then, how many pages will fit in physical memory?
  5. Now find out if your system is 32-bit or 64-bit. If you don’t know, you can find out by running uname -m in your terminal. Usually, x86 indicates a 32-bit system, while x86_64 indicates a 64-bit system.
  6. If you have a 32-bit system, each process has a virtual address space of \(2^{32}\) addresses. On a 64-bit system, this number is \(2^{64}\) . How many pages will fit in your virtual memory space?
  7. Finally, run lscpu on your terminal. You should see a line that indicates how many bits are required for a physical and virtual address on your machine. Does this match what you calculated earlier?

Project #

You should start Project 1 in about a week. It’ll last just over a month, with the final report due on Wednesday, 3/31.

You can work in teams of up to 3 people. Use Slack if you want to find people to work with.

For this week, think about what you want to do. Here is a list of projects from Spring 2019. Here is a list of project resources that you might find helpful.

Proposals are due in one week, on Monday, 3/1. Your proposal should describe:

  • Project goals: the functionality and objectives of your software product.
  • Learning goals: what you want to get out working on this project.
  • Resources: what you are using to learn, what you still need, any plans to get what you need.
  • Starting tasks: at least three tasks to work on, who will do each one, and what completion of each task looks like.

Other important dates to be aware of:

  • Check-in due: Monday, 3/8
  • Update due: Mondey, 3/22
  • Final report due: Wednesday, 3/31

For Next Time #

  • Finish HW2.5 (due Wednesday 2/24)
  • Read Head First C Chapter 3 (due Wednesday 2/24)
  • Think about Project 1 ideas
  • Prepare for a quiz (Wednesday 2/24)