random page selection - yank any in-store page; not very effieient
optimal page replacement - furthest use in the future
not recently used
modified-reference bits -
periodically clear reference bits to find old, unreferenced pages
toss not referenced before referenced pages, and not modified
before modified pages
hardware implementation - still need to clear reference bits
software - interrupts and protection
fifo - oldest out next; no reference or modify checking
second chance - oldest unreferenced out
clock page replacement - second chance or fifo as a ring
LRU Replacement
The past as prolog.
Page queue ordered by access time.
Counters
Matrix
LRU simulations.
Not frequently used adds a read bit to a counter; doesn't forget.
Aging shifts the read bit in from the right.
LRU problems.
There is no intra-interval granularity.
Resolution is finite and small.
Design Issues
working set model
local vs. global page allocations
page size
the memory-management interface
working set model
demand paging
locality of reference
working set
thrashing
prefetch
using counters to compute the working set
adding the working set to paging algorithms - wsclock
local vs. global page allocations
Fractional vs. weighted total allocations.
Managing by page-fault frequency.
Swapping is still useful to handle global thrashing.
page size
Set by hardware or software - alpha 8, 16, 32, 64kb pages.
Internal fragment, small segments argues for small pages.
Small page tables, paging overhead argues for large pages.
Page table overhead on context switches.
The Memory-Management Interface
Sharing pages.
Pinning pages.
Mapping files.
Distributed shared memory.
Overlay addressing
Big programs use address outside the range of physical storage.
Overlay relocation into regions solved the problem for overlays.
A similar technique can solve the problem for virtual memory.
Essentially relocate pages into page frames.
Relative Addressing and Relocation
A program with relative addresses has offsets from a base address.
Relocation involved adding a program base to all the offsets to produce
absolute addresses.
Link-load time relocation is inflexible.
Once a relative program is relocated to absolute addresses, it's
difficult or impossible to re-relocate again.
Relocation Registers
Hardware can help retain relocation flexibility throughout a process's
life.
Keep the base address in a relocation register and add each offset
to the base address.
The process can be relocated by moving it to new base address B'
and storing B' in the relocation register.
Relocating Virtual Addresses
Treat each page as a mini-program.
Addresses in the page are offsets from the start of the page.
Assume each page frame has a relocation register.
Virtual address can be relocated to absolute addresses.
The Virtual Relocation Calcuation
The virtual-relocation calculation needs a base address and an offset.
The structure of the virtual-address space gives us both these values.
The base address requires a bit more work than does the offset.
Virtual Memory Structure
The regular page structure imposed on virtual memory induces a
structure on virtual addresses.
If a 2n-unit virtual-address space is divided into 2p
pages, p < n, how big is each page?
2n/2p = 2n - p units.
Alternatively, if a 2n virtual-address space has 2s-byte
pages, s <= n, how many pages are there?
2n/2s = 2n - s pages
Virtual Address Structure
An n-bit virtual address can be split into two parts:
The high-order p-bit page index, and
The low-order (n - p)-bit page offset, (n - p = s).
For example, consider a 4-bit virtual-address split into four pages.
n = 4, p = 2, s = 4 - 2 = 2.
0000 0100 1000 1100
0001 0101 1001 1101
0010 0110 1010 1110
0011 0111 1011 1111
The Page-Page Frame Mapping
Given the virtual address i*2s + f, which page frame
holds a page index i?
Conceptually, we need an array of size 2p.
Page i is in page frame A[i].
If page frame A[i]) is empty, throw a page fault.
This array is called the page table.
The page table also contains other information for each page:
Read-write-execute permissions.
Access-modify status bits.
Page-Table Hardware
Virtual-address relocation has to be done on every address issue.
The memory-management unit (MMU) helps make sure va relocation is fast.
It holds the page table and does the math.
MMUs can be part of the CPU or off-CPU (but not too far away).
Given their size, they're usually off-CPU.
Page Table Size
Page tables have two unattractive properties:
They're big.
A 32-bit viratual address with 8k (= 213)-byte pages has
232 - 13 = 219 = 512k entries.
A page-table entry is at least four bytes for a total of 2 mbytes.
They're sparse.
There are few gigabyte programs, even with data.
Even if MMUs were big enough to hold a page table, it would be
wasteful.
Segmented Page Tables
A page table can be broken up into segments.
The MMU contains only the segments for code and data being accessed.
Page faulting becomes more complex, as does indexing.
Hierarchal Page Tables
Rather than a flat array of page-table segments, create a tree of
segments.
This packs page-table segments densely in virtual memory.
The tree depth is the page-table level, usually two or three.
The Translation Look-Aside Buffer
The MMU can't be too fast, and caching is a good way to speed things
up.
The cache for the MMU is called the Translation Look-Aside Buffer
(TLB).
Inverted Page Tables
Generally, systems will have less than full compliment of physical
storage.
32-bit systems rarely have 4 gig of physical storage.
The number of page frames can be smaller than the number of pages in a
program.
This is not so true as it used to be with more cost-effective larger
primary stores.
Rather than index the page table by page index, which can be large,
index it by page-frame number, which will usually be smaller.
Page Replacement
What happens when a page needs to be brought in from disk and there are
no open page frames?
A resident page needs to be ejected to free up a page frame.
The page-replacement algorithm determines which pages to eject.
Random Page Replacement
Just pick a page frame at random and eject the resident page.
Simple but not too efficient.
It may pick a hot page that will have to be paged in again soon.
Optimal Page Replacement
The best page to eject is the one that will remain untouched for the
longest time.
It won't have to be paged in until sometime far into the future.
Always selecting the longest untouched page results in optimal
(minamal) paging behavior.
Unfortunately, because it requires knowledge of the future, it's
impossible to implement.
But maybe we could make a good guess about future behavior.
Least-Recently Used Replacement
Estimate future behavior from past behavior.
A page that hasn't been accessed probably won't be accessed.
Eject the page that has been unaccessed the longest.
This is known as Least Recently Used or (LRU) replacement.
LRU replacement works well, but is expensive to implement.
It requires bits per page frame to keep time and lots of comparisons.
Estimating LRU Replacement
There are lots of LRU estimators using fewer clock bits (sometimes only
one) and comparisons.
These are usually still expensive because the clock bits have to be
handled by OS software and not hardware.
The queue is a simple and common LRU estimator.
A page access brings the page to the head of the queue.
Unaccessed pages sink to the end of the queue.
A sampling rate (every nth access) reduces the need to requeue
on every access.
Second-Chance Queueing
Sampling may cause accessed pages to sink to the end of the queue.
Assign each page a second-chance bit, initially 0.
If the end-page second-chance bit is 1, eject it.
If the end--page second-chance bit is 0, requeue it and set the
second-chance bit to 1.
The second-chance bit is cleared whenever the pge is accessed.
Clock replacement
Rather than maintain a queue of pages, form them into a ring and use a
pointer to mark the head of the queue.
A trailing second hand can serve as the second-chance bit scanner,
speeding up the rate at which pages are ejected.
The angle between the head and second scanner is a parameter of the
algorithm.
Paging
If a paged selected for ejection has been modified (is a dirty
page), it needs to be written out to disk to save the modifications.
Unmodified pages don't need to be written out, they can just be
overwritten.
Bias page-selection algorithms towards unmodified pages.
Paging Areas
Dirty pages cannot be written back to the files from which they came.
There may be more than one copy of the program running, and so more
than one copy of the modified page.
Dirty pages are written to a special file system called the paging
(or sometimes swapping) area.
The paging area is a special-purpose file system designed to provide
fast page IO.
Regular files can also be paging areas too, with decreased IO
efficiency.
Page-Frame Allocation
How many page frames should a process get?
Local page-frame allocation partitions the page frames among
processes.
Each process gets a subset of non-shared page frames.
When each process uses fills its allotment of page frames, paging
begins.
Global page-frame allocation makes all page frames available to
all processes.
Each process gloms on to as many page frames as it can.
Paging begins when all page frames are gone.
Local allocation is fair but potentially wasteful; poor allocation
decisions may lead to excessive paging or unused page frames.
Global allocation acheives better utilization (fewer unused page
frames), but may cause a form of extreme paging known as thrashing.
Static vs. Dynamic Allocation
Static page-frame allocation allocates a fixed and unchanging
number of pages at process start.
Dynamic page-frame allocation allows the number of allocated page
frames to change with process behavior.