The Paging Game
[Editor's Note: At the request of many users as well as our own staff members, the following article is reprinted from the University of British Columbia Computing Centre Newsletter. It was written by J. Berryman. and appeared most recently in its entirety in Vol. 6, Number 2, February 1974. Minor changes have been made in keeping with the minor differences that exist between UBC's Computing Center and our own.]
Certain computer systems (MTS, for one) use a process called paging to increase the effective memory capacities of their machines. To help explain this process we have invented a game that is played by the same rules as those that control the paging process. Understanding the game is easier than understanding the actual paging process, because computers usually are surrounded by an air of confusion and mystery whereas games usually aren't. Part I gives the game rules, Part II gives the corresponding rules for the actual process, and Part III discusses the object of the game and how you can make the paging mechanism work best for your programs.
The Crating Game
- You can have up to eight million things. So can everybody else.
- Things are kept in crates that hold 4096 things each. Things in the same crate are called cratemates.
- Crates are stored either in the workshop or the warehouse. The workshop is almost always too small to hold all the crates.
- There is only one workshop and two warehouses. Everybody shares them.
- Each thing has its own thing number.
- What you do with a thing is to zark it. Everybody takes turns zarking.
- You can zark only your things, not anybody else's.
- Things can be zarked only when they're in the workshop.
- Only the Thing King knows whether a thing is in the workshop or in one of the warehouses.
- The longer a thing goes without being zarked, the grubbier it is said to become.
- The way you get your things is to ask the Thing King. The Thing King gives out things only in bunches of eight. This is to keep the royal overhead down.
- The way you zark one of your things is to give its thing number. If you give the number of a thing that happens to be in the workshop, it is zarked immediately. If not, the Thing King finds the grubbiest thing in the workshop, whether it is yours or somebody else's. He packs it off—in its crate, with all its cratemates—to the warehouse. In its place, he puts the crate containing your thing. Your thing then gets zarked, even though the Thing King had to go to some trouble for you. You never know that your thing wasn't in the shop all along.
- Each player's stock of things is numbered in the same range, 1 to 8,000,000. The thing King always knows who owns what thing and whose turn it is, so you can't ever accidentally zark someone else's thing, even if one of your things has the same thing number.
- Traditionally, the Thing King sits at a large, segmented table and is attended to by pages (the so-called table pages) whose job it is to help the King remember where all the things are.
- One consequence of Rule 13 is that everybody's thing numbers will be similar from game to game, regardless of the number of players.
- The Thing King has a few things of his own, some of which move back and forth from the warehouses to the workshop just like anybody else's, but some are just too heavy to move out of the workshop.
- With the given set of rules, oft-zarked things tend to get kept mostly in the workshop, while little-zarked things stay mostly in one of the warehouses. This is efficient stock control.
- Sometimes even the warehouses get full. The Thing King then has to start piling things on the dump out back. This makes the game slower because it takes a long time to get things from the dump when they are needed in the workshop. The Thing King follows a rule that minimizes this delay: When the warehouse begins to get too full, the Thing King uses his spare time to move the grubbiest things in the warehouses out to the dump. This means that the most infrequently zarked things will end up in the dump so that the Thing King won't have to get things from the dump very often. This keeps the game from getting too slow when there are a lot of players and a large number of things to store.
In Part I we said that paging was a process by which some computer systems, including ours, "stretch" the effective sizes of their memories. We described an imaginary game that was played by the same rules that govern the paging process. In this part we'll list the rules again, this time using computing terms instead of the terms of the game. Compare each paging rule given below with the corresponding game rule.
The Paging System
- Any computer job can have up to 8,388,608 bytes of memory at once.
- Memory is divided into pages of 4096 bytes each.
- Pages are stored either in the real memory or on a paging device. The real memory is almost always too small to hold all the pages.
- There is only one real memory (two million bytes of storage) and two principal paging devices (IBM 2301 drum storage units). All the jobs share them.
- Each byte has an address.
- What you do to bytes is to reference (i.e., store data in or fetch data from) them. Jobs take turns referencing.
- Jobs can reference only their own bytes, not anybody else's.
- Bytes can be referenced only when they're in real memory.
- Only the operating system (i.e., MTS, the supervisor, and paging device program) knows whether a byte is in real memory or on the paging device.
- The longer a byte (or a page) goes without being referenced, the less current it is said to become.
- The way a job gets bytes is to request them from the system. Memory is allocated in groups of eight bytes (doublewords) to make the system's recordkeeping process easier.
- The way a job references something is to give its address. Addresses are generally given by making them operands of machine language instructions, contents of hardware registers, etc. At any rate, if a job gives the address of a byte that is in real memory, the reference is completed immediately. If the byte is on a paging device, the system finds the least current page in real memory and writes it onto a paging device. It then reads into real memory the page containing the byte to be referenced. The reference is then completed, just as though nothing special had happened.
- All jobs' addresses are in the same range. The system knows whose memory is whose and which job is doing the referencing, so that one job cannot reference another job's bytes.
- The system keeps track of all the bytes (for a given user) by using one master table, called a segment table, and a bunch of little tables, called page tables.
- Because of Rule 13, the addresses that a job gets don't depend on the number of jobs currently on the computer.
- The system has its own storage, some of which migrates from paging devices to real memory and back just like ordinary memory, but some of which, for a variety of good reasons, stays in real storage all the time.
- Under the above rules, pages that are referenced often tend to stay in real memory, while little-used pages stay on a paging device. This makes efficient use of real storage, which is expensive.
- Sometimes the available space on the principal paging devices gets full. When this situation threatens, the system uses its unbusy moments to select the least current pages on the principal paging devices and move them to a disk pack. This method of using the disk pack minimizes the amount of disk referencing, so that the disk's slower speed won't impede system performance too much.
Real Memory Stretching
(Part I game terms are in parentheses, following their real counterparts.)
Notice that without paging, the total number of bytes (things) that a computer can store is equal to the capacity of its real memory (workshop) alone. With paging, the total number of stored bytes (things) is equal to the real memory (workshop) capacity plus the paging devices (warehouses) capacity plus the disk (dump) capacity. In our case, the approximate figures are:
Total Real Memory (Workshop) 2,000,000 bytes Total Drum (Warehouse) 7,200,000 bytes Total Disk (Dump) 100,000,000 bytes Total Available Memory for All Jobs at Any One Time 109,200,000 bytes
Paged memory is called virtual memory to distinguish it from ordinary (real) memory.
Virtual memory is MUCH cheaper than equal amounts of real memory. Real memory is at least ten times as expensive as drum and disk memory. Consequently, our virtual memory, which consists mostly of drum and disk storage, costs much less than an ordinary, 100 percent real memory of the same size.
In Part I we invented a game whose rules paralleled the real rules of memory paging; in Part II we gave the real rules. In this part we'll discuss the implications of those rules for programmers who are using a paging machine.
When writing programs, there is only one rule that need be followed:
What does this mean? It means that you should write your programs so that they refer to storage in as contiguous and non-scattered a way as possible. Let's take a pair of simple FORTRAN examples. We'll exaggerate the severity of the problem for the sake of explanation, then consider a more realistic example later.
Program A: DIMENSION X(2000) X(1)=50. X(1999)=40. X(2)=30. X(2000)=20. STOP END
Program B: DIMENSION X(2000) X(1)=50. X(2)=30. X(1999)=40. X(2000)=20. STOP END
Both of these programs do the same thing; Program B does it more efficiently. Why? Let's look at the array X in the context of paging. X is 2000 words long; each word takes 4 bytes, so X requires 8000 bytes of storage. Since each page is 4096 bytes of virtual memory, X takes a bit less than 2 pages. Since the elements of arrays like X are arranged in subscript order, X(1) and X(2) will fall in one page, while X(1999) and X(2000) will fall in another. We can draw a little map of X in storage:
If X starts near the end of a page, the storage map might look like this:
PAGE X(2) N
In either case, X(1) and X(2) will be in a different page from X(1999) and X(2000).
Now, let's see what happens when Program A runs.
- Statement / Action
- DIMENSION X(2000)
- Action: None. DIMENSION is non-executable.
- Action: X(1) is set to 50. However, the page containing X(1) must be in real storage before X(1) can be referenced. If the page happens to be on the paging device, it must first be read into real storage or "paged-in."
- Action: X(1999) is set to 40. However, since X(1999) is not in the same page as X(1), X(1999)'s page may first have to read from a paging device.
- Action: X(2) is set to 30. X(2) is in the same page as X(1); unfortunately, since some time has elapsed since X(1) was referenced, X(2)'s (and X(1)'s) page may have been "paged-out" to a paging device again. Remember that we have a timesharing system in which all jobs take turns using the same (or parts of the same) real storage, just as all the players in the crating game use the same workshop. Consequently, the whole paging process is more active than your job alone would have it be. So X(2)'s page might have to be reread from a paging device into real memory.
- Action: X(2000) is set to 20. X(2000) is in the same page as X(1999). Unfortunately, just as for X(2)'s and X(1)'s page, X(2000)'s page might have to be paged-in.
- Action: Program terminates.
- Action: None
Now we'll trace Program B.
- Statement / Action
- DIMENSION X(2000)
- Action: None
- Action: X(1) is set to 50. Like Program A, X(1)'s page may have to be read from a paging device.
- Action: X(2) is set to 30. Unlike Program A, X(2)'s page will almost certainly not have to be reread, since it was referenced only an instant ago.
- Action: X(1999) is set to 40. Like Program A, X(1999)'s page may well have to be read from a paging device.
- Action: X(2000) is set to 20. Like X(1)'s page, X(2000)'s page probably won't need to be reread.
- Action: Program terminates.
- Action: None
Program B is better than Program A because it tends to require a smaller number of page-reads. This is true because Program B localizes its references; it does all its work on one page before moving on to the next, whereas Program A bounces its references back and forth from page to page.
"So what?" someone says. "Page-reads are free, aren't they?" No, we say. Although the system doesn't charge for page-reads per se, each page-read uses a certain amount of CPU time, part of which you are charged for. Also, page-reads increase the elapsed time of a run by making the system (which operates much faster than the paging device) suspend your job until your page can be recalled into main memory from the paging device.
As we said, the example we've used isn't quantitatively accurate — in actual fact, Program A probably would have needed no more page-reads than Program B. The four statements would probably have been executed so quickly that once both pages came into real memory, either program — A or B —would have finished before the vagaries of the paging mechanism found it necessary to put any pages back onto the paging device.
However, for large arrays, these subtleties can grow into expensive disasters. How does this happen? Let's look at a more true-to-life example.
Program C: DIMENSION X(1000,250) DO 100 I=1,1000 DO 100 J=1,250 100 X(I,J)=0.0 STOP END
Program D: DIMENSION X(1000,250) D0 100 J=1,250 D0 100 I=1,1000 100 X(I,J)=0.0 STOP END
These two programs were compiled and run by your author. Program D (the right way) took 4.12 seconds of CPU time and required 17 page-reads. Program C was stopped before it had finished; when it was stopped, at about 1/8 toward completion, it had used 0.50 seconds of CPU time and had needed more than 3560 page-reads! The test was made on the 360/67 under conditions of moderately heavy loading. In a very busy time, the difference would have been greater; in a slack time, the difference would have been less. Why? Let's look at how these programs referenced their data.
Multi-dimensioned arrays in FORTRAN are stored internally in column-major order. This means that the array X in Programs C and D would have been stored in the following order:
That is, the array elements for column 1 (for all rows) are stored at the beginning addresses of the array; next, in ascending addresses, are all of the elements of column 2, and so forth. The elements of column 250 occupy the highest addresses in the region of storage allocated to the array X. Note that the total size of this array is 1,000,000 bytes— four bytes for each of 250,000 array elements. This amounts to just about one-half of the total real memory if the entire array were paged-in. It is also worthwhile noting that each column requires 4000 bytes, or nearly a whole page. Consequently, the first elements (i.e., the row-1 elements) of consecutive columns will be found at intervals of 4000 bytes. Thus, the chances are very good that the row-1 element of one column will be located on a different page from the row-1 element of the next column. It should be apparent that this argument is valid for rows 2, 3, and all the remaining rows of the array.
Programs C and D reference the array in the following respective orders:
Comparing these two orders with the order of the array shows that Program C references the array in a scattered way, skipping 999 array elements between references, while Program D references the array consecutively. Program D obeys the rule. Vive la difference!
The general rule for n-dimensioned FORTRAN array referencing is that the leftmost subscript should vary most rapidly, the next leftmost subscript should vary next most rapidly, etc.
In PL/1, arrays are stored internally in row-major order—the opposite of FORTRAN—so PL/1 arrays should be referenced with the rightmost subscript varying most rapidly, the next rightmost subscript next most rapidly, etc.
In general, you should try to program according to the rule whenever you can. For many small programs, the difference will be minimal; however, small programs can have large arrays! Programs C and D, after all, are "small" in the sense that they have few instructions. If you have a program that is inexplicably expensive to run, perhaps it is violating the rule too much or too often.
Remember, you can find out the total number of page-reads used by your job—they are printed in the accounting information (the "tailsheet") after each signoff. Moreover, if you issue the command $SET TDR=ON, the CPU time and number of page-reads used will be printed after the execution of each command. Try it! If a program is doing thousands of page-reads, it may well be wasting computing funds.
After all this, you might wonder why we use the paging system if people have to worry about it. The reason is that we have, in MTS, a timesharing computer system in which jobs can have up to about 8,000,000 bytes of storage whenever they need it. As we outlined in Part II, the cost of an equivalent system using non-paged memory would be outrageous. We just couldn't do it. So, in the language of Part I, even that most benevolent of monarchs, the Thing King, is imperfect; to keep ourselves happy, we must take his faults into account. ••
"The Paging Game" (or sometimes "The Thing King") describes an imaginary game that is played by the same rules that govern the virtual memory paging process that many computer systems use to "stretch" the effective sizes of their memories. As the article explains "Understanding the game is easier than understanding the actual paging process, because computers usually are surrounded by an air of confusion and mystery whereas games usually aren't." The article was written between 1972 and 1974 by Jeff Berryman, a systems programmer working on the Michigan Terminal System (MTS) operating system at the University of British Columbia Computing Centre. The article is often mistakenly attributed to MIT's Project MAC.
University computing center newsletters were widely shared in the 1970s and as a result the article was frequently reprinted and got into general circulation. It was widely distributed through the SHARE IBM User's Group and was included in at least two books, Charles T. Meadow's Applied Data Management [Wiley, 1976] and Peter van der Linden's Expert C Programming: Deep C Secrets [Prentice Hall, 1994]. A Google search for "Paging Game" "Thing King" in November 2012 returned over 360 results. Jeff Berryman has said that he considers the article to be in the public domain and so, as far as he is concerned, anybody can publish it anywhere they want. He also finds it funny that the only publication that ever got him any fame at all in the computing field was completely facetious. A discussion in the MTS Archive web site includes more information about the article, where and why it was written.
There are three parts to the article. Part I gives the game rules, Part II gives the corresponding rules for the actual process using computing terms instead of the terms of the game, and Part III discusses the object of the game and how one can make the paging mechanism work best for computer programs. It is common for part III and sometimes part II to be omitted when the article is reprinted. While the article describes virtual memory demand paging in a fairly generic way, all three parts of the article are slightly dependent on the details of the specific virtual memory architecture and hardware configuration at a specific site and time and Part III is slightly dependent on the computer operating system being used. As a result many different versions of the article were created with minor adjustments to more closely match the details of different vendors' virtual memory architectures, different sites' hardware configurations, and different operating systems.
The version above includes all three parts and is taken from the 19 June 1974 issue of the University of Michigan's Computing Center Newsletter (Vol.4, No.7, pages 2-6) which is available from the Hathi Trust Digital Library. Some details are based on the virtual memory architecture for the IBM System/360 Model 67 mainframe computer, on the hardware configuration at the University of Michigan in 1974, and on a few aspects of the Michigan Terminal System (MTS) time-sharing operating system that was in use at the University of British Columbia and at the University of Michigan. The University of Michigan allows portions of its Newsletter to be reprinted without permission so long as the source is clearly acknowledged. To avoid misunderstandings, any abridgments of reprinted material should be indicated by ellipsis marks (...) and editorial interpolations should be enclosed in brackets.
The Editor's note at the beginning of the article was added in 1974 by the editor of the Computing Center Newsletter at the University of Michigan.
In November 2012 the two storage maps in part III were adjusted slightly from the 1974 version, where the maps did not seem to match the associated text.