i. Assembly language (PL/1 coming along)
ii. Not much agreement about abstractions
iii. Not much rigor/correctness given to design
i. OS/360 was being designed – huge effort, thousands of programmers, late, buggy, not rigorously designed (e.g. corner cases involving interrupts very sloppy)
i. What services should it offer?
ii. What applications is it for?
iii. How should it be constructed internally?
1. As a bunch of libraries?
2. As a bunch of a layers
3. As a hierarchy?
4. As modules/subsystems?
i. What is right way to organize OS to provide
3. Simplicity / correctness
4. Handle I/O efficiently (abstracted from processes)
ii. How do you battle the complexity of:
1. Multiprogramming – e.g. different users, different tasks, different programs, different priorities
2. Interrupts; re-entrant code
3. Control: who controls things and how
4. Flexibility: not much is known about how to do things, want to have flexibility to change things in the future
iii. What are the right abstractions to provide?
1. e.g. processes, threads, messages, files, names
i. You all know Unix a bit, get it out of the way
ii. Provides a lens to look at other papers
iii. Important context of what OS looks like today
1. Often too long – donÕt need to write so much!
1. Unclear what problem they were solving.
1. Be concrete – not just that the artifact was lasting. What about ideas?
1. No evaluation, Not much motivation
a. Often true for industry projects
2. Why no hard links to directories? Why no file locking?
a. hard to get right, can to work around – emblematic of Unix approach
b. Circular structures canÕt be garbage collected with reference counts
3. Why not worry about quotas?
a. Hard to get right
b. Not needed in their environment
4. They donÕt address the requirements / computing environment
a. When written, everybody knew about it. People today donÕt write about what a PC is or how much it costs (very much).
5. Using C made the OS bigger.
a. QUESTION: comments?
i. Writers worked on Multics for Bell lab – reacted to gross complexity & inefficiency of Multics
ii. Had very small computer to work with, wanted to use for their own purposes
1. Different from creating something for others; you know what you need and what you can sacrifice
2. Has to be usable; often drives out other goals such as abstraction
iii. Commercial OS at the time not extensible – you just got what you got and lived with it.
i. Everything (amost) is a file
1. Device access
2. Interprocess communication
4. QUESTION: Why important?
a. DonÕt need lots of APIs
b. Can have tools that operate on different things.
i. e.g. cat to a device
5. QUESTION: how do you provide shared access to a device, e.g. a printer?
a. Grant exclusive access to a daemon, let it do sharing.
ii. Data is bytes
1. not records
2. Generally null-terminated strings
3. QUESTION: Why important?
a. A record format is hard to program to
b. Text format commonly recognized
iii. Uniform name space
1. no separate naming convention for directories , file names, different disks
a. d:\foo (dos, windows)
b. e.g. $pinot:sys$disk[swift.one]foo.doc;13 (VMS)
2. Mountable file systems into name space
a. but not quite transparent – wasnÕt worth the complexity for linking, mvÕing across mount points.
3. Separate name from contents
a. Name refers to an i-node number, not to a file directly
b. Expose implementation through links
4. QUESTION: why important?
a. Simplify name parsing in programs
b. Easy model to navigate from any one place to another (allows relative paths up and down)
1. Address space + kernel data structure + file descriptor table
2. System calls make it easy to spawn
v. Limited communication / synchronization mechanisms
1. Fork / wait
2. Pipes with child
vi. System shell exposing underlying kernel featuers
1. Fork / wait parallelism
3. Coroutine programming using messages / forks
vii. User IDs + root
1. Simple two level model
2. SetUID to amplify rights
3. QUESTION: why important?
a. Hard to get semi-privileged things right
b. Setuid makes it easy to have privileged subsystems as programs, e.g. login, passwd
c. Compare to Windows: no setuid – need trusted launcher or running process for trusted subsystem
d. No need for a separate mechanism for users to create their own trusted subsystems separate from the system (not possible on Windows easily)
1. Avoid problems that are hard to get right or require a lot of mechanism; e.g. hard links to directories, file usage quotas, moving files (or linking file) across mount points
2. DonÕt hide underlying mechanisms if they are useful
a. fork (easy to do based on context switching)
b. hard links (easy to do based on directory structure)
i. OS structure proposed by Unix
1. Two levels: kernel and user
2. Simple kernel for extensibility
3. Services implemented as setuid programs that run on demand
ii. File system
1. Layer of indirection between name and file – the inode
2. Metadata (but not name) stored in inode on file, not in directory
a. NOTE: is a layer of indirection between directory and file
b. Allows linking: one set of metadata
c. Slow to do ls –l
d. Makes charging hard – who pays, the directory owner or the file creator?
3. No file locking/ synchronization
a. QUESTION: What is the assumption here? Not much sharing
b. What can you do for safe updates?
i. Make a copy & then rename
ii. Have application-specific lock files (e.g. Emacs)
c. EXAMPLE OF Unix approach – keep kernel simple, make applications handle things
4. I/O APIs make all I/O look synchronous, unbuffered
a. Relies on caching, write-behind in kernel for performance
b. No different APIs for sequential & random access – just seek (SUGGESTED by Multics)
c. All synchronous
i. QUESTION: is/was this a problem?
1. O.k. for small # of streams, but problem for networking on a server
5. Directories are also files, but can only be written by root
a. Directory entries contain name and inode number
b. Inode contains protection information, file statistics, reference count
i. QUESTION: what is result? Same access independent of path to file
c. Can only link to files on same disk
i. QUESTION: what problems? Transparency; disk boundaries arbitrary but visible to user (e.g. mv command)
d. Can check directory for consistency by looking at dir entries, inodes, blocks
e. QUESTION: what about performance?
1. QUESTION: How did unix do IPC? Why/ why not?
2. In general, no arbitrary communication between processes – no shared memory or messages or semaphores
3. Can use environment variables between processes
4. Can use shared files (but no locking!)
5. Can use signals to interrupt other processes
6. Related processes can use pipes, just like files
a. Combined with text data format, allows small programs to be combined into larger programs
b. No special communication api except pipe
c. Previous invented at dartmouth, but not used
d. Specialized form of a co-routine
i. E.g. subroutine that does some work then yields and lets another run to do part of the work
e. ONLY MECHANISM FOR SYNCHRONIZATION (waiting for others) other than wait() for exit()
a. Less focus on interactive vs. batch – one program can do both
iv. Process control: Fork / exec
1. QUESTION: why fork/exec?
a. History: had to swap out old process to run new process
b. Fork == leave copy in memory
c. Originally 27 lines of assembly
2. QUESTION: What is benefit?
a. Compare to CreateProcess (9 parameters)
i. Application name
ii. Command line
iii. Process ACL
iv. Thread ACL for first thread
v. Inherit Handle flag
vi. Creation Flags (11 flags regarding how processes are grouped)
vii. Environment pointer – environment variables
viii. Current directory string
ix. Startup info – 18 parameters
1. Windows size
2. Stdin, stdout, stderr andles
3. Desktop to create on
4. 9 flag values
x. Process information
1. Output value containing handles to new thread, process, process id, thread id
b. Fork allows you to control new process by running code before running exec
i. You only need to control yourself (e.g. close/open files, set environment) on Unix; on windows you need to control others
c. General dichotomy:
i. Provide a hook to inject code to do whatever you want (Unix)
ii. Provide configuration options anticipating all possible needs (Windows)
d. General approach to doing things:
i. Unix: provide some code, e.g. shell script, or forked code, to set things up
ii. Windows: statically declare properties as parameters or name-value pairs (e.g. windows)
i. Windows allows more reasoning / control over what happens
ii. Unix allows more flexibility, compact representation – donÕt need to create flags to decide everything
v. Protection: ACLs
1. Owner checked first, then group, then everyone
a. Allows denying a group – stops after first match
b. QUESTION: How much flexibility does this add?
i. A lot. Can make a group to give special access
ii. Not much – can only give special access to one group
iii. How limiting is this? How often do you want different access for user, two groups, and everyone else?
2. Owner has special rights to change ACLs
3. CanÕt give ownership away
4. Where is ACL/user/group stored – on directory or on inode? Should be in inode
5. Single superuse: ROOT IS EXCLUDED
a. QUESTION: Is this a problem? What about sharing administrations tasks?
b. A: use setuid programs & execute permission
a. Like entry capabilities, templates
i. Who should be able to use it?
ii. How should it be controlled?
iii. How can it be misused? What if you put it on the wrong program?
1. E.g. get administrator to run a file which puts SETUID on your file
iv. Need to ensure program canÕt be subverted- e.g. crashed at wrong time, verifies all inputs
7. QUESTION: Is this enough? What can it canÕt it do?
1. Previous systems: one process/user
a. Would replace shell to run command.
b. Exit command would relaunch shell
2. Runs commands, searches default path
a. Problem if path includes local directory
3. Simple extension / unification with underlying kernel primitives
a. E.g. & ˆ no wait called, ; ˆ wait called
4. I/O redirection allows batch operation or interaction of user programs w/o their involvement
5. Previous systems: one process per terminal; shell exits to run program and then restarts
6. Big ideas: commands as binary operators, taking an input and producing an output.
a. Allows executing them in sequence with pipes
b. Allows redirecting them for intermediate storage, batching
c. COMMENT: at time, one-input and one-output seemed too confining
i. Many people quoted paper – smoothely processing a stream of programs. What was the real problem?
i. People read a lot into what was said
ii. E.g. preemption in scheduling. This was never stated – just context switching
iii. People missed a big idea - layering
i. DidnÕt solve deadlock. Is this a flaw or a limitation?
ii. Paper is not just a technical description, but an experience report – random bits of information. Is this information useful to anyone? If so, how should it be dispensed?
i. A: How do you build an OS?
ii. A: What are the right abstractions / organization for an OS?
i. hard to handle interrupts: save, restore information, make sure you donÕt access things in an interrupt you shouldnÕt
ii. hard to manage memory manually (but people did it!)
iii. Each layer gets to run on a virtual machine that removes one element of hardware and replaces it with a software abstraction
i. QUESTION: why layers?
1. Easy to reason about – you can only communicate with layers above/below
2. Logical: can provide an abstract machine to higher levels
3. Problem: what goes in the layer, what order?
4. NOTE: can also consider these as abstract modules, organized for now into layers
ii. 0 = processes and context switching
a. System is a set of sequential processes at undefined speed ratios, use semaphores for synchronization
i. Delaying a process canÕt affect its correctness
ii. QUESTION: is this a realistic model? Is it limited? What does it rule out? (e.g. unprotected reading/writing)
b. Q: What does this mean? How does it help?
c. A: Data doesnÕt disappear if you donÕt pick it up: buffering/blocking input and output
i. We take this for granted
ii. Previously, had to time code so that picked up data before next piece arrived; very sensitive to changes in timing of HW.
d. Impact: no synchronization that is not explicit, via timing. You always wait for things to happen
2. Process real-time clock interrupts
3. Hide multiple processors (should they exist!)
4. Handle processor allocation / context switching
5. Provides: virtual machine for sequential processes
6. Supported 5 user processes, 10 i/o processes
a. System processes structured as cyclic processes (producer/consumer):
i. wait for input
iii. produce output
b. To communicate:
i. Provide input
ii. wait for output
iii. 1 = virtual memory – Òsegment controllerÓ
a. Pages = unit of moving memory
b. Segments = unit of information
c. Separate, large segment address space
d. Segment variable in the core identifies whether segment is in core or on drum
i. Addresses independent of memory or disk address; gives flexibility
ii. Not need consecutive allocation
e. Principle here: virtualizing a resource provides flexibility, hides details from upper levels
i. Is key technique – adding a layer of indirection for flexibility, scheduling, poilcyŒ
2. synchronized access to drum/disk
3. provides virtual address space, automatic swapping
4. provides virtual machine with large virtual address space
iv. 2 = virtual console
1. Handles connection of console keyboard to a process
2. Requires naming of process to be communicated with- on a ÒconversationÓ basis
3. Does message routing based on name of conversation,
4. Provides virtual private console (/dev/tty) to next level
5. Can be swapped out, because above segment controller layer
v. 3 = virtual devices
1. I/O devices are also sequential devices with synchronization
a. Hides timing details as much as possible
2. I/O devices abstracted as buffered input and unbuffered output streams.
3. I/O devices presented as two sequential processes (buffered input, so can read asynchronous) and unbuffered output
4. Above message interpreter so can send error messages to operator (e.g. load tape)
vi. 4 = user programs
vii. 5 = operator?
1. Where would networking go? What if you want to swap segments over a network?
2. Is this the only layers? What could you invert? E.g. move virtual devices under segment controller
3. Could you invert these layers? E.g. put virtual devices under virtual console?
4. Segment controller canÕt use virtual devices.
i. Each layer hides lower layer from upper layer
ii. Reduces potential interactions of n layers of m components from (nm)2 to n*(m2)
i. Using semaphores for mutual exclusion
1. Allows reasoning about concurrency
2. Compared to other structures, allows great flexibility in synchronization
a. E.g. synchronous programming with timers for poling
i. Test all possible combinations of inputs to a layer to verify correctness
i. Addresses truly virtual, not physical for disk or memory. Allows flexibility
i. Multi programmed, reentrant code hard to reason about.
ii. Program verifiers today canÕt really handle it, even
i. Timing doesnÕt matter
i. Is done today – sw firm in UK codes in a language that provides verification
ii. Several formally verified OS exist today
iii. Key: requires design for verification, not design then verification
i. hard to move functionality between layers
ii. Hard to add new functionality: what layer does a network go in? What if you want to do swapping over it?
i. No mention of it, how to track it between layers.
ii. Semaphore construct requires ÒharmoniousÓ cooperation; not realistic
i. Hard to optimize across layers if data is hidden.