QUESTIONS Transparency is provided by? - Wrappers - Object trackers Why don't put device drivers in user space? - backward compatibility, device drivers currently written with the assumption that driver is trusted, and will run in kernel space - performance + if driver read kernel memory a lot --> good + if driver writes a lot --> pay cost (XPC, sync, copy) + even worse if we put device driver in user level Weakness of Nooks? - deal with buggy extensions, not malicious one - cannot recover in some cases: + infinite loop + produce wrong result + ... - recovery is limited: + only kill and restart, works for most of stateless drivers + may not works for some... For example, VFAT recovery may cause corrupt persistent data structure Solution: sync before releasing resources (app specific) - overhead (XPC, page table, object tracking ...) + NOTE: In experiment section, performance is variant, depends on what device drivers are isolated ==> the overhead mostly come from XPC... Compare Nooks with other paper? + Efficient software ... "These technologies are attractive and might replace or aug- ment some of Nooks’ isolation techniques. Nevertheless, in their proposed form, they deal only with the isolation problem, leaving unsolved the problems of transparent integration and recovery" **IMPROVING the RELIABILITY of COMMODITY OPERATING SYSTEMS** ============================================================ # Useful summary: - http://pages.cs.wisc.edu/~swift/papers/sosp-present.pdf - http://pages.cs.wisc.edu/~swift/papers/acm-talk.pdf - www.cs.fsu.edu/~awang/courses/cis6935_s2004/nooks.ppt Goal of Nooks - backward compatibility (rather proposing a new architecture) ==> there for need transparency - efficient - recovery (which microkernel does badly) Nooks vs. Capability-based systems - isolate using VM - support recovery (which is not mention in capability-based system) vs. microkernel - same: using separate address space - deal with recovery: + microkernel: reboot is often the way to reboot a crash service vs. Virtual Machine - same: use virtual memory for protection - diff: Virtual Machine does not deal with a fault component at VMM # See Remzi notes for more # Goals - backward compatibility + target existing commodity systems + minimal modification to OSes and device drivers - efficiency + performance overhead should be small + tolerate as much failures as possible # Core principles - Design for fault resistance, not fault tolerance + not seeking a complete solution for all possible extension errors. + but eliminating most of them will substantially improve system reliability - Design for mistake, not abuse + i.e not deal with malicious device drivers # High level idea - put device drivers in light-weight protection domain + hence device drivers cannot corrupt kernel data - call between kernel and device drivers is carefully checked and interposed using XPC: + check for parameter + check for data structure that may modified + queue for performance - each device driver has its own copy of kernel data structure its need to modified + need copy and sync - device drivers are buggy but not malicious # Cons - cannot completely fault tolerate + tolerate buggy code, but not malicious code - overhead + space: ~ page tables for every device drivers ~ local heap, stack for each device drivers ~ copy of kernel objects modified by each device drivers + time: ~ domain switch (from kernel <--> device drivers) require TLB flush (in x86), and TLB miss ~ synchronization of modified object # Functions Isolation --------- - each driver executes in its own lightweight kernel protection domain + write access to limit portion of kernel address space - major tasks: + provide protection domain managements: using page table ~ create ~ manipulate ~ maintenance + inter-domain control transfer ~ handled by XPC (Extension Procedure Call) • vs. LRPC: LRPC handle control and data transfer between mutually distrustful peers. XPC occurs between trusted domains by is asymmetric (i.e kernel has more rights to the domain extensions...) - lightweight protection domains: + provided by page tables with different access rights ~ kernel can read/write to anywhere in the space ~ device drivers has read/write access to its domain, but only read access to other extensions' domain and kernel address space + each domain has ~ a synchronized copy of the kernel page table ~ private domain-local heap ~ private pool of stacks ~ memory-mapped IO regions ~ kernel memory buffers (socket buffer, I/O blocks) NOTE: malicious device drivers can: 1. reload the hardware page table base register (to change page table) 2. read kernel data structure 3. can build new page table with write access to kernel and switch ... - Inter-domain control transfer + Handled by XPC + transparent to both kernel and extensions (by using wrappers) + Steps (handle by XPC): 1. save caller's context on the stack 2. find a stack for the calling domain 3. change page table (i.e. protection mode) 4. Call the function This is costly, hence queued/deferred XPC e.g. batch transfer of message packets from network driver to kernel Interposition ------------- - Nooks interposes wrapper stubs between extensions and kernel + provide transparency + enable of control and data transfer in both direction - Device drivers sometimes want to write to some kernel data structures, 2 ways: + For non-performance critical object modifications, use XPC into kernel ~ slow, good for infrequent use ~ need to go back and forth + for performance critical updates, create a shadow copy of kernel objects ~ modify this shadow copy in extension domain ~ synchronize later before and after XPCs into extension ~ faster, where update are frequents (don't need to go back and forth) QUESTION: how to know which kernel objects an extensions can modify? A: Domain knowledge, change macros and inline functions that directly modify kernel objects into wrapped function calls. Wrappers -------- - Check parameters for validity. What to check? + pointer validity NOTE: this is not a perfect check, malicious drivers can give bad parameters - Create copy of kernel objects + vs. LRPC, direct copy + no marshaling or unmarshaling + Why? because drivers and kernel share the kernel address space - XPC into kernel or extensions NOTE: - writing wrappers requires domain knowledges - can use compiler techniques to generate wrappers skeleton Synchronization --------------- - need to synchronize copy of kernel objects with original one - How? + for simple objects, synchronization code is in wrappers + for complex objects, need to write explicit synchronization routines - When? + before and after XPCs into the extension Object trackers --------------- - Object = kernel data structure access through a pointer - record addresses of all objects that extensions use - Trackers help garbage collection: + when an extension crashes - Tracking object life time also require domain knowledge + some objects live only during the XPC calls + some are explicitly allocated/deallocated by extensions Failure Detection and Failure Recovery -------------------------------------- - What types of failure can be detect? Y: Invalid memory access Y: Live lock Y: Invalid parameters N: produce wrong result N: infinite loop (timeout, hard to deal with) - Recovery 1. may disable interrupts for the device control by the extensions to prevent livelock 2. May use specific recovery steps for each extension, but by default: + unload the device driver + cleanup: release all of its kernel and physical resources + reload, and restart (this works for most device drivers, because of stateless) For state-full extension this may not work. 3. Signals tasks that are currently executing within the extension, or have called through the extensions, to unwind NOTE: if a task is in non-interruptible state, and the sleeping task is never wakes --> recovery is incomplete (but this is rare) NOTE: Recovery manager also releases, frees, or unregisters all objects that the object trackers keep track. Each object type may have a recovery function.