* What operation cannot make idempotent? - mkdir * File locking in NFS? - No, don't want to keep state at server - File can be intermixed if clients access the same time * Another problem associated with remote, open files is that access permission on the file can change while the file is open. How NFS solve this? - "In the local case the access permission is only checked when the file is opened, but in the remote case permission is checked on every NFS call. This means that if a client program opens a file, then changes the permission bits so that it no longer has read permission, a subsequent read request will fail. To get around this problem we save the client credentials in the file table at open time, and use them in later file access requests." **SUN NETWORK FILE SYSTEM** =========================== # NOTE: main goals of NFS is not scalability but easy recovery # so, why NFS limits scalability? - cache consistency problem: client need to frequently check if cache is valid - server write semantics, must commit write to stable storage before return - Security/locking in AFS # 0. Take away 0) transparency, hence application still uses POSIX interface - introduce VFS layer 1) NFS focus on simple and fast server crash recovery - Server: stateless, to simplify crash recovery + server does not track anything about what client is doing + protocol request contain *all information* needed to complete that request + no shared state between server and client otherwise, recovery is complicated + e.g, not use in-memory file descriptor number, but use *file-handle*: > contain information about volume, inode number, and generation number > allow server to locate the file > generation number prevent client accidentally access newly allocated file - What does client keep track: + mapping fd --> file handle + current offset of the file - Handling server failure with *idempotent* operation + client simply *retries* the request if timeout Why? Because request is *idempotent*, simple uh? + e.g: 2) Client-side caching - Ok, so, simple, easy recovery, and failure handling policy But how about performance? How to improve it? ==> Our old friend: caching at client side. Then, what to cache? - data and metadata, hence read can be served from client memory - temporary write buffering + decouple application write() performance from actually write (i.e from apps view, write complete immediately, but the real data is moved to server later on) Problem of client-side caching? *cache consistency* - e.g: + client 1 reads a file, and cache its content (v1). + client 2 overwrites that file, and its write is buffered at client 2 (v2) + client 3 read that file, it may receive v1 or v2, depending on if the update from client 2 propagate to server ==> Hence some problem: + *update visibility*: when does an update from a client visible to others ==> Solution: flush on close + *stale cache*: if client 2 flush change to server, now client 1 has stale cache ==> solution: call getattr before using cached content This is a design flaw: because NOT DESIGN FOR COMMON CASE (the common case here is only client access the file ..., which fixed by AFS) These are two interesting sub-problems NFS SOLUTION: - flush-on-close semantics: update will be visible when client close the file Hence, another node open() the file after close, will see latest version ==> But still suffer from performance with short-lived, temporary file (client create it, and delete it after a short time, this change still need to go to the server) - call getattr to see if the file has changed, before using cached content ==> New problem: getattr flooding Solution: use *attribute cache*, and this cache is time out every 3 seconds within that windows, just use information on the cache without contacting server ==> this still suffers a *inconsistency windows*, if some other client flush changes to the same file to server during that windows. 3) SERVER-SIDE - cache read in memory, but what to cache + data: less useful to cache, client cache it anyway + for local server-side file system metadata: useful to cache because this stuff is not cached at client side - for write, must commit to disk before return a success to client ==> hence suffer from performance problem (solution: use secondary cache which is battery powered, like NetApp) + like small file problem: one small change, need to update the inode, block bitmap, data bitmap, and data ..., before returning + NetApp solution: LFS, buffer write to non-volatile memory, then write the whole chunk to disk, or RAID5 - One tricky thing: Server use local file system to store stuff, and it give file handle containing Inode to client. ==> what if an inode is remove, and reused, and when a file handle come back in, server need to be able to tell this inode number is now refer to different file ... Solution: add a inode generation number to file handle (hence need to change local file system to add this feature)