No file exists: use your library to ...12/12 -- When can errors occur? It is best
open 'foo' O_WRONLY, write out 1 block, close, open, read
open 'foo' O_WRONLY, write 1, close, corrupt 1 block, open, read
open 'foo' O_WRONLY, write 10, close, corrupt 1 in middle, open, read all
open 'foo' O_WRONLY, write 10, close, corrupt 1 in middle, open, lseek to "bad" block, read 1 block
open 'foo' O_WRONLY, write 10, close, corrupt 2 in middle, open, read all (should fail)
open 'foo' O_WRONLY, write 10, close, corrupt 2 in middle, open, lseek to "bad" block, read 1 block (should fail)
open/write/close 'foo' (size 10), open, lseek to middle block, write 1, close, open and read back all
open/write/close 'foo' (size 10), open, lseek to middle block, write 1, close, corrupt 1 other block, open and read back all
Files kept open a long time: use your library to
open 10 block file, corrupt 1 block, read it, close
open 10 block file, corrupt 3 blocks, read them (should fail), close
open 10 block file, corrupt 1 block, read it, corrupt another, close
Truncating/Unlinking: use your library to ...
open/write/close 'foo' size 10 blocks, unlink, try to open it (should fail)
open/write/close foo size 10, truncate to size 0, open and read all
open/write/close foo size 10, truncate to size 1 block, open and read all
open/write/close foo size 10, truncate to size 3 block, corrupt 1 block, open and read all
open/write/close foo size 10, truncate to size 3 block, corrupt 2 blocks, open and read all (should fail)
Scale tests: use your library to ...
open/write/close 100 files of size 1 block, read them all back
open/write/close 10000 files of size 1 block, read them all back
open/write/close 100 files of size 10000 blocks, read them all back
open 100 files, write 1 block to each, close 100 files, read them all back
open 1000 files, write 1 block to each, close 1000 files, read all back
create 10000 files of size 1 block (this time without your library), read
them all (with your library)
open/write/close 10000 files of size 3 blocks, corrupt 1 block of each, read them all back
open/write/close 10000 files of size 3 blocks, corrupt 2 blocks of each, read them all back (should all fail)
open 1 10-block file 10 times, read same file through each descriptor
open 1 10-block file 10 times, read/write file through each descriptor
File 'foo' (size 10 blocks) exists but has never been accessed thru your library:
open 'foo', read all 1 block @ a time, close
open 'foo', read all, close, corrupt 1 block of 'foo', read all again
open 'foo', read all, close, corrupt 1 block, read just that block
open 'foo', read all, close, corrupt 2 blocks, try to read (should fail)
open 'foo', read all, 2 blocks @ a time
Left to you to test without guidance: error cases ...
e.g., open a file O_WRONLY (write only), try to read a block (all using your library), or having FSPROTECT_HOME not set, or ...
12/6 -- One more simplification:
For truncate() and
ftruncate(), the file can only be shortened (assume it won't
be lengthened), and it can only be shortened to a size that
is a multiple of 4KB.
12/5 -- Complications:
Realize that a file can be deleted
with calls such as unlink() and open(with the O_TRUNC flag)
and the truncate() and ftruncate() calls. Your library should
handle such file deletion correctly.
12/5 -- Simplifications:
To make your life easier, you don't
have to handle files that are not a multiple of 4KB in size
(though you may wish to add checks to make sure the files are
as you expect). You also don't have to worry about reads and
writes that are not 4KB-aligned -- in other words, all reads/writes
to files that you are managing will be a multiple of 4KB in
size and will start at an offset that divides by 4KB. Both of
these simplifications should remove a number of corner cases
from your code.
12/1 -- Dealing with open():
Open is a bit of a pain to deal
with because it takes a variable number of arguments. To
handle it, you should check out the following code example
here.
To learn more about file systems
To learn about checksums
To learn about raid protection schemes
To learn how interposition works
How are you going to do it? Well, that's where things get interesting.
You are going to develop a file system library that
interposes
on
all important file-system related calls and uses checksums and parity
to ensure a higher level of data protection.
Files should be treated as a sequence of 4 KB blocks. Then, for each file
that you access, you should compute and store a parity block for that file.
Hence, each file should have a 4 KB parity block associated with it. If a
block is later determined to be "bad" (e.g., not accessible or corrupted),
you should then be able to transparently "reconstruct" the block from the
other blocks of the file plus the parity block.
This leaves a problem: how do we tell when a file block has gone bad?
Checksums are the answer. Specifically, for each 4KB block of the file,
you should compute and store an MD5 hash of that block. Later, when reading
the file back, you should read the MD5 hash too, and compare it to the hash
of the block you just read. If they don't match, you have a "bad" block,
and hence you need to use the parity block to reconstruct this block. If
you have more than one bad block, you are out of luck -- reads to the bad
blocks should simply return an error (return -1 on read and set errno to EIO).
So far we haven't said how you are going to get access to those file system
open(), read(), write(), and other relevant calls. One way would simply to
be to re-write the file system itself -- but that is a lot of work, and
requires anyone who wants this feature to use your new file system. Instead,
you will be building a dynamically-linked library (again!) which interposes
on important system calls in order to allow you to do the extra work you
need to do. You will call this library "libfsprotect.so" and it will add
checksums and parity to whatever file system you use it upon.
That's about it! Now, for some details.
For more information about md5, you might check out the following:
The unofficial MD5 homepage
Source code example (as above)
To do this, you will use the LD_PRELOAD functionality provided by the
dynamic linker (read the man page for more information ("man ld.so")).
For example, let's say you wanted to be passed control every time
the system call "close" was called by a process.
You would first build your library to define its own close() routine.
Here
is a simple example.
You then need to build the library, just like you would build a typical
dynamically-linked library.
prompt> gcc -shared -o libfsprotect.so -fpic -Wall libfsprotect.cTo "interpose" on the close() routine, use LD_PRELOAD as follows:
prompt> setenv LD_PRELOAD "./libfsprotect.so"Then, simply run something that you know opens (and then closes)
prompt> cat /dev/null
closing fd: 3
prompt>
You can also use "ldd" to find out which libraries an executable
is currently linked with:
prompt> ldd /s/std/bin/catTo turn off this interposition, simply unset the LD_PRELOAD variable:
./libfsprotect.so => ./libfsprotect.so (0xb75e6000)
libc.so.6 => /lib/tls/libc.so.6 (0xb749a000)
libdl.so.2 => /lib/libdl.so.2 (0xb7497000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0xb75e9000)
prompt>
prompt> unsetenv LD_PRELOADOne major problem with this example is that it doesn't actually call
Here
is a simple example of a better version of close.
Also, make sure to link with libdl when you make your shared library.
prompt> gcc -shared -o libfsprotect.so -fpic -Wall libfsprotect.2.c -ldlFor your convenience, you can also add a "constructor" and "destructor"
Here
is a simple example of the close library
with a constructor and destructor.
From there, try to figure out what real applications use. Tools like
strace
(run "strace program" on the command line to see what system
calls "program" calls) can be very useful. Also, the program "objdump"
might be handy, to look for symbols in the executable (the -t flag is handy).
However, how much time you spend on this is up to you -- if you get
everything working for the basic interfaces (open(), close(), lseek(),
read(), write(),
unlink(), ftruncate()/truncate()
), that will be
good enough for full credit. Doing the fuller interface just
lets you run more standard programs on top of your library.
Note:
One interface you don't have to deal with is mmap().
mmap() allows processes to map a file into their address space
and then access it as if it were memory. Catching all reads and
writes to a file thus becomes the task of catching all loads and
stores to a memory region of a process, which is painful (at best).
Hence, in this project, you don't have to handle mmap().
Your job is to figure out how to manage this meta-data in a simple
and efficient manner. One thing that will be assumed is that user
has set an environment variable FSPROTECT_HOME to the directory where
this metadata should live. If this variable isn't set, you should
print an error message and exit. Use getenv() and setenv() to access
this variable, and store any relevant information that you need
about the files you are protecting in this directory.
Note:
The way you manage your meta-data is one major design
aspect of this project. You have complete freedom here, but use that
freedom wisely. Think about what you need to store on disk in the
FSPROTECT_HOME directory (parity, checksums) and then design a scheme
that accomplishes that.
Bootstrapping:
One problem you will encounter is that the proper
meta-data (checksums, parity) for a file has not yet been created. In
this case, when the file is accessed, you should go ahead and "bootstrap"
the file -- that is, you should compute the checksums and parity for
the file, and store them properly in the FSPROTECT_HOME directory.
Dealing with Multiple Processes:
If multiple processes are
accessing protected data at the same time, there is a chance that
multiple concurrent updates will occur to your meta-data. Such
a race condition could lead to inconsistent meta-data, which would
then perhaps lead to corrupt data being given back to the user.
Design your system with file locking to avoid this problem.
Dealing with Crashes:
If a process using your library crashes
while updating data but (for example) before updating the checksums
or parity information, you could have an on-disk consistency problem,
which again could lead to the user receiving bad data. Design your
system to handle the case where a crash occurs in an untimely manner.